|
ABLE 2.3.0 07/13/2005 14:21:00 | |||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||||
java.lang.Objectcom.ibm.able.AbleObject
com.ibm.able.beans.AbleAbstractImport
This abstract class provides common interfaces to import data sources for Able Beans.
An Import bean's primary function is to read data from a data source, and parse each record into the outputBuffer array when processed. AbleAbstractImport can load all the records into memory or optionally cache a quantity of records. The number of records cached is specified by the bufferSize. This object handles the cacheing. Each time all records in the data source have been processed, it sends an end-of-file event and increments the numEpochs value.
An Import uses an AbleImportData object to handle the I/O. Meta-data must be provided in order for an AbleImportData to create field variables. When the data source is first opened, it is scanned to determine the number of records. On this first pass it also computes min/mean/max values for continuous fields, creates symbol to index mappings for categorical fields, and creates number to index mappings for discrete fields. To force additional datasources within the same agent to use the same definition, set computeStatistics to false.
An AbleAbstractImport can be used to generate an AbleFilter bean which will translate the data in the manner specified in the meta-data definition file. Field usage can be initialized from a data definition file (a *.dfn file) for text import beans. It can also be specified interactively on the customizer's data panel for import objects such as database imports whose metadata does not include field usage.
When an Import is processed, it populates the outputBuffer array with elements from the data source. If the data consists solely of continuous fields, a double array is used; otherwise, an Object array is populated. Records may be processed sequentially from the data source, or in random sequence. When buffering is used, the records are randomly retrieved from within each buffer. After all records in the buffer have been processed, the next buffer of records is retrieved.
| Field Summary | |
|---|---|
protected int |
bufferRecordIndex
Current record in the buffer file being processed. |
protected int |
bufferSize
The maximum number of records to read in a block from this data source. |
protected boolean |
computeStatistics
A boolean indicating that metadata is to be opened and field statistics are to be computed when the data source is opened. |
protected boolean |
cycleRelative
A flag indicating the cycleSize is relative to the file size, ie, a multiplier. |
protected double |
cycleSize
When cycleRelative is false, cycleSize is the raw number of records to process in a cycle. |
static java.lang.String |
defaultName
Value assigned to name by default. |
protected boolean |
eof
When the last record in the file has been processed, eof is true. |
protected java.util.Vector |
fieldList
A Vector of AbleField objects describing the data source. |
protected AbleImportData |
importData
The AbleImportData object referenced by this import. |
protected long |
numEpochs
The number of times this data source has processed all records it contains. |
protected java.util.Vector |
numericData
A Vector of double arrays containing records from the database table. |
protected int |
numFieldsPerRecord
The number of fields in a record from a data source. |
protected long |
numRecords
The total number of records in this data source. |
double[] |
outNum
A double array used in calculating the output buffer. |
java.lang.Object[] |
outSym
A String array used in calculating the output buffer. |
protected int[] |
randomIndices
An array of indices used when records are randomly accessed. |
protected boolean |
randomizeData
Determines whether to output records from the data source in random or sequential order. |
protected long |
recordIndex
Current record in the entire data file being processed. |
protected long |
recordsRead
The number of the records read from the start of the data source. |
protected java.util.Vector |
textData
A Vector of String arrays containing records from the database table. |
| Fields inherited from class com.ibm.able.AbleObject |
|---|
changed, chgSupport, comment, dataFlowEnabled, destBufferConnections, eventQueue, fileName, inputBuffer, listeners, logger, name, outputBuffer, parent, properties, propertyConnectionMgr, sourceBufferConnections, state, stateChgSupport, trace |
| Constructor Summary | |
|---|---|
AbleAbstractImport()
Construct a default AbleAbstractImport object. |
|
AbleAbstractImport(java.lang.String name)
Construct an AbleAbstractImport object with specified name. |
|
| Method Summary | |
|---|---|
void |
close()
Close the data source, disable data flow, and set its state to Unititiated. |
void |
endOfFile()
Notify any listeners that we are at the end of the file, and increment the epoch count numEpochs. |
boolean |
eof()
Return whether the data source is at end of file. |
protected void |
generateRandomIndices(int bufferSize)
Generate a set of randomized indices to access the data records. |
protected java.util.Vector |
getAgentFieldList()
Get the default fieldList for this object's container agent. |
int |
getBufferSize()
Return the buffer size. |
boolean |
getComputeStatistics()
Return the value of the computeStatistics setting. |
long |
getCurrentRecordIndex()
Get the index of the last record in the entire data file processed. |
double |
getCycleSize()
Return the raw cycle size setting. |
java.lang.String |
getCycleSizeAsString()
Return the raw cycle size formatted appropriately for the cycleRelative flag. |
java.util.Vector |
getFieldList()
Return a Vector of AbleField objects defining each field in the data source. |
java.util.Vector |
getFieldList(java.lang.String usageType)
Return a Vector of AbleField objects with the specified usage. |
void |
getNextRecordBlock()
Read the next bufferSize records from the data source. |
protected java.lang.Object[] |
getNextTextRecord()
Return the next array of text from the data source. |
int |
getNormalizedRecordSize()
Return the size of the record after categorical and discrete fields are expanded. |
int |
getNumberOfOutputFields()
Return the number of fields per record in the data source. |
long |
getNumEpochs()
Retrieve the number of passes over the data, or epochs. |
long |
getNumRecords()
Return the number of records in the data source. |
long |
getRecordIndex()
Return the record index in the data source. |
long |
getRecordsRead()
Return the current count of records read from the beginning of the data source. |
long |
getStepsPerCycle()
Calculate and return the number of steps in a cycle from the raw cycle size, using the cycleRelative flag. |
void |
init()
Open the data source. |
boolean |
isAllNumericData()
Return true if all fields are "continuous", and false if any are "discrete" or "categorical"; that is, symbols. |
boolean |
isCycleRelative()
Return whether the raw cycle size is to be interpreted as a factor of the number of records in the data source, or as an absolute number of records. |
boolean |
isRandomizeData()
Return whether records are processed in random sequence or not. |
boolean |
isReady()
Indicate whether the importData is ready to provide data. |
void |
open()
Open the data source if it is ready. |
void |
process()
Process gets the next record from the data source, and places its contents in the outputBuffer. |
void |
processAbleEvent(AbleEvent e)
Process an AbleEvent sent by another Able bean. |
void |
processTimerEvent()
Process a timer expiration event synchronously; that is, on the same thread as the caller. |
void |
quitAll()
Close an open data source. |
void |
reset()
Set processing options to default values, and re-initialize (reopen) the the data source. |
void |
setBufferSize(int size)
Set the buffer size, which determines whether to load the entire data source (=0) or just pieces of it (>0) into memory. |
void |
setComputeStatistics(boolean computeStatistics)
Set the value of the computeStatistics flag. |
void |
setCycleSize(double cycleSize,
boolean relative)
Set the cycle size and definition for its use. |
protected void |
setDefaults()
Set processing options to default values. |
void |
setFieldList(java.util.Vector fieldList)
Set a Vector of AbleField objects defining each field in the data source. |
void |
setRandomizeData(boolean state)
Set the randomize flag so records are processed in random sequence. |
void |
setRecordIndex(long theRecordIndex)
Set the record index in the data source. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Methods inherited from interface com.ibm.able.AbleBean |
|---|
getComment, getLogger, getName, getParent, getProperties, getProperty, getState, getTraceLogger, init, isChanged, process, removeAllConnections, removeProperty, resumeAll, setChanged, setComment, setLogger, setName, setParent, setProperties, setProperty, setState, setTraceLogger, suspendAll |
| Methods inherited from interface com.ibm.able.AbleEventListener |
|---|
handleAbleEvent |
| Methods inherited from interface com.ibm.able.AbleEventListenerManager |
|---|
addAbleEventListener, dataChanged, getAbleEventListeners, notifyAbleEventListeners, removeAbleEventListener |
| Methods inherited from interface com.ibm.able.AbleEventQueueProcessor |
|---|
processNoEventProcessingEnabledSituation |
| Methods inherited from interface com.ibm.able.AblePropertyChangeManager |
|---|
addPropertyChangeListener, addPropertyConnection, getPropertyChangeListeners, getPropertyConnectionManager, removeAllPropertyConnections, removePropertyChangeListener, removePropertyConnection |
| Methods inherited from interface com.ibm.able.AbleSerializable |
|---|
getFileName, restoreFromFile, restoreFromFile, saveToFile, saveToFile, setFileName |
| Field Detail |
public static final java.lang.String defaultName
protected int numFieldsPerRecord
protected int bufferSize
protected long numRecords
protected long recordsRead
protected long numEpochs
protected AbleImportData importData
protected boolean randomizeData
protected int[] randomIndices
protected boolean computeStatistics
protected java.util.Vector fieldList
protected transient java.util.Vector textData
protected transient java.util.Vector numericData
protected long recordIndex
protected int bufferRecordIndex
protected boolean eof
protected double cycleSize
protected boolean cycleRelative
public transient double[] outNum
public transient java.lang.Object[] outSym
| Constructor Detail |
public AbleAbstractImport()
throws AbleException
public AbleAbstractImport(java.lang.String name)
throws AbleException
name - A String containing the name used to identify this bean.| Method Detail |
public void init()
throws AbleException
init in interface AbleBeaninit in class AbleObjectAbleException - If an error occurs.open()
public void open()
throws AbleException
AbleExceptioninit()
protected java.util.Vector getAgentFieldList()
throws AbleException
The result is the fieldList from the active data source if there is one, or the first opened data source otherwise. It will be an empty Vector if the container has no open data sources. If this object is not in a container, return the object's current fieldList.
AbleException
public void process()
throws AbleException
process in interface AbleBeanprocess in class AbleObjectAbleExceptionAbleObject.inputBuffer,
AbleObject.outputBuffer,
AbleBean.process()
public void processTimerEvent()
throws AbleException
This method is called by our AbleEventQueue whenever the following conditions are all true:
This method calls process populate the output buffer with the next data record.
processTimerEvent in interface AbleEventQueueProcessorprocessTimerEvent in class AbleObjectAbleException - If an error occurs.AbleObject.setSleepTime(long),
AbleObject.setTimerEventProcessingEnabled(boolean),
AbleObject.startEnabledEventProcessing()
public void reset()
throws AbleException
reset in interface AbleBeanreset in class AbleObjectAbleException - If an error occurs.AbleBean.reset()
protected void setDefaults()
throws AbleException
AbleException
public void quitAll()
throws AbleException
quitAll in interface AbleBeanquitAll in class AbleObjectAbleExceptionclose()
public void close()
throws AbleException
AbleExceptionpublic int getNumberOfOutputFields()
getNumberOfOutputFields in interface AbleDataSourcepublic long getNumRecords()
getNumRecords in interface AbleDataSourcepublic long getRecordIndex()
public void setRecordIndex(long theRecordIndex)
public void getNextRecordBlock()
throws AbleException
AbleException
protected java.lang.Object[] getNextTextRecord()
throws AbleException
AbleException
public void setBufferSize(int size)
throws AbleException
AbleExceptionpublic int getBufferSize()
public long getRecordsRead()
public long getCurrentRecordIndex()
public void processAbleEvent(AbleEvent e)
throws AbleException
processAbleEvent in interface AbleEventQueueProcessorprocessAbleEvent in class AbleObjecte - The event to process.
AbleException - If an error occurs.AbleObject.setAbleEventProcessingEnabled(int),
AbleObject.startEnabledEventProcessing(),
AbleObject.handleAbleEvent(AbleEvent)public long getNumEpochs()
getNumEpochs in interface AbleDataSourcepublic java.util.Vector getFieldList()
getFieldList in interface AbleDataSourcepublic void setFieldList(java.util.Vector fieldList)
setFieldList in interface AbleDataSourcefieldList - A Vector of fields. The Vector may be empty, but not null.
public java.util.Vector getFieldList(java.lang.String usageType)
throws AbleException
AbleData.UsageType(String).
usageType - A String denoting usage type.
AbleException
public int getNormalizedRecordSize()
throws AbleException
A categorical field, for example, is encoded in 1-of-N format in which one field in a boolean vector is used to indicate the value present. The expanded field thus is the same length as the number of unique categorical values.
AbleExceptionpublic boolean isAllNumericData()
isAllNumericData in interface AbleDataSourcepublic boolean eof()
public boolean isReady()
throws AbleException
isReady in interface AbleDataSourceAbleExceptionpublic void endOfFile()
protected void generateRandomIndices(int bufferSize)
public void setRandomizeData(boolean state)
public boolean isRandomizeData()
public double getCycleSize()
public java.lang.String getCycleSizeAsString()
public boolean isCycleRelative()
public long getStepsPerCycle()
getStepsPerCycle in interface AbleDataSource
public void setCycleSize(double cycleSize,
boolean relative)
cycleSize - A double value.relative - A boolean indicating how to interpret the cycleSize.
If true, the number of steps in a cycle is the cycleSize
multiplied by the number of records in the data source.
If false, the number is the absolute number of records
to process in a cycle.public boolean getComputeStatistics()
public void setComputeStatistics(boolean computeStatistics)
|
ABLE 2.3.0 07/13/2005 14:21:00 | |||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||||