Simple Feeder (Aspire 2)

From wiki.searchtechnologies.com
Jump to: navigation, search

For Information on Aspire 3.1 Click Here


Simple Feeder (Aspire 2)
Factory Name  com.searchtechnologies.aspire:N/A
subType  N/A
Inputs  Feeder dependent
Outputs  An AspireObject published to the configured pipeline manager. The content of the document depends on the feeder.

The Simple Feeder is an abstract implementation of a feeder that provides standard feeder functionality - start, stop, statistics reporting and so one. Individual feeders implement the actual feeding methods - feed once, feed periodically, full feed, incremental feed etc. Note, this component must be extended in order to produce a functioning feeder

Configuration

The following configuration will be accessible to all feeders based on the Simple Feeder

Element Type Default Description
autoStart boolean false Set to true to make the feeder automatically start feeding when the component is loaded. Otherwise the feeder must be manually started.
feederLabel string Implementation-dependent The feeder label submitted in the <feederLabel> of the published document.
loopWait int 30000
(= 30s)
The number of milliseconds to sleep between feed iterations.
feedWait int 0 The number of milliseconds to sleep between publishing documents. Can be changed from 0 to throttle feeding.
maxErrorsRetained int 10 The number errors to keep for display via the status page.
statsPeriod int 10 The number of minute periods to keep the statistics of documents submitted, processed succesfully and processed unsuccessfully for display via the status page.
maxRetries int 3 The number of times a job that cannot be placed on the pipelineManager's queue will be retried before it fails.
failedJobs String None The location on disk that jobs which cannot be published (after the given number of retries) will be written. If this option is not set, the failed jobs will not be written out. Failed jobs are written in a form that can be resubmitted using the Job Error Handler. To achieve this configure the directory specified here as a registered directory in the error handler.
branches None The configuration of the pipeline to publish to. See below.
metadataMap see below Standard Metadata Mapper configuration. See below.


Branch Configuration

The simple feeder publishes files using the branch manager. By default it uses onPublish event. You should therefore include a <branches> element in the configuration to publish to a pipeline within a pipeline manager. See Branch Handler for more details. Feeders based on the simple feeder may publish to other events, in which case, these events must be configured in the branch handler

Element Type Description
branches/branch/@event String The event to configure. At the very least, you should include the onPublish event.
branches/branch/@pipelineManager string The URL of the pipeline manager to publish to. Can be relative.
branches/branch/@pipeline string The name of the pipeline to publish to.

Metadata Mapper Configuration

The simple feeder maps some metadata fields to fields in the AspireObject. The mapping will be dependent on the implementation of the actual feeder