Feeder Job Routing

From wiki.searchtechnologies.com
Jump to: navigation, search

For Information on Aspire 3.1 Click Here


If you're creating an application that includes a feeder, but you want to be able to configure the applications your jobs are sent to after installation, you could consider using job routing. This allows you to send a job to a pipeline and at the end of the pipeline, the job is sent to some other pipeline, based in information that has been attached to the job.

The method below adds a dropdown to your application configuration (at installation time using DXF) that allows you to select one or more applications to have a job routed too. The actual routing information is added to the job using the Router component. The Aspire framework itself handles the routing of jobs between the various pipelines.

The potential downside of this approach is that the configuration is done at installation time, and only applications that are installed will show in the dropdown, so you need to ensure you have all the processing applications installed before you install the feeder.

Updates to application-dxf.xml

Add the following to your application-dxf.xml file:

 <routingTable escapeValue="true" display="Job Routing">
   <routeTable>
     <route multiple="true" minCount="1">
       <dxf:attribute name="component" display="Pipeline" type="application" typeFlag="job-input" >
         <dxf:help>The document processing applications to route the jobs from the Fast listener to</dxf:help>
       </dxf:attribute>
       <dxf:attribute name="preference" display="Preference" type="pulldown">
         <dxf:help>Route preference</dxf:help>
         <dxf:option display="Local">LOCAL</dxf:option>
         <dxf:option display="Prefer local">PREFER_LOCAL</dxf:option>
         <dxf:option display="Prefer remote">PREFER_REMOTE</dxf:option>
         <dxf:option display="Remote">REMOTE</dxf:option>
       </dxf:attribute>
     </route>
   </routeTable>
   <dxf:h2 style="font-size:11px;color:#545454;font-family: Arial,Helvetica,sans-serif;font-weight:normal">Route documents to the given document processing applications</dxf:h2>
 </routingTable>


Updates to application.xml

Typically, your feeder will already be configured to send to a pipeline already. For the HTTP Feeder the configuration would look something like this:

 <component name="IsMasterHttpFeeder" factoryName="aspire-http-feeder" subType="default">
   <debug>${debug}</debug>
   <servletName>is_master</servletName>
   <outputMime>application/octet-stream</outputMime>
   <branches>
     <branch event="onPublish" pipelineManager="someManager" pipeline="somePipeline"/>
   </branches> 
 </component>

This will send the jobs from the HTTP Feeder to the somePipeline pipeline on the someManager pipeline manager.

To actually add the routing to the job, you'll need to have add a Router stage to the pipeline your job is being sent to. First you need to add the Router component to the pipeline manager, so you'll need to add something like this:

 <component name="Router" subType="router" factoryName="aspire-tools">
   <debug>${debug}</debug>
   ${xml:routingTable}
 </component>

Note that the ${xml:routingTable} variable name must match the top tag of the dxf from above (less the xml:). This causes the component to add the output from the dxf to the component configuration.

Then you'll need add the Router in to the pipeline as a stage:

 <stage component="Router" />

You'll end up with a pipeline manager that looks something like this:

 <component name="someManager" subType="pipeline"  factoryName="aspire-application">
   <debug>${debug}</debug>
   <gatherStatistics>${debug}</gatherStatistics>
 
   <pipelines>
     <pipeline name="somePipeline" default="true">
       <stages>
         <stage component="Router" />
       </stages>
     </pipeline>
   </pipelines>
       
   <components>
     <component name="Router" subType="router" factoryName="aspire-tools">
       <debug>${debug}</debug>
       ${xml:routingTable}
     </component>
   </components>
 </component>

So any job that the HTTP Feeder publishes get processed on the somePipeline pipeline of the someManager pipeline manager. This pipeline passes the job through the Router stage, which adds the desired job routing.

When the job finishes the somePipeline pipeline, the Aspire framework uses the job's routing information to send the job to the desire application(s)