RDB Row Extractor (Aspire 2)

From wiki.searchtechnologies.com
Jump to: navigation, search

For Information on Aspire 3.1 Click Here


RDB Row Extractor (Aspire 2)
Factory Name  com.searchtechnologies.aspire:aspire-rdbfeeder
subType  rdbRowExtractor
Inputs  AspireObject and RDB
Outputs  The input AspireObject with extra information added. This information is taken from the RDB.

The RDB Row Extractor stage takes an AspireObject and uses this as parameters for a SQL statement. This SQL is then executed against an RDB. The data returned is then added to the document under an output element.

The SQL to execute is held in the configuration file. The configuration allows for parameters to the SQL to be taken from the document. The location of the parameter is given as an xPath.

The extractor has three modes: Single Row, Multiple Row and Compress Multiple Rows. By default, Single Row Mode is used.

Single Row

The SQL is executed and the first row of the results added to the document. Each column of the row will appear as an element under the output element. If there is more than one row, a warning message is issued.

Example document for Single Row mode

 <doc action="insert">
   <ID source="RDBFeederImpl">1</ID>
   <fetchUrl source="RDBFeederImpl">1</fetchUrl>
   .
   .
   <pgTitle>
      <TITLE source="RDBRowExtractor">commission title</TITLE>
      <TITLE_TYPE source="RDBRowExtractor">Group Working Title</TITLE_TYPE>
   </pgTitle>
 </doc>

Multiple Row

The SQL is executed and the rows of results are added to the document. Each row will appear as an element under the output element. Each column of the row will appear as an element under that sub element.

Example document for Multiple Row mode

 <doc action="insert">
   <ID source="RDBFeederImpl">1</ID>
   <fetchUrl source="RDBFeederImpl">1</fetchUrl>
   .
   .
   <pgTitles>
     <pgTitle>
       <TITLE source="RDBRowExtractor">commission title</TITLE>
       <TITLE_TYPE source="RDBRowExtractor">Group Working Title</TITLE_TYPE>
     </pgTitle>
     <pgTitle>
       <TITLE source="RDBRowExtractor">Another commission title</TITLE>
       <TITLE_TYPE source="RDBRowExtractor">Group Working Title</TITLE_TYPE>
     </pgTitle>
   </pgTitles>
 </doc>

Compress Multiple Rows

The SQL is executed and the rows of results are added to the document. Each column of each row will be concatenated together using the configured delimiter. The complete results set will appear under the output element.

Example document for Compress Multiple Row mode

 <doc action="insert">
   <ID source="RDBFeederImpl">1</ID>
   <fetchUrl source="RDBFeederImpl">1</fetchUrl>
   .
   .
   <pgIndexingTerm source="RDBRowExtractor">/multimedia/videos/uk/london;/multimedia/sound/uk/london;/multimedia/web;/multimedia/videos/uk/herefordshire/hendre;</pgIndexingTerm>
 </doc>

Configuration

Element Type Default Description
rdbLocation String rdb The component path within the system of the RDB connection pool component. Used for fetching database connections.
output String None The name of the element in the AspireObject to write the data.
output/@row String None Turn on Multiple Row mode and use the given element name as the sub element for a row.
output/@rowDelim String None Turn on Compress Multiple Row mode and use the the given character as the delimiter.
extract None The SQL to run to extract data. See below.
extract/sql String None The SQL to be run to extract data. The SQL may include ? as place holders for parameters. These will then be substituted in in order.
extract/param/@index Integer None The parameter number of this parameter, with parameter one replacing the first ? in the SQL, parameter two replacing the second ? in the SQL etc.
extract/param/@type String Integer The type of this parameter - either String or Integer.
extract/param/@docPath String None The xpath in the AspireObject to the element that holds the value for this parameter.


Example Configurations

Simple Configuration

 <component name="EM3getPGTitles" subType="rdbRowExtractor" factoryName="aspire-rdbfeeder">
   <rdbLocation>/common/EM3rdb</rdbLocation>
   <output row="pgTitle">pgTitles</output>
   <extract>
     <param index="1" type="Integer" docPath="/doc/PG_ID"/>
     <sql><![CDATA[
         select pgt.title as TITLE,
                tt.name as TITLE_TYPE
         from   em3.product_group_title pgt,
                em3.title_type tt
         where  pgt.product_group_id = ?
            and pgt.title_type_id = tt.id
       ]]></sql>
   </extract>
 </component>