Mahout Store Vector

From wiki.searchtechnologies.com
Jump to: navigation, search

For Information on Aspire 3.1 Click Here

Aspire / Aspire Components / Store Mahout Vector

Store Mahout Vector
Description: Stores the mahoutDocVector in a serial file. Then, stores the vectors to a serialized output file.
Inputs: AspireDocument that has an attached mahoutDocVector variable
Outputs: A serialized data file to store the vectors
Factory: aspire-mahout
Sub Type: store
Object Type: Implements the storage handler and requires a path for the output file

Configuration

Element Type Default Description
xmlIdTag string <none> XML ID Tag to use for storing the key of the document vector to be accessed later by CompareMahoutVectors to identify the document when performing document comparisons. Example. APPLICANT_ID.

Sample Configuration

  <component name="storeMahoutDocVector" subType="store" factoryName="aspire-mahout">
     <config>
	<xmlIdTag>APPLICANT_ID</xmlIdTag>
     </config>
  </component>

Sample Storage Handler Configuration

  <open componentRef="../processDOC/storeMahoutDocVector" variableName="vectorModelHandle" path="trainingModel/MODEL.dat" />

Usage

This stage is meant to be used after Mahout Create Vector, since this stage uses the mahout vector that it creates. This stage is meant to be used before Mahout Compare Vectors, since this stage stores the mahout vectors serial file it will use to do vector comparisons.