Publish to ElasticSearch Tutorial (Aspire 2)

From wiki.searchtechnologies.com
Jump to: navigation, search

For Information on Aspire 3.1 Click Here

Step 1: Launch Aspire and open the Content Source Management Page

Aspire Content Source Management Page

Launch Aspire (if it's not already running). See:

Browse to: http://localhost:50505. For details on using the Aspire Content Source Management page, please refer to UI Introduction.


Step 2: Create a new Content Source

For this step please follow the step from the Configuration Tutorial of the connector of you choice, please refer to Connector list.

Step 3: Add a new Publish to ElasticSearch to the Workflow

To add a Publisher to ElasticSearch drag from the Publish to ElasticSearch rule from the Workflow Library and drop to the Workflow Tree where you want to add it. This will automatically open the Publish to ElasticSearch window for the configuration of the publisher.

Step 3a: Specify Publisher Information

Publish to ElasticSearch Configuration Aspire 2.0.x
Publish to ElasticSearch Configuration Aspire 2.1 +

In the Publish to ElasticSearch window, specify the connection information to publish to the ElasticSearch.

  1. Enter the name of the publisher. (This name must be unique).
  2. Enter the description of the publisher that will be shown in the Workflow Tree.
  3. Enter the index to which the jobs are going to be publish.
  4. Enter the Specify ElasticSearch URL
    • Host and port
      • Enter the ElasticSearch host.
      • Enter the ElasticSearch port (9200 by default)
    • Complete Url
      • Enter the url for the ElasticSearch bulk index endpoint, it must have this format <protocol>://<host>:<port>/_bulk
  5.  (2.1 Release)   Max Results per request: How many documents can be fetched by the search engine for the same query
  6.  (2.1 Release)   Page size: How many documents to fetch per page
  7.  (2.1 Release)   Url field: Field used to store the url in elasticsearch
  8.  (2.1 Release)   Id field: Field used to store the id in elasticsearch. Used to compare against the content source audit logs
  9.  (2.1 Release)   Timestamp field: the name of the timestamp field holding the index timestamp of every document
  10. Groovy File Path: set to the default parameter to use the default JSON transformation file. To use a custom file, follow the instructions in JSON Transformation
  11. Debug: Check if you want to run the publisher in debug mode.
  12. Click on the Add button.

Once you've clicked on the Add button, it will take a moment for Aspire to download all of the necessary components (the Jar files) from the Maven repository and load them into Aspire. Once that's done, the publisher will appear in the Workflow Tree.

Click this link you are following the tutorial for Aspire for Elasticsearch

For details on using the Workflow section, please refer to Workflow introduction.