Aspire Quick Start

From wiki.searchtechnologies.com
Jump to: navigation, search

For Information on Aspire 3.1 Click Here

The following tutorial gets you started with Aspire in 20 minutes or less. In it, you will install Aspire, an Aspire connector application, and an Aspire publisher application (that simply writes to a file instead of to a search engine for indexing). It will give you an idea of how you install Aspire applications using the System Admin user interface (no programming knowledge needed).

There are also two other quick start tutorials:

Additionally, there are also tutorials for each type of connector application you might wish to install; those are located under the section for each Connector.


Prerequisites

Before you begin, you need to be registered to use Aspire (go to http://aspire.searchtechnologies.com/) if you haven't already done that.

You will need your user registration name and password in order to complete this tutorial.

Step 1: Install Java

The version of Java you should use depends on the Aspire version you are targeting to:

  • Aspire 2.1.2 and earlier runs on Java 1.6 or Java 1.7
  • Aspire 2.2 and up requires to run at Java 1.7.

Note that we recommend installing the Java JDK (Java Development Kit), just in case you want to create your own Aspire Components in the future. But really, only the JRE (Java Runtime Environment) is absolutely required.

  1. Download and install the latest version of the Java JDK appropriate for the system that will run Aspire: http://java.com/en/download/manual.jsp
    • If you have a 64 bit machine, we recommend installing the 64 bit version of Java. That will allow you to create large-memory instances of Aspire.
      • The Aspire framework itself does not use up that much memory (100mb or so). But some applications may store big hash tables to improve performance, so it's best to have the 64 bit JVM (Java Virtual Machine), just in case you need it someday.
  2. Test that you can access the "java" command from your console.
    1. Open up a new DOS command-shell (go to the Start menu, enter "cmd" in the "Run" or "Search for Programs" field, and then execute the cmd.exe program).
    2. At the prompt, enter the following, then press the Enter key: java -version
    3. Success is indicated when version information is returned.

up to Aspire 2.1.2:

 > java -version
 java version "1.6.0_18"
 Java(TM) SE Runtime Environment (build 1.6.0_18-b07)
 Java HotSpot(TM) 64-Bit Server VM (build 16.0-b13, mixed mode)

or as of Aspire  (2.2 Release)  :

 > java -version
 java version "1.7.0_79"
 Java(TM) SE Runtime Environment (build 1.7.0_79-b15)
 Java HotSpot(TM) 64-Bit Server VM (build 24.79-b02, mixed mode)

Step 2: Download the Quick-Start Distribution

Download and unpack https://wiki.searchtechnologies.com/binaries/. For purposes of this tutorial, we'll use "aspire-quick-start" as the directory name to which you unpack Aspire.

Note: This is not the best way to create a new Aspire Distribution. The official method is to use the Distribution Archetype, which requires also downloading a Maven client. There's a separate tutorial for getting started using this method: Aspire Quick Start with Distribution Archetype.

The download will create a directory structure similar to that described in Aspire Directory Structure.


Step 3: Edit the Aspire settings.xml File

Go to the directory where you unpacked Aspire (such as "aspire-quick-start") and type "config" to go to the configuration directory. Open the settings.xml file with a text or XML editor. Look for the maven repository tag. You need to replace the user name and password that displays with the user name and password you used to register for Aspire.

<repository type="maven">
     <defaultVersion>1.0-SNAPSHOT</defaultVersion>
     <remoteRepositories>
       <remoteRepository>
         <id>stPublic</id>
         <url>
           http://repository.searchtechnologies.com/artifactory/simple/community-public/
         </url>
         <user>YOUR-REGISTERED-USERNAME</user>
         <password>YOUR-REGISTERED-PASSWORD</password>
       </remoteRepository>
     </remoteRepositories>
   </repository>

Once you've entered your user name and password, save and close the file.

Step 4: Start Up Aspire

First, make sure you have access to the internet so that Aspire can download components. Next, still in the Aspire directory you created, change to the bin directory and type "startup" to launch Aspire.

Note that "startup" is a batch script (on Windows) or a shell script (on Unix) that can be modified as necessary if you need more memory or need to set other system properties.

Aspire may take a few seconds to load all of the necessary components.

NOTE: If you are downloading Aspire Community, ignore the error message about being unable to download the com.searchtechnologies:aspire-dcm-enterprise component. The aspire-dcm-enterprise component is available only with Enterprise systems (and is used for Distributed Processing).

Step 5: Go to the Aspire Home Page

Aspire Main Admin Page

Leaving the terminal window open, open a web browser. Access the Aspire System Administration page at the following URL: http://localhost:50505/

You should see a page similar to the one at right.


Step 6: Install the CS Manager

All packaged Aspire connector applications, such as the file system connector that you will be installing, depend on a special "Content Source Manager" application. So it's usually a good idea to first install the CS Manager application before installing any specific connector. There needs to be only one CS Manager (or two, for failover) in an Aspire system.

Installing the CS Manager

To install the CS Manager:

  1. From the Aspire Admin page, click on the link to the server on which you just loaded Aspire (this is the IP address link).
  2. You will see an "Edit Server" dialog box like the one on the right. Click on "Add Application".
  3. In the "Application" pulldown, select "CS Manager".
  4. Leave Use External RDB un-checked and turn on Debug so you can explore the data which comes out of the CS Manager, if you'd like.
  5. Click on "Add".


CS Manager Running

At the top of the Edit Servers box, you will see "CSManager" under Installed Applications, and its status will be LOADING for a minute or two while components are downloaded from the Maven repository. Click the "refresh" link until you see the status change to RUNNING (see the example at right).

Close the dialog box and you are returned to the Servers List main page.


Step 7: Install the File System Connector

Add File System Connector to a Server

Next install the File System connector application in the same way.

  1. From the Aspire Admin page (http://localhost:50505), click on the server link again to get to the "Edit Server" dialog box.
  2. Click on "Add Application".
  3. In the "Application" pulldown, select "File System Connector".
  4. Leave the defaults for Name, Autostart, and the file system snapshot directory. Turn on Debug if you would like to see advanced logs from the application.
  5. Click on "Add" (at the bottom).

Make sure the newly added application is running, then return to the Servers list page.


Step 8: Install the File System Publisher

Add File System Publisher to a Server

The final application you will install is the File System Publisher.

  1. From the Aspire Admin page (http://localhost:50505), click on the server link again to get to the "Edit Server" dialog box.
  2. Click on "Add Application".
  3. In the "Application" pulldown, select "Publish to File".
  4. Leave the default values for number of logged jobs and the file name.
  5. Click on "Add" (at the bottom).

Make sure the newly added application is running, then return to the Servers list page.



Step 9: Add the File System Content Source

So you've installed the CSManager, the File System connector, and the File System publisher. But you still need to identify what content you wish to crawl.

Add a Content Source
Select File System Connector
Add Routing
  1. From the Aspire Admin page (http://localhost:50505), click on the Content Sources link on the left navigation menu.
  2. Click on New Source.
  3. On the Basic Information tab, enter a representative source name and click on the Active checkbox. For now, leave the schedule to Manually.
  4. Select the Connector tab.
  5. On the Application Name drop down, select "/FileSystemConnector".
  6. Enter a valid URL to a path in your file system. For example, "file://D:/testdata".
    -You can leave Partial Scan unchecked, unless you need it.
    -Check Index Folders and Scan SubFolders options.
    -Leave Include and Exclude patterns empty.
  7. Select Routing Table tab.
  8. Click on Add routing entry.
  9. On the Name drop down, select "/PublishToFile".
  10. Click on Add to add the new routing entry.
  11. Click on Create at the bottom to create the new content source.


Step 10: Test Out the Connector

Now you have installed all the required applications and created a content source. Let's test it.

Starting a Crawl
Check the Statistics
  1. On the Content Sources view, find your recently created content source. In this example is named "My Folder.
  2. Click on Full to start a full crawl of your content source.
  3. Click on Statistics to find out detailed information of the scanning.
  4. Click on the refresh button to update the statistics.
  5. When you see Status: Success on the statistics it means that the crawl is done. In this example, My Folder contained 4 files. Since we selected the Index Folders option then MyFolder itself has been added as a document to get a final count of 5 updates submitted.
  6. Go to your Aspire home folder and check the contents published by the connector. Open the file "{ASPIRE_HOME}\log\PublishToFile\publishToFile.jobs" and check that the content fetched is correct.


Step 11: Congratulate yourself! (and shutdown)

Congratulations!!

You have completed the 20-minute quick start. We hope it was a fun experience.

We recommend that you continue with the Aspire Quick Start with Distribution Archetype next. That will show you how to build Aspire distributions from scratch using Maven prototypes and Maven component repositories.

To shutdown Aspire, go to the home page (http://localhost:50505/aspire) and click on the "Shutdown" button (that's the red button to the right of the server name). Or, you could go to the Aspire console window (where you started Aspire with "bin\startup") and type "shutdown" and then press the Return or Enter key. Either way will shut down Aspire.

Cheers!

The Aspire Development Team