Connector Scanner Stage Test Harness
For Information on Aspire 3.1 Click Here
- 1 Introduction
- 2 Building the Test Harness
- 3 Running the test harness
- 4 Test Harness Menu Options
- 5 Running the tester without the batch file
In version 2.2, we have updated the scanner framework to allow testing of scanner stages without the need for Aspire. The test harness is still under development and is subject to change.
The test harness is a tool which allows developers to scan repositories or download users and groups without needing to run up Aspire. It is run from the command line and uses a text based menu to call scanner functionality. The output can then be sent to a file or to the screen.
The harness supports the "Hierarchical" scanners, the "Linear" scanners and the "Push" scanners, although the actually functionality supported depends on the scanner type.
Building the Test Harness
You can build the Test Harness from the source code of any connector (assuming you've got Maven installed). The pom files include a profile to build/download everything that is required to run the tester.
If you've checked out the code, go to the directory in which the pom.xml exists. Run the command:
mvn clean install -Ptester
This command will run the scanner build, including unit tests (if you want to disable the unit tests, add -DskipTests to the end of the command line). If the build is successful, you should see:
[INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------
In the project target directory, you will then have scannerTester directory and the connector jar file (aspire-documentum-connector-2.2-SNAPSHOT.jar for instance). The scannerTester directory contains a bin directory with a batch file to run the tester and a lib directory that contains all the required dependencies for the connector.
Running the test harness
In most cases, the test harness can be run using a batch file extracted when the test harness is built. Instructions are below:
- Change to the target/scannerTester direcotry
- Invoke the tester using the batch file, passing the jar file of the scanner to be tested as a parameter
bin\test.bat -jar ..\aspire-documentum-connector-2.2-SNAPSHOT.jar
Command line options
The scanner test harness supports the following options:
|-jar <jar-file>||Specify the jar file which contains the Aspire scanner component
Not needed if -class is specified
|-class <full-class-name>||Specify the full class name (e.g. com.searchtechnologies.aspire.components.FileSystemScanner) of the scanner class");
Not needed if -jar is specified
|-home <Aspire-home>||Specify the Aspire Home directory|
|-subType <subType>||Specify the sub-type for the scanner class. If not specified 'default' is assumed|
'NOTE: one of -jar or -class must be specified
See below for information on running the test harness with out using the batch file
Test Harness Menu Options
When run, a menu will be displayed to the console. Some of the options are common to all scanner types, others are specific to individual scanner types.
The following options are common to all scanner types:
|Initialize or view scanner configuration||Shows the current scanner configuration and allows you to load a new one from an xml file. The scanner configuration is the set of properties passed to the scanner when the component is loaded in to Aspire. The format varies from scanner to scanner, but you can use the parameters passed to the component from the application.xml file in the connector app-bundle as an example|
|Load or view content source job||Shows the current scanner content source job and allows you to load a new one from an xml file. The content source job is the set of properties passed to the scanner when a crawl is process. The format varies from scanner to scanner, but you can use the content-source.xml file from the content-sources directory of an Aspire distribution as an example. The root node of the file should be <doc>, but if the test harness sees <connectorSource> it will wrap a <doc> around it|
|Download users and groups to the cache||Initiates the process to download the users and groups from the repository to the user/group cache (if the scanner supports this)|
|Dump the user-group cache||Dumps the contents of the user/group cache to a file or the screen (if the scanner supports this)|
|Lookup a single user in the user-group cache||Looks up a single user (entered by the user when prompted) in the user/group cache and outputs the results to the screen|
|Download special ACLs to special ACL cache||Not yet implemented|
|Dump the special ACLs cache||Not yet implemented|
|Dump ACL intersections (typically requires a full scan first)||Dumps the contents of the intersection ACL database to a file or the screen (if the scanner supports this). The database is populated by a full scan|
If you wish to quit, press Q followed by return.
Scanner configuration file
You will need to provide a scanner configuration file containing the xml to be passed to the scanner when it is created. The format is (unfortunately) specific to the type of scanner (so the file system scanner is different to the Documentum scanner). However, you will notice some common pieces.
Knowing what to have in the configuration can seem difficult, but if you have the scanner running in Aspire, you can get the configuration of an installed scanner via the debug interface.
Navigate to the page for a scanner using a url of the form http://localhost:50505/aspire/SOURCE-NAME/Main/Scanner/ and view the page source. You should be able to view the configuration in the <config> tag for the component.
If you can't run Aspire, you can look in the application.xml file of the scanner app-bundle to work out the configuration.
Example scanner configuration file
<?xml version="1.0" encoding="UTF-8"?> <config> <debug>false</debug> <fullRecovery>incremental</fullRecovery> <incrementalRecovery>incremental</incrementalRecovery> <metadataMap> <map from="action" to="action"/> <map from="doc-type" to="docType"/> <map from="last-modified-date" to="lastModified"/> <map from="content-length-bytes" to="dataSize"/> <map from="owner" to="owner"/> </metadataMap> <snapshotDir>C:\Users\aspire\demo\target\demo-1.0-SNAPSHOT-distribution\data/FileScanner/snapshots</snapshotDir> <fileNamePatterns> <include pattern=".*"/> <exclude pattern=".*tmp$"/> </fileNamePatterns> <emitCrawlStartJobs>true</emitCrawlStartJobs> <emitCrawlEndJobs>false</emitCrawlEndJobs> <enableAuditing>true</enableAuditing> </config>
Content source job
The content source job is also specific to the connector. The easiest way to get an example job is by using a running Aspire instance. Ensure debugging is turned on for the connector and run a crawl. Then visit the url http://localhost:50505/aspire/SOURCE-NAME/Main/IncomingJobLogger?cmd=viewJobs. The jobs here should include the xml you need.
Example content source job
<?xml version="1.0" encoding="UTF-8"?> <doc> <connectorSource> <url>c:\testdata\11</url> <partialScan>false</partialScan> <subDirUrl/> <indexContainers>false</indexContainers> <scanRecursively>true</scanRecursively> <useACLs>false</useACLs> <acls/> <scanExcludedItems>false</scanExcludedItems> <fileNamePatterns/> </connectorSource> </doc>
Hierarchical Scanner Menu Options
The following options are available for Hierarchical scanners:
|Browse Hierarchy||Allows you to (manually) traverse the repository hierarchy, listing "folders" and "documents" and going in to "folders" in order to see there contents. The test harness will start at the initial url given in the configuration and then list the contents. The user can then pick an item from the list and display the contents of that|
|Scan a specified URL||Initiates a scan of a url given in response to a prompt from the harness. The results are output to the screen or a file|
|Scan everything (automatically scans nested folders)||Initiates a recursive scan of the initial url given in the connector source job file. The results are output to the screen or a file|
Linear Scanner Menu Options
The following options are available for Linear scanners:
|Scan everything||Initiates a scan of the initial url given in the connector source job file. The results are output to the screen or a file|
Push Scanner Menu Options
The following options are available for Push scanners:
|Crawl everything||Initiates a crawl of the initial url given in the connector source job file. The results are output to the screen or a file|
Running the tester without the batch file
The harness can be invoked by entering a java command from the command line. The full java command must be specified, including the classpath, the full name of the test harness class and any options to pass to it.
The Java classpath must include:
- org.osgi.core-4.2.0.jar & org.osgi.compendium-4.2.0.jar
- the OSGI container jar files
- The Aspire core file containing services and framework
- The Aspire scanner framework
- The Aspire group expansion framework
- The connector under test. Sometimes this could be named aspire-<repository>-connector-<version>.jar
The classpath must also include any (other) jars on which the scanner is dependent. Typically these will be included in the aspire-<repository>-connector-<version>.jar file, but you must extract them and add them to the classpath manually (see below).
To run the test harness, you run Java to invoke a Java class. You must therefore therefore also include the full name of the class to invoke - com.searchtechnologies.aspire.scanner.testtool.ScannerTester
Command line options
In addition to the standard Java command line options, you can specify any of the Command line options above
'NOTE: one of -jar or -class must be specified
Example command line
Below is an example command line for testing the file system scanner
java -cp .\aspire-scanner-2.2-SNAPSHOT.jar; .\aspire-core-2.2-SNAPSHOT.jar; .\aspire-simple-group-expander-2.2-SNAPSHOT.jar; .\aspire-filesystem-connector-2.2-SNAPSHOT.jar; .\org.osgi.core-4.2.0.jar; .\org.osgi.compendium-4.2.0.jar com.searchtechnologies.aspire.scanner.testtool.ScannerTester -jar .\aspire-filesystem-connector-2.2-SNAPSHOT.jar
NOTE: New lines have been added for readability in the above command. It is a single command and should be on a single line. The -cp parameter is a single value and should contain no spaces
If all your jar files to be added to the classpath are in a directory, you can use the java property java.ext.dirs instead of class path:
java -Djava.ext.dirs=./lib com.searchtechnologies.aspire.scanner.testtool.ScannerTester -jar .\aspire-filesystem-connector-2.2-SNAPSHOT.jar
NOTE: Again, new lines have been added for readability in the above command. It is a single command and should be on a single line.
Extracting embedded jar files
Assuming you have the JDK installed, you can use the jar utility to extract embedded jar files from the scanner jar file under test.
To extract the jar file aspire-scanner-2.2-SNAPSHOT.jar from the aspire-filesystem-connector-2.2-SNAPSHOT.jar use the command:
jar xf aspire-filesystem-connector-2.2-SNAPSHOT.jar aspire-scanner-2.2-SNAPSHOT.jar
To inspect the contents of the file (so you know what to extract) use the command:
jar tf aspire-filesystem-connector-2.2-SNAPSHOT.jar
Then look for the .jar files at the root level and extract each file in turn using the commands above.