Failover Aspire Zookeeper (Aspire 2)
For Information on Aspire 3.1 Click Here
- 1 Failover basics
- 2 How it works?
- 3 Create your own Aspire 2.0 Failover installation
- 4 How does the scheduled crawls work in a failover environment
- 5 Sharing Incremental Data
- 6 Security Concerns
Aspire 2.0 introduce failover features over multiple Aspire servers for content source crawls. For this, Aspire uses Apache ZooKeeper to synchronize configurations (content sources and workflow applications) and coordinate and resume failed crawls, among several Aspire servers.
The failover feature in Aspire intend to maximize the up-time of the Aspire content sources crawls. When an Aspire server is running a crawl for a content source and for any reason it crashes, all other Aspire Servers connected to the same ZooKeeper will notice the fail and one of them will resume the crawl (from the latest snapshot file if a shared drive was configured).
How it works?
- Step 1. A crawl is running in one Aspire server
- Step 2. The server running the content source failed so the crawl is resumed in other server.
Aspire synchronize the content sources configuration and the workflow libraries, so if you create a content source or a library in one server the other one will install that same content source or library.
Create your own Aspire 2.0 Failover installation
Install a ZooKeeper Server
Aspire 2.0 by default start an embedded ZooKeeper server to run in stand-alone mode, you could use this same server but the failover will only work if that Aspire server is always up and running.
We recommend to use external ZooKeeper servers (cluster), so the uptime of the overall failover functionality can be maximized with ZooKeeper failover too.
For ZooKeeper installation go to: http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html
Also check Zookeeper machine requirements at: http://zookeeper.apache.org/doc/r3.1.2/zookeeperAdmin.html#Single+Machine+Requirements
This is necessary for have the same snapshot and statistics for all the Aspire servers, this means that if a crawl has to be resumed by any server, it will continue from the same snapshot file that the last server was using.
We recommend this shared drive or NFS server to be in a different physical server.
Install Aspire 2.0 in each Server
Follow steps at Download Install (Aspire 2) in each Server to install Aspire
For each installation make the following change to the <configAdministration> section in settings.xml file in the aspire/config directory
<zookeeper enabled="false" libraryFolder="config/workflow-libraries" root="/aspire">
<zookeeper enabled="true" libraryFolder="config/workflow-libraries" root="/aspire">
Failover configuration in each Aspire server
Once you have installed Aspire in each server, edit the config/settings.xml file in each distribution folder.
Uncomment the line with:
<!-- <externalServer>127.0.0.1:2182,127.0.0.1:2183,127.0.0.1:2181</externalServer> -->
And write the zookeeper server that you have installed as follows:
<externalServer> host:port </externalServer>
If you are using a cluster of zookeeper servers separate each server with a comma:
<externalServer> host1:port1, host2:port2, host3:port3, ... </externalServer>
By default if no external server is specified, Aspire will start an embedded ZooKeeper server on the port specified in the <clientPort> tag, and it will not be connected to any other ZooKeeper. This is the default for non-failover installations.
Starting Aspire Servers in failover mode
When you are going to start the Aspire servers make sure the first server you start is the one with the correct configuration (content sources and workflow libraries) because any previous data stored in ZooKeeper will be replaced with this server’s configuration.
All subsequent Aspire servers started after the first one is started will replace their own configurations (content sources and workflow libraries) with the one stored in ZooKeeper (set by the first Asprie Server loaded).
To avoid unwanted losing of configuration make sure the first server you start has the content sources and libraries you want all other servers to share.
To start each Aspire server execute: bin\startup.bat or bin\startup.sh for linux servers
Verify your installation
- Once you have started all Aspire servers, open the browser and go the Home UI of any particular Aspire server you want. By default the UI address can be accessed by browsing to: http://aspire-server-1:50505
- From the Home UI of Aspire create any content source, configure it and save it.
- Wait until the content source is successfully loaded.
- Open the Home UI of the rest of the servers and make sure they all have the same content source configured.
How does the scheduled crawls work in a failover environment
If you configured a content source to crawl using a schedule of time, the same configuration will be applied to all servers. And when the time comes to crawl by the schedule, only one server will perform the crawl, and the others will wait.
The crawl is not performed in any particular order but by the first Aspire server who ask to run it. So to make sure the incremental information is always up to date in all servers it is necessary to configure a shared driver or NFS for storing the snapshot files. Next section will cover how to do this.
Sharing Incremental Data
Most of the Aspire connectors use Snapshot files to keep incremental information of each crawl performed.
In order to preserve the incremental information you need to use a shared drive or NFS configured in each Server, and configure each content source to save the snapshot files in a specific shared directory.
In order to use shared snapshot files across the multiple aspire servers do the following:
For each server:
- Go to the Home UI
- Click on the content source you want to enable shared snapshot files
- Go to Connector Properties
- Check the checkbox for Advanced Properties or General Configuration
- Change the snapshot directory to the shared directory you want to use (can be in NFS or a shared drive, make sure you have read and write permission on it)
Repeat the steps for every server and content source to share the snapshots file. (Each content source will be synchronized among all Aspire servers connected to ZooKeeper)
The Failover feature does not enforce Aspire to use security among all Aspire servers, it is completely possible to have a scenario where you have a 3 Aspire Server farm with only two of them have security access restrictions, the third one will not know about this and will perform its failover crawls without caring about it, so you would end up with an unsecured server crawling your sensitive data.
It is recommended to configure all Aspire servers with security if your project requirements demand it. Go to Aspire Security for more details about how to configure security access restrictions to Aspire.