Aspire for Hadoop Introduction (Aspire 2)
For Information on Aspire 3.1 Click Here
- Aspire for Hadoop allows Aspire instances to be executed on Hadoop task tracker nodes as map,reduce and/or combine tasks.
- Every time a task is executed, each Hadoop Task Tracker will be responsible for launching and shutting down the Aspire instance it requires.
- A Hadoop Writable named AspireObjectWritable is available for Mappers and Reducers tasks to read/write AspireObjects to/from Hadoop input/output files.
- External Aspire implementations can publish their output to Hadoop HDFS via the Post to HDFS or Post to WebHDFS components to be later used by Aspire for Hadoop map/reduce tasks.
- The configuration of Aspire for Hadoop tasks is based on similar aspire application.xml structure (map,reduce,combine applications are all defined on a single file).
- Aspire components are provided to interact with Hadoop's Context and Key/Value pairs (read, write pairs).
- Aspire groovy component can also be used to directly interact with Hadoop's Context and Key/Value pairs objects.
- Aspire 2.2.x for Hadoop works with CDH5 Cloudera distribution.
- Aspire 2.1.x for Hadoop works with CDH5 Cloudera distribution.
- Aspire 2.0.x for Hadoop works with CDH4 Cloudera distribution.