Category:2.2 Release

From wiki.searchtechnologies.com
Jump to: navigation, search

For Information on Aspire 3.1 Click Here

This page maintains a list of all of the updates for version 2.2 of Aspire.

6/22/2015: New version 2.2 has been released and is available for Community, Enterprise and Aspire for Elasticsearch


--- Notice ---

As a known bug for Aspire 2.1, the Aspire 2.2 release will cause 2.1 distributions to start loading components and app bundles from the 2.2 release (once restarted), this will probably cause issues because of new properties added to 2.2 components not configured by 2.1 distributions. Link to workarounds


New Features

  • Created Aspire for Elasticsearch distribution with expiration control.
  • Non-text document filtering - ability to recognize document types by extension to support an option to not fetch non-text based files during crawling.
  • Ability to crawl inside ZIP process archive files for connectors using snapshots. Available in File system, CIFS and Lotus Notes connectors.
  • Support for "Aspire Solutions"
    • Ability to load and configure Aspire Solutions via workflow
    • OCR solution based on Staging repository
    • Semantic Co-occurrence solution
  • Jive 8 is officially supported starting from version 2.2.
  • Fetch Hierarchy option in order to skip hierachy information.
  • Fetch Places' ACLs through Entitlements API method. (Jive Cloud Only)
  • Publish to SharePoint 2013
    • Limited the number of items processed by the AspireBDCService.
    • Added BDCs Information into the SharePoint Publisher configuration.
    • SharePoint Components built as SharePoint WSP.
    • CleanUp Timer Job is now part of the Aspire publisher component.

Bug Fixes

This release addresses the following issues:

Aspire Core

  • Hierarchical scan blocks publisher whilst recursively scanning .
  • Separation of batch and document errors in the statistics.
  • Issue of latest version of the DXF is loaded.
  • Default port used in the DCM.
  • Issue if version is not given for a components then Aspire didn't load the latest version.
  • Issue for the statistics zeroing on failover.
  • Issue when importing same content source more than once.
  • DXF multiple value is now saved properly on workflow.
  • Workflow applications cannot have the same name as connectors applications.
  • Issue when Aspire is running in failover mode.
  • Test scan cannot be executed when a second Aspire instance is running a crawl.
  • Auditing is now doing index dump if Solr is not the default collection.
  • Issue with Text Extractor when extracts PDF content.
  • Error encrypting passwords.
  • Failure message removed from startup.
  • DXF information is not deleted after reopening content source information.
  • InvalidVersionSpecificationException while including aspire-lucene.
  • Issue for clearing user in Group expansion cache.
  • Issues with Start/Stop/Resume/Pause buttons on Aspire Services.
  • Groovy compilation error reporting to have Close button issue.
  • Issue with Control Panel when using Content Source schedule options.
  • Issue with tagger component when a single line contains '-'.
  • Full and Incremental crawl buttons are now separated.
  • UI is not handling content source page error for custom connectors that do not exist.
  • Components are now closing log files on exit.
  • Issues running Aspire in offline mode.
  • Setting app.config.dir is now correct for non-content source applications.
  • Linear scanner is now closing the snapshot file.
  • Error "Cannot find schedule id for source (filesystem) - perhaps it has been deleted".
  • Group Expansion group clear error.
  • Fixed substitution of ${xml:} property inside AspireObject.
  • Tooltip for System Name is not more the same as a content source for Aspire Services.
  • Zooming on chrome that was causing text placement issues on the content sources tab.
  • Normalization of connectors DXF to have display consistency across connectors.

Connectors

There are some connectors that have not been released for 2.2 yet; we are planning to release them sooner rather than later. Please check the wiki for more details.

Amazon S3

  • Added DXF waitForSubJobs option missing from the configuration.
  • Issues crawling non-existing folders.

Box

  • Last access and refresh tokens are handled in another folder.
  • NPE with full crawls.
  • Incremental is now working (timestamp was the same for all impersonate users)
  • Issue when user is impersonating the admin user.
  • Crawl does not extract information if URL does not exist.
  • Group Expansion is getting the user as a group.
  • ACLs are being updated when user performs incremental crawl.
  • When imported two Box.com content sources with the same name not publishing correctly.
  • Updates logged as adds.

CIFS

  • Exception with disable Text Extraction option.
  • NullPointerException with invalid credentials on Test Connection.
  • Incremental crawl with archive files is now working fine.
  • Sub directory to scan is now required for partial scans.

Confluence

  • Deadlock found during Confluence crawl execution.
  • Error message was improved when plugin is not installed.
  • NullPointerException when giving invalid URL.

Custom connector

  • Custom connector now considers the version field.

Documentum

  • Missing property on the connector's DXF.
  • NullPointerException when initializing the scanner.
  • Error when shutting down Aspire with a Documentum content source.

eRoom

  • ACLs info is correct when crawl starts in folder, not root.
  • Connector is handling redirects properly.
  • NPE when reading headers.
  • eRoom containers are now shown as updated when a child item is modified.
  • Logs and the retries messages showing incorrectly.
  • ACLs not showing in Aspire object.
  • NPE when there is not valid certificate.

Feed One

  • New connector released for this version

File System

  • Issue with groovy pipelines and text extraction option disabled.
  • Extract Text Timeout field too small for large numbers.
  • Issue with the Sub Directory to Scan field.

Hadoop

  • Reducer scaling problem.

Heritrix

  • Issue with deletes not being detected during incremental crawls.
  • Issue with Crawl Scope filtering after the ACCEPT rules.
  • Pause/Stop crawl actions working properly now.
  • Connector is now taking into account the max hops.
  • Crawl accept patterns and Crawl reject patterns working properly.

Homepage

  • Updated some tooltips on the connector.
  • Improved error message when user selects incorrect header type.

IBM Connections

  • Exception with addUpdate pipeline during full crawls.
  • UnknownHostException when trying to crawl an url with \ at the end.
  • NullPointerException during DetectNonTextDocuments stage.
  • ClassCastException.
  • Wiki's ACL are LDAP DNs instead of GUID.
  • Validation added on LdapSearchBase field.
  • Download LDAP users is not done anymore with Test Connection.
  • Duplicate repItemType "aspire/entry".
  • Bookmarks are now reported as deleted.
  • NullPointerException when running an incremental crawl.

Jive

  • Unnecessary adds/deletes reported during Incremental crawls.
  • Comments not being crawled during Incremental crawls.

Lotus

  • Exception updating user/group database.
  • Non-Text Document Filtering was not working with files inside an archive file.
  • InterceptionGroupMap corrupted when crawl was stopped.
  • "Add Parent Info for Archives Files" option is now working.

RDB via Snapshot

  • Auditing tool is now working properly with the connector.
  • Bad setting and error are now notified on the UI instead of the console.
  • Misspelled word on the Discovery SQL tooltips.

RDB via Update Tables

  • Incremental is not executed if seq_id is wrong.
  • Added documentation about to specify {SLICES} tag on Full Crawl SQL field.

Salesforce

  • Chatter feeds are now available for crawling.
  • Session timeout is now handled correctly.
  • Exceptions being thrown during incremental crawls.
  • Deletes repeated for incremental crawls.
  • MapDB error when reloading content source.
  • New Opportunities being reported twice for incremental crawls.
  • sQueries.xml is not up to date with latest Salesforce fields.
  • Some exceptions when running consecutive incremental crawls.

SharePoint 2010

  • Updates on containers not triggering updates on children items.
  • Scan Recursively and Index Containers options were not working as expected.
  • Exclude pattern and Index Container not working as expected on Incremental Crawls.
  • Testing connection failing with an invalid URL.
  • Delete action not containing SourceName field.
  • Renaming files were not reported correctly.
  • Incremental crawl with errors after renaming library/list.
  • Deleting items on external lists are not handled on incremental crawls.

SharePoint 2013

  • Attachments not being crawled.
  • Updates on containers not triggering updates on children items.
  • Extension list option not working.
  • Specific site collection not working when specified.
  • Incremental crawl not picking up new items for a SP list.

SharePoint Online

  • Full crawl not working as expected.
  • Group Expansion and ACLS not extracting users from Office 365.
  • Extension list option not working.
  • Updates on containers not triggering updates on children items.
  • Metadata extraction not working as expected.

Staging Repository

  • Updated the code to use UTF-8 encoding when reading metadata.
  • Connector is not reading one more item than the ones originally crawled.

Socialcast

  • Deactivated users are not shown on Group Expansion.
  • Socialcast Groups are now extracted by GE.

Teamforge

  • Added index specific item types to TeamForge.
  • Incremental crawl with index containers false doesn't return any result.
  • Incremental crawl shows correct updates with folders.
  • ACL info was added without that option checked.

Publishers

HDFS

  • FolderPath and File Prefix are now required.

SharePoint 2013

  • Sequence number & timestamps for batches are incorrect when running parallel content sources.
  • Aspire not able to communicate with the deployed notification service when SharePoint web site is https.
  • Deletes not working when there are other kind of updates.
  • Fixed error when running Cleanup Timer Job.
  • AspireBDCService not skipping corrupt batch files.
  • Cleanup Job in SP-Publisher not working when using staging repository.
  • Full crawl in Aspire now triggers full crawl in SharePoint 2013.
  • Background thread log is lost when the publisher is reloaded.
  • Fixed Sequence number reset.
  • Improved error logging in all SP2013 publisher components.
  • NotificationService failing to deploy BDC Model.
  • Fixed issue if Intermediate Repository is too large.
  • Fixed exception when starting a crawl (UpdatedConcurrencyException)
  • Fixed NPE when XSL File Path was changed to an absolute path.
  • CleanUp Timer Job is now part of the Aspire publisher component.

Staging Repository

  • Misspelled word on tooltip.

Services

Group Expansion Manager

  • Validation for lookup optional attribute.

LDAP Group Cache

  • Issue with pagination larger than 1000.
  • Not respecting required User and Password.

Known Issues

  • Audit file comparison fails with default transformation files because of ID field.
  • Issue loading minimum version of a component.
  • Issue when impersonating users in Box, and running a full crawl with Scan Recursively.
  • Issue with incremental crawls in Box, and crawling with co-admin users.
  • Issue keeping custom application in Big Data workflow section
  • Exclude patterns not working with incremental crawls in Box.
  • Eroom connector is not getting updates for notes in incremental crawls
  • IBM connector is not extracting text for the Files endpoint.
  • Chatter feed deletes are not being reported for incremental crawls on Salesforce connector.
  • Updates are being handled as Adds on Salesforce connector.
  • InterceptionGroupMap corrupted when crawl is stopped on connectors that handle Interception ACLs
  • NullPointerException and NotesException when the crawl is stopped on Lotus connector.
  • NullPointerException trying to get the ACLs on Jive connector. Hotfix 2.2.0.6 addresses this issue. Please contact Support Team for information on how to install this hotfix.
  • Hierarchy on updates is not being generated correctly in SharePoint 2010.

External Technical Limitations

  • Changes in Box notes content are not considered for incremental crawls.
  • New items added to IBMConnections are reported as updated.
  • Changes made to the attachments of the item type Opportunity in Salesforce are not considered for incremental crawls.
  • Entitlements API is not supporting "User Overrides" on Jive connector. In this case, ACLs will not be retrieved.
  • Documents with exact same date (including milliseconds) will affect the statistics for Incremental Crawls on Jive connector.

Additional release notes

JAVA 7: As of March 18, 2015, we began using Java 7 to build Aspire. You should update your environment to use Java 7 to run Aspire for version 2.2