Mime Type Normalizer (Aspire 2)

From wiki.searchtechnologies.com
Jump to: navigation, search

For Information on Aspire 3.1 Click Here


Mime Type Normalizer (Aspire 2)
Factory Name  com.searchtechnologies.aspire:aspire-tools
subType  mimeTypeNormalizer
Inputs  AspireObject holding a mime-type in one of the mime-type fields described here
Outputs  AspireObject with normalized mime-type fields
Feature only available with Aspire Enterprise

The Mimetype Normalizer component reads the mime type name from the input AspireObject and categorizes it according to the list of known mime types listed in the normalized-mimetypes.xml file.

Mimetype Fields

The Mimetype Normalizer will search for the mime-type to classify on one of the following fields (first appearance in this order is used) in the input AspireObject:

Order Field
1 mimeType
2 contentType
3 hierarchy/item/@type
4 repItemType

Configuration

The mime type normalizer recognizes the following configuration parameters:

Element Type Default Description
mimetypesLocation String ${aspire.home}/resources/com.searchtechnologies.aspire.utilities.tools.MimeTypes/normalized-mimetypes.xml The location of the normalized mimetypes file.

Normalized Mimetypes XML

<?xml version="1.0" encoding="UTF-8"?>
<mimetypes>
  <category name="application/msword" displayName="Word">
    <mimetype name="application/vnd.lotus-wordpro"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.document"/>
    <mimetype name="application/msword"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.wordprocessingml.websettings+xml"/>
    .
    .
    .
  </category>
  <category name="application/vnd.ms-powerpoint" displayName="PowerPoint">
    <mimetype name="application/vnd.ms-powerpoint"/>
    <mimetype name="application/vnd.openxmlformats-officedocument.presentationml.presentation"/>
    .
    .
    .
  </category>
  .
  .
  .
</mimetypes>

Output

The mimetype normalizer will output three different values: the original mime type value (originalMimeType), the normalized mime type or category (normalizedMimeType) and the normalized mime name or friendly name (normalizedMimeName).

<doc>
  <fetchUrl>smb://server/Archive 2011 - DLS Utah presentation.pptx</fetchUrl>
  <mimeType>application/vnd.openxmlformats-officedocument.presentationml.presentation</mimeType>
  .
  .
  .
  <originalMimeType>application/vnd.openxmlformats-officedocument.presentationml.presentation</originalMimeType>
  <normalizedMimeType>application/vnd.ms-powerpoint</normalizedMimeType>
  <normalizedMimeName>PowerPoint</normalizedMimeName>
</doc>