360Works Textractor User Guide

Textractor extracts plain text from non-plain files such as PDF, Word, etc. It can read from remote files or from local files, simply pass in a URL or a container field and it returns the text!

PDFs

PDFs containing form fields have the form values listed at the top of the resulting text.

Note: PDF extraction requires that java version 1.5 or higher be installed on your system.

Excel

Excel files have the option of returning the contents of a file, or only returning unique unordered cell values. See the note on index flags in extractText

Word / RTF

Word (.doc) and Rich Text Format (.rtf) documents are extracted verbatim, without any formatting.
The following versions of Word are supported:

Acknowledgements

The following libraries are used in this plugin:
text-mining

See also: extractText

Installation

Requirements

FileMaker version 7 or higher.

Java Virtual Machine (JVM) version 1.4.2 or later. If you are running a JVM earlier than 1.4.2, you should upgrade. Download a JVM from http://www.java.com/en/download/. If you are not sure what version of Java you have installed, you can do 'java -version' on the command line in Windows or OS X.

Windows, or Mac OS X version 10.4 or higher.

Note to intel Mac users: running this plugin under Rosetta is not supported. Upgrade to FileMaker 8.5 to run our plugin in native Intel mode.

Install Steps for FileMaker Pro

Drag the plugin from the MAC or WIN folder into your FileMaker extensions, and restart FileMaker. You will need to enter your license key before you can use it. After FileMaker starts up with the plugin installed, open FileMaker preferences, click on the Plug-ins tab, select the plugin from the list, and click the Configure button. Enter your license key and company name in this dialog. You will only need to do this once on a given machine. Alternately, you can use the registration function to register the plugin during a startup script.

This will also enable the plugin for use with Instant Web Publishing from the FileMaker Pro client software.

If the plugin does not load correctly, double-check that you meet the system requirements.

Install steps for FileMaker Web Publishing Engine / Instant Web Publishing

You do not need to do this step unless you plan on using the plugin with Instant Web Publishing or Custom Web Publishing with FileMaker Server Advanced. You will need an Enterprise License to use this feature.

For installing into the Web Publishing Engine with FileMaker 9 Server or FileMaker Server Advanced, drag the plugin from the MAC or WIN folder into the FileMaker Server/Web Publishing/publishing-engine/wpc/Plugins folder. If there is no 'Plugins' folder inside the 'wpc' folder, then create it manually. Restart FileMaker Web Publishing, and now the plugins should be ready to go.

Note that you must use the registration function to register the plugin, since there is no preferences dialog in the FileMaker Web Publishing Engine to enter the license key and company name.

Note that due to a bug which we and other plugin vendors have reported to FileMaker, web plugins do not work in FileMaker Web Publishing Engine 8.0v4 on Mac OS X. You will need to use a later version, like 9, or an earlier version, like 8.0v3. The Windows FileMaker Server 8.0v4 does not have this bug, and will work correctly.

The easiest way to test whether the plugin is working is to have a calculation which calls the version function of the plugin, and display that on an IWP layout. If it shows "?", then the plugin is not working. If it shows a number, then the plugin has been installed successfully.

Install steps for FileMaker Server 9

You do not need to do this step unless you plan on using the plugin with scheduled script triggering, a new feature in FileMaker Server 9. You will need an Enterprise License to use this feature.

  1. Drag the plugin from the MAC or WIN folder into the FileMaker Server/Database Server/Extensions folder (Server 8 and older versions of server use the path FileMaker Server/Extensions/Plugins).
  2. Restart FileMaker Server. In the Server Admin application, go to Configuration -> Database Server->Server Plug-ins.
  3. Check the box that says 'Enable FileMaker Server to use plug-ins', and then check the 'enabled' box for this plugin. You should now be able to write schedules that trigger scripts which use the plugin.

Note that you must use the registration function to register the plugin, since there is no preferences dialog in FileMaker Server to enter the license key and company name.

Feedback

We love to hear your suggestions for improving our products! If you are experiencing problems with this plugin, or have a feature request, or are happy with it, we'd appreciate hearing about it. Send us a message on our website, or email us!

Function Summary

Function Detail

extractText ( data {; flags} )

Extracts text from a container file or URL pointing to a file. The supported file formats are If you need support for an additional format, let us know! Contact us at www.360works.com or by sending an email using the link at the bottom of this page.

Flags

You can customize the text extraction behavior by supplying optional flags as a second parameter. Multiple flags can be separated by a plus sign. The following flags are supported:
index
Use this flag if the extracted text is only intended to be searched on, not displayed to the user. This usually results in faster indexing and smaller results, but less legibility. For PDF extraction, words may not be in the correct order. For Excel extraction, duplicate cell values are removed, and the values will not appear in the correct order.

PDF extraction notes

If the PDF document contains form data, it will appear at the beginning of the extracted text. Each input in the form will appear on a separate line. The data will be formatted as:

name=value

Where name is the name of the field, and value is its value.

Parameters:
input - container data or URL for a file to extract the text from.
flags - flags to customize how text is extracted.
Returns: The text in a file, or ERROR if an error occurred. For some file formats, the returned text may only contain unique words.

txrLastError

Returns defailed information about the last error generated by this plugin. If another plugin function returns the text "ERROR", call this function to get a user-presentable description of what went wrong.

Returns: Error text, or null if there was no error.

txrLicenseInfo

Returns license information about the plugin.


txrRegister ( licenseKey ; registeredTo )

Registers the plugin.


txrVersion

Returns the version number of the plugin.