The GoldenGATE Document Editor

Download the demo version of the GoldenGATE editor here (version date 2011.11.21.12.15, archive last modified 2011.11.21.12.15).

The demo version of the GoldenGATE editor includes all the resources needed to convert OCRed biosystematics documents into XML content marked up in the TaxonX XML schema. The description will guide you in marking up any of the test documents:

Description

The intention of the GoldenGATE editor is to build a bridge between NLP components and XML markup of natural language text according to arbitrary XML schemas. It allows the deployment of NLP components to marking up the bodies of literature they were designed for. In this way, it enables transforming the texts into XML content according to an XML schema that was designed to gain maximum benefit from the knowledge provided in them.

The GoldenGATE editor picks up the ideas of plug-in processing resources and pipelined processing implemented in the GATE framework (http://www.gate.co.uk), which has been widely used in many areas of NLP research. At the same time, it provides a full XML editor including assistance for manipulation of both text and markup, thus allowing users to improve data quality by manual intervention.

In order to achieve maximum flexibility and extensibility, the GoldenGATE editor provides plug-and-play interfaces on many levels: Individual automated components for markup creation and manipulation, entire groups of functionalities, components accessing documents in arbitrary storage locations, and arbitrary document data formats.

Publications

Markup Examples

Contact Information

Related Links

The package natively includes:

Resources for automated and semi-automated markup:
Components for viewing documents in specialized ways:
Components for loading and storing documents:
Components providing support for a variety of document formats:

Documentation:

Read the online help for the GoldenGATE editor here.
Read the JavaDoc of the GoldenGATE editor and its backing components (not for the actual implementations of the plugin interfaces, though)

Update Sources:

URLs where the GoldenGATE editor can fetch updates from. These URLs are not meant to access manually through a browser, but to be listed in the UpdateHosts.cnfg file in your GoldenGATE root folder. The GoldenGATE editor automatically downloads and installs updates from the URLs listed in this file.