BioNLP'09 Shared Task

Link U-Compare with Taverna

How to use U-Compare and its workflow from within Taverna workflow. There are two ways, the U-Compare Taverna plugin and the U-Compare command line mode as a Taverna activity. Please also refer to our paper below.


If you use our plugin or the example workflow, please cite this paper:

Kano, Yoshinobu, Paul Dobson, Mio Nakanishi, Jun'ichi Tsujii and Sophia Ananiadou. Text Mining Meets Workflow: Linking U-Compare with Taverna. Bioinformatics. 2010;; doi: 10.1093/bioinformatics/btp463
abstract (journal website) pdf

More related papers can be found in our Publications page.

U-Compare as Command Line

Integrating U-Compare as an external command line tool in Taverna is straightforward. Users should prepare a UIMA CPE workflow to run U-Compare as a command line. The UIMA CPE workflow is an XML file, users might prepare it manually, but U-Compare provides easier way to construct a CPE workflow. By using the U-Compare (stand alone as started from the top page of our website), you can create a CPE workflow visually and save it. Then copy the saved location from the Workflow/Show Location menu, set it in the input port of the command line mode. Details are described in the paper above and here:
Link to the U-Compare developer guide page

You can download the example Taverna workflow using the command line mode from myExperiment, id 1377. You also need to download a UIMA workflow from myExperiment. Details are described in our paper above.

U-Compare Taverna Plugin

The U-Compare Taverna Plugin allows Taverna users to embed a U-Compare workflow within a Taverna workflow. Taverna 2.1.0 (the version should be exactly 2.1.0) and Java 6 is required. Taverna can be downloaded from We will update the plugin to be Taverna 2.2 compatible soon, collaborating with the Taverna team.
Please note that this plugin is provided for lightweight, testing purposes, we recommend to use the stand-alone U-Compare and the command line mode as described above for practical usages.


From the Taverna's menu, Advanced > updates and plugins > Find new plugins button > Add update site button > Enter then Taverna will automatically install the U-Compare Taverna plugin.


Users should specify these mandatory options:

  • the name of the U-Compare workflow to embed, from the pull-down box
  • a post-processing Beanshell script with proper I/O ports to appropriately reformat the output for Taverna
Other options include the memory allocation settings for U-Compare, etc.
When the Taverna workflow is run the U-Compare application starts and runs the specified workflow automatically.

Post Processing Script

After running each document (CAS) in the U-Compare side workflow, the user-defined post-processing script will be called. This script should be a BeanShell script, where a predefiend variable "cas" is accessible. This variable is an instance of the CAS class of the UIMA API, you can use the UIMA 2.2.2 API to retrieve any data included in the cas object. Normally, all of the process results are included in this object. You can also use the U-Compare API (e.g. the type system) in addition to the UIMA API as default. Please refer to the UIMA official documents for details of the UIMA API.

Please note that you have to create at least one output port for Taverna to get any data. If you create the output port in the configuration panel, you can refer/store data to the variable. These port related behaviours are as same as the Taverna's BeanShell Activity, one of the default activities. Please refer to the Taverna's documentation for details of the BeanShell Activity and port settings.

Two Input Modes

There are two modes of behavior regarding U-Compare workflow inputs: voluntary and interactive. Normally a U-Compare workflow has a component called a collection reader, which reads text or a corpus; no input is required and output is a list of annotated documents. In the interactive mode, the collection reader is ignored and a list of String (depth 1) passed to the “input_text” port of the U-Compare plugin is used as the workflow input.