List of U-Compare compatible UIMA components
This is a list of UIMA components, which are compatible with the U-Compare
"comparable" type system. These components are included in the
U-Compare single-click-to-launch package. See components for other components.
Abbreviations: UT or U-Tokyo for the University of Tokyo, UM or U-Man for the University of Manchester, CCP for Computational Pharamacology
Center at the University of Colorado Health Science Center.
Syntactic Tools
Sentence Detectors
| Name |
Provider |
Developer |
Description |
| GENIA Sentence Detector |
U-Tokyo |
Yuichiroh Matsubayashi, U-Tokyo |
Trained with GENIA corpus. |
| LingPipe Sentence Detector |
CCP |
Alias-i |
You have to download and import the lingpipe.ucz package separately from our download page. |
| NaCTeM Sentence Breaker |
NaCTeM at U-Manchester |
Scott Piao, NaCTeM U-Manchester |
English sentence boundary detector which employs heuristic rules, including error-correction rules, compiled based on corpus resources. |
| OpenNLP Sentence Detector |
CCP |
OpenNLP |
From OpenNLP project. |
| UIMA Sentence Detector |
U-Tokyo |
Apache UIMA |
From Apache UIMA examples. |
Tokenizers
| Name |
Provider |
Developer |
Description |
| GENIA Tagger |
U-Tokyo |
Yoshimasa Tsuruoka, U-Tokyo (GENIA project) |
Trained on the WSJ, GENIA, and PennBioIE corpora. |
| OpenNLP Tokenizer |
U-Tokyo |
OpenNLP/Apache UIMA |
From Apache UIMA examples. |
| Penn Bio Tokenizer |
CCP |
U-Penn |
Part of Penn BioTagger. |
| UIMA Tokenizer |
U-Tokyo |
Apache UIMA |
From Apache UIMA examples. |
Part-of-Speech Taggers
| Name |
Provider |
Developer |
Description |
| GENIATagger |
U-Tokyo |
Yoshimasa Tsuruoka, U-Tokyo (GENIA project) |
Trained on the WSJ, GENIA, and PennBioIE corpora. |
| SteppTagger |
U-Tokyo |
Yoshimasa Tsuruoka, NaCTeM |
Based on probabilistic models, tuned to biomedical text trained by WSJ,
GENIA, and PennBioIE corpora. |
| LingPipe POS Tagger |
CCP |
Alias-i |
Trained on the Genia corpus by the Hidden Markov Model.
You have to download and import the lingpipe.ucz package separately from our download page. |
| OpenNLPTagger |
U-Tokyo |
OpenNLP/Apache UIMA |
From Apache UIMA examples. |
Lemmatizers
| Name |
Provider |
Developer |
Description |
| morpha |
NaCTeM/U-Compare |
G. Minnen, et al.,U-Sussex (morph) |
a fast and robust morphological analyser for English based on finite-state techniques that returns the lemma and inflection type of a word, given the word form and its part of speech. |
| GENIATagger |
U-Tokyo |
Yoshimasa Tsuruoka, U-Tokyo (GENIA project) |
Trained on the WSJ, GENIA, and PennBioIE corpora. |
| Enju |
U-Tokyo |
Yusuke Miyao, U-Tokyo (Enju) |
HPSG parser with predicate-argument structures (PAS) as well as phrase
structures, trained with newswire articles (Penn Treebank). |
Syntactic Parsers (CFG/Deep/Dependency Parsers)
| Name |
Provider |
Developer |
Description |
| Enju |
U-Tokyo |
Yusuke Miyao, U-Tokyo (Enju) |
HPSG parser with predicate-argument structures (PAS) as well as phrase
structures, trained with newswire articles (Penn Treebank). |
| OpenNLPParser |
U-Tokyo |
OpenNLP/Apache UIMA |
CFG parser from Apache UIMA examples. |