ChunkerAnnotator (org.annotation.wordfreak)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.annotation.wordfreak.annotator
Class ChunkerAnnotator

java.lang.Object
  org.annotation.wordfreak.annotator.Annotator
      org.annotation.wordfreak.annotator.DocumentProcessor
          org.annotation.wordfreak.annotator.ParagraphProcessor
              org.annotation.wordfreak.annotator.SentenceProcessor
                  org.annotation.wordfreak.annotator.ChunkerAnnotator

All Implemented Interfaces:: java.awt.event.ActionListener, AnnotatedFileListener, java.util.EventListener, Plugin

public abstract class ChunkerAnnotator
extends SentenceProcessor

This annotator creates annotations for sequence data by converting a series of tags into appropiate chunks. This can be extended to create named-entity recognizers, noun-phrase detectors, or other annotators of this category.

Author:: Tom Morton

Nested Class Summary
`static class`	`ChunkerAnnotator.ChunkAction`
`protected static class`	`ChunkerAnnotator.ChunkActionEnum`

Field Summary
`protected boolean`	`createNons` Set to true when non-chunks should be created along with regular chunks for active-learning.

Fields inherited from class org.annotation.wordfreak.annotator.SentenceProcessor

sentenceTypes

Fields inherited from class org.annotation.wordfreak.annotator.Annotator

annotationFilter, dataDirectory, DEFAULT_ANNOTATOR_NAME, files, guiListener, listeners, loaded, progress, trainingFilter

Constructor Summary
`ChunkerAnnotator(java.lang.String type)`

Method Summary
`protected void`	`endOfDocument()` This function is called after each document has been processed.
`protected abstract ChunkerAnnotator.ChunkAction`	`getChunkAction(java.lang.String tag, Annotation ann)` Determines the chunk action for the specified chunk tag.
`protected abstract java.lang.String[]`	`getChunkTags(java.lang.String[] toks, java.lang.String[] tags, java.lang.String[] pretags, double[] tprobs)` Computes a list of chunk tags which can be converted into chunk actions using `getChunkAction`.
`protected void`	`processDocument(Annotation document, double percentage)` Processes the specified document which consisits of the specified percentage of total work to be performed by this annotator.
`void`	`processSentence(Annotation sentence, double percentage)` Processes the specified sentence which consisits of the specified percentage of total work to be performed by this annotator.

Methods inherited from class org.annotation.wordfreak.annotator.SentenceProcessor

processParagraph

Methods inherited from class org.annotation.wordfreak.annotator.DocumentProcessor

annotating

Methods inherited from class org.annotation.wordfreak.annotator.Annotator

actionPerformed, addAnnotatorListener, annotate, annotatedFile, closeAnnotatedFile, done, getDataDirectory, hideWaitDialog, loadAnnotator, loaded, removeAnnotatorListener, setAnnotationFilter, setDataDirectory, setGuiListener, setProgress, setTrainingFilter, showWaitDialog, sortedOutcomes, supportsTraining, train, training, updateProgress

Methods inherited from class java.lang.Object

clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Detail

createNons

protected boolean createNons

Set to true when non-chunks should be created along with regular chunks for active-learning.

Constructor Detail

ChunkerAnnotator

public ChunkerAnnotator(java.lang.String type)

Parameters:: type - Unused parameter.

Method Detail

processDocument

protected void processDocument(Annotation document,
                               double percentage)

Description copied from class: DocumentProcessor

Processes the specified document which consisits of the specified percentage of total work to be performed by this annotator.

Overrides:: processDocument in class ParagraphProcessor

processSentence

public void processSentence(Annotation sentence,
                            double percentage)

Description copied from class: SentenceProcessor

Processes the specified sentence which consisits of the specified percentage of total work to be performed by this annotator.

Specified by:: processSentence in class SentenceProcessor

Parameters:: sentence - The sentence to be annotated.; percentage - The percentage of work this sentence represents.

getChunkTags

protected abstract java.lang.String[] getChunkTags(java.lang.String[] toks,
                                                   java.lang.String[] tags,
                                                   java.lang.String[] pretags,
                                                   double[] tprobs)

Computes a list of chunk tags which can be converted into chunk actions using getChunkAction.

Parameters:: toks - The tokens to be chunked.; tags - The POS tags of the words.; pretags - An array containing chunk tags which should be maintained. A value of null for a particular tag indicates that no pre-tag needs be maintained and sending in null for the array indicated that no pre-tags are to be maintained.; tprobs - The chunk tag probabilities for the returned chunk tags. This is populated by this function.
Returns:: A chunk tag for each toks

getChunkAction

protected abstract ChunkerAnnotator.ChunkAction getChunkAction(java.lang.String tag,
                                                               Annotation ann)

Determines the chunk action for the specified chunk tag.

Parameters:: tag - The chunk tag.; ann - The annotation the tag was applied to.
Returns:: A chunk tag with appropiate action and tag fields.

endOfDocument

protected void endOfDocument()

This function is called after each document has been processed. It can be over-ridden for processing between documents. This is useful for document level tag caching helpful in named entity detection.