org.annotation.wordfreak.annotator
Class TokenAnnotator
java.lang.Object
org.annotation.wordfreak.annotator.Annotator
org.annotation.wordfreak.annotator.DocumentProcessor
org.annotation.wordfreak.annotator.ParagraphProcessor
org.annotation.wordfreak.annotator.SentenceProcessor
org.annotation.wordfreak.annotator.TokenAnnotator
- All Implemented Interfaces:
- java.awt.event.ActionListener, AnnotatedFileListener, java.util.EventListener, Plugin
- Direct Known Subclasses:
- SimpleTokenAnnotator
- public abstract class TokenAnnotator
- extends SentenceProcessor
Provides common functionality for automatic annotation of tokens. Most annotators of tokens should
extend this class and implements its abstract methods.
Method Summary |
protected abstract double[] |
getTokProbs()
Returns a confidence associated with each token returned in the
most recent call to tokenize. |
protected abstract void |
initTraining()
Initializes annotator for training. |
protected void |
processSentence(Annotation sentence,
double percentage)
Processes the specified sentence which consisits of the specified percentage of total work to be performed by this annotator. |
protected abstract Span[] |
tokenize(java.lang.String text)
Returns character offsets which are the tokens of the text parametter. |
protected abstract void |
train()
Trains a model based on the tokens provided in previous calls to
trainWithTokens. |
void |
training(java.util.List files)
|
void |
training(java.lang.String[] files)
|
protected abstract void |
trainWithTokens(Span[] tokens,
java.lang.String text)
Uses the tokens provided to construct events for traiing the current
tokenizer model. |
Methods inherited from class org.annotation.wordfreak.annotator.Annotator |
actionPerformed, addAnnotatorListener, annotate, annotatedFile, closeAnnotatedFile, done, getDataDirectory, hideWaitDialog, loadAnnotator, loaded, removeAnnotatorListener, setAnnotationFilter, setDataDirectory, setGuiListener, setProgress, setTrainingFilter, showWaitDialog, sortedOutcomes, supportsTraining, train, updateProgress |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
TokenAnnotator
public TokenAnnotator(java.lang.String type)
tokenize
protected abstract Span[] tokenize(java.lang.String text)
- Returns character offsets which are the tokens of the text parametter.
- Parameters:
text
- the string to be tokenized. Typically a sentence.
- Returns:
- character offsets in to which are the tokens
getTokProbs
protected abstract double[] getTokProbs()
- Returns a confidence associated with each token returned in the
most recent call to tokenize.
- Returns:
- array of confidences associated with each token returned
in the most recent call to tokenize.
initTraining
protected abstract void initTraining()
- Initializes annotator for training.
trainWithTokens
protected abstract void trainWithTokens(Span[] tokens,
java.lang.String text)
- Uses the tokens provided to construct events for traiing the current
tokenizer model.
- Parameters:
tokens
- character offsets into text which are tokens to be
used for training.text
- string into which offsets specified in tokens refer to.
train
protected abstract void train()
- Trains a model based on the tokens provided in previous calls to
trainWithTokens.
training
public void training(java.util.List files)
training
public void training(java.lang.String[] files)
- Overrides:
training
in class Annotator
processSentence
protected void processSentence(Annotation sentence,
double percentage)
- Description copied from class:
SentenceProcessor
- Processes the specified sentence which consisits of the specified percentage of total work to be performed by this annotator.
- Specified by:
processSentence
in class SentenceProcessor
- Parameters:
sentence
- The sentence to be annotated.percentage
- The percentage of work this sentence represents.
Copyright © 2004 Thomas Morton and Jeremy LaCivita. All Rights Reserved.