de.danielnaber.languagetool.tagging
Interface Tagger

All Known Implementing Classes:
BaseTagger, CzechTagger, DanishTagger, DemoTagger, DutchTagger, EnglishTagger, FrenchTagger, GalicianTagger, GermanTagger, ItalianTagger, PolishTagger, RomanianTagger, RussianTagger, SlovakTagger, SpanishTagger, SwedishTagger, UkrainianMorfoTagger, UkrainianMyspellTagger, UkrainianTagger

public interface Tagger

The part-of-speech tagger interface, whose implementations are usually language-dependent.

Author:
Daniel Naber

Method Summary
 AnalyzedTokenReadings createNullToken(String token, int startPos)
          Create the AnalyzedToken used for whitespace and other non-words.
 AnalyzedToken createToken(String token, String posTag)
          Create a token specific to the language of the implementing class.
 List<AnalyzedTokenReadings> tag(List<String> sentenceTokens)
          Returns a list of AnalyzedTokens that assigns each term in the sentence some kind of part-of-speech information (not necessarily just one tag).
 

Method Detail

tag

List<AnalyzedTokenReadings> tag(List<String> sentenceTokens)
                                throws IOException
Returns a list of AnalyzedTokens that assigns each term in the sentence some kind of part-of-speech information (not necessarily just one tag).

Note that this method takes exactly one sentence. Its implementation may implement special cases for the first word of a sentence, which is usually written with an uppercase letter.

Parameters:
sentenceTokens - the text as returned by a WordTokenizer but without whitespace tokens.
Throws:
IOException

createNullToken

AnalyzedTokenReadings createNullToken(String token,
                                      int startPos)
Create the AnalyzedToken used for whitespace and other non-words. Use null as the POS tag for this token.


createToken

AnalyzedToken createToken(String token,
                          String posTag)
Create a token specific to the language of the implementing class.



Copyright © 2005-2009 Daniel Naber