de.danielnaber.languagetool.tokenizers.pl
Class PolishSentenceTokenizer
Object
SentenceTokenizer
PolishSentenceTokenizer
- All Implemented Interfaces:
- Tokenizer
public class PolishSentenceTokenizer
- extends SentenceTokenizer
Tokenizes Polish text into sentences by looking for typical end-of-sentence markers,
but considering exceptions (e.g. abbreviations).
- Author:
- Marcin Milkowski
| Methods inherited from class Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PolishSentenceTokenizer
public PolishSentenceTokenizer()
- Create a sentence tokenizer.
setSingleLineBreaksMarksParagraph
public final void setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs)
- Overrides:
setSingleLineBreaksMarksParagraph in class SentenceTokenizer
- Parameters:
lineBreakParagraphs - if true, single lines breaks are assumed to end a paragraph,
with false, only two ore more consecutive line breaks end a paragraph
tokenize
public final List<String> tokenize(String s)
- Description copied from class:
SentenceTokenizer
- Tokenize the given string to sentences.
- Specified by:
tokenize in interface Tokenizer- Overrides:
tokenize in class SentenceTokenizer
Copyright © 2005-2007 Daniel Naber