A language model that provides information about ngram probabilities.
The algorithm of a language model, independent of the way data is stored (see sub classes for that).
The BerkeleyLM language model.
Just for testing - at least with the pre-built language models from http://tomato.banatao.berkeley.edu:8080/berkeleylm_binaries/, it doesn't seems possible to get occurrence counts for sentence start symbols, making this not usable (https://github.com/adampauls/berkeleylm/issues/26).
Information about ngram occurrences, taken from Lucene indexes (one index per ngram level).
Produces zero probability for any passed text.
Combines the results of several