public abstract class PartialPosTagFilter extends RuleFilter
no: an integer of the matching 'token' position to be considered. Starts with 1.
regexp: the regular expression to specify the part of the token to be considered. For example, (?:in|un)(.*) will consider the part of the token that comes after 'in' or 'un'. Note that always the first group is considered, so if you need more parenthesis you need to use non-capturing groups (?:...), as in the example.
postag_regexp: a regular expression to match the POS tag of the part of the word, e.g. VB.? to match any verb in English.
negate_postag: if value is yes, then the regexp is negated (not negated if not specified).
two_groups_regexp: if value is yes, then the regexp must contain 2 groups (if not specified - 1 groups).
prefix: a string with prefix that is added to token (since 5.0).
suffix: a string with suffix that is added to token (since 5.0).
|Constructor and Description|
|Modifier and Type||Method and Description|
Returns the original rule match or a modified one, or
getOptional, getRequired, matches
public RuleMatch acceptRuleMatch(RuleMatch match, Map<String,String> args, int patternTokenPos, AnalyzedTokenReadings patternTokens)
nullif the rule match is filtered out.
args- the resolved argument from the
argsattribute in the XML. Resolved means that e.g.
\1has been resolved to the actual string at that match position.
patternTokens- those tokens of the text that correspond the matched pattern
nullif this rule match should be removed, or any other RuleMatch (e.g. the one from the arguments) that properly describes the detected error