public class DefaultFulltextParser extends Object implements FulltextParser
The regexp used can be configured using the system property "org.nuxeo.fulltext.wordsplit". The default is "[\\s\\p{Punct}]+".
Modifier and Type | Field and Description |
---|---|
static String |
WORD_SPLIT_DEF |
static String |
WORD_SPLIT_PROP |
Constructor and Description |
---|
DefaultFulltextParser() |
public static final String WORD_SPLIT_PROP
public static final String WORD_SPLIT_DEF
public String parse(String s, String path)
FulltextParser
The passed path
may be null
if the passed string is not coming from a specific path, for instance
when it was extracted from binary data.
parse
in interface FulltextParser
s
- the string to be parsed and normalizedpath
- the abstracted path for the property (where all complex indexes have been replaced by *
), or
null
public void parse(String s, String path, List<String> strings)
Like FulltextParser.parse(String, String)
but uses the passed list to accumulate words.
The default implementation normalizes text to lowercase and removes punctuation.
This can be subclassed.
parse
in interface FulltextParser
s
- the string to be parsed and normalizedpath
- the abstracted path for the property (where all complex indexes have been replaced by *
), or
null
strings
- the list into which normalized words should be accumulatedCopyright © 2015 Nuxeo SA. All rights reserved.