public class DefaultFulltextParser extends Object implements FulltextParser
The regexp used can be configured using the system property "org.nuxeo.fulltext.wordsplit". The default is "[\\s\\p{Punct}]+".
Modifier and Type | Field and Description |
---|---|
static String |
WORD_SPLIT_DEF |
static String |
WORD_SPLIT_PROP |
Constructor and Description |
---|
DefaultFulltextParser() |
public static final String WORD_SPLIT_PROP
public static final String WORD_SPLIT_DEF
public String parse(String s, String path)
FulltextParser
The passed path
may be null
if the passed string is not
coming from a specific path, for instance when it was extracted from
binary data.
parse
in interface FulltextParser
s
- the string to be parsed and normalizedpath
- the abstracted path for the property (where all complex
indexes have been replaced by *
), or null
public void parse(String s, String path, List<String> strings)
Like FulltextParser.parse(String, String)
but uses the passed list to
accumulate words.
The default implementation normalizes text to lowercase and removes punctuation.
This can be subclassed.
parse
in interface FulltextParser
s
- the string to be parsed and normalizedpath
- the abstracted path for the property (where all complex
indexes have been replaced by *
), or null
strings
- the list into which normalized words should be accumulatedCopyright © 2014 Nuxeo SA. All rights reserved.