public class DefaultFulltextParser extends Object implements FulltextParser
The regexp used can be configured using the system property "org.nuxeo.fulltext.wordsplit". The default is "[\\s\\p{Punct}]+".
| Modifier and Type | Field and Description | 
|---|---|
| static String | WORD_SPLIT_DEF | 
| protected static Pattern | WORD_SPLIT_PATTERN | 
| static String | WORD_SPLIT_PROP | 
| Constructor and Description | 
|---|
| DefaultFulltextParser() | 
| Modifier and Type | Method and Description | 
|---|---|
| String | parse(String s,
     String path)Parses one property value to normalize the fulltext for the database. | 
| void | parse(String s,
     String path,
     List<String> strings)Parses one property value to normalize the fulltext for the database. | 
| protected String | preprocessField(String s,
               String path)Preprocesses one field at the given path. | 
| protected String | removeHtml(String s) | 
public static final String WORD_SPLIT_PROP
public static final String WORD_SPLIT_DEF
protected static final Pattern WORD_SPLIT_PATTERN
public DefaultFulltextParser()
public String parse(String s, String path)
FulltextParser
 The passed path may be null if the passed string is not coming from a specific path, for instance
 when it was extracted from binary data.
parse in interface FulltextParsers - the string to be parsed and normalizedpath - the abstracted path for the property (where all complex indexes have been replaced by *), or
            nullpublic void parse(String s, String path, List<String> strings)
 Like FulltextParser.parse(String, String) but uses the passed list to accumulate words.
 
The default implementation normalizes text to lowercase and removes punctuation.
This can be subclassed.
parse in interface FulltextParsers - the string to be parsed and normalizedpath - the abstracted path for the property (where all complex indexes have been replaced by *), or
            nullstrings - the list into which normalized words should be accumulatedprotected String preprocessField(String s, String path)
The path is unused for now.
protected String removeHtml(String s)
Copyright © 2015 Nuxeo SA. All rights reserved.