Class HyphenationCompoundWordTokenFilterFactory
java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenFilterFactory
org.apache.lucene.analysis.compound.HyphenationCompoundWordTokenFilterFactory
- All Implemented Interfaces:
- ResourceLoaderAware
public class HyphenationCompoundWordTokenFilterFactory
extends TokenFilterFactory
implements ResourceLoaderAware
Factory for 
 
HyphenationCompoundWordTokenFilter.
 This factory accepts the following parameters:
- hyphenator(mandatory): path to the FOP xml hyphenation pattern. See http://offo.sourceforge.net/hyphenation/.
- encoding(optional): encoding of the xml hyphenation file. defaults to UTF-8.
- dictionary(optional): dictionary of words. defaults to no dictionary.
- minWordSize(optional): minimal word length that gets decomposed. defaults to 5.
- minSubwordSize(optional): minimum length of subwords. defaults to 2.
- maxSubwordSize(optional): maximum length of subwords. defaults to 15.
- onlyLongestMatch(optional): if true, adds only the longest matching subword to the stream. defaults to false.
 <fieldType name="text_hyphncomp" class="solr.TextField" positionIncrementGap="100">
   <analyzer>
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
     <filter class="solr.HyphenationCompoundWordTokenFilterFactory" hyphenator="hyphenator.xml" encoding="UTF-8"
         dictionary="dictionary.txt" minWordSize="5" minSubwordSize="2" maxSubwordSize="15" onlyLongestMatch="false"/>
   </analyzer>
 </fieldType>- Since:
- 3.1.0
- See Also:
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "hyphenationCompoundWord"
- 
Field SummaryFieldsFields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactoryLUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
- 
Constructor SummaryConstructorsConstructorDescriptionDefault ctor for compatibility with SPICreates a new HyphenationCompoundWordTokenFilterFactory
- 
Method SummaryMethods inherited from class org.apache.lucene.analysis.TokenFilterFactoryavailableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFiltersMethods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactorydefaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
- 
Field Details- 
NAMESPI name- See Also:
 
 
- 
- 
Constructor Details- 
HyphenationCompoundWordTokenFilterFactoryCreates a new HyphenationCompoundWordTokenFilterFactory
- 
HyphenationCompoundWordTokenFilterFactorypublic HyphenationCompoundWordTokenFilterFactory()Default ctor for compatibility with SPI
 
- 
- 
Method Details- 
inform- Specified by:
- informin interface- ResourceLoaderAware
- Throws:
- IOException
 
- 
create- Specified by:
- createin class- TokenFilterFactory
 
 
-