Class ProtectedTermFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenFilterFactory
-
- org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory
-
- org.apache.lucene.analysis.miscellaneous.ProtectedTermFilterFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class ProtectedTermFilterFactory extends ConditionalTokenFilterFactory
Factory for aProtectedTermFilterCustomAnalyzer example:
Analyzer ana = CustomAnalyzer.builder() .withTokenizer("standard") .when("protectedterm", "ignoreCase", "true", "protected", "protectedTerms.txt") .addTokenFilter("truncate", "prefixLength", "4") .addTokenFilter("lowercase") .endwhen() .build();Solr example, in which conditional filters are specified via the
wrappedFiltersparameter - a comma-separated list of case-insensitive TokenFilter SPI names - and conditional filter args are specified viafilterName.argNameparameters:<fieldType name="reverse_lower_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="truncate,lowercase" truncate.prefixLength="4" /> </analyzer> </fieldType>When using the
wrappedFiltersparameter, each filter name must be unique, so if you need to specify the same filter more than once, you must add case-insensitive unique '-id' suffixes (note that the '-id' suffix is stripped prior to SPI lookup), e.g.:<fieldType name="double_synonym_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="synonymgraph-A,synonymgraph-B" synonymgraph-A.synonyms="synonyms-1.txt" synonymgraph-B.synonyms="synonyms-2.txt"/> </analyzer> </fieldType>See related
CustomAnalyzer.Builder.whenTerm(Predicate)- Since:
- 7.4.0
- SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
- "protectedTerm"
-
-
Field Summary
Fields Modifier and Type Field Description static charFILTER_ARG_SEPARATORstatic charFILTER_NAME_ID_SEPARATORstatic StringNAMEstatic StringPROTECTED_TERMS-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description ProtectedTermFilterFactory()Default ctor for compatibility with SPIProtectedTermFilterFactory(Map<String,String> args)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected ConditionalTokenFiltercreate(TokenStream input, Function<TokenStream,TokenStream> inner)Modify the incomingTokenStreamwith aConditionalTokenFiltervoiddoInform(ResourceLoader loader)Initialises this component with the correspondingResourceLoaderCharArraySetgetProtectedTerms()booleanisIgnoreCase()-
Methods inherited from class org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory
create, inform, setInnerFilters
-
Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final String NAME
- See Also:
- Constant Field Values
-
PROTECTED_TERMS
public static final String PROTECTED_TERMS
- See Also:
- Constant Field Values
-
FILTER_ARG_SEPARATOR
public static final char FILTER_ARG_SEPARATOR
- See Also:
- Constant Field Values
-
FILTER_NAME_ID_SEPARATOR
public static final char FILTER_NAME_ID_SEPARATOR
- See Also:
- Constant Field Values
-
-
Method Detail
-
isIgnoreCase
public boolean isIgnoreCase()
-
getProtectedTerms
public CharArraySet getProtectedTerms()
-
create
protected ConditionalTokenFilter create(TokenStream input, Function<TokenStream,TokenStream> inner)
Description copied from class:ConditionalTokenFilterFactoryModify the incomingTokenStreamwith aConditionalTokenFilter- Specified by:
createin classConditionalTokenFilterFactory
-
doInform
public void doInform(ResourceLoader loader) throws IOException
Description copied from class:ConditionalTokenFilterFactoryInitialises this component with the correspondingResourceLoader- Overrides:
doInformin classConditionalTokenFilterFactory- Throws:
IOException
-
-