Class ShingleAnalyzerWrapper
- java.lang.Object
-
- org.apache.lucene.analysis.Analyzer
-
- org.apache.lucene.analysis.AnalyzerWrapper
-
- org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
- All Implemented Interfaces:
Closeable,AutoCloseable
public final class ShingleAnalyzerWrapper extends AnalyzerWrapper
A ShingleAnalyzerWrapper wraps aShingleFilteraround anotherAnalyzer.A shingle is another name for a token based n-gram.
- Since:
- 3.1
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
-
-
Constructor Summary
Constructors Constructor Description ShingleAnalyzerWrapper()WrapsStandardAnalyzer.ShingleAnalyzerWrapper(int minShingleSize, int maxShingleSize)WrapsStandardAnalyzer.ShingleAnalyzerWrapper(Analyzer defaultAnalyzer)ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int maxShingleSize)ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int minShingleSize, int maxShingleSize)ShingleAnalyzerWrapper(Analyzer delegate, int minShingleSize, int maxShingleSize, String tokenSeparator, boolean outputUnigrams, boolean outputUnigramsIfNoShingles, String fillerToken)Creates a new ShingleAnalyzerWrapper
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description StringgetFillerToken()intgetMaxShingleSize()The max shingle (token ngram) sizeintgetMinShingleSize()The min shingle (token ngram) sizeStringgetTokenSeparator()AnalyzergetWrappedAnalyzer(String fieldName)booleanisOutputUnigrams()booleanisOutputUnigramsIfNoShingles()protected Analyzer.TokenStreamComponentswrapComponents(String fieldName, Analyzer.TokenStreamComponents components)-
Methods inherited from class org.apache.lucene.analysis.AnalyzerWrapper
attributeFactory, createComponents, getOffsetGap, getPositionIncrementGap, initReader, initReaderForNormalization, normalize, wrapReader, wrapReaderForNormalization, wrapTokenStreamForNormalization
-
Methods inherited from class org.apache.lucene.analysis.Analyzer
close, getReuseStrategy, normalize, tokenStream, tokenStream
-
-
-
-
Constructor Detail
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer)
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int maxShingleSize)
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int minShingleSize, int maxShingleSize)
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer delegate, int minShingleSize, int maxShingleSize, String tokenSeparator, boolean outputUnigrams, boolean outputUnigramsIfNoShingles, String fillerToken)
Creates a new ShingleAnalyzerWrapper- Parameters:
delegate- Analyzer whose TokenStream is to be filteredminShingleSize- Min shingle (token ngram) sizemaxShingleSize- Max shingle sizetokenSeparator- Used to separate input stream tokens in output shinglesoutputUnigrams- Whether or not the filter shall pass the original tokens to the output streamoutputUnigramsIfNoShingles- Overrides the behavior of outputUnigrams==false for those times when no shingles are available (because there are fewer than minShingleSize tokens in the input stream)? Note that if outputUnigrams==true, then unigrams are always output, regardless of whether any shingles are available.fillerToken- filler token to use when positionIncrement is more than 1
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper()
WrapsStandardAnalyzer.
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(int minShingleSize, int maxShingleSize)WrapsStandardAnalyzer.
-
-
Method Detail
-
getMaxShingleSize
public int getMaxShingleSize()
The max shingle (token ngram) size- Returns:
- The max shingle (token ngram) size
-
getMinShingleSize
public int getMinShingleSize()
The min shingle (token ngram) size- Returns:
- The min shingle (token ngram) size
-
getTokenSeparator
public String getTokenSeparator()
-
isOutputUnigrams
public boolean isOutputUnigrams()
-
isOutputUnigramsIfNoShingles
public boolean isOutputUnigramsIfNoShingles()
-
getFillerToken
public String getFillerToken()
-
getWrappedAnalyzer
public final Analyzer getWrappedAnalyzer(String fieldName)
- Specified by:
getWrappedAnalyzerin classAnalyzerWrapper
-
wrapComponents
protected Analyzer.TokenStreamComponents wrapComponents(String fieldName, Analyzer.TokenStreamComponents components)
- Overrides:
wrapComponentsin classAnalyzerWrapper
-
-