Package org.apache.lucene.analysis.cjk
Class CJKAnalyzer
java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.StopwordAnalyzerBase
org.apache.lucene.analysis.cjk.CJKAnalyzer
- All Implemented Interfaces:
- Closeable,- AutoCloseable
An 
Analyzer that tokenizes text with StandardTokenizer, normalizes content with
 CJKWidthFilter, folds case with LowerCaseFilter, forms bigrams of CJK with CJKBigramFilter, and filters stopwords with StopFilter- Since:
- 3.1
- 
Nested Class SummaryNested classes/interfaces inherited from class org.apache.lucene.analysis.AnalyzerAnalyzer.ReuseStrategy, Analyzer.TokenStreamComponents
- 
Field SummaryFieldsFields inherited from class org.apache.lucene.analysis.StopwordAnalyzerBasestopwordsFields inherited from class org.apache.lucene.analysis.AnalyzerGLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
- 
Constructor SummaryConstructorsConstructorDescriptionBuilds an analyzer which removes words ingetDefaultStopSet().CJKAnalyzer(CharArraySet stopwords) Builds an analyzer with the given stop words
- 
Method SummaryModifier and TypeMethodDescriptionprotected Analyzer.TokenStreamComponentscreateComponents(String fieldName) static CharArraySetReturns an unmodifiable instance of the default stop-words set.protected TokenStreamnormalize(String fieldName, TokenStream in) Methods inherited from class org.apache.lucene.analysis.StopwordAnalyzerBasegetStopwordSet, loadStopwordSet, loadStopwordSetMethods inherited from class org.apache.lucene.analysis.AnalyzerattributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, initReader, initReaderForNormalization, normalize, tokenStream, tokenStream
- 
Field Details- 
DEFAULT_STOPWORD_FILEFile containing default CJK stopwords.Currently it contains some common English words that are not usually useful for searching and some double-byte interpunctions. - See Also:
 
 
- 
- 
Constructor Details- 
CJKAnalyzerpublic CJKAnalyzer()Builds an analyzer which removes words ingetDefaultStopSet().
- 
CJKAnalyzerBuilds an analyzer with the given stop words- Parameters:
- stopwords- a stopword set
 
 
- 
- 
Method Details- 
getDefaultStopSetReturns an unmodifiable instance of the default stop-words set.- Returns:
- an unmodifiable instance of the default stop-words set.
 
- 
createComponents- Specified by:
- createComponentsin class- Analyzer
 
- 
normalize
 
-