Package org.apache.lucene.analysis.ja
Analyzer for Japanese.
-
Class Summary Class Description GraphvizFormatter Outputs the dot (graphviz) string for the viterbi lattice.JapaneseAnalyzer Analyzer for Japanese that uses morphological analysis.JapaneseBaseFormFilter Replaces term text with theBaseFormAttribute.JapaneseBaseFormFilterFactory Factory forJapaneseBaseFormFilter.JapaneseCompletionAnalyzer Analyzer for Japanese completion suggester.JapaneseCompletionFilter ATokenFilterthat adds Japanese romanized tokens to the term attribute.JapaneseCompletionFilterFactory Factory forJapaneseCompletionFilter.JapaneseHiraganaUppercaseFilter ATokenFilterthat normalizes small letters (捨て仮名) in hiragana into normal letters.JapaneseHiraganaUppercaseFilterFactory Factory forJapaneseHiraganaUppercaseFilter.JapaneseIterationMarkCharFilter Normalizes Japanese horizontal iteration marks (odoriji) to their expanded form.JapaneseIterationMarkCharFilterFactory Factory forJapaneseIterationMarkCharFilter.JapaneseKatakanaStemFilter ATokenFilterthat normalizes common katakana spelling variations ending in a long sound character by removing this character (U+30FC).JapaneseKatakanaStemFilterFactory Factory forJapaneseKatakanaStemFilter.JapaneseKatakanaUppercaseFilter ATokenFilterthat normalizes small letters (捨て仮名) in katakana into normal letters.JapaneseKatakanaUppercaseFilterFactory Factory forJapaneseKatakanaUppercaseFilter.JapaneseNumberFilter ATokenFilterthat normalizes Japanese numbers (kansūji) to regular Arabic decimal numbers in half-width characters.JapaneseNumberFilter.NumberBuffer Buffer that holds a Japanese number string and a position index used as a parsed-to markerJapaneseNumberFilterFactory Factory forJapaneseNumberFilter.JapanesePartOfSpeechStopFilter Removes tokens that match a set of part-of-speech tags.JapanesePartOfSpeechStopFilterFactory Factory forJapanesePartOfSpeechStopFilter.JapaneseReadingFormFilter ATokenFilterthat replaces the term attribute with the reading of a token in either katakana or romaji form.JapaneseReadingFormFilterFactory Factory forJapaneseReadingFormFilter.JapaneseTokenizer Tokenizer for Japanese that uses morphological analysis.JapaneseTokenizerFactory Factory forJapaneseTokenizer.Token Analyzed token with morphological data from its dictionary. -
Enum Summary Enum Description JapaneseCompletionFilter.Mode Completion modeJapaneseTokenizer.Mode Tokenization mode: this determines how the tokenizer handles compound and unknown words.JapaneseTokenizer.Type Token type reflecting the original source of this token