Class BinaryDictionary
- java.lang.Object
-
- org.apache.lucene.analysis.ja.dict.BinaryDictionary
-
- All Implemented Interfaces:
Dictionary
- Direct Known Subclasses:
TokenInfoDictionary,UnknownDictionary
public abstract class BinaryDictionary extends Object implements Dictionary
Base class for a binary-encoded in-memory dictionary.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classBinaryDictionary.ResourceSchemeDeprecated, for removal: This API element is subject to removal in a future version.
-
Field Summary
Fields Modifier and Type Field Description static StringDICT_FILENAME_SUFFIXstatic StringDICT_HEADERstatic intHAS_BASEFORMflag that the entry has baseform data.static intHAS_PRONUNCIATIONflag that the entry has pronunciation data.static intHAS_READINGflag that the entry has reading data.static StringPOSDICT_FILENAME_SUFFIXstatic StringPOSDICT_HEADERstatic StringTARGETMAP_FILENAME_SUFFIXstatic StringTARGETMAP_HEADERstatic intVERSION-
Fields inherited from interface org.apache.lucene.analysis.ja.dict.Dictionary
INTERNAL_SEPARATOR
-
-
Constructor Summary
Constructors Modifier Constructor Description protectedBinaryDictionary(IOSupplier<InputStream> targetMapResource, IOSupplier<InputStream> posResource, IOSupplier<InputStream> dictResource)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description StringgetBaseForm(int wordId, char[] surfaceForm, int off, int len)Get base form of wordStringgetInflectionForm(int wordId)Get inflection form of tokensStringgetInflectionType(int wordId)Get inflection type of tokensintgetLeftId(int wordId)Get left id of specified wordStringgetPartOfSpeech(int wordId)Get Part-Of-Speech of tokensStringgetPronunciation(int wordId, char[] surface, int off, int len)Get pronunciation of tokensStringgetReading(int wordId, char[] surface, int off, int len)Get reading of tokensstatic InputStreamgetResource(BinaryDictionary.ResourceScheme scheme, String path)Deprecated, for removal: This API element is subject to removal in a future version.intgetRightId(int wordId)Get right id of specified wordintgetWordCost(int wordId)Get word cost of specified wordvoidlookupWordIds(int sourceId, IntsRef ref)
-
-
-
Field Detail
-
DICT_FILENAME_SUFFIX
public static final String DICT_FILENAME_SUFFIX
- See Also:
- Constant Field Values
-
TARGETMAP_FILENAME_SUFFIX
public static final String TARGETMAP_FILENAME_SUFFIX
- See Also:
- Constant Field Values
-
POSDICT_FILENAME_SUFFIX
public static final String POSDICT_FILENAME_SUFFIX
- See Also:
- Constant Field Values
-
DICT_HEADER
public static final String DICT_HEADER
- See Also:
- Constant Field Values
-
TARGETMAP_HEADER
public static final String TARGETMAP_HEADER
- See Also:
- Constant Field Values
-
POSDICT_HEADER
public static final String POSDICT_HEADER
- See Also:
- Constant Field Values
-
VERSION
public static final int VERSION
- See Also:
- Constant Field Values
-
HAS_BASEFORM
public static final int HAS_BASEFORM
flag that the entry has baseform data. otherwise it's not inflected (same as surface form)- See Also:
- Constant Field Values
-
HAS_READING
public static final int HAS_READING
flag that the entry has reading data. otherwise reading is surface form converted to katakana- See Also:
- Constant Field Values
-
HAS_PRONUNCIATION
public static final int HAS_PRONUNCIATION
flag that the entry has pronunciation data. otherwise pronunciation is the reading- See Also:
- Constant Field Values
-
-
Constructor Detail
-
BinaryDictionary
protected BinaryDictionary(IOSupplier<InputStream> targetMapResource, IOSupplier<InputStream> posResource, IOSupplier<InputStream> dictResource) throws IOException
- Throws:
IOException
-
-
Method Detail
-
getResource
@Deprecated(forRemoval=true, since="9.1") public static final InputStream getResource(BinaryDictionary.ResourceScheme scheme, String path) throws IOException
Deprecated, for removal: This API element is subject to removal in a future version.- Throws:
IOException
-
lookupWordIds
public void lookupWordIds(int sourceId, IntsRef ref)
-
getLeftId
public int getLeftId(int wordId)
Description copied from interface:DictionaryGet left id of specified word- Specified by:
getLeftIdin interfaceDictionary- Returns:
- left id
-
getRightId
public int getRightId(int wordId)
Description copied from interface:DictionaryGet right id of specified word- Specified by:
getRightIdin interfaceDictionary- Returns:
- right id
-
getWordCost
public int getWordCost(int wordId)
Description copied from interface:DictionaryGet word cost of specified word- Specified by:
getWordCostin interfaceDictionary- Returns:
- word's cost
-
getBaseForm
public String getBaseForm(int wordId, char[] surfaceForm, int off, int len)
Description copied from interface:DictionaryGet base form of word- Specified by:
getBaseFormin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- Base form (only different for inflected words, otherwise null)
-
getReading
public String getReading(int wordId, char[] surface, int off, int len)
Description copied from interface:DictionaryGet reading of tokens- Specified by:
getReadingin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- Reading of the token
-
getPartOfSpeech
public String getPartOfSpeech(int wordId)
Description copied from interface:DictionaryGet Part-Of-Speech of tokens- Specified by:
getPartOfSpeechin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- Part-Of-Speech of the token
-
getPronunciation
public String getPronunciation(int wordId, char[] surface, int off, int len)
Description copied from interface:DictionaryGet pronunciation of tokens- Specified by:
getPronunciationin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- Pronunciation of the token
-
getInflectionType
public String getInflectionType(int wordId)
Description copied from interface:DictionaryGet inflection type of tokens- Specified by:
getInflectionTypein interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- inflection type, or null
-
getInflectionForm
public String getInflectionForm(int wordId)
Description copied from interface:DictionaryGet inflection form of tokens- Specified by:
getInflectionFormin interfaceDictionary- Parameters:
wordId- word ID of token- Returns:
- inflection form, or null
-
-