Class BlockReader
- All Implemented Interfaces:
Accountable,BytesRefIterator
- Direct Known Subclasses:
IntersectBlockReader,STBlockReader
Reads fully the block in blockReadBuffer. Then scans the block terms in memory. The
details region is lazily decoded with termStatesReadBuffer which shares the same byte
array with blockReadBuffer. See BlockWriter and BlockLine for the block
format.
- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.index.TermsEnum
TermsEnum.SeekStatus -
Field Summary
FieldsModifier and TypeFieldDescriptionprotected final BlockDecoderprotected intOffset of the start of the first line of the current block (just after the header), relative to the block start.protected BlockHeaderCurrent block header.protected BlockHeader.Serializerprotected IndexInputIndexInputon theblock file.protected BlockLineCurrent block line.protected BlockLine.Serializerprotected ByteArrayDataInputIn-memory read buffer for the current block.protected longCurrent block start file pointer, absolute in theblock file.protected IndexDictionary.BrowserHolds theIndexDictionary.Browseronce loaded.protected final IndexDictionary.BrowserSupplierIndexDictionary.Browsersupplier for lazy loading.protected final FieldMetadataprotected BytesRefBuilderSet whenseekExact(BytesRef, TermState)is called.protected intCurrent line index in the block.protected final PostingsReaderBaseprotected BytesRefprotected BlockLineprotected final BlockTermStateprotected BlockTermStateCurrent block line details.protected booleanWhether the currentTermStatehas been forced with a call toseekExact(BytesRef, TermState).protected DeltaBaseTermStateSerializerprotected ByteArrayDataInputIn-memory read buffer for the details region of the current block.Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE -
Constructor Summary
ConstructorsModifierConstructorDescriptionprotectedBlockReader(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder) -
Method Summary
Modifier and TypeMethodDescriptionprotected voidprotected intcompareToMiddleAndJump(BytesRef searchedTerm) Compares the searched term to the middle term of the block.protected BlockHeader.Serializerprotected BlockLine.Serializerprotected DeltaBaseTermStateSerializerprotected BytesRefdecodeBlockBytesIfNeeded(int numBlockBytes) intdocFreq()protected IndexDictionary.Browserimpacts(int flags) protected voidprotected voidinitializeHeader(BytesRef searchedTerm, long targetBlockStartFP) Reads and setsblockHeader.protected booleanisBeyondLastTerm(BytesRef searchedTerm, long blockStartFP) Indicates whether the searched term is beyond the last term of the field.protected booleanisCurrentTerm(BytesRef searchedTerm) protected CorruptIndexExceptionnewCorruptIndexException(String msg, Long fp) next()protected BytesRefnextTerm()Moves to the next term line and reads it, it may be in the next block.longord()postings(PostingsEnum reuse, int flags) longprotected BlockHeaderReads the block header.protected BlockLineReads the current block line.protected BlockTermStateReads theBlockTermStateon the current line.protected BlockTermStateReads theBlockTermStateif it is not already set.voidseekExact(long ord) Not supported.booleanvoidPositions thisBlockReaderwithout re-seeking the term dictionary.protected TermsEnum.SeekStatusseekInBlock(BytesRef searchedTerm) Seeks to the provided term in this block.protected TermsEnum.SeekStatusseekInBlock(BytesRef searchedTerm, long blockStartFP) Seeks to the provided term in the block starting at the provided file pointer.term()longMethods inherited from class org.apache.lucene.index.BaseTermsEnum
attributes, prepareSeekExactMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Field Details
-
blockInput
IndexInputon theblock file. -
postingsReader
-
fieldMetadata
-
blockDecoder
-
blockHeaderReader
-
blockLineReader
-
blockReadBuffer
In-memory read buffer for the current block. -
termStatesReadBuffer
In-memory read buffer for the details region of the current block. It shares the same byte array asblockReadBuffer, with a different position. -
termStateSerializer
-
dictionaryBrowserSupplier
IndexDictionary.Browsersupplier for lazy loading. -
dictionaryBrowser
Holds theIndexDictionary.Browseronce loaded. -
blockStartFP
protected long blockStartFPCurrent block start file pointer, absolute in theblock file. -
blockHeader
Current block header. -
blockLine
Current block line. -
termState
Current block line details. -
blockFirstLineStart
protected int blockFirstLineStartOffset of the start of the first line of the current block (just after the header), relative to the block start. -
lineIndexInBlock
protected int lineIndexInBlockCurrent line index in the block. -
termStateForced
protected boolean termStateForcedWhether the currentTermStatehas been forced with a call toseekExact(BytesRef, TermState).- See Also:
-
forcedTerm
Set whenseekExact(BytesRef, TermState)is called.This optimizes the use-case when the caller calls first
seekExact(BytesRef, TermState)and thenpostings(PostingsEnum, int). In this case we don't access the terms block file (we don't seek) but directly the postings file because we already have theTermStatewith the file pointers to the postings file. -
scratchBlockBytes
-
scratchTermState
-
scratchBlockLine
-
-
Constructor Details
-
BlockReader
protected BlockReader(IndexDictionary.BrowserSupplier dictionaryBrowserSupplier, IndexInput blockInput, PostingsReaderBase postingsReader, FieldMetadata fieldMetadata, BlockDecoder blockDecoder) throws IOException - Parameters:
dictionaryBrowserSupplier- to load theIndexDictionary.Browserlazily inseekCeil(BytesRef).blockDecoder- Optional block decoder, may be null if none. It can be used for decompression or decryption.- Throws:
IOException
-
-
Method Details
-
seekCeil
- Specified by:
seekCeilin classTermsEnum- Throws:
IOException
-
seekExact
- Overrides:
seekExactin classBaseTermsEnum- Throws:
IOException
-
isCurrentTerm
-
isBeyondLastTerm
Indicates whether the searched term is beyond the last term of the field.- Parameters:
blockStartFP- The current block start file pointer.
-
seekInBlock
protected TermsEnum.SeekStatus seekInBlock(BytesRef searchedTerm, long blockStartFP) throws IOException Seeks to the provided term in the block starting at the provided file pointer. Does not exceed the block.- Throws:
IOException
-
seekInBlock
Seeks to the provided term in this block.Does not exceed this block;
TermsEnum.SeekStatus.ENDis returned if it follows the block.Compares the line terms with the
searchedTerm, taking advantage of the incremental encoding properties.Scans linearly the terms. Updates the current block line with the current term.
- Throws:
IOException
-
compareToMiddleAndJump
Compares the searched term to the middle term of the block. If the searched term is lexicographically equal or after the middle term then jumps to the second half of the block directly.- Returns:
- The comparison between the searched term and the middle term.
- Throws:
IOException
-
readLineInBlock
Reads the current block line. SetsblockLineand incrementslineIndexInBlock.- Returns:
- The
BlockLine; or null if there no more line in the block. - Throws:
IOException
-
seekExact
Positions thisBlockReaderwithout re-seeking the term dictionary.The block containing the term is not read by this method. It will be read lazily only if needed, for example if
next()is called. Callingpostings(org.apache.lucene.index.PostingsEnum, int)after this method does require the block to be read.- Overrides:
seekExactin classBaseTermsEnum
-
seekExact
public void seekExact(long ord) Not supported. -
next
- Specified by:
nextin interfaceBytesRefIterator- Throws:
IOException
-
nextTerm
Moves to the next term line and reads it, it may be in the next block. The term details are not read yet. They will be read only when needed withreadTermStateIfNotRead().- Returns:
- The read term bytes; or null if there is no more term for the field.
- Throws:
IOException
-
initializeHeader
Reads and setsblockHeader. Sets null if there is no block for the field anymore.- Parameters:
searchedTerm- The searched term; or null if none.targetBlockStartFP- The file pointer of the block to read.- Throws:
IOException
-
initializeBlockReadLazily
- Throws:
IOException
-
createBlockHeaderSerializer
-
createBlockLineSerializer
-
createDeltaBaseTermStateSerializer
-
readHeader
Reads the block header. SetsblockHeader.- Returns:
- The block header; or null if there is no block for the field anymore.
- Throws:
IOException
-
decodeBlockBytesIfNeeded
- Throws:
IOException
-
readTermStateIfNotRead
Reads theBlockTermStateif it is not already set. SetstermState.- Throws:
IOException
-
readTermState
Reads theBlockTermStateon the current line. SetstermState.Overriding method may return null if there is no
BlockTermState(in this case the extending class must support a nulltermState).- Returns:
- The
BlockTermState; or null if none. - Throws:
IOException
-
term
-
ord
public long ord() -
docFreq
- Specified by:
docFreqin classTermsEnum- Throws:
IOException
-
totalTermFreq
- Specified by:
totalTermFreqin classTermsEnum- Throws:
IOException
-
termState
- Overrides:
termStatein classBaseTermsEnum- Throws:
IOException
-
postings
- Specified by:
postingsin classTermsEnum- Throws:
IOException
-
impacts
- Specified by:
impactsin classTermsEnum- Throws:
IOException
-
ramBytesUsed
public long ramBytesUsed()- Specified by:
ramBytesUsedin interfaceAccountable
-
getOrCreateDictionaryBrowser
- Throws:
IOException
-
clearTermState
protected void clearTermState() -
newCorruptIndexException
-