Package org.apache.lucene.codecs.bloom
Class BloomFilteringPostingsFormat
java.lang.Object
org.apache.lucene.codecs.PostingsFormat
org.apache.lucene.codecs.bloom.BloomFilteringPostingsFormat
- All Implemented Interfaces:
- NamedSPILoader.NamedSPI
A 
PostingsFormat useful for low doc-frequency fields such as primary keys. Bloom filters
 are maintained in a ".blm" file which offers "fast-fail" for reads in segments known to have no
 record of the key. A choice of delegate PostingsFormat is used to record all other Postings data.
 A choice of BloomFilterFactory can be passed to tailor Bloom Filter settings on a
 per-field basis. The default configuration is DefaultBloomFilterFactory which allocates a
 ~8mb bitset and hashes values using MurmurHash64. This should be suitable for most
 purposes.
 
The format of the blm file is as follows:
- BloomFilter (.blm) --> Header, DelegatePostingsFormatName, NumFilteredFields, FilterNumFilteredFields, Footer
- Filter --> FieldNumber, FuzzySet
- FuzzySet -->See FuzzySet.serialize(DataOutput)
- Header --> IndexHeader
- DelegatePostingsFormatName --> StringThe name of a ServiceProvider registeredPostingsFormat
- NumFilteredFields --> Uint32
- FieldNumber --> Uint32The number of the field in this segment
- Footer --> CodecFooter
- WARNING: This API is experimental and might change in incompatible ways in the next release.
- 
Field SummaryFieldsFields inherited from class org.apache.lucene.codecs.PostingsFormatEMPTY
- 
Constructor SummaryConstructorsConstructorDescriptionBloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat) Creates Bloom filters for a selection of fields created in the index.BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat, BloomFilterFactory bloomFilterFactory) Creates Bloom filters for a selection of fields created in the index.
- 
Method SummaryModifier and TypeMethodDescriptionfieldsConsumer(SegmentWriteState state) fieldsProducer(SegmentReadState state) toString()Methods inherited from class org.apache.lucene.codecs.PostingsFormatavailablePostingsFormats, forName, getName, reloadPostingsFormats
- 
Field Details- 
BLOOM_CODEC_NAME- See Also:
 
- 
VERSION_STARTpublic static final int VERSION_START- See Also:
 
- 
VERSION_CURRENTpublic static final int VERSION_CURRENT- See Also:
 
 
- 
- 
Constructor Details- 
BloomFilteringPostingsFormatpublic BloomFilteringPostingsFormat(PostingsFormat delegatePostingsFormat, BloomFilterFactory bloomFilterFactory) Creates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This PostingsFormat delegates to a choice of delegate PostingsFormat for encoding all other postings data.- Parameters:
- delegatePostingsFormat- The PostingsFormat that records all the non-bloom filter data i.e. postings info.
- bloomFilterFactory- The- BloomFilterFactoryresponsible for sizing BloomFilters appropriately
 
- 
BloomFilteringPostingsFormatCreates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This PostingsFormat delegates to a choice of delegate PostingsFormat for encoding all other postings data. This choice of constructor defaults to theDefaultBloomFilterFactoryfor configuring per-field BloomFilters.- Parameters:
- delegatePostingsFormat- The PostingsFormat that records all the non-bloom filter data i.e. postings info.
 
- 
BloomFilteringPostingsFormatpublic BloomFilteringPostingsFormat()
 
- 
- 
Method Details- 
fieldsConsumer- Specified by:
- fieldsConsumerin class- PostingsFormat
- Throws:
- IOException
 
- 
fieldsProducer- Specified by:
- fieldsProducerin class- PostingsFormat
- Throws:
- IOException
 
- 
toString- Overrides:
- toStringin class- PostingsFormat
 
 
-