@InterfaceAudience.Public
@InterfaceStability.Stable
public final class CountingBloomFilter
extends org.apache.hadoop.util.bloom.Filter
A counting Bloom filter is an improvement to standard a Bloom filter as it allows dynamic additions and deletions of set membership information. This is achieved through the use of a counting vector instead of a bit vector.
Originally created by European Commission One-Lab Project 034819.
The general behavior of a filter,
Summary cache: a scalable wide-area web cache sharing protocol| Constructor and Description |
|---|
CountingBloomFilter()
Default constructor - use with readFields
|
CountingBloomFilter(int vectorSize,
int nbHash,
int hashType)
Constructor
|
| Modifier and Type | Method and Description |
|---|---|
void |
add(org.apache.hadoop.util.bloom.Key key)
Adds a key to this filter.
|
void |
and(org.apache.hadoop.util.bloom.Filter filter)
Peforms a logical AND between this filter and a specified filter.
|
int |
approximateCount(org.apache.hadoop.util.bloom.Key key)
This method calculates an approximate count of the key, i.e.
|
void |
delete(org.apache.hadoop.util.bloom.Key key)
Removes a specified key from this counting Bloom filter.
|
boolean |
membershipTest(org.apache.hadoop.util.bloom.Key key)
Determines wether a specified key belongs to this filter.
|
void |
not()
Performs a logical NOT on this filter.
|
void |
or(org.apache.hadoop.util.bloom.Filter filter)
Peforms a logical OR between this filter and a specified filter.
|
void |
readFields(DataInput in)
Deserialize the fields of this object from
in. |
String |
toString() |
void |
write(DataOutput out)
Serialize the fields of this object to
out. |
void |
xor(org.apache.hadoop.util.bloom.Filter filter)
Peforms a logical XOR between this filter and a specified filter.
|
public CountingBloomFilter()
public CountingBloomFilter(int vectorSize,
int nbHash,
int hashType)
vectorSize - The vector size of this filter.nbHash - The number of hash function to consider.hashType - type of the hashing function (see
Hash).public void add(org.apache.hadoop.util.bloom.Key key)
org.apache.hadoop.util.bloom.Filteradd in class org.apache.hadoop.util.bloom.Filterkey - The key to add.public void delete(org.apache.hadoop.util.bloom.Key key)
Invariant: nothing happens if the specified key does not belong to this counter Bloom filter.
key - The key to remove.public void and(org.apache.hadoop.util.bloom.Filter filter)
org.apache.hadoop.util.bloom.FilterInvariant: The result is assigned to this filter.
and in class org.apache.hadoop.util.bloom.Filterfilter - The filter to AND with.public boolean membershipTest(org.apache.hadoop.util.bloom.Key key)
org.apache.hadoop.util.bloom.FiltermembershipTest in class org.apache.hadoop.util.bloom.Filterkey - The key to test.public int approximateCount(org.apache.hadoop.util.bloom.Key key)
key -> count map.
NOTE: due to the bucket size of this filter, inserting the same
key more than 15 times will cause an overflow at all filter positions
associated with this key, and it will significantly increase the error
rate for this and other keys. For this reason the filter can only be
used to store small count values 0 <= N << 15.
key - key to be testedv == count with probability equal to the
error rate of this filter, and v > count otherwise.
Additionally, if the filter experienced an underflow as a result of
delete(Key) operation, the return value may be lower than the
count with the probability of the false negative rate of such
filter.public void not()
org.apache.hadoop.util.bloom.FilterThe result is assigned to this filter.
not in class org.apache.hadoop.util.bloom.Filterpublic void or(org.apache.hadoop.util.bloom.Filter filter)
org.apache.hadoop.util.bloom.FilterInvariant: The result is assigned to this filter.
or in class org.apache.hadoop.util.bloom.Filterfilter - The filter to OR with.public void xor(org.apache.hadoop.util.bloom.Filter filter)
org.apache.hadoop.util.bloom.FilterInvariant: The result is assigned to this filter.
xor in class org.apache.hadoop.util.bloom.Filterfilter - The filter to XOR with.public void write(DataOutput out) throws IOException
Writableout.write in interface Writablewrite in class org.apache.hadoop.util.bloom.Filterout - DataOuput to serialize this object into.IOExceptionpublic void readFields(DataInput in) throws IOException
Writablein.
For efficiency, implementations should attempt to re-use storage in the existing object where possible.
readFields in interface WritablereadFields in class org.apache.hadoop.util.bloom.Filterin - DataInput to deseriablize this object from.IOExceptionCopyright © 2022 Apache Software Foundation. All rights reserved.