Package org.apache.lucene.facet
Class RandomSamplingFacetsCollector
- java.lang.Object
-
- org.apache.lucene.search.SimpleCollector
-
- org.apache.lucene.facet.FacetsCollector
-
- org.apache.lucene.facet.RandomSamplingFacetsCollector
-
- All Implemented Interfaces:
Collector,LeafCollector
public class RandomSamplingFacetsCollector extends FacetsCollector
Collects hits for subsequent faceting, using sampling if needed. Once you've run a search and collect hits into this, instantiate one of theFacetssubclasses to do the facet counting. Note that this collector does not collect the scores of matching docs (i.e.FacetsCollector.MatchingDocs.scores) isnull.If you require the original set of hits, you can call
getOriginalMatchingDocs(). Also, since the counts of the top-facets is based on the sampled set, you can amortize the counts by callingamortizeFacetCounts(org.apache.lucene.facet.FacetResult, org.apache.lucene.facet.FacetsConfig, org.apache.lucene.search.IndexSearcher).
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.facet.FacetsCollector
FacetsCollector.MatchingDocs
-
-
Constructor Summary
Constructors Constructor Description RandomSamplingFacetsCollector(int sampleSize)Constructor with the given sample size and default seed.RandomSamplingFacetsCollector(int sampleSize, long seed)Constructor with the given sample size and seed.
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description FacetResultamortizeFacetCounts(FacetResult res, FacetsConfig config, IndexSearcher searcher)Note: if you use a countingFacetsimplementation, you can amortize the sampled counts by calling this method.static CollectorManager<RandomSamplingFacetsCollector,RandomSamplingFacetsCollector>createManager(int sampleSize, long seed)Creates aCollectorManagerfor concurrent random sampling throughRandomSamplingFacetsCollectorList<FacetsCollector.MatchingDocs>getMatchingDocs()Returns the sampled list of the matching documents.List<FacetsCollector.MatchingDocs>getOriginalMatchingDocs()Returns the original matching documents.doublegetSamplingRate()Returns the sampling rate that was used.-
Methods inherited from class org.apache.lucene.facet.FacetsCollector
collect, doSetNextReader, finish, getKeepScores, scoreMode, search, search, search, searchAfter, searchAfter, searchAfter, setScorer
-
Methods inherited from class org.apache.lucene.search.SimpleCollector
getLeafCollector
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.lucene.search.LeafCollector
collect, competitiveIterator
-
-
-
-
Constructor Detail
-
RandomSamplingFacetsCollector
public RandomSamplingFacetsCollector(int sampleSize)
Constructor with the given sample size and default seed.- See Also:
RandomSamplingFacetsCollector(int, long)
-
RandomSamplingFacetsCollector
public RandomSamplingFacetsCollector(int sampleSize, long seed)Constructor with the given sample size and seed.- Parameters:
sampleSize- The preferred sample size. If the number of hits is greater than the size, sampling will be done using a sample ratio of sampling size / totalN. For example: 1000 hits, sample size = 10 results in samplingRatio of 0.01. If the number of hits is lower, no sampling is done at allseed- The random seed. If0then a seed will be chosen for you.
-
-
Method Detail
-
getMatchingDocs
public List<FacetsCollector.MatchingDocs> getMatchingDocs()
Returns the sampled list of the matching documents. Note that aFacetsCollector.MatchingDocsinstance is returned per segment, even if no hits from that segment are included in the sampled set.Note: One or more of the MatchingDocs might be empty (not containing any hits) as result of sampling.
Note:
MatchingDocs.totalHitsis copied from the original MatchingDocs, scores is set tonull- Overrides:
getMatchingDocsin classFacetsCollector
-
getOriginalMatchingDocs
public List<FacetsCollector.MatchingDocs> getOriginalMatchingDocs()
Returns the original matching documents.
-
amortizeFacetCounts
public FacetResult amortizeFacetCounts(FacetResult res, FacetsConfig config, IndexSearcher searcher) throws IOException
Note: if you use a countingFacetsimplementation, you can amortize the sampled counts by calling this method. Uses theFacetsConfigand theIndexSearcherto determine the upper bound for each facet value.- Throws:
IOException
-
getSamplingRate
public double getSamplingRate()
Returns the sampling rate that was used.
-
createManager
public static CollectorManager<RandomSamplingFacetsCollector,RandomSamplingFacetsCollector> createManager(int sampleSize, long seed)
Creates aCollectorManagerfor concurrent random sampling throughRandomSamplingFacetsCollector
-
-