Class MultiTermQuery
- java.lang.Object
-
- org.apache.lucene.search.Query
-
- org.apache.lucene.search.MultiTermQuery
-
- Direct Known Subclasses:
AutomatonQuery,FuzzyQuery,TermInSetQuery
public abstract class MultiTermQuery extends Query
An abstractQuerythat matches documents containing a subset of terms provided by aFilteredTermsEnumenumeration.This query cannot be used directly; you must subclass it and define
getTermsEnum(Terms,AttributeSource)to provide aFilteredTermsEnumthat iterates through the terms to be matched.NOTE: if
MultiTermQuery.RewriteMethodis eitherCONSTANT_SCORE_BOOLEAN_REWRITEorSCORING_BOOLEAN_REWRITE, you may encounter aIndexSearcher.TooManyClausesexception during searching, which happens when the number of terms to be searched exceedsIndexSearcher.getMaxClauseCount(). SettingMultiTermQuery.RewriteMethodtoCONSTANT_SCORE_BLENDED_REWRITEorCONSTANT_SCORE_REWRITEprevents this.The recommended rewrite method is
CONSTANT_SCORE_BLENDED_REWRITE: it doesn't spend CPU computing unhelpful scores, and is the most performant rewrite method given the query. If you need scoring (likeFuzzyQuery, useMultiTermQuery.TopTermsScoringBooleanQueryRewritewhich uses a priority queue to only collect competitive terms and not hit this limitation.Note that org.apache.lucene.queryparser.classic.QueryParser produces MultiTermQueries using
CONSTANT_SCORE_REWRITEby default.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static classMultiTermQuery.RewriteMethodAbstract class that defines how the query is rewritten.static classMultiTermQuery.TopTermsBlendedFreqScoringRewriteA rewrite method that first translates each term intoBooleanClause.Occur.SHOULDclause in a BooleanQuery, but adjusts the frequencies used for scoring to be blended across the terms, otherwise the rarest term typically ranks highest (often not useful eg in the set of expanded terms in a FuzzyQuery).static classMultiTermQuery.TopTermsBoostOnlyBooleanQueryRewriteA rewrite method that first translates each term intoBooleanClause.Occur.SHOULDclause in a BooleanQuery, but the scores are only computed as the boost.static classMultiTermQuery.TopTermsScoringBooleanQueryRewriteA rewrite method that first translates each term intoBooleanClause.Occur.SHOULDclause in a BooleanQuery, and keeps the scores as computed by the query.
-
Field Summary
Fields Modifier and Type Field Description static MultiTermQuery.RewriteMethodCONSTANT_SCORE_BLENDED_REWRITEA rewrite method where documents are assigned a constant score equal to the query's boost.static MultiTermQuery.RewriteMethodCONSTANT_SCORE_BOOLEAN_REWRITELikeSCORING_BOOLEAN_REWRITEexcept scores are not computed.static MultiTermQuery.RewriteMethodCONSTANT_SCORE_REWRITEA rewrite method that first creates a private Filter, by visiting each term in sequence and marking all docs for that term.static MultiTermQuery.RewriteMethodDOC_VALUES_REWRITEA rewrite method that usesDocValuesType.SORTED/DocValuesType.SORTED_SETdoc values to find matching docs through a post-filtering type approach.protected Stringfieldprotected MultiTermQuery.RewriteMethodrewriteMethodstatic MultiTermQuery.RewriteMethodSCORING_BOOLEAN_REWRITEA rewrite method that first translates each term intoBooleanClause.Occur.SHOULDclause in a BooleanQuery, and keeps the scores as computed by the query.
-
Constructor Summary
Constructors Constructor Description MultiTermQuery(String field, MultiTermQuery.RewriteMethod rewriteMethod)Constructs a query matching terms that cannot be represented with a single Term.
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Deprecated Methods Modifier and Type Method Description booleanequals(Object other)Override and implement query instance equivalence properly in a subclass.StringgetField()Returns the field name for this queryMultiTermQuery.RewriteMethodgetRewriteMethod()longgetTermsCount()Return the number of unique terms contained in this query, if known up-front.TermsEnumgetTermsEnum(Terms terms)Constructs an enumeration that expands the pattern term.protected abstract TermsEnumgetTermsEnum(Terms terms, AttributeSource atts)Construct the enumeration to be used, expanding the pattern term.inthashCode()Override and implement query hash code properly in a subclass.Queryrewrite(IndexSearcher indexSearcher)To rewrite to a simpler form, instead return a simpler enum fromgetTermsEnum(Terms, AttributeSource).voidsetRewriteMethod(MultiTermQuery.RewriteMethod method)Deprecated.set this using a constructor instead-
Methods inherited from class org.apache.lucene.search.Query
classHash, createWeight, rewrite, sameClassAs, toString, toString, visit
-
-
-
-
Field Detail
-
field
protected final String field
-
rewriteMethod
protected MultiTermQuery.RewriteMethod rewriteMethod
-
CONSTANT_SCORE_BLENDED_REWRITE
public static final MultiTermQuery.RewriteMethod CONSTANT_SCORE_BLENDED_REWRITE
A rewrite method where documents are assigned a constant score equal to the query's boost. Maintains a boolean query-like implementation over the most costly terms while pre-processing the less costly terms into a filter bitset. Enforces an upper-limit on the number of terms allowed in the boolean query-like implementation.This method aims to balance the benefits of both
CONSTANT_SCORE_BOOLEAN_REWRITEandCONSTANT_SCORE_REWRITEby enabling skipping and early termination over costly terms while limiting the overhead of a BooleanQuery with many terms. It also ensures you cannot hitIndexSearcher.TooManyClauses. For some use-cases with all low cost terms,CONSTANT_SCORE_REWRITEmay be more performant. While for some use-cases with all high cost terms,CONSTANT_SCORE_BOOLEAN_REWRITEmay be better.
-
CONSTANT_SCORE_REWRITE
public static final MultiTermQuery.RewriteMethod CONSTANT_SCORE_REWRITE
A rewrite method that first creates a private Filter, by visiting each term in sequence and marking all docs for that term. Matching documents are assigned a constant score equal to the query's boost.This method is faster than the BooleanQuery rewrite methods when the number of matched terms or matched documents is non-trivial. Also, it will never hit an errant
IndexSearcher.TooManyClausesexception.
-
DOC_VALUES_REWRITE
public static final MultiTermQuery.RewriteMethod DOC_VALUES_REWRITE
A rewrite method that usesDocValuesType.SORTED/DocValuesType.SORTED_SETdoc values to find matching docs through a post-filtering type approach. This will be very slow if used in isolation, but will likely be the most performant option when combined with a sparse query clause. All matching docs are assigned a constant score equal to the query's boost.If you don't have doc values indexed, see the other rewrite methods that rely on postings alone (e.g.,
CONSTANT_SCORE_BLENDED_REWRITE,SCORING_BOOLEAN_REWRITE, etc. depending on scoring needs).
-
SCORING_BOOLEAN_REWRITE
public static final MultiTermQuery.RewriteMethod SCORING_BOOLEAN_REWRITE
A rewrite method that first translates each term intoBooleanClause.Occur.SHOULDclause in a BooleanQuery, and keeps the scores as computed by the query. Note that typically such scores are meaningless to the user, and require non-trivial CPU to compute, so it's almost always better to useCONSTANT_SCORE_REWRITEinstead.NOTE: This rewrite method will hit
IndexSearcher.TooManyClausesif the number of terms exceedsIndexSearcher.getMaxClauseCount().
-
CONSTANT_SCORE_BOOLEAN_REWRITE
public static final MultiTermQuery.RewriteMethod CONSTANT_SCORE_BOOLEAN_REWRITE
LikeSCORING_BOOLEAN_REWRITEexcept scores are not computed. Instead, each matching document receives a constant score equal to the query's boost.NOTE: This rewrite method will hit
IndexSearcher.TooManyClausesif the number of terms exceedsIndexSearcher.getMaxClauseCount().
-
-
Constructor Detail
-
MultiTermQuery
public MultiTermQuery(String field, MultiTermQuery.RewriteMethod rewriteMethod)
Constructs a query matching terms that cannot be represented with a single Term.
-
-
Method Detail
-
getField
public final String getField()
Returns the field name for this query
-
getTermsEnum
protected abstract TermsEnum getTermsEnum(Terms terms, AttributeSource atts) throws IOException
Construct the enumeration to be used, expanding the pattern term. This method should only be called if the field exists (ie, implementations can assume the field does exist). This method should not return null (should instead returnTermsEnum.EMPTYif no terms match). The TermsEnum must already be positioned to the first matching term. The givenAttributeSourceis passed by theMultiTermQuery.RewriteMethodto share information between segments, for exampleTopTermsRewriteuses it to share maximum competitive boosts- Throws:
IOException
-
getTermsEnum
public final TermsEnum getTermsEnum(Terms terms) throws IOException
Constructs an enumeration that expands the pattern term. This method should only be called if the field exists (ie, implementations can assume the field does exist). This method never returns null. The returned TermsEnum is positioned to the first matching term.- Throws:
IOException
-
getTermsCount
public long getTermsCount() throws IOExceptionReturn the number of unique terms contained in this query, if known up-front. If not known, -1 will be returned.- Throws:
IOException
-
rewrite
public final Query rewrite(IndexSearcher indexSearcher) throws IOException
To rewrite to a simpler form, instead return a simpler enum fromgetTermsEnum(Terms, AttributeSource). For example, to rewrite to a single term, return aSingleTermsEnum- Overrides:
rewritein classQuery- Throws:
IOException- See Also:
IndexSearcher.rewrite(Query)
-
getRewriteMethod
public MultiTermQuery.RewriteMethod getRewriteMethod()
-
setRewriteMethod
@Deprecated public void setRewriteMethod(MultiTermQuery.RewriteMethod method)
Deprecated.set this using a constructor insteadSets the rewrite method to be used when executing the query. You can use one of the four core methods, or implement your own subclass ofMultiTermQuery.RewriteMethod.
-
hashCode
public int hashCode()
Description copied from class:QueryOverride and implement query hash code properly in a subclass. This is required so thatQueryCacheworks properly.- Specified by:
hashCodein classQuery- See Also:
Query.equals(Object)
-
equals
public boolean equals(Object other)
Description copied from class:QueryOverride and implement query instance equivalence properly in a subclass. This is required so thatQueryCacheworks properly.Typically a query will be equal to another only if it's an instance of the same class and its document-filtering properties are identical to those of the other instance. Utility methods are provided for certain repetitive code.
- Specified by:
equalsin classQuery- See Also:
Query.sameClassAs(Object),Query.classHash()
-
-