Package org.apache.lucene.search.spans
The calculus of spans.
A span is a <doc,startPosition,endPosition>
tuple.
The following span query operators are implemented:
- A SpanTermQuery matches all spans containing a particular Term.
- A SpanNearQuery matches spans which occur near one another, and can be used to implement things like phrase search (when constructed from SpanTermQueries) and inter-phrase proximity (when constructed from other SpanNearQueries).
- A SpanOrQuery merges spans from a number of other SpanQueries.
- A SpanNotQuery removes spans matching one SpanQuery which overlap another. This can be used, e.g., to implement within-paragraph search.
- A SpanFirstQuery matches spans
matching
q
whose end position is less thann
. This can be used to constrain matches to the first part of the document.
For example, a span query which matches "John Kerry" within ten words of "George Bush" within the first 100 words of the document could be constructed with:
SpanQuery john = new SpanTermQuery(new Term("content", "john")); SpanQuery kerry = new SpanTermQuery(new Term("content", "kerry")); SpanQuery george = new SpanTermQuery(new Term("content", "george")); SpanQuery bush = new SpanTermQuery(new Term("content", "bush")); SpanQuery johnKerry = new SpanNearQuery(new SpanQuery[] {john, kerry}, 0, true); SpanQuery georgeBush = new SpanNearQuery(new SpanQuery[] {george, bush}, 0, true); SpanQuery johnKerryNearGeorgeBush = new SpanNearQuery(new SpanQuery[] {johnKerry, georgeBush}, 10, false); SpanQuery johnKerryNearGeorgeBushAtStart = new SpanFirstQuery(johnKerryNearGeorgeBush, 100);
Span queries may be freely intermixed with other Lucene queries. So, for example, the above query can be restricted to documents which also use the word "iraq" with:
Query query = new BooleanQuery(); query.add(johnKerryNearGeorgeBushAtStart, true, false); query.add(new TermQuery("content", "iraq"), true, false);
-
Class Summary Class Description FieldMaskingSpanQuery Wrapper to allowSpanQuery
objects participate in composite single-field SpanQueries by 'lying' about their search field.NearSpansOrdered A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them.NearSpansUnordered Similar toNearSpansOrdered
, but for the unordered case.SpanFirstQuery Matches spans near the beginning of a field.SpanMultiTermQueryWrapper<Q extends MultiTermQuery> Wraps anyMultiTermQuery
as aSpanQuery
, so it can be nested within other SpanQuery classes.SpanMultiTermQueryWrapper.SpanRewriteMethod Abstract class that defines how the query is rewritten.SpanMultiTermQueryWrapper.TopTermsSpanBooleanQueryRewrite A rewrite method that first translates each term into a SpanTermQuery in aBooleanClause.Occur.SHOULD
clause in a BooleanQuery, and keeps the scores as computed by the query.SpanNearPayloadCheckQuery Only return those matches that have a specific payload at the given position.SpanNearQuery Matches spans which are near one another.SpanNotQuery Removes matches which overlap with another SpanQuery.SpanOrQuery Matches the union of its clauses.SpanPayloadCheckQuery Only return those matches that have a specific payload at the given position.SpanPositionCheckQuery Base class for filtering a SpanQuery based on the position of a match.SpanPositionRangeQuery Checks to see if theSpanPositionCheckQuery.getMatch()
lies between a start and end positionSpanQuery Base class for span-based queries.Spans Expert: an enumeration of span matches.SpanScorer Public for extension only.SpanTermQuery Matches spans containing a term.SpanWeight Expert-only.TermSpans Expert: Public for extension only -
Enum Summary Enum Description SpanPositionCheckQuery.AcceptStatus Return value if the match should be acceptedYES
, rejectedNO
, or rejected and enumeration should advance to the next documentNO_AND_ADVANCE
.