[Xapian-tickets] [Xapian] #700: Support Enquire::matching_terms_begin() without termlist table?

Xapian nobody at xapian.org
Thu Dec 24 05:35:17 GMT 2015


#700: Support Enquire::matching_terms_begin() without termlist table?
----------------------------------+-------------------
        Reporter:  olly           |      Owner:  olly
            Type:  defect         |     Status:  new
        Priority:  normal         |  Milestone:  1.4.x
       Component:  Backend-Glass  |    Version:
        Severity:  normal         |   Keywords:
      Blocked By:                 |   Blocking:
Operating System:  All            |
----------------------------------+-------------------
 ''(Split out of #181)''

 Currently `Enquire::matching_terms_begin()` uses the termlist of the
 document, comparing it terms in the query.  This means it doesn't work if
 the database has no termlist.  It's also another item to lookup for each
 result, and comparing the two lists of terms isn't free.

 It's also arguably not quite correct in some cases, for example for this
 query:

 {{{
 A OR (B AND NOT C)
 }}}

 It'll report `A` and `B` as matching terms in a document containing all
 three terms, but perhaps only `A` should be reported in such a case since
 `B AND NOT C` wouldn't say `B` matched this document.

 We could record the information about matching terms for each candidate
 entry in the proto-`MSet`, which would solve both of these issues.  The
 tricky part is doing this in a way which doesn't incur a significant space
 or time overhead during the match.  E.g. a bitmap of matching terms is
 fairly space efficient.

 If we don't care about the corner cases of which terms match like the one
 above, we could also skip through the posting lists a second time to get
 this information.  More data to decode, but it's likely to already be in
 cache.

 Probably doesn't need API or ABI changes, so suitable for 1.4.x.

--
Ticket URL: <http://trac.xapian.org/ticket/700>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list