[Xapian-tickets] [Xapian] #700: Support Enquire::matching_terms_begin() without termlist table?
Xapian
nobody at xapian.org
Thu Dec 24 05:35:17 GMT 2015
#700: Support Enquire::matching_terms_begin() without termlist table?
----------------------------------+-------------------
Reporter: olly | Owner: olly
Type: defect | Status: new
Priority: normal | Milestone: 1.4.x
Component: Backend-Glass | Version:
Severity: normal | Keywords:
Blocked By: | Blocking:
Operating System: All |
----------------------------------+-------------------
''(Split out of #181)''
Currently `Enquire::matching_terms_begin()` uses the termlist of the
document, comparing it terms in the query. This means it doesn't work if
the database has no termlist. It's also another item to lookup for each
result, and comparing the two lists of terms isn't free.
It's also arguably not quite correct in some cases, for example for this
query:
{{{
A OR (B AND NOT C)
}}}
It'll report `A` and `B` as matching terms in a document containing all
three terms, but perhaps only `A` should be reported in such a case since
`B AND NOT C` wouldn't say `B` matched this document.
We could record the information about matching terms for each candidate
entry in the proto-`MSet`, which would solve both of these issues. The
tricky part is doing this in a way which doesn't incur a significant space
or time overhead during the match. E.g. a bitmap of matching terms is
fairly space efficient.
If we don't care about the corner cases of which terms match like the one
above, we could also skip through the posting lists a second time to get
this information. More data to decode, but it's likely to already be in
cache.
Probably doesn't need API or ABI changes, so suitable for 1.4.x.
--
Ticket URL: <http://trac.xapian.org/ticket/700>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list