[Xapian-discuss] Phrase Query vs AND Query? Why don't these find the same things?

Olly Betts olly at survex.com
Thu Jun 1 19:19:38 BST 2006


On Thu, Jun 01, 2006 at 12:01:28PM -0400, jarrod roberson wrote:
> mset = enq.get_mset(0, db.get_doccount())

Yeah, that's as good as db.get_lastdocid().  db.get_doccount() will
always be <= db.get_lastdocid(), but as I said in the other message, you
don't need a tight bound here.

> I assume that using the posting postiion should make it more effiecient and
> more exact right?

OP_PHRASE and OP_NEAR are implemented as OP_AND plus an addition check
that the terms are in the right order, so it will be more exact, but
slower.

> Since I only want matches where those terms are in that EXACT positional
> order.

You probably want to use OP_PHRASE then - otherwise you'll need to do
the filtering yourself which will probably be slower.

If you're indexing/searching exact pathnames (or even path prefixes),
then you could do it faster by encoding the position in the term.

So "C:\Program Files\Xapian" would be:

LP0:c:
LP1:program files
LP2:xapian

And then you can find everything in "C:\Program Files" with:

LP0:c: AND LP1:program files

Cheers,
    Olly



More information about the Xapian-discuss mailing list