[Xapian-discuss] Matching exact phrases only

James Aylett james-xapian at tartarus.org
Tue Aug 8 18:08:18 BST 2006


On Tue, Aug 08, 2006 at 04:53:33PM +0000, Chris Good wrote:

> Well it seems to work after a fashion:
> CHEQUERS CENTRE Weight 8.348862 100% relevant, matching: centre
> EVERSLEY CENTRE Weight 8.348862 100% relevant, matching: centre
> TOWN CENTRE Weight 8.348862 100% relevant, matching: centre
> FIELDHEAD BUSINESS CENTRE Weight 7.463948 89% relevant, matching: centre
> 
> Unfortunately it's not quite achieving what we're after as the weights
> are pretty similar between exact document matches and non-exact ones.  All of
> those results above are ones that we'd want to ignore as they're too 
> imprecise by way of couter-example for "london" we get:
> 
> LONDON Weight 8.515918 99% relevant, matching: london
> CENTRAL LONDON Weight 7.506283 88% relevant, matching: london
> LITTLE LONDON Weight 7.506283 88% relevant, matching: london 
> LONDON APPRENTICE Weight 7.506283 88% relevant, matching: london
> 
> Of those We actually only care about the "LONDON" match.

Okay. I think you need to figure out an indexing scheme that actually
matches what you're trying to achieve, because at the moment it simply
doesn't. You don't seem to want probabilistic search of freetext,
which is what you've got at the moment.

I'm not entirely certain what you /are/ trying to achieve, but I'm
guessing some kind of location taxonomy is in play, at which point you
should generate your own terms (perhaps with a prefix), which you can
do when generating your input to scriptindex. Whether you'll need a
different query parser will depend on your interface.

James

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list