[Xapian-discuss] Xapian and Synonyms

Justin Finkelstein justin at redwiredesign.com
Fri Jul 29 12:13:14 BST 2011


Hello again

So I've enabled synonym support in my index by doing this:

    Xapian_writeable_db::add_synonym('aubergine', 'eggplant');
    Xapian_writeable_db::add_synonym('eggplant', 'aubergine');

And passed in FLAG_AUTO_SYNONYMS to parse_query as describe in the
manual.

This now works and I get results using both directions of the synonym
but something peculiar is happening:

    o When I search for 'eggplant', I get 22 results
    o When I search for 'aubergine', I get 66

Looking at the debug output from Xapian, these generate the following
queries:

Xapian::Query(((Zeggplant:(pos=1) SYNONYM aubergine:(pos=1)) AND
<alldocuments>))
Xapian::Query(((Zaubergin:(pos=1) SYNONYM eggplant:(pos=1)) AND
<alldocuments>))

Any thoughts on why the number of results would differ when one word
over another?

Thanks,

Justin
    
On Thu, 2011-07-28 at 17:34 +0100, Justin Finkelstein wrote: 

> Hi guys
> 
> I've just had a thought about something we do with our search on
> ReportBuyer.com: we cater for both American and British English in our
> searches and we have had plans for a while now to implement something
> that allows users to find 'colour' and 'color', 'tap' and 'faucet' by
> doing some clever programming. 
> 
> Looking at the Xapian docs, though, it appears I can do this using
> synonyms although the documentation's not clear on how this works (sorry
> if this is starting to sound a little annoying). Having dug through the
> mailing list, I can surmise that:
> 
>     o Synonyms only need to be added to a database once and are
> transaction-persistent
>     o They're activated by adding the flag FLAG_AUTO_SYNONYMS to the
> QueryParser
> 
> Which isn't a lot; so what I'd like clarification on is:
> 
>     o Are synonyms database-persistent (i.e once added do they stay in
> the database until removed)?
>     o Are they dual-direction (i.e. if I enter 'tap' or 'faucet' will
> both be picked up if a defined synonym is present)?
>     o Is there a way to get a list of all current synonyms from a
> database? I can't see a way to do this in the API.
> 
> If I can get answers to this, I can test it and would be happy to write
> this up for the site.
> 
> Cheers,
> 
> Justin
> 




More information about the Xapian-discuss mailing list