[Xapian-discuss] Searching for numbers and roman numerals

Matthew Somerville matthew at mysociety.org
Mon Jul 21 13:25:29 BST 2008


Ben Phillips wrote:
> We're indexing our game database and want to return Grand Theft Auto
> IV when someone searches for Grand Theft Auto 4 (and vice versa) -
> would using synonyms for roman numerals be appropriate here or is
> there a more appropriate solution?

Synonyms sound good to me. Indexing these four "documents":
         'This is a review of Grand Theft Auto 4.',
         'This is a review of Grand Theft Auto IV.',
         'This is a review of Grand Theft Auto Four.',
         'This is a review of Grand Theft Auto 4, known as GTA IV.',

and synonymising all of "four", "iv", and "4" to each other (so 6 calls to 
add_synonym) means all four entries are returned for a search of any of the 
three ways of saying 4 with FLAG_AUTO_SYNONYMS set.

ATB,
Matthew



More information about the Xapian-discuss mailing list