[Xapian-discuss] Search relevance
Matthew Somerville
matthew at mysociety.org
Thu Jul 24 13:24:16 BST 2008
Ben Phillips wrote:
> I'm getting some surprising (to me) ordering of results.
If you look at the parsed query, it's looking for "IV", when it should be
"iv" (Xapian search is case sensitive, given the first letter being
capitalised means something special). Result 5 includes the terms "iv" and
"Ziv", but not "IV". You need to synonym to lowercase versions of words.
ATB,
Matthew
> An example below:
>
> Query string is: grand theft auto four
> Parsed query is: Xapian::Query((Zgrand:(pos=1) OR Ztheft:(pos=2) OR
> Zauto:(pos=3) OR Zfour:(pos=4) OR 4:(pos=4) OR IV:(pos=4)))
> 528 results found.
> Results 1-10:
> 1: 46% ID=51009 TITLE= docid=36694 [Grand Theft Auto Compilation]
> ['XID51009', 'XPLATFORM16', 'Zauto', 'Zcollector', 'Zcompil', 'Zedit',
> 'Zgrand', 'Ztheft', 'auto', "collector's", 'compilation', 'edition',
> 'grand', 'theft']
> 2: 40% ID=6609 TITLE= docid=6569 [Grand Theft Auto Advance]
> ['XID6609', 'XPLATFORM4', 'Zadvanc', 'Zauto', 'Zgrand', 'Ztheft',
> 'advance', 'auto', 'grand', 'theft']
> 3: 40% ID=6614 TITLE= docid=6574 [Grand Theft Auto London 1961]
> ['1961', 'XID6614', 'XPLATFORM15', 'Zauto', 'Zgrand', 'Zlondon',
> 'Ztheft', 'auto', 'grand', 'london', 'theft']
> 4: 39% ID=6607 TITLE= docid=6567 [Grand Theft Auto]
> ['XID6607', 'XPLATFORM15', 'XPLATFORM16', 'XPLATFORM22', 'Zauto',
> 'Zgrand', 'Ztheft', 'auto', 'grand', 'theft']
> 5: 39% ID=24402 TITLE= docid=21668 [Grand Theft Auto IV]
> ['XID24402', 'XPLATFORM34', 'XPLATFORM69', 'Zauto', 'Zgrand', 'Ziv',
> 'Ztheft', 'auto', 'grand', 'iv', 'theft']
>
> four has synonyms so expands to four OR 4 OR IV. I'd expect result 5
> 'Grand Theft Auto IV' to be result number 1 as it's exactly the search
> term. If I search for 'grand theft auto iv' then it is result 1.
>
> Cheers,
> Ben.
More information about the Xapian-discuss
mailing list