[Xapian-discuss] Help with weights
Robert Kaye
rob at eorbit.net
Wed Jul 2 00:15:53 BST 2008
Hi!
Everytime I think I've got the xapian search for MusicBrainz licked I
ask for more feedback and my community finds yet another test case
that throws a monkey-wrench into my project. And the more I try to
understand Xapian's weighting system, the less I really understand it.
Let me ask a specific question -- in my release index (an index of CD
titles, essentially) I have a field called type. When the value of
this field is "album" I give it a termcount of 100. All other values
for this field and all other fields get a termcount of 1.
For the enquire, I use a stock object. I do not define a weighting
system, do not tinker with doc order or sort order. When I search for
the term "love" in the release title (very common term), the top hits
are the ones that contain the word "love" twice. Good.
But, for all the hits that have the word "love" in them once, I would
expect to see the releases of type "album" to be near the top. But
they are not:
http://musicbrainz.homeip.net/search/textsearch.html?query=love&handlearguments=1&limit=25&type=release&adv=0&offset=0
They make up the *bottom* 3-4 pages of the results, meaning they got
ranked BELOW all the non-"album" values:
http://musicbrainz.homeip.net/search/textsearch.html?query=love&handlearguments=1&limit=25&type=release&adv=0&offset=250
I can clearly see that my weighting is having an effect, but its the
opposite effect from what I am expecting.
What am I missing here? Any tips would be appreciated!
--
--ruaok Somewhere in Texas a village is *still* missing its idiot.
Robert Kaye -- rob at eorbit.net -- http://mayhem-chaos.net
More information about the Xapian-discuss
mailing list