[Xapian-devel] Custom weight factors - pushing the relevancy ranking how we want it
Michiel Roding
michiel at parse.nl
Fri Dec 17 13:56:07 GMT 2004
Olly Betts wrote:
>On Fri, Dec 17, 2004 at 11:06:41AM +0100, Michiel Roding wrote:
>
>
>>As forums are, the content that is relevant to a search is not just
>>determined by the frequency or location of the terms; the date the topic
>>has been last modified is important as well.
>>
>>
>
>The match bias code will probably be useful here. I need to tidy up the
>UI, which is slowly bubbling up my todo list. But it'll allow a date
>dependent extra weight term so more recent topics can get a boost.
>
>
So, in time we'll be able to define custom fields to contain an integer
with a definable weight?
Some forums might think date is very important, other may think the
viewcount is a better representation. If they can modify the
index.script, that'd be really nice
>
>
>>Another issue we find is that the amount of results is so overwhelming,
>>the user is unable to find the correct topic for his needs. Combining
>>this with some statistics, we found that a very large part of the
>>queries to Omega are the same. Keywords like windows, xp, dvd etc. are
>>very popular.
>>Therefore, we are contemplating to build a "does this topic meet your
>>search?" feature to store which topics are most relevant to the queries
>>as defined by the users.
>>
>>
>
>One problem with this approach is that different users may want
>different results for the same query. Some searching for "xp" may want
>windows xp, others extreme programming.
>
>Hopefully you'll end up with a small enough set of favoured results that
>this won't be too much of a problem though.
>
>And if you built a second Xapian database where each topic is indexed
>only by terms which people have voted for, then you could use topterms
>to allow users to narrow in on particular meanings.
>
>
Could you ellaborate on this?
>>Other features could be a lame attempt at the PageRank relevancy,
>>storing if a user almost immediatly skips a topic (irrelevant) etc.
>>
>>But, this needs to be stored (easy) and processed by Xapian in the sorting.
>>
>>How could we go about this? Does Xapian somehow support these custom
>>weight factors?
>>
>>
>
>The match bias code again. Currently it's hardwired to expect a Unix
>timestamp and to give an exponentially decaying weight from the present.
>But the concept is that it could be used for all sorts of things,
>including this.
>
whey :)
thanks for the info.
Michiel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20041217/33550dbe/attachment.html>
More information about the Xapian-devel
mailing list