[Xapian-devel] Custom weight factors - pushing the relevancy ranking how we want it

Michiel Roding michiel at parse.nl
Fri Dec 17 13:56:07 GMT 2004


Olly Betts wrote:

>On Fri, Dec 17, 2004 at 11:06:41AM +0100, Michiel Roding wrote:
>  
>
>>As forums are, the content that is relevant to a search is not just 
>>determined by the frequency or location of the terms; the date the topic 
>>has been last modified is important as well.
>>    
>>
>
>The match bias code will probably be useful here.  I need to tidy up the
>UI, which is slowly bubbling up my todo list.  But it'll allow a date
>dependent extra weight term so more recent topics can get a boost.
>  
>
So, in time we'll be able to define custom fields to contain an integer 
with a definable weight?
Some forums might think date is very important, other may think the 
viewcount is a better representation. If they can modify the 
index.script, that'd be really nice

>  
>
>>Another issue we find is that the amount of results is so overwhelming, 
>>the user is unable to find the correct topic for his needs. Combining 
>>this with some statistics, we found that a very large part of the 
>>queries to Omega are the same. Keywords like windows, xp, dvd etc. are 
>>very popular.
>>Therefore, we are contemplating to build a "does this topic meet your 
>>search?" feature to store which topics are most relevant to the queries 
>>as defined by the users.
>>    
>>
>
>One problem with this approach is that different users may want
>different results for the same query.  Some searching for "xp" may want
>windows xp, others extreme programming.
>    
>Hopefully you'll end up with a small enough set of favoured results that
>this won't be too much of a problem though.
>
>And if you built a second Xapian database where each topic is indexed
>only by terms which people have voted for, then you could use topterms
>to allow users to narrow in on particular meanings.
>  
>
Could you ellaborate on this?

>>Other features could be a lame attempt at the PageRank relevancy, 
>>storing if a user almost immediatly skips a topic (irrelevant) etc.
>>
>>But, this needs to be stored (easy) and processed by Xapian in the sorting.
>>
>>How could we go about this? Does Xapian somehow support these custom 
>>weight factors?
>>    
>>
>
>The match bias code again.  Currently it's hardwired to expect a Unix
>timestamp and to give an exponentially decaying weight from the present.
>But the concept is that it could be used for all sorts of things,
>including this.
>
whey :)

thanks for the info.

Michiel


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20041217/33550dbe/attachment.html>


More information about the Xapian-devel mailing list