[Xapian-discuss] Making SORTAFTER useful in omega?

Arjen van der Meijden acmmailing at tweakers.net
Tue Sep 16 15:26:39 BST 2008


I've patched xapian-core to contain another Sorter, which takes the 
calculated weight, then rounds it (actually just multiplies it and casts 
it to int) and uses it as part the first part of a sort key. The second 
part is simply the value's content for a specific document which you'd 
would already be used with the sort_by_relevance_then_value-call.

To give the sorter access to the weight, I added it to the operator()-call.

You can enable my RoundedWeightSorter from within omega using the 
'SORTFACTOR='-parameter on top of the normal SORTAFTER= and SORT=

 From my first looks and limited tests a reasonable value seems to be to 
round the weight to 2 decimals after the dot (sortfactor of 100). The 
time to do the search seems to be similar to a normal value based sorted 
search.

The patches are based on Xapian/Omega 1.0.8 and obviously I'd like to 
hear about all flaws in my approach.

Best regards,

Arjen

On 13-9-2008 9:56, Arjen van der Meijden wrote:
> You're probably right. But I'm not too proficient in C++ and the inner 
> workings of Xapian. I already found the place in omega where to alter 
> the weighting-scheme in query.cc, and did try changing it to the 
> TradWeight-scheme. But that didn't help much.
> So what can I change to actually have results that get same scores when 
> they would have gotten almost the same scores from BM25/TradWeight?
> 
> Do I have to roll my own weighting scheme? Or are there settings for 
> BM25 or TradWeight that already provide such a result set?
> 
> Best regards,
> 
> Arjen
> 
> On 12-9-2008 2:13 alexander lind wrote:
>> Ah, I did misunderstand you there, I thought you wanted to sort the 
>> entire relevance set after a date.
>>
>> I don't think there is a way to make SORTAFTER less sensitive to small 
>> differences in the relevance score. Except for by hacking it in yourself :/
>>
>> Alec
>>
>> On Sep 11, 2008, at 11:35 AM, Arjen van der Meijden wrote:
>>
>>> Alec,
>>>
>>> I don't really understand your answer or perhaps you didn't understand 
>>> my question.
>>> I want relevance sort. But when two results are more or less similar 
>>> in terms of relevance, I want the newest first (which can indeed be 
>>> supplied with a value).
>>>
>>> This is actually what the SORTAFTER in omega is for. But the chance of 
>>> two results having the same relevance is almost zero, so in practice 
>>> it doesn't work.
>>>
>>> Best regards,
>>>
>>> Arjen
>>>
>>> On 11-9-2008 17:39 alexander lind wrote:
>>>> Can't you just use the reordering parameter 'SORT' to sort on a value 
>>>> number?  In this a value number where you'd have put the dates that 
>>>> your users wants to sort by?
>>>> Alec
>>>> On Sep 11, 2008, at 7:12 AM, Arjen van der Meijden wrote:
>>>>> Is there no one with some input on this issue?
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Arjen
>>>>>
>>>>> On 1-9-2008 22:43 Arjen van der Meijden wrote:
>>>>>> Hello List,
>>>>>>
>>>>>> Our users keep asking for some more "logical" sorting of search 
>>>>>> results.
>>>>>> Now the results are sorted on relevance, i.e. the raw weight, by
>>>>>> default. But since the users only see the percentage, that results 
>>>>>> in a
>>>>>> seemingly random secondary sorting.
>>>>>>
>>>>>> According to the docs and earlier mails, omega has the 'SORTAFTER' 
>>>>>> (and
>>>>>> docid sorting) functionality to allow date-based secondary sorting. 
>>>>>> But
>>>>>> according to later mails and the documentation that's only useful 
>>>>>> if you
>>>>>> don't use the default BM25-weighting. Unfortunately you can't alter 
>>>>>> the
>>>>>> weighting scheme via Omega-calls.
>>>>>> Nor does it seem to help to simply patch query.cc to use TradWeight
>>>>>> rather than BM25.
>>>>>>
>>>>>> Since we've built our set-up around omega, we'd rather not have to 
>>>>>> build
>>>>>> something similar or patch omega just because its missing a small but
>>>>>> important feature. Is it somehow possible to make the newer results in
>>>>>> the seemingly similarly relevant results sort on top within Omega?
>>>>>>
>>>>>> Best regards,
>>>>>>
>>>>>> Arjen
>>>>>>
>>>>>> _______________________________________________
>>>>>> Xapian-discuss mailing list
>>>>>> Xapian-discuss at lists.xapian.org
>>>>>> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>>>>>>
>>>>> _______________________________________________
>>>>> Xapian-discuss mailing list
>>>>> Xapian-discuss at lists.xapian.org
>>>>> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>>
> 
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
> 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: roundedweightsorter.patch
Url: http://lists.xapian.org/pipermail/xapian-discuss/attachments/20080916/96f6c68a/attachment.txt 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: roundedweightsorter-omega.patch
Url: http://lists.xapian.org/pipermail/xapian-discuss/attachments/20080916/96f6c68a/attachment-0001.txt 


More information about the Xapian-discuss mailing list