[Xapian-devel] [GSOC 2014] Indexing INEX dataset
Jiarong Wei
vcamx3 at gmail.com
Wed Mar 12 02:13:42 GMT 2014
Thank you Parth and Olly! I’ll try it :)
Jiarong Wei
On Mar 11, 2014, at 16:57, Olly Betts <olly at survex.com> wrote:
> On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote:
>>>
>>> On current trunk, we index the title with prefix "S" by default in
>>> omindex, though with a wdf inc of 5 rather than 1:
>>>
>>> indexer.index_text(title, 5, "S");
>>>
>>> So I don't think you need that change to omindex now.
>>
>> Yes, but please make sure to change 5 to 1 otherwise divide the final count
>> statistics by 5 . :)
>
> We really need to resolve any instances where letor requires code in
> other parts of Xapian to be patched.
>
> In this case, possibly the bias on the title should be done differently,
> but won't this just mean both the wdfs and the field length for the S
> prefix are 5 times larger, and it won't matter?
>
> Cheers,
> Olly
More information about the Xapian-devel
mailing list