[Xapian-devel] [GSOC 2014] Indexing INEX dataset

Jiarong Wei vcamx3 at gmail.com
Wed Mar 12 02:13:42 GMT 2014


Thank you Parth and Olly! I’ll try it :)

Jiarong Wei

On Mar 11, 2014, at 16:57, Olly Betts <olly at survex.com> wrote:

> On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote:
>>> 
>>> On current trunk, we index the title with prefix "S" by default in
>>> omindex, though with a wdf inc of 5 rather than 1:
>>> 
>>>            indexer.index_text(title, 5, "S");
>>> 
>>> So I don't think you need that change to omindex now.
>> 
>> Yes, but please make sure to change 5 to 1 otherwise divide the final count
>> statistics by 5 . :)
> 
> We really need to resolve any instances where letor requires code in
> other parts of Xapian to be patched.
> 
> In this case, possibly the bias on the title should be done differently,
> but won't this just mean both the wdfs and the field length for the S
> prefix are 5 times larger, and it won't matter?
> 
> Cheers,
>    Olly




More information about the Xapian-devel mailing list