[Xapian-devel] [GSOC 2014] Indexing INEX dataset

Olly Betts olly at survex.com
Tue Mar 11 23:57:55 GMT 2014


On Tue, Mar 11, 2014 at 03:20:31PM +0100, Parth Gupta wrote:
> >
> > On current trunk, we index the title with prefix "S" by default in
> > omindex, though with a wdf inc of 5 rather than 1:
> >
> >             indexer.index_text(title, 5, "S");
> >
> > So I don't think you need that change to omindex now.
> 
> Yes, but please make sure to change 5 to 1 otherwise divide the final count
> statistics by 5 . :)

We really need to resolve any instances where letor requires code in
other parts of Xapian to be patched.

In this case, possibly the bias on the title should be done differently,
but won't this just mean both the wdfs and the field length for the S
prefix are 5 times larger, and it won't matter?

Cheers,
    Olly



More information about the Xapian-devel mailing list