[Xapian-devel] [GSOC 2014] Indexing INEX dataset
Olly Betts
olly at survex.com
Thu Mar 20 01:35:20 GMT 2014
On Mon, Mar 17, 2014 at 09:07:29PM +0100, Parth Gupta wrote:
> Wouldn't setting the weight of terms in title back to normal (e.g. 5 to 1)
> by below line, automatically adjust the wdfs and field lengths?
>
> indexer.index_text(title, 5, "S"); -> indexer.index_text(title, 1, "S");
>
> if it does not then we should include that part in the patch too. I like to
> create a patch for xapian-letor for resolving common code of xapian.
I'm not sure I follow.
The reason we use 5 here is that the page title is that matching terms
in the title are usually a good indicator of a page that should be
ranked highly for a search (note omindex is not usually working in a
domain where evil SEOs are trying to distort the rankings).
If we simply change 5 to 1 here, then the title won't be given any extra
consideration, which would be a regression in this area.
Cheers,
Olly
More information about the Xapian-devel
mailing list