Introductory mail

Olly Betts olly at survex.com
Mon Mar 6 21:07:34 GMT 2017


On Sun, Mar 05, 2017 at 03:58:08PM +0530, Vignesh Rao wrote:
> Hello Xapian Community,

Hi Vignesh,

> I have gone through the project ideas and would like to work on "Weighting
> Schemes". Can someone please help me out to get started with it. Please
> share some tutorials or research papers if available.

We generally recommend starting by getting the code and getting the necessary
tools installed to successfully build it, then trying to work on something
small to start to get familiar with the code.  We've put a wiki page together
to help with this process:

https://trac.xapian.org/wiki/GSoC%20Guide

If you want research papers for suitable weighting schemes, you'll have to
take a look through the literature for them.

We now implement most of the well known schemes (though as the project idea
notes, not all the SMART normalisations are supported yet), but papers are
still being published proposing new schemes or variants on existing schemes
(like BM25+/PL2+/Dir+/Piv+ which Vivek implemented last year).

There's also scope for working on implementing tracking of more statistics
to allow some of the already implemented schemes to be better optimised.

Cheers,
    Olly



More information about the Xapian-devel mailing list