GSoC 2016.Weightning formula

James Aylett james-xapian at
Sat Mar 12 15:22:57 GMT 2016

On Sat, Mar 12, 2016 at 04:31:51AM +0300, Рудольф Лайко wrote:

> My name is Rudolph Layko. I am pursuing my bachelor`s degree (2nd year) in
> Applied Mathematics at National Research University - Higher School of
> Economics (Moscow, Russia).

Hi, Rudolph -- welcome to Xapian!

> According to this, I want to ask if there is anything that I can start to
> working on right away in order to get more acquainted with problems, or
> maybe you could provide me with some relevant information for the further
> work, except already linked in project description.

For the weighting schemes, I'd recommend you start by looking at one
that /is/ linked in the project description: adding at other SMART
normalisations that could be added to our TF/IDF implementation. Note
that Nishad Dawkhar has a PR open for max-wdf
so you'd need to choose a different one (and coordinate openly with
anyone else who might be looking at this, so we don't end up with
people looking at the same normalisation independently). Working on a
normalisation will likely require you to track another document
statistic, and then use that to implement the new normalisation.

However if that doesn't suit, you could pick any other small task to
get familiar with Xapian. For instance, this is a small change with
(linked) a patch from a number of years ago, and some comments from

If we can avoid ABI changes (there's some discussion on this in
xapian-core/HACKING) then it doesn't hugely matter when we get this
in, but if it requires an ABI change then doing it during the 1.3.x
period (which we're in now) is going to be easiest.

But that's really just a randomly-chosen issue in the tracker that
looks fairly small to me. You can search the tracker by component to
find something you like; for instance, here are all the issues against
the QueryParser:


  James Aylett, occasional trouble-maker

More information about the Xapian-devel mailing list