[Xapian-devel] Project: Weighting Schemes

Olly Betts olly at survex.com
Mon Mar 10 03:28:26 GMT 2014


On Sun, Mar 09, 2014 at 08:34:58AM +0530, Reetesh Ranjan wrote:
> Now, how I need to start off ? Because until  and unless the bugs are known
> and the flaws of current implementation, we won't be able to work in
> correct direction.

I'm not aware of any bugs in the current weighting schemes.

But there's scope for enhancements - as the project idea says, there are
more SMART schemes we could support if we extended the database backend
to track more statistics.

Also, some of the weighting schemes could probably be optimised (for
example, some of the DfR schemes currently call log() (or log2() or
log10()) more than is necessary, and calculating logarithms is usually
fairly slow compared to more basic floating point operations like
multiplication and addition.  I already improved PL2Weight, so if
you look at the history of weight/pl2weight.cc in git, you can see
the sort of thing I'm talking about.

So I'd suggest a good first step would be looking at adding support for
another SMART normalisation, or trying to optimise an existing weighting
scheme.

Cheers,
    Olly



More information about the Xapian-devel mailing list