Weighting Schemes: Evaluation results
James Aylett
james-xapian at tartarus.org
Thu Jul 28 10:33:34 BST 2016
On Thu, Jul 28, 2016 at 12:25:38PM +0530, Parth Gupta wrote:
> I can say FIRE is also a reliable source but INEX/TREC are
> better. INEX can give you free access and TREC is not freely
> available.
>
> I roughly remember that there was a discussion with our this year
> GSOC student Ayush about INEX data. He had also obtained it, this
> would also be a good way to collaborate with him :) and try to
> establish a common evaluation dataset for future.
I'd forgotten about INEX; it doesn't seem to be running any more, but
there's still a range of datasets available.
It looks like the bigger datasets are likely to survive a while,
although if we come to use these regularly we should contact the
hosting providers so we get warning if they'll disappear. The INEX
supporting software is on Google Code, which will disappear at the end
of this year, so ideally someone would convert that to git and make it
available longer-term. (I've grabbed a download of the subversion
repository for now.)
I think ideally we'd have notes on using a variety of datasets, since
they all seem to cover slightly different scenarios, that would be
ideal. Probably best to start with creating an 'evaluation' page on
the wiki to state which datasets have been used, any notes on them,
and as somewhere to drop the results for the time being.
J
--
James Aylett, occasional trouble-maker
xapian.org
More information about the Xapian-devel
mailing list