[Xapian-devel] Test Dataset for performance and accuracy analysis

Aarsh Shah aarshkshah1992 at gmail.com
Tue Mar 4 15:46:01 GMT 2014


Hi Parth,

                                I implemented DFR algorithms  in Xapian as
a part of GSOC last year under the mentorship of Olly. This year, I want to
work on analyzing and optimizing the performance of the DFR algorithms and
comparing them with BM25.I also want to work on profiling the query
expansion schemes and test the relevance(precision and recall) / speed(time
taken) of the algorithms .
                                 However, for this, I need a well defined
data set containing a considerable amount of textual data, query logs
containing queries that can be run on it, a set of relevant or expected
documents which can be compared with the actual results to measure the
relevance of the schemes. Please can you help me with this ? Thank you so
much for your time.

-Regards
-Aarsh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140304/8d4d01f8/attachment.html>


More information about the Xapian-devel mailing list