<div dir="ltr">Hi Olly,<br><br>Am asking Parth if he can help me with the dataset containing query logs and expected results.Also, is the evaluation  module fully functional ? I saw that some issues are still open on it. Also, I initially thought I would write the query log and expected results set by hand for some wikipedia articles but realize now that you have a point as we need to test on a large number of articles.<br>

<br>-Regards<br>-Aarsh</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Mar 4, 2014 at 5:26 PM, Olly Betts <span dir="ltr"><<a href="mailto:olly@survex.com" target="_blank">olly@survex.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">On Sat, Mar 01, 2014 at 10:12:36AM +0530, Aarsh Shah wrote:<br>

> I am thinking of working on the following ideas for my GSOC proposal<br>

> based on my discussions with Olly and my own understanding. Rather<br>

> than focusing on an entire perftest module, I have decided to focus on<br>

> implementing performance tests for  weighting schemes based on a<br>

> wikipedia dump and in addition to that, build a framework to measure<br>

> the accuracy and relevance of new and old weighting schemes.<br>

<br>

</div>I mentioned this on IRC (not sure if it was before or after you sent<br>

this mail), but for the benefit of anyone reading who wasn't on IRC<br>

then, we do already have an evaluation module which was originally<br>

written by Andy MacFarlane, and further worked on by Gaurav Arora:<br>

<br>

<a href="https://github.com/samuelharden/xapian-evaluation" target="_blank">https://github.com/samuelharden/xapian-evaluation</a><br>

<br>

> * Measuring the relevance and accuracy of  weighting schemes.*<br>

><br>

>    - The accuracy of a weighting scheme can be measured by using the<br>

<div class="">>    concepts of precision and recall. :-<br>

>    <a href="http://en.wikipedia.org/wiki/Precision_and_recall" target="_blank">http://en.wikipedia.org/wiki/Precision_and_recall</a><br>

</div>>    - Once we have the static wikipedia dump in place, we can hardcode<br>

<div class="">>    expected results for each query we plan to run on the data set.<br>

<br>

</div>How would you get a list of suitable queries to run against a wikipedia<br>

dump?  I've not seen public query logs for wikipedia.<br>

<br>

How would you get the "expected results for each query"?  Producing a<br>

set of relevance judgements is rather time consuming.  If the relevance<br>

judgements are poor quality, the conclusions of the evaluation become<br>

untrustworthy.<br>

<br>

I suspect it would be better to use an existing dataset which included<br>

queries and relevance judgements - Parth might know if there's one we<br>

could use.<br>

<br>

>         *Profiling and Optimizing Weighting/Query Expansion Schemes*<br>

><br>

>    - Profile DFR schemes and identify/optimize bottlenecks.<br>

>    - Profile Stemming algorithms and indexing .<br>

>    - For profiling most searches which are fast, valgrind based profilers<br>

<div class="">>    can be used.However, perf can be brought in for slower searches as we had<br>

>    discussed that valgrind based profilers may not be efficient for IO bound<br>

>    tasks.<br>

</div>>    - The speed will first be tested using the Realtime:now function and<br>

<div class="">>    then the profiler will be brought in if the speed appears to be too slow.<br>

</div>>    - As mentioned on the ideas page too, a lot of the optimization can/will<br>

<div class="">>    happen by mapping the forumals used to a smaller set of formulas and reduce<br>

>    the number of times computationally heavy operations such as log() are used.<br>

</div>>    - Create a huge static data-set, preferably a Wikipedia dump.<br>

>    - Test the speed of the DFR schemes against the speed of BM25 and decide<br>

<div class="">>    on a default weighting scheme. Our best bet would be a parameter free DPH<br>

>    schemes as the performance of the one with parameters depends on the input<br>

>    data too.<br>

</div>>    - Similarly, a speed analysis of query expansion scheme will also be<br>

<div class="">>    done to decide on a default query expansion scheme.These can be optimized<br>

>    too.<br>

><br>

>         I am not quite being able to decide on an ideal patch for the idea<br>

> .Please can you suggest some ideas for an ideal patch as an initial first<br>

> step to include with my proposal ?<br>

<br>

</div>I'd suggest trying out profiling something, to get a feel for how the<br>

profiling tools work, and for how long the process of finding a<br>

bottleneck and fixing it takes.<br>

<br>

Cheers,<br>

    Olly<br>

</blockquote></div><br></div>