<p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">Dear Olly,</p><p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">Thank you for your reply. For the tf-idf weighting, I was referring to the traditional tf-idf weighting scheme that multiplies the term frequency and the term rareness, trying both intersection queries and mulitword queries approach. </p>
<p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">This shouldn't take long. However, since I will be having exams on the first two weeks, I planned a bit long( two weeks) for building the scheme and testing.</p>
<p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">For the DFR schemes, I was thinking that instead of simply assuming the type of the randomness to be the binomial distribution or geometric distribution, I can use the Monte Carlo idea to approximate the probabilities. There can be many different approximations. I will probably try Monte Carlo complete path(Simulate N = mn runs of the random walk initiated at each page exactly m times) and Monte Carlo Complete Path Stopping at dangling nodes(Simulate N = mn runs of the random walk initiated at each page exactly m times and stopping when it reaches a dangling node) which will probably be faster than MC end-point with random start approach(Simulate N runs of the random walk initiated at a randomly chosen page).</p>
<p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">For the relevance feedback techniques, I will probably focus more on the explicit feedback which will take users' feedback on the document or query's relevance into account. I'll try implicit feedback, too. For instance, the time duration before one user moves to another page.</p>
<p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">There are quite a numbers of parameters I will need to fine-tune. It is possible that more than one weighting schemes will be used in the search engine. Thus I'll need to set some parameters on how much weight I shall put to a certain weighting scheme. Say I used both tf-idf weighting and explicit feedback weighting, I will need to add both weighting, not exactly 1:1 but some better parameters. Simply multiplying two weights won't work as different weighting schems should share different importance according to the results we get.</p>
<p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">Shall I put these to the proposal, too?</p><p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">I am sorry I am currently traveling overseas and not that responsive. But I'll definitely improve the proposal based on your feedbacks in these two days.</p>
<p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px"><br></p><p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">Best Regards,</p><p style="font-family:Verdana,Arial,Helvetica,sans-serif;font-size:10px">
Shaohuan</p><br><div class="gmail_quote">On Wed, Apr 4, 2012 at 10:37 AM, Olly Betts <span dir="ltr"><<a href="mailto:olly@survex.com">olly@survex.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="im">On Wed, Apr 04, 2012 at 10:12:53AM +0200, Shaohuan Li wrote:<br>
> I am trying to apply for the weighting scheme project and I've submitted a<br>
> proposal according to the template few days ago. I think the proposal is<br>
> still not good enough, can you help give some suggestions on what else<br>
> shall I do more research on&include in the proposal?<br>
<br>
</div>Check your proposal - I have already made some comments on it.<br>
<br>
Cheers,<br>
Olly<br>
</blockquote></div><br><br clear="all"><div><br></div>-- <br>Best Regards,<div><font color="#000099">Shaohuan Li</font></div><br>