<div dir="ltr"><div><div><div><div><div><div>Hello,<br><br></div>So my idea goes like this. Basically I have been working on Question Answering systems. I developed a QA system for "when" type questions (sorry I cant provide the source code at the moment because my paper is under review at SIGIR 2014). I used the part-of-speech and developed a weighted scoring system.<br>

</div>Now I basically plan on developing a generic QA system which encompasses a large number of questions. The biggest drawback of my previous QA system was the lack of relevance measuring mechanism. I want to develop a relevance measure between a query and a sentence. I believe there already exist many relevance measuring codes but those relate a query to a document( as far as I know). To develop a relevance measure I need to take into consideration a large number of sentences and questions so that a generic feature set can be formed which will further be employed in my ML algorithm. This needs a huge dataset of documents which I dont have due to lack of any financial support. I was planning to use the AQUAINT 2 dataset but it costs $500 which i cannot afford. <br>

</div>Now if I am successful at building a relevance measuring system between a query and a sentence then I will take into consideration only those sentences that are relevant. Then I will apply my scoring system to those sentences which will help me select the final answer sentence. In my previous project I got an efficiency of ~74% tested on 200 test queries. I believe that with a proper relevance measure I can cross the 90% mark.<br>

</div>Please give your suggestions on my project idea. It would be very helpful.<br><br></div>Regards<br></div>Karthik  <br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Feb 26, 2014 at 5:16 PM, Parth Gupta <span dir="ltr"><<a href="mailto:pargup8@gmail.com" target="_blank">pargup8@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>The Letor project involves descent amount of Machine Learning while all the ranking related projects are around IR. Its better to introduce your idea on mailing list where all the mentors can have a detailed look at it, potential mentors can respond and the idea is kind of registered under your name.<br>


<br></div><div>Cheers,<br></div>Parth.<br></div><div class="gmail_extra"><br><br><div class="gmail_quote"><div><div class="h5">On Wed, Feb 26, 2014 at 10:20 AM, Olly Betts <span dir="ltr"><<a href="mailto:olly@survex.com" target="_blank">olly@survex.com</a>></span> wrote:<br>


</div></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div class="h5"><div><div>On Tue, Feb 25, 2014 at 03:58:09PM +0530, karthik iyer wrote:<br>

>     I am C Karthik Iyer, a 3rd year B Tech student at NITK Surathkal. I am<br>

> interested in working on projects on Information Retrieval and Machine<br>

> Learning. I've had previous experience on working on projects regarding<br>

> Question Answering Systems.<br>

>     I have a project idea which includes both IR and ML but i dont know how<br>

> feasible the idea is. Could you guys say when will you be available on IRC<br>

> so that I can discuss the idea with you.<br>

<br>

</div></div>I can't say for certain when I'll be monitoring IRC, but I'm in UTC+13.<br>

Other mentors are in a variety of timezones.<br>

<br>

If the idea is complex, email might be better though.<br>

<br>

Cheers,<br>

    Olly<br>

<br></div></div><div class="">

_______________________________________________<br>

Xapian-devel mailing list<br>

<a href="mailto:Xapian-devel@lists.xapian.org" target="_blank">Xapian-devel@lists.xapian.org</a><br>

<a href="http://lists.xapian.org/mailman/listinfo/xapian-devel" target="_blank">http://lists.xapian.org/mailman/listinfo/xapian-devel</a><br>

</div></blockquote></div><br></div>

</blockquote></div><br></div>