[Xapian-devel] GSoC 2014

karthik iyer karthikiyer2000 at gmail.com
Fri Feb 28 13:18:10 GMT 2014


Hi,

Well frankly speaking even I am not too sure how this will fit into Xapian
toolkit. I thought you guys will help me with that. I havent gone through
the Xapian documentation properly yet. I will go through it and reply back
in a couple of days as to how can my idea (if possible) fit into your
toolkit.

Regards
Karthik


On Fri, Feb 28, 2014 at 6:13 PM, Olly Betts <olly at survex.com> wrote:

> On Thu, Feb 27, 2014 at 01:11:24PM +0530, karthik iyer wrote:
> > So my idea goes like this. Basically I have been working on Question
> > Answering systems. I developed a QA system for "when" type questions
> (sorry
> > I cant provide the source code at the moment because my paper is under
> > review at SIGIR 2014). I used the part-of-speech and developed a weighted
> > scoring system.
> > Now I basically plan on developing a generic QA system which encompasses
> a
> > large number of questions. The biggest drawback of my previous QA system
> > was the lack of relevance measuring mechanism. I want to develop a
> > relevance measure between a query and a sentence. I believe there already
> > exist many relevance measuring codes but those relate a query to a
> > document( as far as I know).
>
> The term "document" is what the literature uses, but the mental image
> that might conjure up of a multi-page printout with a staple through the
> corner is misleading.  The "documents" being matched could be single
> sentences.
>
> > To develop a relevance measure I need to take
> > into consideration a large number of sentences and questions so that a
> > generic feature set can be formed which will further be employed in my ML
> > algorithm. This needs a huge dataset of documents which I dont have due
> to
> > lack of any financial support. I was planning to use the AQUAINT 2
> dataset
> > but it costs $500 which i cannot afford.
> > Now if I am successful at building a relevance measuring system between a
> > query and a sentence then I will take into consideration only those
> > sentences that are relevant. Then I will apply my scoring system to those
> > sentences which will help me select the final answer sentence. In my
> > previous project I got an efficiency of ~74% tested on 200 test queries.
> I
> > believe that with a proper relevance measure I can cross the 90% mark.
> > Please give your suggestions on my project idea. It would be very
> helpful.
>
> The first concern I have is whether this is something we actually have
> the skills to mentor.  I personally don't have any previous experience
> of Question Answering systems - I don't know about the other mentors.
>
> I'm also unclear where Xapian fits into the picture.
>
> Are you talking about building this as a new feature for Xapian?
>
> Or is it an framework or application built on top of Xapian?
>
> Or is it a separate system entirely?
>
> Cheers,
>     Olly
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20140228/294fb9e8/attachment.html>


More information about the Xapian-devel mailing list