GSOC 2018: Diversification of Search Results

Gaurav Arora gauravarora.daiict at gmail.com
Fri Apr 27 04:07:39 BST 2018


We are equally excited about working with you over summer.

I think you missed reply by Olly on IRC, you can find it in logs here:
https://botbot.me/freenode/xapian/2018-04-24/?msg=99336093&page=1

   - olly
   icebyte[m]: i think that probably needs to go through SFC (
   https://sfconservancy.org/) as the "legal entity"
   - 2:05 am <https://botbot.me/freenode/xapian/msg/99336095/>
   icebyte[m]: i can talk to them about it



- Gaurav

On Fri, Apr 27, 2018 at 12:23 AM, Uppinder Chugh <uppinderchugh at gmail.com>
wrote:

> Thanks for selecting my proposal for GSoC, looking forward to
> contributing further to Xapian. I've posted this in the IRC but didn't
> receive any reply, so I'm presuming this must've been missed and thus
> posting it here. As proposed, I plan to use ClueWeb09 Category B
> dataset for evaluating diversification. A hosted copy is available
> (http://lemurproject.org/clueweb09.php/index.php#Services) which may
> be accessed but requires a license. The license is free and granted to
> an organisation by applying online
> (http://lemurproject.org/clueweb09/organization_
> agreement.clueweb09.worder.Mar30-18.pdf)
> . If a maintainer could have a look at this, that would be great. It's
> mentioned on the website that it takes around 2 weeks to obtain the
> license, and as discussed in the interview, I might evaluate the
> GLS-MPT implementation before moving on to optimizations (C2-GLS).
>
> On Sat, Mar 10, 2018 at 12:08 AM, Uppinder Chugh
> <uppinderchugh at gmail.com> wrote:
> >
> > Hi, I'd like to share my proposal for GSoC and get feedback on it.
> >
> > https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz-
> nzbIL1NNAo8Adl3gN-8/edit?usp=sharing
> >
> > Thanks,
> > Uppinder Chugh
> >
> > On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh <uppinderchugh at gmail.com>
> wrote:
> >>
> >> In particular, I have the following doubts:
> >>
> >> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this
> scenario and with the api? Also, how can I allow the user to manually allow
> diversification while he configures his result set via Matcher API?
> >>
> >> b) Should I include the LC clustering algorithm in xapian-core/cluster
> (as there's the base class Cluster which can be inherited) or make it part
> of diversification implementation.
> >>
> >> c) Apart from the proposed methods, I'd be writing automated tests,
> examples and documenting the new feature. Some tips here are appreciated as
> I've never written tests for code. Also, for documenting, I believe only
> getting-started-with-xapian should be updated with examples for using the
> new feature.
> >>
> >> Apart from the above, if I'm missing something or didn't go into enough
> detail, please let me know. :)
> >>
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180427/083fdeb6/attachment.html>


More information about the Xapian-devel mailing list