GSOC 2018: Diversification of Search Results

Amanda Jayanetti amandajayanetti at gmail.com
Mon Jun 4 19:50:34 BST 2018


Hi Uppinder,

I noticed that you have not updated the journal [1] since May 14th, so
appreciate if you could provide an update on the current status of the
project. Also, have you applied for the TREC ClueWeb09 dataset?

[1] https://trac.xapian.org/wiki/GSoC2018/Diversification/Journal

Best Regards,
Amanda

On Sat, Apr 28, 2018 at 8:53 AM, Amanda Jayanetti <amandajayanetti at gmail.com
> wrote:

> Hi Uppinder,
>
> Congratulations on being accepted into GSoC 2018 with Xapian!
>
> as discussed in the interview, I might evaluate the
>> GLS-MPT implementation before moving on to optimizations (C2-GLS).
>>
>
> We had a discussion with regard to this, and the decision was to perform
> evaluation after the optimizations as you had originally proposed. So let's
> stick to your original plan and complete the implementation of C2-GLS
> before going ahead with evaluation.
>
> Best Regards,
> Amanda
>
> On Fri, Apr 27, 2018 at 8:37 AM, Gaurav Arora <
> gauravarora.daiict at gmail.com> wrote:
>
>> We are equally excited about working with you over summer.
>>
>> I think you missed reply by Olly on IRC, you can find it in logs here:
>> https://botbot.me/freenode/xapian/2018-04-24/?msg=99336093&page=1
>>
>>    - olly
>>    icebyte[m]: i think that probably needs to go through SFC (
>>    https://sfconservancy.org/) as the "legal entity"
>>    - 2:05 am <https://botbot.me/freenode/xapian/msg/99336095/>
>>    icebyte[m]: i can talk to them about it
>>
>>
>>
>> - Gaurav
>>
>> On Fri, Apr 27, 2018 at 12:23 AM, Uppinder Chugh <uppinderchugh at gmail.com
>> > wrote:
>>
>>> Thanks for selecting my proposal for GSoC, looking forward to
>>> contributing further to Xapian. I've posted this in the IRC but didn't
>>> receive any reply, so I'm presuming this must've been missed and thus
>>> posting it here. As proposed, I plan to use ClueWeb09 Category B
>>> dataset for evaluating diversification. A hosted copy is available
>>> (http://lemurproject.org/clueweb09.php/index.php#Services) which may
>>> be accessed but requires a license. The license is free and granted to
>>> an organisation by applying online
>>> (http://lemurproject.org/clueweb09/organization_agreement.cl
>>> ueweb09.worder.Mar30-18.pdf)
>>> . If a maintainer could have a look at this, that would be great. It's
>>> mentioned on the website that it takes around 2 weeks to obtain the
>>> license, and as discussed in the interview, I might evaluate the
>>> GLS-MPT implementation before moving on to optimizations (C2-GLS).
>>>
>>> On Sat, Mar 10, 2018 at 12:08 AM, Uppinder Chugh
>>> <uppinderchugh at gmail.com> wrote:
>>> >
>>> > Hi, I'd like to share my proposal for GSoC and get feedback on it.
>>> >
>>> > https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz-
>>> nzbIL1NNAo8Adl3gN-8/edit?usp=sharing
>>> >
>>> > Thanks,
>>> > Uppinder Chugh
>>> >
>>> > On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh <
>>> uppinderchugh at gmail.com> wrote:
>>> >>
>>> >> In particular, I have the following doubts:
>>> >>
>>> >> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this
>>> scenario and with the api? Also, how can I allow the user to manually allow
>>> diversification while he configures his result set via Matcher API?
>>> >>
>>> >> b) Should I include the LC clustering algorithm in
>>> xapian-core/cluster (as there's the base class Cluster which can be
>>> inherited) or make it part of diversification implementation.
>>> >>
>>> >> c) Apart from the proposed methods, I'd be writing automated tests,
>>> examples and documenting the new feature. Some tips here are appreciated as
>>> I've never written tests for code. Also, for documenting, I believe only
>>> getting-started-with-xapian should be updated with examples for using the
>>> new feature.
>>> >>
>>> >> Apart from the above, if I'm missing something or didn't go into
>>> enough detail, please let me know. :)
>>> >>
>>> >
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180605/6ab20df2/attachment.html>


More information about the Xapian-devel mailing list