GSOC 2018: Diversification of Search Results

Amanda Jayanetti amandajayanetti at gmail.com
Sat Apr 28 04:23:02 BST 2018


Hi Uppinder,

Congratulations on being accepted into GSoC 2018 with Xapian!

as discussed in the interview, I might evaluate the
> GLS-MPT implementation before moving on to optimizations (C2-GLS).
>

We had a discussion with regard to this, and the decision was to perform
evaluation after the optimizations as you had originally proposed. So let's
stick to your original plan and complete the implementation of C2-GLS
before going ahead with evaluation.

Best Regards,
Amanda

On Fri, Apr 27, 2018 at 8:37 AM, Gaurav Arora <gauravarora.daiict at gmail.com>
wrote:

> We are equally excited about working with you over summer.
>
> I think you missed reply by Olly on IRC, you can find it in logs here:
> https://botbot.me/freenode/xapian/2018-04-24/?msg=99336093&page=1
>
>    - olly
>    icebyte[m]: i think that probably needs to go through SFC (
>    https://sfconservancy.org/) as the "legal entity"
>    - 2:05 am <https://botbot.me/freenode/xapian/msg/99336095/>
>    icebyte[m]: i can talk to them about it
>
>
>
> - Gaurav
>
> On Fri, Apr 27, 2018 at 12:23 AM, Uppinder Chugh <uppinderchugh at gmail.com>
> wrote:
>
>> Thanks for selecting my proposal for GSoC, looking forward to
>> contributing further to Xapian. I've posted this in the IRC but didn't
>> receive any reply, so I'm presuming this must've been missed and thus
>> posting it here. As proposed, I plan to use ClueWeb09 Category B
>> dataset for evaluating diversification. A hosted copy is available
>> (http://lemurproject.org/clueweb09.php/index.php#Services) which may
>> be accessed but requires a license. The license is free and granted to
>> an organisation by applying online
>> (http://lemurproject.org/clueweb09/organization_agreement.
>> clueweb09.worder.Mar30-18.pdf)
>> . If a maintainer could have a look at this, that would be great. It's
>> mentioned on the website that it takes around 2 weeks to obtain the
>> license, and as discussed in the interview, I might evaluate the
>> GLS-MPT implementation before moving on to optimizations (C2-GLS).
>>
>> On Sat, Mar 10, 2018 at 12:08 AM, Uppinder Chugh
>> <uppinderchugh at gmail.com> wrote:
>> >
>> > Hi, I'd like to share my proposal for GSoC and get feedback on it.
>> >
>> > https://docs.google.com/document/d/1A4HF2lZBnLh1TUY3Y2DDUfz-
>> nzbIL1NNAo8Adl3gN-8/edit?usp=sharing
>> >
>> > Thanks,
>> > Uppinder Chugh
>> >
>> > On Mon, Feb 26, 2018 at 2:14 AM, Uppinder Chugh <
>> uppinderchugh at gmail.com> wrote:
>> >>
>> >> In particular, I have the following doubts:
>> >>
>> >> a) Is wrapping Xapian::Mset matcher::get_set(..) suitable in this
>> scenario and with the api? Also, how can I allow the user to manually allow
>> diversification while he configures his result set via Matcher API?
>> >>
>> >> b) Should I include the LC clustering algorithm in xapian-core/cluster
>> (as there's the base class Cluster which can be inherited) or make it part
>> of diversification implementation.
>> >>
>> >> c) Apart from the proposed methods, I'd be writing automated tests,
>> examples and documenting the new feature. Some tips here are appreciated as
>> I've never written tests for code. Also, for documenting, I believe only
>> getting-started-with-xapian should be updated with examples for using the
>> new feature.
>> >>
>> >> Apart from the above, if I'm missing something or didn't go into
>> enough detail, please let me know. :)
>> >>
>> >
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180428/1f22bdef/attachment.html>


More information about the Xapian-devel mailing list