GSoC aspirant - guruprasad hegde
Guruprasad Hegde
guruhegde1308 at gmail.com
Tue Apr 24 10:53:10 BST 2018
Hi All,
Thank you for accepting the proposal.
For the next couple of days, I plan to read again the related papers
keeping the implementation in mind and understand Xapian code(backend part).
Please suggest if any other activity I can do.
Regards,
Guruprasad
On Mon, Mar 26, 2018 at 4:05 PM, Guruprasad Hegde <guruhegde1308 at gmail.com>
wrote:
> Please find the draft proposal with this link: https://github.com/
> guruhegde/xapian-gsoc-proposal
> It is still work in progress.
>
> Question: If we index math terms(symbol pair tuples) in the same DB along
> with the text data, do you think, adding field prefix(making a new one)
> implicitly for math terms, help in some way w.r.t performance for cases
> like searching only text terms or only math terms?
>
> Regards,
> Guruprasad
>
>
> On Mon, Mar 26, 2018 at 3:27 PM, Gaurav Arora <
> gauravarora.daiict at gmail.com> wrote:
>
>>
>> I thought if I start with the MathML as input and build the core, then I
>>> can extend the system to support any other query/document type by looking
>>> for third party tools available for c++. At the moment, I don't have any
>>> idea about this. What do you think?
>>>
>>> We can look for the option in bonding period too. For now, I can make
>>> latex to mathml as first step in proposal and shuffle the steps later right?
>>>
>>
>> Proposal need to account for doing that. i.e proposal should account that
>> before end of GSOC search through latex should be supported and merged. It
>> can be done anytime. It's perfectly fine to build the core using MathML
>> representation initially.
>>
>>>
>>> Generating symbol layout tree requires implementing parser. I guess it
>>> invloves good amount of text processing. Since it's standard problem, I
>>> hope it should not be hard, but requires handling many scenarios. I plan to
>>> read about the parser and try implementing small examples first in coming
>>> days.
>>>
>> That would be great :)
>>
>>>
>>> I feel generating symbol pair will be easy once I build the tree.
>>>
>>> Do you think I should come up with some sort of psuedocode in proposal?
>>>
>> Would definitely help.
>>
>>>
>>>
>> With other weight metric implementations available and with existing
>>> indexing structure, I feel getting the stats and implementing this would
>>> not be hard I feel.
>>>
>>> A basic check and estimate would help to estimate time this would take
>> to plan the project timeline accordingly.
>>
>>
>>> I have been working on the draft. I am really sorry about the delay in
>>> draft. Hope to make up for that with some good work:)
>>>
>> Sooner you show us the draft version would increase your chance of
>> getting feedback from us and improving your proposal.
>>
>>
>> - Gaurav Arora
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20180424/81dcd5b8/attachment.html>
More information about the Xapian-devel
mailing list