GSoC aspirant - guruprasad hegde
guruhegde1308 at gmail.com
Tue Apr 24 10:53:10 BST 2018
Thank you for accepting the proposal.
For the next couple of days, I plan to read again the related papers
keeping the implementation in mind and understand Xapian code(backend part).
Please suggest if any other activity I can do.
On Mon, Mar 26, 2018 at 4:05 PM, Guruprasad Hegde <guruhegde1308 at gmail.com>
> Please find the draft proposal with this link: https://github.com/
> It is still work in progress.
> Question: If we index math terms(symbol pair tuples) in the same DB along
> with the text data, do you think, adding field prefix(making a new one)
> implicitly for math terms, help in some way w.r.t performance for cases
> like searching only text terms or only math terms?
> On Mon, Mar 26, 2018 at 3:27 PM, Gaurav Arora <
> gauravarora.daiict at gmail.com> wrote:
>> I thought if I start with the MathML as input and build the core, then I
>>> can extend the system to support any other query/document type by looking
>>> for third party tools available for c++. At the moment, I don't have any
>>> idea about this. What do you think?
>>> We can look for the option in bonding period too. For now, I can make
>>> latex to mathml as first step in proposal and shuffle the steps later right?
>> Proposal need to account for doing that. i.e proposal should account that
>> before end of GSOC search through latex should be supported and merged. It
>> can be done anytime. It's perfectly fine to build the core using MathML
>> representation initially.
>>> Generating symbol layout tree requires implementing parser. I guess it
>>> invloves good amount of text processing. Since it's standard problem, I
>>> hope it should not be hard, but requires handling many scenarios. I plan to
>>> read about the parser and try implementing small examples first in coming
>> That would be great :)
>>> I feel generating symbol pair will be easy once I build the tree.
>>> Do you think I should come up with some sort of psuedocode in proposal?
>> Would definitely help.
>> With other weight metric implementations available and with existing
>>> indexing structure, I feel getting the stats and implementing this would
>>> not be hard I feel.
>>> A basic check and estimate would help to estimate time this would take
>> to plan the project timeline accordingly.
>>> I have been working on the draft. I am really sorry about the delay in
>>> draft. Hope to make up for that with some good work:)
>> Sooner you show us the draft version would increase your chance of
>> getting feedback from us and improving your proposal.
>> - Gaurav Arora
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Xapian-devel