[Xapian-devel] Interested in GSOC projects

saurabh kumar saurabh.catch at gmail.com
Sat Apr 2 19:15:41 BST 2011


Respected sir,

I have made several changes in the proposal.

- Now I have not ruled out the idea of tool generated parser.

- Regarding time-line I had already proposed weekly report submission. Now I
have written a much detailed time-line.

- I have made several other changes. Please go through it at once.
Looking forward to hear from you

Thanking you

Saurabh Kumar

On Fri, Apr 1, 2011 at 2:40 AM, saurabh kumar <saurabh.catch at gmail.com>wrote:

> Respected sir,
>
> I have submitted my proposal on gsoc-melange site.
> Please have a look and suggest some improvements.
>
> Meanwhile I am spending time understanding the source code
> of query parser class from xapian-core.
>
> Looking forward for your comments on my proposal.
>
> Thanking you
> Saurabh Kumar
>
>
>
> On Thu, Mar 31, 2011 at 8:38 AM, Olly Betts <olly at survex.com> wrote:
>
>> On Thu, Mar 31, 2011 at 01:30:30AM +0530, saurabh kumar wrote:
>> > I have some doubts :
>> >
>> > 1) Why is using the tools like yacc, bison not a good approach? Can you
>> > illustrate with an example?
>>
>> The parser needs to be forgiving, since the input is typed by (often
>> non-technical) humans.  The input isn't expected to be program code, and
>> "Syntax Error" is rarely an acceptable response (better to correct the
>> query and say "Searched for 'XXX' instead", with a "Did you mean 'YYY'?"
>> is there's an alternative plausible fix up).
>>
>> Good error recovery in generated parsers is hard to do well, and usually
>> results in adding extra rules to the parser description, and that
>> obfuscates what we're actually trying to do.
>>
>> The grammar is also not something we can always restrain in ways to suit
>> the parser generator.
>>
>> For a formally specified grammar (like a language standard perhaps),
>> there's usually a BNF description of the grammar rules, so it's handy
>> to have the parser description mirror it.  That's not the case here.
>>
>> Currently the lexer does things like tracking the "mode", which is
>> really an indication of where in the grammar we are.
>>
>> > 2) In the proposed project are we NOT going to use any tools like YACC
>> etc.?
>>
>> Well, you're welcome to propose what you like, but you'll need to do
>> a harder sell on this one.
>>
>> If you want to use a parser generator, we currently use lemon, which has
>> a clearer syntax than bison/yacc, and is structured such that the lexer
>> calls the parser (rather than the parser calling the lexer, as in
>> bison/yacc).  That allows the lexer to be simpler, since it doesn't need
>> to "keep its place" with explicit state.  So I'd suggest we probably
>> don't want to move back to using bison (one reason we moved away
>> originally was the lack of reentrancy in bison-generated parser, but
>> that at least now seems to have been addressed).
>>
>> > Should I mail my proposal to the mailing list or just submit it at
>> google
>> > SOC site? Because certainly I would require your
>> > comments to improve upon the first draft.
>>
>> Just submit it to the site - we can comment there and you can revise it
>> up until the deadline (April 8th, 19:00 UTC).
>>
>> Cheers,
>>     Olly
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20110402/070bf6a4/attachment.htm>


More information about the Xapian-devel mailing list