[Xapian-devel] Interested in GSOC projects

saurabh kumar saurabh.catch at gmail.com
Thu Mar 31 22:10:33 BST 2011


Respected sir,

I have submitted my proposal on gsoc-melange site.
Please have a look and suggest some improvements.

Meanwhile I am spending time understanding the source code
of query parser class from xapian-core.

Looking forward for your comments on my proposal.

Thanking you
Saurabh Kumar



On Thu, Mar 31, 2011 at 8:38 AM, Olly Betts <olly at survex.com> wrote:

> On Thu, Mar 31, 2011 at 01:30:30AM +0530, saurabh kumar wrote:
> > I have some doubts :
> >
> > 1) Why is using the tools like yacc, bison not a good approach? Can you
> > illustrate with an example?
>
> The parser needs to be forgiving, since the input is typed by (often
> non-technical) humans.  The input isn't expected to be program code, and
> "Syntax Error" is rarely an acceptable response (better to correct the
> query and say "Searched for 'XXX' instead", with a "Did you mean 'YYY'?"
> is there's an alternative plausible fix up).
>
> Good error recovery in generated parsers is hard to do well, and usually
> results in adding extra rules to the parser description, and that
> obfuscates what we're actually trying to do.
>
> The grammar is also not something we can always restrain in ways to suit
> the parser generator.
>
> For a formally specified grammar (like a language standard perhaps),
> there's usually a BNF description of the grammar rules, so it's handy
> to have the parser description mirror it.  That's not the case here.
>
> Currently the lexer does things like tracking the "mode", which is
> really an indication of where in the grammar we are.
>
> > 2) In the proposed project are we NOT going to use any tools like YACC
> etc.?
>
> Well, you're welcome to propose what you like, but you'll need to do
> a harder sell on this one.
>
> If you want to use a parser generator, we currently use lemon, which has
> a clearer syntax than bison/yacc, and is structured such that the lexer
> calls the parser (rather than the parser calling the lexer, as in
> bison/yacc).  That allows the lexer to be simpler, since it doesn't need
> to "keep its place" with explicit state.  So I'd suggest we probably
> don't want to move back to using bison (one reason we moved away
> originally was the lack of reentrancy in bison-generated parser, but
> that at least now seems to have been addressed).
>
> > Should I mail my proposal to the mailing list or just submit it at google
> > SOC site? Because certainly I would require your
> > comments to improve upon the first draft.
>
> Just submit it to the site - we can comment there and you can revise it
> up until the deadline (April 8th, 19:00 UTC).
>
> Cheers,
>     Olly
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20110401/615dfbc6/attachment.htm>


More information about the Xapian-devel mailing list