[Xapian-discuss] Stemming and Query Parsing

Mike Boone mike at boonedocks.net
Tue Oct 19 14:27:33 BST 2004


OK, I've fooled around with making some changes in queryparser.cc in the
yylex2() function to get it to keep my # character, but it's not working, so
I guess I don't yet understand the code well enough.

I copied the block for the + character and modified it:

case '#':
  // Ignore # at end of query
  if (qptr == q.end()) return 0;
  if (isspace(*qptr) || *qptr == '#') {
    /* Ignore ## or # followed by a space */
    /* Note that nethack## and Cl# are handled above */
    ++qptr;
    return yylex();
  }
  /* '#' is NOT used in the grammar rules, but leaving code here as-is */
  return c;

This code block isn't quite what I want since # is not really a grammar
rule, and I don't want it to be.

At any rate, compiling this code doesn't work for getting the # character to
appear in my query.

I'm also not sure if I should add the # sign to the yytname array...it looks
to me like those are only for grammar rules. I haven't tried that yet.

Any suggestions?

(BTW, I'm doing this now with Xapian 0.8.3. The 0.8.1 xapian.so for PHP was
1.8MB, the same file for 0.8.3 is 10MB! This is on Red Hat Enterprise AS
2.1.)

Thanks,
Mike.

-----Original Message-----
From: Olly Betts [mailto:olly at survex.com]
Sent: Monday, October 18, 2004 1:06 PM
To: Mike Boone
Cc: xapian-discuss at lists.xapian.org
Subject: Re: [Xapian-discuss] Stemming and Query Parsing


On Mon, Oct 18, 2004 at 01:58:16PM -0400, Mike Boone wrote:
> I wasn't able to find C_isnotsign in the xapian-core code...can you point
me
> to it?

That's what it is in CVS HEAD.  But the C_is* stuff is quite new so it
may not have been like that in the last release.

> I did find some code that looks like it handles the extra characters
> in the function yylex2() of queryparser.cc

That's the place.

> (also in queryparser.yy - what's a .yy file?)

It's Bison source (usually it's .y, but this is C++ so we use .yy) -
queryparser.cc is generated from queryparser.yy using Bison.

If you just want to make a minor tweak locally, modifying the generated
file is a reasonable approach.  If you do modify the .yy be sure to
configure with --enable-maintainer-mode or the file won't be regenerated!

> Is this the correct place to modify the code, or should I look
> elsewhere?

That's the place.

Cheers,
    Olly




More information about the Xapian-discuss mailing list