[Xapian-discuss] Making '.' a valid search term character

Olly Betts olly at survex.com
Tue Mar 6 01:22:29 GMT 2007


On Mon, Mar 05, 2007 at 02:25:45PM +0100, Marcus Ramberg wrote:
> Is there any way to make the queryparser not split queries on '.'?  

It's not currently configurable, but it's not hard to patch the code to
do this.

Look in queryparser/queryparser_internal.cc for where '&' is handled.
We treat a single embedded '&' as a word character so things like "AT&T"
are a single word (but C code like "a&&b" isn't).  If you change that to
check for '&' or '.' then you'll probably get the effect you want.

There's also code to handle "initialisms" specially, so "I.B.M." is
treated the same as "IBM".  That only applies to single capitals with
'.' in between, but you might want to disable that if it's likely to
be a problem for you (it's just above where '&' is checked for).

Cheers,
    Olly



More information about the Xapian-discuss mailing list