[Xapian-discuss] Searching Databases

Olly Betts olly at survex.com
Tue Dec 6 00:44:08 GMT 2005


On Thu, Dec 01, 2005 at 05:44:11PM -0600, Tony Lambiris wrote:
> When I use quest, if I search for "don't have", it appears as if Xapian 
> is parsing for "don" and "t" -- is there a way to modify or change this 
> behavior?

The Xapian::QueryParser class will turn "don't" into a phrase search for
"don" followed by "t".  Currently that's not configurable, short of
modifying the source code, which isn't hard - line 348 (or thereabouts)
of queryparser/queryparser_internal.cc is:

    if (*it != '&') break;

If you change that to also check for a single quote, it'll not split on
a single embedded single quote (which is exactly what you want):

    if (*it != '&' && *it != '\'') break;

For English at least, it would make sense to always treat an embedded
single quote as a word character, except for the possessive form (e.g.
"Tony's") where you really want to be able to match on Tony too.
Perhaps we should just special case that at index time.  I've done that
in the past for a particular project, but it doesn't really seem the
right approach for a general purpose piece of code.

Cheers,
    Olly



More information about the Xapian-discuss mailing list