[Xapian-tickets] [Xapian] #113: QueryParser limitation/inconsistency
Xapian
nobody at xapian.org
Fri Feb 20 14:20:31 GMT 2009
#113: QueryParser limitation/inconsistency
-------------------------------+--------------------------------------------
Reporter: federico.schwindt | Owner: olly
Type: enhancement | Status: assigned
Priority: normal | Milestone: 1.1.1
Component: QueryParser | Version: SVN trunk
Severity: minor | Resolution:
Keywords: | Blockedby:
Platform: All | Blocking:
-------------------------------+--------------------------------------------
Changes (by olly):
* milestone: 1.1.0 => 1.1.1
Old description:
> Hi,
>
> I've been using xapian (0.9.9 and now 0.9.10) recently at work and I've
> found
> that the exquisite QueryParser (no irony intended) imposes some serious
> limitations for certain queries, as it does treat some characters
> specially,
> even when flags does not contain FLAG_PHRASE.
> I'm talking about the method is_phrase_generator(). In the organization
> I work
> for we have a mixed setup of html documents and code. This includes
> several
> references to text in the word_word format. Unfortunately the QueryParser
> treats
> underscore as phrase generator, making impossible to search for terms
> indexed
> using whitespace separators, even when allterms() shows the term exists
> on the
> database.
> I believe this is an inconsistency and also a limitation in the
> QueryParser,
> as it does not matter what flags are used, in such cases where the query
> string
> contains any of the characters defined in is_phrase_generator(), the
> query will
> be automatically converted to a phrase search (note that these characters
> can't
> be changed).
> In an ideal world (mine at least), I'd expect the user to define a
> phrase
> (using " or any other previously defined character) and if this is not
> the case
> the QueryParser should not try to convert the query to anything else
> (except for
> the defined operations, OR, AND, etc).
> ITOH, I could change the indexing to strip the underscores (and the
> other
> characters) and treat every part of the word_word as a separate term, but
> that
> would also mean that "word word" would match as well, when it's not what
> you wanted.
> I hope you have this into consideration. Feel free to contact me if you
> need
> further details or I can clarify anything else.
> Many thanks,
>
> f.-
New description:
Hi,
I've been using xapian (0.9.9 and now 0.9.10) recently at work and I've
found
that the exquisite !QueryParser (no irony intended) imposes some serious
limitations for certain queries, as it does treat some characters
specially,
even when flags does not contain FLAG_PHRASE.
I'm talking about the method is_phrase_generator(). In the organization I
work
for we have a mixed setup of html documents and code. This includes
several
references to text in the word_word format. Unfortunately the !QueryParser
treats
underscore as phrase generator, making impossible to search for terms
indexed
using whitespace separators, even when allterms() shows the term exists on
the
database.
I believe this is an inconsistency and also a limitation in the
!QueryParser,
as it does not matter what flags are used, in such cases where the query
string
contains any of the characters defined in is_phrase_generator(), the query
will
be automatically converted to a phrase search (note that these characters
can't
be changed).
In an ideal world (mine at least), I'd expect the user to define a phrase
(using " or any other previously defined character) and if this is not the
case
the !QueryParser should not try to convert the query to anything else
(except for
the defined operations, OR, AND, etc).
ITOH, I could change the indexing to strip the underscores (and the other
characters) and treat every part of the word_word as a separate term, but
that
would also mean that "word word" would match as well, when it's not what
you wanted.
I hope you have this into consideration. Feel free to contact me if you
need
further details or I can clarify anything else.
Many thanks,
f.-
--
Comment:
Bumping to milestone:1.1.1
(and fix description wiki formatting)
--
Ticket URL: <http://trac.xapian.org/ticket/113#comment:10>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list