xapian parser bug?

David Bremner david at tethera.net
Mon Oct 1 02:25:33 BST 2018


Olly Betts <olly at survex.com> writes:

> On Sun, Sep 30, 2018 at 09:05:25AM -0300, David Bremner wrote:
>>             if (str.find (' ') != std::string::npos)
>> 		query_str = '"' + str + '"';
>> 	    else
>> 		query_str = str;
>> 
>> 	    return parser.parse_query (query_str, NOTMUCH_QUERY_PARSER_FLAGS, term_prefix);
>
> I wouldn't recommend trying to generate strings to feed to QueryParser
> like this code seems to be doing.  QueryParser aims to parse input from
> humans not machines.

str is the parameter to the FieldProcessor () operator.  The field
processor needs a way to approximate the standard probabilistic prefix
parsing in the fallback case. The addition of quotes is to force the
generation of a phrase query, otherwise e.g. subject:"christmas party"
doesn't work out well.

I tried using OP_PHRASE as a the default operators, but it doesn't
handle some cases I need.

% quest -o phrase 'bob jones <bob at example.com>'       
UnimplementedError: OP_NEAR and OP_PHRASE only currently support leaf subqueries

If I don't recursively call parse_query, then I guess I need to generate
terms in a compatible way before turning them into a phrase query. Maybe
that's not as hard as I orginally thought, since being in phrase turns
off the stemmer anyway iiuc.  Is there a Xapian API I can use to extract
 "bob", "jones", "bob", "example", "com" from the example above? I guess
 I guess I could use a throwaway Xapian::Document and a TermGenerator
 (basically aping xapian_core/tests/api_termgen.cc).

d



More information about the Xapian-discuss mailing list