[Xapian-discuss] How to make QueryParser select entire word like "H.O.T"

Olly Betts olly at survex.com
Tue Nov 9 02:56:14 GMT 2010


On Tue, Nov 02, 2010 at 06:41:24PM +0800, David wrote:
> I'm using xapian to build my search engine, but met with a problem.
> The code snippet is like:
> ----------------------Code begin-------------------------------------------------------------
> Xapian::QueryParser qp;
> qp.add_prefix("Singer", "S");
> Xapian::Query query = qp.parse_query("Singer:s.h.e", Xapian::QueryParser::FLAG_PARTIAL|Xapian::QueryParser::FLAG_AUTO_MULTIWORD_SYNONYMS |Xapian::QueryParser::FLAG_PHRASE );
> cout << "Performing query `" << query.get_description() << "'" << endl;
> ----------------------Code end---------------------------------------------------------------
>  
> See the output from the stdio,
> ----------------------Output begin---------------------------------------------------------
> Performing query `Xapian::Query((Ss:(pos=1) PHRASE 3 Sh:(pos=2) PHRASE 3 Se:(pos=3)))'
> ----------------------Output end-----------------------------------------------------------
>  
> See the problem? Actually "s.h.e" is a music band from Taiwan, and I want to
> use this as an entire query word to search in the singer field.
> So any one who know how to let the parser get "Ss.h.e" rather than splitted query ?

QueryParser doesn't allow you to customise how it interprets word
boundaries currently - this is ticket #113:

http://trac.xapian.org/ticket/113

There's currently some special handling for acronyms punctuated with
".", but only if capitalised, so this would work as you want:

    Singer:S.H.E

Perhaps this handling should also work for lower case acronyms.  I can't
think of a good reason for it not to, except that we'd need to fix
TermGenerator to match (or else "s.h.e" in the query wouldn't match
"s.h.e" in a document), and that is an incompatible change, so would
really need to wait for 1.3.0.

Cheers,
    Olly



More information about the Xapian-discuss mailing list