[Xapian-discuss] understanding stemming and synonyms

Websuche :: Felix Antonius Wilhelm Ostmann ostmann at websuche.de
Mon Sep 26 11:06:56 BST 2011


Am 26.09.2011 11:50, schrieb James Aylett:
> [Back on list]
> 
> On 26 Sep 2011, at 09:26, Websuche :: Felix Antonius Wilhelm Ostmann wrote:
> 
>>>> That looks fine, but when i now use the query_parser with stemmer (german2 & STEM_ALL) and parse_query (FLAG_AUTO_SYNONYMS), i get this queries.
>>>
>>> Try STEM_SOME.
>>>
>>> I've poked around a little, and I think we're lacking a clear introduction to the QueryParser, since IIRC this question comes up semi-frequently. I've added a note to MissingDocument; if I'm in error and there is something, feel free to delete it.
>>
>> http://xapian.org/docs/sourcedoc/html/classXapian_1_1QueryParser.html#389713b3969cac6cd98da5fb970f2f8e
>>
>> And it is well documented ... my bad! I think i was at missleaded by a
>> bad howto-website for xapian :-/
> 
> 
> It's documented, but I think my concerns stand. (You have to think to realise it's generally the right choice, and I think from the point of view of getting started thinking is a bad requirement :-)
> 
> There are unfortunately a bunch of howtos for Xapian floating round the internet that are now out of date :-(
> 
> J
> 

i have now a problem with prefixes.

i used STEM_SOME, which works fine for my synonym "problem", but now my
prefixed words wont work anymore.

i used following prefix:
$xapianQueryParser->add_prefix("market","QM");

And a search for market:de now build a wrong query:
[QUERY: Xapian::Query(ZQMde:(pos=1))]

He stem that prefixed word too :-/

Reading again the docu for STEM_SOME ... that sound like a bug:

STEM_SOME: Search for stemmed forms of terms except for those which
start with a capital letter, or are followed by certain characters
(currently: (/@<>=*[{" ), or are used with operators which need
positional information. Stemmed terms are prefixed with 'Z'.


my word started with a capital letter after apply the prefix.


i checked if the other condition work:

a search for market:de* do what i want:
[QUERY: Xapian::Query(QMde:(pos=1))]


also STEM_NONE and STEM_ALL works for prefixed words (but ofc not for my
synonyms).


i am starting to get more confused ;)


P.S.: i also checked a capitalised prefix: add_prefix("MARKET","QM"),
but that also did not work.


-- 
Mit freundlichen Grüßen

Felix Antonius Wilhelm Ostmann
-----------------------------------------------------------
Websuche Search Technology GmbH & Co. KG
Martinistraße 3, D-49080 Osnabrück
-----------------------------------------------------------
Tel.: +49 (0) 541 40666 0, Fax: +49 (0) 541 40666 22
Email: info at websuche.de, Web: www.websuche.de
-----------------------------------------------------------
HRA 200252, AG Osnabrück, Ust-IdNr.: DE814737310
-----------------------------------------------------------
Komplementärin: Websuche Search Technology Verwaltungs GmbH
HRB 200359, AG Osnabrück, Geschäftsführer: Ansas Meyer
-----------------------------------------------------------

Die in dieser Email enthaltenen Informationen sind vertrau-
lich zu behandeln und ausschließlich für den Adressaten be-
stimmt. Jegliche Veröffentlichung, Verteilung oder sonstige
in diesem Zusammenhang stehende Handlung  wird ausdrücklich
untersagt.



More information about the Xapian-discuss mailing list