[Xapian-discuss] understanding stemming and synonyms
Websuche :: Felix Antonius Wilhelm Ostmann
ostmann at websuche.de
Mon Sep 26 11:06:56 BST 2011
Am 26.09.2011 11:50, schrieb James Aylett:
> [Back on list]
>
> On 26 Sep 2011, at 09:26, Websuche :: Felix Antonius Wilhelm Ostmann wrote:
>
>>>> That looks fine, but when i now use the query_parser with stemmer (german2 & STEM_ALL) and parse_query (FLAG_AUTO_SYNONYMS), i get this queries.
>>>
>>> Try STEM_SOME.
>>>
>>> I've poked around a little, and I think we're lacking a clear introduction to the QueryParser, since IIRC this question comes up semi-frequently. I've added a note to MissingDocument; if I'm in error and there is something, feel free to delete it.
>>
>> http://xapian.org/docs/sourcedoc/html/classXapian_1_1QueryParser.html#389713b3969cac6cd98da5fb970f2f8e
>>
>> And it is well documented ... my bad! I think i was at missleaded by a
>> bad howto-website for xapian :-/
>
>
> It's documented, but I think my concerns stand. (You have to think to realise it's generally the right choice, and I think from the point of view of getting started thinking is a bad requirement :-)
>
> There are unfortunately a bunch of howtos for Xapian floating round the internet that are now out of date :-(
>
> J
>
i have now a problem with prefixes.
i used STEM_SOME, which works fine for my synonym "problem", but now my
prefixed words wont work anymore.
i used following prefix:
$xapianQueryParser->add_prefix("market","QM");
And a search for market:de now build a wrong query:
[QUERY: Xapian::Query(ZQMde:(pos=1))]
He stem that prefixed word too :-/
Reading again the docu for STEM_SOME ... that sound like a bug:
STEM_SOME: Search for stemmed forms of terms except for those which
start with a capital letter, or are followed by certain characters
(currently: (/@<>=*[{" ), or are used with operators which need
positional information. Stemmed terms are prefixed with 'Z'.
my word started with a capital letter after apply the prefix.
i checked if the other condition work:
a search for market:de* do what i want:
[QUERY: Xapian::Query(QMde:(pos=1))]
also STEM_NONE and STEM_ALL works for prefixed words (but ofc not for my
synonyms).
i am starting to get more confused ;)
P.S.: i also checked a capitalised prefix: add_prefix("MARKET","QM"),
but that also did not work.
--
Mit freundlichen Grüßen
Felix Antonius Wilhelm Ostmann
-----------------------------------------------------------
Websuche Search Technology GmbH & Co. KG
Martinistraße 3, D-49080 Osnabrück
-----------------------------------------------------------
Tel.: +49 (0) 541 40666 0, Fax: +49 (0) 541 40666 22
Email: info at websuche.de, Web: www.websuche.de
-----------------------------------------------------------
HRA 200252, AG Osnabrück, Ust-IdNr.: DE814737310
-----------------------------------------------------------
Komplementärin: Websuche Search Technology Verwaltungs GmbH
HRB 200359, AG Osnabrück, Geschäftsführer: Ansas Meyer
-----------------------------------------------------------
Die in dieser Email enthaltenen Informationen sind vertrau-
lich zu behandeln und ausschließlich für den Adressaten be-
stimmt. Jegliche Veröffentlichung, Verteilung oder sonstige
in diesem Zusammenhang stehende Handlung wird ausdrücklich
untersagt.
More information about the Xapian-discuss
mailing list