[Xapian-discuss] php and get_termpos (stemming problem?)
Olly Betts
olly at survex.com
Wed Sep 12 12:47:56 BST 2007
On Wed, Sep 12, 2007 at 12:01:23PM +0200, Wojewsky, Sascha, Heinze wrote:
> if (!$pos->equals($database->positionlist_end($i->get_docid(),
> $terms->get_term()))) {
[...]
> But the if condition was always false.
>
> The term (searchstring) from "$terms->get_term()" starts with the
> stemming-"Z".
See:
http://www.xapian.org/docs/termgenerator.html#stemming
In particular:
Now we index all terms lowercased with positional information, and also
stemmed with a 'Z' prefix (unless they start with a digit), but without
positional information.
So your "if" condition is always false because there's no positional
information stored for 'Z'-prefixed terms. This is done because it
saves a lot of disk space, but we can still provide phrase searching
etc (by using the unstemmed forms).
> If I've called $database->positionlist_end($i->get_docid(),
> 'seachstring') without the leading "Z", I've got a result.
You will, provided that the stemmed form is also an unstemmed word in
the document.
> Any unauthorized copying, disclosure or distribution of the material
> in this e-mail is strictly forbidden.
Please don't post to mailing lists with such disclaimers. Email sent
to this (and most other) mailing lists will be copied, disclosed, and
distributed very widely - that's the very purpose of a mailing list.
If you don't want that, don't send mail to it.
Cheers,
Olly
More information about the Xapian-discuss
mailing list