[Xapian-discuss] php and get_termpos (stemming problem?)

Olly Betts olly at survex.com
Wed Sep 12 12:47:56 BST 2007


On Wed, Sep 12, 2007 at 12:01:23PM +0200, Wojewsky, Sascha, Heinze wrote:
>   if (!$pos->equals($database->positionlist_end($i->get_docid(),
> $terms->get_term()))) {
[...]
> But the if condition was always false.
> 
> The term (searchstring) from "$terms->get_term()" starts with the
> stemming-"Z". 

See:

http://www.xapian.org/docs/termgenerator.html#stemming

In particular:

    Now we index all terms lowercased with positional information, and also
    stemmed with a 'Z' prefix (unless they start with a digit), but without
    positional information.

So your "if" condition is always false because there's no positional
information stored for 'Z'-prefixed terms.  This is done because it
saves a lot of disk space, but we can still provide phrase searching
etc (by using the unstemmed forms).

> If I've called $database->positionlist_end($i->get_docid(),
> 'seachstring') without the leading "Z", I've got a result.

You will, provided that the stemmed form is also an unstemmed word in
the document.

> Any unauthorized copying, disclosure or distribution of the material
> in this e-mail is strictly forbidden.

Please don't post to mailing lists with such disclaimers.  Email sent
to this (and most other) mailing lists will be copied, disclosed, and
distributed very widely - that's the very purpose of a mailing list.
If you don't want that, don't send mail to it.

Cheers,
    Olly



More information about the Xapian-discuss mailing list