[Xapian-discuss] Adding terms of more than one word with PHP bindings

Yannick Warnier ywarnier at beeznest.org
Mon Sep 15 01:02:28 BST 2008


Le dimanche 14 septembre 2008 à 19:14 +0100, James Aylett a écrit :
> On Sun, Sep 14, 2008 at 10:00:12AM -0500, Yannick Warnier wrote:
> 
> [Multi-word terms]
> > My use case is that I offer a set of scripts whereby the users can add
> > "tags" to the documents they index. These tags are then kept in the
> > Xapian database using the terms feature.
> > Some of these tags are using multiple words (let's say "summer
> > holiday"). I then offer a search interface which allow for a search
> > based on a combination of tags (boolean search) and normal (statistical)
> > search.
> > The tags are stored correctly using the XapianDocument::add_term()
> > method. They are retrieved correctly using the
> > XapianDatabase::allterms_begin() method.
> >
> > However, when trying to query the Xapian database for my search string
> > (see code appended below), the search string syntax (the first parameter
> > of my xapian_query function), something like 
> > 
> >   sea sex sun T:summer holiday T:beach
> > 
> > doesn't get the tag "summer holiday".
> 
> The query parser isn't really going to help you here, because of word
> splitting. The trouble is, there's no difference in syntax between the
> desired semantic 'tag "summer holiday"' and the desired semantic 'tag
> "summer" AND holiday'. Or rather, you want the syntax above to mean
> the former, but it actually means the latter.
> 
> You may be better off word splitting your tag field and treating it as
> probabilistic rather than boolean, ie making it a freetext metadata
> field rather than a tag. (At least from the point of view of search.)
> You can do that using the TermGenerator, if that's what you're using
> for your regular text.

I may be better off splitting words, but I find it strange to have the
possibility to add a term with a space and not to be able to search it
afterwards... looks kind of anti-logical to me (or maybe my addition of
space-separated terms is actually not working...?).

Yannick




More information about the Xapian-discuss mailing list