[Xapian-discuss] Re: Help on indexscript

Olly Betts olly at survex.com
Thu Apr 27 13:34:10 BST 2006


On Thu, Apr 27, 2006 at 02:26:17PM +0200, Hermann Rokicz wrote:
> Olly Betts wrote:
> >> groups : truncate=200 boolean=G field=groups
> > 
> > Presumably there can be multiple groups?  Currently scriptindex doesn't
> > allow you to generate multiple booleans from one field.  The best fix
> > currently is probably for your conversion script to produce one entry
> > for all the group names (to go in the field) and also one entry per
> > group (to be indexed as boolean=G).
> 
> This is the 'normal' newsgroups-header, like 'Newsgroups: alt.test,
> alt.next.group'. Should my script convert it to something like 'alt.test
> alt.next.group'? What do you mean by 'one entry per group'?

Well, boolean=G takes the whole field value and creates one term from
it.  So you'd get the term:

Galt.test,alt.next.group

Which I doubt you want...

So as things are currently your mail parsing script needs to producing
something like:

groups=alt.test,alt.next.group
group=alt.test
group=alt.next.group

With the corresponding index script lines being:

groups : truncate=200 field
group : boolean=G

But ideally scriptindex should be flexible enough to able to handle this
case directly.

> > If you'd find it useful, you're welcome to a copy of the indexer we use
> > for gmane (http://search.gmane.org/), which has a similar purpose.
> 
> That would be nice. My e-mail-address from the header is working.

OK.  It'll probably take me a day or two to sort out.

Cheers,
    Olly



More information about the Xapian-discuss mailing list