[Xapian-discuss] some questions with scriptindex

Olly Betts olly at survex.com
Mon Mar 28 12:05:39 BST 2005


On Sun, Mar 27, 2005 at 04:07:21PM -0800, Sabrina Shen wrote:
> This is somewhat unexpected.  It seems to me that
> there shouldn't be a single term longer than 200 in
> the boolean fields. JN (journal name) is separated by
> spaces.

If you index a field as boolean, the *whole field* goes in
the term (spaces and all).

> Is there a  way  that I can check exactly where this 
> error happened, say, with which term and which
> document? 

Currently the error isn't detected until the indexed data is being
flushed to disk.

You could modify the code in backends/quartz/btree.cc that throws the
error you're getting to include the key (the key will include the
termname).  Or use a bit of perl on your dumpfile:

perl -ne 'print if /^(JN|PY|AU|CA)/ && length > 200' dumpfile

That'll show all the long fields which produce boolean terms.

Cheers,
    Olly



More information about the Xapian-discuss mailing list