[Xapian-discuss] Emtpy records & unique key...

Olly Betts olly at survex.com
Wed May 11 14:37:13 BST 2005


On Wed, May 11, 2005 at 05:55:25AM -0700, arjan holscher wrote:
> Here is the data I feed to scriptindex. It looks okay
> to me.

OK, it has DOS/Windows end of lines which isn't a problem - scriptindex
will remove a \r if there is one before the \n.

But some lines have multiple \r characters before the \n,
not just one.  Which is rather odd, but shouldn't actually cause
problems except that boolean terms will include these extra characters!

Anyway, with the latest development version on Linux, I get 4284 records
indexed.  Updating adds one record and updates 4283, leaving 4285 in the
database.  If I count the number of "internal=" lines, there are 4283 in
the file, so one record apparently has no internal field, and it makes
sense that it will be readded each time (because UNIQUE won't fire).

This doesn't match the 14222 you get, though the "third updated" sort
of fits...

> I discovered something else in the database. Some
> records are only partially filled. Some fields are
> filled with something and some fields are plain empty.
> Maybe this will ring a bell in your head? :S

Are you running on Windows?  That might be the important difference.

Or perhaps it's a bug which was fixed since 0.8.5, although
scriptindex.cc hasn't changed materially since then - just a few comment
fixes.

Cheers,
    Olly



More information about the Xapian-discuss mailing list