[Xapian-discuss] Key too long: length was 254 bytes
olly at survex.com
Fri Dec 9 17:29:07 GMT 2005
On Fri, Dec 09, 2005 at 02:24:16PM -0200, Christiano Anderson wrote:
> $ quartzcompact base1 base2 finaldb
> After some minutes it returns this error: postlist ...quartzcompact:
> Key too long: length was 254 bytes, maximum length of a key is
> BTREE_MAX_KEY_LEN bytes
> What could be wrong with the databases and how can I solve this problem?
The btree manager which Quartz uses has a maximum key length of 252
bytes. But because the keys contain more than just term names, the
maximum safe length for a term is 240 bytes (or perhaps a few more,
but 240 is certainly safe). There's one further wrinkle - any zero
bytes in a term require 2 bytes in the the quartz key.
The problem you've run into is that the length check currently only
looks at the assembled key in quartz, not at the term length. The
assembled key for some of the Btree tables has the document id encoded
using a variable length coding, so bigger document ids need more
bytes. I suspect that base2 has a 252 byte key in one of these tables
and that when the databases are merged the document id is larger so the
Really we should vet the term lengths themselves to stop this situation
happening, but I'm afraid we don't currently.
So you'll need to find out what is producing such long terms and make
it stop doing so! If it's something like a URL, you might want to look
at how Omega handles this by hashing the tail of long URLs. The code
is in omindex.cc, function make_url_term.
More information about the Xapian-discuss