[Xapian-tickets] [Xapian] #500: Shorter max length for terms that contain zero bytes
Xapian
nobody at xapian.org
Mon Mar 20 04:02:42 GMT 2023
#500: Shorter max length for terms that contain zero bytes
-----------------------------+-------------------------------
Reporter: Versmisse David | Owner: Olly Betts
Type: defect | Status: assigned
Priority: normal | Milestone: 2.0.0
Component: Backend-Glass | Version: 1.2.0
Severity: normal | Resolution:
Keywords: | Blocked By:
Blocking: | Operating System: All
-----------------------------+-------------------------------
Changes (by Olly Betts):
* milestone: 1.4.x => 2.0.0
Comment:
This is still present.
The original plan for addressing this was to have a custom per-table key
comparison function rather than using a byte-string compare. That's
proved awkward to do as it prevents us storing key deltas, which saves a
lot of space (honey implements this), or at least it prevents the obvious
approach from working - I guess we could perhaps have a custom per-table
key delta function (or probably set of functions), but this seems like a
lot of complexity for a corner case.
Perhaps we need to come up with another way to address this. Allowing
longer keys also creates more complexity as we'd need to allow two bytes
for key size in some cases (or have an extra byte overhead on every key
length). We could just declare that terms containing zero bytes aren't
supported, which would side-step the problem - I can't really see a good
use-case for it, but outlawing it seems clumsy.
Maybe we just need to make sure it's clearly documented so people wanting
to use zero bytes are warned up front and can decide if the reduction in
supported term length is a problem or not.
--
Ticket URL: <https://trac.xapian.org/ticket/500#comment:7>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list