[Xapian-tickets] [Xapian] #500: Shorter max length for terms that contain zero bytes

Xapian nobody at xapian.org
Mon Mar 20 04:02:42 GMT 2023


#500: Shorter max length for terms that contain zero bytes
-----------------------------+-------------------------------
 Reporter:  Versmisse David  |             Owner:  Olly Betts
     Type:  defect           |            Status:  assigned
 Priority:  normal           |         Milestone:  2.0.0
Component:  Backend-Glass    |           Version:  1.2.0
 Severity:  normal           |        Resolution:
 Keywords:                   |        Blocked By:
 Blocking:                   |  Operating System:  All
-----------------------------+-------------------------------
Changes (by Olly Betts):

 * milestone:  1.4.x => 2.0.0

Comment:

 This is still present.

 The original plan for addressing this was to have a custom per-table key
 comparison function rather than using a byte-string compare.  That's
 proved awkward to do as it prevents us storing key deltas, which saves a
 lot of space (honey implements this), or at least it prevents the obvious
 approach from working - I guess we could perhaps have a custom per-table
 key delta function (or probably set of functions), but this seems like a
 lot of complexity for a corner case.

 Perhaps we need to come up with another way to address this.  Allowing
 longer keys also creates more complexity as we'd need to allow two bytes
 for key size in some cases (or have an extra byte overhead on every key
 length).  We could just declare that terms containing zero bytes aren't
 supported, which would side-step the problem - I can't really see a good
 use-case for it, but outlawing it seems clumsy.

 Maybe we just need to make sure it's clearly documented so people wanting
 to use zero bytes are warned up front and can decide if the reduction in
 supported term length is a problem or not.
-- 
Ticket URL: <https://trac.xapian.org/ticket/500#comment:7>
Xapian <https://xapian.org/>
Xapian


More information about the Xapian-tickets mailing list