[Xapian-tickets] [Xapian] #737: Fix/improve $filters
Xapian
nobody at xapian.org
Mon Jul 3 06:02:17 BST 2023
#737: Fix/improve $filters
-------------------------+-------------------------------
Reporter: Olly Betts | Owner: Olly Betts
Type: enhancement | Status: assigned
Priority: normal | Milestone: 1.5.0
Component: Omega | Version:
Severity: normal | Resolution:
Keywords: | Blocked By:
Blocking: | Operating System: All
-------------------------+-------------------------------
Comment (by Olly Betts):
Working on this. My WIP so far addresses the first 3 points (any
`START`/`END`/`SPAN` filter is now encoded in the same way as date range
filters from `START.n`, etc are) which gets rid of the `~~` when these
aren't used. Additionally I've shortened the encoding of date range
filters by a character or two in cases where `SPAN`/`SPAN.n` isn't used.
> The DEFAULTOP character could be omitted when using the default
DEFAULTOP.
We probably could, but it's a single character and omitting it entirely
seems to complicate things.
> We could combine some/all of DEFAULTOP, DOCIDORDER and the existing
SORTREVERSE/SORTAFTER characters - there are currently 2, 3 and 2*2
states, though more DEFAULTOP values are possible, and about 10+26*2+19=81
characters which don't need URL encoding, so we could support up to 6
DEFAULTOP values and encode all of these into one character which
shouldn't need URL encoding.
This seems a better approach and potentially saves more.
> We could encode value slot numbers using something like base64 and save
bytes when slots > 9 are used (or perhaps encode all the slot numbers
together such that they'd usually all fit in one byte).
Not looked into this.
> Lists of B and N are sorted, so could easily be prefix-compressed -
reducing the size when there are a lot of either, which is a case where
keeping the size down matters most.
Or this.
--
Ticket URL: <https://trac.xapian.org/ticket/737#comment:3>
Xapian <https://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list