[Xapian-tickets] [Xapian] #737: Fix/improve $filters
Xapian
nobody at xapian.org
Fri Oct 7 02:25:14 BST 2016
#737: Fix/improve $filters
--------------------------------+-------------------
Reporter: olly | Owner: olly
Type: enhancement | Status: new
Priority: normal | Milestone: 1.5.0
Component: Omega | Version:
Severity: normal | Keywords:
Blocked By: | Blocking:
Operating System: All |
--------------------------------+-------------------
The current encoding of $filters has at least one bug (which was also
present in the older encoding used in 1.2.x):
* `DOCIDORDER=A` is the default, but produces an `X` in
`$filters`/`DOCIDORDER=X` is non-default but produces nothing in
`$filters`. Currently however, `A` and `X` are identical as `DONT_CARE`
currently actually always results in `ASCENDING` order, so this doesn't
seem worth changing anything for. But if/when we change the encoding, we
should address this.
And it could be more compact:
* Every `N` term is prefixed by `!`, but only the first needs to be.
* Every encoded string has at least `~~` after the character for
`DEFAULTOP`, which isn't necessary.
* The `DEFAULTOP` character could be omitted when using the default
`DEFAULTOP`.
* We could combine some/all of `DEFAULTOP`, `DOCIDORDER` and the existing
`SORTREVERSE`/`SORTAFTER` characters - there are currently 2, 3 and 2*2
states, though more `DEFAULTOP` values are possible, and about
10+26*2+19=81 characters which don't need URL encoding, so we could
support up to 6 `DEFAULTOP` values and encode all of these into one
character which shouldn't need URL encoding.
* We could encode value slot numbers using something like base64 and save
bytes when slots > 9 are used (or perhaps encode all the slot numbers
together such that they'd usually all fit in one byte).
* Lists of `B` and `N` are sorted, so could easily be prefix-compressed -
reducing the size when there are a lot of either, which is a case where
keeping the size down matters most.
The compactness matters as the length of a URL is limited, and using `GET`
is common for search systems. A longer URL can also look uglier when
pasted, etc.
--
Ticket URL: <https://trac.xapian.org/ticket/737>
Xapian <//xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list