queries for a set of values
Eric Wong
e at 80x24.org
Fri Apr 26 23:37:37 BST 2024
I probably should've used boolean terms in addition to numeric
values when indexing, but currently I have a set of numeric
values[1] and trying to avoid having to reindex ~250GB DBs
(and asking numerous users to do the same).
Say I have a bunch of values which I want to filter a query against.
If I had boolean terms, it could just OP_OR against the whole set.
IOW, this is what notmuch does with terms:
std::set<std::string> terms;
// notmuch populates terms via terms.insert(*i)...
Query(OP_OR, terms.begin(), terms.end());
// Disclaimer: I don't really know C++
With a set of integers I have (after sortable_serialise), would the
best way be to OP_OR a bunch of OP_VALUE_RANGE queries together?
So, perhaps something like:
Query(OP_OR,
Query(OP_VALUE_RANGE, column, v[0], v[0]),
Query(OP_VALUE_RANGE, column, v[1], v[2]),
Query(OP_VALUE_RANGE, column, v[3], v[3]),
...
Query(OP_VALUE_RANGE, column, v[LAST], v[LAST]))
// Or (totally not even compile-tested and I don't know C++)
// something like:
std::vector<Xapian::Query> subq;
for (size_t i = 0; i < nelem; i++) {
std::string v = sortable_serialise(int_vals[i]));
subq.insert(Query(OP_VALUE_RANGE, column, v, v));
}
Query(OP_OR, subq.begin(), subq.end());
It seems what I'm really looking for is an OP_VALUE_OR or OP_VALUE_IN;
but only OP_VALUE_{GE,LE,RANGE} exists.
[1] Even if I switched to terms, I would still keep the numeric
values since I also rely on Enquire.set_collapse_key on this
column.
More information about the Xapian-discuss
mailing list