MultiDatabase shard count limitations
Eric Wong
e at 80x24.org
Fri Aug 21 10:06:59 BST 2020
Going back to the "prioritizing aggregated DBs" thread from
February 2020, I've got 390 Xapian shards for 130 public inboxes
I want to search against(*). There's more on the horizon (we're
expecting tens of thousands of public inboxes).
After bumping RLIMIT_NOFILE and running ->add_database a bunch,
the actual queries seem to be taking ~30s (not good :x).
Now I'm thinking, MultiDatabase isn't the right way to go about
this...
Perhaps creating a new, all-encompassing Xapian index with a
reasonable shard count would be wise, at least for the normal
WWW frontend?
Managing removals of entire inboxes from an all-encompassing
Xapian DB would get much trickier.
IMAP search would still require per-mailbox indices, I think;
because UIDs are currently tied to NNTP article numbers.
Some attributes such as INTERNALDATE (Received: time) and
exact byte sizes would differ if the same message is
cross-posted to multiple public mailing lists, too.
More information about the Xapian-discuss
mailing list