SQL-like JOINs on separate DBs?

Eric Wong e at 80x24.org
Fri Feb 26 11:54:44 GMT 2021


Hi all, I'm dealing with two classes of Xapian DBs.

One class of Xapian DBs is large, well-established, and treated
as read-only for this application.

The other class is new, and in-development for storing keywords
(with "kw:" prefix).  It could potentially have as many
documents as the large DBs (or even all the large DBs combined
into one multi-shard monster)), but few terms and no positional
data, just keywords (e.g. "seen", "flagged").

I want to be able to run queries like:

	something in a giant databases AND kw:seen

	something in a giant databases AND NOT kw:seen

Where "kw:" would only be stored in the small DB and keyed
against docs in the giant DB by a SHA-<1|256>, or
$PATHNAME:$DOCID, even.

It's entirely possible I missed something, but there's currently
no way in Xapian to combine DBs in a way similar to RDBMS JOINs,
is there?

What I may resort to doing is:

1) parse the query myself for each class of DB
   (not sure if QueryParser can help, here)

2) run separate queries and output to text files (in parallel, of course :)

3) run the standard join(1) command on resulting text files

Or what could be done instead of 2)+3) is:

2a) run query on large DB

3a) iterate through results from large DB; include or exclude only
    the ones matching the requested kw: field.

Any other ideas?

I'm using Perl Search::Xapian from Debian stable (buster).

Thanks.



More information about the Xapian-discuss mailing list