[Xapian-discuss] Multiple Databases on the same port
Olly Betts
olly at survex.com
Sat Aug 1 12:48:08 BST 2009
On Fri, Jul 31, 2009 at 11:43:50AM -0400, Eddie Drapkin wrote:
> The suggestion to add a database ID or a site ID
> field in the databases doesn't seem too practical, as we have a
> type-ahead search and even that piece of overhead could turn out to be
> more costly than we'd expect, not to mention making the entire search
> process harder to manage.
Yes, doesn't seem a good approach if you care about performance.
> As far as running a xapian-tcpsrv instance
> per database, that's a management and security nightmare, not to
> mention putting a very real limit on how far we can scale out.
It's not ideal for a large number of databases, but it's the only way
to achieve this with the remote backend as things stand.
> It's not really practical to switch search engines right now, not that
> we're not happy with Xapian, but I do know that Sphinx has the
> capability of serving on a single port and having multiple "indexes."
> Is there another, alternative, way to set up our datasets so that
> they're uniquely searchable without being in separate databases or
> having to search based on DatabaseID all the time?
I don't see one. If you put them all in one database, you need a way
to filter out just the ones you want...
I guess you could serve the databases over NFS or similar to the search
boxes, and then just open them without the remote backend.
Another approach is to do the searching on the box with database
directly and to send the results back using your own daemon and
protocol. There's a "queryserver" in SVN, but I think it is no longer
maintained, and it may not even build unmodified:
http://trac.xapian.org/browser/branches/1.0/xapian-applications/queryserver
If that's not a suitable approach, I think your best option is probably
to patch the remote backend to support this. It shouldn't require deep
knowledge of Xapian's code, and I'm happy to point out where you need to
poke.
If you want us to incorporate the patch, try to minimise the number of
exchanges required in the protocol - it's not really an issue on a fast
LAN, but on a higher latency link adding more message exchanges can
measurably slow down searches, so we've striven to avoid doing so. Or
have two different modes (single- and multiple-database) and require the
client and server to be using the same one...
Cheers,
Olly
More information about the Xapian-discuss
mailing list