[Xapian-discuss] Re: Remote databases and daemons
Philip Neustrom
philipn at gmail.com
Mon Mar 27 17:57:45 BST 2006
Michael Pelletier: Your code will work as a xapian search server
that's searching a local xapian database. This is fine, and I can
easily do something like this (along with indexing), but here's where
I'm stuck:
How do I combine the results from the different servers into a unified
result that makes sense? My guess is to do something like this: have
each server return to the client an MSet, have the client sort the
MSets (by the returned search rank for each match) into a master MSet
and cut off at the number of desired results (say 11). The problem
that I run into when thinking about this approach is that it would be
hard to get it to work with non-zero values for the "first" doccount
attribute of enquire, e.g. "next ten results" would be hard to find
because different results of different relevancy could attrive from
different databases.
The reason I want to do this is so that I can spread my database
across multiple machines without any issues.
Any ideas?
--Philip Neustrom
On 3/26/06, Philip Neustrom <philipn at gmail.com> wrote:
> I've looked over the docs on remote backends, the protocol, and a bit
> of the c++ for doing distributed and remote searches. I've got a
> couple of questions:
>
> * The remote protocol is usable only as a Database, not as a
> WriteableDatabase -- is this correct? So, if I don't want my
> application to have a copy of the database on the same machine I'll
> need to write an indexer daemon on the remote machine and talk to it
> over TCP if i want to be able to remotely index?
>
> * The socketserver.cc and the corresponding xapian-tcpsrv looks like
> it blocks, even for reads. As far as I know, Xapian currently
> supports "single-write, multiple-reads" of the database, which means
> the tcpserver could be doing more. Am I mistaken in thinking that a
> read will block another read with the tcp server?
>
> I'm building an application that I'd love to have near-real-time
> indexing, e.g. when a user saves a document it's sent to xapian.
> That's how it works now, but it's on such a small scale that issues
> like this don't matter. What's the easiest way to make this work?
>
> Here's what I'm thinking: Write a small xapian-daemon server in
> Python that listens on TCP and can index and search. Because xapian
> can only do one write at a time (last i checked?) the server will keep
> a queue of index requests and apply them in order/thread to avoid
> blocking. Is this something that's useful or is there a more
> xapiantic way to do this?
>
> --Philip Neustrom
>
More information about the Xapian-discuss
mailing list