[Xapian-discuss] "Rolling" Database

Richard Heycock rgh at roughage.com.au
Wed Dec 2 02:11:21 GMT 2009


I need a database where the last three months documents need to be
searchable, anything beyond that can be archived. So I'm thinking about
implementing a "rolling" database where I have a database per month and
combine them into one. The latest database would be writable and the
previous two being read-only. When the month ends I would close all the
existing databases and reopen them to include the new month.

For example: December would look like this:

    200912 -> writable
    200911 -> read-only
    200910 -> read-only

And January like this:

    201001 -> writable
    200912 -> read-only
    200911 -> read-only


I'm hoping to use this scheme for a number of reasons: the latest database
is read *significantly* more often than any of the earlier databases;
to be able to manage the ever growing size of the database and to be
able to compact the read only databases.

I'm using the ruby bindings and I've got a couple of questions.

1) is it possible to close a database? I can flush the database and set
   the database object to nil but I can't force the database destructor
   to be called even if I run the garbage collector.

2) Is there a good way of calculating an optimal size for a Xapian
   database? For example I will be getting about ~ 3 million documents
   a month should I be rolling every month, two months etc.

3) Is there a better way to this?

rgh



More information about the Xapian-discuss mailing list