[Xapian-discuss] Xapian -need help!

Olly Betts olly at survex.com
Wed Apr 27 21:15:09 BST 2005


On Wed, Apr 27, 2005 at 03:13:37PM -0400, Rita wrote:
> 1: When I'm trying to index , why it creates/opens so many files,
> e.g.this is how one of my index looks like:
> % ~/xapian_index> ls
> db_lock          position_baseA   record_DB        termlist_baseA
> meta             postlist_DB      record_baseA     value_DB
> position_DB      postlist_baseA   termlist_DB      value_baseA
> 
> why xapian needs to create so many files?

I'm not sure I understand the motivation behind the question, which
makes it a bit hard to provide a satisfactory answer.

There are 5 tables, each of which has a different purpose.  The 5 _DB
files and 5 _base files could potentially all be merged into one _DB
and one _base, but there doesn't seem to be any particular benefit to
doing so.  The only issue I'm aware of if that you do need 5 file
descriptors per open database, but nobody has ever complained about
that.

On the other hand, not doing so provides extra flexibility sometimes.
For example, you can put the _DB files on different partitions - the
record table is nowhere near as speed critical as the postlist table
so you could put it on a slower disk.

Really I'd suggest you should just consider the xapian_index directory
to be an opaque object you don't need to worry about the contents of.

> 2: Also, from the documentation it's not clear to me how xapian
> indexes, what database it uses? Does it uses sleepycat to index the
> documents?

It uses Btrees stored in its own format (we did experiment with using
sleepycat in the early days, but it wasn't really suitable).  Look at
the quartz section on the "internals" documentation if you want to know
more, but again it's not something you need to worry about unless you're
actually wanting to modify the Xapian code.

Cheers,
    Olly



More information about the Xapian-discuss mailing list