[Xapian-discuss] Are these numbers resonsable?

Olly Betts olly at survex.com
Mon Feb 5 03:53:28 GMT 2007


On Fri, Jan 19, 2007 at 03:48:06AM -0800, Rafael SDM Sierra wrote:
> I have only one box[1] running 3 sub-systems[2] at my system, are these
> numbers resonsable[3]??
> 
> [1] - From dmesg (FreeBSD 6.1-RELEASE):
> AMD Sempron(tm) Processor 3000+ (1808.33-MHz K8-class CPU)
> real memory  = 2080309248 (1983 MB)
> avail memory = 1997869056 (1905 MB)
> ad0: 76350MB <SAMSUNG SP0802N TK200-04> at ata0-master UDMA33

Probably not particularly fast disk - SATA is UDMA133, though I suspect
by that point the bus bandwidth to the drive ceases to be a bottleneck
currently.

> [2] The sub-systems are:
> 1 - A server giving adreesses of documents to be indexed
> 2 - A server receiving these documents and replacing (I don't add, just
> replaces)

This could be an issue.  Sequential insertions (or appends) into the
underlying B-trees are optimised specially.  For flint this means that
adding or replacing an ascending sequence of adjacent document ids (or
actually an ascending sequence where any doc ids skipped over don't
already exist) is faster and produces a smaller database.

So depending on the pattern of the document ids you specify, calling
replace could be significantly slower than calling add would be.
There's probably scope for improving this case (I know there's scope for
further optimising appending a sequence), but it would be interesting
to know why you want to set the document ids and what the pattern (if
any) is.

Cheers,
    Olly



More information about the Xapian-discuss mailing list