[Xapian-discuss] Weekly replacement of documents.

David dmorris at sirca.org.au
Tue Jun 26 06:23:13 BST 2007


Hi people!

I'm writing an index to a collection of "web" pages. There is about 150k of
them, and some of them change twice a year, some several times per second. 

There's not that much data going to be going into each Xapian::Document, in
terms of set_data() and terms.

But, every week we get new data, and most documents will have to be
Xapian::WritableDatabase::replace_document()'d. What type of effect would this
have? 

Since the majority of the database will, in effect, be "replaced" on a weekly
basis, how does the database re-organize itself? Would I have to do some sort of
compacting?

And here's some praise: I have found it incredibly easy to get some indexing and
searching happening. I had a proof-of-concept up and running (and in production)
in about 2 days.

Now I just have to go and do it properly :)




More information about the Xapian-discuss mailing list