[Xapian-discuss] making my db leaner and meaner
Ben Campbell
ben at scumways.com
Thu Mar 26 16:30:09 GMT 2009
I'm trying to shrink my xapian database in an effort to reduce load on
the poor server (I think it's just creeping up in size enough now to the
point where the machine is struggling with it a bit)
My indexing is pretty naive, and I've learnt a lot since I first began.
I suspect there is a lot of fat that could be trimmed...
Here are the improvements I'm planning:
- use a stopword list
I expect this to be a pretty big win, but I'm not yet sure how to pick a
good set of stopwords (I've posted separately asking about this).
- reduce the number of values I use.
Currently, I'm using 6 values - most of them are only used to store
things I want to display in my search results. These things I'll move
into a serialised form in the document data (which is currently unused).
I only ever sort using one value (a datetime), so I'll ditch the other five.
- look at running xapian-compact from time to time
I add about 2000 documents per day (and almost never remove documents).
Not sure how much this would help, but you never know, and it's easy to
try it out.
Does this all sound sane? Anything obvious I've missed?
I was toying with the idea of ditching the positional information on
terms, but that would prevent me doing queries like "a walk in the
park", right?
Any other ideas welcome :-)
Thanks,
Ben.
More information about the Xapian-discuss
mailing list