[Xapian-discuss] xapian's cache
Andrey
alpha04 at netvigator.com
Fri Nov 23 22:25:31 GMT 2007
Hi
About the "warming-up" of xapian from the first few queries, in which
prespective does it cache the data in?
xapian / xapian-binding / filesystem IO?
I don't know if that was the right question to ask, say, i have 2 machines
Machine A)Write Head of xapian, write to local HD (continuous writing 24hrs)
[Python]
Machine B)Read Head, network mount to A's xapian DB folder [PHP]
I wonder if i want a faster search, which machine's amount of RAM matters
most?
What happen to the cache when the DB is flush? The cache in memory will gone
or will incrementally added up?
If i use python to search and cache up, does it benefit to php searches?
notice that the DB keep flushing every 10,000 doc (@5mins), will the search
preformance better-off if seperated to 2 DBs, and search over them like
this? will the cache of db1 stays and benefits?
db1 < very large
db2 < only todays document, flush every 5mins 10,000 doc
one more question on Enquire.sort_by_value(), does it use string comparasion
only? because its relative slow comparing to sort_by_docid().. (my values
are all numeric timestamps)
ar.. I when i use set_collapse_key ( MD5(title+domain) ) for removing
duplicated title under a domain, i found it a bit expensive in %.
30M documents with collapse_key : 2-9+ secs
30M documents without collapse_key: 0.01 - 0.9 secs
(my keys are currently 32-byte string)
I will keep testing it after tunned the cache part and database grows..
Big Thanks
Andrey
More information about the Xapian-discuss
mailing list