[Xapian-devel] FASTER Search

林德 leedeetiger at gmail.com
Thu Jan 17 05:50:25 GMT 2013


I am suffering for slow searching performance on Xapian.

I am using Xapian for indexing about 150,000,000 documents.
It was implemented in C++;

The performance of searching was not that fast.
e.g. Searching a query, which includes about 20 terms, needs 2 secs avg.

For searching, I followed such steps:

   1. construct a QueryParser for certain string
   2. parse the query to get a Xapian::Query
   3. construct an Enquire for searching by calling get_mset method

here is the function-time-cost for searching:

samples  %        symbol name
75649    28.0401  ChertPostList::move_forward_in_chunk_to_at_least(unsigned
int)
30118    11.1635  Xapian::BM25Weight::get_sumpart(unsigned int, unsigned
int) const
21291     7.8917  AndMaybePostList::process_next_or_skip_to(double,
Xapian::PostingIterator::Internal*)
17803     6.5989  OrPostList::next(double)
12481     4.6262  AndMaybePostList::get_weight() const
10729     3.9768  OrPostList::get_weight() const
10096     3.7422  AndMaybePostList::next(double)
8743      3.2407  ChertDatabase::get_doclength(unsigned int) const
7527      2.7900  LeafPostList::get_weight() const
7504      2.7814  ChertPostListTable::get_doclength(unsigned int,
Xapian::Internal::RefCntPtr<ChertDatabase const>) const
5402      2.0023  ChertPostList::jump_to(unsigned int)
4518      1.6746  ChertPostList::skip_to(unsigned int, double)
4341      1.6090  ChertPostList::next_in_chunk()
4207      1.5594  ChertPostList::get_docid() const
4065      1.5067  ChertPostList::at_end() const
3988      1.4782  AndMaybePostList::at_end() const
3899      1.4452  OrPostList::get_docid() const
3655      1.3548  MultiMatch::get_mset(unsigned int, unsigned int, unsigned
int, Xapian::MSet&, Xapian::Weight::Internal const&, Xapian::MatchDecider
 const*, Xapian::MatchDecider const*, Xapian::KeyMaker const*)
3172      1.1757  OrPostList::at_end() const
3061      1.1346  ChertPostList::get_wdf() const

most of the time cost were about chert post list;

Could I use some separate database for getting faster searching?

Compacting database will help?

How to reduce time cost for chert post list operation?



Thanks!!

De Lin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20130117/04cad881/attachment.htm>


More information about the Xapian-devel mailing list