[Xapian-discuss] xapian enquire.set_docid_order(Xapian::Enquire::DESCENDING so slow!
makao009
makao009 at 126.com
Wed Aug 10 09:44:58 BST 2011
i have 300 millions records and my search file like this , i want the newest 10 results that match my query , so i use boolean search and "enquire.set_docid_order(enquire.DESCENDING)" , but this method seems a little slow . when i remove "enquire.set_docid_order(enquire.DESCENDING)" it run much faster .
how can i fetch the newest 10 results as fast as possible?
search.py
#-*- coding: utf-8 -*-
import xapian
import sys,time
t1 = time.time()
db_path = sys.argv[1]
terms = sys.argv[2:]
try:
database = xapian.Database(db_path)
terms = ' '.join(terms)
qp = xapian.QueryParser()
qp.set_database(database)
qp.set_default_op(0) #0:OP_AND; 1:OP_OR default
query = qp.parse_query(terms)
enquire = xapian.Enquire(database)
enquire.set_weighting_scheme(xapian.BoolWeight())
enquire.set_query(query)
enquire.set_docid_order(enquire.DESCENDING)
matches = enquire.get_mset(0,10)
print "%i results found . " % matches.get_matches_estimated()
print "Results 1-%i:" % matches.size()
for m in matches:
print "rand= %-4d docid=%-8i" % (m.rank+1,m.docid),
print " value:", xapian.sortable_unserialise(m.document.get_value(0))
except Exception, e:
print "Exception : %s " % str(e)
print 'cost %.3f second ' % (time.time()-t1)
More information about the Xapian-discuss
mailing list