[Xapian-discuss] xapian enquire.set_docid_order(Xapian::Enquire::DESCENDING so slow!

makao009 makao009 at 126.com
Wed Aug 10 09:44:58 BST 2011


i have 300 millions records and my search file like this , i want the newest 10 results that match my query , so i use boolean search and  "enquire.set_docid_order(enquire.DESCENDING)" , but this method seems a little slow .  when i remove "enquire.set_docid_order(enquire.DESCENDING)" it run much faster .
how can i fetch the newest 10 results as fast as possible?


search.py 
#-*- coding: utf-8 -*-
import xapian
import sys,time


t1 = time.time()


db_path = sys.argv[1]
terms = sys.argv[2:]
try:
  database = xapian.Database(db_path)
  terms = ' '.join(terms)
  qp = xapian.QueryParser()
  qp.set_database(database)
  qp.set_default_op(0)   #0:OP_AND; 1:OP_OR default
  query = qp.parse_query(terms)


  enquire = xapian.Enquire(database)
  enquire.set_weighting_scheme(xapian.BoolWeight())
  enquire.set_query(query)


  enquire.set_docid_order(enquire.DESCENDING)
  matches = enquire.get_mset(0,10)
  print "%i results found . " % matches.get_matches_estimated()
  print "Results 1-%i:" % matches.size()


  for m in matches:
    print "rand= %-4d docid=%-8i" % (m.rank+1,m.docid),
    print " value:", xapian.sortable_unserialise(m.document.get_value(0))


except Exception, e:
  print "Exception : %s " % str(e)


print 'cost %.3f second ' % (time.time()-t1)


More information about the Xapian-discuss mailing list