[Xapian-discuss] Fwd: Re: what is the fastest way to fetch results which are sorted by timestamp ?

Tim Brody tdb2 at ecs.soton.ac.uk
Thu Aug 11 08:49:00 BST 2011


(Forwarded off-list message)

-------- Original Message --------
Subject: Re: [Xapian-discuss] what is the fastest way to fetch results
which are sorted by timestamp ?
Date: Thu, 11 Aug 2011 01:06:36 +0800
From: 潘俊勇 <panjunyong at gmail.com>
To: Tim Brody <tdb2 at ecs.soton.ac.uk>

On Wed, Aug 10, 2011 at 6:39 PM, Tim Brody <tdb2 at ecs.soton.ac.uk> wrote:

> Hi,
>
> In terms of the enquiry, do you mean this?:
> set_weighting_scheme(Xapian::BoolWeight());
> set_docid_order(Xapian::Enquire::DESCENDING);
>
>
In my test, it is more than 10 times slower than :

set_weighting_scheme(Xapian::BoolWeight());
set_docid_order(Xapian::Enquire::ASCENDING);

Why?

What's the most efficient process to build multiple Xapian indexes? Can
> the "relevance" index provide any hints to building the sorted indexes?
>
> Cheers,
> Tim.
>
> On Tue, 2011-08-09 at 18:04 +0100, Richard Boulton wrote:
> > On 9 August 2011 17:48, makao009 <makao009 at 126.com> wrote:
> > > what is the fastest way to fetch results which are sorted by
timestamp
> ?
> >
> > The fastest possible way is to have your index sorted by timestamp
> > (ie, such that document IDs increase as the timestamp increases).
> > That way, the search can stop as soon as sufficient matches have been
> > found.  It can be very awkward to get an index in such order though,
> > particularly in the face of updates, assuming that you want the sort
> > order to show most recent first.
> >
> > > i want to use xapian as my search engine , use
> add_boolean_term(something) and
> add_value(0,sortable_serialise(get_timestamp())) to a doc.
> > > search through enquire.set_weighting_scheme(xapian.BoolWeight()) and
> enquire.set_sort_by_value(0,True) to ensure that the results are sorted
by
> the timestamp.
> >
> > That's another approach, certainly.
> >
> > > This method is ok , but is there a faster way to do that ? Since i
have
> millions of records .
> >
> > Sorting the database, or some variant of that, is the way to get
> > really fast sorted results.
> >
> > There's a variation I experimented with using Xappy, involving sorting
> > as much of the database as possible, keeping track of the range of
> > document IDs for which the values were sorted, and using a custom
> > PostingSource to take advantage of that knowledge to skip past the
> > document IDs which were known to be at too low a value.  This worked
> > pretty well (not quite as fast as using a fully sorted database), but
> > is quite fiddly to maintain the ordering (and you need to use a custom
> > PostingSource, so if you're using one of the language bindings, you'd
> > need to compile your own custom Xapian).
> >
>
>
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>



-- 
潘俊勇

易度云办公平台
http://everydo.com
互联网时代新OA

-- 
All the best,
Tim.



More information about the Xapian-discuss mailing list