[Xapian-discuss] Long query times
Olly Betts
olly at survex.com
Thu Sep 29 16:55:20 BST 2005
On Thu, Sep 29, 2005 at 11:43:56AM -0400, tech at dbx.co.uk wrote:
> it could be the nub of the problem, as I'm not sure I understand how xapian
> works, but all I've got in the data is a number (the id of the CV in a
> MySQl database). The indexing process I go through is basically -get text
> from MySQL -> add each word as a term to a document-> add the id to the
> document as data -> add the modification time as a value -> bin the text
> (as the rest of the application (historically) uses the db). This means
> that all Xapian gives me back is a number.
That's a reasonable way to implement it.
> Just remembered, when I search I order the results by the modification time
> that I store as a value -maybe it's the sort?
Ah, that'll be the issue. Values aren't stored in a particularly
efficient way considering how they actually get used nowadays (hindsight
is 20-20). Flint will fix that...
If you arrange to add documents in modification time order (and when
updating a document delete and add it rather than replacing it) then you
can just search ordered by reverse document id to get "sort by
modification time". This is how the gmane search does "Sort by Date".
It's not as fast as it could be (ideally we want to run the postlists
backwards in this case) but it'll be faster than sorting on a value is
ever likely to be.
Cheers,
Olly
More information about the Xapian-discuss
mailing list