[Xapian-discuss] Quickest way to retrieve data for a large match set?
William Crawford
william at sciencephoto.co.uk
Thu Jun 24 12:55:09 BST 2010
We're using the Perl binding to access Xapian in a simple search of image
metadata (title and keywords). Due to the specification for the search engine,
by default we have to sort the results using a function of the search rank,
age (well, newness) and popularity (rated by sales of the image). As a result,
we have to fetch the complete result set and then calculate a new ranking
based on the original rank, perturbed using the ratios of each of the newness
and popularity to the highest values in the result set (i.e. there is no way
to precalculate these at indexing time, alas).
Currently fetching the document data for the results has become something of a
bottleneck (typical searches my generate 50 - 500 matches, but some return
more than 5000).
Code is something like:
...
print STDERR "Query = ", $q->get_description, "\n" if $self->debug;
my $e = $self->index->enquire ($q);
#my $hits = $e->get_mset(0, $self->index->get_doccount, $self->index-
>get_doccount);
my (@hits) = $e->matches (0, $self->index->get_doccount, $self->index-
>get_doccount);
my (@results) = map +thaw($_->get_document->get_data), @hits;
return \@results;
}
I'd like to know if there's anything I can do to improve the speed of fetching
the results (in other words, am I doing it wrong)?
More information about the Xapian-discuss
mailing list