How to get the serialise score returned in Xapian::KeyMaker->operator().
Olly Betts
olly at survex.com
Tue Jan 16 19:25:46 GMT 2018
On Mon, Jan 15, 2018 at 08:55:26PM +0800, 张少华 wrote:
> In our case, we want to get a weight using the user' properties(age,
> gender, price preference) and products' properties(price, comment
> count, purchased amount among different gender or range of age). So
> our weight function is complex, no matter we use KeyMaker or
> PostingSource, six to eight values in slot will be used.
>
> But we find that using doc.get_value(slot) several times separately in
> each search makes getting result slowly.
Each value slot is stored as a separate chunked stream, so fetching many
of them will increase the work required.
> Now we want to constuct a forward index (using unordered map) which
> uses docid as key and its value contains the slot values we need, also
> the forward index will be constructed while we starting our
> application. Then we can get the values we used at the same time, and
> we need not to use sortable_unserialise().
I'm not sure unordered_map is the best choice here - the values will be
accessed in increasing docid order, and something that has better
locality of access will probably be faster due to caching
considerations.
Unless you have a lot of large gaps in your docids, I'd consider just
using a vector indexed by the docid. Even with a few unused entries,
you save all the hash table overhead that unordered_map will add.
> Do you have some suggestions about this or is there some other way to
> make our search faster?
You could also serialise all the data you want for weighting into a
single value slot. If you have the RAM for it and don't mind the
start-up time overhead, I'd imagine your approach would be faster.
Cheers,
Olly
More information about the Xapian-discuss
mailing list