About xapian serialization on float/double variables

Miao LIU miaoliu95 at acm.org
Tue Jan 22 03:37:31 GMT 2019


Dear Members of Xapian Project,

    Sorry for troubling you this time. It can be witnessed that xapian will store Document values with serialization approach when given value types meet float/double.

    Such an approach is deployed on sort_key related fields as well, where the xapian requires KeyMaker::operator() must return an serialized float/dobule variable. Then heap sort comes and ranks the vector<MSetItem> items (multimatch.cc MultiMatch::get_mset()) by comparing serialized sort_keys (std::string) straightforwardly according to <IEEE-754 doubles>. Subsequently sort_keys will be unserialized when user needs to read its real float/double values during iterations of result MSet.

    Obviously, serialization and unserialization are time-consuming operations. Compared with defining and using sort_key as float/double type directly, it is complicated to understand benefits of such serialization above in both performance and coding aspects.

    It will be very kind of you if you could give a short illustration. Looking forward to your early reply.

Best Regards,
Miao LIU



More information about the Xapian-discuss mailing list