[Xapian-discuss] Re: Xapian and research in IR: a few suggestions from experience

Olly Betts olly at survex.com
Wed Sep 12 18:52:13 BST 2007


On Wed, Sep 05, 2007 at 06:45:01PM +0200, Emmanuel Eckard wrote:
> All these models would call for doubles, or vectors of doubles, to be 
> associated with Documents, TermIterators and Databases.

I wonder how best to store such doubles.  One option is the format we
use for the remote protocol (serialise_double() in
common/serialise-double.cc) which can require up to 11 bytes, but
often needs less than 8.  It's possible that the encoding could be
made more compact - an extra byte or two isn't a big concern for where
it is currently used.

Another approach would be to use the IEEE format, and carefully convert
to/from that on platforms where it isn't the native format.

If you have a sample of the doubles you'd want to store, it would be
interesting to see how large it is after running it through
serialise_double().

Cheers,
    Olly



More information about the Xapian-discuss mailing list