[Xapian-tickets] [Xapian] #385: Expanding docids (etc) beyond 32 bit types
Xapian
nobody at xapian.org
Wed Jun 24 12:29:23 BST 2015
#385: Expanding docids (etc) beyond 32 bit types
-------------------------+------------------------------
Reporter: james | Owner: olly
Type: enhancement | Status: assigned
Priority: normal | Milestone: 1.3.x
Component: Other | Version: SVN trunk
Severity: minor | Resolution:
Keywords: | Blocked By:
Blocking: | Operating System: All
-------------------------+------------------------------
Comment (by olly):
[14b7af012cd57bb5e6097584f36d8680ca3c8d7e] just changes the testutils
`operator<<` to use `Xapian::docid` instead of `unsigned int`, which
avoids using a template there. Really the issue is just that the wrong
type was being used.
[ decode_length() ]
> I'm thinking this is no longer needed because of the other change you
made?
Indeed - that should all just work now.
[ sizeof(long long) ]
> I'm not really sure how to find out the answer to this question. Do we
have a list of devices/hardware/OSs or something to check against?
It's hard to gather a complete list of platforms Xapian works on as people
might have successfully built on something and never told us. We're much
more likely to hear if they fail, or if a few tweaks are needed.
But we can look at the platforms which are generally in active use (and in
this case, anything old isn't going to make `long long` wider than it has
to, and 64-bit is the minimum requirement).
The current norm is definitely 64-bit long long - e.g. all the
architectures Debian supports have `sizeof(long long) == 8`. The GCC
manual seems to hint that GCC supports platforms where this isn't true,
but I don't know an easy way to find out what they are:
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fint128.html#g_t_005f_005fint128
says "There is no support in GCC for expressing an integer constant of
type __int128 for targets with long long integer less than 128 bits wide"
which suggests that there are targets with long long at least 128 bits
wide.
C++11 provides `uint64_t` in `<cstdint>` (at least if such a type exists),
though so far we've tried to avoid introducing C++11 assumptions in the
API headers (only in the library code) - most compilers currently need
C++11 support enabling with a command-line option, and it seems unhelpful
to force all C++ projects using Xapian to update their build systems to
probe for such an option.
I think we probably just use `unsigned long long` - it will always work,
and while it may be wider than necessary, that seems mostly a theoretical
worry currently.
> Would the conditional enabling that seems half done through #define
USE_64BIT_DOCID and #define USE_64BIT_TERMCOUNT be suitable?
We don't want to have be defining generically named macros in the API
headers (we risk colliding with macros the application using Xapian is
using) - so the macros should start `XAPIAN_`.
But the basic idea seems OK.
> Could we have those somehow enabled in ./configure perhaps ./configure
--with-64-bit-docids --with-64bit-termcount? Is there an example of a
configuration step in xapian already that I can look at and try to copy
that?
This is trickier for things like this which we want to use in the API
headers as we can't just stick `#include <config.h>` in those.
I'd look at `--enable-backend-chert` and `XAPIAN_HAS_CHERT_BACKEND` in
`configure.ac` and `include/xapian/version_h.cc` (which is used to
generate `include/xapian/version.h`).
The options should probably be `--enable-X` (`--with-X` is conventionally
meant to be used when `X` is some other software package and `--enable-X`
when `X` is a feature of this package - e.g. `--with-java` vs `--enable-
backend-chert` - the most obvious consequence is the sections they are
listed under by `configure --help`).
It seems confusing for `docids` to be plural in the option name when
`termcount` isn't; similarly be consistent with `64-bit` vs `64bit` there.
> Also is there a way to detect this at compile time so we can pivot based
on whether or not unsigned long is already 64bit?
I can't think of one unless we force people to select C++11 and just use
`uint64_t`, but it seems a bit soon for that. At some point compilers
will presumably default to C++11 and this won't be a consideration.
--
Ticket URL: <http://trac.xapian.org/ticket/385#comment:13>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list