[Xapian-devel] [xapian] a bug fixed in brass_database.cc

Olly Betts olly at survex.com
Tue Apr 17 03:14:04 BST 2012


On Sat, Apr 14, 2012 at 07:02:58PM -0400, Weixian Zhou wrote:
> I fixed a bug in brass_database.cc.

Thanks for your patches.

One minor but important issue - your indentation doesn't match the
existing code.  Set your editor to indent by 4 spaces, with tabs
displayed as 8 spaces, and indents of 8 or more spaces tab-filled.
Full details are in HACKING.

Also, in a patch, it's better to just remove old code rather than
commenting it out.  If people want to see the old code, they can look in
the version control system.

Looking at the flush threshold patch first, don't put your initials or
name in comments you add (or at least not as a matter of course) - if
people want to know who wrote a method, the version control system can
tell them (e.g. git blame or svn blame).  Updating copyright attribution
is the big exception to that.

It's better to make comments like that one in brass_inverter.h doxygen
format comments so they appear in the generated documentation for
internal APIs.  Mostly that means use "/**" or "///" to start them,
though there are various formatting rules too.

There's no need to define a destructor if it's empty - you get an
empty destructor by default.

Inverter::get_inverter_size() is just an accessor method, so should
really be const.

The tracking of posting_count seems incorrect to me:

* You increment it for each call to add_posting() (OK so far)
* But you don't increment it at all for remove_posting(), which often
  also adds entries.
* You only decrease it by own when a posting list is written out, not
  by the number of entries written.
* The default threshold of 10000 is going to be too low now.

It would also really be better (though much harder) to track the
memory actually used, since that's what matters more to users.

You don't remove the existing change_count variable, but just remove the
code which updates it when documents are added or updated or removed, so
it will always be zero (unless the user asks for allterms) which will
cause various issues (like commit() not actually flushing changes).
This causes a number of testcases to fail with your changes applied -
try "make check" to see.

> I also modified the simpleindex.cc so that it now supports batch files
> indexing.

Interesting, but we already have omindex for indexing a directory tree,
while simpleindex is intended to be *simple*.  So I'm not sure we want
to change it to scan directories, particularly since the code to do that
depends on the OS (we have a adaptor layer for this for the MSVC build,
but examples should ideally not have to drag in such code as it makes
them harder to understand).

A list of text files on the command line might be a reasonable
alternative (or perhaps better as a separate example).

Cheers,
    Olly



More information about the Xapian-devel mailing list