[Xapian-discuss] Stemming and Query Parsing

Olly Betts olly at survex.com
Thu Oct 21 00:31:39 BST 2004


On Wed, Oct 20, 2004 at 11:37:40AM -0400, Mike Boone wrote:
> OK, I have the # sign stuff working. When the comment said "Note that
> nethack-- and Cl- are handled above", I thought it meant the 'if' statement
> a couple lines above, not way above in another function. It might be more
> clear to change it to "Note that nethack-- and Cl- are handled above in
> p_notplusminus".

I'll improve the comment (though p_notplusminus has gone now!)

> I changed the function to look like this:
> 
> inline static bool p_notplusminus(unsigned int c)
> {
>   // MB adding # sign
>   return c != '+' && c != '-' && c != '#';
> }

That sounds about right, assuming p_notplusminus isn't used elsewhere.
Note that with this change the parser will happily accept terms like
foo-#++#--- -- weird, but probably not actually a problem.

> Regarding the large PHP4 xapian.so file size: I was just copying the PHP4
> xapian.so from the .libs directory to the place I wanted to keep it, but I
> ran the make install-strip and it cut it down to 3.7MB, still double the
> size of the 0.8.1 version, but better than 10MB.

Hmm, that's more worrying.  There are a few more methods wrapped in
0.8.3 than 0.8.1, but nothing that should double the size.  I suspect
this change is the cause:

* Build the SWIG glue library like we build the others rather than using
  SWIG's -phpfull option.  This avoids problems with newer automake
  versions and means we can build against an uninstalled xapian library.

I'd rather not revert this change, as the new way is an improvement in
several ways.  But building this way shouldn't change the size at all.
So we just need to figure out what the important difference is.

I'll take a look.  The other possibilities I can see are changes due to
different versions of GCC, SWIG, or PHP.  Do you know if you upgraded
any of these between the two builds?

> Is there any way to make
> that strip happen without doing the make install-strip and having the files
> put somewhere I didn't necessarily want them?

You can run the strip by hand, but that's fiddly.

We plan to add a configure option to "install" stuff to a non-system
location (mainly to aid non-root users to use the bindings).  Once
that's done it'll be easier.

For now, a reasonable workaround is to use:

make install-strip DESTDIR=`pwd`/TEMP

DESTDIR is prepended to the install paths, so stuff will be installed to
TEMP/usr/lib/... instead of /usr/lib (DESTDIR is a standard feature of
automake build systems).

Cheers,
    Olly



More information about the Xapian-discuss mailing list