Weighting Schemes: Implementing Piv+ Normalization

Vivek Pal vivekpal.dtu at gmail.com
Fri Jul 29 21:25:11 BST 2016


> `ptr` is, if I inferred correctly, a `const char *`. (I'm not sure,
> because I don't know why you're incrementing it. Please push your code
> to github if you need further help so people can see the entire
> context of your changes.)

I've pushed all the changes I made so far
https://github.com/xapian/xapian/compare/master...ivmarkp:piv+?diff=split&name=piv%2B

Can you please add some comments on it? Support for normalization weighting
is complete -- just these issues with serialisation.

Meanwhile, I'm working on adding an overloaded constructor and pass
parameters s and delta to it. Thus, separating the normalization strings
altogether which will be left to the existing constructor to deal with.
Does this sound like a viable approach though?

Thanks,
Vivek

On Fri, Jul 29, 2016 at 6:09 PM, James Aylett <james-xapian at tartarus.org>
wrote:

> On Fri, Jul 29, 2016 at 12:35:57AM +0530, Vivek Pal wrote:
>
> > > I can't tell for sure without seeing the diff. You may mean just
> > > `ptr++`? But it could be something else, depending on what you're
> > > trying to do.
> >
> > I'm trying to unserialise normalization strings (e.g. "nfn", "nbsl"
> etc.)
> > along with the new double parameters (s and delta) but
> > it isn't turning out to be smooth because there's no method for
> > unserialising strings in serialise-double.h
>
> Serialising it about round-tripping numbers through strings. You
> shouldn't need to serialise a string; you don't even have to worry
> about encoding with those strings, since they're all covered by ASCII
> anyway.
>
> > Although, doing just
> >
> > const string normals = ptr++; or, const string normals =
> static_cast<const
> > string>ptr++; fixes compile errors.
>
> `ptr` is, if I inferred correctly, a `const char *`. (I'm not sure,
> because I don't know why you're incrementing it. Please push your code
> to github if you need further help so people can see the entire
> context of your changes.)
>
> const string normals = "something";
>
> will work because there's a suitable constructor. `static_cast<>`
> isn't appropriate here (again, providing I've inferred the type of
> `ptr` correctly).
>
> > But tfidfweight3 test case is failing with remote backends :-
> >
> > $ ./runtest gdb ./apitest -v tfidfweight3
> >
> > Running test: tfidfweight3... SerialisationError: REMOTE:Bad encoded
> > double: short mantissa (context: remote:prog(../bin/xapian-progsrv
> -t300000
> > .glass/db=apitest_simpledata)
>
> You're deserialising something that wasn't serialised, or wasn't
> serialised properly.
>
> If I put 'xapian bad encoded double short mantissa' into Google, I get
> this page as the top result (it may be further down the page for you):
>
> https://xapian.org/docs/sourcedoc/html/serialise-double_8cc_source.html
>
> The error message is at line 173 of unserialise_double. I don't need
> to read the code to understand what the error is telling me, because
> mantissa is a common term when dealing with floating point numbers
> (again, Google is directly fairly helpful here).
>
> > I'm wondering if I need to introduce a new method in serialise-double.h
> for
> > string parameters (normalizations in this case)? To be honest, I have
> > little idea about that part of Xapian so probably a workaround might be
> > better. :)
>
> Again: no. And generally, applying a workaround because you don't
> understand something isn't a good idea, because how will you know if
> the workaround isn't working around some important issues you need to
> address directly?
>
> J
>
> --
>   James Aylett, occasional trouble-maker
>   xapian.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.xapian.org/pipermail/xapian-devel/attachments/20160730/0c91c274/attachment.html>


More information about the Xapian-devel mailing list