[Xapian-devel] Re: [Xapian-discuss] Fixing issue with PHP5/Windows

Daniel Ménard Daniel.Menard at bdsp.tm.fr
Thu Apr 5 09:40:56 BST 2007


Olly Betts a écrit :
> But I just reread the original thread.  To summarise, this has the odd
> bug under MSVC:
>
>     $terms = join(" ", $enq->get_matching_terms($mset->get_hit(0)))
>
> But this works:
>
>     $hit=$mset->get_hit(0);
>     $it=$enq->get_matching_terms_begin($hit);
>     while (! $it->equals($enq->get_matching_terms_end($hit)))
>     {
> 	echo $it->get_term(), ' ';
> 	$it->next();
>     }
>
> I thought someone had said that this worked, but I can't find where
> anyone did
>   
I think I am this someone ;-).
I will sum up what I tested:

- join(" ", $enq->get_matching_terms($mset->get_hit(0)));
does not work: each term of the resulting string as a "\0" replacing the 
first char of the term
(we get the string " s  here", ie the chars '\0', 's', '<space>', '\0', 
'h', 'e', 'r' and 'e')

- the problem is not with the join(). The code
var_export($enq->get_matching_terms($mset->get_hit(0)));
prints out an array in which each term also starts with a \0 char:

    array (
      0 => ' s',   // first char is \0
      1 => ' here', // idem
    )

- the problem is not with the way xapian stores terms: using delve -t or 
php code like this works:

    $hit=$mset->get_hit(0);
           $it=$enq->get_matching_terms_begin($hit);
           while (! $it->equals($enq->get_matching_terms_end($hit)))
           {
               echo $it->get_term(), ' ';
               $it->next();
           }

(the code prints out the expected terms : 'hi' and 'there')

- The problem is probably windows specific: I can't reproduce it under 
debian

- Richard suggested it has something to do with the code at the end of 
utils.i which copies the value to create a native php array:
http://permalink.gmane.org/gmane.comp.search.xapian.devel/1053
I think he is right... I added some debug code in xapian_wrap.cc to 
output terms in a file (be indulgent with me, my C knowledge is near 
from null):

        FILE * f=fopen("c:\\terms.txt", "w");
        for (Xapian::TermIterator i = (&result)->first; i !=
    (&result)->second; ++i) {
          char * p = const_cast<char *>((*i).data());
          fprintf(f,"data: [%s], p:[%s]\n",((*i).data()), p);
          add_next_index_stringl(return_value, p, (*i).length(), 1);
        }
        fclose(f);


I got the following output :

    data: [is], p:[]
    data: [there], p:[]

So (*i).data() is OK, but p is not... Something to do with the cast?
In fact, if I ignore the cast by changing the add_next_index_string1 
line with
      add_next_index_stringl(return_value, (char *) (*i).data(), 
(*i).length(), 1);
There's no more problem and the test passes... just my two cents...

- Last thing: Charlie said that fixing this was debatable...
http://permalink.gmane.org/gmane.comp.search.xapian.devel/1063
and I agree with him: as long as get_matching_terms_begin/end work as 
expected, I can live without the php array.

Best regards,

-- 

Daniel Ménard





More information about the Xapian-devel mailing list