[Xapian-discuss] PHP Bindings... Infinite Loop?

Olly Betts olly at survex.com
Wed Dec 27 23:29:42 GMT 2006


On Wed, Dec 27, 2006 at 02:48:50PM -0500, Ryan Mahoney wrote:
> I recently upgraded to version 0.9.9 of the php bindings.  I am running 
> Apache 2.2.3 and PHP 5.2.  I have re-indexed all my data with the latest 
> version of scriptindex.  The data set I have is somewhat small, maybe 
> 4000 items.  When I run a query, it completes quickly, but as I run a 
> few more (maybe 2 or 3), the Apache process goes out of control, using 
> up 1 CPU completely on a 4 CPU linux server.  Any ideas?

I notice you're using the now-deprecated flat function interface.  I
guess that could be the problem.  Even if it isn't, updating is a good
idea:

http://article.gmane.org/gmane.comp.search.xapian.general/3754

As for the loop, the thing you need to do is narrow down where it is
looping.  You can attach gdb to an existing process with pid PID using
"gdb --pid PID" and then type "bt" to produce a backtrace from the
current point of execution.  This will only help if the problem is
in code compiled with debugging, otherwise you won't get a very helpful
answer.

Other than that, I'd suggest the tried and tested debugging technique
of adding `print "got to 1\n";', `print "got to 2\n";', etc through the
code to at least find out where in the PHP script it starts looping.

>                //create a query parser
>                define('FLAG_BOOLEAN', 1);

You should use the constants defined by the xapian bindings rather than
defining them for yourself - the numerical values aren't guaranteed to
remain the same when the library ABI changes, but the named constants
will.  With the deprecated flat wrappers, you should be using
QueryParser_FLAG_BOOLEAN (with the OO wrappers for PHP5, it's
XapianQueryParser::FLAG_BOOLEAN).

>                $query_parser = new_queryparser();
> 
>                //set the stemmer and turn on the stemming strategy
>                queryparser_set_stemmer($query_parser, $stemmer);
>                queryparser_set_stemming_strategy($query_parser, 1);

Similar issue - use the named constant (QueryParser_STEM_SOME or
XapianQueryParser::STEM_SOME), not a literal numerical value!

>                //count elements in each category
>                foreach($category as $cat_id => $cat_name) {

You don't appear to have posted a complete example here.  If you have
then $category is undefined at this point.

>                        $cat_enq = new_enquire($db);

You might as well just reuse a single XapianEnquire object here.

>                        //parse and create the query
>                        $cat_query = 
> queryparser_parse_query($query_parser, $_REQUEST["search"] . " 
> category:" . $cat_id);

You shouldn't abuse the QueryParser like this.  Only use it for human
entered queries.  If you want to apply an automatic filter to a human
entered query, do something like:

    $query = $query_parser->parse_query($_REQUEST["search"]);
    $query = new XapianQuery(XapianQuery::OP_FILTER, $query, new XapianQuery("XC" + $cat_id));

Trying to modify the user's query opens a big can of worms - for
example, they might enter `foo NOT' as the query, and your filter will
get applied inverted: `foo NOT category:bar'

A bonus in this case is that you can parse the user's query just once
outside the loop, and combined it with each filter in turn inside the
loop.

>                $query = queryparser_parse_query($query_parser, 
> $_REQUEST["search"] . $append);

Same issues here.

>                $item = mset_begin($result_set);
>                $total = mset_get_matches_estimated($result_set);
> 
>                $pages = ceil($total / $results_per_page);
> 
>                //reset array
>                $item = mset_begin($result_set);

I don't understand what "reset array" means here.  Calling mset_begin
twice like this is pointless since you don't touch $item from the first
call.  Should be harmless though, it's just wasted effort.

Cheers,
    Olly



More information about the Xapian-discuss mailing list