[Xapian-discuss] PHP Bindings - very slow iteration

Matei Pavel mateipavel at gmail.com
Fri Oct 16 09:55:40 BST 2009


and here is the oProfile log:

http://www.promotii-reviste.ro/oprofile-callgraph.txt





On Fri, Oct 16, 2009 at 10:07 AM, Matei Pavel <mateipavel at gmail.com> wrote:

> An interesting thing I found is that this only happens when the index has
> been updated in any way and only for the first search on a particular term.
> So:
>
> 1. index updated
> 2. search for 'word': 2 -> 7 seconds
> 3. search for 'word' again: 0.1 seconds
> 3. update index
> 4. search for 'word': 2-> 7 seconds
>
> The times refer to the mse titerator get_document() loop.
>
> Any thoughts?
>
>
>
>
>
> On Thu, Oct 15, 2009 at 10:25 AM, Matei Pavel <mateipavel at gmail.com>wrote:
>
>> Hi Olly, thanks for your reply. I'm working on getting my sysadmin to
>> install oprofile since I don't have root access on the server. Here's the
>> PHP code though:
>>
>> I have a "search_controller" class that handles the user data for
>> searching and a "xapian_controller" that does everything xapian-related.
>> Inside the search controller, this is the piece of code that does the
>> search/loop:
>>
>> *$core -> xapian_controller -> read_init();     *
>> *       *
>> *$this -> result_count = $core -> xapian_controller -> search($keyword);
>>                *
>> *
>> *
>> *while($item = $core -> xapian_controller -> get_result()) {      *
>> *            $this -> search_query -> add_result($item['id'],
>> $item['relevance'], $item['site'], $item['date_published']);*
>> *}*
>>
>> And here is the related code from xapian_controller:
>>
>> *public function search($query) {                *
>> *                *
>> *        // Start an enquire session.*
>> *        $enquire = new XapianEnquire($this -> read_db);*
>> *               *
>> *        $qp = new XapianQueryParser();*
>> *        *
>> *        //stopwords*
>> *        $stopper=new XapianSimpleStopper();*
>> *        $stopwords = explode(',', config::$xapian['stopwords']);*
>> *        foreach($stopwords as $stopword)*
>> *            $stopper->add($stopword);*
>> *        *
>> *        $qp->set_stopper($stopper);*
>> *        //prefixes*
>> *        foreach(config::$xapian['prefixes'] as $key => $prefix) {*
>> *            $qp -> add_prefix($key, $prefix);*
>> *        }*
>> *        *
>> *        $qp -> set_database($this -> read_db);*
>> *        $query = $qp -> parse_query($query);*
>> *        *
>> *        $enquire -> set_query($query);*
>> *        $matches = $enquire -> get_mset(0, $this -> read_db ->
>> get_doccount());*
>> *        *
>> *        $this -> matches = $matches;        *
>> *        $this -> matches_iteration = $this -> matches -> begin();*
>> *        *
>> *        return $matches -> get_matches_estimated();*
>> *        *
>> *    }*
>>
>> *public function get_result() {*
>> *        *
>> *        if(!$this -> matches_iteration -> equals($this -> matches ->
>> end())) {*
>> *            $doc = $this -> matches_iteration -> get_document();*
>> *            $item = array(*
>> *                'id'             => Xapian::sortable_unserialise($doc ->
>> get_value(0)),*
>> *                'relevance'      => $this -> matches_iteration ->
>> get_percent(),*
>> *                'site'           => Xapian::sortable_unserialise($doc ->
>> get_value(3)),*
>> *                'date_published' => $doc -> get_value(1)*
>> *            );*
>> *            *
>> *            $this -> matches_iteration -> next();*
>> *            return $item;*
>> *        } else {        *
>> *            return false;*
>> *        }*
>> *    }*
>>
>>
>> Does this contain any obvious "don't do!"s ?
>>
>> Thank you.
>> Matt
>>
>>
>>
>>
>>
>> On Thu, Oct 15, 2009 at 1:27 AM, Olly Betts <olly at survex.com> wrote:
>>
>>> On Wed, Oct 14, 2009 at 11:03:26AM +0300, Matei Pavel wrote:
>>> > I'm using the latest Xapian release and PHP bindings and I only have
>>> about
>>> > 40.000 indexed documents, yet when i loop through the results and
>>> getting
>>> > each result document with get_document(), it takes a really long time.
>>> >
>>> > For example, a query that only returns 2-300 results takes about 2.1
>>> seconds
>>> > to complete the loop.
>>>
>>> It would be useful if you could show us a complete example PHP script
>>> which
>>> exhibits this behaviour.
>>>
>>> > Is there anything I can do to speed this up?
>>>
>>> Profile to find where the time is spent.  For some tips see:
>>>
>>> http://trac.xapian.org/wiki/ProfilingXapian
>>>
>>> Cheers,
>>>     Olly
>>>
>>
>>
>


More information about the Xapian-discuss mailing list