[Xapian-discuss] PHP Bindings - very slow iteration
Matei Pavel
mateipavel at gmail.com
Fri Oct 16 08:07:51 BST 2009
An interesting thing I found is that this only happens when the index has
been updated in any way and only for the first search on a particular term.
So:
1. index updated
2. search for 'word': 2 -> 7 seconds
3. search for 'word' again: 0.1 seconds
3. update index
4. search for 'word': 2-> 7 seconds
The times refer to the mse titerator get_document() loop.
Any thoughts?
On Thu, Oct 15, 2009 at 10:25 AM, Matei Pavel <mateipavel at gmail.com> wrote:
> Hi Olly, thanks for your reply. I'm working on getting my sysadmin to
> install oprofile since I don't have root access on the server. Here's the
> PHP code though:
>
> I have a "search_controller" class that handles the user data for searching
> and a "xapian_controller" that does everything xapian-related. Inside the
> search controller, this is the piece of code that does the search/loop:
>
> *$core -> xapian_controller -> read_init(); *
> * *
> *$this -> result_count = $core -> xapian_controller -> search($keyword);
> *
> *
> *
> *while($item = $core -> xapian_controller -> get_result()) { *
> * $this -> search_query -> add_result($item['id'],
> $item['relevance'], $item['site'], $item['date_published']);*
> *}*
>
> And here is the related code from xapian_controller:
>
> *public function search($query) { *
> * *
> * // Start an enquire session.*
> * $enquire = new XapianEnquire($this -> read_db);*
> * *
> * $qp = new XapianQueryParser();*
> * *
> * //stopwords*
> * $stopper=new XapianSimpleStopper();*
> * $stopwords = explode(',', config::$xapian['stopwords']);*
> * foreach($stopwords as $stopword)*
> * $stopper->add($stopword);*
> * *
> * $qp->set_stopper($stopper);*
> * //prefixes*
> * foreach(config::$xapian['prefixes'] as $key => $prefix) {*
> * $qp -> add_prefix($key, $prefix);*
> * }*
> * *
> * $qp -> set_database($this -> read_db);*
> * $query = $qp -> parse_query($query);*
> * *
> * $enquire -> set_query($query);*
> * $matches = $enquire -> get_mset(0, $this -> read_db ->
> get_doccount());*
> * *
> * $this -> matches = $matches; *
> * $this -> matches_iteration = $this -> matches -> begin();*
> * *
> * return $matches -> get_matches_estimated();*
> * *
> * }*
>
> *public function get_result() {*
> * *
> * if(!$this -> matches_iteration -> equals($this -> matches ->
> end())) {*
> * $doc = $this -> matches_iteration -> get_document();*
> * $item = array(*
> * 'id' => Xapian::sortable_unserialise($doc ->
> get_value(0)),*
> * 'relevance' => $this -> matches_iteration ->
> get_percent(),*
> * 'site' => Xapian::sortable_unserialise($doc ->
> get_value(3)),*
> * 'date_published' => $doc -> get_value(1)*
> * );*
> * *
> * $this -> matches_iteration -> next();*
> * return $item;*
> * } else { *
> * return false;*
> * }*
> * }*
>
>
> Does this contain any obvious "don't do!"s ?
>
> Thank you.
> Matt
>
>
>
>
>
> On Thu, Oct 15, 2009 at 1:27 AM, Olly Betts <olly at survex.com> wrote:
>
>> On Wed, Oct 14, 2009 at 11:03:26AM +0300, Matei Pavel wrote:
>> > I'm using the latest Xapian release and PHP bindings and I only have
>> about
>> > 40.000 indexed documents, yet when i loop through the results and
>> getting
>> > each result document with get_document(), it takes a really long time.
>> >
>> > For example, a query that only returns 2-300 results takes about 2.1
>> seconds
>> > to complete the loop.
>>
>> It would be useful if you could show us a complete example PHP script
>> which
>> exhibits this behaviour.
>>
>> > Is there anything I can do to speed this up?
>>
>> Profile to find where the time is spent. For some tips see:
>>
>> http://trac.xapian.org/wiki/ProfilingXapian
>>
>> Cheers,
>> Olly
>>
>
>
More information about the Xapian-discuss
mailing list