[Xapian-discuss] writing match deciders / custom handling of terms

Torsten Foertsch torsten.foertsch at gmx.net
Tue Nov 11 09:18:11 GMT 2008


On Tue 11 Nov 2008, Oliver Flimm wrote:
> I've written all this in my project wiki, *but* in german ;-)
>
> http://wiki.openbib.org/index.php?title=Einführung_in_das_Xapian_Perl
-API

Just a few questions about that. Why don't you use the Xapian term 
generator for indexing but instead String::Tokenizer? For searching you 
then use Xapian::QueryParser. But to my knowledge the query parser and 
the term generator used for indexing must fit together. Otherwise you 
may get terms in your database that cannot be searched for, right? I am 
quite new to xapian. So please forgive me if I am wrong.

Then you say one will get up to 100000 matches by "my @matches = 
$enq->matches(0,99999);". While that is true the next statement isn't 
as far as I know. Again, please forgive me if I am wrong. You say the 
real matching count can be get as "my $fullresultcount = 
scalar(@matches);". But that statement simply returns the length of the 
list, does it not? So $fullresultcount is at most 100000. What if your 
matching set is larger?

I don't know if it is the way to go but I do the following to get the 
estimated match count:

  tie my @matches, 'Search::Xapian::MSet::Tied', $enq->get_mset(0,3);
  my $estimated_matches=(tied @matches)->get_matches_estimated;
  my $upper_bound=(tied @matches)->get_matches_upper_bound;

Torsten

--
Need professional mod_perl support?
Just hire me: torsten.foertsch at gmx.net



More information about the Xapian-discuss mailing list