[Xapian-discuss] use TermGenerator to get the term

Ying Liu liux0395 at umn.edu
Mon Nov 2 18:31:15 GMT 2009


Hi Olly,

Thanks for your reply.

The reason I asked the question is I didn't understand that this is an 
information retrieval system. I can only start from a word and get its 
position, but not the opposite direction, just like the one way street. 
And that is the reason it is so fast.

For the PositionIterator, positionlist_begin() and positionlist_end() 
will point to the same place only when the list is empty.

I learn a lot from archived emails. Thank you very much!

-Ying


Olly Betts wrote:
> On Thu, Oct 29, 2009 at 02:26:58PM -0500, Ying Liu wrote:
>   
>> I have a question about use the TermGenerator alone by Perl. Someone  
>> asked this question before and his code is in C++.  
>> (http://lists.xapian.org/pipermail/xapian-discuss/2008-November/006109.html). 
>> My code is in Perl. Can I get the term by the position just by  
>> TermGenerator?
>>
>>    my $analyzer = Search::Xapian::TermGenerator->new;
>>    $analyzer->index_text("hello Xapian world");
>>    my $curr_position = $analyzer->get_termpos();     
>> $analyzer->set_termpos(2);     $curr_position = $analyzer->get_termpos();
>>    $analyzer->increase_termpos(1);
>>    $curr_position =  $analyzer->get_termpos();
>>
>> If I set the document and then use $doc to iterate the term list, terms  
>> are order  alphabetically. I don't know how to use the positer().
>>
>> $analyzer->set_document($doc);
>> my $termlist_begin = $doc->termlist_begin();
>> $termlist_begin++;
>> my $term = $termlist_begin->get_document();
>>     
>
> Um, that line is wrong as TermIterator doesn't have a get_document()
> method.  I think you mean:
>
>     my $term = $termlist_begin->get_termname();
>
> To iterator the positions for TermIterator $term_itor you'd do:
>
>     my $pos_itor = $term_itor->positionlist_begin();
>     while ($pos_itor ne $term_itor->positionlist_end()) {
> 	print $pos_itor->get_termpos(), "\n";
> 	++$pos_itor;
>     }
>
>   
>> Btw, if I want to use ESet, how to assign the text, similar method like  
>> index_text for ESet?
>>     
>
> The ESet is generated from the same terms used for searching.
>
> Cheers,
>     Olly
>   




More information about the Xapian-discuss mailing list