[Xapian-discuss] New to Xapian (coming from Lucene)

Alexander Lind malte at webstay.org
Fri Apr 13 16:55:28 BST 2007


Jeff Anderson wrote:
> Perhaps that was because the Lucene docs were more accessible than the
> Xapian docs. Sorry to sound mean, but the fact is i found Lucene to be
> much easier to understand and to actually use.
I agree with that xapians doc:s are pretty hard to follow sometimes. I 
hope to contribute some documentation to the wiki when I know I 
understand things more thoroughly.
>
>> Terms come in two forms: postings (which have positional information)
>> and "plain" terms (which don't). So you can do:
>
> I still don't understand the need to specifiy a numeric position,
> unless this determines some kind of "boost" on the term. In Lucene and
> Kinosearch, the coder can refer to items by a key name, and be able to
> retrieve pieces of data to display in the search results by those key
> names. Leave the numbers to the mathematicians! :P
Are you sure you are not talking about values here? You can establish a 
value at say number 0, which is always say a product_id (linking the 
item to something in a database perhaps). And then number 1 could be the 
price of an item. Number 2 could be the weight.
The numbers are not boosts, they are just there to let you establish a 
structure of values. You can use these values to sort results by price, 
sizes, popularity or whatnot.

As for the postings, if you want to be able to do a phrase search, 
xapian has to index all the words in the text that you want to phrase 
search on, and hence has to store positional information about each word 
in that text. But postings and terms are not the same as values.

IMHO you don't need to be a mathematician to use either of these techniques.

Alec



More information about the Xapian-discuss mailing list