[Xapian-discuss] New to Xapian (coming from Lucene)

James Aylett james-xapian at tartarus.org
Fri Apr 13 17:25:47 BST 2007


On Fri, Apr 13, 2007 at 10:55:36AM -0400, Jeff Anderson wrote:

> Perhaps that was because the Lucene docs were more accessible than the
> Xapian docs. Sorry to sound mean, but the fact is i found Lucene to be
> much easier to understand and to actually use.

I agree, there's some work that needs doing on improving Xapian's
documentation. There are plans, they just need some time to implement :-/

> >Terms come in two forms: postings (which have positional information)
> >and "plain" terms (which don't). So you can do:
> 
> I still don't understand the need to specifiy a numeric position,
> unless this determines some kind of "boost" on the term. In Lucene and
> Kinosearch, the coder can refer to items by a key name, and be able to
> retrieve pieces of data to display in the search results by those key
> names. Leave the numbers to the mathematicians! :P

No, positions aren't for this at all. Let me give you a concrete
example. Terms are following, in brackets, by their position (I've
assumed no stemming, but I've lowercased all terms):

i(1) am(2) going(3) to(4) the(5) zoo(6)

The position is the position of the term within the wider text it came
from. It is used for phrase and proximity matching, so you can (say)
search for "going to the" as three consecutive terms, or "zoo" within
four terms of "going" or something.

Positions and terms have *nothing* to do with information that you
want to look up indexed. For that, you put something in the document
data (which is where you don't like how Xapian operates).

We have discussed having management of some reasonable data format
that gives you fields built into the library, but we never agreed
quite what it should look like. (I *think* we're agreed it wouldn't be
a bad idea, these days.)

Looking quickly at Kino, it would be entirely possible to build
something similar on top of Xapian. Similar things have been done in
the past (Xapwrap comes to mind). However if Kino does what you want
right now, you're probably better off using that.

J

-- 
/--------------------------------------------------------------------------\
  James Aylett                                                  xapian.org
  james at tartarus.org                               uncertaintydivision.org



More information about the Xapian-discuss mailing list