[Xapian-discuss] New to Xapian (coming from Lucene)

Oliver Flimm flimm at sigtrap.de
Fri Apr 13 19:05:27 BST 2007


Hi Jeff,

On Fri, Apr 13, 2007 at 10:55:36AM -0400, Jeff Anderson wrote:
> >The document data is simply a chunk of text or binary data which is
> >stored alongside the document. It is most commonly used to implement
> >what we call FIELDS as name-value pairs ... Xapian doesn't provide
> >any support itself for doing this, but in Perl it's pretty easy to do (you
> >could, in fact, serialise a Perl hash to make life really easy).
> 
> And this is what really gets to me. I just don't agree with this
> philosphy, as i have already stated a few times previously. As i also

I don't see this as a problem because it can be handled quite easily in
Perl. In fact I find this 'container'-approach much more flexible and
'natural' to use in Perl than numbered fields. In my own Open Source project
OpenBib (see my previous mail) I use this container to store hashes of
arrays of hashes serialized with Perl's Storable-module. In it I store
bibliographic data like title, authors, institutions, year of publication
etc.. It's very convenient to retrieve this data-structure and just use it. 

Another advantage of this approach is the ability to use the same container
in different search-backends. In OpenBib I can use Xapian as a search
backend as well as data in a MySQL-database. Only the code to search and
retrieve the containers differs, but after that I just use the hash (of
arrays of hashes) I get after de-serializing. Very easy, very neat.

Just my 0.02 EUR.

Regards,

Oliver

-- 
!- Oliver Flimm - Cologne/Germany | flimm at sigtrap.de | http://www.sigtrap.de/ -!
!    Die Zehn Gebote haben 279 Woerter, die amerikanische Unabhaengigkeits-    !
! erklaerung hat 300 Woerter. Die EU-Verordnung zur Einfuhr von Karamelbonbons !
!-----------------------------  hat 25911 Woerter  ----------------------------!



More information about the Xapian-discuss mailing list