[Xapian-discuss] Practical example/explanation using an existing
database
Jim
jim at fayettedigital.com
Tue Jul 24 12:52:18 BST 2007
Edwin Smulders wrote:
> Hi,
>
> I'm reading up on the usage of Xapian to find out if we can use it for
> Wine's Application Database, and I'm having a bit of trouble seeing
> the general picture. I could use some practical information (through
> words or code) on how to search an existing (mysql) database.
>
> As far as I can tell it can be used with a mysql db, and I read that
> Xapian first makes an index (in it's own database/tables ?) and then
> searches through that index. Now a few questions come to mind and I
> couldn't find the answers in the documentation.
As Alexander said, Xapian is just a library that allows you to build and
search. Scriptindex is a program using Xapian that takes input from a
file as a set of text fields and an index file telling what to do with
the fields and builds an Xapian indexed database from the text. Each
set of fields in the input would represent a row in the database
(presumably) and one of the fields should contain a unique value (like
an ID) that would be used to fetch the row during a search.
For example, if we had a Mysql db with the following schema:
id bigint
lastname varchar(40)
firstname varchar(40)
address varchar(80)
You could write a simple script/program to extract the data from the db
and create an input file for scriptindex that looks like:
id=0
lastname=Brown-White
firstname=John
address=123 Main Street
=anywhere,
=VA
=22222
=USA
id=1
lastname=Johnson
firstname=Jack
=aroo
address=234 Story Lane
=somewhere,
=WA
=09876-0988
=USA
Etc. Each block represents a record fetched from the db.
The index file might look like:
id : field boolean=Q unique=Q
lastname: text
firsname: text
address: text
Scriptindex would then read both files (a data file and an index file)
and create a searchable database that omega could read.
Omega would then be called via something like:
omega?P=lastname:Johnson%20ANDfirstname:Roger
or
omega?P=address:23123 (to find all the people in zip 23123)
What omega returns will probably have to be interpreted by a program
that actually goes to the mysql db and fetches the row and formats it in
the way you want the data presented.
This is not the best way to index the data, but for simplicity I left
off a lot since you wanted a concept not the details. For simplicity I
used the same names for the fields in Scriptindex as in the database but
that is not necessary.
Jim.
>
> Firstly, how exactly does the indexing work in regard to telling
> Xapian what to search through? Do we write an SQL query returning all
> the data we want indexed? or maybe do we tell it what tables/columns
> to index (ie. does it generate queries?)
> And how is the index updated, a regular rescan or an update whenever
> data in our system updates?
>
> The other question that came to mind is, once everything is indexed,
> how is the data returned on a search? This is best explained in an
> example:
> If a user would be entering a a search term and I (the programmer)
> want to search the database, can i specifically tell Xapian to search
> in for example the application names, or the descriptions, or both?
By using the "field:" syntax you may search any field(s) that you want.
>
> I hope somebody can clarify this for me, right now it all looks quite
> difficult to implement.
Not at all.
>
>
> Edwin Smulders
>
> _______________________________________________
> Xapian-discuss mailing list
> Xapian-discuss at lists.xapian.org
> http://lists.xapian.org/mailman/listinfo/xapian-discuss
>
More information about the Xapian-discuss
mailing list