[Xapian-discuss] Indexing text files with Xapian

Olly Betts olly at survex.com
Sun Oct 19 14:28:43 BST 2008


On Sat, Oct 18, 2008 at 09:02:58PM +0200, Justine Demeyer wrote:
> So, I'm on Ubuntu and I use the Java Bindings. I have to index a large
> amount of text files and then, make some search on them. So, my question is
> : is it possible to "ask" to xapian to index the content of all files and
> ask to him to return which files has a certain word in its content??

The Xapian API doesn't provide any functionality to recurse or iterate
directories, or open files and load their contents (such APIs already
exist, so why reinvent the wheel?)  But once you've read a file, you can
pass its contents to Xapian to index and then later search the database
to find matching files.

Or just use the omindex program from Omega to do the indexing, and your
own Java code to search.

> I suppose it is possible but I don't know how to do that. Is there someone
> who can help me, give me some indications, give me some examples, some
> documentation??

The C++ API documentation is the best reference, except that method
names are mapped to match Java conventions (to get_mset -> getMSet).
The Java wrappers are lagging a bit currently, because they're
hand-coded JNI which is painful to update.  I've been slowly working
on some new wrappers based on SWIG, but haven't had time to work
on them for a while I'm afraid.

There are some Java examples in the source code - you can find them
online here:

http://trac.xapian.org/browser/trunk/xapian-bindings/java/org/xapian/examples

Cheers,
    Olly



More information about the Xapian-discuss mailing list