[Xapian-discuss] Python Binding -
match[xapian.MSET_DOCUMENT].get_data() doesn't return anything!
jarrod roberson
jarrod.roberson at gmail.com
Thu Jan 19 18:44:20 GMT 2006
I am working on creating a OSX Spotlight like application.
first task is to index fully qualified paths, I want to be able to search
for filenames first as a learning exercise to learn xapian and the python
bindings.
I tried using Xapwrap by divmod.org, that didn't pan out, I could not get
the actual data back after a search, a search would return document uid but
I never code get .get_document().get_data() to return anything.
So I decided to just use the "raw" python bindings provided
so I tried the simpleindex and simplesearch python example programs.
I think in both cases ( xapwrap and just the default xapian ) bindings I am
getting indexing to happen, but I can't really tell because I can't get any
search results to confirm anything.
When I tried with the xapian python bindings directly, I can't get the
search to work. Granted the simplesearch example program is broken, so I am
kind of groping in the dark on how to get the search to return a list of
documents and have get_data() actually return something.
I guess what I need is some simple example code that will allow me to do the
following..
given some data like
/this/is/a/fully/qualified/path/to/a/filename
how do I create a document and add it to an index so that I can search for
it by 'filename'
this is what I am doing to create documents and add them to the index
#!/usr/bin/python
# indexer.py
import sys
import xapian
# setup the file to index
fileToIndex = sys.argv[1]
if len(sys.argv) >= 3:
maxRecordsToIndex = int(sys.argv[2])
else:
maxRecordsToIndex = 0
recordCount = -1
# setup the xapian database
try:
db = xapian.WritableDatabase('/tmp/index', xapian.DB_CREATE_OR_OPEN)
# index the file
for line in file(fileToIndex):
doc = xapian.Document()
doc.set_data(line)
db.add_document(doc)
# my input file is 70GB of data, this is to make testing faster
recordCount = recordCount + 1
if maxRecordsToIndex > -1 and recordCount >= maxRecordsToIndex:
break
elif recordCount % 1000 == 0:
print 'print processed %s records so far!' % recordCount
print 'processed %s records' % recordCount
except Exception, e:
print'Exception: %s' % str(e)
sys.exit(1)
and this is what I an doing to try and get the data back from a search, the
problem is I can't get it to find anything.
Given the example data above when run: python searcher.py /tmp/index
filename
I get 0 records found!
#!/usr/local/bin/python
# searcher.py
import sys
import xapian
if len(sys.argv) < 3:
print "usage: %s <path to database> <search terms>" % sys.argv[0]
sys.exit(1)
try:
database = xapian.Database(sys.argv[1])
enquire = xapian.Enquire(database)
query = xapian.Query(sys.argv[2])
print "Performing query `%s'" % query.get_description()
enquire.set_query(query)
matches = enquire.get_mset(0, 10)
print "%i results found" % matches.get_matches_estimated()
for match in matches:
print "ID %i %i%% [%s]" % (match[xapian.MSET_DID], match[
xapian.MSET_PERCENT], match[xapian.MSET_DOCUMENT].get_data())
except Exception, e:
print "Exception: %s" % str(e)
sys.exit(1)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.tartarus.org/pipermail/xapian-discuss/attachments/20060119/e3fb7b91/attachment.htm
More information about the Xapian-discuss
mailing list