[Xapian-devel] Fetching document content by Q term in Python
Olly Betts
olly at survex.com
Fri Feb 9 08:01:05 GMT 2007
On Fri, Feb 09, 2007 at 11:18:10AM +1100, Alec Thomas wrote:
> I'd like to be able to retrieve the indexes stored copy of the document
> text and tried the following:
>
> terms = self.db.allterms()
> terms.skip_to('Q' + uri.encode('utf-8'))
> term = terms.next()
> doc = self.db.get_document(term[1])
> print doc.get_data()
>
> I just wildly guessed that [1] was the docid, but of course it isn't. So the
> question is, how do I get a docid out of a term?
This will print the data from each document indexed by a particular
term:
term = 'Q' + uri.encode('utf-8')
for docid in self.db.postlist(term):
doc = self.db.get_document(docid)
print doc.get_data()
You get a PostingIter from db.postlist(term) - see
python/docs/bindings.html for details.
> Or if I'm completely on the wrong track, how do I get the document from
> a Q term?
Alternatively, you can run a search for the Q-prefixed term. The above
is a little less work though.
Cheers,
Olly
More information about the Xapian-devel
mailing list