[Xapian-tickets] [Xapian] #503: Add Python PostingSource example from Xappy to docs

Xapian nobody at xapian.org
Mon Aug 9 14:44:48 BST 2010


#503: Add Python PostingSource example from Xappy to docs
--------------------------------------+-------------------------------------
 Reporter:  jcassee                   |       Owner:  richard
     Type:  enhancement               |      Status:  new    
 Priority:  normal                    |   Milestone:  1.2.3  
Component:  Xapian-bindings (Python)  |     Version:  1.2.2  
 Severity:  normal                    |    Keywords:         
Blockedby:                            |    Platform:  All    
 Blocking:                            |  
--------------------------------------+-------------------------------------
Changes (by olly):

  * owner:  olly => richard
  * version:  => 1.2.2
  * component:  Other => Xapian-bindings (Python)
  * milestone:  => 1.2.3


Old description:

> The Xappy source code contains a perfect example of a
> [http://code.google.com/p/xappy/source/browse/trunk/xappy/searchconnection.py?r=620#1591
> weight-only (non-filtering) !PostingSource written in Python]. This would
> be a good addition to the [source:trunk/xapian-
> core/docs/postingsource.rstpostingsource docs]. I have slightly edited
> the original.
>
> {{{
> class ExternalWeightPostingSource(xapian.PostingSource):
>     """
>     A Xapian posting source reading from an ExternalWeightSource.
>     """
>     def __init__(self, xapdb, wtsource):
>         xapian.PostingSource.__init__(self)
>         self.xapdb = xapdb
>         self.wtsource = wtsource
>
>     def init(self, xapdb):
>         self.alldocs = xapdb.postlist('')
>
>     def reset(self, xapdb):
>         # backwards compatibility
>         self.init(xapdb)
>
>     def get_termfreq_min(self): return 0
>     def get_termfreq_est(self): return self.xapdb.get_doccount()
>     def get_termfreq_max(self): return self.xapdb.get_doccount()
>
>     def next(self, minweight):
>         try:
>             self.current = self.alldocs.next()
>         except StopIteration:
>             self.current = None
>
>     def skip_to(self, docid, minweight):
>         try:
>             self.current = self.alldocs.skip_to(docid)
>         except StopIteration:
>             self.current = None
>
>     def at_end(self):
>         return self.current is None
>
>     def get_docid(self):
>         return self.current.docid
>
>     def get_maxweight(self):
>         return self.wtsource.get_maxweight()
>
>     def get_weight(self):
>         xapdoc = self.xapdb.get_document(self.current.docid)
>         doc = ProcessedDocument(self.conn._field_mappings, xapdoc)
>         return self.wtsource.get_weight(doc)
> }}}

New description:

 The Xappy source code contains a perfect example of a
 [http://code.google.com/p/xappy/source/browse/trunk/xappy/searchconnection.py?r=620#1591
 weight-only (non-filtering) PostingSource written in Python]. This would
 be a good addition to the [source:trunk/xapian-
 core/docs/postingsource.rstpostingsource docs]. I have slightly edited the
 original.

 {{{
 class ExternalWeightPostingSource(xapian.PostingSource):
     """
     A Xapian posting source reading from an ExternalWeightSource.
     """
     def __init__(self, xapdb, wtsource):
         xapian.PostingSource.__init__(self)
         self.xapdb = xapdb
         self.wtsource = wtsource

     def init(self, xapdb):
         self.alldocs = xapdb.postlist('')

     def reset(self, xapdb):
         # backwards compatibility
         self.init(xapdb)

     def get_termfreq_min(self): return 0
     def get_termfreq_est(self): return self.xapdb.get_doccount()
     def get_termfreq_max(self): return self.xapdb.get_doccount()

     def next(self, minweight):
         try:
             self.current = self.alldocs.next()
         except StopIteration:
             self.current = None

     def skip_to(self, docid, minweight):
         try:
             self.current = self.alldocs.skip_to(docid)
         except StopIteration:
             self.current = None

     def at_end(self):
         return self.current is None

     def get_docid(self):
         return self.current.docid

     def get_maxweight(self):
         return self.wtsource.get_maxweight()

     def get_weight(self):
         xapdoc = self.xapdb.get_document(self.current.docid)
         doc = ProcessedDocument(self.conn._field_mappings, xapdoc)
         return self.wtsource.get_weight(doc)
 }}}

--

Comment:

 Marking for 1.2.3, though that's pending on us being OK to relicense this
 in the future.  Richard, who wrote this?  The (C) headers on the file list
 you and Lemur (which shouldn't be a problem, though we should explicitly
 check) and Pablo Hoffman who I don't think I know.

 We should kill reset() from it if it really is for backward compatibility
 - compatibility with 1.1.x isn't interesting at this point, and going
 forward a clean example is more important.

 Probably also better to rename xapdb to just db for the new context.

-- 
Ticket URL: <http://trac.xapian.org/ticket/503#comment:1>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list