[Xapian-tickets] [Xapian] #595: Allow omega to index Atom feed MIME type

Xapian nobody at xapian.org
Thu Apr 5 14:41:29 BST 2012


#595: Allow omega to index Atom feed MIME type
-------------------------+--------------------------------------------------
 Reporter:  mihaibivol   |       Owner:  olly     
     Type:  enhancement  |      Status:  new      
 Priority:  normal       |   Milestone:           
Component:  Omega        |     Version:  SVN trunk
 Severity:  minor        |    Keywords:           
Blockedby:               |    Platform:  All      
 Blocking:               |  
-------------------------+--------------------------------------------------
Changes (by olly):

  * version:  => SVN trunk


Comment:

 I tried this out on my blog atom feeds, such as:

 http://survex.com/~olly/blog/xapian/index.atom

 The patch is definitely along the right lines, but I spotted some issues:

  * The "author" field includes the feed uri (from the {{{<uri>}}} tag
 inside {{{<author>}}}) which doesn't seem useful.

  * The handling of {{{<category>}}} tags seems wrong - the code tries to
 pull out any
  content, but it looks like the category is in a term attribute, e.g.
 {{{<category term="foo" />}}}

  * It seems to ignore {{{<content>}}} which I think probably should be
 treated as body text.

  * It looks like {{{type=html}}} on various tags needs special handling.

  * Also, it doesn't pick up any of the subtitle text in my blog, because
 our HTML parser doesn't handle CDATA, and just throws away the <...> it
 sees (this is a bug in our HTML parser really, rather than a bug in this
 patch):

 {{{
 #!xml
 <subtitle type="html"><![CDATA[
 A blog lacking a good description.
 ]]></subtitle>
 }}}

 It'd be good to update the documentation to reflect this newly supported
 type too.

-- 
Ticket URL: <http://trac.xapian.org/ticket/595#comment:1>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list