[Xapian-tickets] [Xapian] #599: The Omega HTML parser resets contents if a further <body> tag is found

Xapian nobody at xapian.org
Wed May 16 07:27:17 BST 2012


#599: The Omega HTML parser resets contents if a further <body> tag is found
--------------------+-------------------------------------------------------
 Reporter:  medoc   |       Owner:  olly
     Type:  defect  |      Status:  new 
 Priority:  normal  |   Milestone:      
Component:  Omega   |     Version:      
 Severity:  normal  |    Keywords:      
Blockedby:          |    Platform:  All 
 Blocking:          |  
--------------------+-------------------------------------------------------
 In myhtmlparse.cc around line 81, the omega HTML handler resets the
 current content each time an opening <body> tag is found.

 Some very malformed HTML files contain several opening <body> tags, and
 resetting on further occurrences loses content.

 At least Firefox and Opera ignore further <body> tags. Incidentally they
 also just ignore closing </body> and </html> tags.

 Noticed through a reported Recoll issue (Recoll uses the Omega parser
 mostly unmodified), and changed locally.

-- 
Ticket URL: <http://trac.xapian.org/ticket/599>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list