[Xapian-tickets] [Xapian] #550: Omega script enhancement: $prettyurl

Xapian nobody at xapian.org
Mon Jun 16 00:55:02 BST 2014


#550: Omega script enhancement: $prettyurl
-------------------------+-----------------------------
 Reporter:  catkin       |             Owner:  olly
     Type:  enhancement  |            Status:  assigned
 Priority:  normal       |         Milestone:  1.3.3
Component:  Omega        |           Version:
 Severity:  normal       |        Resolution:
 Keywords:               |        Blocked By:
 Blocking:               |  Operating System:  All
-------------------------+-----------------------------
\
\
\
\
\
\

Comment (by olly):

 We should discuss encodings explicitly in the docs (and I think we indeed
 don't currently).

 The main issue is actually filenames, though for text/plain documents we
 correctly handle files with an explicit BOM, UTF-8, and also real-world
 cases of ISO-8859-1.

 The ISO-8859-1 handling is because our UTF-8 decoder falls back to
 interpreting invalid UTF-8 sequences as ISO-8859-1 - that's technically
 invalid behaviour these days, but the security implications are very
 limited when parsing documents and queries and changing it would break
 user code that expects it (either deliberately or without realising it).
 And the alternative is to sniff the charset in advance to decide UTF-8 or
 ISO-8859-1, then parse as whichever we sniffed, so we'd end up with much
 the same result, just with having to make an extra pass over the text
 first.

 There's also the issue of what the encoding of the output (via the
 templates) is - you'd struggle to make that anything but UTF-8 as things
 are currently, but we should say that somewhere.

 And queries will also be expected to be UTF-8, which means you should set
 {{{accept-charset="UTF-8"}}} on search {{{<form>}}} tags unless the page
 with the search form uses UTF-8 encoding itself).  I think that's true
 even if you use HTTP POST, as I don't think Omega currently looks for a
 charset in the POST request (but presumably it can have one).  GET seems a
 better option for searches though.
\
\
\

--
Ticket URL: <http://trac.xapian.org/ticket/550#comment:14>
Xapian <http://xapian.org/>
Xapian



More information about the Xapian-tickets mailing list