[Xapian-tickets] [Xapian] #550: Omega script enhancement: $prettyurl
Xapian
nobody at xapian.org
Mon Jun 16 00:55:02 BST 2014
#550: Omega script enhancement: $prettyurl
-------------------------+-----------------------------
Reporter: catkin | Owner: olly
Type: enhancement | Status: assigned
Priority: normal | Milestone: 1.3.3
Component: Omega | Version:
Severity: normal | Resolution:
Keywords: | Blocked By:
Blocking: | Operating System: All
-------------------------+-----------------------------
\
\
\
\
\
\
Comment (by olly):
We should discuss encodings explicitly in the docs (and I think we indeed
don't currently).
The main issue is actually filenames, though for text/plain documents we
correctly handle files with an explicit BOM, UTF-8, and also real-world
cases of ISO-8859-1.
The ISO-8859-1 handling is because our UTF-8 decoder falls back to
interpreting invalid UTF-8 sequences as ISO-8859-1 - that's technically
invalid behaviour these days, but the security implications are very
limited when parsing documents and queries and changing it would break
user code that expects it (either deliberately or without realising it).
And the alternative is to sniff the charset in advance to decide UTF-8 or
ISO-8859-1, then parse as whichever we sniffed, so we'd end up with much
the same result, just with having to make an extra pass over the text
first.
There's also the issue of what the encoding of the output (via the
templates) is - you'd struggle to make that anything but UTF-8 as things
are currently, but we should say that somewhere.
And queries will also be expected to be UTF-8, which means you should set
{{{accept-charset="UTF-8"}}} on search {{{<form>}}} tags unless the page
with the search form uses UTF-8 encoding itself). I think that's true
even if you use HTTP POST, as I don't think Omega currently looks for a
charset in the POST request (but presumably it can have one). GET seems a
better option for searches though.
\
\
\
--
Ticket URL: <http://trac.xapian.org/ticket/550#comment:14>
Xapian <http://xapian.org/>
Xapian
More information about the Xapian-tickets
mailing list