Hello.
This message asks for opinions and suggestions how to make
Mozilla products understand «Content-Type» of a Web resource
exactly as specified in HTTP(s) headers.
Content sniffing in browsers is a compromise between standards
and interoperability with “poor” Web sites.
It creates vulnerabilities and, generally, breaks compatibility with (original) HTTP/1.1.
In some cases it conceals protocol data from such end user’s tools as Ctrl-I (information on page).
See http://www.superstructure.info/browser/compromised/toxic-sniffing.html
for some generally less known information about it.
I have particular concerns about two scenarios.
First is “media type (a.k.a. MIME) sniffing”,
when browser overrides media type/subtype.
This is implemented in toolkit/components/mediasniffer/nsMediaSniffer.cpp component
(and possibly others, don’t know).
There is a proposal https://bugzilla.mozilla.org/show_bug.cgi?id=471020
to make behaviour of Firefox compatible with MS Internet Explorer
and https://mimesniff.spec.whatwg.org/#supplied-mime-type-detection-algorithm ,
using «X-Content-Type-Options: nosniff» to switch the sniffing off.
Second scenario is a less known “UTF sniffing”,
applicable only to text media types. Browser respects the type proper,
but overrides «charset=» label with own guesses.
This is implemented in netwerk/base/nsUnicharStreamLoader.cpp ;
such implementation is based on HTML5 encoding sniffing
that isn’t applicable (reasonably) to text/plain.
In the case of text/plain it leads to bugs. Simple test cases are available
at http://course.irccity.ru/ya-yu-9-amp.txt (toxic UTF-16 “BOM”)
and http://course.irccity.ru/p-guillemet-yi-ya.txt (toxic UTF-8 “BOM”).
It poses less immediate security risk, but still can cause data corruption
whenever arbitatry data are allowed into (beginning of) text/plain documents.
The toxic UTF sniffing was observer in Firefox, MSIE, Google Chrome, and Safari
and doesn’t seemingly correlate with «X-Content-Type-Options» mentioned above.
Possible approaches to the toxic UTF sniffing include:
• Just fix it (certainly would cause backlash from people eager to burn
anything except UTF-8).
• Something along the lines of the no-sniff flag.
• Make a new Firefox preferences value
(e. g. network.http.charset_quirk_level) controlling browser’s behaviour.
• Make patches for the source code to be used only by those who are interested.
Possible approaches to relation between two scenarios include:
• Extend the meaning of the «X-Content-Type-Options: nosniff» to banning the
toxic UTF sniffing.
• Make interpretation of «X-Content-Type-Options» depend on preferences.
• Invent a new value for X-Content-Type-Options, or a new header at all,
in a hope other browsers and Web applications will ultimately adopt it.
• Treat two problems completely separately.
Opinions?
Please note, I’m not (yet) a browser developer and
my main agenda is making a browser I could trust myself.
Regards, Incnis Mrsi
_______________________________________________
dev-platform mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-platform