Key and value are CDATA in the HTML FORM. According to [1], "CDATA is a sequence of characters from the document character set"
So no you cannot have binary data with the application/x-www-form-urlencoded form encoding, and no your initial example is not a valid use case. Also, the client controls the character set of the HTML file. What I was about to introduce in the spec is: All strings are valid UTF-8 strings. Transforming the UTF-8 string into valid HTML is the client responsibility (e.g. transform '&' into '&', etc). Same for transforming UTF-8 string into valid HTTP POST data (refer to [2] for details about that). [1] http://www.w3.org/TR/html401/types.html#type-cdata [2] http://www.ietf.org/rfc/rfc1738.txt Le jeudi 21 janvier 2010 à 20:47 +0000, Simon McVittie a écrit : > On Thu, 21 Jan 2010 at 14:39:09 -0500, Nicolas Dufresne wrote: > > > > > HTTP mostly deals in bytes, not characters. > > > > > > > > We only want to support a specialized form of HTTP POST data > > > > (x-www-form-urlencoded), which is in ASCII. This data can be passed to > > > > the browser by creating a temporary HTML file with redirecting form. > > > > > > I see. If that's the case, why don't you just put it in a single string? > > > > In x-www-form-urlencoded POST Data is represented in key/value pair. To > > generate the HTML file, you would have to parse the single string to > > split those key/value pairs into HTML nodes (<input type='hidden' > > name='key'>value</input>). > > Surely that means that we don't want the keys and values to be encoded, at > which point there's no guarantee that they are in any particular character > set, bringing us back where we started? > > I agree that x-www-form-urlencoded data ends up as ASCII. However, the > necessary text to write into an HTML file to produce a desired > x-www-form-urlencoded POST is not necessarily ASCII. > > Quoting from my earlier mail, suppose the required POST data is a map where > the keys are bytes 0x01 and 0x80, and the values are equal to the keys. (Note > that the byte 0x80 is meaningless on its own in UTF-8 - the byte-string '\x80' > is not valid UTF-8.) > > I agree that the HTTP POST we end up with is %01=%01&%80=%80. > > However, if the client is going to be writing out temporary HTML, neither > that nor %01 are actually very useful, since what it actually wants is > something like: > > <input type="hidden"></input> > <input type="hidden">€</input> > > (and then some way to arrange for these to be submitted as latin-1, which is > a browser-specific can of worms). > > If the implicit assumption you're making is that we only support abstract > keys and values that are *also* ASCII (i.e. a subset of the set of possible > HTTP POSTs), with the expectation that all practical webmail systems work like > that, and in particular that the example I quoted cannot be supported by > this spec, then please state that assumption/limitation in the spec. > > Simon > _______________________________________________ > telepathy mailing list > [email protected] > http://lists.freedesktop.org/mailman/listinfo/telepathy _______________________________________________ telepathy mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/telepathy
