On Thu, May 7, 2009 at 00:43, "Martin v. Löwis" <mar...@v.loewis.de> wrote: > Michael Urman wrote: >> On Wed, May 6, 2009 at 15:42, "Martin v. Löwis" <mar...@v.loewis.de> wrote: >>> Despite there being also an error handler called "surrogates". >> >> Not that I have to be, but I'm not sold on the previous UTF-8 codec >> behavior becoming an error handler of the name "surrogates" for two >> reasons (I do respect the obvious PBP argument for the implementation, >> and have no better name - "lenient"?). > > PBP?
Practicality beats purity. From a purity standpoint, the legacy invalid utf-8 seems more like an encoding than an error handler to me. From a practicality standpoint, it's presumably much more convenient to implement it on top of the new valid UTF-8 codec's behavior. And then any error handler needs a name. > Well, there is a way to stack error handlers, although it's not pretty: > [...] > codecs.register_error("surrogates_then_replace", > surrogates_then_replace) That mitigates my arguments significantly, although I'd rather see something like errors=('surrogates', 'replace') chain the handlers without additional registrations. But that's a different PEP or arbitrary change. :) >> The stacking argument also applies to the new utf8b behavior on encode >> (only, as it handles all errors on decode). This may be a YAGNI > > Indeed - in particular, as, in the primary application of this error > handler (i.e. file IO operations), there is no way of specifying > an addition error handler anyway. Would it be useful to allow setting this somewhere? It'd be analogous to setfsencoding, perhaps a setfsencodingerrors. It's not hard to imagine an application working on Windows where all Unicode characters are valid, and constructing backup filenames by adding some arbitrary character, or receiving them from a user who doesn't understand encodings. When this application is taken to a non-Unicode filesystem, without the ability to say "I really want a valid filename: so replace", that could get messy. But it may still be a YAGNI, or a "don't do that." -- Michael Urman _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com