Re: [Python-Dev] Encoding detection in the standard library?

David Wolever Tue, 22 Apr 2008 09:11:58 -0700

On 22-Apr-08, at 12:30 AM, Martin v. Löwis wrote:

IMO, encoding estimation is something that many web programs willhave
to deal with

Can you please explain why that is? Web programs should not normally
have the need to detect the encoding; instead, it should be specified
always - unless you are talking about browsers specifically, which
need to support web pages that specify the encoding incorrectly.

Two cases come immediately to mind: email and web forms.

When a web browser POSTs data, there is no standard way ofcommunicating which encoding it's using. There are some hints whichmake it easier (accept-charset attributes, the encoding used to sendthe page to the browser), but no guarantees.Email is a smaller problem, because it usually has a helpful content-type header, but that's no guarantee.

Now, at the moment, the only data I have to support this claim is myexperience with DrProject in non-English locations.If I'm the only one who has had these sorts of problems, I'll go backto "Unicode for Dummies".

so it might as well be built in; I would prefer the option
to run `text=input.encode('guess')` (or something similar) thanrelying
on an external dependency or worse yet using a hand-rolled algorithm.

Ok, let me try differently then. Please feel free to post a patch to
bugs.python.org, and let other people rip it apart.
For example, I don't think it should be a codec, as I can't imagine it
working on streams.

As things frequently are, it seems like this is a much larger problemthat I originally believed.

I'll go back and take another look at the problem, then come back ifnew revelations appear.

_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Encoding detection in the standard library?

Reply via email to