Hi,
Le samedi 09 janvier 2010 13:45:58, vous avez écrit :
> > Note: I implemented the BOM check in TextIOWrapper; so it's already
> > usable for any file-like object.
>
> Yes, but the implementation is limited to just BOM checking
> and thus only supports UTF-8-SIG, UTF-16 and UTF-32.
Sure, but
Le samedi 09 janvier 2010 02:12:28, MRAB a écrit :
> What about listing the possible encodings? It would try each in turn
> until it found one where the BOM matched or had no BOM:
>
> my_file = open(filename, 'r', encoding='UTF-8-sig|UTF-16|UTF-8')
>
> or is that taking it too far?
Yes, you'
Le samedi 09 janvier 2010 01:47:38, vous avez écrit :
> One concern I have with this implementation encoding="BOM" is that if
> there is no BOM it assumes UTF-8.
If no BOM is found, it fallback to the current heuristic: os.device_encoding()
or system local.
> (...) Hence, it might be that someon
Victor Stinner wrote:
> (2) Check for a BOM while reading or detect it before?
>
> Everybody agree that checking BOM is an interesting option and should not be
> limited to open().
>
> Marc-Andre proposed a codecs.guess_file_encoding() function accepting a file
> name or a binary file-like obje
Le samedi 09 janvier 2010 02:23:07, Martin v. Löwis a écrit :
> While I would support combining BOM detection in the case where a file
> is opened for reading and no encoding is specified, I see two problems:
> a) if a seek operations is performed before having looked at the BOM,
>no determinat
On 09.01.10 01:47, Glenn Linderman wrote:
> On approximately 1/8/2010 3:59 PM, came the following characters from
> the keyboard of Victor Stinner:
>> Hi,
>>
>> Thanks for all the answers! I will try to sum up all ideas here.
>
> One concern I have with this implementation encoding="BOM" is that
It seems to me that when opening a file, the following is the only
flow that makes sense for the typical opening of a file flow:
if encoding is not None:
use encoding
elif file has BOM:
use BOM
else:
use system default
And hence a encoding='BOM' isn't needed there. Although I'm trying to
On approximately 1/8/2010 5:12 PM, came the following characters from
the keyboard of MRAB:
Glenn Linderman wrote:
On approximately 1/8/2010 3:59 PM, came the following characters from
the keyboard of Victor Stinner:
Hi,
Thanks for all the answers! I will try to sum up all ideas here.
One co
>>> Antoine would like to check BOM by default, because both options
>>> (system locale vs checking for BOM) is the same thing.
>>>
>> To be clear, I am not saying it is the same thing. What I think is
>> that it would be a mistake to use a mildly unreliable heuristic by
>> default (the locale +
Glenn Linderman wrote:
On approximately 1/8/2010 3:59 PM, came the following characters from
the keyboard of Victor Stinner:
Hi,
Thanks for all the answers! I will try to sum up all ideas here.
One concern I have with this implementation encoding="BOM" is that if
there is no BOM it assumes U
On approximately 1/8/2010 3:59 PM, came the following characters from
the keyboard of Victor Stinner:
Hi,
Thanks for all the answers! I will try to sum up all ideas here.
One concern I have with this implementation encoding="BOM" is that if
there is no BOM it assumes UTF-8. That is probably
On 09/01/2010 00:09, Antoine Pitrou wrote:
Hello Victor,
Victor Stinner haypocalc.com> writes:
(1) Change default open() behaviour or make it optional?
[...]
Antoine would like to check BOM by default, because both options (system
locale vs checking for BOM) is the same thing
Hello Victor,
Victor Stinner haypocalc.com> writes:
>
> (1) Change default open() behaviour or make it optional?
>
[...]
>
> Antoine would like to check BOM by default, because both options (system
> locale vs checking for BOM) is the same thing.
To be clear, I am not saying it is the same t
13 matches
Mail list logo