[issue3187] os.listdir can return byte strings

2008-09-23 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Guido compiled my patches here: http://codereview.appspot.com/3055 My patches allows bytes for fnmatch.filter(), glob.glob1(), os.path.join() and open(). ___ Python tracker <[EMAIL PROTECTE

[issue3952] _lsprof: clear() should call flush_unmatched()

2008-09-23 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: Example to reproduce the bug (using Python trunk): --- from gc import collect import _lsprof def callMethod(obj): obj.clear() collect() obj = _lsprof.Profiler() obj.enable() callMethod(obj) obj.enable() del obj collect() --

[issue3951] Disable Py_USING_MEMORY_DEBUGGER!

2008-09-24 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: @loewis: your patch (revert.diff) includes a change in configure.in about OpenBSD !? ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3954] _hotshot: invalid error control in logreader()

2008-09-24 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: Using Fusil the fuzzer, I found a "minor" bug in _hotshot module: it doesn't check correctly the errors in hotshot_logreader(). On error, an exception is raised (eg. by eof_error()) but the result is a pointer to a

[issue3954] _hotshot: invalid error control in logreader()

2008-09-24 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Oops, my previous patch always raise an error even on valid file :-p Here is a new patch with *an unit test* (yeah!). Added file: http://bugs.python.org/file11590/_hotshot_logreader2.patch ___

[issue3954] _hotshot: invalid error control in logreader()

2008-09-24 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11589/_hotshot_logreader.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3954] _hotshot: invalid error control in logreader()

2008-09-24 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: @georg: give me a svn access and i will commit it ;-) ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3967] bytearray().count()

2008-09-25 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: bytes_count() doesn't check start maximum value: _adjust_indices() should check that start is smaller than len (smaller or egal? len or len-1?). Example: >>> b = bytearray(3) >>> b.count("x", 149

[issue3967] bytearray().count()

2008-09-25 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Here is a trace of Valgrind: >>> b=bytearray(2) >>> b.count("", 3493403, 0) 0 >>> b.count("", 23131230123012010231023, 0) ==13650== Invalid read of size 1 ==13650==at 0x

[issue3973] Invalid line number in Exception traceback with header # -*- coding: xxx -*-

2008-09-26 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: Short example: --- # -*- coding: ASCII -*- raise Exception("line 2") --- Result: Traceback (most recent call last): File "plop.py", line 3, in Exception: line 2 The problem is around newtra

[issue2384] [Py3k] line number is wrong after encoding declaration

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: By setting lineto to 1 (as proposed ocean-city), ASCII tests (test1 and test2, see below) works correctly. This change doesn't impact utf-8/iso-8859-1 charset (it's special case). --- test1 --- # coding: ASCII raise

[issue2384] [Py3k] line number is wrong after encoding declaration

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: ocean-city testcase is invalid: it uses subprocess.call() which returns the exit code, not the Python error line number! Here is a better testcase using subprocess.Popen() checking the line number but also the display line. It tests

[issue2384] [Py3k] line number is wrong after encoding declaration

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Hum, about the empty line error using a multibyte charset, the issue is different. PyTraceBack_Print() calls _Py_DisplaySourceLine() which doesn't take care of the charset. ___ Python trac

[issue3975] PyTraceBack_Print() doesn't respect # coding: xxx header

2008-09-26 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: PyTraceBack_Print() doesn't take care of the "# coding: xxx" header of a Python script. It calls _Py_DisplaySourceLine() which opens the file as a byte stream (and not an unicode characters stream). Because of this p

[issue2384] [Py3k] line number is wrong after encoding declaration

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Here is a patch fixing this issue: it's quite the same that ocean-city patch, but I prefer to patch lineno only if set_readline() succeed. About the truncated traceback for multibyte charset: see the new issue3975. Added

[issue2384] [Py3k] line number is wrong after encoding declaration

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Oh! My patch breaks "python -m". The problem is maybe no in the token parser but... somewhere else? --- test.py --- # coding: ASCII raise Exception("line 2") # try again! --- Python 3.0 trunk unpatch

[issue3975] PyTraceBack_Print() doesn't respect # coding: xxx header

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Here is a new version of _Py_DisplaySourceLine() using PyTokenizer_FindEncoding() to read the coding header, and PyFile_FromFd() to create an "unicode-awake" file. The code could be optimized, but it least it displays co

[issue3975] PyTraceBack_Print() doesn't respect # coding: xxx header

2008-09-26 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11611/traceback_unicode.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3975] PyTraceBack_Print() doesn't respect # coding: xxx header

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: (oops, first patch included an useless whitespace change) Added file: http://bugs.python.org/file11612/traceback_unicode.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3623] _json: fix raise_errmsg(), py_encode_basestring_ascii() and linecol()

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: > Is this something that needs to be backported to 2.6? Hum, here is a part of my patch which can be applied to python 2.6. I don't know if it fixes real bugs, but the code looks better with the patch: PyErr_SetObject()

[issue3977] Check PyInt_AsSsize_t/PyLong_AsSsize_t error

2008-09-26 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: PyLong_Ssize_t() returns -1 and set an error (OverflowError) on overflow, but some modules don't check this case. Here is a first patch for BytesIO() and StringIO(). -- components: Library (Lib) files: py3k_bytes_str

[issue3977] Check PyInt_AsSsize_t/PyLong_AsSsize_t error

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Here is a fix for struct.pack_into(). Added file: http://bugs.python.org/file11616/py3k_struct.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3977] Check PyInt_AsSsize_t/PyLong_AsSsize_t error

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Fix _bytesio of Python 2.6. Added file: http://bugs.python.org/file11617/py26_bytesio.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3977] Check PyInt_AsSsize_t/PyLong_AsSsize_t error

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: py3k_struct.patch can be ported to python trunk: so here is the fix for python trunk (2.6). Added file: http://bugs.python.org/file11618/py26_struct.patch ___ Python tracker <[EMAIL PROTECTE

[issue2384] [Py3k] line number is wrong after encoding declaration

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: @ocean-city: Oops, sorry. Using your patch (set lineno in fp_setreadl()), it works on both cases ("python test.py" or "python -m test"). The new patch includes your fix for tokenizer.c and a new version of the t

[issue3967] bytearray().count()

2008-09-26 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Fixed in Python trunk, rev66631, by amaury.forgeotdarc. ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3187] os.listdir can return byte strings

2008-09-27 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11189/filename.py ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3187] os.listdir can return byte strings

2008-09-27 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11210/invalid_filename.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3187] os.listdir can return byte strings

2008-09-27 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: getcwd() fails with "NOT FOUNT" (not foun*d*?) if the current directory filename can't be converted to unicode (str type). Here is a patch to fallback to bytes if creation of the unicode failed. Added file: http://bugs

[issue2384] [Py3k] line number is wrong after encoding declaration

2008-09-27 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11610/tokenizer-coding.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue2832] Line numbers reported by extract_stack are offset by the #-*- encoding line

2008-09-27 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: This bug is a duplicate of the issue 2384: I tried your example with the patch tokenizer-coding-2.patch and your bug is fixed: * first example (no coding): this is line 3 * second example (with coding): this is line 4 -

[issue3975] PyTraceBack_Print() doesn't respect # coding: xxx header

2008-09-27 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Ooops, my first version introduces a regression: if file open fails, the traceback printing was stopped. Here is a new version of my patch to support #coding: header in _Py_DisplaySourceLine(). It doesn't print the line of file

[issue2384] [Py3k] line number is wrong after encoding declaration

2008-09-27 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Issue 2832 is a duplicate. ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2384> ___ __

[issue3988] Byte warning mode and b'' != ''

2008-09-28 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Here is a patch for this issue. -- keywords: +patch nosy: +haypo Added file: http://bugs.python.org/file11646/bytes_ne_warning.patch ___ Python tracker <[EMAIL PROTECTED]> <http://

[issue3988] Byte warning mode and b'' != ''

2008-09-28 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: @christian.heimes: Oops, i totally forget the bytearray() type. Here is a new patch. Added file: http://bugs.python.org/file11647/bytes_ne_warning-2.patch ___ Python tracker <[EMAIL PROTECTE

[issue3988] Byte warning mode and b'' != ''

2008-09-28 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11646/bytes_ne_warning.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3988] Byte warning mode and b'' != ''

2008-09-28 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: I don't know how to activate BytesWarning as error (as "python3 -bb" does). Here is an patch for tests only working with "python3 -bb". Added file: http://bugs.python

[issue3995] iso-xxx/cp1252 inconsistencies in Python 2.* not in 3.*

2008-09-29 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: If you write "€" in the Python interpreter (Python2), you will get a *bytes* string encoded in your terminal charset. Example on Linux (utf-8): Python 2.5.1 (r251:54863, Jul 31 2008, 23:17:40) >>> '€' &#

[issue3982] support .format for bytes

2008-09-29 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: I don't think that b'...'.format() is a good idea. Programmers will continue to mix characters and bytes since .format() target are characters. -- nosy: +haypo ___ P

[issue1069092] segfault on printing nested sequences of None/Ellipsis

2008-09-29 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: This issue is a stack overflow: your code do recursive calls to internal_print(). Backtrace from gdb: #0 0xb7e92064 in _IO_new_file_overflow () from /lib/tls/i686/cmov/libc.so.6 #1 0xb7

[issue3982] support .format for bytes

2008-09-29 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: > I think Martin's suggesting of encoding back to ascii might be > the best thing to do As I understand, you would like to use bytes as characters, like b'{code} {message}'.format(code=100, message='

[issue3187] os.listdir can return byte strings

2008-09-29 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: About os.getcwd(), another solution is merge_os_getcwd_getcwdu.patch: os.getcwd() always return unicode string and raise an error on unicode decode error. Wheras os.getcwd(bytes=True) always return bytes. The old function os.getcwd

[issue3187] os.listdir can return byte strings

2008-09-29 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: As Steven Bethard proposed, here is a new version of my getcwd() patch: instead of adding a keyword argument "bytes", I created a function getcwdb(): * os.getcwd() -> unicode * os.getcwdb() -> bytes In Python

[issue3996] PyOS_CheckStack does not work

2008-09-29 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: This issue may be related: issue1069092 -- nosy: +haypo ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3187] os.listdir can return byte strings

2008-09-29 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Patch python3_bytes_filename.patch: - open() support bytes - listdir(unicode) -> only unicode, *skip* invalid filenames (as asked by Guido) - remove os.getcwdu() - create os.getcwdb() -> bytes - glob.glob() s

[issue3999] Real segmentation fault handler

2008-09-29 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: I would like to be able to catch SIGSEGV in my Python code! So I started to hack Python trunk to support this feature. The idea is to use a signal handler which call longjmp(), and add setjmp() at Py_EvalFrameEx() enter. See at

[issue3999] Real segmentation fault handler

2008-09-30 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: @amaury.forgeotdarc: It looks like PyOS_CheckStack() is only implemented for Windows. It uses alloca() + __try/__except + _resetstkoflw(). The GNU libc nor Linux kernel don't check stack pointer on alloca(), it's just

[issue3999] Real segmentation fault handler

2008-09-30 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Note: my patch can be adapted to catch SIGFPE (divison by zero or other math error). For int/long types, Python avoids divison by zero, but for code written in C ("external modules"), Python is unable to catch such erro

[issue3999] Real segmentation fault handler

2008-09-30 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Oops, my patch was broken. I forgot to install the fault handler! Here is a new version of the patch which also catch SIGFPE: raise an ArithmeticError. Added file: http://bugs.python.org/file11666/segfault-2

[issue3999] Real segmentation fault handler

2008-09-30 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11659/segfault.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3951] Disable Py_USING_MEMORY_DEBUGGER!

2008-09-30 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Close the issue since it's commited in 2.6 and 3.0. My patch configure-memory-debugger.patch is useless, a developer can fix obmalloc.c. -- status: open -> closed ___ Python t

[issue3959] Add Google's ipaddr.py to the stdlib

2008-09-30 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: I'm the maintainer of IPy library. Another library for IPv4/IPv6 manipulation. The code is old (was written for Python 2.2?), but released under BSD license. Main issue of this library: it's unable to manipulation

[issue3959] Add Google's ipaddr.py to the stdlib

2008-09-30 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Ooops, the website: http://software.inl.fr/trac/wiki/IPy ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3959] Add Google's ipaddr.py to the stdlib

2008-09-30 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Another Python library: http://erlug.linux.it/~da/soft/iplib/ ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3187] os.listdir can return byte strings

2008-09-30 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: As I wrote, python3_bytes_filename.patch was just an initial support for bytes filename. So as asked by Guido, here is a new version of my patch. Changes: - for all functions, support bytes as well as bytearray - os.readlink(u

[issue4004] missing newline in "Could not convert argument %s to string" error message

2008-09-30 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: Example: $ ./python $(echo -e "\xff"); ./python $(echo -e "\xff"); echo "--" Could not convert argument 1 to stringCould not convert argument 1 to string-- -- files: argv_error_newline.patch

[issue4008] IDLE: checksyntax() doesn't support Unicode?

2008-10-01 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: IDLE checksyntax() function doesn't support Unicode. Example with idle-3.0rc1-quits-when-run.py in an ASCII terminal: $ ./python Tools/scripts/idle Exception in Tkinter callback Traceback (most recent call last): File "

[issue4008] IDLE: checksyntax() doesn't support Unicode?

2008-10-01 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Hum, the problem is that IDLE asks io.open() to detect the charset whereas open() doesn't know the #coding: header. So if your locale is ASCII, CP1252 or anything different of UTF-8, read the file will fails. I wrote a patch to

[issue2382] [Py3k] SyntaxError cursor shifted if multibyte character is in line.

2008-10-01 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: See also a related issue: issue3975. -- nosy: +haypo ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue4006] os.getenv silently discards env variables with non-UTF-8 values

2008-10-01 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: It's not a bug, it's a feature! Python3 rejects invalid byte sequence (according to the "default system encoding") from the command line or environment variables. listdir(str) will also drop invalid filenames. Yes, w

[issue4008] IDLE: checksyntax() doesn't support Unicode?

2008-10-02 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11673/idle_encoding.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue4008] IDLE: checksyntax() doesn't support Unicode?

2008-10-02 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Ah! tokenize has already a method detect_encoding(). My new patch uses it to avoid code duplication. Added file: http://bugs.python.org/file11677/idle_encoding-2.patch ___ Python tracker &

[issue4016] improve linecache: reuse tokenize.detect_encoding() and io.open()

2008-10-02 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: linecache uses it own code to detect a Python script encoding whereas a function tokenize.detect_encoding() already exists. It does also convert bytes => unicode conversion of the file lines whereas open() already supports thi

[issue4016] improve linecache: reuse tokenize.detect_encoding() and io.open()

2008-10-02 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Added file: http://bugs.python.org/file11679/tokenize_bom_utf8.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3187] os.listdir can return byte strings

2008-10-02 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Added file: http://bugs.python.org/file11680/python3_bytes_filename-3.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3187] os.listdir can return byte strings

2008-10-02 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11667/python3_bytes_filename-2.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue4008] IDLE: checksyntax() doesn't support Unicode?

2008-10-02 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: loewis wrote: > Notice that there is also IOBinding.coding_spec. > Not sure whether this or the one in tokenize is more correct. Oh! IOBinding reimplement many features now available in Python like universal new line or funct

[issue4006] os.getenv silently discards env variables with non-UTF-8 values

2008-10-02 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: @a.badger: Again, dropping invalid filenames in listdir() is a (very recent) choice of the Python3 design. Please read this document which explain the current situation of bytes vs unicode: http://wiki.python.or

[issue4021] tokenize.detect_encoding(): raise SyntaxError on codecs.lookup() error

2008-10-02 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: tokenize.detect_encoding() raises a LookupError() if the charset is unknown whereas Python raises a SyntaxError. So this patch mimics Python behaviour for tokenize module. Extra: reuse BOM_UTF8 from the codecs module. --

[issue4016] improve linecache: reuse tokenize.detect_encoding() and io.open()

2008-10-02 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: I wrote a different (and better) patch for tokenize module: moved to the issue4021. ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue4016] improve linecache: reuse tokenize.detect_encoding() and io.open()

2008-10-02 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11679/tokenize_bom_utf8.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue4008] IDLE: checksyntax() doesn't support Unicode?

2008-10-02 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: @loewis: Ok, I didn't know. I think that it's better to reuse existing code. I also compared the implementations of encoding detection, and the code looks the same in IDLE and tokenize, but I prefer tokenize. tokenize

[issue4023] convert os.getcwdu() to os.getcwd(), and getcwdu() to getcwd()

2008-10-02 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: Python3 removes os.getcwdu() and introduces os.getcwdb(). The patch is a fixer for lib2to3 replacing "os.getcwdu()" to "os.getcwd()", and "getcwdu()" to "getcwd()". I hope that nobody defi

[issue4023] convert os.getcwdu() to os.getcwd(), and getcwdu() to getcwd()

2008-10-02 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Added file: http://bugs.python.org/file11684/fix_getcwdu.py ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue4023] convert os.getcwdu() to os.getcwd(), and getcwdu() to getcwd()

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Le Friday 03 October 2008 04:44:13 Benjamin Peterson, vous avez écrit : > You're patch looks pretty good. Could you write tests for it, though? My patch doesn't work, that's why I don't write unit test :-)

[issue3187] os.listdir can return byte strings

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Le Friday 03 October 2008 03:45:44 Amaury Forgeot d'Arc, vous avez écrit : > Here is a patch for Windows: (...) > test_ntpath also runs functions with bytes. Which charset is used when you use bytes filename? I read somewhe

[issue3187] os.listdir can return byte strings

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: > The most generic way of allowing all bytes-alike objects is to write: > path = bytes(path) If you use that, any unicode may fails and the function will always return unicode. The goal is to get: func(bytes)->bytes

[issue4024] float(0.0) singleton

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: We need maybe more hardcoded floats. I mean a "cache" of current float. Example of pseudocode: def cache_float(value): return abs(value) in (0.0, 1.0, 2.0) def create_float(value): try: return cache[value]

[issue3187] os.listdir can return byte strings

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: path=path is useless most of the code (unicode path), this code is faster if both cases (bytes or unicode)! if not isinstance(path, str): path = bytes(path) * a if b else c: unicode=0.756730079651; bytes=1.93071103096 * i

[issue4008] IDLE: checksyntax() doesn't support Unicode?

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: @loewis: I guess that your locale is still UTF-8. On Linux (Ubuntu Gutsy) using "env -i DISPLAY=$DISPLAY HOME=$HOME xterm" to get a new empty environment, I get: $ locale LANG= LC_ALL= LC_CTYPE="POSIX" LC_NUME

[issue4035] Support bytes for os.exec*()

2008-10-03 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: os.exec*() functions doesn't support bytes if the program name doesn't use absolute path. The problem is that PATH is used to complete the full path but Python3 disallows bytes+str (which is a good thing!). Example: pyth

[issue4036] Support bytes for subprocess.Popen()

2008-10-03 Thread STINNER Victor
New submission from STINNER Victor <[EMAIL PROTECTED]>: subprocess doesn't support bytes for the "args" argument. - On Windows, subprocess._execute_child() converts args to a string if it was a list - On UNIX, subprocess._execute_child() converts args to a list if it

[issue3574] compile() cannot decode Latin-1 source encodings

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Using py3k trunk + fix_latin.diff: - compile(b'# coding: latin-1\nu = "\xC7"\n', '', 'exec') doesn't fail - test_pep3120.py is ok - but execute a ISO-8859-1 script fails: see attached is

[issue3574] compile() cannot decode Latin-1 source encodings

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: See also Lib/test/test_shlex.py: trunk is ok, but with fix_latin.diff the test fails. ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3574] compile() cannot decode Latin-1 source encodings

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: It looks like the problem of fix_latin.diff is the decoding_state: it's set to STATE_NORMAL whereas current behaviour is to stay in state STATE_RAW. I wrote another patch which is a mix of case 1 (utf-8: just set tok->encodi

[issue3574] compile() cannot decode Latin-1 source encodings

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: @brett.cannon: I found it: ast.c used a hack for iso-8859-1! Since this hack introduces a bug (your compile(...) example), I prefer to remove it to simplify to code. The new patch just removes the hack in tokenizer.c and ast.c. I

[issue3574] compile() cannot decode Latin-1 source encodings

2008-10-03 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: After reading tokenizer.c 1000 times, I finally used grep: $ grep -l -i 'iso.8859.1' $(find -name "*.c") ./Python/ast.c <~~~ WTF? ./Objects/unicodeobject.c ./Parser/tokenizer.c ./Modules/cjkcodecs/_codecs_iso20

[issue4008] IDLE: checksyntax() doesn't support Unicode?

2008-10-04 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: IDLE opens the script many than once. There are two cases: (1) first open when IDLE read the file content to display it (2) second open on pressing F5 key (Run Module) to check the syntax (1) uses IOBinding and fails to open ISO-

[issue4035] Support bytes for os.exec*()

2008-10-04 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: The fix can be changed to be specific to POSIX system: +if name == 'posix' \ +and isinstance(file, bytes): +encoding = sys.getfilesystemencoding() +PATH = (bytes(dir, encoding) for dir in PATH)

[issue4053] str.split unintentionally strips char 'I' from the string

2008-10-06 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Duplicate of issue4054. -- nosy: +haypo resolution: -> duplicate status: open -> closed ___ Python tracker <[EMAIL PROTECTED]> <http://bugs

[issue4054] str.split unintentionally strips char 'I' from the string

2008-10-06 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: It's not a bug. Please read the documentation of the split() method: http://docs.python.org/library/stdtypes.html#str.split If you want to get the value after "=", use: value = line.split("FILE=", 1)[1

[issue3574] compile() cannot decode Latin-1 source encodings

2008-10-06 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11698/tokenizer_iso-8859-1.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3574] compile() cannot decode Latin-1 source encodings

2008-10-06 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Added file: http://bugs.python.org/file11716/tokenizer_iso-8859-1-patch3.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue2384] [Py3k] line number is wrong after encoding declaration

2008-10-06 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: benjamin was afraid by the comment /* dirty hack */ in my previous comment. After reading tokenizer.c again and again, I can say that the fix is correct: the file is closed and then re-opened by fp_setreadl() (using io.open()), and

[issue4008] IDLE: checksyntax() doesn't support Unicode?

2008-10-06 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11677/idle_encoding-2.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3574] compile() cannot decode Latin-1 source encodings

2008-10-06 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Removed file: http://bugs.python.org/file11715/python3_bytes_filename-3.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue3574] compile() cannot decode Latin-1 source encodings

2008-10-06 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: My patch version 2 included an "unrelated" fix for the issue2384. Added file: http://bugs.python.org/file11715/python3_bytes_filename-3.patch ___ Python tracker <[EMAIL PROTECTED]> &

[issue3623] _json: fix raise_errmsg(), py_encode_basestring_ascii() and linecol()

2008-10-06 Thread STINNER Victor
Changes by STINNER Victor <[EMAIL PROTECTED]>: Added file: http://bugs.python.org/file11718/json_test.patch ___ Python tracker <[EMAIL PROTECTED]> <http://bugs.pytho

[issue1565525] gc allowing tracebacks to eat up memory

2008-10-06 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Similar issue: issue4034 proposes to be able to set tb.tb_frame=None. It's easy to implement this, I can write a patch for this. -- nosy: +haypo ___ Python tracker <[EMAIL PRO

[issue4034] traceback attribute error

2008-10-06 Thread STINNER Victor
STINNER Victor <[EMAIL PROTECTED]> added the comment: Instead of converting tb_frame attribute to read only, I prefer to allow the user to clear the traceback to free some memory bytes. So I wrote a different patch. marge$ ./python Python 2.7a0 (trunk:66786M, Oct 7 2008, 00:48:32) &

<    34   35   36   37   38   39   40   41   42   43   >