[ python-Bugs-1202493 ] RE parser too loose with {m,n} construct
Bugs item #1202493, was opened at 2005-05-15 21:59
Message generated for change (Comment added) made by niemeyer
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1202493&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.5
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: RE parser too loose with {m,n} construct
Initial Comment:
This seems wrong to me:
>>> re.match("(UNIX{})", "UNIX{}").groups()
('UNIX',)
With no numbers or commas, "{}" should not be considered
special in the pattern. The docs identify three numeric
repetition possibilities: {m}, {m,} and {m,n}. There's no
description of {} meaning anything. Either the docs should
say {} implies {1,1}, {} should have no special meaning, or
an exception should be raised during compilation of the
regular expression.
--
>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-09-14 08:58
Message:
Logged In: YES
user_id=7887
Fixed in:
Lib/sre_parse.py: 1.64 -> 1.65
Lib/test/test_re.py: 1.55 -> 1.56
Misc/NEWS: 1.1360 -> 1.1361
Notice that perl will also handle constructs like '{,2}' as
literals, while Python will consider them as '{0,2}'. I
think it's too late to change that one though, as this
behavior may be relied upon in code out there.
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-08-31 22:16
Message:
Logged In: YES
user_id=1188172
No, you're the expert, so you'll get the honor of fixing it. :P
--
Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-08-31 22:11
Message:
Logged In: YES
user_id=7887
I support Skip's opinion on following whatever perl is currently doing, if
that won't lead to unexpected errors on current running code which was
considered sane (expecting {} to behave like {1,1} is not sane :-).
Your original patch looks under-optimal though (look at the tests around
it). I'll fix it, or if you prefer to do it by yourself, I may apply the
patch/review it/whatever. :-)
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-08-31 21:55
Message:
Logged In: YES
user_id=1188172
Any more objections against treating "{}" as literal?
The impact on existing code will be minimal, as I presume no
one will write "{}" in a RE instead of "{1,1}" (well, who
writes "{1,1}" anyway...).
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 19:10
Message:
Logged In: YES
user_id=1188172
Then, I think, we should follow Perl's behaviour and treat
"{}" as a literal, just like every other brace construct
that isn't a repeat specifier.
--
Comment By: Raymond Hettinger (rhettinger)
Date: 2005-06-03 18:46
Message:
Logged In: YES
user_id=80475
Hmm, it looks like they cannot be treated differently
without breaking backwards compatability.
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 18:00
Message:
Logged In: YES
user_id=1188172
Raymond said that braces should always be considered
special. This includes constructs like "{(?P.*)}"
which the string module uses, and which would be a syntax
error then.
--
Comment By: Skip Montanaro (montanaro)
Date: 2005-06-03 15:13
Message:
Logged In: YES
user_id=44345
Can you elaborate? I fail to see what the string module
has to do with the re module. Can you give an example
of code that would break?
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 08:01
Message:
Logged In: YES
user_id=1188172
I just realized that e.g. the string module uses unescaped
braces, so I think we should not become overly strict as it
would break much code...
Perhaps the original patch (sre-brace-diff) is better...
--
Comment By: Skip Montanaro (montanaro)
Date: 2005-06-02 11:16
Message:
Logged In: YES
user_id=44345
In the absence of strong technical reasons, I'd vote to do what Perl
does. I believe the assumption all along has been that most people
coming to Python who already know how to use regular expressions are
Perl programmers. It wouldn't seem to make sense to throw little land
mines in their paths. I realize that explicit is better than implicit, but
[ python-Bugs-1113484 ] document {m} regex matcher wrt empty matches
Bugs item #1113484, was opened at 2005-01-31 21:46
Message generated for change (Comment added) made by niemeyer
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1113484&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: None
>Status: Closed
>Resolution: Works For Me
Priority: 5
Submitted By: Wummel (calvin)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: document {m} regex matcher wrt empty matches
Initial Comment:
The {m} matcher seems not to be applicable to (some)
empty matches. For example this will raise a regex
compile error:
>>> re.compile("(a*){4}")
Traceback (most recent call last):
File "", line 1, in ?
File "/usr/lib/python2.3/sre.py", line 179, in compile
return _compile(pattern, flags)
File "/usr/lib/python2.3/sre.py", line 230, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat
However this matcher is compiled without error:
>>> re.compile("(\ba*){4}")
<_sre.SRE_Pattern object at 0xb7f86c58>
I don't know why the first example gives an error, but
it should perhaps be mentioned in the documentation
about the {} regex operator.
--
>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-09-14 09:17
Message:
Logged In: YES
user_id=7887
Would you be able to come up with an example that would be
useful for that kind of construction?
"(a*){4}" will always match "a" as many times as possible,
and than match the empty string 3 more times. So it has the
effect of "a*", but in addition will kill the grouping
effect since the given group will always be empty. With that
in mind considering it as a syntax error seems correct.
Do you agree?
--
Comment By: Wummel (calvin)
Date: 2005-02-03 17:06
Message:
Logged In: YES
user_id=9205
Oops, it should have been:
>>> re.compile(r"(\ba*){4}")
And now the error is consistent (now tested in Python 2.4
instead of 2.3):
Traceback (most recent call last):
File "", line 1, in ?
File "/usr/lib/python2.4/sre.py", line 180, in compile
return _compile(pattern, flags)
File "/usr/lib/python2.4/sre.py", line 227, in _compile
raise error, v # invalid expression
sre_constants.error: nothing to repeat
So it seems that {m} operator does not like potentially
empty matches.
--
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1113484&group_id=5470
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1058786 ] r'\10' as replacement pattern loops in compilation
Bugs item #1058786, was opened at 2004-11-02 12:39 Message generated for change (Comment added) made by niemeyer You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1058786&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Regular Expressions Group: Python 2.3 >Status: Closed >Resolution: Wont Fix Priority: 5 Submitted By: Nick Maclaren (nmm1) Assigned to: Gustavo Niemeyer (niemeyer) Summary: r'\10' as replacement pattern loops in compilation Initial Comment: The following program loops under at least Solaris 9 on SPARC and Linux (kernel 2.6) in x86. From tracebacks, it seems to be in the internal compilation of the pattern r'\10'. from re import compile line = "" pat = compile(12 * r'(\d+)') ltarget = float(pat.sub(r'\10',line)) print ltarget -- >Comment By: Gustavo Niemeyer (niemeyer) Date: 2005-09-14 09:34 Message: Logged In: YES user_id=7887 It's fixed in the 2.4+, and there's a workaround for previous versions, so I'm closing that as wontfix for 2.3. -- Comment By: Reinhold Birkenfeld (birkenfeld) Date: 2005-05-31 11:45 Message: Logged In: YES user_id=1188172 Setting group to Python 2.3. If there won't be a 2.3.6 in the future, it can be closed. -- Comment By: Nick Maclaren (nmm1) Date: 2004-11-02 13:28 Message: Logged In: YES user_id=652073 I have also checked, and it is fixed. From my point of view, it isn't worth backporting, as I can upgrade and don't mind using a beta version. -- Comment By: Michael Hudson (mwh) Date: 2004-11-02 13:07 Message: Logged In: YES user_id=6656 It does seem to be fixed in 2.4, but not in 2.3(.3, anyway). I know some of the re changes for 2.4 are fairly large, so I don't know whether the fix is a backport candidate for 2.3.5. Gustavo might know. -- Comment By: Johannes Gijsbers (jlgijsbers) Date: 2004-11-02 13:07 Message: Logged In: YES user_id=469548 I get the following on Python 2.4/Linux 2.6.8, so it does seem to be fixed: >>> from re import compile >>> line = "" >>> pat = compile(12 * r'(\d+)') >>> ltarget = float(pat.sub(r'\10',line)) Traceback (most recent call last): File "", line 1, in ? ValueError: empty string for float() -- Comment By: Fredrik Lundh (effbot) Date: 2004-11-02 13:00 Message: Logged In: YES user_id=38376 If you need a workaround for 2.2, use a sub callback: http://effbot.org/zone/re-sub.htm#callbacks -- Comment By: Fredrik Lundh (effbot) Date: 2004-11-02 12:58 Message: Logged In: YES user_id=38376 Cannot check this right now, but I'm 99% sure that this has been fixed in 2.4. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1058786&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1202493 ] RE parser too loose with {m,n} construct
Bugs item #1202493, was opened at 2005-05-15 23:59
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1202493&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.5
Status: Closed
Resolution: Fixed
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: RE parser too loose with {m,n} construct
Initial Comment:
This seems wrong to me:
>>> re.match("(UNIX{})", "UNIX{}").groups()
('UNIX',)
With no numbers or commas, "{}" should not be considered
special in the pattern. The docs identify three numeric
repetition possibilities: {m}, {m,} and {m,n}. There's no
description of {} meaning anything. Either the docs should
say {} implies {1,1}, {} should have no special meaning, or
an exception should be raised during compilation of the
regular expression.
--
>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-14 12:58
Message:
Logged In: YES
user_id=1188172
Will you backport the fix?
--
Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-09-14 10:58
Message:
Logged In: YES
user_id=7887
Fixed in:
Lib/sre_parse.py: 1.64 -> 1.65
Lib/test/test_re.py: 1.55 -> 1.56
Misc/NEWS: 1.1360 -> 1.1361
Notice that perl will also handle constructs like '{,2}' as
literals, while Python will consider them as '{0,2}'. I
think it's too late to change that one though, as this
behavior may be relied upon in code out there.
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-01 00:16
Message:
Logged In: YES
user_id=1188172
No, you're the expert, so you'll get the honor of fixing it. :P
--
Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-09-01 00:11
Message:
Logged In: YES
user_id=7887
I support Skip's opinion on following whatever perl is currently doing, if
that won't lead to unexpected errors on current running code which was
considered sane (expecting {} to behave like {1,1} is not sane :-).
Your original patch looks under-optimal though (look at the tests around
it). I'll fix it, or if you prefer to do it by yourself, I may apply the
patch/review it/whatever. :-)
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-08-31 23:55
Message:
Logged In: YES
user_id=1188172
Any more objections against treating "{}" as literal?
The impact on existing code will be minimal, as I presume no
one will write "{}" in a RE instead of "{1,1}" (well, who
writes "{1,1}" anyway...).
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 21:10
Message:
Logged In: YES
user_id=1188172
Then, I think, we should follow Perl's behaviour and treat
"{}" as a literal, just like every other brace construct
that isn't a repeat specifier.
--
Comment By: Raymond Hettinger (rhettinger)
Date: 2005-06-03 20:46
Message:
Logged In: YES
user_id=80475
Hmm, it looks like they cannot be treated differently
without breaking backwards compatability.
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 20:00
Message:
Logged In: YES
user_id=1188172
Raymond said that braces should always be considered
special. This includes constructs like "{(?P.*)}"
which the string module uses, and which would be a syntax
error then.
--
Comment By: Skip Montanaro (montanaro)
Date: 2005-06-03 17:13
Message:
Logged In: YES
user_id=44345
Can you elaborate? I fail to see what the string module
has to do with the re module. Can you give an example
of code that would break?
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 10:01
Message:
Logged In: YES
user_id=1188172
I just realized that e.g. the string module uses unescaped
braces, so I think we should not become overly strict as it
would break much code...
Perhaps the original patch (sre-brace-diff) is better...
--
Comment By: Skip Montanaro (montanaro)
Date: 2005-06-02 13:16
Message:
Logged In: YES
user_id=44345
In the absence of strong technical reasons, I'd vote to do what Perl
does. I believe the assumption all along has been that most people
coming
[ python-Bugs-1290333 ] cjkcodec compile error under AIX 5.2 on symbol 100_encode
Bugs item #1290333, was opened at 2005-09-14 02:55
Message generated for change (Settings changed) made by perky
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1290333&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Build
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Deron Meranda (dmeranda)
>Assigned to: Hye-Shik Chang (perky)
Summary: cjkcodec compile error under AIX 5.2 on symbol 100_encode
Initial Comment:
Trying to compile Python 2.4.1 under AIX 5.2 with the
native cc compiler. When compiling the module
cjkcodecs the compiler will fail with these errors on
the source file Modules/cjkcodecs/_codecs_cn.c
building '_codecs_cn' extension
cc -DNDEBUG -O -I.
-I/home/psgtools/aix52/build/Python-2.4.1/./Include
-I/opt/cmax/psgtools/include
-I/home/psgtools/aix52/build/Python-2.4.1/Include
-I/home/psgtools/aix52/build/Python-2.4.1 -c
/home/psgtools/aix52/build/Python-2.4.1/Modules/cjkcodecs/_codecs_cn.c
-o build/temp.aix-5.2-2.4/_codecs_cn.o
"/home/psgtools/aix52/build/Python-2.4.1/Modules/cjkcodecs/_codecs_cn.c",
line 431.3: 1506-206 (S) Suffix of integer constant
100_encode is not valid.
"/home/psgtools/aix52/build/Python-2.4.1/Modules/cjkcodecs/_codecs_cn.c",
line 431.3: 1506-196 (W) Initialization between types
"int(*)(union {...}*,const void*,const unsigned
long**,unsigned long,unsigned char**,unsigned
long,int)" and "int" is not allowed.
and so on.
This is happening because of the "hz" codec. Due to
the way the source file uses the C preprocessor macro,
and the fact that the preprocessor symbol "hz" is
predefined as 100 on this platform, it is producing
invalid lecical symbols such as "100_encode".
The simple solution is to insert the following line in
the source file immediately before the first reference
to the name "hz":
#undef hz
--
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1290333&group_id=5470
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1284341 ] re nested conditional matching (?()) doesn't work
Bugs item #1284341, was opened at 2005-09-08 00:36
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1284341&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.4
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Erik Demaine (edemaine)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: re nested conditional matching (?()) doesn't work
Initial Comment:
Here is a simple regular expression that should match
\o, \{o}, {\o}, and \{o}}: (This example arose as
a simplification of a general accent matcher for
LaTeX/BibTeX.)
r = re.compile(r'(\{)?\"(\{)?(.)(?(2)\})(?(1)\})')
However, it fails on two out of four of the desired
matches:
r.search(r'\o) ## returns None (WRONG)
r.search(r\{o}').group() ## returns '\"{o}"' (CORRECT)
r.search(r'{\o).group() ## returns \"o} (WRONG)
r.search(r{\{o}}').group() ## returns '{\"{o}}'
(CORRECT)
The third case is particularly bizarre. Incidentally,
the behavior is different if '(.)' is replaced by '.'
(incorrect in different ways).
I have tested this on Python 2.4.1 on Windows and a CVS
version on Linux. I do not believe it is a platform issue.
--
>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-14 19:32
Message:
Logged In: YES
user_id=1188172
The fix is already in Python 2.4 CVS, so I'm closing as Fixed.
--
Comment By: Erik Demaine (edemaine)
Date: 2005-09-08 01:09
Message:
Logged In: YES
user_id=265183
Whoops, I just updated CVS to the latest HEAD and discovered
that the problem has already been solved. Nice work! Sorry
about the extraneous report, but let me turn this into a
request that the fix go into 2.4.2, not just 2.5.
--
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1284341&group_id=5470
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1256669 ] Significant memory leak with PyImport_ReloadModule
Bugs item #1256669, was opened at 2005-08-11 08:49 Message generated for change (Comment added) made by collinwinter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1256669&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Ben Held (bheld) Assigned to: Nobody/Anonymous (nobody) Summary: Significant memory leak with PyImport_ReloadModule Initial Comment: Having recently upgraded to Python 2.4, I am having a large memory leak with the following code built with VC++ 6.0: PyObject *pName, *pModule; Py_Initialize(); pName = PyString_FromString(argv[1]); pModule = PyImport_Import(pName); Py_DECREF(pName); PyObject* pModule2 = PyImport_ReloadModule(pModule); Py_DECREF(pModule2); Py_DECREF(pModule); Py_Finalize(); return 0; I get leaks of over 500 kb. I have another program which is much more complex, in which every call to PyImport_ReloadModule is leaking 200+ kb, even though I am calling Py_DECREF correctly. -- Comment By: Collin Winter (collinwinter) Date: 2005-09-14 13:53 Message: Logged In: YES user_id=1344176 I've been unable to verify this on Linux. I've tested python versions 2.2.3, 2.3.5 and 2.4.1, all compiled with gcc 3.3.5 on Debian 3.1 under kernel 2.6.8. I used the sample program provided by Ben, modified with an infinite loop over the PyImport_ReloadModule/PyDECREF(pModule2) lines, sleeping for 1 second after every 25 iterations. I tested reloading the modules distutils, os.path, distutils.command.sdist for 300+ iterations each under each python version. No memory leak was observed. -- Comment By: Ben Held (bheld) Date: 2005-08-16 09:56 Message: Logged In: YES user_id=1327580 Boundschecker shows the leak and I have verified this by watching the process memory increase via the task manager. -- Comment By: Martin v. Löwis (loewis) Date: 2005-08-13 09:34 Message: Logged In: YES user_id=21627 How do you know there is a memory leak? -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1256669&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1256669 ] Significant memory leak with PyImport_ReloadModule
Bugs item #1256669, was opened at 2005-08-11 12:49 Message generated for change (Comment added) made by bheld You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1256669&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Ben Held (bheld) Assigned to: Nobody/Anonymous (nobody) Summary: Significant memory leak with PyImport_ReloadModule Initial Comment: Having recently upgraded to Python 2.4, I am having a large memory leak with the following code built with VC++ 6.0: PyObject *pName, *pModule; Py_Initialize(); pName = PyString_FromString(argv[1]); pModule = PyImport_Import(pName); Py_DECREF(pName); PyObject* pModule2 = PyImport_ReloadModule(pModule); Py_DECREF(pModule2); Py_DECREF(pModule); Py_Finalize(); return 0; I get leaks of over 500 kb. I have another program which is much more complex, in which every call to PyImport_ReloadModule is leaking 200+ kb, even though I am calling Py_DECREF correctly. -- >Comment By: Ben Held (bheld) Date: 2005-09-14 18:09 Message: Logged In: YES user_id=1327580 This behavior is evident with Python 2.3.5 built on Windows with VC++ 6.0. -- Comment By: Collin Winter (collinwinter) Date: 2005-09-14 17:53 Message: Logged In: YES user_id=1344176 I've been unable to verify this on Linux. I've tested python versions 2.2.3, 2.3.5 and 2.4.1, all compiled with gcc 3.3.5 on Debian 3.1 under kernel 2.6.8. I used the sample program provided by Ben, modified with an infinite loop over the PyImport_ReloadModule/PyDECREF(pModule2) lines, sleeping for 1 second after every 25 iterations. I tested reloading the modules distutils, os.path, distutils.command.sdist for 300+ iterations each under each python version. No memory leak was observed. -- Comment By: Ben Held (bheld) Date: 2005-08-16 13:56 Message: Logged In: YES user_id=1327580 Boundschecker shows the leak and I have verified this by watching the process memory increase via the task manager. -- Comment By: Martin v. Löwis (loewis) Date: 2005-08-13 13:34 Message: Logged In: YES user_id=21627 How do you know there is a memory leak? -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1256669&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1066546 ] test_pwd fails on 64bit system (Opteron)
Bugs item #1066546, was opened at 2004-11-15 04:34
Message generated for change (Comment added) made by gvanrossum
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1066546&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Miki Tebeka (tebeka)
Assigned to: Martin v. Löwis (loewis)
Summary: test_pwd fails on 64bit system (Opteron)
Initial Comment:
test test_pwd failed -- Traceback (most recent call last):
File "/tmp/miki/Python-2.4b2/Lib/test/test_pwd.py",
line 42, in test_values
self.assert_(pwd.getpwuid(e.pw_uid) in
entriesbyuid[e.pw_uid])
OverflowError: signed integer is greater than maximum
$ cat /proc/version
Linux version 2.4.21-20.ELsmp
([EMAIL PROTECTED]) (gcc version 3.2.3
20030502 (Red Hat Linux 3.2.3-42)) #1 SMP Wed Aug 18
20:34:58 EDT 2004
Processor is AMD Opteron 2.4MHz
--
>Comment By: Guido van Rossum (gvanrossum)
Date: 2005-09-14 14:17
Message:
Logged In: YES
user_id=6380
Martin, IMO you can close this now that I've checked in the
AIX patch which should address this with the i->I change
suggested in a comment below. (patch 1284289)
--
Comment By: Neal Norwitz (nnorwitz)
Date: 2005-09-07 17:16
Message:
Logged In: YES
user_id=33168
See this patch which looks like it may fix the same problem
(among others).
https://sourceforge.net/tracker/index.php?func=detail&aid=1284289&group_id=5470&atid=305470
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-03 14:17
Message:
Logged In: YES
user_id=1188172
Is the patch safe to apply? I think so, I haven't seen a
negative uid/gid yet.
--
Comment By: Clark Mobarry (cmobarry)
Date: 2005-09-01 08:57
Message:
Logged In: YES
user_id=1035073
The suggested patch by heffler worked brilliantly for my 64
bit environment. Thanks. My bug submission was on
2005-08-03 14:40.
--
Comment By: Marvin Heffler (heffler)
Date: 2005-08-11 14:19
Message:
Logged In: YES
user_id=298758
I think I figued out the problem with python handling uids and
gids greater than 2147483647 when using the grp.getgrgid
and pwd.getpwuid functions. Both of the functions call
PyArg_ParseTuple with a type of "i", thus indicating the
argument is a signed integer. Instead they should be
using "I" (upper-case i) for an unsigned integer. The fix is
fairly simple. Here are the two patches necessary to the
python source:
diff -Naur Python-2.4.orig/Modules/grpmodule.c Python-
2.4/Modules/grpmodule.c
--- Python-2.4.orig/Modules/grpmodule.c 2004-01-20
16:06:00.0 -0500
+++ Python-2.4/Modules/grpmodule.c 2005-08-11
13:36:48.0 -0400
@@ -87,7 +87,7 @@
{
int gid;
struct group *p;
-if (!PyArg_ParseTuple(args, "i:getgrgid", &gid))
+if (!PyArg_ParseTuple(args, "I:getgrgid", &gid))
return NULL;
if ((p = getgrgid(gid)) == NULL) {
PyErr_Format(PyExc_KeyError, "getgrgid(): gid not
found: %d", gid);
diff -Naur Python-2.4.orig/Modules/pwdmodule.c Python-
2.4/Modules/pwdmodule.c
--- Python-2.4.orig/Modules/pwdmodule.c 2004-01-20
16:07:23.0 -0500
+++ Python-2.4/Modules/pwdmodule.c 2005-08-11
13:36:27.0 -0400
@@ -104,7 +104,7 @@
{
int uid;
struct passwd *p;
- if (!PyArg_ParseTuple(args, "i:getpwuid", &uid))
+ if (!PyArg_ParseTuple(args, "I:getpwuid", &uid))
return NULL;
if ((p = getpwuid(uid)) == NULL) {
PyErr_Format(PyExc_KeyError,
Hopefully, someone from the python project can verify my
patch and get it incorporated into a future release.
--
Comment By: Clark Mobarry (cmobarry)
Date: 2005-08-03 14:40
Message:
Logged In: YES
user_id=1035073
The same error occurs for an Intel P4-521 processor running
RedHat Enterprise Linux WS v4 Intel EM64T 64bit.
$ cat /proc/version
Linux version 2.6.9-5.ELsmp ([EMAIL PROTECTED])
(gcc version 3.4.3 20041212 (Red Hat 3.4.3-9.EL4)) #1 SMP
Wed Jan 5 19:29:47 EST 2005
test test_grp failed -- Traceback (most recent call last):
File
"/home/cmobarry/downloads/Python-2.4.1/Lib/test/test_grp.py",
line 29, in test_values
e2 = grp.getgrgid(e.gr_gid)
OverflowError: signed integer is greater than maximum
test test_pwd failed -- Traceback (most recent call last):
File
"/home/cmobarry/downloads/Python-2.4.1/Lib/test/test_pwd.py",
line 42, in test_values
self.assert_(pwd.getpwuid(e.pw_uid) in
entriesbyuid[e.p
[ python-Bugs-893549 ] skipitem() in getargs.c missing some types
Bugs item #893549, was opened at 2004-02-09 18:30 Message generated for change (Comment added) made by birkenfeld You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=893549&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.2.3 >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Kirill A Kryloff (hacocuk) Assigned to: Nobody/Anonymous (nobody) Summary: skipitem() in getargs.c missing some types Initial Comment: python 2.2.3 looks like skipitem in getargs.c is missing some types 'k' for example -- >Comment By: Reinhold Birkenfeld (birkenfeld) Date: 2005-09-14 21:32 Message: Logged In: YES user_id=1188172 Fixed with patch #1212928. -- Comment By: Reinhold Birkenfeld (birkenfeld) Date: 2005-07-11 21:11 Message: Logged In: YES user_id=1188172 I submitted a patch (#1212928) which fixes that. -- Comment By: Craig Ringer (ringerc) Date: 2005-07-11 21:08 Message: Logged In: YES user_id=639504 It matters all right. Just wasted a bunch of time tracking this down into the Python sources and confirming it was a Python bug. It's really nasty for 'es'. This will cause bizarre errors for PyArg_ParseTupleAndKeywords(...) calls using the unsupported format strings after the | optional argument barrier. The errors will always contain the string: impossible The error will, of course, only turn up if the user omits one or more of the arguments with unsupported formats. -- Comment By: Petr Vaněk (subik) Date: 2005-07-11 21:06 Message: Logged In: YES user_id=784012 this bug is still presented in later versions 2.3, 2.4. We have real problem with it (see. http://bugs.scribus.net/view.php?id=2018). Broken PyArg_ParseTupleAndKeywords in skipitem() (getargs.c) causes "impossible" exception by missing case conditions. I would like to please developers for fixing (or we will be forced to provide a patch (which will force us to know Python guts (etc.))). -- Comment By: Reinhold Birkenfeld (birkenfeld) Date: 2005-06-01 14:10 Message: Logged In: YES user_id=1188172 The missing types are u, u#, es, es#, et, et#, k, K, I, U, t#, w, w# and maybe (...) I don't know whether this is of any significance though. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=893549&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1285809 ] re special sequence '\w'
Bugs item #1285809, was opened at 2005-09-09 11:40
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1285809&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
>Resolution: Wont Fix
Priority: 5
Submitted By: ChristianJ (cybb20)
Assigned to: Nobody/Anonymous (nobody)
Summary: re special sequence '\w'
Initial Comment:
>>> rexp = re.compile('\w', re.LOCALE)
>>> rexp.findall('_')
['_']
>>> '_'.isalnum()
False
While the Python docs say, that the underscore is
supported, I strongly ask why this is so?
The problem is that I want to match a sequence of
alphanumeric characters but excluding the underscore.
If you defined \w to not support "_" anymore, people
could easily check for the "_" as well with \w|_ .
My locale is "de_DE" but it does affect other locales as
well.
--
>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-14 21:45
Message:
Logged In: YES
user_id=1188172
\w matches the underscore since \w has been introduced in RE
syntax, and this was not in Python. This alone is sufficient
to justify this behavior.
Anyway, Python's behavior cannot change, too. Many REs would
become erroneous with such a change.
So closing as Won't fix.
--
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1285809&group_id=5470
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1281291 ] Erroneous \url command in python.sty
Bugs item #1281291, was opened at 2005-09-03 17:18 Message generated for change (Settings changed) made by birkenfeld You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1281291&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: Rory Yorke (ryorke) >Assigned to: Fred L. Drake, Jr. (fdrake) Summary: Erroneous \url command in python.sty Initial Comment: The \url and \ulink commands in texinputs/python.sty produce erroneous PDF files. This can be fairly easily tested using GhostScript (see the attached log file for an example). The current Python 2.4.1 PDF docs (as downloadable from www.python.org) have this error. Although the error does not prevent correct rendering in viewers such as gv, xpdf or acroread, the link is absent. The attached patch partially addresses this, by changing the arguments to the \pdfstart command used. The changes are taken straight from texmf/tex/latex/hyperref/hpdftex.def; that file has the following version header: %% File: hyperref.dtx Copyright 1995-2001 Sebastian Rahtz, %% RCS: $Id: hyperref.dtx 6.71 2000/10/04 rahtz Exp rahtz $ I don't pretend to understand the TeX code, but it does fix some of the problem. Some URLs, like http://sourceforge.net/bugs/?group_id=5470, are not linked to correctly. That particular URL becomes http://sourceforge.net/bugs/[EMAIL PROTECTED]@[EMAIL PROTECTED]@[EMAIL PROTECTED]@skip%20id=5470 -- I guess that has something to do with the underscore. The diff was generated relative to Python CVS head of 3 Sept 2005; the python.sty file had version 1.113. The python executable used was 2.4.1, not CVS. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1281291&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1281408 ] Py_BuildValue k format units don't work with big values
Bugs item #1281408, was opened at 2005-09-04 00:12
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1281408&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Adal Chiriliuc (adalx)
>Assigned to: Martin v. Löwis (loewis)
Summary: Py_BuildValue k format units don't work with big values
Initial Comment:
Python 2.4 on Windows XP SP2
Consider this code:
unsigned long x = 0xaabbccdd;
PyObject* v = Py_BuildValue("k", x);
unsigned long y = PyLong_AsUnsignedLong(v);
y will be equal with -1 because PyLong_AsUnsignedLong
will raise an OverflowError since Py_BuildValue doesn't
create a long for the "k" format unit, but an int which
will be interpreted as a negative number.
The K format seems to have the same error,
PyLong_FromLongLong is used instead of
PyLong_FromUnsignedLongLong.
The do_mkvalue function from mod_support.c must be
fixed to use PyLong_FromUnsignedLong instead of
PyInt_FromLong for "k".
Also, the BHLkK format units for Py_BuildValue should
be documented. In my Python 2.4 manual they do not appear.
--
>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-14 22:02
Message:
Logged In: YES
user_id=1188172
I think you're right. Do you too, Martin?
--
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1281408&group_id=5470
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1288615 ] Python code.interact() and UTF-8 locale
Bugs item #1288615, was opened at 2005-09-12 13:40
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1288615&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Parser/Compiler
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: STINNER Victor (haypo)
Assigned to: Nobody/Anonymous (nobody)
Summary: Python code.interact() and UTF-8 locale
Initial Comment:
Hi,
I found a bug in Python interactive command line
(program python alone: looks to be code.interact()
function in code.py). With UTF-8 locale, the command <<
u"é" >> returns << u'\xc3\xa9' >> and not << u'\xE9'
>>. Remember: the french e with acute is Unicode 233
(0xE9), encoded \xC3 \xA9 in UTF-8.
Another example of the bug:
#-*- coding: UTF-8 -*-
code = "u\%s\" % "\xc3\xa9"
compiled = compile(code,'',"single")
exec compiled
Result :
u'\xc3\xa9'
Excepted result :
u'\xe9'
After long hours of debuging (read Python
documentation, debug Python with gdb, read Python C
source code, ...) I found the origin of the bug:
function parsestr() in Python/compile.c. This function
translate a string to a unicode string (or a classic
string). The problem is when the encoding declaration
doesn't exist: the string isn't converted.
Solution to the first code:
#-*- coding: ascii -*-
code = """#-*- coding: UTF-8 -*-
u\%s\""" % "\xc3\xa9"
compiled = compile(code,'',"single")
exec compiled
Proposition: u"..." and unicode("...") should use
sys.stdin.encoding by default. They will work as
unicode("...", sys.stdin.encoding). Or easier, the
compiler should use sys.stdin.encoding and not ascii as
default encoding.
Sorry if someone already reported this bug. And, is it
a bug or a feature ? ;-)
Bye, Haypo
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-14 22:03
Message:
Logged In: YES
user_id=1188172
There's no uploaded file! You have to check the
checkbox labeled "Check to Upload & Attach File"
when you upload a file.
Please try again.
(This is a SourceForge annoyance that we can do
nothing about. :-( )
--
Comment By: STINNER Victor (haypo)
Date: 2005-09-12 14:46
Message:
Logged In: YES
user_id=365388
Ok ok, after long discution with RexFi on IRC, I understood
that Python can't *guess* string encoding ... I agree with
that, system locale or source encoding are not a good choice.
But ... Python console have a bug. It uses raw_input(). So I
wrote a patch to just add the right unicode cast. But Python
console don't looks to be code.interact().
I attach the patch to this comment.
Haypo
--
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1288615&group_id=5470
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1201461 ] suspected cPickle memory leak
Bugs item #1201461, was opened at 2005-05-13 17:49 Message generated for change (Comment added) made by birkenfeld You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1201461&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.2 >Status: Closed >Resolution: Wont Fix Priority: 5 Submitted By: Alan (franz2) Assigned to: Nobody/Anonymous (nobody) Summary: suspected cPickle memory leak Initial Comment: I believe there is a memory leak in cPickle. I have a parallel code which uses array() and indices() from Numeric to massage data buffers before being sent and received by Pypar. Pypar subsequently uses cPickle to pickle the data. After many hours of execution, my code crashes with one of the following error messages (depending upon the run): a = zeros(shape, typecode, savespace) MemoryError: can't allocate memory for array or: s = dumps(x, 1) MemoryError: out of memory I have since modified my code to use a different data format so cPickle is no longer used from PyPar and now the code runs fine. -- >Comment By: Reinhold Birkenfeld (birkenfeld) Date: 2005-09-14 22:18 Message: Logged In: YES user_id=1188172 Closing due to lack of response. cPickle is such a complex module, without a test case the leak cannot be found. -- Comment By: Facundo Batista (facundobatista) Date: 2005-05-30 22:34 Message: Logged In: YES user_id=752496 Please, could you verify if this problem persists in Python 2.3.4 or 2.4? If yes, in which version? Can you provide a test case? If the problem is solved, from which version? Note that if you fail to answer in one month, I'll close this bug as "Won't fix". Thank you! .Facundo -- Comment By: Martin v. Löwis (loewis) Date: 2005-05-30 10:34 Message: Logged In: YES user_id=21627 Can you provide a test case that demonstrates how the memory is exhausted? Without a test case, it is unlikely that we will be able to find the suspected leak. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1201461&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1283491 ] nit for builtin sum doc
Bugs item #1283491, was opened at 2005-09-07 02:18 Message generated for change (Comment added) made by birkenfeld You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1283491&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 4 Submitted By: daishi harada (daishiharada) Assigned to: Nobody/Anonymous (nobody) Summary: nit for builtin sum doc Initial Comment: the docstring signature for sum in bltinmodule.c should be changed from: sum(sequence, start=0) to: sum(sequence[, start]) to reflect the current implementation in builtin_sum. (or else the implementation should be changed to accept kwargs.) -- >Comment By: Reinhold Birkenfeld (birkenfeld) Date: 2005-09-14 22:24 Message: Logged In: YES user_id=1188172 If we change the function signature in the docstring, we must include the "defaults to 0" somewhere. -- Comment By: Raymond Hettinger (rhettinger) Date: 2005-09-08 04:37 Message: Logged In: YES user_id=80475 """ >>> sum([x] for x in xrange(10), start=[]) File "", line 1 SyntaxError: invalid syntax # The problem above is orthogonal to the issue in this bug, # but I wonder if at some point we'll be able to write such? """ FYI, the answer is no. The requirement for parenthesis cannot change. To see why, parse this: f(g(t) for t in a, b). -- Comment By: daishi harada (daishiharada) Date: 2005-09-07 21:02 Message: Logged In: YES user_id=493197 This is relatively minor so I don't mean to push particularly hard, but I'd like to at least show how the docstring made me stray: >>> sum([x] for x in xrange(10)) Traceback (most recent call last): File "", line 1, in ? TypeError: unsupported operand type(s) for +: 'int' and 'list' >>> help(sum) Help on built-in function sum in module __builtin__: sum(...) sum(sequence, start=0) -> value Returns the sum of a sequence of numbers (NOT strings) plus the value of parameter 'start'. When the sequence is empty, returns start. >>> sum([x] for x in xrange(10), start=[]) File "", line 1 SyntaxError: invalid syntax # The problem above is orthogonal to the issue in this bug, # but I wonder if at some point we'll be able to write such? >>> sum(([x] for x in xrange(10)), start=[]) Traceback (most recent call last): File "", line 1, in ? TypeError: sum() takes no keyword arguments # examine lib docs, which give the signature: # sum(sequence[, start]) >>> sum(([x] for x in xrange(10)), []) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> # examine bltinmodule.c to confirm that # sum doesn't accept kwargs. -- Comment By: daishi harada (daishiharada) Date: 2005-09-07 21:02 Message: Logged In: YES user_id=493197 This is relatively minor so I don't mean to push particularly hard, but I'd like to at least show how the docstring made me stray: >>> sum([x] for x in xrange(10)) Traceback (most recent call last): File "", line 1, in ? TypeError: unsupported operand type(s) for +: 'int' and 'list' >>> help(sum) Help on built-in function sum in module __builtin__: sum(...) sum(sequence, start=0) -> value Returns the sum of a sequence of numbers (NOT strings) plus the value of parameter 'start'. When the sequence is empty, returns start. >>> sum([x] for x in xrange(10), start=[]) File "", line 1 SyntaxError: invalid syntax # The problem above is orthogonal to the issue in this bug, # but I wonder if at some point we'll be able to write such? >>> sum(([x] for x in xrange(10)), start=[]) Traceback (most recent call last): File "", line 1, in ? TypeError: sum() takes no keyword arguments # examine lib docs, which give the signature: # sum(sequence[, start]) >>> sum(([x] for x in xrange(10)), []) [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] >>> # examine bltinmodule.c to confirm that # sum doesn't accept kwargs. -- Comment By: Raymond Hettinger (rhettinger) Date: 2005-09-07 05:11 Message: Logged In: YES user_id=80475 While the proposed change is technically correct, I find the original to be more informative. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1283491&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1274828 ] splitunc not documented
Bugs item #1274828, was opened at 2005-08-27 23:40 Message generated for change (Comment added) made by birkenfeld You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1274828&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Documentation Group: Python 2.4 >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: Poor Yorick (pooryorick) >Assigned to: Reinhold Birkenfeld (birkenfeld) Summary: splitunc not documented Initial Comment: a description of splitunc is missing from http://docs.python.org/lib/module-os.path.html -- >Comment By: Reinhold Birkenfeld (birkenfeld) Date: 2005-09-14 22:42 Message: Logged In: YES user_id=1188172 Interesting, since splitunc() is mentioned in the chapter heading :) Added a description in Doc/lib/libposixpath.tex r1.43, r1.40.2.3. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1274828&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1007046 ] os.startfile() doesn't accept Unicode filenames
Bugs item #1007046, was opened at 2004-08-11 08:47
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1007046&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Extension Modules
Group: Python 2.3
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Matthias Huening (huening)
>Assigned to: Reinhold Birkenfeld (birkenfeld)
Summary: os.startfile() doesn't accept Unicode filenames
Initial Comment:
WinXP, Python 2.3.4
os.startfile() seems to have problems with Unicode
filenames. Example:
>>> import tkFileDialog
>>> import os
>>> f = tkFileDialog.askopenfilename()
>>> type(f)
>>> os.startfile(f)
Traceback (most recent call last):
File "", line 1, in -toplevel-
os.startfile(f)
UnicodeEncodeError: 'ascii' codec can't encode
characters in position 14-16: ordinal not in range(128)
>>>
--
>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-14 22:52
Message:
Logged In: YES
user_id=1188172
Checked this in now. posixmodule.c r2.340, r2.329.2.4.
--
Comment By: M.-A. Lemburg (lemburg)
Date: 2005-09-01 10:27
Message:
Logged In: YES
user_id=38388
The path looks OK, but I can't test it on Windows
(os.startfile() is only available on Windows).
A note on style: you should always try to keep lines shorter
than 80 characters, e.g.:
--- CVS-Python/Modules/posixmodule.c2005-08-15
10:15:27.0 +0200
+++ Dev-Python/Modules/posixmodule.c2005-09-01
10:23:06.555633134 +0200
@@ -7248,7 +7248,8 @@
{
char *filepath;
HINSTANCE rc;
- if (!PyArg_ParseTuple(args, "s:startfile", &filepath))
+ if (!PyArg_ParseTuple(args, "et:startfile",
+ Py_FileSystemDefaultEncoding,
&filepath))
return NULL;
Py_BEGIN_ALLOW_THREADS
rc = ShellExecute((HWND)0, NULL, filepath, NULL,
NULL, SW_SHOWNORMAL);
--
Comment By: Raymond Hettinger (rhettinger)
Date: 2005-08-24 07:18
Message:
Logged In: YES
user_id=80475
I'm unicode illiterate. Passing to MAL for review.
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-26 23:24
Message:
Logged In: YES
user_id=1188172
Attaching a patch which should fix that.
--
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1007046&group_id=5470
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1291446 ] SSLObject breaks read semantics
Bugs item #1291446, was opened at 2005-09-14 22:28 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1291446&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Jonathan Ellis (ellisj) Assigned to: Nobody/Anonymous (nobody) Summary: SSLObject breaks read semantics Initial Comment: f = socket.ssl(sock) f.read(n) doesn't always return n bytes, even if the connection remains open! in particular, it seems to reproducibly return less than n bytes if the read would span the boundary between units of 16KB of data. We've had to work around this with code like the following: pieces = [] while n > 0: got = self.realfile.read(n) if not got: break pieces.append(got) n -= len(got) return ''.join(pieces) -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1291446&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1290505 ] strptime(): can't switch locales more than once
Bugs item #1290505, was opened at 2005-09-13 15:50
Message generated for change (Comment added) made by bcannon
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1290505&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
>Status: Closed
>Resolution: Fixed
Priority: 5
Submitted By: Adam Monsen (meonkeys)
Assigned to: Brett Cannon (bcannon)
Summary: strptime(): can't switch locales more than once
Initial Comment:
After calling strptime() once, it appears that
subsequent efforts to modify the locale settings (so
dates strings in different locales can be parsed) throw
a ValueError. I'm pasting everything here since spacing
is irrelevant:
import locale, time
print locale.getdefaultlocale()# ('en_US', 'utf')
print locale.getlocale(locale.LC_TIME) # (None, None)
# save old locale
old_loc = locale.getlocale(locale.LC_TIME)
locale.setlocale(locale.LC_TIME, 'nl_NL')
print locale.getlocale(locale.LC_TIME) # ('nl_NL',
'ISO8859-1')
# parse local date
date = '10 augustus 2005 om 17:26'
format = '%d %B %Y om %H:%M'
dateTuple = time.strptime(date, format)
# switch back to previous locale
locale.setlocale(locale.LC_TIME, old_loc)
print locale.getlocale(locale.LC_TIME) # (None, None)
date = '10 August 2005 at 17:26'
format = '%d %B %Y at %H:%M'
dateTuple = time.strptime(date, format)
The output I get from this script is:
('en_US', 'utf')
(None, None)
('nl_NL', 'ISO8859-1')
(None, None)
Traceback (most recent call last):
File "switching.py", line 17, in ?
dateTuple = time.strptime(date, format)
File "/usr/lib/python2.4/_strptime.py", line 292, in
strptime
raise ValueError("time data did not match format:
data=%s fmt=%s" %
ValueError: time data did not match format: data=10
August 2005 at 17:26 fmt=%d %B %Y at %H:%M
One workaround I found is by manually busting the
regular expression cache in _strptime:
import _strptime
_strptime._cache_lock.acquire()
_strptime._TimeRE_cache = _strptime.TimeRE()
_strptime._regex_cache = {}
_strptime._cache_lock.release()
If I do all that, I can change the LC_TIME part of the
locale as many times as I choose.
If this isn't a bug, this should at least be in the
documentation for the locale module and/or strptime().
--
>Comment By: Brett Cannon (bcannon)
Date: 2005-09-14 19:42
Message:
Logged In: YES
user_id=357491
OK, the problem was that the cache for the locale
information in terms of dates and time was being invalidated
and recreated, but the regex cache was not being touched. I
has now been fixed in rev. 1.41 for 2.5 and in rev. 1.38.2.3
for 2.4 .
Thanks for reporting this, Adam.
--
Comment By: Adam Monsen (meonkeys)
Date: 2005-09-13 15:57
Message:
Logged In: YES
user_id=259388
I think there were some long lines in my code. Attaching
test case.
--
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1290505&group_id=5470
___
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Feature Requests-1237680 ] add dedent() string method
Feature Requests item #1237680, was opened at 2005-07-13 18:48 Message generated for change (Comment added) made by birkenfeld You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1237680&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Interpreter Core Group: None >Status: Closed >Resolution: Rejected Priority: 5 Submitted By: Reinhold Birkenfeld (birkenfeld) Assigned to: Nobody/Anonymous (nobody) Summary: add dedent() string method Initial Comment: textwrap.dedent() is very useful for in-code multi-line string literals. However, as it is "hidden" in a module it does not really fit in, people don't use it and instead propose new string literal syntax for "dedented". str.dedent with an efficient C implementation would solve this. -- >Comment By: Reinhold Birkenfeld (birkenfeld) Date: 2005-09-15 07:45 Message: Logged In: YES user_id=1188172 Rejected as per discussion on python-dev. -- Comment By: Raymond Hettinger (rhettinger) Date: 2005-07-13 23:53 Message: Logged In: YES user_id=80475 -1 * Being a top level function in a module doesn't count as hidden. This is no more hidden than collections.deque, glob.glob, or re.sub. * The API requirements are looser in a textwrap context. For a string method, there would need to be a universally useful decision about how to handle mixed spaces and tabs and whether the first line of a triple-quoted string would be handled differently. Am not sure if universal newlines present any additional issues. * The world-view of the string module is character oriented, not line oriented. A dedent method() is not a perfect fit. * While the topic comes up every few years, in general, there is no user demand for this. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1237680&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1202493 ] RE parser too loose with {m,n} construct
Bugs item #1202493, was opened at 2005-05-15 14:59
Message generated for change (Comment added) made by josiahcarlson
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1202493&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.5
Status: Closed
Resolution: Fixed
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: RE parser too loose with {m,n} construct
Initial Comment:
This seems wrong to me:
>>> re.match("(UNIX{})", "UNIX{}").groups()
('UNIX',)
With no numbers or commas, "{}" should not be considered
special in the pattern. The docs identify three numeric
repetition possibilities: {m}, {m,} and {m,n}. There's no
description of {} meaning anything. Either the docs should
say {} implies {1,1}, {} should have no special meaning, or
an exception should be raised during compilation of the
regular expression.
--
Comment By: Josiah Carlson (josiahcarlson)
Date: 2005-09-14 23:07
Message:
Logged In: YES
user_id=341410
Was it a bug, or was it merely confusing semantics?
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-14 03:58
Message:
Logged In: YES
user_id=1188172
Will you backport the fix?
--
Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-09-14 01:58
Message:
Logged In: YES
user_id=7887
Fixed in:
Lib/sre_parse.py: 1.64 -> 1.65
Lib/test/test_re.py: 1.55 -> 1.56
Misc/NEWS: 1.1360 -> 1.1361
Notice that perl will also handle constructs like '{,2}' as
literals, while Python will consider them as '{0,2}'. I
think it's too late to change that one though, as this
behavior may be relied upon in code out there.
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-08-31 15:16
Message:
Logged In: YES
user_id=1188172
No, you're the expert, so you'll get the honor of fixing it. :P
--
Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-08-31 15:11
Message:
Logged In: YES
user_id=7887
I support Skip's opinion on following whatever perl is currently doing, if
that won't lead to unexpected errors on current running code which was
considered sane (expecting {} to behave like {1,1} is not sane :-).
Your original patch looks under-optimal though (look at the tests around
it). I'll fix it, or if you prefer to do it by yourself, I may apply the
patch/review it/whatever. :-)
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-08-31 14:55
Message:
Logged In: YES
user_id=1188172
Any more objections against treating "{}" as literal?
The impact on existing code will be minimal, as I presume no
one will write "{}" in a RE instead of "{1,1}" (well, who
writes "{1,1}" anyway...).
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 12:10
Message:
Logged In: YES
user_id=1188172
Then, I think, we should follow Perl's behaviour and treat
"{}" as a literal, just like every other brace construct
that isn't a repeat specifier.
--
Comment By: Raymond Hettinger (rhettinger)
Date: 2005-06-03 11:46
Message:
Logged In: YES
user_id=80475
Hmm, it looks like they cannot be treated differently
without breaking backwards compatability.
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 11:00
Message:
Logged In: YES
user_id=1188172
Raymond said that braces should always be considered
special. This includes constructs like "{(?P.*)}"
which the string module uses, and which would be a syntax
error then.
--
Comment By: Skip Montanaro (montanaro)
Date: 2005-06-03 08:13
Message:
Logged In: YES
user_id=44345
Can you elaborate? I fail to see what the string module
has to do with the re module. Can you give an example
of code that would break?
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 01:01
Message:
Logged In: YES
user_id=1188172
I just realized that e.g. the string module uses unescaped
braces, so I think we should not become overly strict as it
would break much code...
Perhaps the original patch (sre-brace-diff) is better...
--
Comment By:
[ python-Bugs-1202493 ] RE parser too loose with {m,n} construct
Bugs item #1202493, was opened at 2005-05-15 23:59
Message generated for change (Comment added) made by birkenfeld
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1202493&group_id=5470
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.5
Status: Closed
Resolution: Fixed
Priority: 5
Submitted By: Skip Montanaro (montanaro)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: RE parser too loose with {m,n} construct
Initial Comment:
This seems wrong to me:
>>> re.match("(UNIX{})", "UNIX{}").groups()
('UNIX',)
With no numbers or commas, "{}" should not be considered
special in the pattern. The docs identify three numeric
repetition possibilities: {m}, {m,} and {m,n}. There's no
description of {} meaning anything. Either the docs should
say {} implies {1,1}, {} should have no special meaning, or
an exception should be raised during compilation of the
regular expression.
--
>Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-15 08:12
Message:
Logged In: YES
user_id=1188172
I would say bug.
--
Comment By: Josiah Carlson (josiahcarlson)
Date: 2005-09-15 08:07
Message:
Logged In: YES
user_id=341410
Was it a bug, or was it merely confusing semantics?
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-14 12:58
Message:
Logged In: YES
user_id=1188172
Will you backport the fix?
--
Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-09-14 10:58
Message:
Logged In: YES
user_id=7887
Fixed in:
Lib/sre_parse.py: 1.64 -> 1.65
Lib/test/test_re.py: 1.55 -> 1.56
Misc/NEWS: 1.1360 -> 1.1361
Notice that perl will also handle constructs like '{,2}' as
literals, while Python will consider them as '{0,2}'. I
think it's too late to change that one though, as this
behavior may be relied upon in code out there.
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-09-01 00:16
Message:
Logged In: YES
user_id=1188172
No, you're the expert, so you'll get the honor of fixing it. :P
--
Comment By: Gustavo Niemeyer (niemeyer)
Date: 2005-09-01 00:11
Message:
Logged In: YES
user_id=7887
I support Skip's opinion on following whatever perl is currently doing, if
that won't lead to unexpected errors on current running code which was
considered sane (expecting {} to behave like {1,1} is not sane :-).
Your original patch looks under-optimal though (look at the tests around
it). I'll fix it, or if you prefer to do it by yourself, I may apply the
patch/review it/whatever. :-)
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-08-31 23:55
Message:
Logged In: YES
user_id=1188172
Any more objections against treating "{}" as literal?
The impact on existing code will be minimal, as I presume no
one will write "{}" in a RE instead of "{1,1}" (well, who
writes "{1,1}" anyway...).
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 21:10
Message:
Logged In: YES
user_id=1188172
Then, I think, we should follow Perl's behaviour and treat
"{}" as a literal, just like every other brace construct
that isn't a repeat specifier.
--
Comment By: Raymond Hettinger (rhettinger)
Date: 2005-06-03 20:46
Message:
Logged In: YES
user_id=80475
Hmm, it looks like they cannot be treated differently
without breaking backwards compatability.
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 20:00
Message:
Logged In: YES
user_id=1188172
Raymond said that braces should always be considered
special. This includes constructs like "{(?P.*)}"
which the string module uses, and which would be a syntax
error then.
--
Comment By: Skip Montanaro (montanaro)
Date: 2005-06-03 17:13
Message:
Logged In: YES
user_id=44345
Can you elaborate? I fail to see what the string module
has to do with the re module. Can you give an example
of code that would break?
--
Comment By: Reinhold Birkenfeld (birkenfeld)
Date: 2005-06-03 10:01
Message:
Logged In: YES
user_id=1188172
I just realized that e.g. the string module uses unescaped
braces, so I think we should
[ python-Bugs-1291662 ] Installation of waste by MacPython installer
Bugs item #1291662, was opened at 2005-09-15 08:44 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1291662&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Macintosh Group: None Status: Open Resolution: None Priority: 5 Submitted By: Freek Dijkstra (macfreek) Assigned to: Jack Jansen (jackjansen) Summary: Installation of waste by MacPython installer Initial Comment: Hi, I just installed MacPython 2.3 (on my Mac with Tiger 10.4), and found that the IDE did not launch with this error in the console.log: Traceback (most recent call last): File "/Applications/MacPython/PythonIDE.app/ Contents/Resources/PythonIDE.py", line 58, in ? import PythonIDEMain as _PythonIDEMain File "/Applications/MacPython/PythonIDE.app/ Contents/Resources/PythonIDEMain.py", line 7, in ? import W File "/System/Library/Frameworks/Python.framework/ Versions/2.3/Mac/Tools/IDE/W.py", line 7, in ? from Wtext import * File "/System/Library/Frameworks/Python.framework/ Versions/2.3/Mac/Tools/IDE/Wtext.py", line 6, in ? import waste ImportError: No module named waste (included line-breaks for readability). The problem was solved relatively easy: I noticed that /System/Library/Frameworks/Python.framework/Versions/2.3/lib/ python2.3/site-packages points to /Library/Python/2.3/site- packages/ but that waste.so was installed (apparently by the MacPython 2.3) installer in /Library/Python/2.3/. mv /Library/Python/2.3/waste.so /Library/Python/2.3/site- packages/ did solve the problem, and PythonIDE did launch. Is this a bug in the installer? Perhaps the aforementioned symbolic link changed from /Library/Python/2.3/ to /Library/Python/2.3/site-packages/ since the release of 10.4. Kind regards, Freek Dijkstra -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1291662&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
