[ python-Bugs-1653121 ] Double free/corruption?

2007-02-06 Thread SourceForge.net
Bugs item #1653121, was opened at 2007-02-06 10:54
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653121&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Jarek Zgoda (zgoda)
Assigned to: Nobody/Anonymous (nobody)
Summary: Double free/corruption?

Initial Comment:
Today I encountered a problem with system complaining on double 
free/corruption, but as I don't know C, I cann't say it's a problem with Python 
or with MySQLdb.
Attached is a stack trace that I saw in screen session termination window. I am 
unable to reproduce this error, I tried few times, but it does not happen.

If this is a MySQLdb (or even MySQL) problem, I'll report the bug as 
appriopriate, just let me know.

The system is pretty standard FC4. Below is as some system information, let me 
know if I should provide you anything more.

$ python
Python 2.4.3 (#1, Jun 13 2006, 16:41:18)
[GCC 4.0.2 20051125 (Red Hat 4.0.2-8)] on linux2

$ uname -a
Linux localhost 2.6.17-1.2139_FC4smp #1 SMP Fri Jun 23 21:12:13 EDT 2006 i686 
i686 i386 GNU/Linux

$ yum list installed glibc
Installed Packages
glibc.i686   2.3.6-3

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653121&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1653121 ] Double free/corruption?

2007-02-06 Thread SourceForge.net
Bugs item #1653121, was opened at 2007-02-06 04:54
Message generated for change (Comment added) made by tim_one
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653121&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Jarek Zgoda (zgoda)
Assigned to: Nobody/Anonymous (nobody)
Summary: Double free/corruption?

Initial Comment:
Today I encountered a problem with system complaining on double 
free/corruption, but as I don't know C, I cann't say it's a problem with Python 
or with MySQLdb.
Attached is a stack trace that I saw in screen session termination window. I am 
unable to reproduce this error, I tried few times, but it does not happen.

If this is a MySQLdb (or even MySQL) problem, I'll report the bug as 
appriopriate, just let me know.

The system is pretty standard FC4. Below is as some system information, let me 
know if I should provide you anything more.

$ python
Python 2.4.3 (#1, Jun 13 2006, 16:41:18)
[GCC 4.0.2 20051125 (Red Hat 4.0.2-8)] on linux2

$ uname -a
Linux localhost 2.6.17-1.2139_FC4smp #1 SMP Fri Jun 23 21:12:13 EDT 2006 i686 
i686 i386 GNU/Linux

$ yum list installed glibc
Installed Packages
glibc.i686   2.3.6-3

--

>Comment By: Tim Peters (tim_one)
Date: 2007-02-06 05:31

Message:
Logged In: YES 
user_id=31435
Originator: NO

Since the top 6 or 7 stack entries are all in
/usr/lib/mysql/libmysqlclient_r.so.14, and shows it trying to using its own
free() function (my_no_flags_free()), it's almost certainly a problem in
the extension module (as opposed to in Python).

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653121&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Feature Requests-500698 ] Taint a la Perl?

2007-02-06 Thread SourceForge.net
Feature Requests item #500698, was opened at 2002-01-08 03:48
Message generated for change (Comment added) made by jcrocholl
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=500698&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Peter Scott (sketerpot)
Assigned to: Nobody/Anonymous (nobody)
Summary: Taint a la Perl?

Initial Comment:
This might just add unnecessary bloat, but since Python is being 
used in CGI scripts, it can be used to narrow a security hole. One way 
of breaking security is for a naiive programmer (don't try to deny 
their existance) to run an arbitrary command from the page 
viewer.

Perl has developed an interesting mechanism for 
helping with this: taint. The way it works is, when something comes 
directly from the user, like a key in a form, it is considered to have 
taint unless specifically untainted. Things like os.exec() would 
create a warning message if you passed tainted strings to 
them.

As I said, this might just add unnecessary bloat, but for 
an option that can be left out for most builds of Python I think it 
would be pretty nice.

--

Comment By: Johann C. Rocholl (jcrocholl)
Date: 2007-02-06 11:51

Message:
Logged In: YES 
user_id=656137
Originator: NO

http://svn.rocholl.net/taint/trunk/taint.py

--

Comment By: Johann C. Rocholl (jcrocholl)
Date: 2007-02-05 22:55

Message:
Logged In: YES 
user_id=656137
Originator: NO

I have come up with a class called SafeString which is the opposite of a
tainted string. In my model, all strings are tainted by default, and you
have to call untaint() to create a SafeString. Then I replace all
functions in the os module with wrapper functions that check all
parameters first and raise TaintError if any string is not safe. If I can
figure out how to attach a file here, I will post it. Otherwise you may
find it on comp.lang.python by the name of taint.py.

--

Comment By: Peter Scott (sketerpot)
Date: 2003-02-14 18:21

Message:
Logged In: YES 
user_id=252564

Thanks for the idea, phr. I wrote a small class called 
TaintString, derived from string, that has a taint attribute. This 
is probably the least difficult part. The difficult part will be in 
modifying functions like os.system() to raise warnings or 
exceptions when tainted strings are passed to them. I'm 
currently thinking of making wrapper modules with names like 
taint.os, or taint.cgi, but the problem with this is that you 
have to manually use taint.* for certain functions. If anybody 
can think of something that can simplify this, please post it.

--

Comment By: paul rubin (phr)
Date: 2003-02-14 05:47

Message:
Logged In: YES 
user_id=72053

With new-style classes, maybe this can be done by
subclassing string somehow.  There would be a subclass for
tainted strings and trying to do most things with them would
raise an exception.  With taint checking enabled, functions
like os.getenv and cgi.FieldStorage would make objects
containing tainted strings.  You'd untaint them by passing
them to re.search or re.match and pulling out the match
variables, like in Per.

--

Comment By: Skip Montanaro (montanaro)
Date: 2003-01-03 02:25

Message:
Logged In: YES 
user_id=44345

Took awhile for a response to this feature request. ;-)

Perl's heavy integration of regular expressions with its
taint facility probably wouldn't work all that well in
Python.  For one, Python has more ways of searching
strings than with regular expressions.  Second, regular
expressions are not nearly as tightly wound into Python
as they are in Perl.  I think you'd have to add a taint
attribute to strings and just rely on the programmer to
properly clear that attribute.

I think a first cut at an implementation would go much
further toward getting the concept seriously considered
for addition to Python.


--

Comment By: Neal McBurnett (nealmcb)
Date: 2003-01-02 22:20

Message:
Logged In: YES 
user_id=105956

I really like taint mode.
I think this would make Python a better choice for CGI scripts.

See http://www.perldoc.com/perl5.8.0/pod/perlsec.html
and http://gunther.web66.com/FAQS/taintmode.html
for more background.


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=500698&group_id=5470
_

[ python-Bugs-1653121 ] Double free/corruption?

2007-02-06 Thread SourceForge.net
Bugs item #1653121, was opened at 2007-02-06 10:54
Message generated for change (Comment added) made by zgoda
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653121&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Jarek Zgoda (zgoda)
Assigned to: Nobody/Anonymous (nobody)
Summary: Double free/corruption?

Initial Comment:
Today I encountered a problem with system complaining on double 
free/corruption, but as I don't know C, I cann't say it's a problem with Python 
or with MySQLdb.
Attached is a stack trace that I saw in screen session termination window. I am 
unable to reproduce this error, I tried few times, but it does not happen.

If this is a MySQLdb (or even MySQL) problem, I'll report the bug as 
appriopriate, just let me know.

The system is pretty standard FC4. Below is as some system information, let me 
know if I should provide you anything more.

$ python
Python 2.4.3 (#1, Jun 13 2006, 16:41:18)
[GCC 4.0.2 20051125 (Red Hat 4.0.2-8)] on linux2

$ uname -a
Linux localhost 2.6.17-1.2139_FC4smp #1 SMP Fri Jun 23 21:12:13 EDT 2006 i686 
i686 i386 GNU/Linux

$ yum list installed glibc
Installed Packages
glibc.i686   2.3.6-3

--

>Comment By: Jarek Zgoda (zgoda)
Date: 2007-02-06 12:24

Message:
Logged In: YES 
user_id=9
Originator: YES

Thank you, will try my luck with MySQLdb.

--

Comment By: Tim Peters (tim_one)
Date: 2007-02-06 11:31

Message:
Logged In: YES 
user_id=31435
Originator: NO

Since the top 6 or 7 stack entries are all in
/usr/lib/mysql/libmysqlclient_r.so.14, and shows it trying to using its
own free() function (my_no_flags_free()), it's almost certainly a problem
in the extension module (as opposed to in Python).

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653121&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1653121 ] Double free/corruption?

2007-02-06 Thread SourceForge.net
Bugs item #1653121, was opened at 2007-02-06 09:54
Message generated for change (Settings changed) made by gbrandl
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653121&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: Python 2.4
>Status: Pending
Resolution: None
Priority: 5
Private: No
Submitted By: Jarek Zgoda (zgoda)
Assigned to: Nobody/Anonymous (nobody)
Summary: Double free/corruption?

Initial Comment:
Today I encountered a problem with system complaining on double 
free/corruption, but as I don't know C, I cann't say it's a problem with Python 
or with MySQLdb.
Attached is a stack trace that I saw in screen session termination window. I am 
unable to reproduce this error, I tried few times, but it does not happen.

If this is a MySQLdb (or even MySQL) problem, I'll report the bug as 
appriopriate, just let me know.

The system is pretty standard FC4. Below is as some system information, let me 
know if I should provide you anything more.

$ python
Python 2.4.3 (#1, Jun 13 2006, 16:41:18)
[GCC 4.0.2 20051125 (Red Hat 4.0.2-8)] on linux2

$ uname -a
Linux localhost 2.6.17-1.2139_FC4smp #1 SMP Fri Jun 23 21:12:13 EDT 2006 i686 
i686 i386 GNU/Linux

$ yum list installed glibc
Installed Packages
glibc.i686   2.3.6-3

--

Comment By: Jarek Zgoda (zgoda)
Date: 2007-02-06 11:24

Message:
Logged In: YES 
user_id=9
Originator: YES

Thank you, will try my luck with MySQLdb.

--

Comment By: Tim Peters (tim_one)
Date: 2007-02-06 10:31

Message:
Logged In: YES 
user_id=31435
Originator: NO

Since the top 6 or 7 stack entries are all in
/usr/lib/mysql/libmysqlclient_r.so.14, and shows it trying to using its own
free() function (my_no_flags_free()), it's almost certainly a problem in
the extension module (as opposed to in Python).

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653121&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1124861 ] subprocess fails on GetStdHandle in interactive GUI

2007-02-06 Thread SourceForge.net
Bugs item #1124861, was opened at 2005-02-17 17:23
Message generated for change (Comment added) made by astrand
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1124861&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Windows
Group: Python 2.4
>Status: Closed
>Resolution: Fixed
Priority: 7
Private: No
Submitted By: davids (davidschein)
Assigned to: Nobody/Anonymous (nobody)
Summary: subprocess fails on GetStdHandle in interactive GUI

Initial Comment:
Using the suprocess module from with IDLE or PyWindows,
it appears that calls GetStdHandle (STD__HANDLE)
returns None, which causes an error.  (All appears fine
on Linux, the standard Python command-line, and ipython.)

For example:
>>> import subprocess
>>> p = subprocess.Popen("dir", stdout=subprocess.PIPE)

Traceback (most recent call last):
  File "", line 1, in -toplevel-
p = subprocess.Popen("dir", stdout=subprocess.PIPE)
  File "C:\Python24\lib\subprocess.py", line 545, in
__init__
(p2cread, p2cwrite,
  File "C:\Python24\lib\subprocess.py", line 605, in
_get_handles
p2cread = self._make_inheritable(p2cread)
  File "C:\Python24\lib\subprocess.py", line 646, in
_make_inheritable
DUPLICATE_SAME_ACCESS)
TypeError: an integer is required

The error originates in the mswindows implementation of
_get_handles.  You need to set one of stdin, stdout, or
strerr because the first line in the method is:
if stdin == None and stdout == None and stderr == None:
...return (None, None, None, None, None, None)

I added "if not handle: return GetCurrentProcess()" to
_make_inheritable() as below and it worked.  Of course,
I really do not know what is going on, so I am letting
go now...

def _make_inheritable(self, handle):
..."""Return a duplicate of handle, which is inheritable"""
...if not handle: return GetCurrentProcess()
...return DuplicateHandle(GetCurrentProcess(), handle,
GetCurrentProcess(),
0, 1,
DUPLICATE_SAME_ACCESS)


--

>Comment By: Peter Åstrand (astrand)
Date: 2007-02-06 16:43

Message:
Logged In: YES 
user_id=344921
Originator: NO

I've applied 1124861.3.patch to both trunk (rev 53646) and the
release25-maint branch (rev 53647). 

--

Comment By: Peter Åstrand (astrand)
Date: 2007-01-30 21:05

Message:
Logged In: YES 
user_id=344921
Originator: NO

Please review 1124861.3.patch. 

--

Comment By: Peter Åstrand (astrand)
Date: 2007-01-30 21:04

Message:
Logged In: YES 
user_id=344921
Originator: NO

File Added: 1124861.3.patch

--

Comment By: Peter Åstrand (astrand)
Date: 2007-01-29 22:42

Message:
Logged In: YES 
user_id=344921
Originator: NO

Some ideas of possible solutions for this bug:

1) As Roger Upole suggests, throw an readable error when GetStdHandle
fails. This would not really change much, besides of subprocess being a
little less confusing. 

2) Automatically create PIPEs for those handles that fails. The PIPE could
either be left open or closed. A WriteFile in the child would get
ERROR_BROKEN_PIPE, if the parent has closed it. Not as good as
ERROR_INVALID_HANDLE, but pretty close. (Or should I say pretty closed?
:-)

3) Try to attach the handles to a NUL device, as 1238747 suggests. 

4) Hope for the best and actually pass invalid handles in
startupinfo.hStdInput, startupinfo.hStdOutput, or
startupinfo.hStdError. It would be nice if this was possible: If
GetStdHandle fails in the current process, it makes sense that
GetStdHandle will fail in the child as well. But, as far as I understand,
it's not possible or safe to pass invalid handles in the startupinfo
structure. 

Currently, I'm leaning towards solution 2), with closing the parents PIPE
ends. 

--

Comment By: Peter Åstrand (astrand)
Date: 2007-01-22 20:36

Message:
Logged In: YES 
user_id=344921
Originator: NO

The following bugs have been marked as duplicate of this bug:

1358527
1603907
1126208
1238747



--

Comment By: craig (codecraig)
Date: 2006-10-13 17:54

Message:
Logged In: YES 
user_id=1258995

On windows, this seems to work

from subprocess import *
p = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE)

in some cases (depending on what command you are
executing, a command prompt window may appear).  Do not show
a window use this...

import win32con
p = Popen(cmd, stdin=PIPE, stdout=PIPE, stderr=PIPE,
creationflags=win32con.CREATE_NO_WINDOW)

...google for Microsoft Process Creation Flags for more info

---

[ python-Bugs-1653416 ] print >> f, "Hello" produces no error: normal?

2007-02-06 Thread SourceForge.net
Bugs item #1653416, was opened at 2007-02-06 17:23
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653416&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: E.-O. Le Bigot (eolebigot)
Assigned to: Nobody/Anonymous (nobody)
Summary: print >> f, "Hello" produces no error: normal?

Initial Comment:
When using
  print >> f, "Hello"
on a file f opened for reading, no exception is raised.  Is this normal?

This situation has to be contrasted with
  f.write("Hello")
which raises an exception.

Details with Python 2.5 (r25:51908, Sep 24 206) on OS X 10.4.8 / darwin 8.8.0:

In [1]: f=open("start.01")
In [2]: f.write("Hello")
: [Errno 9] Bad file descriptor
In [3]: print >> f, "Hello"
In [4]: f.close()

NB: file f is not modified, despite the "print" statement yielding no error, 
above.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653416&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1653457 ] Python misbehaves when installed in / (patch attached)

2007-02-06 Thread SourceForge.net
Bugs item #1653457, was opened at 2007-02-06 17:08
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653457&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Build
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Chris Webb (chris_arachsys)
Assigned to: Nobody/Anonymous (nobody)
Summary: Python misbehaves when installed in / (patch attached)

Initial Comment:
reduce() in getpath.c chops down a path to the empty string rather than to /. 
As a result, if you build python with --prefix='' in the usual way for software 
to be installed into /, it tries to find its libraries in the current directory 
instead of in /lib:

$ python
Could not find platform independent libraries 
Could not find platform dependent libraries 
Consider setting $PYTHONHOME to [:]
'import site' failed; use -v for traceback
Python 2.5 (r25:51908, Feb  6 2007, 16:15:42) 
[GCC 3.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 

This is fixed by the attached patch.

$ python
Python 2.5 (r25:51908, Feb  6 2007, 16:19:38) 
[GCC 3.4.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> 



--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653457&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1653416 ] print >> f, "Hello" produces no error: normal?

2007-02-06 Thread SourceForge.net
Bugs item #1653416, was opened at 2007-02-06 16:23
Message generated for change (Comment added) made by gbrandl
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653416&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: E.-O. Le Bigot (eolebigot)
Assigned to: Nobody/Anonymous (nobody)
Summary: print >> f, "Hello" produces no error: normal?

Initial Comment:
When using
  print >> f, "Hello"
on a file f opened for reading, no exception is raised.  Is this normal?

This situation has to be contrasted with
  f.write("Hello")
which raises an exception.

Details with Python 2.5 (r25:51908, Sep 24 206) on OS X 10.4.8 / darwin 8.8.0:

In [1]: f=open("start.01")
In [2]: f.write("Hello")
: [Errno 9] Bad file descriptor
In [3]: print >> f, "Hello"
In [4]: f.close()

NB: file f is not modified, despite the "print" statement yielding no error, 
above.

--

>Comment By: Georg Brandl (gbrandl)
Date: 2007-02-06 17:31

Message:
Logged In: YES 
user_id=849994
Originator: NO

If this happens, it's a bug. Though it doesn't seem to occur under Linux
here.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653416&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1653416 ] print >> f, "Hello" produces no error: normal?

2007-02-06 Thread SourceForge.net
Bugs item #1653416, was opened at 2007-02-06 17:23
Message generated for change (Comment added) made by eolebigot
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653416&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: E.-O. Le Bigot (eolebigot)
Assigned to: Nobody/Anonymous (nobody)
Summary: print >> f, "Hello" produces no error: normal?

Initial Comment:
When using
  print >> f, "Hello"
on a file f opened for reading, no exception is raised.  Is this normal?

This situation has to be contrasted with
  f.write("Hello")
which raises an exception.

Details with Python 2.5 (r25:51908, Sep 24 206) on OS X 10.4.8 / darwin 8.8.0:

In [1]: f=open("start.01")
In [2]: f.write("Hello")
: [Errno 9] Bad file descriptor
In [3]: print >> f, "Hello"
In [4]: f.close()

NB: file f is not modified, despite the "print" statement yielding no error, 
above.

--

>Comment By: E.-O. Le Bigot (eolebigot)
Date: 2007-02-06 18:45

Message:
Logged In: YES 
user_id=1440667
Originator: YES

Interesting point, about Linux.   The incorrect behavior is even seen in
the  default python 2.3 that ships with Mac OS X!


--

Comment By: Georg Brandl (gbrandl)
Date: 2007-02-06 18:31

Message:
Logged In: YES 
user_id=849994
Originator: NO

If this happens, it's a bug. Though it doesn't seem to occur under Linux
here.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653416&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1653416 ] print >> f, "Hello" produces no error: normal?

2007-02-06 Thread SourceForge.net
Bugs item #1653416, was opened at 2007-02-06 10:23
Message generated for change (Comment added) made by montanaro
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653416&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: E.-O. Le Bigot (eolebigot)
Assigned to: Nobody/Anonymous (nobody)
Summary: print >> f, "Hello" produces no error: normal?

Initial Comment:
When using
  print >> f, "Hello"
on a file f opened for reading, no exception is raised.  Is this normal?

This situation has to be contrasted with
  f.write("Hello")
which raises an exception.

Details with Python 2.5 (r25:51908, Sep 24 206) on OS X 10.4.8 / darwin 8.8.0:

In [1]: f=open("start.01")
In [2]: f.write("Hello")
: [Errno 9] Bad file descriptor
In [3]: print >> f, "Hello"
In [4]: f.close()

NB: file f is not modified, despite the "print" statement yielding no error, 
above.

--

>Comment By: Skip Montanaro (montanaro)
Date: 2007-02-06 12:49

Message:
Logged In: YES 
user_id=44345
Originator: NO

I verified this behavior on my Mac with /usr/bin/python, Python 2.5 and
Python 2.6a0, both built from SVN.

Skip


--

Comment By: E.-O. Le Bigot (eolebigot)
Date: 2007-02-06 11:45

Message:
Logged In: YES 
user_id=1440667
Originator: YES

Interesting point, about Linux.   The incorrect behavior is even seen in
the  default python 2.3 that ships with Mac OS X!


--

Comment By: Georg Brandl (gbrandl)
Date: 2007-02-06 11:31

Message:
Logged In: YES 
user_id=849994
Originator: NO

If this happens, it's a bug. Though it doesn't seem to occur under Linux
here.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653416&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Feature Requests-1649329 ] gettext.py incompatible with eggs

2007-02-06 Thread SourceForge.net
Feature Requests item #1649329, was opened at 2007-02-01 03:20
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1649329&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
>Category: Python Library
>Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Shannon -jj Behrens (jjinux)
Assigned to: Nobody/Anonymous (nobody)
Summary: gettext.py incompatible with eggs

Initial Comment:
If you distribute your code in a zipped egg, you can't use translation catalogs 
stored in that egg.  That's because gettext.py only knows how to find the 
translation catalogs in the actual filesystem.

This wouldn't be so bad if the "find" function didn't serve double duty.  On 
the one hand, it implements the "find" algorithm to pick the right languages 
and fallbacks.  On the other hand, it actually resolves the files in the 
filesystem.  It would be nice if these were separate.  It seems like people in 
projects like Pylons are stuck copying code from the find function in order to 
work around this problem.

Perhaps gettext can be updated to know about eggs.  Perhaps setting localedir 
to something like "egg://..." would enable this functionality.

--

>Comment By: Martin v. Löwis (loewis)
Date: 2007-02-06 20:43

Message:
Logged In: YES 
user_id=21627
Originator: NO

I fail to see the bug. gettext.find behaves as specified; if you want
something else, don't use that function.

If you want to load a .mo file from a zip file, you should be able to
create a GNUTranslation object directly, from a file-like object.

I don't think that gettext should support eggs directly. If you want it to
do things other than loading from file systems, you should generalize that
appropriately. One appropriate generalization could be the introduction of
a directory-like object, where you can do .exists(relpath), and
.open(relpath). However, introduction of directory-like objets is PEP
material.

Reclassifying this as a feature request.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1649329&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1648268 ] Parameter list mismatches (portation problem)

2007-02-06 Thread SourceForge.net
Bugs item #1648268, was opened at 2007-01-30 23:15
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1648268&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: ked-tao (ked-tao)
Assigned to: Nobody/Anonymous (nobody)
Summary: Parameter list mismatches (portation problem)

Initial Comment:
On the system I'm porting to(*), an application will trap if the caller does 
not pass the exact parameter list that the callee requires. This is causing 
problems running Python.

One common instance where this appears to be causing problems is where 
functions are registered as METH_NOARGS methods. For example, in 
Obejcts/dictobject.c, dict_popitem() is declared:

static PyObject *dict_popitem(dictobject *mp);

However, as it is declared in the method array as METH_NOARGS, it will be 
called by Objects/methodobject.c:PyCFunction_Call() as "(*meth)(self, NULL)" 
(i.e., an extra NULL parameter is passed for some reason). This will fail on my 
target system.

I've no problem submitting a patch for this (dictobject.c is by no means the 
only place this is happening - it's just the first one encountered because it's 
used so much - though some places _do_ correctly declare a second, ignored 
parameter). However, I'd like to get agreement on the correct form it should be 
changed to before I put the effort in to produce a patch (it's going to be a 
fairly tedious process to identify and fix all these).

In various modules, the functions are called internally as well as being 
registered as METH_NOARGS methods.

Therefore, the change can either be:

static PyObject *foo(PyObject *self)
{
  ...
}

static PyObject *foo_noargs(PyObject *self, void *noargs_null)
{
   return foo(self);
}

... where 'foo' is called internally and 'foo_noargs' is registered as a 
METH_NOARGS method.

or:

static PyObject *foo(PyObject *self, void *noargs_null)
{
  ...
}

... and any internal calls in the module have to pass a second, NULL, argument 
in each call.

The former favours internal module calls over METH_NOARGS calls, the latter 
penalises them. Which is preferred? Should this be raised on a different forum? 
Does anyone care? ;)

Thanks, Kev.

(*) Details on request.

--

>Comment By: Martin v. Löwis (loewis)
Date: 2007-02-06 20:49

Message:
Logged In: YES 
user_id=21627
Originator: NO

The current specification says that these should be PyCFunction pointers,
see

http://docs.python.org/api/common-structs.html

My initial implementation of METH_NOARGS had it differently, and nobody
ever bothered fixing them all when this was changed. Please do submit a
patch to correct all such errors, both in code and documentation.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1648268&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1646068 ] Dict lookups fail if sizeof(Py_ssize_t) < sizeof(long)

2007-02-06 Thread SourceForge.net
Bugs item #1646068, was opened at 2007-01-27 19:23
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1646068&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Interpreter Core
Group: Python 2.5
Status: Open
Resolution: None
Priority: 6
Private: No
Submitted By: ked-tao (ked-tao)
Assigned to: Tim Peters (tim_one)
Summary: Dict lookups fail if sizeof(Py_ssize_t) < sizeof(long)

Initial Comment:
Portation problem.

Include/dictobject.h defines PyDictEntry.me_hash as a Py_ssize_t. Everywhere 
else uses a C 'long' for hashes.

On the system I'm porting to, ints and pointers (and ssize_t) are 32-bit, but 
longs and long longs are 64-bit. Therefore, the assignments to me_hash truncate 
the hash and subsequent lookups fail.

I've changed the definition of me_hash to 'long' and (in Objects/dictobject.c) 
removed the casting from the various assignments and changed the definition of 
'i' in dict_popitem(). This has fixed my immediate problems, but I guess I've 
just reintroduced whatever problem it got changed for. The comment in the 
header says:

/* Cached hash code of me_key.  Note that hash codes are C longs.
 * We have to use Py_ssize_t instead because dict_popitem() abuses
 * me_hash to hold a search finger.
 */

... but that doesn't really explain what it is about dict_popitem() that 
requires the different type.

Thanks. Kev.

--

>Comment By: Martin v. Löwis (loewis)
Date: 2007-02-06 21:03

Message:
Logged In: YES 
user_id=21627
Originator: NO

ked-tao: as for "doesn't really explain", please take a look at this
comment:

/* Set ep to "the first" dict entry with a value.  We abuse the
hash
 * field of slot 0 to hold a search finger:
 * If slot 0 has a value, use slot 0.
 * Else slot 0 is being used to hold a search finger,
 * and we use its hash value as the first index to look.
 */

So .popitem first returns (and removes) the item in slot 0. Afterwards, it
does a 
linear scan through the dictionary, returning one item at a time. To
avoid
re-scanning the emptying dictionary over and over again, the me_hash
value of slot 0 indicates the place to start searching when the next
.popitem
call is made. Of course, this value may start out bogus and out-of-range,
or may become out-of-range if the dictionary shrinks; in that case, it
starts over at index 1. If it is bogus (i.e. never set as a search
finger)
and in-range, that's fine: it will just start searching for a non-empty
slot at me_hash.

Because it is a slot number, me_hash must be large enough to hold a
Py_ssize_t. On some systems (Win64 in particular), long is not large
enough to hold Py_ssize_t.

I believe the proposed patch is fine.

--

Comment By: Jim Jewett (jimjjewett)
Date: 2007-02-04 17:35

Message:
Logged In: YES 
user_id=764593
Originator: NO

Yes, I'm curious about what system this is ... is it a characteristic of
the whole system, or a compiler choice to get longer ints?

As to using a Py_hash_t -- it probably wouldn't be as bad as you think. 
You might get away with just masking it to throw away the high order bits
in dict and set.  (That might not work with perturbation.)  

Even if you have to change it everywhere at the source, then there is some
prior art (from when hash was allowed to be a python long), and it is
almost certainly limited to methods with "hash" in the name which generate
a hash.  (eq/ne on the same objects may use the hash.)  Consumers of hash
really are limited to dict and derivatives.  I think dict, set, and
defaultdict may be the full list for the default distribution.


--

Comment By: ked-tao (ked-tao)
Date: 2007-02-04 15:11

Message:
Logged In: YES 
user_id=1703158
Originator: YES

Hi Jim. I understand what the problem is (perhaps I didn't state it
clearly enough) - me_hash is a cache of the dict item's hash which is
compared against the hash of the object being looked up before going any
further with expensive richer comparisons. On my system, me_hash is a
32-bit quantity but hashes in general are declared 'long' which is a 64-bit
quantity. Therefore for any object whose hash has any of the top 32 bits
set, a dict lookup will fail as it will never get past that first check
(regardless of why that slot is being checked - it has nothing to do with
the perturbation to find the next slot).

The deal is that my system is basically a 32-bit system (sizeof(int) ==
sizeof(void *) == 4, and therefore ssize_t is not unreasonably also
32-bit), but C longs are 64-bit.

You say "popitem assumes it can store a pointer there", but AFAICS it'

[ python-Feature Requests-1649329 ] gettext.py incompatible with eggs

2007-02-06 Thread SourceForge.net
Feature Requests item #1649329, was opened at 2007-01-31 18:20
Message generated for change (Comment added) made by jjinux
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1649329&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Shannon -jj Behrens (jjinux)
Assigned to: Nobody/Anonymous (nobody)
Summary: gettext.py incompatible with eggs

Initial Comment:
If you distribute your code in a zipped egg, you can't use translation catalogs 
stored in that egg.  That's because gettext.py only knows how to find the 
translation catalogs in the actual filesystem.

This wouldn't be so bad if the "find" function didn't serve double duty.  On 
the one hand, it implements the "find" algorithm to pick the right languages 
and fallbacks.  On the other hand, it actually resolves the files in the 
filesystem.  It would be nice if these were separate.  It seems like people in 
projects like Pylons are stuck copying code from the find function in order to 
work around this problem.

Perhaps gettext can be updated to know about eggs.  Perhaps setting localedir 
to something like "egg://..." would enable this functionality.

--

>Comment By: Shannon -jj Behrens (jjinux)
Date: 2007-02-06 14:28

Message:
Logged In: YES 
user_id=30164
Originator: YES

Sorry, yes, this is a feature request.  The most important thing that I'm
requesting is to refactor the code.  That "find" function implements a
certain algorithm that has nothing to do with the filesystem.

--

Comment By: Martin v. Löwis (loewis)
Date: 2007-02-06 11:43

Message:
Logged In: YES 
user_id=21627
Originator: NO

I fail to see the bug. gettext.find behaves as specified; if you want
something else, don't use that function.

If you want to load a .mo file from a zip file, you should be able to
create a GNUTranslation object directly, from a file-like object.

I don't think that gettext should support eggs directly. If you want it to
do things other than loading from file systems, you should generalize that
appropriately. One appropriate generalization could be the introduction of
a directory-like object, where you can do .exists(relpath), and
.open(relpath). However, introduction of directory-like objets is PEP
material.

Reclassifying this as a feature request.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1649329&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1653736 ] Problems in datetime.c and typeobject.c.

2007-02-06 Thread SourceForge.net
Bugs item #1653736, was opened at 2007-02-07 01:15
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653736&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: ked-tao (ked-tao)
Assigned to: Nobody/Anonymous (nobody)
Summary: Problems in datetime.c and typeobject.c.

Initial Comment:
This is related to [python-Bugs-1648268], but I think these problems might be 
important enough to consider fixing prior to any patch being produced for that 
item.

Modules/datetimemodule.c:time_isoformat() is declared in time_methods[] as 
METH_KEYWORDS. However, I believe it is better declared as METH_NOARGS (calling 
it with args and kwargs doesn't raise any exception, but it doesn't accept 
them).

Objects/typeobject.c:4428 - slot_nb_inplace_power is declared with the SLOT1() 
macro. I'm not sure I entirely grok what's going on here (not enough to supply 
a python-level failure case), but it seems to me that it should be declared 
with the SLOT2() macro (it's a ternary op). FWIW, I changed it from:

SLOT1(slot_nb_inplace_power, "__ipow__", PyObject *, "O")

to:

SLOT2(slot_nb_inplace_power, "__ipow__", PyObject *, PyObject *, "OO")

... and that ran the failing tests OK.

Hopefully someone familiar with this code can determine if this is correct or 
not.

Thanks, Kev.




--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653736&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1653753 ] crash / abort during install

2007-02-06 Thread SourceForge.net
Bugs item #1653753, was opened at 2007-02-06 17:56
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653753&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Installation
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: SAndreason (sandreas41)
Assigned to: Nobody/Anonymous (nobody)
Summary: crash / abort during install

Initial Comment:
linux-2.4.33
installing Python-2.5.tar.bz2  9357099 bytes

After a successful make,
su # make install
...[clip]...
PYTHONPATH=/usr/local/lib/python2.5   \
./python -Wi -tt /usr/local/lib/python2.5/compileall.py \
-d /usr/local/lib/python2.5 -f \
-x 'bad_coding|badsyntax|site-packages' /usr/local/lib/python2.5
Listing /usr/local/lib/python2.5 ...
Compiling /usr/local/lib/python2.5/BaseHTTPServer.py ...
...[clip]...
Compiling /usr/local/lib/python2.5/xmlrpclib.py ...
Compiling /usr/local/lib/python2.5/zipfile.py ...
make: *** [libinstall] Error 1


Can installation be finished manually? or is there a patch or workaround for 
this?

Stewart

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653753&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1653757 ] configure does not check/warn/stop for tk/tcl

2007-02-06 Thread SourceForge.net
Bugs item #1653757, was opened at 2007-02-06 18:15
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653757&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Build
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: SAndreason (sandreas41)
Assigned to: Nobody/Anonymous (nobody)
Summary: configure does not check/warn/stop for tk/tcl

Initial Comment:
linux-2.4.33
installing Python-2.5.tar.bz2 9357099 bytes

During the first configure/make process, there were no errors until compilation 
failed.
Looking back at the configure output, I see:
...[clip]...
checking for UCS-4 tcl... no
...[clip]...

Because during make, it said:
...[clip]...
/usr/src/Python-2.5/Modules/_tkinter.c:80:2: #error "Tk older than 8.2 not 
supported"
/usr/src/Python-2.5/Modules/_tkinter.c:92:2: #error "unsupported Tcl 
configuration"
...[clip]...and many pages of:...
/usr/src/Python-2.5/Modules/_tkinter.c:: errors

Ok, so I upgraded the tk and tcl packages without incident.

Now, Why during the clean re-configuration, do I get the same message, and also 
an error in the config.log that matches??

make did (appear to) finish ok

Perhaps this may have relevance to the other bug.
[ 1653753 ] crash / abort during install

config.log is attached there.


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1653757&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Feature Requests-1649329 ] gettext.py incompatible with eggs

2007-02-06 Thread SourceForge.net
Feature Requests item #1649329, was opened at 2007-02-01 03:20
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1649329&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Shannon -jj Behrens (jjinux)
Assigned to: Nobody/Anonymous (nobody)
Summary: gettext.py incompatible with eggs

Initial Comment:
If you distribute your code in a zipped egg, you can't use translation catalogs 
stored in that egg.  That's because gettext.py only knows how to find the 
translation catalogs in the actual filesystem.

This wouldn't be so bad if the "find" function didn't serve double duty.  On 
the one hand, it implements the "find" algorithm to pick the right languages 
and fallbacks.  On the other hand, it actually resolves the files in the 
filesystem.  It would be nice if these were separate.  It seems like people in 
projects like Pylons are stuck copying code from the find function in order to 
work around this problem.

Perhaps gettext can be updated to know about eggs.  Perhaps setting localedir 
to something like "egg://..." would enable this functionality.

--

>Comment By: Martin v. Löwis (loewis)
Date: 2007-02-07 07:45

Message:
Logged In: YES 
user_id=21627
Originator: NO

Do you have a proposal on how to do the refactoring? It would be fine to
split find into two parts; it would not be acceptable (to me) to teach it
egg: URLs.

--

Comment By: Shannon -jj Behrens (jjinux)
Date: 2007-02-06 23:28

Message:
Logged In: YES 
user_id=30164
Originator: YES

Sorry, yes, this is a feature request.  The most important thing that I'm
requesting is to refactor the code.  That "find" function implements a
certain algorithm that has nothing to do with the filesystem.

--

Comment By: Martin v. Löwis (loewis)
Date: 2007-02-06 20:43

Message:
Logged In: YES 
user_id=21627
Originator: NO

I fail to see the bug. gettext.find behaves as specified; if you want
something else, don't use that function.

If you want to load a .mo file from a zip file, you should be able to
create a GNUTranslation object directly, from a file-like object.

I don't think that gettext should support eggs directly. If you want it to
do things other than loading from file systems, you should generalize that
appropriately. One appropriate generalization could be the introduction of
a directory-like object, where you can do .exists(relpath), and
.open(relpath). However, introduction of directory-like objets is PEP
material.

Reclassifying this as a feature request.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=1649329&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1651995 ] sgmllib _convert_ref UnicodeDecodeError exception, new in 2.

2007-02-06 Thread SourceForge.net
Bugs item #1651995, was opened at 2007-02-04 22:34
Message generated for change (Comment added) made by nagle
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1651995&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: John Nagle (nagle)
Assigned to: Nobody/Anonymous (nobody)
Summary: sgmllib _convert_ref UnicodeDecodeError exception, new in 2.

Initial Comment:
   I'm running a website page through BeautifulSoup.  It parses OK with Python 
2.4, but Python 2.5 fails with an exception:

Traceback (most recent call last):
  File "./sitetruth/InfoSitePage.py", line 268, in httpfetch
self.pagetree = BeautifulSoup.BeautifulSoup(sitetext) # parse into tree form
  File "./sitetruth/BeautifulSoup.py", line 1326, in __init__
BeautifulStoneSoup.__init__(self, *args, **kwargs)
  File "./sitetruth/BeautifulSoup.py", line 973, in __init__
self._feed()
  File "./sitetruth/BeautifulSoup.py", line 998, in _feed
SGMLParser.feed(self, markup or "")
  File "/usr/lib/python2.5/sgmllib.py", line 99, in feed
self.goahead(0)
  File "/usr/lib/python2.5/sgmllib.py", line 133, in goahead
k = self.parse_starttag(i)
  File "/usr/lib/python2.5/sgmllib.py", line 291, in parse_starttag
self.finish_starttag(tag, attrs)
  File "/usr/lib/python2.5/sgmllib.py", line 340, in finish_starttag
self.handle_starttag(tag, method, attrs)
  File "/usr/lib/python2.5/sgmllib.py", line 376, in handle_starttag
method(attrs)
  File "./sitetruth/BeautifulSoup.py", line 1416, in start_meta
self._feed(self.declaredHTMLEncoding)
  File "./sitetruth/BeautifulSoup.py", line 998, in _feed
SGMLParser.feed(self, markup or "")
  File "/usr/lib/python2.5/sgmllib.py", line 99, in feed
self.goahead(0)
  File "/usr/lib/python2.5/sgmllib.py", line 133, in goahead
k = self.parse_starttag(i)
  File "/usr/lib/python2.5/sgmllib.py", line 285, in parse_starttag
self._convert_ref, attrvalue)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa7 in position 0: ordinal 
not in range(128)

The code that's failing is in "_convert_ref", which is new in Python 2.5. 
That function wasn't present in 2.4.  I think the code is trying to handle 
single quotes inside of double quotes in HTML attributes, or something like 
that.

To replicate, run

http://www.bankofamerica.com
or
http://www.gm.com

through BeautifulSoup.  

Something about this code doesn't like big companies. Web sites of smaller 
companies are going through OK.

--

>Comment By: John Nagle (nagle)
Date: 2007-02-07 07:57

Message:
Logged In: YES 
user_id=5571
Originator: YES

Found the problem. In sgmllib.py for Python 2.5, in convert_charref, the
code for handling character escapes assumes that ASCII characters have
values up to 255.
But the correct limit is 127, of course.

If a Unicode string is run through SGMLparser, and that string has a
character in an attribute with a value between 128 and 255, which is valid
in Unicode, the
value is passed through as a character with "chr", creating a
one-character invalid ASCII string.  

Then, when the bad string is later converted to Unicode as the output is
assembled, the UnicodeDecodeError exception is raised. 

So the fix is to change 255 to 127 in convert_charref in sgmllib.py,
as shown below.  This forces characters above 127 to be expressed with
escape sequences.  Please patch accordingly.  Thanks.

def convert_charref(self, name):
"""Convert character reference, may be overridden."""
try:
n = int(name)
except ValueError:
return
if not 0 <= n <= 127 : # ASCII ends at 127, not 255
return
return self.convert_codepoint(n)


--

Comment By: wrstl prmpft (wrstlprmpft)
Date: 2007-02-05 07:16

Message:
Logged In: YES 
user_id=801589
Originator: NO

I had a similar problem recently and did not have time to file a
bug-report. Thanks for doing that.

The problem is the code that handles entity and character references in
SGMLParser.parse_starttag. Seems that it is not careful about unicode/str
issues.
(But maybe Beautifulsoup needs to tell it to?)

My quick'n'dirty workaround was to remove the offending char-entity from
the website before feeding it to Beautifulsoup::

  text = text.replace('®', '') # remove rights reserved sign entity

cheers,
stefan


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1651995&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mai