Marc-Andre Lemburg added the comment:
All string length calculations in Python 2.4 are done using ints
which are 32-bit, even on 64-bit platforms.
Since UTF-8 can use up to 4 bytes per Unicode code point, the encoder
overallocates the needed chunk of memory to len*4 bytes. This
will go straight
Martin v. Löwis added the comment:
Just to support Mark's decision: Python 2.4 is no longer maintained; you
are on your own with any problems you encounter with it. So closing it
as "won't fix" would also have been appropriate.
The same holds for 2.5, unless you can demonstrate this to cause
se
Mark Dickinson added the comment:
Well, the signature of PyUnicode_Encode in Python 2.4 (see
Objects/unicodeobject.c) is:
PyObject *PyUnicode_Encode(const Py_UNICODE *s,
int size,
const char *encoding,
const char
Andreas Jung added the comment:
Both systems are Linux system running a narrow Python build.
The problem does not occur with Python 2.5 or 2.6.
Unfortunately this error occurs with Zope 2 which is tied (at least with
versions prior to Zope 2.12 to Python 2.4).
--
status: pending -> o
Mark Dickinson added the comment:
Is the first machine also a Linux machine? Perhaps the difference is that
the first machine has a wide-unicode build (i.e., it uses UCS4 internally)
and the other doesn't?
Unfortunately there's not much that the python-devs can do about this
unless the prob
New submission from Andreas Jung :
We encountered a pretty bizarre behavior of Python 2.4.6 while decoding a 600MB
long unicode string
'data':
Python 2.4.6 (8GB RAM, 64 bit)
(Pdb) type(data)
(Pdb) len(data)
601794657
(Pdb) data2=data.encode('utf-8')
*** SystemError: Negative size passed to