Re: [Python-Dev] PEP 393: Special-casing ASCII-only strings

2011-09-15 Thread Martin v. Löwis
Am 16.09.11 00:42, schrieb Nick Coghlan: On Fri, Sep 16, 2011 at 7:39 AM, "Martin v. Löwis wrote: Thinking about this, the following may work: - ASCIIObject: state, length, hash, wstr*, data follow - SingleBlockUnicode: ASCIIObject, wstr_len, utf8*, utf8_len, data follow - UnicodeObject: Sin

Re: [Python-Dev] PEP 393: Special-casing ASCII-only strings

2011-09-15 Thread Nick Coghlan
On Fri, Sep 16, 2011 at 7:39 AM, "Martin v. Löwis" wrote: > Thinking about this, the following may work: > - ASCIIObject: state, length, hash, wstr*, data follow > - SingleBlockUnicode: ASCIIObject, wstr_len, >                      utf8*, utf8_len, data follow > - UnicodeObject: SingleBlockUnicode

Re: [Python-Dev] PEP 393: Special-casing ASCII-only strings

2011-09-15 Thread Martin v. Löwis
I like it. If we start which such optimization, we can also also remove data from strings allocated by the new API (it can be computed: object pointer + size of the structure). See my email for my proposition of structures: Re: [Python-Dev] PEP 393 review Thu Aug 25 00:29:19 2011 I agree

Re: [Python-Dev] PEP 393: Special-casing ASCII-only strings

2011-09-15 Thread Victor Stinner
Le jeudi 15 septembre 2011 17:50:41, Martin v. Löwis a écrit : > In reviewing memory usage, I found potential for saving more memory for > ASCII-only strings. (...) > > typedef struct { > PyObject_HEAD > Py_ssize_t length; > union { > void *any; > Py_UCS1 *latin1;

Re: [Python-Dev] PEP 393: Special-casing ASCII-only strings

2011-09-15 Thread Guido van Rossum
On Thu, Sep 15, 2011 at 8:50 AM, "Martin v. Löwis" wrote: > In reviewing memory usage, I found potential for saving more memory for > ASCII-only strings. Both Victor and Guido commented that something like > this be done; Antoine had asked whether there was anything that could > be done. Here is t

Re: [Python-Dev] PEP 393: Special-casing ASCII-only strings

2011-09-15 Thread Terry Reedy
On 9/15/2011 11:50 AM, "Martin v. Löwis" wrote: To comply with the C aliasing rules, the structures would look like this: typedef struct { PyObject_HEAD Py_ssize_t length; union { void *any; Py_UCS1 *latin1; Py_UCS2 *ucs2; Py_UCS4 *ucs4; } data; Py_hash_t hash; int state; /* may include SSTATE_