Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Brett Cannon
On 11/15/05, Neal Norwitz <[EMAIL PROTECTED]> wrote:
> On 11/15/05, Jeremy Hylton <[EMAIL PROTECTED]> wrote:
> >
> > Thanks for the message.  I was going to suggest the same thing.  I
> > think it's primarily a question of how to add an arena layer.  The AST
> > phase has a mixture of malloc/free and Python object allocation.  It
> > should be straightforward to change the malloc/free code to use an
> > arena API.  We'd probably need a separate mechanism to associate a set
> > of PyObject* with the arena and have those DECREFed.
>
> Well good.  It seems we all agree there is a problem and on the
> general solution.  I haven't thought about Brett's idea to see if it
> could work or not.  It would be great if we had someone start working
> to improve the situation.  It could well be that we live with the
> current code for 2.5, but it would be great to use arenas for 2.6 at
> least.
>

 I have been thinking about this some more  to put off doing homework
and I have some random ideas I just wanted to toss out there to make
sure I am not thinking about arena memory management incorrectly
(never actually encountered it directly before).

I think an arena API is going to be the best solution.  Pulling
trickery with redefining Py_INCREF and such like I suggested seems
like a pain and possibly error-prone.  With the compiler being a
specific corner of the core having a special API for handling the
memory for PyObject* stuff seems reasonable.

We might need PyArena_Malloc() and PyArena_New() to handle malloc()
and PyObject* creation.  We could then have a struct that just stored
pointers to the allocated memory (linked list for each pointer which
gives high memory overhead or linked list of arrays that should lower
memory but make having possible holes in the array for stuff already
freed a pain to handle).  We would then have PyArena_FreeAll() that
would be strategically placed in the code for when bad things happen
that would just traverse the lists and free everything.  I assume
having a way to free individual items might be useful.  Could have the
PyArena_New() and _Malloc() return structs with the needed info for a
PyArena_Free(location_struct) to be able to fee the specific item
without triggering a complete freeing of all memory.  But this usage
should be discouraged and only used when proper memory management is
guaranteed.

Boy am I wanting RAII from C++ for automatic freeing when scope is
left.  Maybe we need to come up with a similar thing, like all memory
that should be freed once a scope is left must use some special struct
that stores references to all created memory locally and then a free
call must be made at all exit points in the function using the special
struct.  Otherwise the pointer is stored in the arena and handled
en-mass later.

Hopefully this is all made some sense.  =)  Is this the basic strategy
that an arena setup would need?  if not can someone enlighten me?


-Brett
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Thomas Lee
As the writer of the crappy code that sparked this conversation, I feel 
I should say something :)

Brett Cannon wrote:

>On 11/15/05, Neal Norwitz <[EMAIL PROTECTED]> wrote:
>  
>
>>On 11/15/05, Jeremy Hylton <[EMAIL PROTECTED]> wrote:
>>
>>
>>>Thanks for the message.  I was going to suggest the same thing.  I
>>>think it's primarily a question of how to add an arena layer.  The AST
>>>phase has a mixture of malloc/free and Python object allocation.  It
>>>should be straightforward to change the malloc/free code to use an
>>>arena API.  We'd probably need a separate mechanism to associate a set
>>>of PyObject* with the arena and have those DECREFed.
>>>  
>>>
>>Well good.  It seems we all agree there is a problem and on the
>>general solution.  I haven't thought about Brett's idea to see if it
>>could work or not.  It would be great if we had someone start working
>>to improve the situation.  It could well be that we live with the
>>current code for 2.5, but it would be great to use arenas for 2.6 at
>>least.
>>
>>
>>
>
> I have been thinking about this some more  to put off doing homework
>and I have some random ideas I just wanted to toss out there to make
>sure I am not thinking about arena memory management incorrectly
>(never actually encountered it directly before).
>
>I think an arena API is going to be the best solution.  Pulling
>trickery with redefining Py_INCREF and such like I suggested seems
>like a pain and possibly error-prone.  With the compiler being a
>specific corner of the core having a special API for handling the
>memory for PyObject* stuff seems reasonable.
>
>  
>
I agree. And it raises the learning curve for poor saps like myself. :)

>We might need PyArena_Malloc() and PyArena_New() to handle malloc()
>and PyObject* creation.  We could then have a struct that just stored
>pointers to the allocated memory (linked list for each pointer which
>gives high memory overhead or linked list of arrays that should lower
>memory but make having possible holes in the array for stuff already
>freed a pain to handle).  We would then have PyArena_FreeAll() that
>would be strategically placed in the code for when bad things happen
>that would just traverse the lists and free everything.  I assume
>having a way to free individual items might be useful.  Could have the
>PyArena_New() and _Malloc() return structs with the needed info for a
>PyArena_Free(location_struct) to be able to fee the specific item
>without triggering a complete freeing of all memory.  But this usage
>should be discouraged and only used when proper memory management is
>guaranteed.
>
>  
>
An arena/pool (as I understood it from my quick skim) for the AST would 
probably best be implemented (IMHO) as an ADT based on a linked-list:

typedef struct _ast_pool_node {
  struct _ast_pool_node *next;
  PyObject *object; /* == NULL when data != NULL */
  void *data; /* == NULL when object != NULL */
}ast_pool_node;

deallocating a node could then be as simple as:

/* ast_pool_node *n */
PyObject_Free(n->object);
if (n->data != NULL)
  free(n->data);
/* save n->next */
free(n);
/* then go on to free n->next */

I haven't really thought all that deeply about this, so somebody shoot 
me down if I'm completely off-base (Neal? :D). Every allocation of a 
seq/stmt within ast.c would have its memory saved to the pool within the 
function it's allocated in. Then before we return, we can just 
deallocate the pool/arena/whatever you want to call it.

The problem with this is that should we get to the end of the function 
and everything actually went okay (i.e. we return non-NULL), we then 
have to run through and deallocate all the nodes anyway (without 
deallocating n->object or n->data). Bah. Maybe we *would* be better off 
with a monolithic cleanup. I don't know.

>Boy am I wanting RAII from C++ for automatic freeing when scope is
>left.  Maybe we need to come up with a similar thing, like all memory
>that should be freed once a scope is left must use some special struct
>that stores references to all created memory locally and then a free
>call must be made at all exit points in the function using the special
>struct.  Otherwise the pointer is stored in the arena and handled
>en-mass later.
>
>  
>
Which is basically what I just rambled on about up above, I think :)

>Hopefully this is all made some sense.  =)  Is this the basic strategy
>that an arena setup would need?  if not can someone enlighten me?
>
>
>-Brett
>___
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: 
>http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com
>
>  
>

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Nick Coghlan
Nick Coghlan wrote:
> Marek Baczek Baczyński wrote:
>> 2005/11/15, Nick Coghlan <[EMAIL PROTECTED]>:
>>> It avoids the potential for labelling problems that arises when goto's are
>>> used for resource cleanup. It's a far cry from real exception handling, but
>>> it's the best solution I've seen within the limits of C.
>> 
>> do {
>> 
>> 
>> } while (0);
>>
>>
>> Same benefit and saves some typing :)
> 
> Heh. Good point. I spend so much time working with a certain language I tend 
> to forget do/while loops exist ;)

Thomas actually tried doing things this way, and the parser/compiler code 
needs to use loops, which means this trick won't work reliably.

So we'll need to do something smarter (such as the arena idea) to deal with 
the memory allocation problem.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread skip

Travis> More to the point, however, these scalar objects were allocated
Travis> using the standard PyObject_New and PyObject_Del functions which
Travis> of course use the Python memory manager.  One user ported his
Travis> (long-running) code to the new scipy core and found much to his
Travis> dismay that what used to consume around 100MB now completely
Travis> dominated his machine consuming up to 2GB of memory after only a
Travis> few iterations.  After searching many hours for memory leaks in
Travis> scipy core (not a bad exercise anyway as some were found), the
Travis> real problem was tracked to the fact that his code ended up
Travis> creating and destroying many of these new array scalars.

What Python object were his array elements a subclass of?

Travis> In the long term, what is the status of plans to re-work the
Travis> Python Memory manager to free memory that it acquires (or
Travis> improve the detection of already freed memory locations).  

None that I'm aware of.  It's seen a great deal of work in the past and
generally doesn't cause problems.  Maybe your user's usage patterns were
a bad corner case.  It's hard to tell without more details.

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Niko Matsakis
> Boy am I wanting RAII from C++ for automatic freeing when scope is
> left.  Maybe we need to come up with a similar thing, like all memory
> that should be freed once a scope is left must use some special struct
> that stores references to all created memory locally and then a free
> call must be made at all exit points in the function using the special
> struct.  Otherwise the pointer is stored in the arena and handled
> en-mass later.

That made sense.  I think I'd be opposed to what you describe here  
just because I think anything which *requires* that cleanup code be  
placed on every function is error prone.

Depending on how much you care about peak memory usage, you do not  
necessarily need to worry about freeing pointers as you go.  If you  
can avoid thinking about it, it makes things much simpler.

If you are concerned with peak memory usage, it gets more  
complicated, and you will begin to have greater possibility of user  
error.  The problem is that dynamically allocated memory often  
outlives the stack frame in which it was created.  There are several  
possibilities:

- If you use ref-counted memory, you can add to the ref count of the  
memory which outlives the stack frame; the problem is knowing when to  
drop it down again.  I think the easiest is to have two lists: one  
for memory which will go away quickly, and another for more permanent  
memory.  The more permanent memory list goes away at the end of the  
transform and is hopefully rarely used.

- Another idea is to have trees of arenas: the idea is that when an  
arena is created, it is assigned a parent.  When an arena is freed,  
an arenas in its subtree are also freed.  This way you can have one  
master arena for exception handling, but if there is some sub-region  
where allocations can be grouped together, you create a sub-arena and  
free it when that region is complete.  Note that if you forget to  
free a sub-arena, it will eventually be freed.

There is no one-size-fits-all solution.  The right one depends on how  
memory is used; but I think all of them are much simpler and less  
error prone than tracking individual pointers.

I'd actually be happy to hack on the AST code and try to clean up the  
memory usage, assuming that the 2.6 release is far enough out that I  
will have time to squeeze it in among the other things I am doing.


Niko
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Thomas Lee
Niko Matsakis wrote:

>>Boy am I wanting RAII from C++ for automatic freeing when scope is
>>left.  Maybe we need to come up with a similar thing, like all memory
>>that should be freed once a scope is left must use some special struct
>>that stores references to all created memory locally and then a free
>>call must be made at all exit points in the function using the special
>>struct.  Otherwise the pointer is stored in the arena and handled
>>en-mass later.
>>
>>
>
>That made sense.  I think I'd be opposed to what you describe here  
>just because I think anything which *requires* that cleanup code be  
>placed on every function is error prone.
>
>  
>
Placing it in every function isn't really the problem: at the moment 
it's more the fact we have to keep track of too many variables at any 
given time to properly deallocate it all. Cleanup code gets tricky very 
fast.

Then it gets further complicated by the fact that 
stmt_ty/expr_ty/mod_ty/etc. deallocate members (usually asdl_seq 
instances in my experience) - so if a construction takes place, all of a 
sudden you have to make sure you don't deallocate those members a second 
time in the cleanup code :S it gets tricky very quickly.

Even if it meant we had just one function call - one, safe function call 
that deallocated all the memory allocated within a function - that we 
had to put before each and every return, that's better than what we 
have. Is it the best solution? Maybe not. But that's what we're looking 
for here I guess :)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Thomas Lee
By the way, I liked the sound of the arena/pool tree - really good idea.

Thomas Lee wrote:

>Niko Matsakis wrote:
>
>  
>
>>>Boy am I wanting RAII from C++ for automatic freeing when scope is
>>>left.  Maybe we need to come up with a similar thing, like all memory
>>>that should be freed once a scope is left must use some special struct
>>>that stores references to all created memory locally and then a free
>>>call must be made at all exit points in the function using the special
>>>struct.  Otherwise the pointer is stored in the arena and handled
>>>en-mass later.
>>>   
>>>
>>>  
>>>
>>That made sense.  I think I'd be opposed to what you describe here  
>>just because I think anything which *requires* that cleanup code be  
>>placed on every function is error prone.
>>
>> 
>>
>>
>>
>Placing it in every function isn't really the problem: at the moment 
>it's more the fact we have to keep track of too many variables at any 
>given time to properly deallocate it all. Cleanup code gets tricky very 
>fast.
>
>Then it gets further complicated by the fact that 
>stmt_ty/expr_ty/mod_ty/etc. deallocate members (usually asdl_seq 
>instances in my experience) - so if a construction takes place, all of a 
>sudden you have to make sure you don't deallocate those members a second 
>time in the cleanup code :S it gets tricky very quickly.
>
>Even if it meant we had just one function call - one, safe function call 
>that deallocated all the memory allocated within a function - that we 
>had to put before each and every return, that's better than what we 
>have. Is it the best solution? Maybe not. But that's what we're looking 
>for here I guess :)
>
>___
>Python-Dev mailing list
>Python-Dev@python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: 
>http://mail.python.org/mailman/options/python-dev/krumms%40gmail.com
>
>  
>

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Fredrik Lundh
Thomas Lee wrote:

> Even if it meant we had just one function call - one, safe function call
> that deallocated all the memory allocated within a function - that we
> had to put before each and every return, that's better than what we
> have.

alloca?

(duck)

 



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Collin Winter
On 11/16/05, Niko Matsakis <[EMAIL PROTECTED]> wrote:
> - Another idea is to have trees of arenas: the idea is that when an
> arena is created, it is assigned a parent.  When an arena is freed,
> an arenas in its subtree are also freed.  This way you can have one
> master arena for exception handling, but if there is some sub-region
> where allocations can be grouped together, you create a sub-arena and
> free it when that region is complete.  Note that if you forget to
> free a sub-arena, it will eventually be freed.

You might be able to draw some inspiration from the Apache Portable
Runtime. It includes a memory pool management scheme that might be of
some interest.

The main project page is http://apr.apache.org, with the docs for the
mempool API located at
http://apr.apache.org/docs/apr/group__apr__pools.html

Collin Winter
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Nick Coghlan
Thomas Lee wrote:
> As the writer of the crappy code that sparked this conversation, I feel 
> I should say something :)

Don't feel bad about it. It turned out the 'helpful' review comments from Neal 
and I didn't originally work out very well either ;)

With the AST compiler being so new, this is the first serious attempt to 
introduce modifications based on it. It's already better than the old CST 
compiler, but that memory management in the parser is a cow :)

>> Hopefully this is all made some sense.  =)  Is this the basic strategy
>> that an arena setup would need?  if not can someone enlighten me?

I think we need to be explicit about the problems we're trying to solve before 
deciding on what kind of solution we want :)

1. Cleaning up after failures in symtable.c and compile.c
   It turns out this is already dealt with in the case of code blocks - the 
compiler state handles a linked list of blocks which it automatically frees 
when the compiler state is cleaned up.
   So the only rule that needs to be followed in these files is to *never* 
call any of the VISIT_* macros while there is a Python object which requires 
DECREF'ing, or a C pointer which needs to be freed.
   This rule was being broken in a couple of places in compile.c (with respect 
to strings). I was the offender in both cases I found - the errors date from 
when this was still on the ast-branch in CVS.
   I've fixed those errors in SVN, and added a note to the comment at the top 
of compile.c, to help others avoid making the same mistake I did.
   It's fragile in some ways, but it does work. It makes the actual 
compilation code look clean (because there isn't any cleanup code), but it 
also makes that code look *wrong* (because the lack of cleanup code makes the 
calls to "compiler_new_block" look unbalanced), which is a little disconcerting.

2. Parsing a token stream into the AST in ast.c
   This is the bit that has caused Thomas grief (the PEP 341 patch only needs 
to modify the front end parser). When building an AST node, each of the 
contained AST nodes or sequences has to be built first. That means that, if 
there's a problem with any of the later subnodes, the earlier subnodes need to 
be freed.
   The key problem with memory management in this module is that the free 
method to be invoked is dependent on the nature of the AST node to be freed. 
In the case of a node sequence, it is dependent on the nature of the contained 
elements.
   So not only do you have to remember to free the memory, you have to 
remember to free it the *right way*.

Would it be worth the extra memory needed to store a pointer to an AST node's 
"free" method in the AST type structure itself? And do the same for ASDL 
sequences?

Then a simple FREE_AST macro would be able to "do the right thing" when it 
came to freeing either AST nodes or sequences. In particular, ASDL sequences 
would be able to free their contents without knowing what those contents 
actually are.

That wouldn't eliminate the problem with memory leaks or double-deletion, but 
it would eliminate some of the mental overhead of dealing with figuring out 
which freeing function to invoke.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-16 Thread Thomas Lee
Just messing around with some ideas. I was trying to avoid the ugly 
macros (note my earlier whinge about a learning curve) but they're the 
cleanest way I could think of to get around the problem without 
resorting to a mass deallocation right at the end of the AST run. Which 
may not be all that bad given we're going to keep everything in-memory 
anyway until an error occurs ... anyway, anyway, I'm getting sidetracked :)

The idea is to ensure that all allocations within a single function are 
made using the pool so that a function finishes what it starts. This 
way, if the function fails it alone is responsible for cleaning up its 
own pool and that's all. No funkyness needed for sequences, because each 
member of the sequence belongs to the pool too. Note that the stmt_ty 
instances are also allocated using the pool.

This breaks interfaces all over the place though. Not exactly a pretty 
change :) But yeah, maybe somebody smarter than I will come up with 
something a bit cleaner.

--

/* snip! */

#define AST_SUCCESS(pool, result) return result
#define AST_FAILURE(pool, result) asdl_pool_free(pool); return result

static stmt_ty
ast_for_try_stmt(struct compiling *c, const node *n)
{
/* with the pool stuff, we wouldn't need to declare _all_ the variables
   here either. I'm just lazy. */

asdl_pool *pool;
int i;
const int nch = NCH(n);
int n_except = (nch - 3)/3;
stmt_ty result_st = NULL, except_st = NULL;
asdl_seq *body = NULL, *orelse = NULL, *finally = NULL;
asdl_seq *inner = NULL, *handlers = NULL;

REQ(n, try_stmt);

/* c->pool is the parent of pool. when pool is freed
   (via AST_FAILURE), it is also removed from c->pool's list of 
children */
pool = asdl_pool_new(c->pool);
if (pool == NULL)
AST_FAILURE(pool, NULL);

body = ast_for_suite(c, CHILD(n, 2));
if (body == NULL)
AST_FAILURE(pool, NULL);

if (TYPE(CHILD(n, nch - 3)) == NAME) {
if (strcmp(STR(CHILD(n, nch - 3)), "finally") == 0) {
if (nch >= 9 && TYPE(CHILD(n, nch - 6)) == NAME) {
/* we can assume it's an "else",
   because nch >= 9 for try-else-finally and
   it would otherwise have a type of except_clause */
orelse = ast_for_suite(c, CHILD(n, nch - 4));
if (orelse == NULL)
AST_FAILURE(pool, NULL);
n_except--;
}

finally = ast_for_suite(c, CHILD(n, nch - 1));
if (finally == NULL)
AST_FAILURE(pool, NULL);
n_except--;
}
else {
/* we can assume it's an "else",
   otherwise it would have a type of except_clause */
orelse = ast_for_suite(c, CHILD(n, nch - 1));
if (orelse == NULL)
AST_FAILURE(pool, NULL);
n_except--;
}
}
else if (TYPE(CHILD(n, nch - 3)) != except_clause) {
ast_error(n, "malformed 'try' statement");
AST_FAILURE(pool, NULL);
}

 if (n_except > 0) {
/* process except statements to create a try ... except */
handlers = asdl_seq_new(pool, n_except);
if (handlers == NULL)
AST_FAILURE(pool, NULL);

for (i = 0; i < n_except; i++) {
excepthandler_ty e = ast_for_except_clause(c, CHILD(n, 3 + i 
* 3),
CHILD(n, 5 + i * 3));
if (!e)
AST_FAILURE(pool, NULL);
asdl_seq_SET(handlers, i, e);
}

except_st = TryExcept(pool, body, handlers, orelse, LINENO(n));
if (except_st == NULL)
AST_FAILURE(pool, NULL);

/* if a 'finally' is present too, we nest the TryExcept within a
   TryFinally to emulate try ... except ... finally */
if (finally != NULL) {
inner = asdl_seq_new(pool, 1);
if (inner == NULL)
AST_FAILURE(pool, NULL);
asdl_seq_SET(inner, 0, except_st);
result_st = TryFinally(pool, inner, finally, LINENO(n));
if (result_st == NULL)
AST_FAILURE(pool, NULL);
}
else
result_st = except_st;
}
else {
/* no exceptions: must be a try ... finally */
assert(orelse == NULL);
assert(finally != NULL);
result_st = TryFinally(pool, body, finally, LINENO(n));
if (result_st == NULL)
AST_FAILURE(pool, NULL);
}

/* pool deallocated when c->pool is deallocated */
return AST_SUCCESS(pool, result_st);
}


Nick Coghlan wrote:

>Thomas Lee wrote:
>  
>
>>As the writer of the crappy code that sparked this conversation, I feel 
>>I should say something :)
>>
>>
>
>Don't feel bad about it. It turned out the 'helpful' review comments from Neal 
>and I didn't originally work out very well either ;)
>
>With the AST compiler being so new, this is the first serious attempt to

[Python-Dev] Conclusion: Event loops, PyOS_InputHook, and Tkinter

2005-11-16 Thread Jim Jewett
Phillip J. Eby:

> did you ever try using IPython, and confirm whether it
> does or does not address the issue

As I understand it, using IPython (or otherwise changing
the interactive mode) works fine *if* you just want a point
solution -- get something up in some environment chosen
by the developer.

Michiel is looking to create a component that will work in
whatever environment the *user* chooses.  Telling users
"you must go through this particular interface" is not
acceptable.  Therefore, IPython is only a workaround,
not a solution.

On the other hand, IPython is clearly a *good* workaround.
The dance described in
http://mail.python.org/pipermail/python-dev/2005-November/058057.html
is short enough that a real solution might well be built on
IPython; it just isn't quite done yet.

-jJ
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is some magic required to check out new files from svn?

2005-11-16 Thread Armin Rigo
Hi,

On Sun, Nov 13, 2005 at 07:08:15AM -0600, [EMAIL PROTECTED] wrote:
> The full svn status output is
> 
> % svn status
> !  .
> !  Python

The "!" definitely mean that these items are missing, or for
directories, incomplete in some way.  You need to play around until the
"!" goes away; for example, you may try

svn revert -R . # revert to pristine state, recursively

if you have no local changes you want to keep, followed by 'svn up'.  If
it still doesn't help, then I'm lost about the cause and would just
recommend doing a fresh checkout.


A bientot,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread Travis Oliphant
[EMAIL PROTECTED] wrote:

>Travis> More to the point, however, these scalar objects were allocated
>Travis> using the standard PyObject_New and PyObject_Del functions which
>Travis> of course use the Python memory manager.  One user ported his
>Travis> (long-running) code to the new scipy core and found much to his
>Travis> dismay that what used to consume around 100MB now completely
>Travis> dominated his machine consuming up to 2GB of memory after only a
>Travis> few iterations.  After searching many hours for memory leaks in
>Travis> scipy core (not a bad exercise anyway as some were found), the
>Travis> real problem was tracked to the fact that his code ended up
>Travis> creating and destroying many of these new array scalars.
>
>What Python object were his array elements a subclass of?
>  
>
These were all scipy core arrays.  The elements were therefore all 
C-like numbers (floats and integers I think).  If he obtained an element 
in Python, he would get an instance of a new "array" scalar object which 
is a builtin extension type written in C.  The important issue though is 
that these "array" scalars were allocated using PyObject_New and 
deallocated using PyObject_Del.  The problem is that the Python memory 
manager did not free the memory. 

>Travis> In the long term, what is the status of plans to re-work the
>Travis> Python Memory manager to free memory that it acquires (or
>Travis> improve the detection of already freed memory locations).  
>
>None that I'm aware of.  It's seen a great deal of work in the past and
>generally doesn't cause problems.  Maybe your user's usage patterns were
>a bad corner case.  It's hard to tell without more details.
>  
>
I think definitely, his usage pattern represented a "bad" corner case.  
An unusable "corner" case in fact.   At any rate, moving to use the 
system free and malloc fixed the immediate problem.  I mainly wanted to 
report the problem here just as another piece of anecdotal evidence.

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread Josiah Carlson

Travis Oliphant <[EMAIL PROTECTED]> wrote:
> 
> [EMAIL PROTECTED] wrote:
> 
> >Travis> More to the point, however, these scalar objects were allocated
> >Travis> using the standard PyObject_New and PyObject_Del functions which
> >Travis> of course use the Python memory manager.  One user ported his
> >Travis> (long-running) code to the new scipy core and found much to his
> >Travis> dismay that what used to consume around 100MB now completely
> >Travis> dominated his machine consuming up to 2GB of memory after only a
> >Travis> few iterations.  After searching many hours for memory leaks in
> >Travis> scipy core (not a bad exercise anyway as some were found), the
> >Travis> real problem was tracked to the fact that his code ended up
> >Travis> creating and destroying many of these new array scalars.
> >
> >What Python object were his array elements a subclass of?
> 
> These were all scipy core arrays.  The elements were therefore all 
> C-like numbers (floats and integers I think).  If he obtained an element 
> in Python, he would get an instance of a new "array" scalar object which 
> is a builtin extension type written in C.  The important issue though is 
> that these "array" scalars were allocated using PyObject_New and 
> deallocated using PyObject_Del.  The problem is that the Python memory 
> manager did not free the memory. 

This is not a bug, and there doesn't seem to be any plans to change the
behavior: python.org/sf/1338264

If I remember correctly, arrays from the Python standard library (import
array), as well as numarray and Numeric, all store values in their
pure C representations (they don't use PyObject_New unless someone uses
the Python interface to fetch a particular element).  This saves the
overhead of allocating base objects, as well as the 3-5x space blowup
when using Python integers (depending on whether your platform has 32 or
64 bit ints).


> I think definitely, his usage pattern represented a "bad" corner case.  
> An unusable "corner" case in fact.   At any rate, moving to use the 
> system free and malloc fixed the immediate problem.  I mainly wanted to 
> report the problem here just as another piece of anecdotal evidence.

On the one hand, using PyObjects embedded in an array in scientific
Python is a good idea; you can use all of the standard Python
manipulations on them.  On the other hand, other similar projects have
found it more efficient to never embed PyObjects in their arrays, and
just allocate them as necessary on access.

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread Robert Kern
Josiah Carlson wrote:
> Travis Oliphant <[EMAIL PROTECTED]> wrote:

>>I think definitely, his usage pattern represented a "bad" corner case.  
>>An unusable "corner" case in fact.   At any rate, moving to use the 
>>system free and malloc fixed the immediate problem.  I mainly wanted to 
>>report the problem here just as another piece of anecdotal evidence.
> 
> On the one hand, using PyObjects embedded in an array in scientific
> Python is a good idea; you can use all of the standard Python
> manipulations on them.  On the other hand, other similar projects have
> found it more efficient to never embed PyObjects in their arrays, and
> just allocate them as necessary on access.

That's not what we're doing[1]. The scipy_core arrays here are just
blocks of C doubles. However, the offending code (I believe Chris
Fonnesbeck's PyMC, but I could be mistaken) frequently indexes into
these arrays to get scalar values. In scipy_core, we've defined a set of
numerical types that generally behave like Python ints and floats but
have the underlying storage of the appropriate C data type and have the
various array attributes and methods. When the result of an indexing
operation is a scalar (e.g., arange(10)[0]), it always returns an
instance of the appropriate scalar type. We are "just allocat[ing] them
as necessary on access."

[1] There *is* an array type for general PyObjects in scipy_core, but
that's not being used in the code that blows up and has nothing to do
with the problem Travis is talking about.

-- 
Robert Kern
[EMAIL PROTECTED]

"In the fields of hell where the grass grows high
 Are the graves of dreams allowed to die."
  -- Richard Harter

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread Josiah Carlson

Robert Kern <[EMAIL PROTECTED]> wrote:
> 
> [1] There *is* an array type for general PyObjects in scipy_core, but
> that's not being used in the code that blows up and has nothing to do
> with the problem Travis is talking about.

I seemed to have misunderstood the discussion.  Was the original user
accessing and saving copies of many millions of these doubles?  That's
the only way that I would be able to explain the huge overhead, and in
that case, perhaps the user should have been storing them in scipy
arrays (or even Python array.arrays).

 - Josiah

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread Travis Oliphant
Josiah Carlson wrote:

>Robert Kern <[EMAIL PROTECTED]> wrote:
>  
>
>>[1] There *is* an array type for general PyObjects in scipy_core, but
>>that's not being used in the code that blows up and has nothing to do
>>with the problem Travis is talking about.
>>
>>
>
>I seemed to have misunderstood the discussion.  Was the original user
>accessing and saving copies of many millions of these doubles?  
>
He *was* accessing them (therefore generating a call to an array-scalar 
object creation function).  But they *weren't being* saved.  They were 
being deleted soon after access.   That's why it was so confusing that 
his memory usage should continue to grow and grow so terribly.

As verified by removing usage of the Python PyObject_MALLOC function, it 
was the Python memory manager that was performing poorly.   Even though 
the array-scalar objects were deleted, the memory manager would not 
re-use their memory for later object creation. Instead, the memory 
manager kept allocating new arenas to cover the load (when it should 
have been able to re-use the old memory that had been freed by the 
deleted objects--- again, I don't know enough about the memory manager 
to say why this happened).

The fact that it did happen is what I'm reporting on.  If nothing will 
be done about it (which I can understand), at least this thread might 
help somebody else in a similar situation track down why their Python 
process consumes all of their memory even though their objects are being 
deleted appropriately.

Best,

-Travis

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread Neal Norwitz
On 11/16/05, Travis Oliphant <[EMAIL PROTECTED]> wrote:
>
> As verified by removing usage of the Python PyObject_MALLOC function, it
> was the Python memory manager that was performing poorly.   Even though
> the array-scalar objects were deleted, the memory manager would not
> re-use their memory for later object creation. Instead, the memory
> manager kept allocating new arenas to cover the load (when it should
> have been able to re-use the old memory that had been freed by the
> deleted objects--- again, I don't know enough about the memory manager
> to say why this happened).

Can you provide a minimal test case?  It's hard to do anything about
it if we can't reproduce it.

n
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread skip

>> [1] There *is* an array type for general PyObjects in scipy_core, but
>> that's not being used in the code that blows up and has nothing to do
>> with the problem Travis is talking about.

Josiah> I seemed to have misunderstood the discussion.  

I'm sorry, but I'm confused as well.  If these scipy arrays have elements
that are subclasses of floats shouldn't we be able to provoke this memory
growth using an array.array of floats?  Can you provide a simple script in
pure Python (no scipy) that demonstrates the problem?

Skip
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread Travis Oliphant
Jim Jewett wrote:

>Do you have the code that caused problems?
>  
>
Yes.  I was able to reproduce his trouble and was trying to debug it.

>The things I would check first are
>
>(1)  Is he allocating (peak usage) a type (such as integers) that
>never gets returned to the free pool, in case you need more of that
>same type?
>  
>
No, I don't think so.

>(2)  Is he allocating new _types_, which I think don't get properly
>
> collected.
>  
>

Bingo.  Yes, definitely allocating new _types_ (an awful lot of them...) 
--- that's what the "array scalars" are: new types created in C.  If 
they don't get properly collected then that would definitely have 
created the problem.  It would seem this should be advertised when 
telling people to use PyObject_New for allocating new memory for an object.

>(3)  Is there something in his code that keeps a live reference, or at
>least a spotty memory usage so that the memory can't be cleanly
>released?
>
>  
>
No, that's where I thought the problem was, at first.  I spent a lot of 
time tracking down references.What finally convinced me it was the 
Python memory manager was when I re-wrote the tp->alloc functions of the 
new types to use the system malloc instead of PyObject_Malloc.As 
soon as I did this the problems disappeared and memory stayed constant. 

Thanks for your comments,

-Travis



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications

2005-11-16 Thread decker
Hello,


I would appreciate feedback concerning these patches before the next
"PythonD" (for DOS/DJGPP) is released.


Thanks in advance.




Regards,
Ben Decker
Systems Integrator
http://www.caddit.net



-
Stay ahead of the information curve.
Receive MCAD news and jobs on your desktop daily.
Subscribe today to the MCAD CafeNews newsletter.
[ http://www10.mcadcafe.com/nl/newsletter_subscribe.php ]
It's informative and essential.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] DRAFT: python-dev Summary for 2005-09-16 to 2005-09-30

2005-11-16 Thread Tony Meyer
It's been some time (all that concurrency discussion didn't help ;)  
but here's the second half of September.  Many apologies for the  
delay; hopefully you agree with Guido's 'better late than never', and  
I promise to try harder in the future.  Note that the delay is all my  
bad, and epithets should be directed at me and not Steve.  As usual,  
please read over if you have a chance, and direct comments/ 
corrections to [EMAIL PROTECTED] or [EMAIL PROTECTED]   
(One particular question is whether the concurrency summary is too  
long).

=
Announcements
=

-
QOTF: Quotes of the fortnight
-

We have two quotes this week, one each from the two biggest threads  
of this fortnight: concurrency and conditional expressions.  The  
first quote, from Donovan Barda, puts Python's approach to threading  
into perspective:

 The reality is threads were invented as a low overhead way of  
easily implementing concurrent applications... ON A SINGLE PROCESSOR.  
Taking into account threading's limitations and objectives, Python's  
GIL is the best way to support threads. When hardware (seriously)  
moves to multiple processors, other concurrency models will start to  
shine.

Our second QOTF, by yours truly (hey, who could refuse a nomination  
from Guido?), is a not-so-subtle reminder to leave syntax decisions  
to Guido:

 Please no more syntax proposals! ... We need to leave the syntax  
to Guido.  We've already proved that ... we can't as a community  
agree on a syntax.  That's what we have a BDFL for. =)

Contributing threads:

- `GIL, Python 3, and MP vs. UP `__
- `Adding a conditional expression in Py3.0 `__

[SJB]

---
Compressed MSI file
---

Martin v. Lˆwis discovered that a little more than a `MiB`_ in the  
Python installer by using LZX:21 instead of the standard MSZIP when  
compressing the CAB file.  After confirmation from several testers  
that the new format worked, the change (for Python 2.4.2 and beyond)  
was made.

.. _MiB: http://en.wikipedia.org/wiki/Mibibyte

Contributing thread:

- `Compressing MSI files: 2.4.2 candidate? `__

[TAM]

=
Summaries
=

---
Conditional expressions
---

Raymond Hettinger proposed that the ``and`` and ``or`` operators be  
modified in Python 3.0 to produce only booleans instead of producing  
objects, motivating this proposal in part by the common (mis-)use of  
`` and  or `` to emulate a conditional  
expression.  In response, Guido suggested that that the conditional  
expression discussion of `PEP 308`_ be reopened.  This time around,  
people seemed almost unanimously in support of adding a conditional  
expression, though as before they disagreed on syntax.  Fortunately,  
this time Guido cut the discussion short and pronounced a new syntax:  
`` if  else ``.  Although it has not  
been implemented yet, the plan is for it to appear in Python 2.5.

.. _PEP 308: http://www.python.org/peps/pep-0308.html

Contributing threads:

- `"and" and "or" operators in Py3.0 `__
- `Adding a conditional expression in Py3.0 `__
- `Conditional Expression Resolution `__

[SJB]

-
Concurrency in Python
-

Once again, the subject of removing the global interpreter lock (GIL)  
came up.  Sokolov Yura suggested that the GIL be replaced with a  
system where there are thread-local GILs that cooperate to share  
writing; Martin v. Lˆwis suggested that he try to implement his  
ideas, and predicted that he would find that doing so would be a lot  
of work, would require changes to all extension modules (likely to  
introduce new bugs, particularly race conditions), and possibly  
decrease performance.  This kicked off several long threads about  
multi-processor coding.

A long time ago (circa Python 1.5), Greg Ward experimented with free  
threading, which did yield around a 1.6 times speedup on a dual- 
processor machine.  To avoid the overhead of multi-processor locking  
on a uniprocessor machine, a separate binary could be distributed.   
Some of the code apparently did make it into Python 1.5, but the  
issue died off because no-one provided working code, or a strategy  
for what to do with existing extension modules.

Guido pointed out that it is not clear at this time how multiple  
processors will be used as they become the norm.  With the treaded  
programming model (e.g. in Java) there are problems with concurrent  
modification

[Python-Dev] DRAFT: python-dev Summary for 2005-10-01 to 2005-10-15

2005-11-16 Thread Tony Meyer
As you have noticed, there has been a summary delay recently.  This  
is my fault (insert your favourite thesis/work/leisure excuse here).   
Steve has generously covered my slackness by doing all of the October  
summaries himself (thanks!).  Anyway, if you have some moments to  
spare, cast your mind back to the start of October, and see if these  
reflect what happened.  Comments/corrections to [EMAIL PROTECTED]  
or [EMAIL PROTECTED]  Thanks!

=
Announcements
=


QOTF: Quote of the Fortnight


 From Phillip J. Eby:

 So, if threads are "easy" in Python compared to other langauges,  
it's *because of* the GIL, not in spite of it.

Contributing thread:

- `Pythonic concurrency `__

[SJB]


GCC/G++ Issues on Linux: Patch available


Christoph Ludwig provided the previously `promised patch`_ to address  
some of the issues in compiling Python with GCC/G++ on Linux.  The  
patch_ keeps ELF systems like x86 / Linux from having any  
dependencies on the C++ runtime, and allows systems that require main 
() to be a C++ function to be configured appropriately.

.. _promised patch: http://www.python.org/dev/summary/ 
2005-07-01_2005-07-15.html#gcc-g-issues-on-linux
.. _patch: http://python.org/sf/1324762

Contributing thread:

- `[C++-sig] GCC version compatibility `__

[SJB]

=
Summaries
=

-
Concurrency in Python
-

Michael Sparks spent a bit of time descibing the current state and  
future goals of the Kamaelia_ project.  Mainly, Kamaelia aims to make  
concurrency as simple and easy to use as possible.  A scheduler  
manages a set of generators that communicate with each other through  
Queues.  The long term goals include being able to farm the various  
generators off into thread or processes as needed, so that whether  
your concurrency model is cooperative, threaded or process-based,  
your code can basically look the same.

There was also continued discussion about how "easy" threads are.   
Shane Hathaway made the point that it's actually locking that's  
"insanely difficult", and approaches that simplify how much you need  
to think about locking can keep threading relatively easy -- this was  
one of the strong points of ZODB.  A fairly large camp also got  
behind the claim that threads are easy if you're limited to only  
message passing.  There were also a few comments about how Python  
makes threading easier, e.g. through the GIL (see `QOTF: Quote of the  
Fortnight`_) and through threading.threads's encapsulation of thread- 
local resources as instance attributes.

.. _Kamaelia: http://kamaelia.sourceforge.ne

Contributing threads:

- `Pythonic concurrency - cooperative MT `__
- `Pythonic concurrency `__

[SJB]

-
Organization of modules for threading
-

A few people took issue with the current organization of the  
threading modules into Queue, thread and threading.  Guido views  
Queue as an application of threading, so putting it in the threading  
module is inappropriate (though with a deeper package structure, it  
should definitely be a sibling).  Nick Coghlan suggested that Queue  
should be in a threadtools module (in parallel with itertools), while  
Skip proposed a hierarchy of modules with thread and lock being in  
the lowest level one, and Thread and Queue being in the highest  
level.  Aahz suggested (and Guido approved) deprecating the thread  
module and renaming it to _thread at least in Python 3.0.  It seems  
the deprecation may happen sooner though.

Contributing threads:

- `Making Queue.Queue easier to use `__
- `Autoloading? (Making Queue.Queue easier to use) `__
- `threadtools (was Re: Autoloading? (Making Queue.Queue easier to  
use)) `__
- `Threading and synchronization primitives `__

[SJB]

-
Speed of Unicode decoding
-

Tony Nelson found that decoding with a codec like mac-roman or  
iso8859-1 can take around ten times as long as decoding with utf-8.   
Walter Dˆrwald provided a patch_ that implements the mapping using a  
unicode string of length 256 where undefined characters are mapped to  
u"\ufffd".  This dropped the decode time for mac-roman to nearly the  
speed of the utf-8 decoding.  Hye-Shik Chang showed off a fastmap  
decoder 

[Python-Dev] DRAFT: python-dev Summary for 2005-10-16 to 2005-10-31

2005-11-16 Thread Tony Meyer
And this one brings us up-to-date (apart from the fortnight ending  
yesterday).  Again, if you have the time, please send any comments/ 
corrections to us.  Once again thanks to Steve for covering me and  
getting this all out on his own.

=
Announcements
=

--
AST for Python
--

As of October 21st, Python's compiler now uses a real Abstract Syntax  
Tree (AST)!  This should make experimenting with new syntax much  
easier, as well as allowing some optimizations that were difficult  
with the previous Concrete Syntax Tree (CST).  While there is no  
Python interface to the AST yet, one is intended for the not-so- 
distant future.

Thanks again to all who contributed, most notably: Armin Rigo, Brett  
Cannon, Grant Edwards, John Ehresman, Kurt Kaiser, Neal Norwitz, Neil  
Schemenauer, Nick Coghlan and Tim Peters.

Contributing threads:

- `AST branch merge status `__
- `AST branch update `__
- `AST branch is in? `__
- `Questionable AST wibbles `__
- `[Jython-dev] Re: AST branch is in? `__

[SJB]


Python on Subversion


As of October 27th, Python is now on Subversion!  The new repository  
is http://svn.python.org/projects/.  Check the `Developers FAQ`_ for  
information on how to get yourself setup with Subversion.  Thanks  
again to Martin v. Lˆwis for making this possible!

.. _Developers FAQ: http://www.python.org/dev/devfaq.html#subversion-svn

Contributing threads:

- `Migrating to subversion `__
- `Freezing the CVS on Oct 26 for SVN switchover `__
- `CVS is read-only `__
- `Conversion to Subversion is complete `__

[SJB]

---
Faster decoding
---

M.-A. Lemburg checked in Walter Dˆrwald's patches that improve  
decoding speeds by using a character map.  These should make decoding  
into mac-roman or iso8859-1 nearly as fast as decoding into utf-8.   
Thanks again guys!

Contributing threads:

- `Unicode charmap decoders slow `__
- `New codecs checked in `__
- `KOI8_U (New codecs checked in) `__

[SJB]


=
Summaries
=

-
Strings in Python 3.0
-

Guido proposed that in Python 3.0, all character strings would be  
unicode, possibly with multiple internal representations.  Some of  
the issues:

- Multiple implementations could make the C API difficult.  If utf-8,  
utf-16 and utf-32 are all possible, what types should the C API pass  
around?

- Windows expects utf-16, so using any other encoding will mean that  
calls to Windows will have to convert to and from utf-16.  However,  
even in current Python, all strings passed to Windows system calls  
have to undergo 8 bit to utf-16 conversion.

- Surrogates (two code units encoding one code point) can slow  
indexing down because the number of bytes per character isn't  
constant.  Note that even though utf-32 doesn't need surrogates, they  
may still be used (and must be interpreted correctly) in utf-32  
data.  Also, in utf-32, "graphemes" (which correspond better to the  
traditional concept of a "character" than code points do) may still  
be composed of multiple code points, e.g. "È" (e with a accent) can  
be written as "e" + "'".

This last issue was particularly vexing -- Guido thinks "it's a bad  
idea to offer an indexing operation that isn't O(1)".  A number of  
proposals were put forward, including:

- Adding a flag to strings to indicate whether or not they have any  
surrogates in them.  This makes indexing O(1) when no surrogates are  
in a string, but O(N) otherwise.

- Using a B-tree instead of an array for storage.  This would make  
all indexing O(log N).

- Discouraging using the indexing operations by providing an  
alternate API for strings.  This would require creating iterator-like  
objects that keep track of position in the unicode object.  Coming up  
with an API that's as usable as the slicing API seemed difficult though.

Contributing thread:

- `Divorcing str and unicode (no more implicit conversions). `__

[SJB]

---
Unicode identifiers
---

Martin v. Lˆwis suggested lifting the restriction that identifiers be  
ASCII.  There was some concern about c

Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread JustFillBug
On 2005-11-16, Travis Oliphant <[EMAIL PROTECTED]> wrote:
> Josiah Carlson wrote:
>>I seemed to have misunderstood the discussion.  Was the original user
>>accessing and saving copies of many millions of these doubles?  
>>
> He *was* accessing them (therefore generating a call to an array-scalar 
> object creation function).  But they *weren't being* saved.  They were 
> being deleted soon after access.   That's why it was so confusing that 
> his memory usage should continue to grow and grow so terribly.
>
> As verified by removing usage of the Python PyObject_MALLOC function, it 
> was the Python memory manager that was performing poorly.   Even though 
> the array-scalar objects were deleted, the memory manager would not 
> re-use their memory for later object creation. Instead, the memory 
> manager kept allocating new arenas to cover the load (when it should 
> have been able to re-use the old memory that had been freed by the 
> deleted objects--- again, I don't know enough about the memory manager 
> to say why this happened).

Well, the user have to call garbage collection before the memory were
freed. Python won't free memory when it can allocate more. It sucks but
it is my experience with python. I mean when python start doing swap on
my machine, I have to add manual garbage collection calls into my codes.



___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Problems with the Python Memory Manager

2005-11-16 Thread Ronald Oussoren

On 17-nov-2005, at 3:15, Travis Oliphant wrote:

> Jim Jewett wrote:
>
>>
>
>> (2)  Is he allocating new _types_, which I think don't get properly
>>
>> collected.
>>
>>
>
> Bingo.  Yes, definitely allocating new _types_ (an awful lot of  
> them...)
> --- that's what the "array scalars" are: new types created in C.

Do you really mean that someArray[1] will create a new type to represent
the second element of someArray? I would guess that you create an
instance of a type defined in your extension.

Ronald

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com