Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-09-04 Thread Amaury Forgeot d'Arc
Hello, Andrey Zhmoginov wrote: > I don't know if the following question is relevant, but it seems that many > people here are familiar with Python cyclic garbage collector. > I see Python [v2.5.2 (r252:60911, Jul 31 2008, 17:28:52)] crashing with > Segment fault when I extend Python with a very si

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-09-03 Thread Andrey Zhmoginov
Would anyone mind if I did add a public C API for gc.disable() and gc.enable()? I would like to use it as an optimization for the pickle module (I found out that I get a good 2x speedup just by disabling the GC while loading large pickles). Of course, I could simply import the gc module and call

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-26 Thread Greg Ewing
Jeff Hall wrote: I mistakenly thought that was because they were assumed to be small. It sounds like they're ignored because they're automatically collected and so they SHOULD be ignored for object garbage collection. Strings aren't tracked by the cyclic garbage collector because they don't c

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-25 Thread Alexandre Vassalotti
On Thu, Jun 26, 2008 at 12:01 AM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: >> I would it be possible, if not a good idea, to only track object >> deallocations as the GC traversal trigger? As far as I know, dangling >> cyclic references cannot be formed when allocating objects. > > Not sure wha

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-25 Thread Martin v. Löwis
> I would it be possible, if not a good idea, to only track object > deallocations as the GC traversal trigger? As far as I know, dangling > cyclic references cannot be formed when allocating objects. Not sure what you mean by that. x = [] x.append(x) del x creates a cycle with no deallocation o

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-25 Thread Alexandre Vassalotti
On Wed, Jun 25, 2008 at 4:55 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > I think exactly the other way 'round. The timing of thing should not > matter at all, only the exact sequence of allocations and deallocations. I would it be possible, if not a good idea, to only track object deallocat

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-25 Thread Martin v. Löwis
> I took the statement, "Current GC only takes into account container > objects, which, most significantly, ignores string objects (of which > most applications create plenty)" to mean that strings were ignored for > deciding when to do garbage collection. I mistakenly thought that was > because th

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-25 Thread Jeff Hall
On Wed, Jun 25, 2008 at 4:55 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > It seems to me that the root problem is allocation spikes of legitimate, > > useful data. Perhaps then we need some sort of "test" to determine if > > those are legitimate. Perhaps checking every nth (with n decreasi

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-25 Thread Martin v. Löwis
> It seems to me that the root problem is allocation spikes of legitimate, > useful data. Perhaps then we need some sort of "test" to determine if > those are legitimate. Perhaps checking every nth (with n decreasing as > allocation bytes increases) object allocated during a "spike" could be > usef

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-25 Thread Jeff Hall
It seems to me that the root problem is allocation spikes of legitimate, useful data. Perhaps then we need some sort of "test" to determine if those are legitimate. Perhaps checking every nth (with n decreasing as allocation bytes increases) object allocated during a "spike" could be useful. Then d

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-24 Thread Antoine Pitrou
Martin v. Löwis v.loewis.de> writes: > > I'd like to see in an experiment whether this is really true. Right, all those ideas should be implemented and tried out. I don't really have time to spend on it right now. Also, what's missing is a suite of performance/efficiency tests for the garbage

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-23 Thread Martin v. Löwis
> It would not help the quadratic behaviour - and is orthogonal to your > proposal - > , but at least avoid calling the GC too often when lots of small objects are > allocated (as opposed to lots of large objects). I'd like to see in an experiment whether this is really true. Current GC only tak

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-23 Thread Antoine Pitrou
Martin v. Löwis v.loewis.de> writes: > Currently, only youngest collections are triggered by allocation > rate; middle and old are triggered by frequency of youngest collection. > So would you now specify that the youngest collection should occur > if-and-only-if a new arena is allocated? Or disco

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-22 Thread Martin v. Löwis
> pymalloc needing to allocate a new arena would be a different way to > track an excess of allocations over deallocations, and in some ways > more sensible (since it would reflect an excess of /bytes/ allocated > over bytes freed, rather than an excess in the counts of objects > allocated-over-fre

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-22 Thread Tim Peters
[Antoine Pitrou] >> Would it be helpful if the GC was informed of memory growth by the >> Python memory allocator (that is, each time it either asks or gives back >> a block of memory to the system allocator) ? [Martin v. Löwis] > I don't see how. The garbage collector is already informed about me

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Stephen J. Turnbull
"Martin v. Löwis" writes: > > XEmacs implements this strategy in a way which is claimed to give > > constant amortized time (ie, averaged over memory allocated). > > See my recent proposal. I did, crossed in the mail. To the extent that I understand both systems, your proposal looks like an

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Nick Coghlan
Martin v. Löwis wrote: Antoine Pitrou wrote: Le samedi 21 juin 2008 à 17:49 +0200, "Martin v. Löwis" a écrit : I don't think any strategies based on timing will be successful. Instead, one should count and analyze objects (although I'm unsure how exactly that could work). Would it be helpful i

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Martin v. Löwis
> XEmacs implements this strategy in a way which is claimed to give > constant amortized time (ie, averaged over memory allocated). See my recent proposal. The old trick is to do reorganizations in a fixed fraction of the total size, resulting in a per-increase amortized-constant overhead (assumin

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Stephen J. Turnbull
"Martin v. Löwis" writes: > Given the choice of "run slower" and "run out of memory", Python should > always prefer the former. > > One approach could be to measure how successful a GC run was: if GC > finds that more-and-more objects get allocated and very few (or none) > are garbage, it m

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Martin v. Löwis
Antoine Pitrou wrote: > Le samedi 21 juin 2008 à 17:49 +0200, "Martin v. Löwis" a écrit : >> I don't think any strategies based on timing will be successful. >> Instead, one should count and analyze objects (although I'm unsure >> how exactly that could work). > > Would it be helpful if the GC wa

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Antoine Pitrou
Le samedi 21 juin 2008 à 17:49 +0200, "Martin v. Löwis" a écrit : > I don't think any strategies based on timing will be successful. > Instead, one should count and analyze objects (although I'm unsure > how exactly that could work). Would it be helpful if the GC was informed of memory growth by

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Terry Reedy
Kevin Jacobs <[EMAIL PROTECTED]> wrote: I can say with complete certainty that of the 20+ programmers I've had working for me, many who have used Python for 3+ years, not a single one would think to question the garbage collector if they observed the kind of quadratic time complexity I've de

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Bill Janssen
> > What follows from that? To me, the natural conclusion is "people who > > witness performance problems just need to despair, or accept them, as > > they can't do anything about it", however, I don't think this is the > > conclusion that you had in mind. > > > > I can say with complete certainty

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Martin v. Löwis
> I'm not sure I agree with this. GC IIRC was introduced primarily to > alleviate *long-term* memory starvation. I don't think that's historically the case. GC would not need to be generational if releasing short-lived objects shortly after they become garbage was irrelevant. Of course, it was al

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Aahz
On Sat, Jun 21, 2008, "Martin v. L??wis" wrote: > > In general, any solution of the "do GC less often" needs to deal with > cases where lots of garbage gets produced in a short amount of time > (e.g. in a tight loop), and which run out of memory when GC is done less > often. > > Given the choice

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Martin v. Löwis
> Idea 1: Allow GC to run automatically no more often than n CPU seconds, > n being perhaps 5 or 10. I think it's very easy to exhaust the memory with such a policy, even though much memory would still be available. Worse, in a program producing a lot of garbage, performance will go significantly

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Kevin Jacobs <[EMAIL PROTECTED]>
On Sat, Jun 21, 2008 at 11:20 AM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > In general, any solution of the "do GC less often" needs to deal with > cases where lots of garbage gets produced in a short amount of time > (e.g. in a tight loop), and which run out of memory when GC is done less >

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Martin v. Löwis
> Well, they could hang themselves or switch to another language (which > some people might view as equivalent :-)), but perhaps optimistically > the various propositions that were sketched out in this thread (by Adam > Olsen and Greg Ewing) could bring an improvement. I don't know how > realistic

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Martin v. Löwis
> I can say with complete certainty that of the 20+ programmers I've had > working for me, many who have used Python for 3+ years, not a single one > would think to question the garbage collector if they observed the kind > of quadratic time complexity I've demonstrated. This is not because > they

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Kevin Jacobs <[EMAIL PROTECTED]>
On Sat, Jun 21, 2008 at 4:33 AM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > I don't think expecting people to tweak gc parameters when they witness > > performance problems is reasonable. > > What follows from that? To me, the natural conclusion is "people who > witness performance problems

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Antoine Pitrou
Le samedi 21 juin 2008 à 10:33 +0200, "Martin v. Löwis" a écrit : > > I don't think expecting people to tweak gc parameters when they witness > > performance problems is reasonable. > > What follows from that? To me, the natural conclusion is "people who > witness performance problems just need to

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-21 Thread Martin v. Löwis
> I don't think expecting people to tweak gc parameters when they witness > performance problems is reasonable. What follows from that? To me, the natural conclusion is "people who witness performance problems just need to despair, or accept them, as they can't do anything about it", however, I do

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-20 Thread Antoine Pitrou
Le vendredi 20 juin 2008 à 17:44 +0200, Amaury Forgeot d'Arc a écrit : > In short: the gc is tuned for typical usage. If your usage of python > is specific, > use gc.set_threshold and increase its values. It's fine for people "in the know" who take the time to test their code using various gc para

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-20 Thread Adam Olsen
On Fri, Jun 20, 2008 at 9:44 AM, Amaury Forgeot d'Arc <[EMAIL PROTECTED]> wrote: > 2008/6/20 Kevin Jacobs <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>: >> On Fri, Jun 20, 2008 at 10:25 AM, Antoine Pitrou <[EMAIL PROTECTED]> >> wrote: >>> >>> Kevin Jacobs bioinformed.com> gmail.com> >>> writes: >>> > >

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-20 Thread Amaury Forgeot d'Arc
2008/6/20 Kevin Jacobs <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>: > On Fri, Jun 20, 2008 at 10:25 AM, Antoine Pitrou <[EMAIL PROTECTED]> > wrote: >> >> Kevin Jacobs bioinformed.com> gmail.com> >> writes: >> > >> > +1 on a C API for enabling and disabling GC. I have several instances >> > where >>

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-20 Thread Kevin Jacobs <[EMAIL PROTECTED]>
On Fri, Jun 20, 2008 at 10:25 AM, Antoine Pitrou <[EMAIL PROTECTED]> wrote: > > Kevin Jacobs bioinformed.com> gmail.com> > writes: > > > > +1 on a C API for enabling and disabling GC. I have several instances > where > I create a large number of objects non-cyclic objects where I see huge GC >

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-20 Thread Antoine Pitrou
Hi, Kevin Jacobs bioinformed.com> gmail.com> writes: > > +1 on a C API for enabling and disabling GC.  I have several instances where I create a large number of objects non-cyclic objects where I see huge GC overhead (30+ seconds with gc enabled, 0.15 seconds when disabled). Could you try t

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-20 Thread Kevin Jacobs <[EMAIL PROTECTED]>
+1 on a C API for enabling and disabling GC. I have several instances where I create a large number of objects non-cyclic objects where I see huge GC overhead (30+ seconds with gc enabled, 0.15 seconds when disabled). +1000 to fixing the garbage collector to be smart enough to self-regulate itsel

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-19 Thread Greg Ewing
Alexandre Vassalotti wrote: Do you have any idea how this behavior could be fixed? I am not a GC expert, but I could try to fix this. Perhaps after making a GC pass you could look at the number of objects reclaimed during that pass, and if it's less than some fraction of the objects in existen

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-19 Thread Adam Olsen
On Thu, Jun 19, 2008 at 3:23 PM, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote: > On Sun, Jun 1, 2008 at 12:28 AM, Adam Olsen <[EMAIL PROTECTED]> wrote: >> On Sat, May 31, 2008 at 10:11 PM, Alexandre Vassalotti >> <[EMAIL PROTECTED]> wrote: >>> Would anyone mind if I did add a public C API for gc.

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-06-19 Thread Alexandre Vassalotti
On Sun, Jun 1, 2008 at 12:28 AM, Adam Olsen <[EMAIL PROTECTED]> wrote: > On Sat, May 31, 2008 at 10:11 PM, Alexandre Vassalotti > <[EMAIL PROTECTED]> wrote: >> Would anyone mind if I did add a public C API for gc.disable() and >> gc.enable()? I would like to use it as an optimization for the pickle

Re: [Python-Dev] C API for gc.enable() and gc.disable()

2008-05-31 Thread Adam Olsen
On Sat, May 31, 2008 at 10:11 PM, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote: > Would anyone mind if I did add a public C API for gc.disable() and > gc.enable()? I would like to use it as an optimization for the pickle > module (I found out that I get a good 2x speedup just by disabling the > G