Re: [Python-Dev] And the winner is...

2009-03-30 Thread Mike Coleman
Just for curiosity's sake, could someone outline the five (or so) most
significant pluses of hg relative to git?


(My personal feeling is that any of the three is a huge improvement
over subversion.  I think git probably should have been written in
Python with some stuff in C where necessary, and (perhaps) the hg guy
really is right when he claims that Linus should have skipped git and
used hg from the start.  That notwithstanding, though, it kind of
looks like git has won the mindshare war at this point, and I think
the best hg can hope for from this point forward is a sort of *BSD to
git's Linux.  I do hope that it lives on, shutouts being fascist, etc.

Aside: I once worked with the guy maintaining git, and he might have
the greatest sum of talent plus humility of any programmer I ever
met.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] And the winner is...

2009-03-31 Thread Mike Coleman
On Mon, Mar 30, 2009 at 9:54 PM, Guido van Rossum  wrote:
> Yeah, I also think I'll just stop developing Python now and suggest
> that you all switch to Java, which has clearly won the mindshare war
> for languages. :-)

Heh.  :-)

Guess I should have said "mindshare among people whose technical
opinions I give weight to".  In that sense, Python mindshare seems to
have been and to still be increasing steadily.  (My Magic 8-ball says
"future unclear" for Java.)

The TIOBE index is entertaining, if you haven't seen it before:

http://www.tiobe.com/content/paperinfo/tpci/index.html


> But is his humility enough to cancel out Linus's attitude?

Why would I want to do that?  :-)


Seriously--thanks for all of your responses.  If it wasn't clear, I
was asking because I was curious about whether and why I should look
some more at hg.  I would never dream of trying to change anyone's
mind...

Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] And the winner is...

2009-03-31 Thread Mike Coleman
On Tue, Mar 31, 2009 at 12:31 AM, Stephen J. Turnbull
 wrote:
> I also just wrote a long post about the comparison of bzr to hg
> responding to a comment on baz...@canonical.com.  I won't recap it
> here but it might be of interest.

I found the post interesting.  Here's a link to the start of the thread:

https://lists.ubuntu.com/archives/bazaar/2009q1/055805.html

There's a bit of bafflement there regarding Python culture.  I can
relate--although I love Python, I don't feel like I understand the
culture either.

> It wouldn't be that hard to do a rewrite in Python, but the git
> programmers are mostly kernel people.  They write in C and shell.

I mentioned this once on the git list and Linus' response was
something like "C lets me see exactly what's going on".  I'm not
unsympathetic to this point of view--I'm really growing to loathe C++
partly because it *doesn't* let me see exactly what's going on--but
I'm not convinced, either.

It looks like there might be a Python clone sprouting here:

http://gitorious.org/projects/git-python/


> People who lean toward the DAG as *recording* history will prefer
> Mercurial or Bazaar. People who tend to see the DAG as a tool for
> *presenting* changes will prefer git.

I've noticed this tension as well.  It seems to me that both uses are
important, so I suspect all three will eventually steal each other's
features with respect to this over time.

Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-19 Thread Mike Coleman
I have a program that creates a huge (45GB) defaultdict.  (The keys
are short strings, the values are short lists of pairs (string, int).)
 Nothing but possibly the strings and ints is shared.

The program takes around 10 minutes to run, but longer than 20 minutes
to exit (I gave up at that point).  That is, after executing the final
statement (a print), it is apparently spending a huge amount of time
cleaning up before exiting.  I haven't installed any exit handlers or
anything like that, all files are already closed and stdout/stderr
flushed, and there's nothing special going on.  I have done
'gc.disable()' for performance (which is hideous without it)--I have
no reason to think there are any loops.

Currently I am working around this by doing an os._exit(), which is
immediate, but this seems like a bit of hack.  Is this something that
needs fixing, or that has already been fixed?

Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Mike Coleman
On Sat, Dec 20, 2008 at 4:02 AM, Kristján Valur Jónsson
 wrote:
> Can you distill the program into something reproducible?
> Maybe with something slightly less than 45Gb but still exhibiting some 
> degradation of exit performance?
> I can try to point our commercial profiling tools at it and see what it is 
> doing.

I will try next week to see if I can come up with a smaller,
submittable example.  Thanks.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Mike Coleman
Andrew, this is on an (intel) x86_64 box with 64GB of RAM.  I don't
recall the maker or details of the architecture off the top of my
head, but it would be something "off the rack" from Dell or maybe HP.
There were other users on the box at the time, but nothing heavy or
that gave me any reason to think was affecting my program.

It's running CentOS 5 I think, so that might make glibc several years
old.  Your malloc idea sounds plausible to me.  If it is a libc
problem, it would be nice if there was some way we could tell malloc
to "live for today because there is no tomorrow" in the terminal phase
of the program.

I'm not sure exactly how to attack this.  Callgrind is cool, but no
way will work on something this size.  Timed ltrace output might be
interesting.  Or maybe a gprof'ed Python, though that's more work.

Regarding interning, I thought this only worked with strings.  Is
there some way to intern integers?  I'm probably creating 300M
integers more or less uniformly distributed across range(1).

Mike





On Sat, Dec 20, 2008 at 4:08 AM, Andrew MacIntyre
 wrote:
> Mike Coleman wrote:
>>
>> I have a program that creates a huge (45GB) defaultdict.  (The keys
>> are short strings, the values are short lists of pairs (string, int).)
>>  Nothing but possibly the strings and ints is shared.
>>
>> The program takes around 10 minutes to run, but longer than 20 minutes
>> to exit (I gave up at that point).  That is, after executing the final
>> statement (a print), it is apparently spending a huge amount of time
>> cleaning up before exiting.  I haven't installed any exit handlers or
>> anything like that, all files are already closed and stdout/stderr
>> flushed, and there's nothing special going on.  I have done
>> 'gc.disable()' for performance (which is hideous without it)--I have
>> no reason to think there are any loops.
>>
>> Currently I am working around this by doing an os._exit(), which is
>> immediate, but this seems like a bit of hack.  Is this something that
>> needs fixing, or that has already been fixed?
>
> You don't mention the platform, but...
>
> This behaviour was not unknown in the distant past, with much smaller
> datasets.  Most of the problems then related to the platform malloc()
> doing funny things as stuff was free()ed, like coalescing free space.
>
> [I once sat and watched a Python script run in something like 30 seconds
>  and then take nearly 10 minutes to terminate, as you describe (Python
>  2.1/Solaris 2.5/Ultrasparc E3500)... and that was only a couple of
>  hundred MB of memory - the Solaris 2.5 malloc() had some undesirable
>  properties from Python's point of view]
>
> PyMalloc effectively removed this as an issue for most cases and platform
> malloc()s have also become considerably more sophisticated since then,
> but I wonder whether the sheer size of your dataset is unmasking related
> issues.
>
> Note that in Python 2.5 PyMalloc does free() unused arenas as a surplus
> accumulates (2.3 & 2.4 never free()ed arenas).  Your platform malloc()
> might have odd behaviour with 45GB of arenas returned to it piecemeal.
> This is something that could be checked with a small C program.
> Calling os._exit() circumvents the free()ing of the arenas.
>
> Also consider that, with the exception of small integers (-1..256), no
> interning of integers is done.  If your data contains large quantities
> of integers with non-unique values (that aren't in the small integer
> range) you may find it useful to do your own interning.
>
> --
> -
> Andrew I MacIntyre "These thoughts are mine alone..."
> E-mail: andy...@bullseye.apana.org.au  (pref) | Snail: PO Box 370
>   andy...@pcug.org.au (alt) |Belconnen ACT 2616
> Web:http://www.andymac.org/   |Australia
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Mike Coleman
On Sat, Dec 20, 2008 at 2:50 PM, M.-A. Lemburg  wrote:
> If you want a really fast exit, try this:
>
> import os
> os.kill(os.getpid(), 9)
>
> But you better know what you're doing if you take this approach...

This would work, but I think os._exit(EX_OK) is probably just as fast,
and allows you to control the exit status...
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Mike Coleman
Tim, I left out some details that I believe probably rule out the
"swapped out" theory.  The machine in question has 64GB RAM, but only
16GB swap.  I'd prefer more swap, but in any case only around ~400MB
of the swap was actually in use during my program's entire run.
Furthermore, during my program's exit, it was using 100% CPU, and I'm
95% sure there was no significant "system" or "wait" CPU time for the
system.  (All observations via 'top'.)  So, I think that the problem
is entirely a computational one within this process.

The system does have 8 CPUs.  I'm not sure about it's memory
architecture, but if it's some kind of NUMA box, I guess access to
memory could be slower than what we'd normally expect.  I'm skeptical
about that being a significant factor here, though.

Just to clarify, I didn't gc.disable() to address this problem, but
rather because it destroys performance during the creation of the huge
dict.  I don't have a specific number, but I think disabling gc
reduced construction from something like 70 minutes to 5 (or maybe
10).  Quite dramatic.

Mike


>From Tim Peters:
BTW, the original poster should try this:  use whatever tools the OS
supplies to look at CPU and disk usage during the long exit.  What I
/expect/ is that almost no CPU time is being used, while the disk is
grinding itself to dust.  That's what happens when a large number of
objects have been swapped out to disk, and exit processing has to page
them all back into memory again (in order to decrement their
refcounts).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Mike Coleman
Re "held" and "intern_it":  Haha!  That's evil and extremely evil,
respectively.  :-)

I will add these to the Python wiki if they're not already there...

Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-20 Thread Mike Coleman
On Sat, Dec 20, 2008 at 5:40 PM, Alexandre Vassalotti
 wrote:
> Could you give us more information about the dictionary. For example,
> how many objects does it contain? Is 45GB the actual size of the
> dictionary or of the Python process?

The 45G was the VM size of the process (resident size was similar).

The dict keys were all uppercase alpha strings of length 7.  I don't
have access at the moment, but maybe something like 10-100M of them
(not sure how redundant the set is).  The values are all lists of
pairs, where each pair is a (string, int).  The pair strings are of
length around 30, and drawn from a "small" fixed set of around 60K
strings ().  As mentioned previously, I think the ints are drawn
pretty uniformly from something like range(1).  The length of the
lists depends on the redundancy of the key set, but I think there are
around 100-200M pairs total, for the entire dict.

(If you're curious about the application domain, see 'http://greylag.org'.)

> Have you seen any significant difference in the exit time when the
> cyclic GC is disabled or enabled?

Unfortunately, with GC enabled, the application is too slow to be
useful, because of the greatly increased time for dict creation.  I
suppose it's theoretically possible that with this increased time, the
long time for exit will look less bad by comparison, but I'd be
surprised if it makes any difference at all.  I'm confident that there
are no loops in this dict, and nothing for cyclic gc to collect.

Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread Mike Coleman
Thanks for all of the useful suggestions.  Here are some preliminary results.

With still gc.disable(), at the end of the program I first did a
gc.collect(), which took about five minutes.  (So, reason enough not
to gc.enable(), at least without Antoine's patch.)

After that, I did a .clear() on the huge dict.  That's where the time
is being spent.  Doing the suggested "poor man's profiling" (repeated
backtraces via gdb), for 20 or so samples, one is within libc free,
but all of the rest are in the same place (same source line) within
PyObjectFree (see below), sometimes within list_dealloc and sometimes
within tuple_dealloc.  So, apparently a lot of time is being spent in
this loop:


/* Case 3:  We have to move the arena towards the end
 * of the list, because it has more free pools than
 * the arena to its right.

   ...

/* Locate the new insertion point by iterating over
 * the list, using our nextarena pointer.
 */
while (ao->nextarena != NULL &&
nf > ao->nextarena->nfreepools) {
ao->prevarena = ao->nextarena;
ao->nextarena = ao->nextarena->nextarena;
}

Investigating further, from one stop, I used gdb to follow the chain
of pointers in the nextarena and prevarena directions.  There were
5449 and 112765 links, respectively.  maxarenas is 131072.

Sampling nf at different breaks gives values in the range(10,20).

This loop looks like an insertion sort.  If it's the case that only a
"few" iterations are ever needed for any given free, this might be
okay--if not, it would seem that this must be quadratic.

I attempted to look further by setting a silent break with counter
within the loop and another break after the loop to inspect the
counter, but gdb's been buzzing away on that for 40 minutes without
coming back.  That might mean that there are a lot of passes through
this loop per free (i.e., that gdb is taking a long time to process
100,000 silent breaks), or perhaps I've made a mistake, or gdb isn't
handling this well.

In any case, this looks like the problem locus.

It's tempting to say "don't do this arena ordering optimization if
we're doing final cleanup", but really the program could have done
this .clear() at any point.  Maybe there needs to be a flag to disable
it altogether?  Or perhaps there's a smarter way to manage the list of
arena/free pool info.

Mike



Program received signal SIGINT, Interrupt.
0x004461dc in PyObject_Free (p=0x5ec043db0) at Objects/obmalloc.c:1064
1064while (ao->nextarena != NULL &&
(gdb) bt
#0  0x004461dc in PyObject_Free (p=0x5ec043db0) at
Objects/obmalloc.c:1064
#1  0x00433478 in list_dealloc (op=0x5ec043dd0) at
Objects/listobject.c:281
#2  0x0044075b in PyDict_Clear (op=0x74c7cd0) at
Objects/dictobject.c:757
#3  0x004407b9 in dict_clear (mp=0x5ec043db0) at
Objects/dictobject.c:1776
#4  0x00485905 in PyEval_EvalFrameEx (f=0x746ca50,
throwflag=)
at Python/ceval.c:3557
#5  0x0048725f in PyEval_EvalCodeEx (co=0x72643f0,
globals=,
locals=, args=0x1, argcount=0, kws=0x72a5770,
kwcount=0, defs=0x743eba8,
defcount=1, closure=0x0) at Python/ceval.c:2836
#6  0x004855bc in PyEval_EvalFrameEx (f=0x72a55f0,
throwflag=)
at Python/ceval.c:3669
#7  0x0048725f in PyEval_EvalCodeEx (co=0x72644e0,
globals=,
locals=, args=0x0, argcount=0, kws=0x0,
kwcount=0, defs=0x0, defcount=0,
closure=0x0) at Python/ceval.c:2836
#8  0x004872a2 in PyEval_EvalCode (co=0x5ec043db0,
globals=0x543e41f10, locals=0x543b969c0)
at Python/ceval.c:494
#9  0x004a844e in PyRun_FileExFlags (fp=0x7171010,
filename=0x7af6b419
"/home/mkc/greylag/main/greylag_reannotate.py", start=,
globals=0x7194510, locals=0x7194510, closeit=1,
flags=0x7af69080) at Python/pythonrun.c:1273
#10 0x004a86e0 in PyRun_SimpleFileExFlags (fp=0x7171010,
filename=0x7af6b419
"/home/mkc/greylag/main/greylag_reannotate.py", closeit=1,
flags=0x7af69080) at Python/pythonrun.c:879
#11 0x00412275 in Py_Main (argc=,
argv=0x7af691a8) at Modules/main.c:523
#12 0x0030fea1d8b4 in __libc_start_main () from /lib64/libc.so.6
#13 0x00411799 in _start ()





On Sun, Dec 21, 2008 at 12:44 PM, Adam Olsen  wrote:
> On Sat, Dec 20, 2008 at 6:09 PM, Mike Coleman  wrote:
>> On Sat, Dec 20, 2008 at 5:40 PM, Alexandre Vassalotti
>>> Have you seen any significant difference in the exit time when the
>>> cyclic GC is disabled or enabled?
>>
>> Unfortunately, 

Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread Mike Coleman
On Mon, Dec 22, 2008 at 6:20 AM, M.-A. Lemburg  wrote:
> BTW: Rather than using a huge in-memory dict, I'd suggest to either
> use an on-disk dictionary such as the ones found in mxBeeBase or
> a database.

I really want this to work in-memory.  I have 64G RAM, and I'm only
trying to use 45G of it ("only" 45G :-), and I don't need the results
to persist after the program finishes.

Python should be able to do this.  I don't want to hear "Just use Perl
instead" from my co-workers...  ;-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread Mike Coleman
On Mon, Dec 22, 2008 at 2:38 PM, "Martin v. Löwis"  wrote:
>> Or perhaps there's a smarter way to manage the list of
>> arena/free pool info.
>
> If that code is the real problem (in a reproducible test case),
> then this approach is the only acceptable solution. Disabling
> long-running code is not acceptable.

By "disabling", I meant disabling the optimization that's trying to
rearrange the arenas so that more memory can be returned to the OS.
This presumably wouldn't be any worse than things were in Python 2.4,
when memory was never returned to the OS.

(I'm working on a test case.)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread Mike Coleman
On Mon, Dec 22, 2008 at 2:54 PM, Ivan Krstić
 wrote:
> It's still not clear to me, from reading the whole thread, precisely what
> you're seeing. A self-contained test case, preferably with generated random
> data, would be great, and save everyone a lot of investigation time.

I'm still working on a test case.  The first couple of attempts, using
a half-hearted attempt to model the application behavior wrt this dict
didn't demonstrate bad behavior.

My impression is that no one's burning much time on this but me at the
moment, aside from offering helpful advice.  If you are, you might
want to wait.  I noticed just now that the original hardware was
throwing some chipkills, so I'm retesting on something else.


> In the
> meantime, can you 1) turn off all swap files and partitions, and 2) confirm
> positively that your CPU cycles are burning up in userland?

For (1), I don't have that much control over the machine.  Plus, based
on watching with top, I seriously doubt the process is using swap in
any way.  For (2), yes, 100% CPU usage.

> (In general, unless you know exactly why your workload needs swap, and have
> written your program to take swapping into account, having _any_ swap on a
> machine with 64GB RAM is lunacy. The machine will grind to a complete
> standstill long before filling up gigabytes of swap.)

The swap is not there to support my application per se.  Clearly if
you're swapping, generally you're crawling.  This host is used by a
reasonably large set of non- and novice programmers, who sometimes
vacuum up VM without realizing it.  If you have a nice, big swap
space, you can 'kill -STOP' these offenders, and allow them to swap
out while you have a leisurely discussion with the owner and possibly
'kill -CONT' later, as opposed to having to do a quick 'kill -KILL' to
save the machine.  That's my thinking, anyway.

Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread Mike Coleman
On Mon, Dec 22, 2008 at 2:22 PM, Adam Olsen  wrote:
> To make sure that's the correct line please recompile python without
> optimizations.  GCC happily reorders and merges different parts of a
> function.
>
> Adding a counter in C and recompiling would be a lot faster than using
> a gdb hook.

Okay, I did this.  The results are the same, except that now sampling
selects the different source statements within this loop, instead of
just the top of the loop (which makes sense).

I added a counter (static volatile long) as suggested, and a
breakpoint to sample it.  Not every pass through PyObject_Free takes
case 3, but for those that do, this loop runs around 100-25000 times.
I didn't try to graph it, but based on a quick sample, it looks like
more than 5000 iterations on most occasions.

The total counter is 12.4 billion at the moment, and still growing.
That seems high, but I'm not sure what would be expected or hoped for.

I have a script that demonstrates the problem, but unfortunately the
behavior isn't clearly bad until large amounts of memory are used.  I
don't think it shows at 2G, for example.  (A 32G machine is
sufficient.)  Here is a log of running the program at different sizes
($1):

1 4.04686999321 0.696660041809
2 8.1575551033 1.46393489838
3 12.6426320076 2.30558800697
4 16.471298933 3.80377006531
5 20.1461620331 4.96685886383
6 25.150053978 5.48230814934
7 28.9099609852 7.41244196892
8 32.283219099 6.31711483002
9 36.6974511147 7.40236377716
10 40.3126089573 9.01174497604
20 81.7559120655 20.3317198753
30 123.67071104 31.4815018177
40 161.935647011 61.4484620094
50 210.610441923 88.6161060333
60 248.89805007 118.821491003
70 288.944771051 194.166989088
80 329.93295002 262.14109993
90 396.209988832 454.317914009
100 435.610564947 564.191882133

If you plot this, it is clearly quadratic (or worse).

Here is the script:

#!/usr/bin/env python


"""
Try to trigger quadratic (?) behavior during .clear() of a large but simple
defaultdict.

"""


from collections import defaultdict
import time
import sys

import gc; gc.disable()


print >> sys.stderr, sys.version

h = defaultdict(list)

n = 0

lasttime = time.time()


megs = int(sys.argv[1])

print megs,
sys.stdout.flush()

# 100M iterations -> ~24GB? on my 64-bit host

for i in xrange(megs * 1024 * 1024):
s = '%0.7d' % i
h[s].append(('', 12345))
h[s].append(('', 12345))
h[s].append(('', 12345))
#   if (i % 100) == 0:
#   t = time.time()
#   print >> sys.stderr, t-lasttime
#   lasttime = t

t = time.time()
print t-lasttime,
sys.stdout.flush()
lasttime = t

h.clear()

t = time.time()
print t-lasttime,
sys.stdout.flush()
lasttime = t

print
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread Mike Coleman
2008/12/22 Ivan Krstić :
> On Dec 22, 2008, at 6:28 PM, Mike Coleman wrote:
>>
>> For (2), yes, 100% CPU usage.
>
> 100% _user_ CPU usage? (I'm trying to make sure we're not chasing some
> particular degeneration of kmalloc/vmalloc and friends.)

Yes, user.  No noticeable sys or wait CPU going on.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-22 Thread Mike Coleman
I unfortunately don't have time to work out how obmalloc works myself,
but I wonder if any of the constants in that file might need to scale
somehow with memory size.  That is, is it possible that some of them
that work okay with 1G RAM won't work well with (say) 128G or 1024G
(coming soon enough)?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] extremely slow exit for program having huge (45G) dict (python 2.5.2)

2008-12-23 Thread Mike Coleman
On Sat, Dec 20, 2008 at 6:22 PM, Mike Coleman  wrote:
> Re "held" and "intern_it":  Haha!  That's evil and extremely evil,
> respectively.  :-)

P.S.  I tried the "held" idea out (interning integers in a list), and
unfortunately it didn't make that much difference.  In the example I
tried, there were 104465178 instances of integers from range(33467).
I guess if ints are 12 bytes (per Beazley's book, but not sure if that
still holds), then that would correspond to a 1GB reduction.  Judging
by 'top', it might have been 2 or 3GB instead, from a total of 45G.

Mike
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] suggest change to "Failed to find the necessary bits to build these modules" message

2008-12-23 Thread Mike Coleman
I was thrown by the "Failed to find the necessary bits to build these
modules" message at the end of newer Python builds, and thought that
this indicated that the Python executable itself was not built.
That's arguably stupidity on my part, but I wonder if others will not
trip on this, too.

Would it be possible to change this wording slightly, to something like

Python built, but failed to find the necessary bits to build these modules

?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] suggest change to "Failed to find the necessary bits to build these modules" message

2008-12-23 Thread Mike Coleman
Done: http://bugs.python.org/issue4731


On Tue, Dec 23, 2008 at 12:13 PM, Brett Cannon  wrote:
> On Tue, Dec 23, 2008 at 09:59, Mike Coleman  wrote:
>> I was thrown by the "Failed to find the necessary bits to build these
>> modules" message at the end of newer Python builds, and thought that
>> this indicated that the Python executable itself was not built.
>> That's arguably stupidity on my part, but I wonder if others will not
>> trip on this, too.
>>
>> Would it be possible to change this wording slightly, to something like
>>
>>Python built, but failed to find the necessary bits to build these modules
>>
>> ?
>
> Sounds reasonable to me. Can you file a bug report at bugs.python.org,
> Mike, so this doesn't get lost?
>
> -Brett
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] error in doc for fcntl module

2009-01-07 Thread Mike Coleman
In the doc page for the fcntl module, the example below is given.
This seems like an error, or at least very misleading, as the normal
usage is to get the flags (F_GETFL), set or unset the bits you want to
change, then set the flags (F_SETFL).  A reader might think that the
example below merely sets O_NDELAY, but it also stomps all of the
other bits to zero.

If someone can confirm my thinking, this ought to be changed.

import struct, fcntl, os

f = open(...)
rv = fcntl.fcntl(f, fcntl.F_SETFL, os.O_NDELAY)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] error in doc for fcntl module

2009-01-08 Thread Mike Coleman
One problem is that API wrappers like this sometimes include extra
functionality.  When I ran across this example, I wondered whether the
Python interface had been enhanced to work like this

# set these three flags
rv = fcntl.fcntl(f, fcntl.F_SETFL, os.O_NDELAY)
rv = fcntl.fcntl(f, fcntl.F_SETFL, os.O_APPEND)
rv = fcntl.fcntl(f, fcntl.F_SETFL, os.O_NOATIME)

Something like this might be nice, but after staring at it for another
minute, I realized that the Python interface itself was standard, and
that it was the example itself that was confusing me.  (I've been
programming Unix/POSIX for over 20 years, so perhaps I simply
outsmarted myself, or am an idiot.  Still, I found it confusing.)

One of the many virtues of Python is that it's oriented towards
learning/teaching.  It seems like it would be useful in this case to
have an example that shows best practice (as in Stevens/Rago and other
similar texts), rather than one that will merely usually work on
present systems.

If it makes any difference, I'd be happy to send a patch.  Is there
any reason not to change this?

Mike



On Wed, Jan 7, 2009 at 6:36 PM, Guido van Rossum  wrote:
> Well my Linux man page says that the only flags supported are
> O_APPEND,  O_ASYNC,  O_DIRECT, O_NOATIME, and O_NONBLOCK; and all of
> those are typically off -- so I'm not sure that it's a mistake or need
> correcting. These APIs should only be used by people who know what
> they're doing anyways; the examples are meant to briefly show the call
> format.
>
> On Wed, Jan 7, 2009 at 1:31 PM, Mike Coleman  wrote:
>> In the doc page for the fcntl module, the example below is given.
>> This seems like an error, or at least very misleading, as the normal
>> usage is to get the flags (F_GETFL), set or unset the bits you want to
>> change, then set the flags (F_SETFL).  A reader might think that the
>> example below merely sets O_NDELAY, but it also stomps all of the
>> other bits to zero.
>>
>> If someone can confirm my thinking, this ought to be changed.
>>
>> import struct, fcntl, os
>>
>> f = open(...)
>> rv = fcntl.fcntl(f, fcntl.F_SETFL, os.O_NDELAY)
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com