Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
On Tue, 18 May 2010 08:45:41 +0200 Lennart Regebro wrote: > >> > >> Are you referring to the "New GIL"? > > > > Yes. > > At has been shown, it also in certain cases will race with the OS > scheduler, so this is not already fixed, although apparently improved, > if I understand correctly. "Race" is a strange term here and I'm not sure what you mean. The issue found out by Dave Beazley can't be reasonably described by this word, I think. Please read and understand the issue report mentioned by Nir before trying to make statements based on rumours heard here and there. Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
Nir Aides wrote: > I would like to restart this thread with 2 notes from the lively discussion: > > a) Issue 7946 (and this discussion?) concerns Python 3.2 > b) The GIL problems are not specific to OSX. The old and new GIL > misbehave on GNU/Linux and Windows too. I think Antoine and Bill went off an a bit of a tangent that *is* specific to Mac OS X and the old GIL (where a Python application not only fails to take advantage of additional cores, but actually runs slower than it does on a less powerful single core machine). The convoying problem identified in issue 7946 does indeed apply to the new GIL on multiple platforms. Without reviewing either proposed patch in detail, I personally am slightly inclined to favour David's suggested solution, both because I better understand the explanation of how it works (and simplicity is a virtue from a maintainability point of view), and because the BFS approach appears to run into trouble when it comes to identifying a suitable cross platform time reference. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
Bill Janssen, 17.05.2010 23:09: Most folks don't use "threads" Seems like a somewhat reasonable assumption to me. they use a higher-level abstraction like the nltk library. I have my doubts that this applies to "most folks" - likely not even to most of those who use threads. Stefan ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
On Tue, May 18, 2010 at 12:53, Antoine Pitrou wrote: > "Race" is a strange term here and I'm not sure what you mean. > The issue found out by Dave Beazley can't be reasonably described by > this word, I think. OK, maybe "race" is the wrong word. But that doesn't mean the issue doesn't exist. > Please read and understand the issue report mentioned by Nir before > trying to make statements based on rumours heard here and there. Oh, so Dave Beazleys reports is a rumour now. -- Lennart Regebro: Python, Zope, Plone, Grok http://regebro.wordpress.com/ +33 661 58 14 64 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
Le mardi 18 mai 2010 à 14:16 +0200, Lennart Regebro a écrit : > > Please read and understand the issue report mentioned by Nir before > > trying to make statements based on rumours heard here and there. > > Oh, so Dave Beazleys reports is a rumour now. Your and other people's grandiloquent interpretation of them is. Now let me ask you a question: did you witness some of the effects mentioned here? Did it disturb the proper functioning one of your applications, programs or services? If yes, please be so kind as to explain how; it will be an useful datapoint. Bonus points if the issue affects Python 3.2, since that's really what Nir is talking about. If not, then do you have any valuable information to contribute to this discussion? ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
On Tue, May 18, 2010 at 14:52, Antoine Pitrou wrote: > > Le mardi 18 mai 2010 à 14:16 +0200, Lennart Regebro a écrit : >> > Please read and understand the issue report mentioned by Nir before >> > trying to make statements based on rumours heard here and there. >> >> Oh, so Dave Beazleys reports is a rumour now. > > Your and other people's grandiloquent interpretation of them is. > > Now let me ask you a question: did you witness some of the effects > mentioned here? Did it disturb the proper functioning one of your > applications, programs or services? > If yes, please be so kind as to explain how; it will be an useful > datapoint. Bonus points if the issue affects Python 3.2, since that's > really what Nir is talking about. > > If not, then do you have any valuable information to contribute to this > discussion? I doubt anything I say can be less constructive than your rude comments. -- Lennart Regebro: Python, Zope, Plone, Grok http://regebro.wordpress.com/ +33 661 58 14 64 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
Lennart Regebro wrote: > On Tue, May 18, 2010 at 14:52, Antoine Pitrou wrote: >> Le mardi 18 mai 2010 à 14:16 +0200, Lennart Regebro a écrit : Please read and understand the issue report mentioned by Nir before trying to make statements based on rumours heard here and there. >>> Oh, so Dave Beazleys reports is a rumour now. >> Your and other people's grandiloquent interpretation of them is. >> >> Now let me ask you a question: did you witness some of the effects >> mentioned here? Did it disturb the proper functioning one of your >> applications, programs or services? >> If yes, please be so kind as to explain how; it will be an useful >> datapoint. Bonus points if the issue affects Python 3.2, since that's >> really what Nir is talking about. >> >> If not, then do you have any valuable information to contribute to this >> discussion? > > I doubt anything I say can be less constructive than your rude comments. I can understand why Antoine is being offended: it's his implementation that you attacked. You literally said "At has been shown, it also in certain cases will race with the OS scheduler, so this is not already fixed", claiming that it is not fixed I believe Antoine does consider it fixed, on the grounds that all counter-examples provided so far are made-up toy examples, rather than actual applications that still suffer from the original problems. So please join us in considering the issue fixed unless you can provide a really world example that demonstrates the contrary. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
On Tue, May 18, 2010 at 3:43 PM, "Martin v. Löwis" wrote: > So please join us in considering the issue fixed unless you can provide > a really world example that demonstrates the contrary. The server software I maintain (openrpg) experiences this issue with when I tried porting the server code to 3.2. Granted it was only a 5 to 7% speed drop over single core, though with the old GIL (py2.5) there was a 25% to 30% speed drop (It can be running upto 300 IO bound threads & 30 CPU bound threads) so a net improvement of 20% so I am more than happy with the new GIL. I think the new GIL should be given a year or so in the wild before you start trying to optimize theoretical issues you may run into. If in a year people come back and have some examples of where a proper scheduler would help improve speed on multi-core systems even more, then we can address the issue at that time. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
On Sun, May 16, 2010 at 1:07 PM, Nir Aides wrote: > Relevant Python issue: http://bugs.python.org/issue7946 Is there any chance Antoine's "gilinter" patch from that issue might be applied to python 2.7? I have been experiencing rare long delays in simple io operations in multithreaded python applications, and I suspect that they might be related to this issue. -Mike ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
On Tue, 18 May 2010 14:39:43 -0700 Mike Klaas wrote: > On Sun, May 16, 2010 at 1:07 PM, Nir Aides wrote: > > > Relevant Python issue: http://bugs.python.org/issue7946 > > Is there any chance Antoine's "gilinter" patch from that issue might > be applied to python 2.7? I have been experiencing rare long delays > in simple io operations in multithreaded python applications, and I > suspect that they might be related to this issue. There's no chance for this since the patch relies on the new GIL. (that's unless there's a rush to backport the new GIL in 2.7, of course) I think your "rare long delays" might be related to the old GIL's own problems, though. How long are they? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Python versions for Ubuntu 10.10 (Maverick Meerkat)
I just wanted to let the python-dev community know about some tracks we had at the recently concluded Ubuntu Developer Summit in Brussels. Among the several Python-related discussions, we talked about what versions of Python will be supported and default in the next version of Ubuntu (10.10, code name Maverick Meerkat, to be released in October). If you're interested in following and participating in this discussion, I've started a wiki page as the central place to collect information: https://wiki.ubuntu.com/MaverickMeerkat/TechnicalOverview/Python While we don't have consensus yet, and we have a lot of footwork to do before we can make a decision, my goal is to include Python 2.6, 2.7, and 3.2 (beta) in Ubuntu 10.10. I'd like to make Python 2.7 the default if possible, and we will need to get pre-approval to do stable release upgrades to Python 3.2 to track the move to its final release, which will happen after Ubuntu 10.10. is released. It seems that most discussions are happening on the debian-python mailing list right now. they-don't-call-it-maverick-for-nuthin'-ly y'rs, -Barry signature.asc Description: PGP signature ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Incorrect length of collections.Counter objects / Multiplicity function
Hello, everyone. I've checked the new collections.Counter class and I think I've found a bug: > >>> from collections import Counter > >>> c1 = Counter([1, 2, 1, 3, 2]) > >>> c2 = Counter([1, 1, 2, 2, 3]) > >>> c3 = Counter([1, 1, 2, 3]) > >>> c1 == c2 and c3 not in (c1, c2) > True > >>> # Perfect, so far. But... There's always a "but": > ... > >>> len(c1) > 3 The length of a Counter is the amount of unique elements. But the length must be the cardinality, and the cardinality of a multiset is the total number of elements (including duplicates) [1] [2]. The source code mentions that the recipe on ActiveState [3] was one of the references, but that recipe has this right. Also, why is it indexed? The indexes of a multiset call to mind the position of its elements, but there's no such thing in sets. I think this is inconsistent with the built-in set. I would have implemented the multiplicity function as a method instead of the indexes: c1.get_multiplicity(element) # instead of c1[element] Is this the intended behavior? If so, I'd like to propose a proper multiset implementation for the standard library (preferably called "Multiset"; should I create a PEP?). If not, I can write a patch to fix it, although I'm afraid it'd be a backwards incompatible change. Cheers, [1] http://en.wikipedia.org/wiki/Multiset#Overview [2] http://preview.tinyurl.com/smalltalk-bag [3] http://code.activestate.com/recipes/259174/ -- Gustavo Narea . | Tech blog: =Gustavo/(+blog)/tech ~ About me: =Gustavo/about | ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Unordered tuples/lists
Hello, everybody. I've been searching for a data structure like a tuple/list *but* unordered -- like a set, but duplicated elements shouldn't be removed. I have not even found a recipe, so I'd like to write an implementation and contribute it to the "collections" module in the standard library. This is the situation I have: I have a data structure that represents an arithmetic/boolean operation. Operations can be commutative, which means that the order of their operands don't change the result of the operation. This is, the following operations are equivalent: operation(a, b, c) == operation(c, b, a) == operation(b, a, c) operation(a, b, a) == operation(a, a, b) == operation(b, a, a) operation(a, a) == operation(a, a) So, I need a type to store the arguments/operands so that if two of these collections have the same elements with the same multiplicity, they are equivalent, regardless of the order. A multiset is not exactly what I need: I still need to use the elements in the order they were given. For example, the logical conjuction (aka "and" operator) has a left and right operands; I need to evaluate the first/left one and, if it returned True, then call the second/right one. They must not be evaluated in a random order. To sum up, it would behave like a tuple or a list, except when it's compared with another object: They would be equivalent if they're both unordered tuples/lists, and have the same elements. There can be mutable and immutable editions (UnorderedList and UnorderedTuple, respectively). I will write a PEP to elaborate on this if you think it'd be nice to have. Or, should I have written the PEP first? Cheers, -- Gustavo Narea . | Tech blog: =Gustavo/(+blog)/tech ~ About me: =Gustavo/about | ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Summing up
On Tue, 18 May 2010 21:43:30 +0200 "Martin v. Löwis" wrote: > > I can understand why Antoine is being offended: it's his implementation > that you attacked. You literally said "At has been shown, it also in > certain cases will race with the OS scheduler, so this is not already > fixed", claiming that it is not fixed > > I believe Antoine does consider it fixed, on the grounds that all > counter-examples provided so far are made-up toy examples, rather than > actual applications that still suffer from the original problems. Ok, this is a good opportunity to try to sum up, from my point of view. The main problem of the old GIL, which was evidenced in Dave's original study (not this year's, but the previous one) *is* fixed unless someone demonstrates otherwise. It should be noted that witnessing a slight performance degradation on a multi-core machine is not enough to demonstrate such a thing. The degradation could be caused by other factors, such as thread migration, bad OS behaviour, or even locking peculiarities in your own application, which are not related to the GIL. A good test is whether performance improves if you play with sys.setswitchinterval(). Dave's newer study regards another issue, which I must stress is also present in the old GIL algorithm, and therefore must have affected, if it is serious, real-world applications in 2.x. And indeed, the test I recently added to ccbench evidences the huge drop in socket I/Os per second when there's a background CPU thread; this test exercises the same situation as Dave's demos, only with a less trivial CPU workload: == CPython 2.7b2+.0 (trunk:81274M) == == x86_64 Linux on 'x86_64' == --- I/O bandwidth --- Background CPU task: Pi calculation (Python) CPU threads=0: 23034.5 packets/s. CPU threads=1: 6.4 ( 0 %) CPU threads=2: 15.7 ( 0 %) CPU threads=3: 13.9 ( 0 %) CPU threads=4: 20.8 ( 0 %) (note: I've just changed my desktop machine, so these figures are different from what I've posted weeks or months ago) Regardless of the fact that apparently noone reported it in real-world conditions, we *could* decide that the issue needs fixing. If we decide so, Nir's approach is the most rigorous one: it tries to fix the problem thoroughly, rather than graft an additional heuristic. Nir also has tested his patch on a variety of machines, more so than Dave and I did with our own patches; he is obviously willing to go forward. Right now, there are two problems with Nir's proposal: - first, what Nick said: the difficulty of having reliable high-precision cross-platform time sources, which are necessary for the BFS algorithm. Ironically, timestamp counters have their own problems on multi-core machines (they can go out of sync between CPUs). gettimeofday() and clock_gettime() may be precise enough on most Unices, though. - second, the BFS algorithm is not that well-studied, since AFAIK it was refused for inclusion in the Linux kernel; someone in the python-dev community would therefore have to make sense of, and evaluate, its heuristic. I also don't consider my own patch a very satisfactory "solution", although it has the reassuring quality of being simple and short (and easy to revert!). That said, most of us are programmers and we love to invent ways of fixing technical issues. It sometimes leads us to consider some things issues even when they are mostly theoretical. This is why I am lukewarm on this. I think interested people should focus on real-world testing (rather than Dave and I's synthetic tests) of the new GIL, with or without the various patches, and share the results. Otherwise, Dj Gilcrease's suggestion of waiting for third-party reports is also a very good one. Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unordered tuples/lists
This is typically called a "bag". Maybe searching for that will help you find a recipe? On Tue, May 18, 2010 at 3:13 PM, Gustavo Narea wrote: > Hello, everybody. > > I've been searching for a data structure like a tuple/list *but* unordered -- > like a set, but duplicated elements shouldn't be removed. I have not even > found a recipe, so I'd like to write an implementation and contribute it to > the "collections" module in the standard library. > > This is the situation I have: I have a data structure that represents an > arithmetic/boolean operation. Operations can be commutative, which means that > the order of their operands don't change the result of the operation. This is, > the following operations are equivalent: > operation(a, b, c) == operation(c, b, a) == operation(b, a, c) > operation(a, b, a) == operation(a, a, b) == operation(b, a, a) > operation(a, a) == operation(a, a) > > So, I need a type to store the arguments/operands so that if two of these > collections have the same elements with the same multiplicity, they are > equivalent, regardless of the order. > > A multiset is not exactly what I need: I still need to use the elements in the > order they were given. For example, the logical conjuction (aka "and" > operator) has a left and right operands; I need to evaluate the first/left one > and, if it returned True, then call the second/right one. They must not be > evaluated in a random order. > > To sum up, it would behave like a tuple or a list, except when it's compared > with another object: They would be equivalent if they're both unordered > tuples/lists, and have the same elements. There can be mutable and immutable > editions (UnorderedList and UnorderedTuple, respectively). > > I will write a PEP to elaborate on this if you think it'd be nice to have. Or, > should I have written the PEP first? > > Cheers, > -- > Gustavo Narea . > | Tech blog: =Gustavo/(+blog)/tech ~ About me: =Gustavo/about | > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unordered tuples/lists
2010/5/18 Guido van Rossum : > This is typically called a "bag". Maybe searching for that will help > you find a recipe? Yes, and we have one in Python 2.7+ called collections.Counter. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unordered tuples/lists
On Wed, 19 May 2010 08:13:42 am Gustavo Narea wrote: > Hello, everybody. > > I've been searching for a data structure like a tuple/list *but* > unordered -- like a set, but duplicated elements shouldn't be > removed. I have not even found a recipe, so I'd like to write an > implementation and contribute it to the "collections" module in the > standard library. [...] > So, I need a type to store the arguments/operands so that if two of > these collections have the same elements with the same multiplicity, > they are equivalent, regardless of the order. If I've understood your requirements, this should probably do it: # Untested. class MyCounter(collections.Counter): def __eq__(self, other): try: other = other.items() except AttributeError: return False return sorted(self.items()) == sorted(other) def __ne__(self, other): return not self == other > A multiset is not exactly what I need: I still need to use the > elements in the order they were given. For example, the logical > conjuction (aka "and" operator) has a left and right operands; I need > to evaluate the first/left one and, if it returned True, then call > the second/right one. They must not be evaluated in a random order. These requirements seem specialised enough to me that I expect it would be wasteful to add your class to the standard library. It's not quite an OrderedMultiSet (perhaps built on an OrderedDict), but a MultiSet that sometimes behaves as ordered and sometimes doesn't. > To sum up, it would behave like a tuple or a list, except when it's > compared with another object: They would be equivalent if they're > both unordered tuples/lists, and have the same elements. There can be > mutable and immutable editions (UnorderedList and UnorderedTuple, > respectively). > > I will write a PEP to elaborate on this if you think it'd be nice to > have. Or, should I have written the PEP first? Neither. I think you will need to demonstrate the need for these first. The Python standard library doesn't generally add data types on the basis of theoretical nice-to-have, or even for a single person's use-case (unless that person is Guido *wink*). Your first step should be to publish the classes somewhere else. If they are small enough, the Python recipes on ActiveState would be good, otherwise PyPI. If there is a demonstrated need for these classes, and you (or somebody else) is willing to maintain them, then they may be added to the collections module (perhaps for Python 3.3, since I think 3.2 and 2.7 are in feature-freeze). I suggest you also take this idea to python-l...@python.org or comp.lang.python first, to see whether there is any enthusiasm from the wider Python community. -- Steven D'Aprano ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unordered tuples/lists
On Tue, May 18, 2010 at 11:13:42PM +0100, Gustavo Narea wrote: > To sum up, it would behave like a tuple or a list, except when it's compared > with another object: They would be equivalent if they're both unordered > tuples/lists, and have the same elements. There can be mutable and immutable > editions (UnorderedList and UnorderedTuple, respectively). class UnorderedList(list): def __eq__(self, other): if not isinstance(other, UnorderedList): return False return sorted(self) == sorted(other) def __ne__(self, other): return not self.__eq__(other) Do you need more than that? Oleg. -- Oleg Broytmanhttp://phd.pp.ru/p...@phd.pp.ru Programmers don't die, they just GOSUB without RETURN. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Unordered tuples/lists
Gustavo Narea writes: > I've been searching for a data structure like a tuple/list *but* > unordered -- like a set, but duplicated elements shouldn't be removed. By that description, you're looking for the “Bag” pattern. […] > A multiset is not exactly what I need: I still need to use the > elements in the order they were given. This appears to contradict your explicit request that the collection type be unordered. […] > To sum up, it would behave like a tuple or a list, except when it's > compared with another object: They would be equivalent if they're both > unordered tuples/lists, and have the same elements. There can be > mutable and immutable editions (UnorderedList and UnorderedTuple, > respectively). Okay, so you want something rather special; it would be misleading to call this an unordered type, since you *do* want ordering preserved. It's only the comparison operators that would ignore ordering. AFAIK you'll need to design and implement this yourself. > I will write a PEP to elaborate on this if you think it'd be nice to > have. Or, should I have written the PEP first? Either way; but I think this discussion belongs on ‘python-ideas’ while the behaviour is still being designed. -- \ “On the internet you simply can't outsource parenting.” —Eliza | `\ Cussen, _Top 10 Internet Filter Lies_, The Punch, 2010-03-25 | _o__) | Ben Finney ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
On Tue, May 18, 2010 at 2:50 PM, Antoine Pitrou wrote: > There's no chance for this since the patch relies on the new GIL. > (that's unless there's a rush to backport the new GIL in 2.7, of course) Thanks I missed that detail. > I think your "rare long delays" might be related to the old GIL's own > problems, though. How long are they? Typically between 20 and 60s. This is the time it takes to send and receive a single small packet on an already-active tcp connection to ensure it is still alive. Most of the time it is < 1ms. I don't have strong evidence that GIL issues are causing the problem, because I can't reliably reproduce the issue. But the general setup is similar (one thread doing light io experiencing odd delays in a process with multiple threads that are often cpu-bound, on a multi-core machine) thanks, -Mike ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Summing up
Antoine, This is a pretty good summary that mirrors my thoughts on the GIL matter as well. In the big picture, I do think it's desirable for Python to address the multicore performance issue--namely to not have the performance needlessly thrashed in that environment. The original new GIL addressed this. The I/O convoy effect problem is more subtle. Personally, I think it's an issue that at least merits further study because trying to overlap I/O with computation is a known programming technique that might be useful for people using Python to do message passing, distributed computation, etc. As an example, the multiprocessing module uses threads as part of its queue implementation. Is it impacted by convoying? I honestly don't know. I agree that getting some more real-world experience would be useful. Cheers, Dave > From: Antoine Pitrou > > Ok, this is a good opportunity to try to sum up, from my point of view. > > The main problem of the old GIL, which was evidenced in Dave's original > study (not this year's, but the previous one) *is* fixed unless someone > demonstrates otherwise. > > It should be noted that witnessing a slight performance degradation on > a multi-core machine is not enough to demonstrate such a thing. The > degradation could be caused by other factors, such as thread migration, > bad OS behaviour, or even locking peculiarities in your own > application, which are not related to the GIL. A good test is whether > performance improves if you play with sys.setswitchinterval(). > > > Dave's newer study regards another issue, which I must stress is also > present in the old GIL algorithm, and therefore must have affected, if > it is serious, real-world applications in 2.x. And indeed, the test I > recently added to ccbench evidences the huge drop in socket I/Os per > second when there's a background CPU thread; this test exercises the > same situation as Dave's demos, only with a less trivial CPU workload: > > == CPython 2.7b2+.0 (trunk:81274M) == > == x86_64 Linux on 'x86_64' == > > --- I/O bandwidth --- > > Background CPU task: Pi calculation (Python) > > CPU threads=0: 23034.5 packets/s. > CPU threads=1: 6.4 ( 0 %) > CPU threads=2: 15.7 ( 0 %) > CPU threads=3: 13.9 ( 0 %) > CPU threads=4: 20.8 ( 0 %) > > (note: I've just changed my desktop machine, so these figures are > different from what I've posted weeks or months ago) > > > Regardless of the fact that apparently noone reported it in real-world > conditions, we *could* decide that the issue needs fixing. If we > decide so, Nir's approach is the most rigorous one: it tries to fix > the problem thoroughly, rather than graft an additional heuristic. Nir > also has tested his patch on a variety of machines, more so than Dave > and I did with our own patches; he is obviously willing to go forward. > > Right now, there are two problems with Nir's proposal: > > - first, what Nick said: the difficulty of having reliable > high-precision cross-platform time sources, which are necessary for > the BFS algorithm. Ironically, timestamp counters have their own > problems on multi-core machines (they can go out of sync between > CPUs). gettimeofday() and clock_gettime() may be precise enough on > most Unices, though. > > - second, the BFS algorithm is not that well-studied, since AFAIK it > was refused for inclusion in the Linux kernel; someone in the > python-dev community would therefore have to make sense of, and > evaluate, its heuristic. > > I also don't consider my own patch a very satisfactory "solution", > although it has the reassuring quality of being simple and short (and > easy to revert!). > > > That said, most of us are programmers and we love to invent ways > of fixing technical issues. It sometimes leads us to consider some > things issues even when they are mostly theoretical. This is why I > am lukewarm on this. I think interested people should focus on > real-world testing (rather than Dave and I's synthetic tests) of the new > GIL, with or without the various patches, and share the results. > > Otherwise, Dj Gilcrease's suggestion of waiting for third-party reports > is also a very good one. > > Regards > > Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
On Tue, 18 May 2010 17:26:44 -0700 Mike Klaas wrote: > > > I think your "rare long delays" might be related to the old GIL's own > > problems, though. How long are they? > > Typically between 20 and 60s. You mean milliseconds I suppose? If it's the case, then you may simply be witnessing garbage collection runs. I've measured garbage collection runs of about 50 ms each on a Web application, with the full framework loaded and a bunch of objects in memory. If you really meant seconds, it looks a bit too high to be GIL-related. What kind of things are the CPU threads doing? Regards Antoine. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Summing up
On 19/05/10 10:35, David Beazley wrote: Antoine, This is a pretty good summary that mirrors my thoughts on the GIL matter as well. In the big picture, I do think it's desirable for Python to address the multicore performance issue--namely to not have the performance needlessly thrashed in that environment. The original new GIL addressed this. The I/O convoy effect problem is more subtle. Personally, I think it's an issue that at least merits further study because trying to overlap I/O with computation is a known programming technique that might be useful for people using Python to do message passing, distributed computation, etc. As an example, the multiprocessing module uses threads as part of its queue implementation. Is it impacted by convoying? I honestly don't know. I agree that getting some more real-world experience would be useful. My takeaway from this discussion is that: A. we should leave the new GIL in 3.2 in its current (relatively) simple form for now, keeping the various patches in issue 7946 in our back pocket if someone finds real world examples of the convoying effect discussed there. The idea here being that we shouldn't complicate the implementation without some solid evidence that doing so is actually necessary for real world workloads. B. some more thought should be given to incorporating the new GIL into 2.7. However, this requires two things: - an update to the patch in 7753 to either retain the old GIL for platforms not supported by the new GIL or else to make the new GIL a configure option - Benjamin accepting that patch (as it would likely mean adding another beta release to the cycle) In the absence of an updated version of the 7753 patch, backporting the new GIL to 2.7 isn't really a serious option. Regards, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Fixing the GIL (with a BFS scheduler)
> I think the new GIL should be given a year or so in the wild before > you start trying to optimize theoretical issues you may run into. If > in a year people come back and have some examples of where a proper > scheduler would help improve speed on multi-core systems even more, > then we can address the issue at that time. Exactly my feelings. Regards, Martin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com