Re: [Python-Dev] sum(...) limitation
On 12 Aug 2014 11:21, "Chris Barker - NOAA Federal" wrote: > > Sorry for the bike shedding here, but: > >> The quadratic behaviour of repeated str summation is a subtle, silent error. > > OK, fair enough. I suppose it would be hard and ugly to catch those instances and raise an exception pointing users to "".join. >> >> *is* controversial that CPython silently optimises some cases of it away, since it can cause problems when porting affected code to other interpreters that don't use refcounting and thus have a harder time implementing such a trick. > > Is there anything in the language spec that says string concatenation is O(n^2)? Or for that matter any of the performs characteristics of build in types? Those striker as implementation details that SHOULD be particular to the implementation. If you implement strings so they have multiple data segments internally (as is the case for StringIO these days), yes, you can avoid quadratic time concatenation behaviour. Doing so makes it harder to meet other complexity expectations (like O(1) access to arbitrary code points), and isn't going to happen in CPython regardless due to C API backwards compatibility constraints. For the explicit loop with repeated concatenation, we can't say "this is slow, don't do it". People do it anyway, so we've opted for the "fine, make it as fast as we can" option as being preferable to an obscure and relatively hard to debug performance problem. For sum(), we have the option of being more direct and just telling people Python's answer to the string concatenation problem (i.e. str.join). That is decidedly *not* the series of operations described in sum's documentation as "Sums start and the items of an iterable from left to right and returns the total." Regards, Nick. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sum(...) limitation
Hi all, The core of the matter is that if we repeatedly __add__ strings from a long list, we get O(n**2) behavior. For one point of view, the reason is that the additions proceed in left-to-right order. Indeed, sum() could proceed in a more balanced tree-like order: from [x0, x1, x2, x3, ...], reduce the list to [x0+x1, x2+x3, ...]; then repeat until there is only one item in the final list. This order ensures that sum(list_of_strings) is at worst O(n log n). It might be in practice close enough from linear to not matter. It also improves a lot the precision of sum(list_of_floats) (though not reaching the same precision levels of math.fsum()). Just a thought, Armin. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multiline with statement line continuation
I think this thread is probably Python-Ideas territory... On Mon, Aug 11, 2014 at 4:08 PM, Allen Li wrote: > Currently, this works with explicit line continuation, but as all style > guides favor implicit line continuation over explicit, it would be nice > if you could do the following: > > with (open('foo') as foo, > open('bar') as bar, > open('baz') as baz, > open('spam') as spam, > open('eggs') as eggs): > pass The parentheses seem unnecessary/redundant/weird. Why not allow newlines in-between "with" and the terminating ":"? with open('foo') as foo, open('bar') as bar, open('baz') as baz: pass -- Devin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multiline with statement line continuation
On Tue, Aug 12, 2014 at 10:28:14AM +1000, Nick Coghlan wrote: > On 12 Aug 2014 09:09, "Allen Li" wrote: > > > > This is a problem I sometimes run into when working with a lot of files > > simultaneously, where I need three or more `with` statements: > > > > with open('foo') as foo: > > with open('bar') as bar: > > with open('baz') as baz: > > pass > > > > Thankfully, support for multiple items was added in 3.1: > > > > with open('foo') as foo, open('bar') as bar, open('baz') as baz: > > pass > > > > However, this begs the need for a multiline form, especially when > > working with three or more items: > > > > with open('foo') as foo, \ > > open('bar') as bar, \ > > open('baz') as baz, \ > > open('spam') as spam \ > > open('eggs') as eggs: > > pass > > I generally see this kind of construct as a sign that refactoring is > needed. For example, contextlib.ExitStack offers a number of ways to manage > multiple context managers dynamically rather than statically. I don't think that ExitStack is the right solution for when you have a small number of context managers known at edit-time. The extra effort of writing your code, and reading it, in a dynamic manner is not justified. Compare the natural way of writing this: with open("spam") as spam, open("eggs", "w") as eggs, frobulate("cheese") as cheese: # do stuff with spam, eggs, cheese versus the dynamic way: with ExitStack() as stack: spam, eggs = [stack.enter_context(open(fname), mode) for fname, mode in zip(("spam", "eggs"), ("r", "w")] cheese = stack.enter_context(frobulate("cheese")) # do stuff with spam, eggs, cheese I prefer the first, even with the long line. -- Steven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multiline with statement line continuation
On Tue, Aug 12, 2014 at 7:15 AM, Steven D'Aprano wrote: > On Tue, Aug 12, 2014 at 10:28:14AM +1000, Nick Coghlan wrote: >> On 12 Aug 2014 09:09, "Allen Li" wrote: >> > >> > This is a problem I sometimes run into when working with a lot of files >> > simultaneously, where I need three or more `with` statements: >> > >> > with open('foo') as foo: >> > with open('bar') as bar: >> > with open('baz') as baz: >> > pass >> > >> > Thankfully, support for multiple items was added in 3.1: >> > >> > with open('foo') as foo, open('bar') as bar, open('baz') as baz: >> > pass >> > >> > However, this begs the need for a multiline form, especially when >> > working with three or more items: >> > >> > with open('foo') as foo, \ >> > open('bar') as bar, \ >> > open('baz') as baz, \ >> > open('spam') as spam \ >> > open('eggs') as eggs: >> > pass >> >> I generally see this kind of construct as a sign that refactoring is >> needed. For example, contextlib.ExitStack offers a number of ways to manage >> multiple context managers dynamically rather than statically. > > I don't think that ExitStack is the right solution for when you have a > small number of context managers known at edit-time. The extra effort of > writing your code, and reading it, in a dynamic manner is not justified. > Compare the natural way of writing this: > > with open("spam") as spam, open("eggs", "w") as eggs, frobulate("cheese") as > cheese: > # do stuff with spam, eggs, cheese > > versus the dynamic way: > > with ExitStack() as stack: > spam, eggs = [stack.enter_context(open(fname), mode) for fname, mode in > zip(("spam", "eggs"), ("r", "w")] > cheese = stack.enter_context(frobulate("cheese")) > # do stuff with spam, eggs, cheese > > I prefer the first, even with the long line. I agree with Steven for *small* numbers of context managers. Once they become too long though, either refactoring is severely needed or the user should ExitStack. To quote Ben Hoyt: > Is it meaningful to use "with" with a tuple, though? Because a tuple > isn't a context manager with __enter__ and __exit__ methods. For > example: > > >>> with (1,2,3): pass > ... > Traceback (most recent call last): > File "", line 1, in > AttributeError: __exit__ > > So -- although I'm not arguing for it here -- you'd be turning an code > (a runtime AttributeError) into valid syntax. I think by introducing parentheses we are going to risk seriously confusing users who may then try to write an assignment like a = (open('spam') as spam, open('eggs') as eggs) Because it looks like a tuple but isn't and I think the extra complexity this would add to the language would not be worth the benefit. If we simply look at Ruby for what happens when you have an overloaded syntax that means two different things, you can see why I'm against modifying this syntax. In Ruby, parentheses for method calls are optional and curly braces (i.e, {}) are used for blocks and hash literals. With a method on class that takes a parameter and a block, you get some confusing errors, take for example: class Spam def eggs(ham) puts ham yield if block_present? end end s = Spam.new s.eggs {monty: 'python'} SyntaxError: ... But s.eggs({monty: 'python'}) Will print out the hash. The interpreter isn't intelligent enough to know if you're attempting to pass a hash as a parameter or a block to be executed. This may seem like a stretch to apply to Python, but the concept of muddling the meaning of something already very well defined seems like a bad idea. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multiline with statement line continuation
On Tue, Aug 12, 2014 at 3:43 AM, Devin Jeanpierre wrote: > I think this thread is probably Python-Ideas territory... > > On Mon, Aug 11, 2014 at 4:08 PM, Allen Li wrote: > > Currently, this works with explicit line continuation, but as all style > > guides favor implicit line continuation over explicit, it would be nice > > if you could do the following: > > > > with (open('foo') as foo, > > open('bar') as bar, > > open('baz') as baz, > > open('spam') as spam, > > open('eggs') as eggs): > > pass > > The parentheses seem unnecessary/redundant/weird. Why not allow > newlines in-between "with" and the terminating ":"? > > with open('foo') as foo, >open('bar') as bar, >open('baz') as baz: > pass > That way lies Coffeescript. Too much guessing. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multiline with statement line continuation
Hi, On 12 August 2014 01:08, Allen Li wrote: > with (open('foo') as foo, > open('bar') as bar, > open('baz') as baz, > open('spam') as spam, > open('eggs') as eggs): > pass +1. It's exactly the same grammar extension as for "from import" statements, for the same reason. Armin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multiline with statement line continuation
On 08/12/2014 06:57 PM, Armin Rigo wrote: > Hi, > > On 12 August 2014 01:08, Allen Li wrote: >> with (open('foo') as foo, >> open('bar') as bar, >> open('baz') as baz, >> open('spam') as spam, >> open('eggs') as eggs): >> pass > > +1. It's exactly the same grammar extension as for "from import" > statements, for the same reason. Not the same: in import statements it unambiguously replaces a list of (optionally as-renamed) identifiers. Here, it would replace an arbitrary expression, which I think would mean that we couldn't differentiate between e.g. with (expr).meth():# a line break in "expr" # would make the parens useful and with (expr1, expr2): cheers, Georg ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sum(...) limitation
On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull wrote: > I'm referring to removing the unnecessary information that there's a > better way to do it, and simply raising an error (as in Python 3.2, > say) which is all a RealProgrammer[tm] should ever need! > I can't imagine anyone is suggesting that -- disallow it, but don't tell anyone why? The only thing that is remotely on the table here is: 1) remove the special case for strings -- buyer beware -- but consistent and less "ugly" 2) add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check. And I doubt anyone really is pushing for anything but (2) Steven Turnbull wrote: > IMO we'd also want a homogeneous_iterable ABC Actually, I've thought for years that that would open the door to a lot of optimizations -- but that's a much broader question that sum(). I even brought it up probably over ten years ago -- but no one was the least bit iinterested -- nor are they now -- I now this was a rhetorical suggestion to make the point about what not to do Because obviously we'd want the > attractive nuisance of "if you have __add__, there's a default > definition of __sum__" now I'm confused -- isn't that exactly what we have now? It's possible that Python could provide some kind of feature that > would allow an optimized sum function for every type that has __add__, > but I think this will take a lot of thinking. does it need to be every type? As it is the common ones work fine already except for strings -- so if we add an optimized string sum() then we're done. *Somebody* will do it > (I don't think anybody is +1 on restricting sum() to a subset of types > with __add__). uhm, that's exactly what we have now -- you can use sum() with anything that has an __add__, except strings. Ns by that logic, if we thought there were other inefficient use cases, we'd restrict those too. But users can always define their own classes that have a __sum__ and are really inefficient -- so unless sum() becomes just for a certain subset of built-in types -- does anyone want that? Then we are back to the current situation: sum() can be used for any type that has an __add__ defined. But naive users are likely to try it with strings, and that's bad, so we want to prevent that, and have a special case check for strings. What I fail to see is why it's better to raise an exception and point users to a better way, than to simply provide an optimization so that it's a mute issue. The only justification offered here is that will teach people that summing strings (and some other objects?) is order(N^2) and a bad idea. But: a) Python's primary purpose is practical, not pedagogical (not that it isn't great for that) b) I doubt any naive users learn anything other than "I can't use sum() for strings, I should use "".join()". Will they make the leap to "I shouldn't use string concatenation in a loop, either"? Oh, wait, you can use string concatenation in a loop -- that's been optimized. So will they learn: "some types of object shave poor performance with repeated concatenation and shouldn't be used with sum(). So If I write such a class, and want to sum them up, I'll need to write an optimized version of that code"? I submit that no naive user is going to get any closer to a proper understanding of algorithmic Order behavior from this small hint. Which leaves no reason to prefer an Exception to an optimization. One other point: perhaps this will lead a naive user into thinking -- "sum() raises an exception if I try to use it inefficiently, so it must be OK to use for anything that doesn't raise an exception" -- that would be a bad lesson to mis-learn -Chris PS: Armin Rigo wrote: > It also improves a > lot the precision of sum(list_of_floats) (though not reaching the same > precision levels of math.fsum()). while we are at it, having the default sum() for floats be fsum() would be nice -- I'd rather the default was better accuracy loser performance. Folks that really care about performance could call math.fastsum(), or really, use numpy... This does turn sum() into a function that does type-based dispatch, but isn't python full of those already? do something special for the types you know about, call the generic dunder method for the rest. -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R(206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception chris.bar...@noaa.gov ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sum(...) limitation
I know, I have nothing to decide here, since I'm no contributer and just a silent watcher on this list. However I just wanted to point out I fully agree with Chris Barker's position. Couldn't have stated it better. Performance should be interpreter implementation issue, not language issue. > 2) add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check. I would give it a +1 if my opinion counts anything. Cheers Stefan Gesendet: Dienstag, 12. August 2014 um 21:11 Uhr Von: "Chris Barker" An: Kein Empfänger Cc: "Python Dev" Betreff: Re: [Python-Dev] sum(...) limitation On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbullwrote: I'm referring to removing the unnecessary information that there's a better way to do it, and simply raising an error (as in Python 3.2, say) which is all a RealProgrammer[tm] should ever need! I can't imagine anyone is suggesting that -- disallow it, but don't tell anyone why? The only thing that is remotely on the table here is: 1) remove the special case for strings -- buyer beware -- but consistent and less "ugly" 2) add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check. And I doubt anyone really is pushing for anything but (2) Steven Turnbull wrote: IMO we'd also want a homogeneous_iterable ABC Actually, I've thought for years that that would open the door to a lot of optimizations -- but that's a much broader question that sum(). I even brought it up probably over ten years ago -- but no one was the least bit iinterested -- nor are they now -- I now this was a rhetorical suggestion to make the point about what not to do Because obviously we'd want the attractive nuisance of "if you have __add__, there's a default definition of __sum__" now I'm confused -- isn't that exactly what we have now? It's possible that Python could provide some kind of feature that would allow an optimized sum function for every type that has __add__, but I think this will take a lot of thinking. does it need to be every type? As it is the common ones work fine already except for strings -- so if we add an optimized string sum() then we're done. *Somebody* will do it (I don't think anybody is +1 on restricting sum() to a subset of types with __add__). uhm, that's exactly what we have now -- you can use sum() with anything that has an __add__, except strings. Ns by that logic, if we thought there were other inefficient use cases, we'd restrict those too. But users can always define their own classes that have a __sum__ and are really inefficient -- so unless sum() becomes just for a certain subset of built-in types -- does anyone want that? Then we are back to the current situation: sum() can be used for any type that has an __add__ defined. But naive users are likely to try it with strings, and that's bad, so we want to prevent that, and have a special case check for strings. What I fail to see is why it's better to raise an exception and point users to a better way, than to simply provide an optimization so that it's a mute issue. The only justification offered here is that will teach people that summing strings (and some other objects?) is order(N^2) and a bad idea. But: a) Python's primary purpose is practical, not pedagogical (not that it isn't great for that) b) I doubt any naive users learn anything other than "I can't use sum() for strings, I should use "".join()". Will they make the leap to "I shouldn't use string concatenation in a loop, either"? Oh, wait, you can use string concatenation in a loop -- that's been optimized. So will they learn: "some types of object shave poor performance with repeated concatenation and shouldn't be used with sum(). So If I write such a class, and want to sum them up, I'll need to write an optimized version of that code"? I submit that no naive user is going to get any closer to a proper understanding of algorithmic Order behavior from this small hint. Which leaves no reason to prefer an Exception to an optimization. One other point: perhaps this will lead a naive user into thinking -- "sum() raises an exception if I try to use it inefficiently, so it must be OK to use for anything that doesn't raise an exception" -- that would be a bad lesson to mis-learn -Chris PS: Armin Rigo wrote: It also improves a lot the precision of sum(list_of_floats) (though not reaching the same precision levels of math.fsum()). while we are at it, having the default sum() for floats be fsum() would be nice -- I'd rather the default was better accuracy loser performance. Folks that really care about performance could call math.fastsum(), or really, use numpy... This does turn sum() into a function that does type-based di
Re: [Python-Dev] Multiline with statement line continuation
On Tue, Aug 12, 2014 at 8:12 AM, Guido van Rossum wrote: > On Tue, Aug 12, 2014 at 3:43 AM, Devin Jeanpierre > wrote: >> The parentheses seem unnecessary/redundant/weird. Why not allow >> newlines in-between "with" and the terminating ":"? >> >> with open('foo') as foo, >>open('bar') as bar, >>open('baz') as baz: >> pass > > > That way lies Coffeescript. Too much guessing. There's no syntactic ambiguity, so what guessing are you talking about? What *really* requires guessing, is figuring out where in Python's syntax parentheses are allowed vs not allowed ;). For example, "from foo import (bar, baz)" is legal, but "import (bar, baz)" is not. Sometimes it feels like Python is slowly and organically evolving into a parenthesis-delimited language. -- Devin ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sum(...) limitation
Chris Barker writes: > What I fail to see is why it's better to raise an exception and point users > to a better way, than to simply provide an optimization so that it's a mute > issue. > > The only justification offered here is that will teach people that summing > strings (and some other objects?) is order(N^2) and a bad idea. But: > > a) Python's primary purpose is practical, not pedagogical (not that it > isn't great for that) > > b) I doubt any naive users learn anything other than "I can't use sum() for > strings, I should use "".join()". Will they make the leap to "I shouldn't > use string concatenation in a loop, either"? Oh, wait, you can use string > concatenation in a loop -- that's been optimized. So will they learn: "some > types of object shave poor performance with repeated concatenation and > shouldn't be used with sum(). So If I write such a class, and want to sum > them up, I'll need to write an optimized version of that code"? > > I submit that no naive user is going to get any closer to a proper > understanding of algorithmic Order behavior from this small hint. Which > leaves no reason to prefer an Exception to an optimization. > > One other point: perhaps this will lead a naive user into thinking -- > "sum() raises an exception if I try to use it inefficiently, so it must be > OK to use for anything that doesn't raise an exception" -- that would be a > bad lesson to mis-learn AOL to that. Best, -Nikolaus -- GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F »Time flies like an arrow, fruit flies like a Banana.« ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Multiline with statement line continuation
On Tue, Aug 12, 2014 at 08:04:35AM -0500, Ian Cordasco wrote: > I think by introducing parentheses we are going to risk seriously > confusing users who may then try to write an assignment like > > a = (open('spam') as spam, open('eggs') as eggs) Seriously? If they try it, they will get a syntax error. Now, admittedly Python's syntax error messages tend to be terse and cryptic, but it's still enough to show that you can't do that. py> a = (open('spam') as spam, open('eggs') as eggs) File "", line 1 a = (open('spam') as spam, open('eggs') as eggs) ^ SyntaxError: invalid syntax I don't see this as a problem. There's no limit to the things that people *might* do if they don't understand Python semantics: for module in sys, math, os, import module (and yes, I once tried this as a beginner) but they try it once, realise it doesn't work, and never do it again. > Because it looks like a tuple but isn't and I think the extra > complexity this would add to the language would not be worth the > benefit. Do we have a problem with people thinking that, since tuples are normally interchangable with lists, they can write this? from module import [fe, fi, fo, fum, spam, eggs, cheese] and then being "seriously confused" by the syntax error they receive? Or writing this? from (module import fe, fi, fo, fum, spam, eggs, cheese) It's not sufficient that people might try it, see it fails, and move on. Your claim is that it will cause serious confusion. I just don't see that happening. > If we simply look at Ruby for what happens when you have an > overloaded syntax that means two different things, you can see why I'm > against modifying this syntax. That ship has sailed in Python, oh, 20+ years ago. Parens are used for grouping, for tuples[1], for function calls, for parameter lists, class base-classes, generator expressions and line continuations. I cannot think of any examples where these multiple uses for parens has cause meaningful confusion, and I don't think this one will either. [1] Technically not, since it's the comma, not the ( ), which makes a tuple, but a lot of people don't know that and treat it as if it the parens were compulsary. -- Steven ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] sum(...) limitation
Redirecting to python-ideas, so trimming less than I might. Chris Barker writes: > On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull > wrote: > > > I'm referring to removing the unnecessary information that there's a > > better way to do it, and simply raising an error (as in Python 3.2, > > say) which is all a RealProgrammer[tm] should ever need! > > > > I can't imagine anyone is suggesting that -- disallow it, but don't tell > anyone why? As I said, it's a regression. That's exactly the behavior in Python 3.2. > The only thing that is remotely on the table here is: > > 1) remove the special case for strings -- buyer beware -- but consistent > and less "ugly" It's only consistent if you believe that Python has strict rules for use of various operators. It doesn't, except as far as they are constrained by precedence. For example, I have an application where I add bytestrings bytewise modulo N <= 256, and concatenate them. In fact I use function call syntax, but the obvious operator syntax is '+' for the bytewise addition, and '*' for the concatenation. It's not in the Zen, but I believe in the maxim "If it's worth doing, it's worth doing well." So for me, 1) is out anyway. > 2) add a special case for strings that is fast and efficient -- may be as > simple as calling "".join() under the hood --no more code than the > exception check. Sure, but what about all the other immutable containers with __add__ methods? What about mappings with key-wise __add__ methods whose values might be immutable but have __add__ methods? Where do you stop with the special-casing? I consider this far more complex and ugly than the simple "sum() is for numbers" rule (and even that is way too complex considering accuracy of summing floats). > And I doubt anyone really is pushing for anything but (2) I know that, but I think it's the wrong solution to the problem (which is genuine IMO). The right solution is something generic, possibly a __sum__ method. The question is whether that leads to too much work to be worth it (eg, "homogeneous_iterable"). > > Because obviously we'd want the attractive nuisance of "if you > > have __add__, there's a default definition of __sum__" > > now I'm confused -- isn't that exactly what we have now? Yes and my feeling (backed up by arguments that I admit may persuade nobody but myself) is that what we have now kinda sucks[tm]. It seemed like a good idea when I first saw it, but then, my apps don't scale to where the pain starts in my own usage. > > It's possible that Python could provide some kind of feature that > > would allow an optimized sum function for every type that has > > __add__, but I think this will take a lot of thinking. > > does it need to be every type? As it is the common ones work fine already > except for strings -- so if we add an optimized string sum() then we're > done. I didn't say provide an optimized sum(), I said provide a feature enabling people who want to optimize sum() to do so. So yes, it needs to be every type (the optional __sum__ method is a proof of concept, modulo it actually being implementable ;-). > > *Somebody* will do it (I don't think anybody is +1 on restricting > > sum() to a subset of types with __add__). > > uhm, that's exactly what we have now Exactly. Who's arguing that the sum() we have now is a ticket to Paradise? I'm just saying that there's probably somebody out there negative enough on the current situation to come up with an answer that I think is general enough (and I suspect that python-dev consensus is that demanding, too). > sum() can be used for any type that has an __add__ defined. I'd like to see that be mutable types with __iadd__. > What I fail to see is why it's better to raise an exception and > point users to a better way, than to simply provide an optimization > so that it's a mute issue. Because inefficient sum() is an attractive nuisance, easy to overlook, and likely to bite users other than the author. > The only justification offered here is that will teach people that summing > strings (and some other objects?) Summing tuples works (with appropriate start=tuple()). Haven't benchmarked, but I bet that's O(N^2). > is order(N^2) and a bad idea. But: > > a) Python's primary purpose is practical, not pedagogical (not that it > isn't great for that) My argument is that in practical use sum() is a bad idea, period, until you book up on the types and applications where it *does* work. N.B. It doesn't even work properly for numbers (inaccurate for floats). > b) I doubt any naive users learn anything other than "I can't use sum() for > strings, I should use "".join()". For people who think that special-casing strings is a good idea, I think this is about as much benefit as you can expect. Why go farther?<0.5 wink/> > I submit that no naive user is going to get any closer to a proper > understanding of algorithmic Order behavior from this