date:20140812

Re: [Python-Dev] sum(...) limitation

2014-08-12 Thread Nick Coghlan

On 12 Aug 2014 11:21, "Chris Barker - NOAA Federal" 
wrote:
>
> Sorry for the bike shedding here, but:
>
>> The quadratic behaviour of repeated str summation is a subtle, silent
error.
>
> OK, fair enough. I suppose it would be hard and ugly to catch those
instances and raise an exception pointing users to "".join.
>>
>> *is* controversial that CPython silently optimises some cases of it
away, since it can cause problems when porting affected code to other
interpreters that don't use refcounting and thus have a harder time
implementing such a trick.
>
> Is there anything in the language spec that says string concatenation is
O(n^2)? Or for that matter any of the performs characteristics of build in
types? Those striker as implementation details that SHOULD be particular to
the implementation.

If you implement strings so they have multiple data segments internally (as
is the case for StringIO these days), yes, you can avoid quadratic time
concatenation behaviour. Doing so makes it harder to meet other complexity
expectations (like O(1) access to arbitrary code points), and isn't going
to happen in CPython regardless due to C API backwards compatibility
constraints.

For the explicit loop with repeated concatenation, we can't say "this is
slow, don't do it". People do it anyway, so we've opted for the "fine, make
it as fast as we can" option as being preferable to an obscure and
relatively hard to debug performance problem.

For sum(), we have the option of being more direct and just telling people
Python's answer to the string concatenation problem (i.e. str.join). That
is decidedly *not* the series of operations described in sum's
documentation as "Sums start and the items of an iterable from left to
right and returns the total."

Regards,
Nick.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] sum(...) limitation

2014-08-12 Thread Armin Rigo

Hi all,

The core of the matter is that if we repeatedly __add__ strings from a
long list, we get O(n**2) behavior.  For one point of view, the
reason is that the additions proceed in left-to-right order.  Indeed,
sum() could proceed in a more balanced tree-like order: from [x0, x1,
x2, x3, ...], reduce the list to [x0+x1, x2+x3, ...]; then repeat
until there is only one item in the final list.  This order ensures
that sum(list_of_strings) is at worst O(n log n).  It might be in
practice close enough from linear to not matter.  It also improves a
lot the precision of sum(list_of_floats) (though not reaching the same
precision levels of math.fsum()).


Just a thought,

Armin.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Devin Jeanpierre

I think this thread is probably Python-Ideas territory...

On Mon, Aug 11, 2014 at 4:08 PM, Allen Li  wrote:
> Currently, this works with explicit line continuation, but as all style
> guides favor implicit line continuation over explicit, it would be nice
> if you could do the following:
>
> with (open('foo') as foo,
>   open('bar') as bar,
>   open('baz') as baz,
>   open('spam') as spam,
>   open('eggs') as eggs):
> pass

The parentheses seem unnecessary/redundant/weird. Why not allow
newlines in-between "with" and the terminating ":"?

with open('foo') as foo,
   open('bar') as bar,
   open('baz') as baz:
pass

-- Devin
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Steven D'Aprano

On Tue, Aug 12, 2014 at 10:28:14AM +1000, Nick Coghlan wrote:
> On 12 Aug 2014 09:09, "Allen Li"  wrote:
> >
> > This is a problem I sometimes run into when working with a lot of files
> > simultaneously, where I need three or more `with` statements:
> >
> > with open('foo') as foo:
> > with open('bar') as bar:
> > with open('baz') as baz:
> > pass
> >
> > Thankfully, support for multiple items was added in 3.1:
> >
> > with open('foo') as foo, open('bar') as bar, open('baz') as baz:
> > pass
> >
> > However, this begs the need for a multiline form, especially when
> > working with three or more items:
> >
> > with open('foo') as foo, \
> >  open('bar') as bar, \
> >  open('baz') as baz, \
> >  open('spam') as spam \
> >  open('eggs') as eggs:
> > pass
> 
> I generally see this kind of construct as a sign that refactoring is
> needed. For example, contextlib.ExitStack offers a number of ways to manage
> multiple context managers dynamically rather than statically.

I don't think that ExitStack is the right solution for when you have a 
small number of context managers known at edit-time. The extra effort of 
writing your code, and reading it, in a dynamic manner is not justified. 
Compare the natural way of writing this:

with open("spam") as spam, open("eggs", "w") as eggs, frobulate("cheese") as 
cheese:
# do stuff with spam, eggs, cheese

versus the dynamic way:

with ExitStack() as stack:
spam, eggs = [stack.enter_context(open(fname), mode) for fname, mode in 
  zip(("spam", "eggs"), ("r", "w")]
cheese = stack.enter_context(frobulate("cheese"))
# do stuff with spam, eggs, cheese

I prefer the first, even with the long line.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Ian Cordasco

On Tue, Aug 12, 2014 at 7:15 AM, Steven D'Aprano  wrote:
> On Tue, Aug 12, 2014 at 10:28:14AM +1000, Nick Coghlan wrote:
>> On 12 Aug 2014 09:09, "Allen Li"  wrote:
>> >
>> > This is a problem I sometimes run into when working with a lot of files
>> > simultaneously, where I need three or more `with` statements:
>> >
>> > with open('foo') as foo:
>> > with open('bar') as bar:
>> > with open('baz') as baz:
>> > pass
>> >
>> > Thankfully, support for multiple items was added in 3.1:
>> >
>> > with open('foo') as foo, open('bar') as bar, open('baz') as baz:
>> > pass
>> >
>> > However, this begs the need for a multiline form, especially when
>> > working with three or more items:
>> >
>> > with open('foo') as foo, \
>> >  open('bar') as bar, \
>> >  open('baz') as baz, \
>> >  open('spam') as spam \
>> >  open('eggs') as eggs:
>> > pass
>>
>> I generally see this kind of construct as a sign that refactoring is
>> needed. For example, contextlib.ExitStack offers a number of ways to manage
>> multiple context managers dynamically rather than statically.
>
> I don't think that ExitStack is the right solution for when you have a
> small number of context managers known at edit-time. The extra effort of
> writing your code, and reading it, in a dynamic manner is not justified.
> Compare the natural way of writing this:
>
> with open("spam") as spam, open("eggs", "w") as eggs, frobulate("cheese") as 
> cheese:
> # do stuff with spam, eggs, cheese
>
> versus the dynamic way:
>
> with ExitStack() as stack:
> spam, eggs = [stack.enter_context(open(fname), mode) for fname, mode in
>   zip(("spam", "eggs"), ("r", "w")]
> cheese = stack.enter_context(frobulate("cheese"))
> # do stuff with spam, eggs, cheese
>
> I prefer the first, even with the long line.

I agree with Steven for *small* numbers of context managers. Once they
become too long though, either refactoring is severely needed or the
user should ExitStack.

To quote Ben Hoyt:

> Is it meaningful to use "with" with a tuple, though? Because a tuple
> isn't a context manager with __enter__ and __exit__ methods. For
> example:
>
> >>> with (1,2,3): pass
> ...
> Traceback (most recent call last):
>   File "", line 1, in 
> AttributeError: __exit__
>
> So -- although I'm not arguing for it here -- you'd be turning an code
> (a runtime AttributeError) into valid syntax.

I think by introducing parentheses we are going to risk seriously
confusing users who may then try to write an assignment like

a = (open('spam') as spam, open('eggs') as eggs)

Because it looks like a tuple but isn't and I think the extra
complexity this would add to the language would not be worth the
benefit. If we simply look at Ruby for what happens when you have an
overloaded syntax that means two different things, you can see why I'm
against modifying this syntax. In Ruby, parentheses for method calls
are optional and curly braces (i.e, {}) are used for blocks and hash
literals. With a method on class that takes a parameter and a block,
you get some confusing errors, take for example:

class Spam
  def eggs(ham)
puts ham
yield if block_present?
  end
end

s = Spam.new
s.eggs {monty: 'python'}
SyntaxError: ...

But

s.eggs({monty: 'python'})

Will print out the hash. The interpreter isn't intelligent enough to
know if you're attempting to pass a hash as a parameter or a block to
be executed. This may seem like a stretch to apply to Python, but the
concept of muddling the meaning of something already very well defined
seems like a bad idea.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Guido van Rossum

On Tue, Aug 12, 2014 at 3:43 AM, Devin Jeanpierre 
wrote:

> I think this thread is probably Python-Ideas territory...
>
> On Mon, Aug 11, 2014 at 4:08 PM, Allen Li  wrote:
> > Currently, this works with explicit line continuation, but as all style
> > guides favor implicit line continuation over explicit, it would be nice
> > if you could do the following:
> >
> > with (open('foo') as foo,
> >   open('bar') as bar,
> >   open('baz') as baz,
> >   open('spam') as spam,
> >   open('eggs') as eggs):
> > pass
>
> The parentheses seem unnecessary/redundant/weird. Why not allow
> newlines in-between "with" and the terminating ":"?
>
> with open('foo') as foo,
>open('bar') as bar,
>open('baz') as baz:
> pass
>

That way lies Coffeescript. Too much guessing.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Armin Rigo

Hi,

On 12 August 2014 01:08, Allen Li  wrote:
> with (open('foo') as foo,
>   open('bar') as bar,
>   open('baz') as baz,
>   open('spam') as spam,
>   open('eggs') as eggs):
> pass

+1.  It's exactly the same grammar extension as for "from import"
statements, for the same reason.


Armin
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Georg Brandl

On 08/12/2014 06:57 PM, Armin Rigo wrote:
> Hi,
> 
> On 12 August 2014 01:08, Allen Li  wrote:
>> with (open('foo') as foo,
>>   open('bar') as bar,
>>   open('baz') as baz,
>>   open('spam') as spam,
>>   open('eggs') as eggs):
>> pass
> 
> +1.  It's exactly the same grammar extension as for "from import"
> statements, for the same reason.

Not the same: in import statements it unambiguously replaces a list
of (optionally as-renamed) identifiers.  Here, it would replace an
arbitrary expression, which I think would mean that we couldn't
differentiate between e.g.

   with (expr).meth():# a line break in "expr"
  # would make the parens useful

and

   with (expr1, expr2):

cheers,
Georg

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] sum(...) limitation

2014-08-12 Thread Chris Barker

On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull 
wrote:

> I'm referring to removing the unnecessary information that there's a
>  better way to do it, and simply raising an error (as in Python 3.2,
> say) which is all a RealProgrammer[tm] should ever need!
>

I can't imagine anyone is suggesting that -- disallow it, but don't tell
anyone why?

The only thing that is remotely on the table here is:

1) remove the special case for strings -- buyer beware -- but consistent
and less "ugly"

2) add a special case for strings that is fast and efficient -- may be as
simple as calling "".join() under the hood --no more code than the
exception check.

And I doubt anyone really is pushing for anything but (2)

Steven Turnbull wrote:

>   IMO we'd also want a homogeneous_iterable ABC

Actually, I've thought for years that that would open the door to a lot of
optimizations -- but that's a much broader question that sum(). I even
brought it up probably over ten years ago -- but no one was the least bit
iinterested -- nor are they now -- I now this was a rhetorical suggestion
to make the point about what not to do

  Because obviously we'd want the
> attractive nuisance of "if you have __add__, there's a default
> definition of __sum__"

now I'm confused -- isn't that exactly what we have now?

It's possible that Python could provide some kind of feature that
> would allow an optimized sum function for every type that has __add__,
> but I think this will take a lot of thinking.

does it need to be every type? As it is the common ones work fine already
except for strings -- so if we add an optimized string sum() then we're
done.

 *Somebody* will do it
> (I don't think anybody is +1 on restricting sum() to a subset of types
> with __add__).

uhm, that's exactly what we have now -- you can use sum() with anything
that has an __add__, except strings. Ns by that logic, if we thought there
were other inefficient use cases, we'd restrict those too.

But users can always define their own classes that have a __sum__ and are
really inefficient -- so unless sum() becomes just for a certain subset of
built-in types -- does anyone want that? Then we are back to the current
situation:

sum() can be used for any type that has an __add__ defined.

But naive users are likely to try it with strings, and that's bad, so we
want to prevent that, and have a special case check for strings.

What I fail to see is why it's better to raise an exception and point users
to a better way, than to simply provide an optimization so that it's a mute
issue.

The only justification offered here is that will teach people that summing
strings (and some other objects?) is order(N^2) and a bad idea. But:

a) Python's primary purpose is practical, not pedagogical (not that it
isn't great for that)

b) I doubt any naive users learn anything other than "I can't use sum() for
strings, I should use "".join()". Will they make the leap to "I shouldn't
use string concatenation in a loop, either"? Oh, wait, you can use string
concatenation in a loop -- that's been optimized. So will they learn: "some
types of object shave poor performance with repeated concatenation and
shouldn't be used with sum(). So If I write such a class, and want to sum
them up, I'll need to write an optimized version of that code"?

I submit that no naive user is going to get any closer to a proper
understanding of algorithmic Order behavior from this small hint. Which
leaves no reason to prefer an Exception to an optimization.

One other point: perhaps this will lead a naive user into thinking --
"sum() raises an exception if I try to use it inefficiently, so it must be
OK to use for anything that doesn't raise an exception" -- that would be a
bad lesson to mis-learn

-Chris

PS:
Armin Rigo wrote:

> It also improves a
> lot the precision of sum(list_of_floats) (though not reaching the same
> precision levels of math.fsum()).

while we are at it, having the default sum() for floats be fsum() would be
nice -- I'd rather the default was better accuracy loser performance. Folks
that really care about performance could call math.fastsum(), or really,
use numpy...

This does turn sum() into a function that does type-based dispatch, but
isn't python full of those already? do something special for the types you
know about, call the generic dunder method for the rest.

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

chris.bar...@noaa.gov
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] sum(...) limitation

2014-08-12 Thread Stefan Richthofer

I know, I have nothing to decide here, since I'm no contributer and just a silent watcher on this list.

However I just wanted to point out I fully agree with Chris Barker's position. Couldn't have stated

it better. Performance should be interpreter implementation issue, not language issue.

> 2) add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check.

I would give it a +1 if my opinion counts anything.

Cheers

Stefan

Gesendet: Dienstag, 12. August 2014 um 21:11 Uhr
Von: "Chris Barker"
An: Kein Empfänger
Cc: "Python Dev"
Betreff: Re: [Python-Dev] sum(...) limitation

On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull wrote:

I'm referring to removing the unnecessary information that there's a
better way to do it, and simply raising an error (as in Python 3.2,
say) which is all a RealProgrammer[tm] should ever need!

I can't imagine anyone is suggesting that -- disallow it, but don't tell anyone why?

The only thing that is remotely on the table here is:

1) remove the special case for strings -- buyer beware -- but consistent and less "ugly"

2) add a special case for strings that is fast and efficient -- may be as simple as calling "".join() under the hood --no more code than the exception check.

And I doubt anyone really is pushing for anything but (2)

Steven Turnbull wrote:

IMO we'd also want a homogeneous_iterable ABC

Actually, I've thought for years that that would open the door to a lot of optimizations -- but that's a much broader question that sum(). I even brought it up probably over ten years ago -- but no one was the least bit iinterested -- nor are they now -- I now this was a rhetorical suggestion to make the point about what not to do

Because obviously we'd want the
attractive nuisance of "if you have __add__, there's a default
definition of __sum__"

now I'm confused -- isn't that exactly what we have now?

It's possible that Python could provide some kind of feature that
would allow an optimized sum function for every type that has __add__,
but I think this will take a lot of thinking.

does it need to be every type? As it is the common ones work fine already except for strings -- so if we add an optimized string sum() then we're done.

*Somebody* will do it
(I don't think anybody is +1 on restricting sum() to a subset of types
with __add__).

uhm, that's exactly what we have now -- you can use sum() with anything that has an __add__, except strings. Ns by that logic, if we thought there were other inefficient use cases, we'd restrict those too.

But users can always define their own classes that have a __sum__ and are really inefficient -- so unless sum() becomes just for a certain subset of built-in types -- does anyone want that? Then we are back to the current situation:

sum() can be used for any type that has an __add__ defined.

But naive users are likely to try it with strings, and that's bad, so we want to prevent that, and have a special case check for strings.

What I fail to see is why it's better to raise an exception and point users to a better way, than to simply provide an optimization so that it's a mute issue.

The only justification offered here is that will teach people that summing strings (and some other objects?) is order(N^2) and a bad idea. But:

a) Python's primary purpose is practical, not pedagogical (not that it isn't great for that)

b) I doubt any naive users learn anything other than "I can't use sum() for strings, I should use "".join()". Will they make the leap to "I shouldn't use string concatenation in a loop, either"? Oh, wait, you can use string concatenation in a loop -- that's been optimized. So will they learn: "some types of object shave poor performance with repeated concatenation and shouldn't be used with sum(). So If I write such a class, and want to sum them up, I'll need to write an optimized version of that code"?

I submit that no naive user is going to get any closer to a proper understanding of algorithmic Order behavior from this small hint. Which leaves no reason to prefer an Exception to an optimization.

One other point: perhaps this will lead a naive user into thinking -- "sum() raises an exception if I try to use it inefficiently, so it must be OK to use for anything that doesn't raise an exception" -- that would be a bad lesson to mis-learn

-Chris

PS:

Armin Rigo wrote:

It also improves a
lot the precision of sum(list_of_floats) (though not reaching the same
precision levels of math.fsum()).

while we are at it, having the default sum() for floats be fsum() would be nice -- I'd rather the default was better accuracy loser performance. Folks that really care about performance could call math.fastsum(), or really, use numpy...

This does turn sum() into a function that does type-based di

Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Devin Jeanpierre

On Tue, Aug 12, 2014 at 8:12 AM, Guido van Rossum  wrote:
> On Tue, Aug 12, 2014 at 3:43 AM, Devin Jeanpierre 
> wrote:
>> The parentheses seem unnecessary/redundant/weird. Why not allow
>> newlines in-between "with" and the terminating ":"?
>>
>> with open('foo') as foo,
>>open('bar') as bar,
>>open('baz') as baz:
>> pass
>
>
> That way lies Coffeescript. Too much guessing.

There's no syntactic ambiguity, so what guessing are you talking about?

What *really* requires guessing, is figuring out where in Python's
syntax parentheses are allowed vs not allowed ;). For example, "from
foo import (bar, baz)" is legal, but "import (bar, baz)" is not.
Sometimes it feels like Python is slowly and organically evolving into
a parenthesis-delimited language.

-- Devin
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] sum(...) limitation

2014-08-12 Thread Nikolaus Rath

Chris Barker  writes:
> What I fail to see is why it's better to raise an exception and point users
> to a better way, than to simply provide an optimization so that it's a mute
> issue.
>
> The only justification offered here is that will teach people that summing
> strings (and some other objects?) is order(N^2) and a bad idea. But:
>
> a) Python's primary purpose is practical, not pedagogical (not that it
> isn't great for that)
>
> b) I doubt any naive users learn anything other than "I can't use sum() for
> strings, I should use "".join()". Will they make the leap to "I shouldn't
> use string concatenation in a loop, either"? Oh, wait, you can use string
> concatenation in a loop -- that's been optimized. So will they learn: "some
> types of object shave poor performance with repeated concatenation and
> shouldn't be used with sum(). So If I write such a class, and want to sum
> them up, I'll need to write an optimized version of that code"?
>
> I submit that no naive user is going to get any closer to a proper
> understanding of algorithmic Order behavior from this small hint. Which
> leaves no reason to prefer an Exception to an optimization.
>
> One other point: perhaps this will lead a naive user into thinking --
> "sum() raises an exception if I try to use it inefficiently, so it must be
> OK to use for anything that doesn't raise an exception" -- that would be a
> bad lesson to mis-learn

AOL to that.

Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Steven D'Aprano

On Tue, Aug 12, 2014 at 08:04:35AM -0500, Ian Cordasco wrote:

> I think by introducing parentheses we are going to risk seriously
> confusing users who may then try to write an assignment like
> 
> a = (open('spam') as spam, open('eggs') as eggs)

Seriously?

If they try it, they will get a syntax error. Now, admittedly Python's 
syntax error messages tend to be terse and cryptic, but it's still 
enough to show that you can't do that.

py> a = (open('spam') as spam, open('eggs') as eggs)
  File "", line 1
a = (open('spam') as spam, open('eggs') as eggs)
   ^
SyntaxError: invalid syntax

I don't see this as a problem. There's no limit to the things that 
people *might* do if they don't understand Python semantics:

for module in sys, math, os, 
import module

(and yes, I once tried this as a beginner) but they try it once, realise 
it doesn't work, and never do it again.

> Because it looks like a tuple but isn't and I think the extra
> complexity this would add to the language would not be worth the
> benefit. 

Do we have a problem with people thinking that, since tuples are 
normally interchangable with lists, they can write this?

from module import [fe, fi, fo, fum,
spam, eggs, cheese]

and then being "seriously confused" by the syntax error they receive? Or 
writing this?

from (module import fe, fi, fo, fum,
spam, eggs, cheese)

It's not sufficient that people might try it, see it fails, and move on. 
Your claim is that it will cause serious confusion. I just don't see 
that happening.

> If we simply look at Ruby for what happens when you have an
> overloaded syntax that means two different things, you can see why I'm
> against modifying this syntax. 

That ship has sailed in Python, oh, 20+ years ago. Parens are used for 
grouping, for tuples[1], for function calls, for parameter lists, class 
base-classes, generator expressions and line continuations. I cannot 
think of any examples where these multiple uses for parens has cause 
meaningful confusion, and I don't think this one will either.

[1] Technically not, since it's the comma, not the ( ), which makes a 
tuple, but a lot of people don't know that and treat it as if it the 
parens were compulsary.

-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] sum(...) limitation

2014-08-12 Thread Stephen J. Turnbull

Redirecting to python-ideas, so trimming less than I might.

Chris Barker writes:
 > On Mon, Aug 11, 2014 at 11:07 PM, Stephen J. Turnbull 
 > wrote:
 > 
 > > I'm referring to removing the unnecessary information that there's a
 > >  better way to do it, and simply raising an error (as in Python 3.2,
 > > say) which is all a RealProgrammer[tm] should ever need!
 > >
 > 
 > I can't imagine anyone is suggesting that -- disallow it, but don't tell
 > anyone why?

As I said, it's a regression.  That's exactly the behavior in Python 3.2.

 > The only thing that is remotely on the table here is:
 > 
 > 1) remove the special case for strings -- buyer beware -- but consistent
 > and less "ugly"

It's only consistent if you believe that Python has strict rules for
use of various operators.  It doesn't, except as far as they are
constrained by precedence.  For example, I have an application where I
add bytestrings bytewise modulo N <= 256, and concatenate them.  In
fact I use function call syntax, but the obvious operator syntax is
'+' for the bytewise addition, and '*' for the concatenation.

It's not in the Zen, but I believe in the maxim "If it's worth doing,
it's worth doing well."  So for me, 1) is out anyway.

 > 2) add a special case for strings that is fast and efficient -- may be as
 > simple as calling "".join() under the hood --no more code than the
 > exception check.

Sure, but what about all the other immutable containers with __add__
methods?  What about mappings with key-wise __add__ methods whose
values might be immutable but have __add__ methods?  Where do you stop
with the special-casing?  I consider this far more complex and ugly
than the simple "sum() is for numbers" rule (and even that is way too
complex considering accuracy of summing floats).

 > And I doubt anyone really is pushing for anything but (2)

I know that, but I think it's the wrong solution to the problem (which
is genuine IMO).  The right solution is something generic, possibly a
__sum__ method.  The question is whether that leads to too much work
to be worth it (eg, "homogeneous_iterable").

 > > Because obviously we'd want the attractive nuisance of "if you
 > > have __add__, there's a default definition of __sum__"
 > 
 > now I'm confused -- isn't that exactly what we have now?

Yes and my feeling (backed up by arguments that I admit may persuade
nobody but myself) is that what we have now kinda sucks[tm].  It
seemed like a good idea when I first saw it, but then, my apps don't
scale to where the pain starts in my own usage.

 > > It's possible that Python could provide some kind of feature that
 > > would allow an optimized sum function for every type that has
 > > __add__, but I think this will take a lot of thinking.
 > 
 > does it need to be every type? As it is the common ones work fine already
 > except for strings -- so if we add an optimized string sum() then we're
 > done.

I didn't say provide an optimized sum(), I said provide a feature
enabling people who want to optimize sum() to do so.  So yes, it needs
to be every type (the optional __sum__ method is a proof of concept,
modulo it actually being implementable ;-).

 > > *Somebody* will do it (I don't think anybody is +1 on restricting
 > > sum() to a subset of types with __add__).
 > 
 > uhm, that's exactly what we have now

Exactly.  Who's arguing that the sum() we have now is a ticket to
Paradise?  I'm just saying that there's probably somebody out there
negative enough on the current situation to come up with an answer
that I think is general enough (and I suspect that python-dev
consensus is that demanding, too).

 > sum() can be used for any type that has an __add__ defined.

I'd like to see that be mutable types with __iadd__.

 > What I fail to see is why it's better to raise an exception and
 > point users to a better way, than to simply provide an optimization
 > so that it's a mute issue.

Because inefficient sum() is an attractive nuisance, easy to overlook,
and likely to bite users other than the author.

 > The only justification offered here is that will teach people that summing
 > strings (and some other objects?)

Summing tuples works (with appropriate start=tuple()).  Haven't
benchmarked, but I bet that's O(N^2).

 > is order(N^2) and a bad idea. But:
 > 
 > a) Python's primary purpose is practical, not pedagogical (not that it
 > isn't great for that)

My argument is that in practical use sum() is a bad idea, period,
until you book up on the types and applications where it *does* work.
N.B. It doesn't even work properly for numbers (inaccurate for floats).

 > b) I doubt any naive users learn anything other than "I can't use sum() for
 > strings, I should use "".join()".

For people who think that special-casing strings is a good idea, I
think this is about as much benefit as you can expect.  Why go
farther?<0.5 wink/>

 > I submit that no naive user is going to get any closer to a proper
 > understanding of algorithmic Order behavior from this

Re: [Python-Dev] sum(...) limitation

Re: [Python-Dev] sum(...) limitation

Re: [Python-Dev] Multiline with statement line continuation

Re: [Python-Dev] Multiline with statement line continuation

Re: [Python-Dev] Multiline with statement line continuation

Re: [Python-Dev] Multiline with statement line continuation

Re: [Python-Dev] Multiline with statement line continuation

Re: [Python-Dev] Multiline with statement line continuation

Re: [Python-Dev] sum(...) limitation

Re: [Python-Dev] sum(...) limitation

Re: [Python-Dev] Multiline with statement line continuation

Re: [Python-Dev] sum(...) limitation

Re: [Python-Dev] Multiline with statement line continuation

Re: [Python-Dev] sum(...) limitation

14 matches

Site Navigation

Mail list logo

Footer information