Re: [Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

2014-06-26 Thread Steven D'Aprano
On Fri, Jun 27, 2014 at 03:07:46AM +0300, Paul Sokolovsky wrote:

> With my MicroPython hat on, os.scandir() would make things only worse.
> With current interface, one can either have inefficient implementation
> (like CPython chose) or efficient implementation (like MicroPython
> chose) - all transparently. os.scandir() supposedly opens up efficient
> implementation for everyone, but at the price of bloating API and
> introducing heavy-weight objects to wrap info. 

os.scandir is not part of the Python API, it is not a built-in function. 
It is part of the CPython standard library. That means (in my opinion) 
that there is an expectation that other Pythons should provide it, but 
not an absolute requirement. Especially for the os module, which by 
definition is platform-specific. In my opinion that means you have four 
options:

1. provide os.scandir, with exactly the same semantics as on CPython;

2. provide os.scandir, but change its semantics to be more lightweight 
   (e.g. return an ordinary tuple, as you already suggest);

3. don't provide os.scandir at all; or

4. do something different depending on whether the platform is Linux
   or an embedded system.

I would consider any of those acceptable for a library feature, but not 
for a language feature.


[...]
> But reusing os.stat struct is glaringly not what's proposed. And
> it's clear where that comes from - "[DirEntry.]lstat(): like os.lstat(),
> but requires no system calls on Windows". Nice, but OS "FooBar" can do
> much more than Windows - it has a system call to send a file by email,
> right when scanning a directory containing it. So, why not to have
> DirEntry.send_by_email(recipient) method? I hear the answer - it's
> because CPython strives to support Windows well, while doesn't care
> about "FooBar" OS.

Correct. If there is sufficient demand for FooBar, then CPython may 
support it. Until then, FooBarPython can support it, and offer whatever 
platform-specific features are needed within its standard library.


> And then it again leads to the question I posed several times - where's
> line between "CPython" and "Python"? Is it grounded for CPython to add
> (or remove) to Python stdlib something which is useful for its users,
> but useless or complicating for other Python implementations?

I think so. And other implementations are free to do the same thing.

Of course there is an expectation that the standard library of most 
implementations will be broadly similar, but not that they will be 
identical.

I am surprised that both Jython and IronPython offer an non-functioning 
dis module: you can import it successfully, but if there's a way to 
actually use it, I haven't found it:


steve@orac:~$ jython
Jython 2.5.1+ (Release_2_5_1, Aug 4 2010, 07:18:19)
[OpenJDK Server VM (Sun Microsystems Inc.)] on java1.6.0_27
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis(lambda x: x+1)
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/share/jython/Lib/dis.py", line 42, in dis
disassemble(x)
  File "/usr/share/jython/Lib/dis.py", line 64, in disassemble
linestarts = dict(findlinestarts(co))
  File "/usr/share/jython/Lib/dis.py", line 183, in findlinestarts
byte_increments = [ord(c) for c in code.co_lnotab[0::2]]
AttributeError: 'tablecode' object has no attribute 'co_lnotab'


IronPython gives a different exception:

steve@orac:~$ ipy
IronPython 2.6 Beta 2 DEBUG (2.6.0.20) on .NET 2.0.50727.1433
Type "help", "copyright", "credits" or "license" for more information.
>>> import dis
>>> dis.dis(lambda x: x+1)
Traceback (most recent call last):
TypeError: don't know how to disassemble code objects


It's quite annoying, I would have rather that they just removed the 
module altogether. Better still would have been to disassemble code 
objects to whatever byte code the Java and .Net platforms use. But 
there's surely no requirement to disassemble to CPython byte code!



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

2014-06-26 Thread Steven D'Aprano
On Thu, Jun 26, 2014 at 09:37:50PM -0400, Ben Hoyt wrote:
> I don't mind iterdir() and would take it :-), but I'll just say why I
> chose the name scandir() -- though it wasn't my suggestion originally:
> 
> iterdir() sounds like just an iterator version of listdir(), kinda
> like keys() and iterkeys() in Python 2. Whereas in actual fact the
> return values are quite different (DirEntry objects vs strings), and
> so the name change reflects that difference a little.

+1 

I think that's a good objective reason to prefer scandir, which suits 
me, because my subjective opinion is that "iterdir" is an inelegant 
and less than attractive name.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 -- os.scandir() function -- a better and faster directory iterator

2014-06-29 Thread Steven D'Aprano
On Sat, Jun 28, 2014 at 03:55:00PM -0400, Ben Hoyt wrote:
> Re is_dir etc being properties rather than methods:
[...]
> The problem with this is that properties "look free", they look just
> like attribute access, so you wouldn't normally handle exceptions when
> accessing them. But .lstat() and .is_dir() etc may do an OS call, so
> if you're needing to be careful with error handling, you may want to
> handle errors on them. Hence I think it's best practice to make them
> functions().

I think this one could go either way. Methods look like they actually 
re-test the value each time you call it. I can easily see people not 
realising that the value is cached and writing code like this toy 
example:


# Detect a file change.
t = the_file.lstat().st_mtime
while the_file.lstat().st_mtime == t:
 sleep(0.1)
print("Changed!")


I know that's not the best way to detect file changes, but I'm sure 
people will do something like that and not realise that the call to 
lstat is cached.

Personally, I would prefer a property. If I forget to wrap a call in a 
try...except, it will fail hard and I will get an exception. But with a 
method call, the failure is silent and I keep getting the cached result.

Speaking of caching, is there a way to freshen the cached values?


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Steven D'Aprano
On Mon, Jul 07, 2014 at 04:52:17PM -0700, Ethan Furman wrote:
> On 07/07/2014 04:49 PM, Benjamin Peterson wrote:
> >
> >Probably the best argument for the behavior is that "x is y" should
> >imply "x == y", which preludes raising an exception. No such invariant
> >is desired for ordering, so default implementations of < and > are not
> >provided in Python 3.
> 
> Nice.  This bit should definitely make it into the doc patch if not already 
> in the docs.

However, saying this should not preclude classes where this is not the 
case, e.g. IEEE-754 NANs. I would not like this wording (which otherwise 
is very nice) to be used in the future to force reflexivity on object 
equality.

https://en.wikipedia.org/wiki/Reflexive_relation

To try to cut off arguments:

- Yes, it is fine to have the default implementation of __eq__ 
  assume reflexivity.

- Yes, it is fine for standard library containers (lists, dicts,
  etc.) to assume reflexivity of their items.

- I'm fully aware that some people think the non-reflexivity of 
  NANs is logically nonsensical and a mistake. I do not agree 
  with them.

- I'm not looking to change anything here, the current behaviour
  is fine, I just want to ensure that an otherwise admirable doc
  change does not get interpreted in the future in a way that 
  prevents classes from defining __eq__ to be non-reflexive.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x - summary

2014-07-07 Thread Steven D'Aprano
On Tue, Jul 08, 2014 at 01:53:06AM +0200, Andreas Maier wrote:
> Thanks to all who responded.
> 
> In absence of class-specific equality test methods, the default 
> implementations revert to use the identity (=address) of the object as a 
> basis for the test, in both Python 2 and Python 3.

Scrub out the "= address" part. Python does not require that objects 
even have an address, that is not part of the language definition. (If I 
simulate a Python interpreter in my head, what is the address of the 
objects?) CPython happens to use the address of objects as their 
identity, but that is an implementation-specific trick, not a language 
guarantee, and it is documented as such. Neither IronPython nor Jython 
use the address as ID.


> In absence of specific ordering test methods, the default 
> implementations revert to use the identity (=address) of the object as a 
> basis for the test, in Python 2. 

I don't think that is correct. This is using Python 2.7:

py> a = (1, 2)
py> b = "Hello World!"
py> id(a) < id(b)
True
py> a < b
False

And just to be sure that neither a nor b are controlling this:

py> a.__lt__(b)
NotImplemented
py> b.__gt__(a)
NotImplemented


So the identity of the instances a and b are not used for < , although 
the identity of their types may be:

py> id(type(a)) < id(type(b))
False


Using the identity of the instances would be silly, since that would 
mean that sorting a list of mixed types would depend on the items' 
history, not their values.


> In Python 3, an exception is raised in that case.

I don't think the ordering methods are terribly relevant to the 
behaviour of equals.


> The bottom line of the discussion seems to be that this behavior is 
> intentional, and a lot of code depends on it.
> 
> We still need to figure out how to document this. Options could be:

I'm not sure it needs to be documented other than to say that the 
default object.__eq__ compares by identity. Everything else is, in my 
opinion, over-thinking it.


> 1. We define that the default for the value of an object is its 
> identity. That allows to describe the behavior of the equality test 
> without special casing such objects, but it does not work for ordering. 

Why does it need to work for ordering? Not all values define ordering 
relations.

Unlike type and identity, "value" does not have a single concrete 
definition, it depends on the class designer. In the case of object, the 
value of an object instance is itself, i.e. its identity. I don't think 
we need more than that.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-07 Thread Steven D'Aprano
On Tue, Jul 08, 2014 at 02:59:30AM +0100, Rob Cliffe wrote:

> >- "*Every object has an identity, a type and a value.*"
>
> Hm, is that *really* true?

Yes. It's pretty much true by definition: objects are *defined* to have 
an identity, type and value, even if that value is abstract rather than 
concrete.


> Every object has an identity and a type, sure.
> Every *variable* has a value, which is an object (an instance of some 
> class).  (I think? :-) )

I don't think so. Variables can be undefined, which means they don't 
have a value:

py> del x
py> print x
Traceback (most recent call last):
  File "", line 1, in 
NameError: name 'x' is not defined


> But ISTM that the notion of the value of an *object* exists more in our 
> minds than in Python.

Pretty much. How could it be otherwise? Human beings define the 
semantics of objects, that is, their value, not Python.


[...]
> If I came across an int object and had no concept of what an integer 
> number was, how would I know what its "value" is supposed to be?

You couldn't, any more than you would know what the value of a Watzit 
object was if you knew nothing about Watzits. The value of an object is 
intimitely tied to its semantics, what the object represents and what it 
is intended to be used for. In general, we can say nothing about the 
value of an object until we've read the documentation for the object.

But we can be confident that the object has *some* value, otherwise what 
would be the point of it? In some cases, that value might be nothing 
more than it's identity, but that's okay.

I think the problem we're having here is that some people are looking 
for a concrete definition of what the value of an object is, but there 
isn't one.


[...]
> And can the following *objects* (class instances) be said to have a 
> (obvious) value?
> obj1 = object()
> def obj2(): pass
> obj3 = (x for x in range(3))
> obj4 = xrange(4)

The value as understood by a human reader, as opposed to the value as 
assumed by Python, is not necessarily the same. As far as Python is 
concerned, the value of all four objects is the object itself, i.e. its 
identity. (For avoidance of doubt, not its id(), which is just a 
number.) 

A human reader could infer more than Python:

- the second object is a "do nothing" function;
- the third object is a lazy sequence (0, 1, 2);
- the fourth object is a lazy sequence (0, 1, 2, 3);

but since the class designer didn't deem it important enough, or 
practical enough, to implement an __eq__ method that takes those things 
into account, *for the purposes of equality* (but perhaps not other 
purposes) we say that the value is just the object itself, its identity.



> And is there any sensible way of comparing two such similar objects, e.g.
> obj3  = (x for x in range(3))
> obj3a = (x for x in range(3))
> except by id?

In principle, one might peer into the two generators and note that they 
perform exactly the same computations on exactly the same input, and 
therefore should be deemed to have the same value. But since that's 
hard, and "exactly the same" is not always well-defined, Python doesn't 
try to be too clever and just uses a simpler idea: the value is the 
object itself.


> Well, possibly in some cases.  You might define two functions as equal 
> if their code objects are identical (I'm outside my competence here, so 
> please no-one correct me if I've got the technical detail wrong).  But I 
> don't see how you can compare two generators (other than by id) except 
> by calling them both destructively (possibly an infinite number of 
> times, and hoping that neither has unpredictable behaviour, side 
> effects, etc.).

Generator objects have code objects as well.

py> x = (a for a in (1, 2))
py> x.gi_code
 at 0xb7ee39f8, file "", line 1>

> >- "An object's /identity/ never changes once it has been created;  
> >The /value/ of some objects can change. Objects whose value can change 
> >are said to be /mutable/; objects whose value is unchangeable once 
> >they are created are called /immutable/."
>
> ISTM it needs to be explicitly documented for each class what the 
> "value" of an instance is intended to be.

Why? What value (pun intended) is there in adding an explicit statement 
of value to every single class?

"The value of a str is the str's sequence of characters."
"The value of a list is the list's sequence of items."
"The value of an int is the int's numeric value."
"The value of a float is the float's numeric value, or in the case of 
 INFs and NANs, that they are an INF or NAN."
"The value of a complex number is the ordered pair of its real and 
 imaginary components."
"The value of a re MatchObject is the MatchObject itself."

I don't see any benefit to forcing all classes to explicitly document 
this sort of thing. It's nearly always redundant and unnecessary.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.pyt

Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Steven D'Aprano
On Tue, Jul 08, 2014 at 04:53:50PM +0900, Stephen J. Turnbull wrote:
> Chris Angelico writes:
> 
>  > The reason NaN isn't equal to itself is because there are X bit
>  > patterns representing NaN, but an infinite number of possible
>  > non-numbers that could result from a calculation.
> 
> I understand that.  But you're missing at least two alternatives that
> involve raising on some calculations involving NaN, as well as the
> fact that forcing inequality of two NaNs produced by equivalent
> calculations is arguably just as wrong as allowing equality of two
> NaNs produced by the different calculations.  

I don't think so. Floating point == represents *numeric* equality, not
(for example) equality in the sense of "All Men Are Created Equal". Not
even numeric equality in the most general sense, but specifically in the
sense of (approximately) real-valued numbers, so it's an extremely 
precise definition of "equal", not fuzzy in any way.

In an early post, you suggested that NANs don't have a value, or that 
they have a value which is not a value. I don't think that's a good way 
to look at it. I think the obvious way to think of it is that NAN's 
value is Not A Number, exactly like it says on the box. Now, if 
something is not a number, obviously you cannot compare it numerically:

"Considered as numbers, is the sound of rain on a tin roof
 numerically equal to the sight of a baby smiling?"

Some might argue that the only valid answer to this question is "Mu",

https://en.wikipedia.org/wiki/Mu_%28negative%29#.22Unasking.22_the_question

but if we're forced to give a Yes/No True/False answer, then clearly
False is the only sensible answer. No, Virginia, Santa Claus is not the 
same number as Santa Claus.

To put it another way, if x is not a number, then x != y for all 
possible values of y -- including x.

[Disclaimer: despite the name, IEEE-754 arguably does not intend NANs to 
be Not A Number in the sense that Santa Claus is not a number, but more 
like "it's some number, but it's impossible to tell which". However, 
despite that, the standard specifies behaviour which is best thought of 
in terms of as the Santa Claus model.]



> That's where things get
> fuzzy for me -- in Python I would expect that preserving invariants
> would be more important than computational efficiency, but evidently
> it's not.  

I'm not sure what you're referring to here. Is it that containers such 
as lists and dicts are permitted to optimize equality tests with 
identity tests for speed?

py> NAN = float('NAN')
py> a = [1, 2, NAN, 4]
py> NAN in a  # identity is checked before equality
True
py> any(x == NAN for x in a)
False


When this came up for discussion last time, the clear consensus was that 
this is reasonable behaviour. NANs and other such "weird" objects are 
too rare and too specialised for built-in classes to carry the burden of 
having to allow for them. If you want a "NAN-aware list", you can make 
one yourself.


> I assume that I would have a better grasp on why Python
> chose to go this way rather than that if I understood IEEE 754 better.

See the answer by Stephen Canon here:

http://stackoverflow.com/questions/1565164/

[quote]

It is not possible to specify a fixed-size arithmetic type that 
satisfies all of the properties of real arithmetic that we know and 
love. The 754 committee has to decide to bend or break some of them. 
This is guided by some pretty simple principles:

When we can, we match the behavior of real arithmetic.
When we can't, we try to make the violations as predictable and as 
easy to diagnose as possible.

[end quote]


In particular, reflexivity for NANs was dropped for a number of reasons, 
some stronger than others:

- One of the weaker reasons for NAN non-reflexivity is that it preserved
  the identity x == y <=> x - y == 0. Although that is the cornerstone 
  of real arithmetic, it's violated by IEEE-754 INFs, so violating it
  for NANs is not a big deal either.

- Dropping reflexivity preserves the useful property that NANs compare 
  unequal to everything.

- Practicality beats purity: dropping reflexivity allowed programmers
  to identify NANs without waiting years or decades for programming 
  languages to implement isnan() functions. E.g. before Python had 
  math.isnan(), I made my own:

  def isnan(x):
  return isinstance(x, float) and x != x

- Keeping reflexivity for NANs would have implied some pretty nasty
  things, e.g. if log(-3) == log(-5), then -3 == -5.


Basically, and I realise that many people disagree with their decision 
(notably Bertrand Meyer of Eiffel fame, and our own Mark Dickenson), the 
IEEE-754 committee led by William Kahan decided that the problems caused 
by having NANs compare unequal to themselves were much less than the 
problems that would have been caused without it.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/

Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Steven D'Aprano
On Tue, Jul 08, 2014 at 04:58:33PM +0200, Anders J. Munch wrote:

> For two NaNs computed differently to compare equal is no worse than 2+2 
> comparing equal to 1+3.  You're comparing values, not their history.

a = -23
b = -42
if log(a) == log(b):
print "a == b"


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x

2014-07-08 Thread Steven D'Aprano
On Tue, Jul 08, 2014 at 06:33:31PM +0100, MRAB wrote:

> The log of a negative number is a complex number.

Only in complex arithmetic. In real arithmetic, the log of a negative 
number isn't a number at all.

-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] == on object tests identity in 3.x - list delegation to members?

2014-07-13 Thread Steven D'Aprano
On Sun, Jul 13, 2014 at 05:13:20PM +0200, Andreas Maier wrote:

> Second, if not by delegation to equality of its elements, how would the 
> equality of sequences defined otherwise?

Wow. I'm impressed by the amount of detailed effort you've put into 
investigating this. (Too much detail to absorb, I'm afraid.) But perhaps 
you might have just asked on the python-l...@python.org mailing list, or 
here, where we would have told you the answer:

list __eq__ first checks element identity before going on
to check element equality.


If you can read C, you might like to check the list source code:

http://hg.python.org/cpython/file/22e5a85ba840/Objects/listobject.c

but if I'm reading it correctly, list.__eq__ conceptually looks 
something like this:

def __eq__(self, other):
if not isinstance(other, list):
return NotImplemented
if len(other) != len(self):
return False
for a, b in zip(self, other):
if not (a is b or a == b):
return False
return True

(The actual code is a bit more complex than that, since there is a 
single function, list_richcompare, which handles all the rich 
comparisons.)

The critical test is PyObject_RichCompareBool here:

http://hg.python.org/cpython/file/22e5a85ba840/Objects/object.c

which explicitly says:

/* Quick result when objects are the same.
   Guarantees that identity implies equality. */


[...]
> I added this test only to show that float NaN is a special case,

NANs are not a special case. List __eq__ treats all object types 
identically (pun intended):

py> class X:
... def __eq__(self, other): return False
...
py> x = X()
py> x == x
False
py> [x] == [X()]
False
py> [x] == [x]
True


[...]
> Case 6.c) is the surprising case. It could be interpreted in two ways 
> (at least that's what I found):
> 
> 1) The comparison is based on identity of the float objects. But that is 
> inconsistent with test #4. And why would the list special-case NaN 
> comparison in such a way that it ends up being inconsistent with the 
> special definition of NaN (outside of the list)?

It doesn't. NANs are not special cased in any way.

This was discussed to death some time ago, both on python-dev and 
python-ideas. If you're interested, you can start here:

https://mail.python.org/pipermail/python-list/2012-October/633992.html

which is in the middle of one of the threads, but at least it gets you 
to the right time period.


> 2) The list does not always delegate to element equality, but attempts 
> to optimize if the objects are the same (same identity).

Right! It's not just lists -- I believe that tuples, dicts and sets 
behave the same way.


> We will see 
> later that that happens. Further, when comparing float NaNs of the same 
> identity, the list implementation forgot to special-case NaNs. Which 
> would be a bug, IMHO.

"Forgot"? I don't think the behaviour of list comparisons is an 
accident.

NAN equality is non-reflexive. Very few other things are the same. It 
would be seriously weird if alist == alist could return False. You'll 
note that the IEEE-754 standard has nothing to say about the behaviour 
of Python lists containing NANs, so we're free to pick whatever 
behaviour makes the most sense for Python, and that is to minimise the 
"Gotcha!" factor.

NANs are a gotcha to anyone who doesn't know IEEE-754, and possibly even 
some who do. I will go to the barricades to fight to keep the 
non-reflexivity of NANs *in isolation*, but I believe that Python has 
made the right decision to treat lists containing NANs the same as 
everything else.

NAN == NAN  # obeys IEEE-754 semantics and returns False

[NAN] == [NAN]  # obeys standard expectation that equality is reflexive

This behaviour is not a bug, it is a feature. As far as I am concerned, 
this only needs documenting. If anyone needs list equality to honour the 
special behaviour of NANs, write a subclass or an equal() function.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Exposing the Android platform existence to Python modules

2014-08-01 Thread Steven D'Aprano
On Sat, Aug 02, 2014 at 05:53:45AM +0400, Akira Li wrote:

> Python uses os.name, sys.platform, and various functions from `platform`
> module to provide version info:
[...]
> If Android is posixy enough (would `posix` module work on Android?)
> then os.name could be left 'posix'.

Does anyone know what kivy does when running under Android?


-- 
Steven

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-02 Thread Steven D'Aprano
On Fri, Aug 01, 2014 at 10:57:38PM -0700, Allen Li wrote:
> On Fri, Aug 01, 2014 at 02:51:54PM -0700, Guido van Rossum wrote:
> > No. We just can't put all possible use cases in the docstring. :-)
> > 
> > 
> > On Fri, Aug 1, 2014 at 2:48 PM, Andrea Griffini  wrote:
> > 
> > help(sum) tells clearly that it should be used to sum numbers and not
> > strings, and with strings actually fails.
> > 
> > However sum([[1,2,3],[4],[],[5,6]], []) concatenates the lists.
> > 
> > Is this to be considered a bug?
> 
> Can you explain the rationale behind this design decision?  It seems
> terribly inconsistent.  Why are only strings explicitly restricted from
> being sum()ed?  sum() should either ban everything except numbers or
> accept everything that implements addition (duck typing).

Repeated list and str concatenation both have quadratic O(N**2) 
performance, but people frequently build up strings with + and rarely do 
the same for lists. String concatenation with + is an attractive 
nuisance for many people, including some who actually know better but 
nevertheless do it. Also, for reasons I don't understand, many people 
dislike or cannot remember to use ''.join.

Whatever the reason, repeated string concatenation is common whereas 
repeated list concatenation is much, much rarer (and repeated tuple 
concatenation even rarer), so sum(strings) is likely to be a land mine 
buried in your code while sum(lists) is not. Hence the decision that 
beginners in particular need to be protected from the mistake of using 
sum(strings) but bothering to check for sum(lists) is a waste of time.

Personally, I wish that sum would raise a warning rather than an 
exception.

As for prohibiting anything except numbers with sum(), that in my 
opinion would be a bad idea. sum(vectors), sum(numeric_arrays), 
sum(angles) etc. should all be allowed. The general sum() built-in 
should accept any type that allows + (unless explicitly black-listed), 
while specialist numeric-only sums could go into modules (like 
math.fsum).



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-02 Thread Steven D'Aprano
On Sat, Aug 02, 2014 at 10:52:07AM -0400, Alexander Belopolsky wrote:
> On Sat, Aug 2, 2014 at 3:39 AM, Steven D'Aprano  wrote:
> 
> > String concatenation with + is an attractive
> > nuisance for many people, including some who actually know better but
> > nevertheless do it. Also, for reasons I don't understand, many people
> > dislike or cannot remember to use ''.join.
> >
> 
> Since sum() already treats strings as a special case, why can't it simply
> call (an equivalent of) ''.join itself instead of telling the user to do
> it?  It does not matter why "many people dislike or cannot remember to use
> ''.join" - if this is a fact - it should be considered by language
> implementors.

It could, of course, but there is virtue in keeping sum simple, 
rather than special-casing who knows how many different types. If sum() 
tries to handle strings, should it do the same for lists? bytearrays? 
array.array? tuple? Where do we stop?

Ultimately it comes down to personal taste. Some people are going to 
wish sum() tried harder to do the clever thing with more types, some 
people are going to wish it was simpler and didn't try to be clever at 
all.

Another argument against excessive cleverness is that it ties sum() to 
one particular idiom or implementation. Today, the idiomatic and 
efficient way to concatenate a lot of strings is with ''.join, but 
tomorrow there might be a new str.concat() method. Who knows? sum() 
shouldn't have to care about these details, since they are secondary to 
sum()'s purpose, which is to add numbers. Anything else is a 
bonus (or perhaps a nuisance).

So, I would argue that when faced with something that is not a number, 
there are two reasonable approaches for sum() to take:

- refuse to handle the type at all; or
- fall back on simple-minded repeated addition.


By the way, I think this whole argument would have been easily 
side-stepped if + was only used for addition, and & used for 
concatenation. Then there would be no question about what sum() should 
do for lists and tuples and strings: raise TypeError.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-04 Thread Steven D'Aprano
On Mon, Aug 04, 2014 at 09:25:12AM -0700, Chris Barker wrote:

> Good point -- I was trying to make the point about .join() vs + for strings
> in an intro python class last year, and made the mistake of having the
> students test the performance.
> 
> You need to concatenate a LOT of strings to see any difference at all --  I
> know that O() of algorithms is unavoidable, but between efficient python
> optimizations and a an apparently good memory allocator, it's really a
> practical non-issue.

If only that were the case, but it isn't. Here's a cautionary tale for 
how using string concatenation can blow up in your face:

Chris Withers asks for help debugging HTTP slowness:
https://mail.python.org/pipermail/python-dev/2009-August/091125.html

and publishes some times:
https://mail.python.org/pipermail/python-dev/2009-September/091581.html

(notice that Python was SIX HUNDRED times slower than wget or IE)

and Simon Cross identified the problem:
https://mail.python.org/pipermail/python-dev/2009-September/091582.html

leading Guido to describe the offending code as an embarrassment.

It shouldn't be hard to demonstrate the difference between repeated 
string concatenation and join, all you need do is defeat sum()'s 
prohibition against strings. Run this bit of code, and you'll see a 
significant difference in performance, even with CPython's optimized 
concatenation:

# --- cut ---
class Faker:
def __add__(self, other):
return other

x = Faker()
strings = list("Hello World!")
assert ''.join(strings) == sum(strings, x)

from timeit import Timer
setup = "from __main__ import x, strings"
t1 = Timer("''.join(strings)", setup)
t2 = Timer("sum(strings, x)", setup)

print (min(t1.repeat()))
print (min(t2.repeat()))
# --- cut ---


On my computer, using Python 2.7, I find the version using sum is nearly 
4.5 times slower, and with 3.3 about 4.2 times slower. That's with a 
mere twelve substrings, hardly "a lot". I tried running it on IronPython 
with a slightly larger list of substrings, but I got sick of waiting for 
it to finish.

If you want to argue that microbenchmarks aren't important, well, I 
might agree with you in general, but in the specific case of string 
concatenation there's that pesky factor of 600 slowdown in real world 
code to argue with.


> Blocking sum( some_strings) because it _might_ have poor performance seems
> awfully pedantic.

The rationale for explicitly prohibiting strings while merely implicitly 
discouraging other non-numeric types is that beginners, who are least 
likely to understand why their code occasionally and unpredictably 
becomes catastrophically slow, are far more likely to sum strings than 
sum tuples or lists.

(I don't entirely agree with this rationale, I'd prefer a warning rather 
than an exception.)



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] sum(...) limitation

2014-08-08 Thread Steven D'Aprano
On Fri, Aug 08, 2014 at 10:20:37PM -0400, Alexander Belopolsky wrote:
> On Fri, Aug 8, 2014 at 8:56 PM, Ethan Furman  wrote:
> 
> > I don't use sum at all, or at least very rarely, and it still irritates me.
> 
> 
> You are not alone.  When I see sum([a, b, c]), I think it is a + b + c, but
> in Python it is 0 + a + b + c.  If we had a "join" operator for strings
> that is different form + - then sure, I would not try to use sum to join
> strings, but we don't.

I've long believed that + is the wrong operator for concatenating 
strings, and that & makes a much better operator. We wouldn't be having 
these interminable arguments about using sum() to concatenate strings 
(and lists, and tuples) if the & operator was used for concatenation and 
+ was only used for numeric addition.


> I have always thought that sum(x) is just a
> shorthand for reduce(operator.add, x), but again it is not so in Python.

The signature of reduce is:

reduce(...)
reduce(function, sequence[, initial]) -> value

so sum() is (at least conceptually) a shorthand for reduce:

def sum(values, initial=0):
return reduce(operator.add, values, initial)

but that's an implementation detail, not a language promise, and sum() 
is free to differ from that simple version. Indeed, even the public 
interface is different, since sum() prohibits using a string as the 
initial value and only promises to work with numbers. The fact that it 
happens to work with lists and tuples is somewhat of an accident of 
implementation.


> While "sum should only be used for numbers,"  it turns out it is not a
> good choice for floats - use math.fsum.

Correct. And if you (generic you, not you personally) do not understand 
why simple-minded addition of floats is troublesome, then you're going 
to have a world of trouble. Anyone who is disturbed by the question of 
"should I use sum or math.fsum?" probably shouldn't be writing serious 
floating point code at all. Floating point computations are hard, and 
there is simply no escaping this fact.


> While "strings are blocked because
> sum is slow," numpy arrays with millions of elements are not.

That's not a good example. Strings are potentially O(N**2), which means 
not just "slow" but *agonisingly* slow, as in taking a week -- no 
exaggeration -- to concat a million strings. If it takes a nanosecond to 
concat two strings, then 1e6**2 such concatenations could take over 
eleven days. Slowness of such magnitude might as well be "the process 
has locked up".

In comparison, summing a numpy array with a million entries is not 
really slow in that sense. The time taken is proportional to the number 
of entries, and differs from summing a list only by a constant factor.

Besides, in the case of strings it is quite simple to decide "is the 
initial value a string?", whereas with lists or numpy arrays it's quite 
hard to decide "is the list or array so huge that the user will consider 
this too slow?". What counts as "too slow" depends on the machine it is 
running on, what other processes are running, and the user's mood, and 
leads to the silly result that summing an array of N items succeeds but 
N+1 items doesn't. So in the case of strings, it is easy to make a
blanket prohibition, but in the case of lists or arrays, there is no 
reasonable place to draw the line.


> And try to
> explain to someone that sum(x) is bad on a numpy array, but abs(x) is fine.

I think that's because sum() has to box up each and every element in the 
array into an object, which is wasteful, while abs() can delegate to a 
specialist array.__abs__ method. Although that's not something beginners 
should be expected to understand, no serious Python programmer should be 
confused by this. As a programmer, we should expect to have some 
understanding of our tools, how they work, their limitations, and when 
to use a different tool. That's why numpy has its own version of sum 
which is designed to work specifically on numpy arrays. Use a specialist 
tool for a specialist job:

py> with Stopwatch():
... sum(carray)  # carray is a numpy array of 7500 floats.
...
11250.0
time taken: 52.659770 seconds
py> with Stopwatch():
... numpy.sum(carray)
...
11250.0
time taken: 0.161263 seconds


>  Why have builtin sum at all if its use comes with so many caveats?

Because sum() is a perfectly reasonable general purpose tool for adding 
up small amounts of numbers where high floating point precision is not 
required. It has been included as a built-in because Python comes with 
"batteries included", and a basic function for adding up a few numbers 
is an obvious, simple battery. But serious programmers should be 
comfortable with the idea that you use the right tool for the right job.

If you visit a hardware store, you will find that even something as 
simple as the hammer exists in many specialist varieties. There are tack 
hammers, claw hammers, framing hammers, lump hammers, rubber and wooden 
mallets, "brass" non-sparking 

Re: [Python-Dev] class Foo(object) vs class Foo: should be clearly explained in python 2 and 3 doc

2014-08-09 Thread Steven D'Aprano
On Sat, Aug 09, 2014 at 02:44:10PM -0400, John Yeuk Hon Wong wrote:
> Hi.
> 
> Referring to my discussion on [1] and then on #python this afternoon.
> 
> A little background would help people to understand where this was 
> coming from.
> 
> 1. I write Python 2 code and have done zero Python-3 specific code.
> 2. I have always been using class Foo(object) so I do not know the new 
> style is no longer required in Python 3. I feel "stupid" and "wrong" by 
> thinking (object) is still a convention in Python 3.

But object is still a convention in Python 3.

It is certainly required when writing code that will behave the same in 
version 2 and 3, and it's optional in 3-only code, but certainly not 
frowned upon or discouraged. There's nothing wrong with explicitly 
inheriting from object in Python 3, and with the Zen of Python "Explicit 
is better than implicit" I would argue that *leaving it out* should be 
very slightly discouraged.

class Spam:  # okay, but a bit lazy
class Spam(object):  # better

Perhaps PEP 8 should make a recommendation, but if so, I think it should 
be a very weak one. In Python 3, it really doesn't matter which you 
write. My own personal practice is to explicitly inherit from object 
when the class is "important" or more than half a dozen lines, and leave 
it out if the class is a stub or tiny.


> 3. Many Python 2 tutorials do not use object as the base class whether 
> for historical reason, or lack of information/education, and can cause 
> confusing to newcomers searching for answers when they consult the 
> official documentation.

We can't do anything about third party tutorials :-(


> While Python 3 code no longer requires object be the base class for the 
> new-style class definition, I believe (object) is still required if one 
> has to write a 2-3 compatible code. But this was not explained or warned 
> anywhere in Python 2 and Python 3 code, AFAIK. (if I am wrong, please 
> correct me)

It's not *always* required, only if you use features which require 
new-style classes, e.g. super, or properties.


> I propose the followings:
> 
> * It is desirable to state boldly to users that (object) is no longer 
> needed in Python-3 **only** code 

I'm against that. Stating this boldly will be understood by some readers 
that object should not be used, and I'm strongly against that. I believe 
explicitly inheriting from object should be mildly preferred, not 
strongly discouraged.


> and warn users to revert to (object) 
> style if the code needs to be 2 and 3 compatible.

I don't think that should be necesary, but have no objections to it 
being mentioned. I think it should be obvious: if you need new-style 
behaviour in Python 2, then obviously you have to inherit from object 
otherwise you have a classic class. That requirement doesn't go away 
just because your code will sometimes run under Python 3.


Looking at your comment here:

> [1]: https://news.ycombinator.com/item?id=8154471

there is a reply from zeckalpha, who says:

   "Actually, leaving out `object` is the preferred convention for 
Python 3, as they are semantically equivalent."

How does (s)he justify this claim?

   "Explicit is better than implicit."

which is not logical. If you leave out `object`, that's implicit, not 
explicit.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] class Foo(object) vs class Foo: should be clearly explained in python 2 and 3 doc

2014-08-10 Thread Steven D'Aprano
On Sun, Aug 10, 2014 at 11:51:51AM -0400, Alexander Belopolsky wrote:
> On Sat, Aug 9, 2014 at 8:44 PM, Steven D'Aprano  wrote:
> 
> > It is certainly required when writing code that will behave the same in
> > version 2 and 3
> >
> 
> This is not true.  An alternative is to put
> 
> __metaclass__ = type
> 
> at the top of your module to make all classes in your module new-style in
> python2.

So it is. I forgot about that, thank you for the correction.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Steven D'Aprano
On Tue, Aug 12, 2014 at 10:28:14AM +1000, Nick Coghlan wrote:
> On 12 Aug 2014 09:09, "Allen Li"  wrote:
> >
> > This is a problem I sometimes run into when working with a lot of files
> > simultaneously, where I need three or more `with` statements:
> >
> > with open('foo') as foo:
> > with open('bar') as bar:
> > with open('baz') as baz:
> > pass
> >
> > Thankfully, support for multiple items was added in 3.1:
> >
> > with open('foo') as foo, open('bar') as bar, open('baz') as baz:
> > pass
> >
> > However, this begs the need for a multiline form, especially when
> > working with three or more items:
> >
> > with open('foo') as foo, \
> >  open('bar') as bar, \
> >  open('baz') as baz, \
> >  open('spam') as spam \
> >  open('eggs') as eggs:
> > pass
> 
> I generally see this kind of construct as a sign that refactoring is
> needed. For example, contextlib.ExitStack offers a number of ways to manage
> multiple context managers dynamically rather than statically.

I don't think that ExitStack is the right solution for when you have a 
small number of context managers known at edit-time. The extra effort of 
writing your code, and reading it, in a dynamic manner is not justified. 
Compare the natural way of writing this:

with open("spam") as spam, open("eggs", "w") as eggs, frobulate("cheese") as 
cheese:
# do stuff with spam, eggs, cheese

versus the dynamic way:

with ExitStack() as stack:
spam, eggs = [stack.enter_context(open(fname), mode) for fname, mode in 
  zip(("spam", "eggs"), ("r", "w")]
cheese = stack.enter_context(frobulate("cheese"))
# do stuff with spam, eggs, cheese

I prefer the first, even with the long line.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multiline with statement line continuation

2014-08-12 Thread Steven D'Aprano
On Tue, Aug 12, 2014 at 08:04:35AM -0500, Ian Cordasco wrote:

> I think by introducing parentheses we are going to risk seriously
> confusing users who may then try to write an assignment like
> 
> a = (open('spam') as spam, open('eggs') as eggs)

Seriously?

If they try it, they will get a syntax error. Now, admittedly Python's 
syntax error messages tend to be terse and cryptic, but it's still 
enough to show that you can't do that.

py> a = (open('spam') as spam, open('eggs') as eggs)
  File "", line 1
a = (open('spam') as spam, open('eggs') as eggs)
   ^
SyntaxError: invalid syntax

I don't see this as a problem. There's no limit to the things that 
people *might* do if they don't understand Python semantics:

for module in sys, math, os, 
import module

(and yes, I once tried this as a beginner) but they try it once, realise 
it doesn't work, and never do it again.

 
> Because it looks like a tuple but isn't and I think the extra
> complexity this would add to the language would not be worth the
> benefit. 

Do we have a problem with people thinking that, since tuples are 
normally interchangable with lists, they can write this?

from module import [fe, fi, fo, fum,
spam, eggs, cheese]


and then being "seriously confused" by the syntax error they receive? Or 
writing this?

from (module import fe, fi, fo, fum,
spam, eggs, cheese)


It's not sufficient that people might try it, see it fails, and move on. 
Your claim is that it will cause serious confusion. I just don't see 
that happening.


> If we simply look at Ruby for what happens when you have an
> overloaded syntax that means two different things, you can see why I'm
> against modifying this syntax. 

That ship has sailed in Python, oh, 20+ years ago. Parens are used for 
grouping, for tuples[1], for function calls, for parameter lists, class 
base-classes, generator expressions and line continuations. I cannot 
think of any examples where these multiple uses for parens has cause 
meaningful confusion, and I don't think this one will either.


[1] Technically not, since it's the comma, not the ( ), which makes a 
tuple, but a lot of people don't know that and treat it as if it the 
parens were compulsary.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reviving restricted mode?

2014-08-13 Thread Steven D'Aprano
On Thu, Aug 14, 2014 at 02:26:29AM +1000, Chris Angelico wrote:
> On Wed, Aug 13, 2014 at 11:11 PM, Isaac Morland  wrote:
> > While I would not claim a Python sandbox is utterly impossible, I'm
> > suspicious that the whole "consenting adults" approach in Python is
> > incompatible with a sandbox.  The whole idea of a sandbox is to absolutely
> > prevent people from doing things even if they really want to and know what
> > they are doing.

The point of a sandbox is that I, the consenting adult writing the 
application in the first place, may want to allow *untrusted others* to 
call Python code without giving them control of the entire application. 
The consenting adults rule applies to me, the application writer, not 
them, the end-users, even if they happen to be writing Python code. If 
they want unrestricted access to the Python interpreter, they can run 
their code on their own machine, not mine.


> It's certainly not *fundamentally* impossible to sandbox Python.
> However, the question becomes one of how much effort you're going to
> go to and how much you're going to restrict the code.

I believe that PyPy has an effective sandbox, but to what degree of 
effectiveness I don't know.

http://pypy.readthedocs.org/en/latest/sandbox.html

I've had rogue Javascript crash my browser or make my entire computer 
effectively unusable often enough that I am skeptical about claims that 
Javascript in the browser is effectively sandboxed, so I'm doubly 
cautious about Python.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multiline with statement line continuation

2014-08-13 Thread Steven D'Aprano
On Wed, Aug 13, 2014 at 08:08:51PM +0300, yoav glazner wrote:
[...]
> Just a thought, would it bit wierd that:
> with (a as b, c as d): "works"
> with (a, c): "boom"
> with(a as b, c): ?

If this proposal is accepted, there is no need for the "boom". The 
syntax should allow:

# Without parens, limited to a single line.
with a [as name], b [as name], c [as name], ...:
block

# With parens, not limited to a single line.
with (a [as name],
  b [as name],
  c [as name],
  ...
  ):
block

where the "as name" part is always optional. In both these cases, 
whether there are parens or not, it will be interpreted as a series of 
context managers and never as a single tuple.

Note two things:

(1) this means that even in the unlikely event that tuples become 
context managers in the future, you won't be able to use a tuple 
literal:

with (1, 2, 3):  # won't work as expected

t = (1, 2, 3)
with t:  # will work as expected

But I cannot imagine any circumstances where tuples will become context 
managers.


(2) Also note that *this is already the case*, since tuples are made by 
the commas, not the parentheses. E.g. this succeeds:

# Not a tuple, actually two context managers.
with open("/tmp/foo"), open("/tmp/bar", "w"):
   pass




-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multiline with statement line continuation

2014-08-15 Thread Steven D'Aprano
On Fri, Aug 15, 2014 at 02:08:42PM -0700, Ethan Furman wrote:
> On 08/13/2014 10:32 AM, Steven D'Aprano wrote:
> >
> >(2) Also note that *this is already the case*, since tuples are made by
> >the commas, not the parentheses. E.g. this succeeds:
> >
> ># Not a tuple, actually two context managers.
> >with open("/tmp/foo"), open("/tmp/bar", "w"):
> >pass
> 
> Thanks for proving my point!  A comma, and yet we did *not* get a tuple 
> from it.

Um, sorry, I don't quite get you. Are you agreeing or disagreeing with 
me? I spent half of yesterday reading the static typing thread over on 
Python-ideas and it's possible my brain has melted down *wink* but I'm 
confused by your response.

Normally when people say "Thanks for proving my point", the implication 
is that the person being thanked (in this case me) has inadvertently 
undercut their own argument. I don't think I have. I'm suggesting that 
the argument *against* the proposal:

"Multi-line with statements should not be allowed, because:

with (spam,
  eggs,
  cheese):
...

is syntactically a tuple"


is a poor argument (that is, I'm disagreeing with it), since *single* 
line parens-free with statements are already syntactically a tuple:

with spam, eggs, cheese:  # Commas make a tuple, not parens.
...

I think the OP's suggestion is a sound one, and while Nick's point that 
bulky with-statements *may* be a sign that some re-factoring is needed, 
there are many things that are a sign that re-factoring is needed and 
I don't think this particular one warrents rejecting what is otherwise 
an obvious and clear way of using multiple context managers.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multiline with statement line continuation

2014-08-15 Thread Steven D'Aprano
On Fri, Aug 15, 2014 at 08:29:09PM -0700, Ethan Furman wrote:
> On 08/15/2014 08:08 PM, Steven D'Aprano wrote:

[...]
> >is a poor argument (that is, I'm disagreeing with it), since *single*
> >line parens-free with statements are already syntactically a tuple:
> >
> > with spam, eggs, cheese:  # Commas make a tuple, not parens.
> 
> This point I do not understand -- commas /can/ create a tuple, but don't 
> /necessarily/ create a tuple.  So, semantically: no tuple.

Right! I think we are in agreement. It's not that with statements 
actually generate a tuple, but that they *look* like they include a 
tuple. That's what I meant by "syntactically a tuple", sorry if that was 
confusing. I didn't mean to suggest that Python necessarily builds a 
tuple of context managers.

If people were going to be prone to mistake

with (a, b, c): ...

as including a tuple, they would have already mistaken:

with a, b, c: ...

the same way. But they haven't.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multiline with statement line continuation

2014-08-16 Thread Steven D'Aprano
On Sat, Aug 16, 2014 at 05:25:33PM +1000, Ben Finney wrote:
[...] 
> > they would have already mistaken:
> >
> > with a, b, c: ...
> >
> > the same way. But they haven't.
> 
> Right. The presence or absence of parens make a big semantic difference.

from silly.mistakes.programmers.make import (
 hands, up, anyone, who, thinks, this, is_, a, tuple)

def function(how, about, this, one): ...


But quite frankly, even if there is some person somewhere who gets 
confused and tries to write:

context_managers = (open("a"), open("b", "w"), open("c", "w"))
with context_managers as things:
text = things[0].read()
things[1].write(text)
things[2].write(text.upper())


I simply don't care. They will try it, discover that tuples are not 
context managers, fix their code, and move on. (I've made sillier 
mistakes, and became a better programmer from it.)

We cannot paralyse ourselves out of fear that somebody, somewhere, will 
make a silly mistake. You can try that "with tuple" code right now, and 
you will get nice runtime exception. I admit that the error message is 
not the most descriptive I've ever seen, but I've seen worse, and any 
half-decent programmer can do what they do for any other unexpected 
exception: read the Fine Manual, or ask for help, or otherwise debug the 
problem. Why should this specific exception be treated as so harmful 
that we have to forgo a useful piece of functionality to avoid it?

Some designs are bug-magnets, like the infamous "except A,B" syntax, 
which fails silently, doing the wrong thing. Unless someone has a 
convincing rationale for how and why this multi-line with will likewise 
be a bug-magnet, I don't think that some vague similarity between it and 
tuples is justification for rejecting the proposal.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 4000 to explicitly declare we won't be doing a Py3k style compatibility break again?

2014-08-16 Thread Steven D'Aprano
On Sun, Aug 17, 2014 at 11:28:48AM +1000, Nick Coghlan wrote:
> I've seen a few people on python-ideas express the assumption that
> there will be another Py3k style compatibility break for Python 4.0.

I used to refer to Python 4000 as the hypothetical compatibility break 
version. Now I refer to Python 5000.

> I've also had people express the concern that "you broke compatibility
> in a major way once, how do we know you won't do it again?".

Even languages with ISO standards behind them and release schedules 
measured in decades make backward-incompatible changes. For example, I 
see that Fortran 95 (despite being classified as a minor revision) 
deleted at least six language features. To expect Python to never break 
compatibility again is asking too much.

But I think it is fair to promise that Python won't make *so 
many* backwards incompatible changes all at once again, and has no 
concrete plans to make backwards incompatible changes to syntax in the 
foreseeable future. (That is, not before Python 5000 :-)


[...]
> If folks (most signficantly, Guido) are amenable to the idea, it
> shouldn't take long to put such a PEP together, and I think it could
> help reduce some of the confusions around the expectations for Python
> 4.0 and the evolution of 3.x in general.

I think it's a good idea, so long as there's no implied or explicit 
promise that Python language is now set in stone never to change.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-22 Thread Steven D'Aprano
On Fri, Aug 22, 2014 at 04:42:29AM +0200, Oleg Broytman wrote:
> On Thu, Aug 21, 2014 at 05:30:14PM -0700, Chris Barker - NOAA Federal 
>  wrote:
> > This brings up the other key problem. If file names are (almost)
> > arbitrary bytes, how do you write one to/read one from a text file
> > with a particular encoding? ( or for that matter display it on a
> > terminal)
> 
>There is no such thing as an encoding of text files.

I don't understand this comment. It seems to me that *text* files have 
to have an encoding, otherwise you can't interpret the contents as text. 
Files, of course, only contain bytes, but to be treated as bytes you 
need some way of transforming byte N to char C (or multiple bytes to C), 
which is an encoding.

Perhaps you just mean that encodings are not recorded in the text file 
itself?

To answer Chris' question, you typically cannot include arbitrary 
bytes in text files, and displaying them to the user is likewise 
problematic. The usual solution is to support some form of 
escaping, like \t #x0A; or %0D, to give a few examples.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-23 Thread Steven D'Aprano
On Fri, Aug 22, 2014 at 11:53:01AM -0700, Chris Barker wrote:

> The point is that if you are reading a file name from the system, and then
> passing it back to the system, then you can treat it as just bytes -- who
> cares? And if you add the byte value of 47 thing, then you can even do
> basic path manipulations. But once you want to do other things with your
> file name, then you need to know the encoding. And it is very, very common
> for users to need to do other things with filenames, and they almost always
> want them as text that they can read and understand.
> 
> Python3 supports this case very well. But it does indeed make it hard to
> work with filenames when you don't know the encoding they are in.

Just "not knowing" is not sufficient. In that case, you'll likely get a 
Unicode string containing moji-bake:

# I write a file name using UTF-8 on my system:
filename = 'music by Наӥв.txt'.encode('utf-8')
# You try to use it assuming ISO-8859-7 (Greek)
filename.decode('iso-8859-7')
=> 'music by Π\x9dΠ°Σ₯Π².txt'

which, even though it looks wrong, still lets you refer to the file 
(provided you then encode back to bytes with ISO-8859-7 again). This 
won't always be the case, sometimes the encoding you guess will be 
wrong.

When I started this email, I originally began to say that the actual 
problem was with byte file names that cannot be decoded into Unicode 
using the system encoding (typically UTF-8 on Linux systems. But I've 
actually had difficulty demonstrating that it actually is a problem. I 
started with a byte sequence which is invalid UTF-8, namely:

b'ZZ\xdb\xdf\xfa\xff'

created a file with that name, and then tried listing it with 
os.listdir. Even in Python 3.1 it worked fine. I was able to list the 
directory and open the file, so I'm not entirely sure where the problem 
lies exactly. Can somebody demonstrate the failure mode?


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog

2014-09-10 Thread Steven D'Aprano
On Wed, Sep 10, 2014 at 05:17:57PM +1000, Nick Coghlan wrote:
> Since it may come in handy when discussing "Why was Python 3
> necessary?" with folks, I wanted to point out that my article on the
> transition to multilingual programming has now been reposted on the
> Red Hat developer blog:
> http://developerblog.redhat.com/2014/09/09/transition-to-multilingual-programming-python/

That's awesome! Thank you Nick.

-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog

2014-09-16 Thread Steven D'Aprano
On Wed, Sep 17, 2014 at 11:14:15AM +1000, Chris Angelico wrote:
> On Wed, Sep 17, 2014 at 5:29 AM, R. David Murray  
> wrote:

> > Basically, we are pretending that the each smuggled
> > byte is single character for string parsing purposes...but they don't
> > match any of our parsing constants.  They are all "any character" matches
> > in the regexes and what have you.
> 
> This is slightly iffy, as you can't be sure that one byte represents
> one character, but as long as you don't much care about that, it's not
> going to be an issue.

This discussion would probably be a lot more easy to follow, with fewer 
miscommunications, if there were some examples. Here is my example, 
perhaps someone can tell me if I'm understanding it correctly.

I want to send an email including the header line:

'Subject: “NOBODY expects the Spanish Inquisition!”'

Note the curly quotes. I've read the manifesto "UTF-8 Everywhere" so I 
do the right thing and encode it as UTF-8:

b'Subject: \xe2\x80\x9cNOBODY expects the Spanish Inquisition!\xe2\x80\x9d'

but my mail package, not being written in a language as awesome as 
Python, is just riddled with bugs, and somehow I end up with this 
corrupted byte-string instead:

b'Subject: \x9c\x80\xe2NOBODY expects the Spanish Inquisition!\xe2\x80\x9d'

Note that the bytes from the first curly quote bytes are in the wrong 
order, but the second is okay. (Like I said, it's just *riddled* with 
bugs.) That means that trying to decode those bytes will fail in Python:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9c in position 9: 
invalid start byte

but it's not up to Python's email package to throw those invalid bytes 
out or permantly replace them with something else. Also, we want to work 
with Unicode strings, not byte strings, so there has to be a way to 
smuggle those three bytes into Unicode, without ending up with either 
the replacement bytes:

# using the 'replace' error handler
'Subject: ���NOBODY expects the Spanish Inquisition!”'

or incorrectly interpreting them as valid, but wrong, code points. (If 
we do the second, we end up with two control characters "\x9c\x80" 
followed by "â".) We want to be able to round-trip back to the same 
bytes we received.

Am I right so far?

So the email package uses the surrogate-escape error handler and ends up 
with this Unicode string:

'Subject: \udc9c\udc80\udce2NOBODY expects the Spanish Inquisition!”'

which can be encoded back to the bytes we started with.

Note that technically those three \u... code points are NOT classified 
as "noncharacters". They are actually surrogate code points:

http://www.unicode.org/faq/private_use.html#nonchar4
http://www.unicode.org/glossary/#surrogate_code_point

and they're supposed to be reserved for UTF-16. I'm not sure of the 
implication of that.


> I'm fairly sure you're never going to find an
> encoding in which one unknown byte represents two characters,

There are encodings which use a "shift" mechanism, whereby a byte X 
represents one character by default, and a different character after the 
shift mechanism. But I don't think that matters, since we're not able to 
interpret those bytes. If we were, we'd just decode them to a text 
string and be done with it.


> but
> there are cases where it takes more than one byte to make up a
> character (or the bytes are just shift codes or something). 

Multi-byte encodings are very common. All the Unicode encodings are 
multi-byte. So are many East Asian encodings.


> Does that
> ever throw off your regexes? It wouldn't be an issue to a .* between
> two character markers, but if you ever say .{5} then it might match
> incorrectly.

I don't think the idea is to match on these smuggled bytes specifically. 
I think the idea is to match *around* them. In the example above, we 
might match everything from "Subject: " to the end of the line. So long 
as we never end up with a situation where the smuggled bytes are 
replaced by something else, or shuffled around into different positions, 
we should be fine.

David, is my understanding correct?



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Multilingual programming article on the Red Hat Developer blog

2014-09-17 Thread Steven D'Aprano
On Wed, Sep 17, 2014 at 09:21:56AM +0900, Stephen J. Turnbull wrote:

> Guido's mantra is something like "Python's str doesn't contain
> characters or even code points[1], it contains code units."

But is that true? If it were true, I would expect to be able to make 
Python text strings containing code units that aren't code points, e.g. 
something like "\U1234" or chr(0x1234) should work, but neither 
do. As far as I can tell, there is no way to build a string containing 
items which aren't code points.

I don't think it is useful to say that strings *contain* code units, 
more that they *are made up from* code units. Code units are the 
implementation: 16-bit code units in narrow builds, 32-bit code units 
in wide builds, and either 8-, 16- or 32-bit code units in Python 3.3 and 
beyond. (I don't know of any Python implementation which uses UTF-8 
internally, but if there was one, it would use 8-bit code units.)

It isn't very useful to say that in Python 3.3 the string "A" *contains*
the 8-bit code unit 0x41. That's conflating two different levels of 
explanation (the high-level interface and the underlying implemention) 
and potentially leads to user confusion like

# 8-bit code units are bytes, right?
assert b'\41' in "A"

which is Not Even Wrong.
http://rationalwiki.org/wiki/Not_even_wrong

I think it is correct to say that Python strings are sequences of 
Unicode code points U+ through U+10. There are no other 
restrictions, e.g. strings can contain surrogates, noncharacters, or 
nonsensical combinations of code points such as a U+0300 COMBINING GRAVE 
ACCENT combined with U+000A (newline).


> Implying
> that dealing with characters (or the grapheme globs that occasionally
> raise their ugly heads here) is an issue for higher-level facilities
> than str to deal with.

Agreed that Python doesn't offer a string type based on graphemes, and 
that such a facility belongs as a high-level library, not a built-in 
type.

Also agreed that talking about characters is sloppy. Nevertheless, for 
English speakers at least, "code point = character" isn't too awful a 
first approximation.


> The point being that
> 
>  > Basically, we are pretending that the each smuggled byte is single
>  > character
> 
> is something of a misstatement (good enough for present purpose of
> discussing email, but not good enough for the general case of
> understanding how this is supposed to work when porting the construct
> to other Python implementations), while
> 
>  > for string parsing purposes...but they don't match any of our
>  > parsing constants.
> 
> is precisely Pythonically correct.  You might want to add "because all
> parsing constants contain only valid characters by construction."

I don't understand what you are trying to say here.


>  > [*] I worried a lot that this was re-introducing the bytes/string
>  > problem from python2.
> 
> It isn't, because the bytes/str problem was that given a str object
> out of context you could not tell whether it was a binary blob or
> text, and if text, you couldn't tell if it was external encoded text
> or internal abstract text.
> 
> That is not true here because the representations of characters vs.
> smuggled bytes in str are disjoint sets.

Nor am I sure what you are trying to say here either.


> Footnotes: 
> [1]  In Unicode terminology, a code unit is the smallest computer
> object that can represent a character (this is uniquely and sanely
> defined for all real Unicode transformation formats aka UTFs).  A code
> point is an integer 0 - (17*256*256-1) that can represent a character,
> but many code points such as surrogates and 0x are defined to be
> non-characters.

Actually not quite. "Noncharacter" is concretely defined in Unicode, and 
there are only 66 of them, many fewer than the surrogate code points 
alone. Surrogates are reserved, not noncharacters.

http://www.unicode.org/glossary/#surrogate_code_point
http://www.unicode.org/faq/private_use.html#nonchar1

It is wrong to talk about "surrogate characters", but perhaps you mean 
to say that surrogates (by which I understand you to mean surrogate code 
points) are "not human-meaningful characters", which is not the same 
thing as a Unicode noncharacter.


> Characters are those code points that may be assigned
> an interpretation as a character, including undefined characters
> (private space and reserved).

So characters are code points which are characters, including undefined 
characters? :-)

http://www.unicode.org/glossary/#character



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 394 - Clarification of what "python" command should invoke

2014-09-19 Thread Steven D'Aprano
On Fri, Sep 19, 2014 at 04:44:26AM -0400, Donald Stufft wrote:
> 
> > On Sep 19, 2014, at 3:31 AM, Bohuslav Kabrda  wrote:
> > 
> > Hi, as Fedora is getting closer to having python3 as a default, I'm 
> > being more and more asked by Fedora users/contributors what'll 
> > "/usr/bin/python" invoke when we achieve this (Fedora 22 hopefully). 
> > So I was rereading PEP 394 and I think I need a small clarification 
> > regarding two points in the PEP: - "for the time being, all 
> > distributions should ensure that python refers to the same target as 
> > python2." - "Similarly, the more general python command should be 
> > installed whenever any version of Python is installed and should 
> > invoke the same version of Python as either python2 or python3."
> > 
> > The important word in the second point is, I think, *whenever*. 
> > Trying to apply these two points to Fedora 22 situation, I can think 
> > of several approaches:

> > - /usr/bin/python will always point to python3 (seems to go against 
> > the first mentioned PEP recommendation)

Definitely not that.

Arch Linux pointed /usr/bin/python at Python 3 some years ago, and I 
understand that this has caused no end of trouble for the folks on 
#python. I haven't seen any sign of this being an issue on the tutor@ or 
python-l...@python.org mailing lists, but the demographics are quite 
different so that's not surprising.


> > - /usr/bin/python will always point to python2 (seems to go against 
> > the second mentioned PEP recommendation, there is no /usr/bin/python 
> > if python2 is not installed)

My understanding is that this is the intention of the PEP, at least 
until such time as Python 2 is end-of-lifed.

My interpretion would be that the second recommendation in the PEP is 
just confused :-) Perhaps the PEP author could clarify what the 
intention is.


> > - /usr/bin/python will point to python3 if python2 is not installed, 
> > else it will point to python2 (inconsistent; also the user doesn't 
> > know he's running and what libraries he'll be able to import - the 
> > system can have different sets of python2-* and python3-* extension 
> > modules installed)

Likely to cause all sorts of problems, and I understood that this was 
not the intention. Perhaps it was added *only* as a "grand-father 
clause" so that people don't yell at Arch Linux "See, the PEP says 
you're doing it wrong!".


> > - there will be no /usr/bin/python (goes against PEP and seems just wrong)

Seems like the least-worst to me.

If you think of "python == Python 2.x" (at least for the next few 
years), then if Python 2.x isn't installed, there should be no 
/usr/bin/python either.

> I don’t know for a fact, but I assume that as long as Python 2.x is 
> installed by default than ``python`` should point to ``python2``. If 
> Python 3.x is the default version and Python 2.x is the “optional” 
> version than I think personally it makes sense to switch eventually. 
> Maybe not immediately to give people time to update though?

Agreed. Once Python 2 is finally end-of-lifed in 2023 or thereabouts, 
then we can reconsider pointing /usr/bin/python at Python 3 (or 4, 
whatever is current by then). If Arch Linux jumped the gun by a decade 
or so, that's their problem :-)


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 394 - Clarification of what "python" command should invoke

2014-09-19 Thread Steven D'Aprano
On Fri, Sep 19, 2014 at 10:41:58AM -0400, Barry Warsaw wrote:
> On Sep 19, 2014, at 10:23 AM, Donald Stufft wrote:
> 
> >My biggest problem with ``python3``, is what happens after 3.9.
> 
> FWIW, 3.9 by my rough calculation is 7 years away.

That makes it 2021, one year after Python 2.7 free support ends, but two 
years before Red Hat commercial support for it ends.

> I seem to recall Guido saying that *if* there's a 4.0, it won't be a major
> break like Python 3, whatever that says about the numbering scheme after 3.9.
> 
> Is 7 years enough to eradicate Python 2 the way we did for Python 1?  Then
> maybe Python 4 can reclaim /usr/bin/python.

I expect not quite. Perhaps 10 years though.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Critical bash vulnerability CVE-2014-6271 may affect Python on *n*x and OSX

2014-09-25 Thread Steven D'Aprano
On Fri, Sep 26, 2014 at 12:17:46AM +0200, Antoine Pitrou wrote:
> On Thu, 25 Sep 2014 13:00:16 -0700
> Bob Hanson  wrote:
> > Critical bash vulnerability CVE-2014-6271 may affect Python on
> > *n*x and OSX:
[...]

See also:

http://adminlogs.info/2014/09/25/again-bash-cve-2014-7169/


> Fortunately, Python's subprocess has its `shell` argument default to
> False. However, `os.system` invokes the shell implicitly and is
> therefore a possible attack vector.

Perhaps I'm missing something, but aren't there easier ways to attack 
os.system than the bash env vulnerability? If I'm accepting and running 
arbitrary strings from an untrusted user, there's no need for them to go 
to the trouble of feeding me:

"env x='() { :;}; echo gotcha'  bash -c 'echo do something useful'"

when they can just feed me:

"echo gotcha"

In other words, os.system is *already* an attack vector, unless you only 
use it with trusted strings. I don't think the bash env vulnerability 
adds to the attack surface.

Have I missed something?



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [OFF-TOPIC] It is true that is impossible write in binary code, the lowest level of programming that you can write is in hex code?

2014-11-03 Thread Steven D'Aprano
This is off-topic for this mailing list, as you know. There are some 
mailing lists which approve of off-topic conversations, but this is not 
one of those.

You could ask on the python-l...@python.org mailing list, where it will 
still be off-topic, but the people there are more likely to answer. But 
even better would be to look for a mailing list or forum for assembly 
programming, machine code, or micro-code.


On Mon, Nov 03, 2014 at 09:19:46PM -0200, françai s wrote:
> I intend to write in lowest level of computer programming as a hobby.
[...]


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-21 Thread Steven D'Aprano
On Sat, Nov 22, 2014 at 12:53:41AM +1100, Chris Angelico wrote:
> On Sat, Nov 22, 2014 at 12:47 AM, Raymond Hettinger
>  wrote:
> > Also, the proposal breaks a reasonably useful pattern of calling
> > next(subiterator) inside a generator and letting the generator terminate
> > when the data stream  ends.  Here is an example that I have taught for
> > years:
> >
> > def izip(iterable1, iterable2):
> > it1 = iter(iterable1)
> > it2 = iter(iterable2)
> > while True:
> > v1 = next(it1)
> > v2 = next(it2)
> > yield v1, v2
> 
> Is it obvious to every user that this will consume an element from
> it1, then silently terminate if it2 no longer has any content?

"Every user"? Of course not. But it should be obvious to those who think 
carefully about the specification of zip() and what is available to 
implement it.

zip() can't detect that the second argument is empty except by calling 
next(), which it doesn't do until after it has retrieved a value from 
the first argument. If it turns out the second argument is empty, what 
can it do with that first value? It can't shove it back into the 
iterator. It can't return a single value, or pad it with some sentinel 
value (that's what izip_longest does). Since zip() is documented as 
halting on the shorter argument, it can't raise an exception. So what 
other options are there apart from silently consuming the value?

Indeed that is exactly what the built-in zip does:

py> a = iter("abcdef")
py> b = iter("abc")
py> list(zip(a, b))
[('a', 'a'), ('b', 'b'), ('c', 'c')]
py> next(a)
'e'


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-21 Thread Steven D'Aprano
On Thu, Nov 20, 2014 at 11:36:54AM -0800, Guido van Rossum wrote:

[...]
> That said, I think for most people the change won't matter, some people
> will have to apply one of a few simple fixes, and a rare few will have to
> rewrite their code in a non-trivial way (sometimes this will affect
> "clever" libraries).
> 
> I wonder if the PEP needs a better transition plan, e.g.
> 
> - right now, start an education campaign
> - with Python 3.5, introduce "from __future__ import generator_return", and
> silent deprecation warnings
> - with Python 3.6, start issuing non-silent deprecation warnings
> - with Python 3.7, make the new behavior the default (subject to some kind
> of review)

I fear that there is one specific corner case that will be impossible to 
deal with in a backwards-compatible way supporting both Python 2 and 3 
in one code base: the use of `return value` in a generator.

In Python 2.x through 3.1, `return value` is a syntax error inside 
generators. Currently, the only way to handle this case in 2+3 code is 
by using `raise StopIteration(value)` but if that changes in 3.6 or 3.7 
then there will be no (obvious?) way to deal with this case.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-23 Thread Steven D'Aprano
On Sun, Nov 23, 2014 at 08:17:00AM -0800, Ethan Furman wrote:

> While I am in favor of PEP 479, and I have to agree with Raymond that 
> this isn't pretty.
> 
> Currently, next() accepts an argument of what to return if the 
> iterator is empty.  Can we enhance that in some way so that the 
> overall previous behavior could be retained?
[...]
> Then, if the iterator is empty, instead of raising StopIteration, or 
> returning some value that would then have to be checked, it could 
> raise some other exception that is understood to be normal generator 
> termination.

We *already* have an exception that is understood to be normal 
generator termination. It is called StopIteration.

Removing the long-standing ability to halt generators with 
StopIteration, but then recreating that ability under a different name 
is the worst of both worlds:

- working code is still broken;
- people will complain that the new exception X is silently swallowed by 
  generators, just as they complained about StopIteration;
- it is yet another subtle difference between Python 2 and 3;
- it involves a code smell ("no constant arguments to functions");
- and does nothing to help generators that don't call next().

The current behaviour is nice and clean and has worked well for over a 
decade. The new behaviour exchanges consistency in one area (generators 
behave like all other iterators) for consistency in another (generator 
expressions will behave like comprehensions in the face of 
StopIteration). But trying to have both at the same time via a new 
exception adds even more complexity and would leave everyone unhappy.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move selected documentation repos to PSF BitBucket account?

2014-11-23 Thread Steven D'Aprano
On Sun, Nov 23, 2014 at 08:55:50AM -0800, Guido van Rossum wrote:

> But I strongly believe that if we want to do the right thing for the 
> long term, we should switch to GitHub.

Encouraging a software, or social, monopoly is never the right thing for 
the long term.

http://nedbatchelder.com/blog/201405/github_monoculture.html


> I promise you that once the pain of the switch is over you will feel 
> much better about it. I am also convinced that we'll get more 
> contributions this way.

I'm sure that we'll get *more* contributions, but will they be *better* 
contributions?

I know that there are people who think that mailing lists are old and 
passe, and that we should shift discussion to a social media site like 
Reddit. If we did, we'd probably get twenty times as many comments, and 
the average quality would probably plummet. More is not necessarily a 
good thing.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Move selected documentation repos to PSF BitBucket account?

2014-11-23 Thread Steven D'Aprano
On Sun, Nov 23, 2014 at 06:08:07PM -0600, Brian Curtin wrote:
> On Sun, Nov 23, 2014 at 5:57 PM, Steven D'Aprano  wrote:

> > I'm sure that we'll get *more* contributions, but will they be *better*
> > contributions?
> >
> > I know that there are people who think that mailing lists are old and
> > passe, and that we should shift discussion to a social media site like
> > Reddit. If we did, we'd probably get twenty times as many comments, and
> > the average quality would probably plummet. More is not necessarily a
> > good thing.
> 
> If we need to ensure that we're getting better contributions than we
> are now, then we should be interviewing committers, rejecting
> newcomers (or the opposite, multiplying core-mentors by 100), and
> running this like a business. I've written some crappy code that got
> committed, so I should probably be fired.

None of those things are guarenteed to lead to better contributions. The 
quality of code from the average successful business is significantly 
lower than that from successful FOSS projects like Python. Interviews 
just weed out people who are poor interviewees, not poor performers. And 
any organisation that fires contributors for relatively trivial mistakes 
like "crappy code" would soon run out of developers.

My point is that increasing the number of contributions is not, in and 
of itself, a useful aim to have. More contributions is just a means to 
an end, the end we want is better Python.


> Enabling our community to be active contributors is an important
> thing. Give them a means to level up and we'll all be better off from
> it.

Right. But this isn't a feel-good exercise where anyone who wants a Gold 
Star for contributing gets commit privileges. (That would "enable our 
community to be active contributors" too.) Barriers to contribute work 
two ways:

(1) we miss out on good contributions we would want;

(2) we also miss out on poor contributions that would 
just have to be rejected.


Enabling more people to contribute increases both.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-25 Thread Steven D'Aprano
On Mon, Nov 24, 2014 at 10:22:54AM +1100, Chris Angelico wrote:

> My point is that doing the same errant operation on a list or a dict
> will give different exceptions. In the same way, calling next() on an
> empty iterator will raise StopIteration normally, but might raise
> RuntimeError instead. It's still an exception, it still indicates a
> place where code needs to be changed

I wouldn't interpret it like that.

Calling next() on an empty iterator raises StopIteration. That's not a 
bug indicating a failure, it's the protocol working as expected. Your 
response to that may be to catch the StopIteration and ignore it, or to 
allow it to bubble up for something else to deal with it. Either way, 
next() raising StopIteration is not a bug, it is normal behaviour.

(Failure to deal with any such StopIteration may be a bug.)

However, if next() raises RuntimeError, that's not part of the protocol 
for iterators, so it is almost certainly a bug to be fixed. (Probably 
coming from an explicit "raise StopIteration" inside a generator 
function.) Your fix for the bug may be to refuse to fix it and just 
catch the exception and ignore it, but that's kind of nasty and hackish 
and shouldn't be considered good code.

Do you agree this is a reasonable way to look at it?


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] advice needed: best approach to enabling "metamodules"?

2014-11-29 Thread Steven D'Aprano
On Sun, Nov 30, 2014 at 11:07:57AM +1300, Greg Ewing wrote:
> Nathaniel Smith wrote:
> >So pkgname/__new__.py might look like:
> >
> >import sys
> >from pkgname._metamodule import MyModuleSubtype
> >sys.modules[__name__] = MyModuleSubtype(__name__, docstring)
> >
> >To start with, the 'from
> >pkgname._metamodule ...' line is an infinite loop,
> 
> Why does MyModuleSubtype have to be imported from pkgname?
> It would make more sense for it to be defined directly in
> __new__.py, wouldn't it? Isn't the purpose of separating
> stuff out into __new__.py precisely to avoid circularities
> like that?

Perhaps I'm missing something, but won't that imply that every module 
which wants to use a "special" module type has to re-invent the wheel?

If this feature is going to be used, I would expect to be able to re-use 
pre-written module types. E.g. having written "module with properties" 
(so to speak) once, I can just import it and use it in my next project.


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 481 - Migrate Some Supporting Repositories to Git and Github

2014-11-30 Thread Steven D'Aprano
I have some questions and/or issues with the PEP, but first I'm going to 
add something to Nick's comments:

On Sun, Nov 30, 2014 at 11:12:17AM +1000, Nick Coghlan wrote:

> Beyond that, GitHub is indeed the most expedient option. My two main
> reasons for objecting to taking the expedient path are:
> 
> 1. I strongly believe that the long term sustainability of the overall open
> source community requires the availability and use of open source
> infrastructure. While I admire the ingenuity of the "free-as-in-beer" model
> for proprietary software companies fending off open source competition, I
> still know a proprietary platform play when I see one (and so do venture
> capitalists looking to extract monopoly rents from the industry in the
> future). (So yes, I regret relenting on this principle in previously
> suggesting the interim use of another proprietary hosted service)
> 
> 2. I also feel that this proposal is far too cavalier in not even
> discussing the possibility of helping out the Mercurial team to resolve
> their documentation and usability issues rather than just yelling at them
> "your tool isn't popular enough for us, and we find certain aspects of it
> too hard to use, so we're switching to something else rather than working
> with you to address our concerns". We consider the Mercurial team a
> significant enough part of the Python ecosystem that Matt was one of the
> folks specifically invited to the 2014 language summit to discuss their
> concerns around the Python 3 transition. Yet we'd prefer to switch to
> something else entirely rather than organising a sprint with them at PyCon
> to help ensure that our existing Mercurial based infrastructure is
> approachable for git & GitHub users? (And yes, I consider some of the core
> Mercurial devs to be friends, so this isn't an entirely abstract concern
> for me)


Thanks Nick, I think these are excellent points, particularly the 
second. It would be a gross strawman to say that we should "only" use 
software developed in Python, but we should eat our own dogfood whenever 
practical and we should support and encourage the Python ecosystem, 
including Mercurial.

Particularly since hg and git are neck and neck feature-wise, we should 
resist the tendency to jump on bandwagons. If git were clearly the 
superior product, then maybe there would be an argument for using the 
best tool for the job, but it isn't.

As for the question of using Github hosting, there's another factor 
which has been conspicuous by its absence. Has GitHub's allegedly toxic 
and bullying culture changed since Julie Horvath quit in March? And if 
it has not, do we care?

I'm not a saint, but I do try to choose ethical companies and 
institutions over unethical ones whenever it is possible and practical. 
I'm not looking for a witch-hunt against GitHub, but if the allegations 
made by Horvath earlier this year are true, and I don't believe anyone 
has denied them, then so long as GitHub's internal culture remains 
sexist and hostile to the degree reported, then I do not believe that we 
should use GitHub's services even if we shift some repos to git.

I have serious doubts about GitHub's compatibility with the ideals 
expressed by the PSF. Even if our code of conduct does not explicitly 
forbid it, I think that it goes against the principles that we say we 
aspire to.

Given Horvath's experiences, and the lack of clear evidence that 
anything has changed in GitHub, I would be deeply disappointed if Python 
lent even a smidgeon of legitimacy to their company, and I personally 
will not use their services.

I acknowledge that it's hard to prove a negative, and GitHub may have 
difficulty proving to my satisfaction that they have changed. (My 
experience is that company culture rarely changes unless there is a 
change in management, and even then only slowly.) Particularly given 
GitHub's supposed egalitarian, non-hierarchical, and meritocratic 
structure, that nobody apparently saw anything wrong with the bullying 
of staff and workplace sexism until it became public knowledge suggests 
that it is not just a few bad apples but a problem all through the 
company.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 481 - Migrate Some Supporting Repositories to Git and Github

2014-12-01 Thread Steven D'Aprano
On Sun, Nov 30, 2014 at 02:56:22PM -0500, Donald Stufft wrote:

> As I mentioned in my other email, we’re already supporting two 
> different tools, and it’s a hope of mine to use this as a sort of 
> testbed to moving the other repositories as well.

If we go down this path, can we have some *concrete* and *objective* 
measures of success? If moving to git truly does improve things, then 
the move can be said to be a success. But if it makes no concrete 
difference, then we've wasted our time. In six months time, how will we 
know which it is?

Can we have some concrete and objective measures of what would count as 
success, and some Before and After measurements?

Just off the top of my head... if the number of documentation patches 
increases significiantly (say, by 30%) after six months, that's a sign 
the move was successful.

It's one thing to say that using hg is discouraging contributors, and 
that hg is much more popular. It's another thing to say that moving to 
git will *actually make a difference*. Maybe all the would-be 
contributors using git are too busy writing kernel patches for Linus or 
using Node.js and wouldn't be caught dead with Python :-)

With concrete and objective measures of success, you will have 
ammunition to suggest moving the rest of Python to git in a few years 
time. And without it, we'll also have good evidence that any further 
migration to git may be a waste of time and effort and we should focus 
our energy elsewhere rather than git vs hg holy wars.


[...]
> I also think it’s hard to look at a company like bitbucket, for 
> example, and say they are *better* than Github just because they 
> didn’t have a public and inflammatory event.

We can't judge companies on what they might be doing behind closed 
doors, only on what we can actually see of them. Anybody might be rotten 
bounders and cads in private, but how would we know? It's an imperfect 
world and we have imperfect knowledge but still have to make a decision 
as best we can.


> Attempting to reduce the cognitive burden for contributing and aligning 
> ourselves
> with the most popular tools allows us to take advantage of the network effects
> of these tools popularity. This can be the difference between someone with 
> limited
> amount of time being able to contribute or not, which can make real inroads 
> towards
> making it easier for under privileged people to contribute much more than 
> refusing
> to use a product of one group of people over another just because the other 
> group
> hasn’t had a public and inflammatory event.

In other contexts, that could be a pretty awful excuse for inaction 
against the most aggregiously bad behaviour. "Sure, Acme Inc might have 
adulterated baby food with arsenic, but other companies might have done 
worse things that we haven't found out about. So we should keep buying 
Acme's products, because they're cheaper and that's good for the poor."

Not that I'm comparing GitHub's actions with poisoning babies. What 
GitHub did was much worse. *wink*


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 481 - Migrate Some Supporting Repositories to Git and Github

2014-12-01 Thread Steven D'Aprano
On Tue, Dec 02, 2014 at 12:37:22AM +1100, Steven D'Aprano wrote:
[...]
> It's one thing to say that using hg is discouraging contributors, and 
> that hg is much more popular.

/s/more/less/ 


-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.x and 3.x use survey, 2014 edition

2014-12-12 Thread Steven D'Aprano
On Fri, Dec 12, 2014 at 10:24:15AM -0800, Mark Roberts wrote:
> So, I'm more than aware of how to write Python 2/3 compatible code. I've
> ported 10-20 libraries to Python 3 and write Python 2/3 compatible code at
> work. I'm also aware of how much writing 2/3 compatible code makes me hate
> Python as a language.

I'm surprised by the strength of feeling there.

Most of the code I write supports 2.4+, with the exception of 3.0 where 
I say "it should work, but if it doesn't, I don't care". I'll be *very* 
happy when I can drop support for 2.4, but with very few exceptions I 
have not found many major problems supporting both 2.7 and 3.3+ in the 
one code-base, and nothing I couldn't work around (sometimes by just 
dropping support for a specific feature in certain versions).

I'm not disputing that your experiences are valid, but I am curious what 
specific issues you have come across and wondering if there are things 
which 3.5 can include to ease that transition. E.g. 3.3 re-added support 
for u'' syntax.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] also

2015-01-28 Thread Steven D'Aprano
On Wed, Jan 28, 2015 at 09:39:25AM -0500, Alan Armour wrote:
> if you can do this
> 
> a chemical physics and element physics like everything from melting points
> to how much heat you need to add two chemicals together
> 
> and physics like aerodynamics, space dynamics, and hydrodynamics etcetera
> for propellers and motors and stuff.
> 
> just having this in a main language seems to make a shit ton of sense.

You should check out Frink:

http://futureboy.us/frinkdocs/


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] subclassing builtin data structures

2015-02-12 Thread Steven D'Aprano
On Thu, Feb 12, 2015 at 06:14:22PM -0800, Ethan Furman wrote:
> On 02/12/2015 05:46 PM, MRAB wrote:
> > On 2015-02-13 00:55, Guido van Rossum wrote:

> >> Actually, the problem is that the base class (e.g. int) doesn't know how
> >> to construct an instance of the subclass -- there is no reason (in
> >> general) why the signature of a subclass constructor should match the
> >> base class constructor, and it often doesn't.
> >>
> >> So this is pretty much a no-go. It's not unique to Python -- it's a
> >> basic issue with OO.
> >>
> > Really?
> 
> What I was asking about, and Guido responded to, was not having to 
> specifically override __add__, __mul__, __sub__, and
> all the others; if we do override them then there is no problem.

I think you have misunderstood MRAB's comment. My interpretation is 
that MRAB is suggesting that methods in the base classes should use 
type(self) rather than hard-coding their own type.

E.g. if int were written in pure Python, it might look something like 
this:

class int(object):
def __new__(cls, arg):
...

def __add__(self, other):
return int(self, other)

(figuratively, rather than literally). But if it looked like this:

def __add__(self, other):
return type(self)(self, other)


then sub-classing would "just work" without the sub-class having to 
override each and every method.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] subclassing builtin data structures

2015-02-14 Thread Steven D'Aprano
On Fri, Feb 13, 2015 at 06:03:35PM -0500, Neil Girdhar wrote:
> I personally don't think this is a big enough issue to warrant any changes,
> but I think Serhiy's solution would be the ideal best with one additional
> parameter: the caller's type.  Something like
> 
> def __make_me__(self, cls, *args, **kwargs)
> 
> and the idea is that any time you want to construct a type, instead of
> 
> self.__class__(assumed arguments…)
> 
> where you are not sure that the derived class' constructor knows the right
> argument types, you do
> 
> def SomeCls:
>  def some_method(self, ...):
>return self.__make_me__(SomeCls, assumed arguments…)
> 
> Now the derived class knows who is asking for a copy.

What if you wish to return an instance from a classmethod? You don't 
have a `self` available.

class SomeCls:
def __init__(self, x, y, z):
...
@classmethod
def from_spam(cls, spam):
x, y, z = process(spam)
return cls.__make_me__(self, cls, x, y, z)  # oops, no self


Even if you are calling from an instance method, and self is available, 
you cannot assume that the information needed for the subclass 
constructor is still available. Perhaps that information is used in the 
constructor and then discarded.

The problem we wish to solve is that when subclassing, methods of some
base class blindly return instances of itself, instead of self's type:


py> class MyInt(int):
... pass
...
py> n = MyInt(23)
py> assert isinstance(n, MyInt)
py> assert isinstance(n+1, MyInt)
Traceback (most recent call last):
  File "", line 1, in ?
AssertionError


The means that subclasses often have to override all the parent's 
methods, just to ensure the type is correct:

class MyInt(int):
def __add__(self, other):
o = super().__add__(other)
if o is not NotImplemented:
o = type(self)(o)
return o


Something like that, repeated for all the int methods, should work:

py> n = MyInt(23)
py> type(n+1)



This is tedious and error prone, but at least once it is done, 
subclasses of MyInt will Just Work:


py> class MyOtherInt(MyInt):
... pass
...
py> a = MyOtherInt(42)
py> type(a + 1000)



(At least, *in general* they will work. See below.)

So, why not have int's methods use type(self) instead of hard coding 
int? The answer is that *some* subclasses might override the 
constructor, which would cause the __add__ method to fail:

# this will fail if the constructor has a different signature
o = type(self)(o)


Okay, but changing the constructor signature is quite unusual. Mostly, 
people subclass to add new methods or attributes, or to override a 
specific method. The dict/defaultdict situation is relatively uncommon.

Instead of requiring *every* subclass to override all the methods, 
couldn't we require the base classes (like int) to assume that the 
signature is unchanged and call type(self), and leave it up to the 
subclass to override all the methods *only* if the signature has 
changed? (Which they probably would have to do anyway.)

As the MyInt example above shows, or datetime in the standard library, 
this actually works fine in practice:

py> from datetime import datetime
py> class MySpecialDateTime(datetime):
... pass
...
py> t = MySpecialDateTime.today()
py> type(t)



Why can't int, str, list, tuple etc. be more like datetime?



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] subclassing builtin data structures

2015-02-14 Thread Steven D'Aprano
On Sat, Feb 14, 2015 at 01:26:36PM -0500, Alexander Belopolsky wrote:
> On Sat, Feb 14, 2015 at 7:23 AM, Steven D'Aprano 
> wrote:
> 
> > Why can't int, str, list, tuple etc. be more like datetime?
> 
> 
> They are.  In all these types, class methods call subclass constructors but
> instance methods don't.

But in datetime, instance methods *do*.

Sorry that my example with .today() was misleading.

py> from datetime import datetime
py> class MyDatetime(datetime):
... pass
...
py> MyDatetime.today()
MyDatetime(2015, 2, 15, 12, 45, 38, 429269)
py> MyDatetime.today().replace(day=20)
MyDatetime(2015, 2, 20, 12, 45, 53, 405889)


> In the case of int, there is a good reason for this behavior - bool.  In
> python, we want True + True == 2.

Sure. But bool is only one subclass. I expect that it should be bool's 
responsibility to override __add__ etc. to return an instance of the 
parent class (int) rather have nearly all subclasses have to override 
__add__ etc. to return instances of themselves.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 488: elimination of PYO files

2015-03-06 Thread Steven D'Aprano
On Fri, Mar 06, 2015 at 09:37:05PM +0100, Antoine Pitrou wrote:
> On Fri, 06 Mar 2015 18:11:19 +
> Brett Cannon  wrote:
> > And the dropping of docstrings does have an impact on
> > memory usage when you use Python at scale.
> 
> What kind of "scale" are you talking about? Do you have any numbers
> about such impact?
> 
> > You're also assuming that we will never develop an AST optimizer
> 
> No, the assumption is that we don't have such an optimizer *right now*.
> Having command-line options because they might be useful some day is
> silly.

Quoting the PEP:

This issue is only compounded when people optimize Python 
code beyond what the interpreter natively supports, e.g., 
using the astoptimizer project [2]_.


Brett, I'm a very strong +1 on the PEP. It's well-written and gives a 
good explanation for why such a thing is needed. The current behaviour 
of re-using the same .pyo file for two distinct sets of bytecode is 
out-and-out buggy:

[steve@ando ~]$ python3.3 -O -c "import dis; print(dis.__doc__[:32])"
Disassembler of Python byte code
[steve@ando ~]$ python3.3 -OO -c "import dis; print(dis.__doc__[:32])"
Disassembler of Python byte code

The second should fail, since doc strings should be removed under -OO 
optimization, but because the .pyo file already exists it doesn't.

Even if CPython drops -O and -OO altogether, this PEP should still be 
accepted to allow third party optimizers like astoptimizer to interact 
without getting in each other's way.

(And for the record, I'm an equally strong -1 on dropping -O and -OO.)

Thank you.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 488: elimination of PYO files

2015-03-07 Thread Steven D'Aprano
On Fri, Mar 06, 2015 at 08:00:20PM -0500, Ron Adam wrote:

> Have you considered doing this by having different magic numbers in the 
> .pyc file for standard, -O, and -O0 compiled bytecode files?  Python 
> already checks that number and recompiles the files if it's not what it's 
> expected to be.  And it wouldn't require any naming conventions or new 
> cache directories.  It seems to me it would be much easier to do as well.

And it would fail to solve the problem. The problem isn't just that the 
.pyo file can contain the wrong byte-code for the optimization level, 
that's only part of the problem. Another issue is that you cannot have 
pre-compiled byte-code for multiple different optimization levels. You 
can have a "no optimization" byte-code file, the .pyc file, but only one 
"optimized" byte-code file at the same time.

Brett's proposal will allow -O optimized and -OO optimized byte-code 
files to co-exist, as well as setting up a clear naming convention for 
future optimizers in either the Python compiler or third-party 
optimizers.

No new cache directories are needed. The __pycache__ directory has been 
used since Python 3.3 (or was it 3.2? I forget which). 



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] boxing and unboxing data types

2015-03-08 Thread Steven D'Aprano
On Sun, Mar 08, 2015 at 08:31:30PM -0700, Ethan Furman wrote:

> When data is passed from Python to a native library (such as in an O/S 
> call), how does the unboxing of data types occur?
[...]
> So the real question: anywhere in Python where an int is expected (for 
> lower-level API work), but not directly received, should __int__ (or 
> __index__) be called?  and failure to do so is a bug?

I think the answer is in the docs:

https://docs.python.org/3/reference/datamodel.html#object.__int__

Immediately below that __index__ is described, with this note:

In order to have a coherent integer type class, when 
__index__() is defined __int__() should also be defined, 
and both should return the same value.


The PEP adding __index__ is also useful:

https://www.python.org/dev/peps/pep-0357/


My summary is as follows:

__int__ is used as the special method for int(), and it should coerce 
the object to an integer. This may be lossy e.g. int(2.999) --> 2 or may 
involve a conversion from a non-numeric type to integer e.g. int("2").

__index__ is used when the object in question actually represents an 
integer of some kind, e.g. a fixed-with integer. Conversion should be 
lossless and conceptually may be thought of a way of telling Python 
"this value actually is an int, even though it doesn't inherit from int" 
(for some definition of "is an int").

There's no built-in way of calling __index__ that I know of (no 
equivalent to int(obj)), but slicing at the very least will call it, 
e.g. seq[a:] will call type(a).__index__.

If you define __index__ for your class, you should also define __int__ 
and have the two return the same value. I would expect that an IntFlags 
object should inherit from int, and if that is not possible, practical 
or desirable for some reason, then it should define __index__ and 
__int__.

Failure to call __index__ is not necessarily a bug. I think it is 
allowed for functions to insist on an actual int, as slicing did for 
many years, but it is an obvious enhancement to allow such functions to 
accept arbitrary int-like objects.

Does that answer your questions?



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Steven D'Aprano
On Mon, Mar 09, 2015 at 09:52:01AM -0400, Neil Girdhar wrote:

> Here is a list of methods on
> int that should not be on IntFlags in my opinion (give or take a couple):
> 
> __abs__, __add__, __delattr__, __divmod__, __float__, __floor__,
> __floordiv__, __index__, __lshift__, __mod__, __mul__, __pos__, __pow__,
> __radd__, __rdivmod__, __rfloordiv__, __rlshift__, __rmod__, __rmul__,
> __round__, __rpow__, __rrshift__, __rshift__, __rsub__, __rtruediv__,
> __sub__, __truediv__, __trunc__, conjugate, denominator, imag, numerator,
> real.
> 
> I don't think __index__ should be exposed either since are you really going
> to slice a list using IntFlags?  Really?

In what way is this an *Int*Flags object if it is nothing like an int? 
It sounds like what you want is a bunch of Enum inside a set with a custom 
__str__, not IntFlags.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.

2015-03-09 Thread Steven D'Aprano
On Sun, Mar 08, 2015 at 10:57:30PM -0700, Ryan Smith-Roberts wrote:
> I suspect that you will find the Python community extremely conservative
> about any changes to its sorting algorithm, given that it took thirteen
> years and some really impressive automated verification software to find
> this bug:

On the other hand, the only person who really needs to be convinced is 
Tim Peters. It's really not up to the Python community.

The bug tracker is the right place for discussing this.

-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 488: elimination of PYO files

2015-03-11 Thread Steven D'Aprano
On Wed, Mar 11, 2015 at 05:34:10PM +, Brett Cannon wrote:

> I have a poll going on G+ to see what people think of the various proposed
> file name formats at
> https://plus.google.com/u/0/+BrettCannon/posts/fZynLNwHWGm . Feel free to
> vote if you have an opinion.

G+ hates my browser and won't let me vote. I click on the button and 
nothing happens. I have Javascript enabled and I'm not using any ad 
blockers.

For the record, I think only the first two options 

importlib.cpython-35.opt-0.pyc
importlib.cpython-35.opt0.pyc


are sane, and I prefer the first. I'm mildly inclined to leave out the 
opt* part for default, unoptimized code. In other words, the file name 
holds two or three '.' delimited fields, plus the extension:

.-.[opt-].pyc

where [...] is optional and the optimization codes for CPython will be 1 
for -O and 2 for -OO. And 0 for unoptimized, if you decide that it 
should be mandatory.

Thank you for moving forward on this, I think it is a good plan.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 8 update

2015-04-06 Thread Steven D'Aprano
On Tue, Apr 07, 2015 at 03:11:30AM +0100, Rob Cliffe wrote:

> As a matter of interest, how far away from mainstream am I in 
> preferring, *in this particular example* (obviously it might be 
> different for more complicated computation),
> 
> def foo(x):
> return math.sqrt(x) if x >= 0 else None
> 
> I probably have a personal bias towards compact code, but it does seem 
> to me that the latter says exactly what it means, no more and no less, 
> and therefore is somewhat more readable.  (Easier to keep the reader's 
> attention for 32 non-whitespace characters than 40.)

In my opinion, code like that is a good example of why the ternary if 
operator was resisted for so long :-) Sometimes you can have code which 
is just too compact.

My own preference would be:

def foo(x):
if x >= 0: 
return math.sqrt(x)
return None

but I'm not terribly fussed about whether the "else" is added or not, 
whether the return is on the same line as the if, and other minor 
variations. 

-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 8 update

2015-04-07 Thread Steven D'Aprano
On Tue, Apr 07, 2015 at 08:47:25AM -0400, Ben Hoyt wrote:
> > My own preference would be:
> >
> > def foo(x):
> > if x >= 0:
> > return math.sqrt(x)
> > return None
> 
> Kind of getting into the weeds here, but I would always invert this to
> "return errors early, and keep the normal flow at the main indentation
> level". Depends a little on what foo() means, but it seems to me the
> "return None" case is the exceptional/error case, so this would be:
> 
> def foo(x):
> if x < 0:
> return None
> return math.sqrt(x)

While *in general* I agree with "handle the error case early", there are 
cases where "handle the normal case early" is better, and I think that 
this is one of them. Also, inverting the comparison isn't appropriate, 
due to float NANs. With the first version, foo(NAN) returns None (which 
I assumed was deliberate by the OP). In your version, it returns NAN.

But as you say, we're now deep into the weeds...


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Keyword-only parameters

2015-04-14 Thread Steven D'Aprano
On Tue, Apr 14, 2015 at 01:40:40PM -0400, Eric V. Smith wrote:

> But, I don't see a lot of keyword-only parameters being added to stdlib
> code. Is there some position we've taken on this? Barring someone saying
> "stdlib APIs shouldn't contain keyword-only params", I'm inclined to
> make numeric_owner keyword-only.

I expect that's because keyword-only parameters are quite recent (3.x 
only) and most of the stdlib is quite old.

Keyword-only feels right for this to me too.

-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-20 Thread Steven D'Aprano
On Mon, Apr 20, 2015 at 11:34:51PM +0100, Harry Percival wrote:
> exactly.  yay stub files!  we all agree! everyone loves them!

Not even close.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-20 Thread Steven D'Aprano
On Mon, Apr 20, 2015 at 07:30:39PM +0100, Harry Percival wrote:
> Hi all,
> 
> tldr; type hints in python source are scary. Would reserving them for stub
> files be better?

No no no, a thousand times no it would not!

Please excuse my extreme reaction, but over on the python-list mailing 
list (comp.lang.python if you prefer Usenet) we already had this 
discussion back in January.

Anyone wishing to read those conversations should start here:

https://mail.python.org/pipermail/python-list/2015-January/697202.html

https://mail.python.org/pipermail/python-list/2015-January/697315.html

Be prepared for a long, long, long read. Nothing in your post hasn't 
already been discussed (except for your proposal to deprecate 
annotations altogether). So if you feel that I'm giving any of your 
ideas or concerns short-shrift, I'm not, it's just that I've already 
given them more time than I can afford. And now I get to do it all 
again, yay.

(When reading those threads, please excuse my occasional snark towards 
Rick, he is a notorious troll on the list and sometimes I let myself be 
goaded into somewhat less than professional responses.)

While I sympathise with your thought that "it's scary", I think it is 
misguided and wrong. As a thought-experiment, let us say that we roll 
back the clock to 1993 or thereabouts, just as Python 1.0 (or so) was 
about to be released, and Guido proposed adding *default values* to 
function declarations [assuming they weren't there from the start]. If 
we were used to Python's clean syntax:

def zipmap(f, xx, yy):

the thought of having to deal with default values:

def zipmap(f=None, xx=(), yy=()):

might be scary. Especially since those defaults could be arbitrarily 
complex expressions. Twenty years on, what should we think about such 
fears?

Type hinting or declarations are extremely common in programming 
languages, and I'm not just talking about older languages like C and 
Java. New languages, both dynamic and static, like Cobra, Julia, 
Haskell, Go, Boo, D, F#, Fantom, Kotlin, Rust and many more include 
optional or mandatory type declarations. You cannot be a programmer 
without expecting to deal with type hints/declarations somewhere. As 
soon as you read code written in other languages (and surely you do 
that, don't you? you practically cannot escape Java and C code on the 
internet) and in my opinion Python cannot be a modern language without 
them.

I'm going to respond to your recommendation to use stub files in 
another post (replying to Barry Warsaw), here I will discuss your 
concerns first.

> My first reaction to type hints was "yuck", and I'm sure I'm not the only
> one to think that.  viz (from some pycon slides):
> 
> def zipmap(f: Callable[[int, int], int], xx: List[int],
>yy: List[int]) -> List[Tuple[int, int, int]]:
> 
> arg.  and imagine it with default arguments.

You've picked a complex example and written it poorly. I'd say yuck too, 
but let's use something closer to PEP-8 formatting:

def zipmap(f: Callable[[int, int], int],
   xx: List[int],
   yy: List[int]
   ) -> List[Tuple[int, int, int]]:

Not quite so bad with each parameter on its own line. It's actually 
quite readable, once you learn what the annotations mean. Like all new 
syntax, of course you need to learn it. But the type hints are just 
regular Python expressions.


> Of course, part of this reaction is just a knee-jerk reaction to the new
> and unfamiliar, and should be dismissed, entirely justifiably, as mere
> irrationality.  But I'm sure sensible people agree that they do make our
> function definitions longer, more complex, and harder to read.

Everything has a cost and all features, or lack of features, are a 
trade-off. Function definitions could be even shorter and simpler and 
easier to read if we didn't have default values.


[...] 
> I'm not so sure.  My worry is that once type hinting gets standardised,
> then they will become a "best practice", and there's a particular
> personality type out there that's going to start wanting to add type hints
> to every function they write.  Similarly to mindlessly obeying PEP8 while
> ignoring its intentions, hobgoblin-of-little-minds style, I think we're
> very likely to see type hints appearing in a lot of python source, or a lot
> of pre-commit-hook checkers.  Pretty soon it will be hard to find any open
> source library code that doesn't have type hints, or any project style
> guide that doesn't require them.

I doubt that very much. I'm not a betting man, but if I were, I would 
put money on it.

Firstly: libraries tend to be multi-version, and these days they are 
often hybrid Python 2 & 3 code. Since annotations are 3 only, libraries 
cannot use these type hints until they drop support for Python 2.7, 
which will surely be *no less* than five years away. Probably more like 
ten. So "annotations everywhere" are, at best, many years away.

Secondly, more importantl

Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-20 Thread Steven D'Aprano
On Mon, Apr 20, 2015 at 02:41:06PM -0400, Barry Warsaw wrote:
> On Apr 20, 2015, at 07:30 PM, Harry Percival wrote:
> 
> >tldr; type hints in python source are scary. Would reserving them for stub
> >files be better?
> 
> I think so.  I think PEP 8 should require stub files for stdlib modules and
> strongly encourage them for 3rd party code.

A very, very strong -1 to that.

Stub files are a necessary evil. Except where absolutely necessary, 
they should be strongly discouraged. A quote from the Go FAQs:

Dependency management is a big part of software development 
today but the “header files” of languages in the C tradition 
are antithetical to clean dependency analysis—and fast 
compilation.

http://golang.org/doc/faq#What_is_the_purpose_of_the_project


Things that go together should be together. A function parameter and 
its type information (if any) go together: the type is as much a part 
of the parameter declaration as the name and the default. Putting them 
together is the best situation:

def func(n: Integer): ...


and should strongly be prefered as best practice for when you choose to 
use type hinting at all. Alternatives are not as good. Second best is to 
put them close by, as in a decorator:

@typing(n=Integer)  # Don't Repeat Yourself violation
def func(n): ...


A distant third best is a docstring. Not only does it also violate DRY, 
but it also increases the likelyhood of errors:

def func(n):
"""Blah blah blah

blah blah blah

Arguments:

 m:  Integer

"""

Keeping documentation and code in synch is hard, and such mistakes are 
not uncommon.

Putting the type information in a stub file is an exponentially more 
distant fourth best, or to put it another way, *the worst* solution for 
where to put type hints. Not only do you Repeat Yourself with the name 
of the parameter, but also the name of the function (or method and 
class) AND module. The type information *isn't even in the same file*, 
which increases the chance of it being lost, forgotten, deleted, out of 
date, unmaintained, etc.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-21 Thread Steven D'Aprano
On Mon, Apr 20, 2015 at 08:37:28PM -0700, Guido van Rossum wrote:
> On Mon, Apr 20, 2015 at 4:41 PM, Jack Diederich  wrote:
> 
> > Twelve years ago a wise man said to me "I suggest that you also propose a
> > new name for the resulting language"
> >
> 
> The barrage of FUD makes me feel like the woman who asked her doctor for a
> second opinion and was told "you're ugly too."

Don't worry Guido, some of us are very excited to see this coming to 
fruition :-)

It's been over ten years since your first blog post on optional typing 
for Python. At least nobody can accuse you of rushing into this.

http://www.artima.com/weblogs/viewpost.jsp?thread=85551


-- 
Steve

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-21 Thread Steven D'Aprano
On Tue, Apr 21, 2015 at 11:56:15AM +0100, Rob Cliffe wrote:

> (Adding a type hint that restricted the argument to say a 
> sequence of numbers turns out to be a mistake.

Let's find out how big a mistake it is with an test run.

py> def sorter(alist: List[int]) -> List[int]:
... return sorted(alist)
...
py> data = (chr(i) + 'ay' for i in range(97, 107))
py> type(data)

py> sorter(data)
['aay', 'bay', 'cay', 'day', 'eay', 'fay', 'gay', 'hay', 'iay', 'jay']


When we say that type checking is optional, we mean it.

[Disclaimer: I had to fake the List object, since I don't have the 
typing module, but everything else is exactly as you see it.]

Annotations will be available for type checking. If you don't want to 
type check, don't type check. If you want to go against the type hints, 
you can go against the type hints, and get exactly the same runtime 
errors as you have now:

py> sorter(None)
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 2, in sorter
TypeError: 'NoneType' object is not iterable

There is no compile time checking unless you choose to run a type 
checker or linter -- and even if the type checker flags errors, you can 
ignore it and run the program regardless. Just like today with linters 
like PyFlakes, PyLint and similar.

For those who choose not to run a type checker, the annotations will be 
nothing more than introspectable documentation.


> And what is a number?
>  Is Fraction?  What about complex numbers, which can't be 
> sorted?  What if the function were written before the Decimal class?)

I know that you are intending these as rhetorical questions, but Python 
has had a proper numeric tower since version 2.5 or 2.6. So:

py> from numbers import Number
py> from decimal import Decimal
py> isinstance(Decimal("1.25"), Number)
True
py> isinstance(2+3j, Number)
True


> Errors are often not caught until run time that would be caught at 
> compile time in other languages (though static code checkers help).

Yes they do help, which is exactly the point.


> (Not much of a disadvantage because of Python's superb error 
> diagnostics.)

That's certainly very optimistic of you.

If I had to pick just one out of compile time type checking versus run 
time unit tests, I'd pick run time tests. But it is naive to deny the 
benefits of compile time checks in catching errors that you otherwise 
might not have found even with extensive unit tests (and lets face it, 
we never have enough unit tests).

Ironically, type hinting will *reduce* the need for intrusive, 
anti-duck-testing explicit calls to isinstance() at runtime:

def func(x:float):
if isinstance(x, float): ...
else: raise TypeError


Why bother making that expensive isinstance call every single time the 
function is called, if the type checker can prove that x is always a 
float?


> Python code typically says what it is doing, with the minimum of 
> syntactic guff.  (Well, apart from colons after if/while/try etc. :-) )
> Which makes it easy to read.
> Now it seems as if this proposal wants to start turning Python in the 
> C++ direction, encouraging adding ugly boilerplate code.  (This may only 
> be tangentially relevant, but I want to scream when I see some 
> combination of public/private/protected/static/extern etc., most of 
> which I don't understand.)

Perhaps if you understood it you would be less inclined to scream.


> Chris A makes the valid point (if I understand correctly) that
> Authors of libraries should make it as easy as possible to
>   (i) know what object types can be passed to functions
>   (ii) diagnose when the wrong type of object is passed
> Authors of apps are not under such obligation, they can basically 
> do what they want.
> 
> Well,
> (i) can be done with good documentation (docstrings etc.).
> (ii) can be done with appropriate runtime checks and good error 
> messages.

How ironic. After singing the praises of duck-typing, now you are 
recommending runtime type checks.

As far as good error messages go, they don't help you one bit when the 
application suddenly falls over in a totally unexpected place due to a 
bug in your code.

I can't go into too many details due to commercial confidentiality, but 
we experienced something similar recently. A situation nobody foresaw, 
that wasn't guarded against, and wasn't tested for, came up after 
deployment. There was a traceback, of course, but a failure in the field 
200km away with a stressed customer and hundreds of angry users is not 
as useful as a compile-time failure during development.


> You see where I'm going with this - adding type hints to Python feels a 
> bit like painting feet on the snake.

Pythons are one of the few snakes which have vestigal legs:

http://en.wikipedia.org/wiki/Pelvic_spur


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-

Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-21 Thread Steven D'Aprano
On Tue, Apr 21, 2015 at 01:25:34PM +0100, Chris Withers wrote:

> Anyway, I've not posted much to python-dev in quite a while, but this is 
> a topic that I would be kicking myself in 5-10 years time when I've had 
> to move to Javascript or  because everyone 
> else has drifted away from Python as it had become ugly...


Facebook released Flow, a static typechecker for Javascript, to a very 
positive reaction. From their announcement:

Flow’s type checking is opt-in — you do not need to type check all 
your code at once. However, underlying the design of Flow is the 
assumption that most JavaScript code is implicitly statically typed; 
even though types may not appear anywhere in the code, they are in 
the developer’s mind as a way to reason about the correctness of the 
code. Flow infers those types automatically wherever possible, which 
means that it can find type errors without needing any changes to 
the code at all. On the other hand, some JavaScript code, especially 
frameworks, make heavy use of reflection that is often hard to 
reason about statically. For such inherently dynamic code, type 
checking would be too imprecise, so Flow provides a simple way to 
explicitly trust such code and move on. This design is validated by 
our huge JavaScript codebase at Facebook: Most of our code falls in 
the implicitly statically typed category, where developers can check 
their code for type errors without having to explicitly annotate 
that code with types.

Quoted here: 

http://blog.jooq.org/2014/12/11/the-inconvenient-truth-about-dynamic-vs-static-typing/


More about flow: 

http://flowtype.org/


Matz is interested in the same sort of gradual type checking for Ruby as 
Guido wants to add to Python:

https://www.omniref.com/blog/blog/2014/11/17/matz-at-rubyconf-2014-will-ruby-3-dot-0-be-statically-typed/


Julia already includes this sort of hybrid dynamic+static type checking:

http://julia.readthedocs.org/en/latest/manual/types/

I could keep going, but I hope I've made my point. Whatever language you 
are using in 5-10 years time, it will almost certainly be either mostly 
static with some dynamic features like Java, or dynamic with optional 
and gradual typing.



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-21 Thread Steven D'Aprano
On Tue, Apr 21, 2015 at 03:08:27PM +0200, Antoine Pitrou wrote:
> On Tue, 21 Apr 2015 22:47:23 +1000
> Steven D'Aprano  wrote:
> > 
> > Ironically, type hinting will *reduce* the need for intrusive, 
> > anti-duck-testing explicit calls to isinstance() at runtime:
> 
> It won't, since as you pointed out yourself, type checks are purely
> optional and entirely separate from compilation and runtime evaluation.

Perhaps you are thinking of libraries, where the library function has to 
deal with whatever junk people throw at it. To such libraries, I believe 
that the major benefit of type hints is not so much in proving the 
library's correctness in the face of random arguments, but as 
documentation. In any case, of course you are correct that public 
library functions and methods will continue to need to check their 
arguments. (Private functions, perhaps not.)

But for applications, the situation is different. If my application 
talks to a database and extracts a string which it passes on to its own 
function spam(), then it will be a string. Not a string-like object. Not 
something that quacks like a string. A string. Once the type checker is 
satisfied that spam() always receives a string, then further isinstance 
checks inside spam() is a waste of time. If spam()'s caller changes and 
might return something which is not a string, then the type checker 
will flag that.

Obviously to get this benefit you need to actually use a type checker. I 
didn't think I needed to mention that.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-21 Thread Steven D'Aprano
On Tue, Apr 21, 2015 at 03:51:05PM +0100, Cory Benfield wrote:
> On 21 April 2015 at 15:31, Chris Angelico  wrote:
> > Granted, there are some
> > vague areas - how many functions take a "file-like object", and are
> > they all the same? - but between MyPy types and the abstract base
> > types that already exist, there are plenty of ways to formalize duck
> > typing.
> 
> Are there? Can I have a link or an example, please? I feel like I
> don't know how I'm supposed to do this, and I'd like to see how that
> works. I'll even give a concrete use-case: I want to be able to take a
> file-like object that has a .read() method and a .seek() method.

I've never done this before, so I might not quite have done it 
correctly, but this appears to work just fine:

py> import abc
py> class SeekableReadable(metaclass=abc.ABCMeta):
... @classmethod
... def __subclasshook__(cls, C):
... if hasattr(C, 'seek') and hasattr(C, 'read'):
... return True
... return NotImplemented
...
py> f = open('/tmp/foo')
py> isinstance(f, SeekableReadable)
True
py> from io import StringIO
py> issubclass(StringIO, SeekableReadable)
True
py> issubclass(int, SeekableReadable)
False


That gives you your runtime check for an object with seek() and read() 
methods. For compile-time checking, I expect you would define 
SeekableReadable as above, then make the declaration:

def read_from_start(f:SeekableReadable, size:int):
f.seek(0)
return f.read(size)

So now you have runtime interface checking via an ABC, plus 
documentation for the function parameter type via annotation.

But will the static checker understand that annotation? My guess is, 
probably not as it stands. According to the docs, MyPy currently 
doesn't support this sort of duck typing, but will:

[quote]
There are also plans to support more Python-style “duck typing” 
in the type system. The details are still open.
[end quote]

http://mypy.readthedocs.org/en/latest/class_basics.html#abstract-base-classes-and-multiple-inheritance


I expect that dealing with duck typing will be very high on the list 
of priorities for the future. In the meantime, for this specific use-case, 
you're probably not going to be able to statically check this type hint. 
Your choices would be:

- don't type check anything;

- don't type check the read_from_start() function, but type check 
  everything else;

- don't type check the f parameter (remove the SeekableReadable 
  annotation, or replace it with Any, but leave the size:int 
  annotation);

- possibly some type checkers will infer from the function body that f 
  must have seek() and read() methods, and you don't have to declare 
  anything (structural typing instead of nominal?);

- (a bad idea, but just for the sake of completeness) leave the 
  annotation in, and ignore false negatives.


Remember that there is no built-in Python type checker. If you have no 
checker, the annotations are just documentation and nothing else will 
have changed. If you don't like the checker you have, you'll be able to 
replace it with another.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-23 Thread Steven D'Aprano
On Thu, Apr 23, 2015 at 03:25:30PM +0100, Harry Percival wrote:
> lol @ the fact that the type hints are breaking github's syntax highlighter
> :)

That just tells us that Github's syntax highlighter has been broken for 
over five years. Function annotations go back to Python 3.0, more than 
five years ago. The only thing which is new about type hinting is that 
we're adding a standard *use* for those annotations.

I just tested a version of kwrite from 2005, ten years old, and it 
highlights the following annotated function perfectly:

def func(a:str='hello', b:int=int(x+1)) -> None:
print(a + b)


Of course, I'm hoping that any decent type checker won't need the type 
hints. It should be able to infer from the default values that a is a 
string and b an int, and only require a type hint if you want to accept 
other types as well.

(It should also highlight that a+b cannot succeed.)


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] async/await in Python; v2

2015-04-24 Thread Steven D'Aprano
On Thu, Apr 23, 2015 at 01:51:52PM -0400, Barry Warsaw wrote:

> Why "async def" and not "def async"?
> 
> My concern is about existing tools that already know that "def" as the first
> non-whitespace on the line starts a function/method definition.  Think of a
> regexp in an IDE that searches backwards from the current line to find the
> function its defined on.  Sure, tools can be updated but it is it *necessary*
> to choose a syntax that breaks tools?

Surely its the other way? If I'm searching for the definition of a 
function manually, I search for "def spam". `async def spam` will still 
be found, while `def async spam` will not.

It seems to me that tools that search for r"^\s*def\s+spam\s*\(" are 
going to break whichever choice is made, while a less pedantic search 
like r"def\s+spam\s*\(" will work only if async comes first.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] typeshed for 3rd party packages

2015-04-24 Thread Steven D'Aprano
On Wed, Apr 22, 2015 at 11:26:14AM -0500, Ian Cordasco wrote:

> On a separate thread Cory provided an example of what the hints would look
> like for *part* of one function in the requests public functional API.
> While our API is outwardly simple, the values we accept in certain cases
> are actually non-trivially represented. Getting the hints *exactly* correct
> would be extraordinarily difficult.

I don't think you need to get them exactly correct. The type-checker 
does two things:

(1) catch type errors involving types which should not be allowed;

(2) allow code which involves types which should be allowed.

If the type hints are wrong, there are two errors: false positives, when 
code which should be allowed is flagged as a type error; and false 
negatives, when code which should be flagged as an error is not.
Ideally, there should be no false positives. But false negatives are not 
so important, since you will still be doing runtime checks. All that 
means is that the static type-checker will be a little less capable of 
picking up type errors at compile time.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] typeshed for 3rd party packages

2015-04-24 Thread Steven D'Aprano
On Fri, Apr 24, 2015 at 03:44:45PM +0100, Cory Benfield wrote:
> On 24 April 2015 at 15:21, Steven D'Aprano  wrote:
> 
> > If the type hints are wrong, there are two errors: false positives, when
> > code which should be allowed is flagged as a type error; and false
> > negatives, when code which should be flagged as an error is not.
> > Ideally, there should be no false positives. But false negatives are not
> > so important, since you will still be doing runtime checks. All that
> > means is that the static type-checker will be a little less capable of
> > picking up type errors at compile time.
> 
> I think that's a rational view that will not be shared as widely as I'd like.

I can't tell if you are agreeing with me, or disagreeing. The above 
sentence seems to be agreeing with me, but you later end your message 
with "do it properly or not at all" which disagrees. So I'm confused.


> Given that the purpose of a type checker is to catch bugs caused by
> passing incorrectly typed objects to a function, it seems entirely
> reasonable to me to raise a bug against a type hint that allows code
> that was of an incorrect type where that incorrectness *could* have
> been caught by the type hint.

Of course it is reasonable for people to submit bug reports to do with 
the type hints. And it is also reasonable for the package maintainer to 
reject the bug report as "Won't Fix" if it makes the type hint too 
complex.

The beauty of gradual typing is that unlike Java or Haskell, you can 
choose to have as little or as much type checking as works for you. You 
don't have to satisfy the type checker over the entire program before 
the code will run, you only need check the parts you want to check.


> Extending from that into the general
> ratio of "reports that are actually bugs" versus "reports that are
> errors on the part of the reporter", I can assume that plenty of
> people will raise bug reports for incorrect cases as well.

Okay. Do you get many false positive bug reports for your tests too?


> From the perspective of sustainable long-term maintenance, I think the
> only way to do type hints is to have them be sufficiently exhaustive
> that a user would have to actively *try* to hit an edge case false
> negative. I believe that requests' API is too dynamically-typed to fit
> into that category at this time.

I think we agree that, static type checks or no static type checks, 
requests is going to need to do runtime type checks. So why does it 
matter if it misses a few type errors at compile time?

I think we're all in agreement that for extremely dynamic code like 
requests, you may not get as much value from static type checks as some 
other libraries or applications. You might even decide that you get no 
value at all. Okay, that's fine. I'm just suggesting that you don't have 
just two choices, "all or nothing". The whole point of gradual typing is 
to give developers more options.


> PS: I should mention that, as Gary Bernhardt pointed out at PyCon,
> people often believe (incorrectly) that types are a replacement for
> tests.

They *can* be a replacement for tests. You don't see Java or Haskell
programmers writing unit tests to check that their code never tries to 
add a string to a float. Even if they could write such as test, they 
don't bother because the type checker will catch that sort of error.

The situation in Python is a bit different, and as Antoine points out, 
libraries cannot rely on their callers obeying the type restrictions of 
the public API. (Private functions are different -- if you call my 
private function with the wrong type and blow up your computer, it's 
your own fault.) For libraries, I see type checks as complementing 
tests, not replacing them.

But for application code, type checks may replace unit tests, provided 
that nobody checks in production code until both the type checker and 
the unit tests pass. If you work under that rule, there's no point in 
having the unit tests check what the type checker already tested.


> For that reason I feel like underspecified type hints are
> something of an attractive nuisance. Again, I really think this is a
> case of do it properly or not at all.

In my opinion, underspecified type hints are no more of an attractive 
nuisance than a test suite which doesn't test enough. Full coverage is 
great, but 10% coverage is better than 5% coverage, which is better than 
nothing. That applies whether we are talking about tests, type checks, 
or documentation.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] async/await in Python; v2

2015-04-24 Thread Steven D'Aprano
On Fri, Apr 24, 2015 at 09:32:51AM -0400, Barry Warsaw wrote:
> On Apr 24, 2015, at 11:17 PM, Steven D'Aprano wrote:
> 
> >It seems to me that tools that search for r"^\s*def\s+spam\s*\(" are
> 
> They would likely search for something like r"^\s*def\s+[a-zA-Z0-9_]+" which
> will hit "def async spam" but not "async def".

Unless somebody wants to do a survey of editors and IDEs and other 
tools, arguments about what regex they may or may not use to search for 
function definitions is an exercise in futility. They may use regexes 
anchored to the start of the line. They may not. They may deal with "def 
async" better than "async def", or the other way around. Either way, 
it's a pretty thin argument for breaking the invariant that the token 
following `def` is the name of the function.

Whatever new syntax is added, something is going to break.

-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Type hints -- a mediocre programmer's reaction

2015-04-24 Thread Steven D'Aprano
On Sat, Apr 25, 2015 at 02:05:15AM +0100, Ronan Lamy wrote:

> * Hints have no run-time effect. The interpreter cannot assume that they 
> are obeyed.

I know what you mean, but just for the record, annotations are runtime 
inspectable, so people can (and probably have already started) to write 
runtime argument checking decorators or frameworks which rely on the 
type hints.


> * PEP484 hints are too high-level. Replacing an 'int' object with a 
> single machine word would be useful, but an 'int' annotation gives no 
> guarantee that it's correct (because Python 3 ints can have arbitrary 
> size and because subclasses of 'int' can override any operation to 
> invoke arbitrary code).

Then create your own int16, uint64 etc types.


> * A lot more information is needed to produce good code (e.g. “this f() 
> called here really means this function there, and will never be 
> monkey-patched” – same with len() or list(), btw).
> * Most of this information cannot easily be expressed as a type
> * If the interpreter gathers all that information, it'll probably have 
> gathered a superset of what PEP484 can provide anyway.

All this is a red herring. If type hints are useful to PyPy, that's a 
bonus. Cython uses its own system of type hints, a future version may be 
able to use PEP 484 hints instead. But any performance benefit is a 
bonus. PEP 484 is for increasing correctness, not speed.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What's missing in PEP-484 (Type hints)

2015-04-30 Thread Steven D'Aprano
On Thu, Apr 30, 2015 at 01:41:53PM +0200, Dima Tisnek wrote:

> # Syntactic sugar
> "Beautiful is better than ugly, " thus nice syntax is needed.
> Current syntax is very mechanical.
> Syntactic sugar is needed on top of current PEP.

I think the annotation syntax is beautiful. It reminds me of Pascal.


> # internal vs external
> @intify
> def foo() -> int:
> b = "42"
> return b  # check 1
> x = foo() // 2  # check 2
> 
> Does the return type apply to implementation (str) or decorated callable 
> (int)?

I would expect that a static type checker would look at foo, and flag 
this as an error. The annotation says that foo returns an int, but it 
clearly returns a string. That's an obvious error.

Here is how I would write that:


# Perhaps typing should have a Function type?
def intify(func: Callable[[], str]) -> Callable[[], int]:
@functools.wraps(func)
def inner() -> int:
return int(func())
return inner


@intify
def foo() -> str:
b = "42"
return b


That should, I hope, pass the type check, and without lying about the 
signature of *undecorated* foo.

The one problem with this is that naive readers will assume that 
*decorated* foo also has a return type of str, and be confused. That's a 
problem. One solution might be, "don't write decorators that change the 
return type", but that seems horribly restrictive. Another solution 
might be to write a comment:

@intify  # changes return type to int
def foo() -> str:
...

but that's duplicating information already in the intify decorator, and 
it relies on the programmer writing a comment, which people don't do 
unless they really need to.

I think that the only solution is education: given a decorator, you 
cannot assume that the annotations still apply unless you know what the 
decorator does.


> How can same annotation or a pair of annotations be used to:
> * validate return statement type
> * validate subsequent use
> * look reasonable in the source code
> 
> 
> # lambda
> Not mentioned in the PEP, omitted for convenience or is there a rationale?
> f = lambda x: None if x is None else str(x ** 2)
> Current syntax seems to preclude annotation of `x` due to colon.
> Current syntax sort of allows lamba return type annotation, but it's
> easy to confuse with `f`.

I don't believe that you can annotate lambda functions with current 
syntax. For many purposes, I do not think that is important: a good type 
checker will often be able to infer the return type of the lambda, and 
from that infer what argument types are permitted:

lambda arg: arg + 1

Obviously arg must be a Number, since it has to support addition with 
ints.


> # local variables
> Not mentioned in the PEP
> Non-trivial code could really use these.

Normally local variables will have their type inferred from the 
operations done to them:

s = arg[1:]  # s has the same type as arg

When that is not satisfactory, you can annotate variables with a comment:

s = arg[1:]  #type: List[int]

https://www.python.org/dev/peps/pep-0484/#id24


> # global variables
> Not mentioned in the PEP
> Module-level globals are part of API, annotation is welcome.
> What is the syntax?

As above.


> # comprehensions
> [3 * x.data for x in foo if "bar" in x.type]
> Arguable, perhaps annotation is only needed on `foo` here, but then
> how complex comprehensions, e.g. below, the intermediate comprehension
> could use an annotation
> [xx for y in [...] if ...]

A list comprehension is obviously of type List. If you need to give a 
more specific hint:

result = [expr for x in things if cond(x)]  #type: List[Whatever]

See also the discussion of "cast" in the PEP.

https://www.python.org/dev/peps/pep-0484/#id25


> # class attributes
> s = socket.socket(...)
> s.type, s.family, s.proto  # int
> s.fileno  # callable
> If annotations are only available for methods, it will lead to
> Java-style explicit getters and setters.
> Python language and data model prefers properties instead, thus
> annotations are needed on attributes.

class Thing:
a = 42  # can be inferred
b = []  # inferred as List[Any]
c = []  #type: List[float]



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 492 quibble and request

2015-05-01 Thread Steven D'Aprano
On Wed, Apr 29, 2015 at 06:12:37PM -0700, Guido van Rossum wrote:
> On Wed, Apr 29, 2015 at 5:59 PM, Nick Coghlan  wrote:
> 
> > On 30 April 2015 at 10:21, Ethan Furman  wrote:
> > > From the PEP:
> > >
> > >> Why not a __future__ import
> > >>
> > >> __future__ imports are inconvenient and easy to forget to add.
> > >
> > > That is a horrible rationale for not using an import.  By that logic we
> > > should have everything in built-ins.  ;)
> >
> 
> This response is silly. The point is not against import but against
> __future__. A __future__ import definitely is inconvenient -- few people I
> know could even recite the correct constraints on their placement.

Are you talking about actual Python programmers, or people who dabble 
with the odd Python script now and again? I'm kinda shocked if it's the 
first.

It's not a complex rule: the __future__ import must be the first line of 
actual executable code in the file, so it can come after any encoding 
cookie, module docstring, comments and blank lines, but before any other 
code. The only part I didn't remember was that you can have multiple 
__future__ imports, I thought they all had to be on one line. (Nice to 
learn something new!)



[...]
> > 'as' went through the "not really a keyword" path, and
> > it's a recipe for complexity in the code generation toolchain and
> > general quirkiness as things behave in unexpected ways.
> >
> 
> I don't recall that -- but it was a really long time ago so I may
> misremember (did we even have __future__ at the time?).

I have a memory of much rejoicing when "as" was made a keyword, and an 
emphatic "we're never going to do that again!" about semi-keywords. I've 
tried searching for the relevant post(s), but cannot find anything. 
Maybe I imagined it?

But I do have Python 2.4 available, when we could write lovely code like 
this:

py> import math as as
py> as


I'm definitely not looking forward to anything like that again.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 492 quibble and request

2015-05-01 Thread Steven D'Aprano
On Wed, Apr 29, 2015 at 07:31:22PM -0700, Guido van Rossum wrote:

> Ah, but here's the other clever bit: it's only interpreted this way
> *inside* a function declared with 'async def'. Outside such functions,
> 'await' is not a keyword, so that grammar rule doesn't trigger. (Kind of
> similar to the way that the print_function __future__ disables the
> keyword-ness of 'print', except here it's toggled on or off depending on
> whether the nearest surrounding scope is 'async def' or not. The PEP could
> probably be clearer about this; it's all hidden in the Transition Plan
> section.)

You mean we could write code like this?

def await(x):
...


if condition:
async def spam():
await (eggs or cheese)
else:
def spam():
await(eggs or cheese)


I must admit that's kind of cool, but I'm sure I'd regret it.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 492: async/await in Python; version 4

2015-05-03 Thread Steven D'Aprano
On Fri, May 01, 2015 at 09:24:47PM +0100, Arnaud Delobelle wrote:

> I'm not convinced that allowing an object to be both a normal and an
> async iterator is a good thing.  It could be a recipe for confusion.

In what way?

I'm thinking that the only confusion would be if you wrote "async for" 
instead of "for", or vice versa, and instead of getting an exception you 
got the (a)syncronous behaviour you didn't want.

But I have no intuition for how likely it is that you could write an 
asyncronous for loop, leave out the async, and still have the code do 
something meaningful.

Other than that, I think it would be fine to have an object be both a 
syncronous and asyncronous iterator. You specify the behaviour you want 
by how you use it. We can already do that, e.g. unittest's assertRaises 
is both a test assertion and a context manager.

Objects can have multiple roles, and it's not usually abused, or 
confusing. I'm not sure that async iterables will be any different.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 557: Data Classes

2017-09-08 Thread Steven D'Aprano
On Fri, Sep 08, 2017 at 10:37:12AM -0700, Nick Coghlan wrote:

> >   def __eq__(self, other):
> >   if other.__class__ is self.__class__:
> >   return (self.name, self.unit_price, self.quantity_on_hand) ==
> > (other.name, other.unit_price, other.quantity_on_hand)
> >   return NotImplemented
> 
> My one technical question about the PEP relates to the use of an exact
> type check in the comparison methods, rather than "isinstance(other,
> self.__class__)".

I haven't read the whole PEP in close detail, but that method stood out 
for me too. Only, unlike Nick, I don't think I agree with the decision.

I'm also not convinced that we should be adding ordered comparisons 
(__lt__ __gt__ etc) by default, if these DataClasses are considered 
more like structs/records than tuples. The closest existing equivalent 
to a struct in the std lib (apart from namedtuple) is, I think, 
SimpleNamespace, and they are unorderable.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 559 - built-in noop()

2017-09-10 Thread Steven D'Aprano
On Mon, Sep 11, 2017 at 07:39:07AM +1000, Chris Angelico wrote:

[...]
> As a language change, definitely not. But I like this idea for
> PYTHONBREAKPOINT. You set it to the name of a function, or to "pass"
> if you want nothing to be done. It's a special case that can't
> possibly conflict with normal usage.

I disagree -- its a confusion of concepts. "pass" is a do-nothing 
statement, not a value, so you can't set something to pass. Expect a lot 
of StackOverflow questions asking why this doesn't work:

sys.breakpoint = pass

In fact, in one sense pass is not even a statement. It has no runtime 
effect, it isn't compiled into any bytecode. It is a purely syntactic 
feature to satisfy the parser.

Of course env variables are actually strings, so we can choose "pass" to 
mean "no break point" if we wanted. But I think there are already two 
perfectly good candidates for that usage which don't mix the concepts of 
statements and values, the empty string, and None:

setenv PYTHONBREAKPOINT=""
setenv PYTHONBREAKPOINT=None


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 544

2017-10-04 Thread Steven D'Aprano
On Wed, Oct 04, 2017 at 03:56:14PM -0700, VERY ANONYMOUS wrote:
> i want to learn

Start by learning to communicate in full sentences. You want to learn 
what? Core development? Python? How to program? English?

This is not a mailing list for Python beginners.  Try the "tutor" or 
"python-list" mailing lists.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] iso8601 parsing

2017-10-25 Thread Steven D'Aprano
On Wed, Oct 25, 2017 at 04:32:39PM -0400, Alexander Belopolsky wrote:
> On Wed, Oct 25, 2017 at 3:48 PM, Alex Walters  wrote:
> >  Why make parsing ISO time special?
> 
> It's not the ISO format per se that is special, but parsing of str(x).
> For all numeric types, int, float, complex and even
> fractions.Fraction, we have a roundtrip invariant T(str(x)) == x.
> Datetime types are a special kind of numbers, but they don't follow
> this established pattern.  This is annoying when you deal with time
> series where it is common to have text files with a mix of dates,
> timestamps and numbers.  You can write generic code to deal with ints
> and floats, but have to special-case anything time related.

Maybe I'm just being slow today, but I don't see how you can write 
"generic code" to convert text to int/float/complex/Fraction, but not 
times. The only difference is that instead of calling the type directly, 
you call the appropriate classmethod.

What am I missing?



-- 
Steven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \G (match last position) regex operator non-existant in python?

2017-10-28 Thread Steven D'Aprano
On Sun, Oct 29, 2017 at 12:31:01AM +0100, MRAB wrote:

> Not that I'm planning on making any further additions, just bug fixes 
> and updates to follow the Unicode updates. I think I've crammed enough 
> into it already. There's only so much you can do with the regex syntax 
> with its handful of metacharacters and possible escape sequences...

What do you think of the Perl 6 regex syntax?

https://en.wikipedia.org/wiki/Perl_6_rules#Changes_from_Perl_5



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 563: Postponed Evaluation of Annotations

2017-11-02 Thread Steven D'Aprano
On Wed, Nov 01, 2017 at 03:48:00PM -0700, Lukasz Langa wrote:

> PEP: 563
> Title: Postponed Evaluation of Annotations

> This PEP proposes changing function annotations and variable annotations
> so that they are no longer evaluated at function definition time.
> Instead, they are preserved in ``__annotations__`` in string form.

This means that now *all* annotations, not just forward references, are 
no longer validated at runtime and will allow arbitrary typos and 
errors:

def spam(n:itn):  # now valid
...

Up to now, it has been only forward references that were vulnerable to 
that sort of thing. Of course running a type checker should pick those 
errors up, but the evaluation of annotations ensures that they are 
actually valid (not necessarily correct, but at least a valid name), 
even if you happen to not be running a type checker. That's useful.

Are we happy to live with that change?


> Rationale and Goals
> ===
> 
> PEP 3107 added support for arbitrary annotations on parts of a function
> definition.  Just like default values, annotations are evaluated at
> function definition time.  This creates a number of issues for the type
> hinting use case:
> 
> * forward references: when a type hint contains names that have not been
>   defined yet, that definition needs to be expressed as a string
>   literal;

After all the discussion, I still don't see why this is an issue. 
Strings makes perfectly fine forward references. What is the problem 
that needs solving? Is this about people not wanting to type the leading 
and trailing ' around forward references?


> * type hints are executed at module import time, which is not
>   computationally free.

True; but is that really a performance bottleneck? If it is, that should 
be stated in the PEP, and state what typical performance improvement 
this change should give.

After all, if we're going to break people's code in order to improve 
performance, we should at least be sure that it improves performance :-)


> Postponing the evaluation of annotations solves both problems.

Actually it doesn't. As your PEP says later:

> This PEP is meant to solve the problem of forward references in type
> annotations.  There are still cases outside of annotations where
> forward references will require usage of string literals.  Those are
> listed in a later section of this document.

So the primary problem this PEP is designed to solve, isn't actually 
solved by this PEP.

(See Guido's comments, quoted later.)



> Implementation
> ==
> 
> In Python 4.0, function and variable annotations will no longer be
> evaluated at definition time.  Instead, a string form will be preserved
> in the respective ``__annotations__`` dictionary.  Static type checkers
> will see no difference in behavior,

Static checkers don't see __annotations__ at all, since that's not 
available at edit/compile time. Static checkers see only the source 
code. The checker (and the human reader!) will no longer have the useful 
clue that something is a forward reference:

# before
class C:
def method(self, other:'C'): 
...

since the quotes around C will be redundant and almost certainly left 
out. And if they aren't left out, then what are we to make of the 
annotation? Will the quotes be stripped out, or left in?

In other words, will method's __annotations__ contain 'C' or "'C'"? That 
will make a difference when the type hint is eval'ed.


> If an annotation was already a string, this string is preserved
> verbatim.

That's ambiguous. See above.


> Annotations can only use names present in the module scope as postponed
> evaluation using local names is not reliable (with the sole exception of
> class-level names resolved by ``typing.get_type_hints()``).

Even if you call get_type_hints from inside the function defining the 
local names?

def function():
A = something()
def inner(x:A)->int:
...
d = typing.get_type_hints(inner)
return (d, inner)

I would expect that should work. Will it?


> For code which uses annotations for other purposes, a regular
> ``eval(ann, globals, locals)`` call is enough to resolve the
> annotation.

Let's just hope nobody doing that has allowed any tainted strings to 
be stuffed into __annotations__.


> * modules should use their own ``__dict__``.

Which is better written as ``vars()`` with no argument, I believe. Or 
possibly ``globals()``.


> If a function generates a class or a function with annotations that
> have to use local variables, it can populate the given generated
> object's ``__annotations__`` dictionary directly, without relying on
> the compiler.

I don't understand this paragraph.


> The biggest controversy on the issue was Guido van Rossum's concern
> that untokenizing annotation expressions back to their string form has
> no precedent in the Python programming language and feels like a hacky
> workaround.  He said:
> 
> One thing that comes to mind is that i

Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread Steven D'Aprano
On Mon, Nov 06, 2017 at 01:07:51PM +1000, Nick Coghlan wrote:

> That means our choices for 3.7 boil down to:
> 
> * make this a language level guarantee that Python devs can reasonably rely on
> * deliberately perturb dict iteration in CPython the same way the
> default Go runtime does [1]

I agree with this choice.

My preference is for the first: having dicts be unordered has never been 
a positive virtue in itself, but always the cost we paid for fast O(1) 
access. Now what we have fast O(1) access *without* dicts being 
unordered, we should make it a language guarantee.

Provided of course that we can be reasonable certain that other 
implementations can do the same. And it looks like we can.

But if we wanted to still keep our options open, how about weakening the 
requirement that globals() and object __dicts__ be specifically the same 
type as builtin dict? That way if we discover a super-fast and compact 
dict implementation (maybe one that allows only string keys?) that is 
unordered, we can use it for object namespaces without affecting the 
builtin dict.


> When we did the "insertion ordered hash map" availability review, the
> main implementations we were checking on behalf of were Jython & VOC
> (JVM implementations), Batavia (JavaScript implementation), and
> MicroPython (C implementation). Adding IronPython (C# implementation)
> to the mix gives:

Shouldn't we check with Nuitka (C++) and Cython as well?

I'd be surprised if this is a problem for either of them, but we should 
ask.


> Since the round-trip behaviour that comes from guaranteed order
> preservation is genuinely useful, and we're comfortable with folks
> switching to more specialised containers when they need different
> performance characteristics from what the builtins provide, elevating
> insertion order preservation to a language level requirements makes
> sense.

+1


OrderedDict could then become a thin wrapper around regular dicts.

-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread Steven D'Aprano
On Mon, Nov 06, 2017 at 12:27:54PM +0100, Antoine Pitrou wrote:

> The ordered-ness of dicts could instead become one of those stable
> CPython implementation details, such as the fact that resources are
> cleaned up timely by reference counting, that people nevertheless
> should not rely on if they're writing portable code.

Given that (according to others) none of IronPython, Jython, Batavia, 
Nuitka, or even MicroPython, should have trouble implementing an 
insertion-order preserving dict, and that PyPy already has, why should 
we say it is a CPython implementation detail?


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread Steven D'Aprano
On Mon, Nov 06, 2017 at 12:18:17PM +0200, Paul Sokolovsky wrote:

> > I don't think that situation should change the decision, 
> 
> Indeed, it shouldn't. What may change it is the simple and obvious fact
> that there's no need to change anything, as proven by the 25-year
> history of the language.

I disagree -- the history of Python shows that having dicts be unordered 
is a PITA for many Python programmers. Python eventually gained an 
ordered dict because it provides useful functionality that developers 
demand.

Every new generation of Python programmers comes along and gets confused 
by why dicts mysteriously change their order from how they were entered, 
why doctests involving dicts break, why keyword arguments lose their 
order, why they have to import a module to get ordered dicts instead of 
having it be built-in, etc. Historically, we had things like 
ConfigParser reordering ini files when you write them.

Having dicts be unordered is not a positive virtue, it is a limitation. 
Up until now, it was the price we've paid for having fast, O(1) dicts. 
Now we have a dict implementation which is fast, O(1) and ordered. Why 
pretend that we don't? This is a long-requested feature, and the cost 
appears to be small: by specifying this, all we do is rule out some, but 
not all, hypothetical future optimizations.

Unordered dicts served CPython well for 20+ years, but I doubt many 
people will miss them.


> What happens now borders on technologic surrealism - the CPython, after
> many years of persuasion, switched its dict algorithm, rather
> inefficient in terms of memory, to something else, less inefficient
> (still quite inefficient, taking "no overhead" as the baseline).

Trading off space for time is a very common practice. You said that 
lookups on MicroPython's dicts are O(N). How efficient is µPy when doing 
a lookup of a dict with ten million keys?

µPy has chosen to optimize for space, rather than time. That's great. 
But I don't think you should sneer at CPython's choice to optimize for 
time instead.

And given that µPy's dicts already fail to meet the expected O(1) dict 
behviour, and the already large number of functional differences (not 
just performance differences) between µPy and Python:

http://docs.micropython.org/en/latest/pyboard/genrst/index.html

I don't think that this will make much practical difference. MicroPython 
users already cannot expect to run arbitrary Python code that works in 
other implementations: the Python community is fragmented between µPy 
code written for tiny machines, and Python code for machines with lots 
of memory.


> That
> algorithm randomly had another property. Now there's a seemingly
> serious talk of letting that property leak into the *language spec*,

It will no more be a "leak" than any other deliberate design choice.


> despite the fact that there can be unlimited number of dictionary
> algorithms, most of them not having that property. 

Sure. So what? There's an unlimited number of algorithms that don't 
provide the functionality that we want. There are an unlimited number of 
sort algorithms, but Python guarantees that we're only going to use 
those that are stable. Similar applies for method resolution (which µPy 
already violates), strings, etc.


> What it will lead to is further fragmentation of the community.

Aren't you concerned about fragmenting the community because of the 
functional differences between MicroPython and the specs?

Sometimes a small amount of fragmentation is unavoidable, and not 
necessarily a bad thing.


> > P.S. If anyone does want to explore MicroPython's dict implementation,
> > and see if there might be an alternate implementation strategy that
> > offers both O(1) lookup and guaranteed ordering without using
> > additional memory
> 
> That would be the first programmer in the history to have a cake and
> eat it too. Memory efficiency, runtime efficiency, sorted order: choose
> 2 of 3.

Given that you state that µPy dicts are O(N) and unordered, does that 
mean you picked only 1 out of 3?


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread Steven D'Aprano
On Mon, Nov 06, 2017 at 11:33:10AM -0800, Barry Warsaw wrote:

> If we did make the change, it’s possible we would need a way to 
> explicit say that order is not preserved.  That seems a little weird 
> to me, but I suppose it could be useful. 

Useful for what?

Given that we will hypothetically have order-preserving dicts that 
perform no worse than unordered dicts, I'm struggling to think of a 
reason (apart from performance) why somebody would intentionally use a 
non-ordered dict. If performance was an issue, sure, it makes sense to 
have a non-ordered dict for when you don't want to pay the cost of 
keeping insertion order. But performance seems to be a non-issue.

I can see people wanting a SortedDict which automatically sorts the keys 
into some specified order. If I really work at it, I can imagine that 
there might even be a use-case for randomizing the key order (like 
calling random.shuffle on the keys). But if you are willing to use a 
dict with arbitrary order, that means that *you don't care* what order 
the keys are in. If you don't care, then insertion order should be no 
better or worse than any other implementation-defined arbitrary order.


> I like the idea previously 
> brought up that iteration order be deliberately randomized in that 
> case, but we’d still need a good way to spell that.

That would only be in the scenario that we decide *not* to guarantee 
insertion-order preserving semantics for dicts, in order to prevent 
users from relying on an implementation feature that isn't a language 
guarantee.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread Steven D'Aprano
On Mon, Nov 06, 2017 at 08:05:07PM -0800, David Mertz wrote:
> I strongly opposed adding an ordered guarantee to regular dicts. If the
> implementation happens to keep that, great. 

That's the worst of both worlds. The status quo is that unless we 
deliberately perturb the dictionary order, developers will come to rely 
on implementation order (because that's what the CPython reference 
implementation actually offers, regardless of what the docs say). 
Consequently:

- people will be writing non-portable code, whether they know it or not;

- CPython won't be able to change the implementation, because it will 
break too much code;

- other implementations will be pressured to match CPython's 
implementation.

The only difference is that on the one hand we are honest and up-front 
about requiring order-preserving dicts, and on the other we still 
require it, but pretend that we don't.

And frankly, it seems rather perverse to intentionally perturb 
dictionary order just to keep our options open that someday there might 
be some algorithm which offers sufficiently better performance but 
doesn't preserve order. Preserving order is useful, desirable, often 
requested functionality, and now that we have it, it would have to be 
one hell of an optimization to justify dropping it again.

(It is like Timsort and stability. How much faster sorting would it 
have taken to justify giving up sort stability? 50% faster? 100%? We 
wouldn't have done it for a 1% speedup.)

It would be better to relax the requirement that builtin dict is used 
for those things that would benefit from improved performance. Is there 
any need for globals() to be the same mapping type as dict? Probably 
not. If somebody comes up with a much more efficient, non-order- 
preserving map ideal for globals, it would be better to change globals 
than dict. In my opinion.


> Maybe OrderedDict can be
> rewritten to use the dict implementation. But the evidence that all
> implementations will always be fine with this restraint feels poor,

I think you have a different definition of "poor" to me :-)

Nick has already done a survey of PyPy (which already has insertion- 
order preserving dicts), Jython, VOC, and Batavia, and they don't have 
any problem with this. IronPython is built on C#, which has order- 
preserving mappings. Nuitka is built on C++, and if C++ can't implement 
an order-preserving mapping, there is something terribly wrong with the 
world. Cython (I believe) uses CPython's implementation, as does 
Stackless.

The only well-known implementation that may have trouble with this is 
MicroPython, but it already changes the functionality of a lot of 
builtins and core language features, e.g. it uses a different method 
resolution order (so multiple inheritence won't work right), some 
builtins don't support slicing with three arguments, etc.

I think the evidence is excellent that other implementations shouldn't 
have a problem with this, unless (like MicroPython) they are targetting 
machines with tiny memory resources. µPy runs on the PyBoard, which I 
believe has under 200K of memory. I think we can all forgive µPy if it 
only *approximately* matches Python semantics.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread Steven D'Aprano
On Mon, Nov 06, 2017 at 06:35:48PM +0200, Paul Sokolovsky wrote:

> For MicroPython, it would lead to quite an overhead to make
> dictionary items be in insertion order. As I mentioned, MicroPython
> optimizes for very low bookkeeping memory overhead, so lookups are
> effectively O(n), but orderedness will increase constant factor
> significantly, perhaps 5x.

Paul, it would be good if you could respond to Raymond's earlier 
comments where he wrote:

I've just looked at the MicroPython dictionary implementation and 
think they won't have a problem implementing O(1) compact dicts with 
ordering.

The likely reason for the confusion is that they are already have an 
option for an "ordered array" dict variant that does a brute-force 
linear search.  However, their normal hashed lookup is very similar 
to ours and is easily amenable to being compact and ordered.

See:  
https://github.com/micropython/micropython/blob/77a48e8cd493c0b0e0ca2d2ad58a110a23c6a232/py/map.c#L139

Raymond has also volunteered to assist with this.


> Also, arguably any algorithm which would *maintain* insertion order
> over mutating operations would be more complex and/or require more
> memory that one which doesn't.

I think it would be reasonable to say that builtin dicts only maintain 
insertion order for insertions, lookups, and changing the value. Any 
mutation which deletes keys may arbitrarily re-order the dict.

If the user wants a stronger guarantee, then they should use 
OrderedDict.




-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread Steven D'Aprano
On Mon, Nov 06, 2017 at 10:17:23PM -0200, Joao S. O. Bueno wrote:

> And also, forgot along the discussion, is the big disadvantage that
> other Python implementations would have a quite
> significant overhead on mandatory ordered dicts.

I don't think that is correct. Nick already did a survey, and found that
C# (IronPython), Java (Jython and VOC) and Javascript (Batavia) all have
acceptable insertion-order preserving mappings. C++ (Nuitka) surely
won't have any problem with this (if C++ cannot implement an efficient
order-preserving map, there is something terribly wrong with the world).

As for other languages that somebody might choose to build Python on 
(the Parrot VM, Haskell, D, Rust, etc) surely we shouldn't be limiting 
what Python does for the sake of hypothetical implementations in 
"underpowered" languages?

I don't mean to imply that any of those examples are necessarily
underpowered, but if language Foo is incapable of supporting an
efficient ordered map, then language Foo is simply not good enough for a
serious Python implementation. We shouldn't allow Python's evolution to
be hamstrung by the requirement to support arbitrarily weak
implementation languages.


> One that was mentioned along the way is transpilers, with
> Brython as an example - but there might be others.

Since Brython transpiles to Javascript, couldn't it use the standard
Map object, which preserves insertion order?

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Map

Quote:

Description
A Map object iterates its elements in insertion order

The EMCAScript 6 standard specifies that Map.prototype.forEach operates
over the key/value pairs in insertion order:

https://tc39.github.io/ecma262/#sec-map-objects



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-06 Thread Steven D'Aprano
On Tue, Nov 07, 2017 at 05:28:24PM +1000, Nick Coghlan wrote:
> On 7 November 2017 at 16:21, Steven D'Aprano  wrote:
> > On Mon, Nov 06, 2017 at 08:05:07PM -0800, David Mertz wrote:
> >> Maybe OrderedDict can be
> >> rewritten to use the dict implementation. But the evidence that all
> >> implementations will always be fine with this restraint feels poor,
> >
> > I think you have a different definition of "poor" to me :-)
> 
> While I think "poor" is understating the case, I think "excellent"
> (which you use later on) is overstating it. My own characterisation
> would be "at least arguably good enough".

Fair enough, and thanks for elaborating.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The current dict is not an "OrderedDict"

2017-11-07 Thread Steven D'Aprano
On Tue, Nov 07, 2017 at 03:32:29PM +0100, Antoine Pitrou wrote:

[...]
> > "Insertion ordered until the first key removal" is the only guarantee
> > that's being proposed.
> 
> Is it?  It seems to me that many arguments being made are only relevant
> under the hypothesis that insertion is ordered even after the first key
> removal.  For example the user-friendliness argument, for I don't
> think it's very user-friendly to have a guarantee that disappears
> forever on the first __del__.

Don't let the perfect be the enemy of the good.

For many applications, keys are never removed from the dict, so this 
doesn't matter. If you never delete a key, then the remaining keys will 
never be reordered.

I think that Nick's intent was not to say that after a single deletion, 
the ordering guarantee goes away "forever", but that a deletion may be 
permitted to reorder the keys, after which further additions will honour 
insertion order. At least, that's how I interpret him.

To clarify: if we start with an empty dict, add keys A...D, delete B, 
then add E...H, we could expect:

{A: 1}
{A: 1, B: 2}
{A: 1, B: 2, C: 3}
{A: 1, B: 2, C: 3, D: 4}
{D: 4, A: 1, C: 3}  # some arbitrary reordering
{D: 4, A: 1, C: 3, E: 5}
{D: 4, A: 1, C: 3, E: 5, F: 6}
{D: 4, A: 1, C: 3, E: 5, F: 6, G: 7}
{D: 4, A: 1, C: 3, E: 5, F: 6, G: 7, H: 8}


Nick, am I correct that this was your intent?



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The current dict is not an "OrderedDict"

2017-11-07 Thread Steven D'Aprano
On Tue, Nov 07, 2017 at 05:37:15PM +0200, Serhiy Storchaka wrote:
> 07.11.17 16:56, Steven D'Aprano пише:
> >To clarify: if we start with an empty dict, add keys A...D, delete B,
> >then add E...H, we could expect:
[...]

> Rather
> 
> {A: 1, D: 4, C: 3}  # move the last item in place of removed
> {A: 1, D: 4, C: 3, E: 5}

Thanks for the correction.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comment on PEP 562 (Module __getattr__ and __dir__)

2017-11-19 Thread Steven D'Aprano
On Sun, Nov 19, 2017 at 08:24:00PM +, Mark Shannon wrote:
> Hi,
> 
> Just one comment. Could the new behaviour of attribute lookup on a 
> module be spelled out more explicitly please?
> 
> 
> I'm guessing it is now something like:
> 
> `module.__getattribute__` is now equivalent to:
> 
> def __getattribute__(mod, name):
> try:
> return object.__getattribute__(mod, name)
> except AttributeError:
> try:
> getter = mod.__dict__["__getattr__"]

A minor point: this should(?) be written in terms of the public 
interface for accessing namespaces, namely:

  getter = vars(mod)["__getattr__"]



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Comment on PEP 562 (Module __getattr__ and __dir__)

2017-11-19 Thread Steven D'Aprano
On Sun, Nov 19, 2017 at 05:34:35PM -0800, Guido van Rossum wrote:
> On Sun, Nov 19, 2017 at 4:57 PM, Steven D'Aprano 
> wrote:

> > A minor point: this should(?) be written in terms of the public
> > interface for accessing namespaces, namely:
> >
> >   getter = vars(mod)["__getattr__"]
> 
> Should it? The PEP is not proposing anything for other namespaces. What
> difference do you envision this way of specifying it would make?

I don't know if it should -- that's why I included the question mark.

But my idea is that __dict__ is the implementation and vars() is the 
interface to __dir__, and we should prefer using the interface rather 
than the implementation unless there's a good reason not to.

(I'm not talking here about changing the actual name lookup code to go 
through vars(). I'm just talking about how we write the equivalent 
recipe.)

Its not a big deal either way, __dict__ is already heavily used and 
vars() poorly known. Call it a matter of taste, if you like, but in my 
opinion the fewer times we directly reference dunders, the better.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What's the status of PEP 505: None-aware operators?

2017-12-01 Thread Steven D'Aprano
On Thu, Nov 30, 2017 at 11:54:39PM -0500, Random832 wrote:

> The OP isn't confusing anything; it's Eric who is confused. The quoted
> paragraph of the PEP clearly and unambiguously claims that the sequence
> is "arguments -> function -> call", meaning that something happens after
> the "function" stage [i.e. a None check] cannot short-circuit the
> "arguments" stage. But in fact the sequence is "function -> arguments ->
> call".

I'm more confused than ever. You seem to be arguing that Python 
functions CAN short-circuit their arguments and avoid evaluating them. 
Is that the case?

If not, then I fail to see the difference between 

"arguments -> function -> call"

"function -> arguments -> call"

In *both cases* the arguments are fully evaluated before the function is 
called, and so there is nothing the function can do to delay evaluating 
its arguments.

If this is merely about when the name "function" is looked up, then I 
don't see why that's relevant to the PEP.

What am I missing?


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] What's the status of PEP 505: None-aware operators?

2017-12-01 Thread Steven D'Aprano
On Fri, Dec 01, 2017 at 08:24:05AM -0500, Random832 wrote:
> On Fri, Dec 1, 2017, at 05:31, Steven D'Aprano wrote:
> > I'm more confused than ever. You seem to be arguing that Python 
> > functions CAN short-circuit their arguments and avoid evaluating them. 
> > Is that the case?
> 
> > If this is merely about when the name "function" is looked up, then I 
> > don't see why that's relevant to the PEP.
> > 
> > What am I missing?
> 
> You're completely missing the context of the discussion,

Yes I am. That's why I asked.

> which was the
> supposed reason that a *new* function call operator, with the proposed
> syntax function?(args), that would short-circuit (based on the
> 'function' being None) could not be implemented.

Given that neither your post (which I replied to) nor the post you were 
replying to mentioned anything about function?() syntax, perhaps I might 
be forgiven for having no idea what you were talking about?

The PEP only mentions function?() as a rejected idea, do I don't know 
why we're even talking about it. The PEP is deferred, with considerable 
opposition and luke-warm support, even the PEP author has said he's not 
going to push for it, and we're arguing about a pedantic point related 
to a part of the PEP which is rejected...

:-)



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-18 Thread Steven D'Aprano
On Mon, Dec 18, 2017 at 06:11:05PM -0800, Chris Barker wrote:

> Now that dicts are order-preserving, maybe we should change prettyprint:
> 
> In [7]: d = {'one':1, 'two':2, 'three':3}
> 
> In [8]: print(d)
> {'one': 1, 'two': 2, 'three': 3}
> 
> order preserved.
> 
> In [9]: pprint.pprint(d)
> {'one': 1, 'three': 3, 'two': 2}
> 
> order not preserved ( sorted, I presume? )

Indeed. pprint.PrettyPrinter has separate methods for OrderedDict and 
regular dicts, and the method for printing dicts calls sorted() while 
the other does not.


> With arbitrary order, it made sense to sort, so as to always give the same
> "pretty" representation. But now that order is "part of" the dict itself,
> it seems prettyprint should present the preserved order of the dict.

I disagree. Many uses of dicts are still conceptually unordered, even if 
the dict now preserves insertion order. For those use-cases, insertion 
order is of no interest whatsoever, and sorting is still "prettier".



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-18 Thread Steven D'Aprano
On Mon, Dec 18, 2017 at 07:37:03PM -0800, Nathaniel Smith wrote:
> On Mon, Dec 18, 2017 at 7:02 PM, Barry Warsaw  wrote:
> > On Dec 18, 2017, at 21:11, Chris Barker  wrote:
> >
> >> Will changing pprint be considered a breaking change?
> >
> > Yes, definitely.
> 
> Wait, what? Why would changing pprint (so that it accurately reflects
> dict's new underlying semantics!) be a breaking change?

I have a script which today prints data like so:

{'Aaron': 62,
 'Anne': 51,
 'Bob': 23,
 'George': 30,
 'Karen': 45,
 'Sue': 17,
 'Sylvester': 34}

Tomorrow, it will suddenly start printing:

{'Bob': 23,
 'Karen': 45,
 'Sue': 17,
 'George': 30,
 'Aaron': 62,
 'Anne': 51,
 'Sylvester': 34}


and my users will yell at me that my script is broken because the data 
is now in random order. Now, maybe that's my own damn fault for using 
pprint instead of writing my own pretty printer... but surely the point 
of pprint is so I don't have to write my own?

Besides, the docs say very prominently:

"Dictionaries are sorted by key before the display is computed."

https://docs.python.org/3/library/pprint.html

so I think I can be excused having relied on that feature.



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-18 Thread Steven D'Aprano
On Mon, Dec 18, 2017 at 08:28:52PM -0800, Chris Barker wrote:
> On Mon, Dec 18, 2017 at 7:41 PM, Steven D'Aprano 
> wrote:
> 
> > > With arbitrary order, it made sense to sort, so as to always give the
> > same
> > > "pretty" representation. But now that order is "part of" the dict itself,
> > > it seems prettyprint should present the preserved order of the dict.
> >
> > I disagree. Many uses of dicts are still conceptually unordered, even if
> > the dict now preserves insertion order. For those use-cases, insertion
> > order is of no interest whatsoever, and sorting is still "prettier".
> >
> 
> and many uses of dicts have "sorted" order as completely irrelevant, and
> sorting them arbitrarily is not necessarily pretty (you can't provide a
> sort key can you? -- so yes, it's arbitrary)

I completely agree. We might argue that it was a mistake to sort dicts 
in the first place, or at least a mistake to *always* sort them without 
allowing the caller to provide a sort key. But what's done is done: the 
fact that dicts are sorted by pprint is not merely an implementation 
detail, but a documented behaviour.


> I'm not necessarily saying we should break things, but I won't agree that
> pprint sorting dicts is the "right" interface for what is actually an
> order-preserving mapping.

If sorting dicts was the "right" behaviour in Python 3.4, it remains the 
right behaviour -- at least for use-cases that don't care about 
insertion order. Anyone using pprint on dicts *now* doesn't care about 
insertion order. If they did, they would be using OrderedDict.

That will change in the future, but even in the future there are lots of 
use-cases for dicts where insertion order might as well be random. The 
order that some dict happen to be constructed may not be "pretty" or 
significant or even consistent from one dict to the next.

(If your key/values pairs are coming in from an external source, they 
might not always come in the same order.)

I'm not denying that sometimes it would be nice to see dicts in 
insertion order. Right now, those use-cases are handled by OrderedDict 
but in the future many of them will be taken over by regular dicts. So 
we have a conflict:

- for some use-cases, insertion order is the "right" way for pprint
  to display the dict;

- but for others, sorting by keys is the "pretty" way for pprint to
  display the dict;

- and there's no way for pprint to know which is which just by 
  inspecting the dict.

How to break this tie? Backwards compatibility trumps all. If we want 
to change the default behaviour of pprint, we need to go through a 
deprecation period.

Or add a flag sorted=True, and let the caller decide.


> I would think it was only the right choice in the first place in order (get
> it?) to get a consistent representation, not because sorting was a good
> thing per se.

*shrug* That's arguable. As you said yourself, dicts were sorted by key 
to give a "pretty" representation. I'm not so sure that consistency is 
the justification. What does that even mean? If you print the same dict 
twice, with no modifications, it will print the same whether you sort 
first or not. If you print two different dicts, who is to say that they 
were constructed in the same order?

But the point is moot: whatever the justification, the fact that pprint 
sorts dicts by key is the defined behaviour, and even if it was a 
mistake to guarantee it, we can't just change it without a deprecation 
period.


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Revisiting old enhancement requests

2017-12-19 Thread Steven D'Aprano
What is the best practice for revisiting old enhancement requests on the 
tracker, if I believe that the time is right to revisit a rejected issue 
from many years ago? (Nearly a decade.)

Should I raise a new enhancement request and link back to the old one, 
or re-open the original?


Thanks,



-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   5   6   7   8   9   10   >