Re: [Python-Dev] Adding numbering to PEP 20, the Zen of Python

2014-09-18 Thread Tim Peters
[Ben Hoyt]
> I was emailing someone today about implementing something (for PEP
> 471, as it happens) and wanted to link to the Zen of Python [1] and
> note a particular clause (in this case "If the implementation is hard
> to explain, it's a bad idea."). However, there are no clause numbers,
> so you can't refer to specific phrases.
>
> I know it's a short enough document that it probably doesn't matter.
> And maybe numbering them would make it less Zen. Would be handy in
> code reviews and the like, for example: "Not very Pythonic. See PEP 20
> point 5." Is it just my pedantic self, or have others wanted to do
> this too?
>
> [1] http://legacy.python.org/dev/peps/pep-0020/

See Revelation 22:18.  Adding numbers is how heresy begins ;-)

http://biblehub.com/revelation/22-18.htm
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.

2015-03-09 Thread Tim Peters
[nha pham ]
> Statement_1: With an array of size N or less than N, we need at most log2(N)
> comparisons to find a value
> (or a position, incase the search miss), using the binary search algorithm.
>
> proof: This statement is trivia, and I believe, someone outthere already
> proved it.

Sorry for the quick message here.  It's just a simple point where it
will pay not to get off on a wrong foot ;-)

Correct:  for an array of size N, binary search can require as many as
ceiling(log2(N+1)) comparisons.

That's because there are N+1 possible results for an array of size N.
For example, for an array of size 3, [A, B, C], "the answer" may be
"before A", "between A and B", "between B and C", or "after C".  3
elements, 3+1 = 4 possible results.  log2(3) comparisons are not
enough to distinguish among 4 results.

Make it trivial, an array of length 1.  Then 1 comparison is obviously
necessary and sufficient in all cases.  And, indeed,
ceiling(log2(1+1)) = 1.  log2(1) equals 0, too small.

For the rest, I haven't been able to understand your description or
your pseudo-code.  I'll try harder.  Some things clearly aren't doing
what you _intend_ them to do.  For example, in your Python code, each
time through the outer loop you're apparently trying to sort the next
CHUNK elements, but you end up appending CHUNK+1 values to data2 (or
data3).

Or in this part:

for i in range(low,high):
x = data[i]
if x >= data[i-1]:

the first time that loop is executed low == 0, and so i == 0 on the
first iteration, and so the conditional is

   if x >= data[0-1]

That's referencing data[-1], which is the very last element in data -
which has nothing to do with the CHUNK you're trying to sort at the
time.

So there are a number of errors here, which makes it that much harder
to sort out (pun intended ) what you're trying to do.  It would
help you to add some asserts to verify your code is doing what you
_hope_ it's doing.  For example, add

assert data2[low: high] == sorted(data[low: high])
assert len(data2) == high

to the end of your `sample` loop, and similarly for data3 in your
`new` loop.  Until those asserts all pass, you're not testing code
that's actually sorting correctly.  Repair the errors and you almost
certainly won't find `new` running over 10 times faster than `sample`
anymore.  I don't know what you _will_ discover, though.  If the code
doesn't have to sort correctly, there are much simpler ways to make it
run _very_ much faster ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.

2015-03-09 Thread Tim Peters
[nha pham ]
> Thank you very much. I am very happy that I got a reply from Tim Peter.

My pleasure to speak with you too :-)


> You are correct, my mistake.
>
> The python code should be:
> for i in range(low+1,high):  //because we already add
> data[low]
> x = data[i]
> if x >= data[i-1]:
>
> After I fix it, here is the result:
>
> random array 10^6:
> Old binsort:  1.3322
> New binsort: 1.0015
> ratio: 0.33
>
> You are right, it is not ten times faster anymore. I will update other
> results soon.
>
> I do check the result of two sorting methods many times to make sure they
> are the same. It is just because I do not know how to put assert into the
> timeit.Timer class.

`assert` is just another Python statement.  You simply add it to the
code - there's nothing tricky about this.  You could, e.g., simply
copy and paste the `assert`s I suggested last time.

Before you do, trying adding `print index` to your inner loops, and
make SIZE much smaller (say, 1000) so you're not overwhelmed with
output.  You'll be surprised by what you see on the second (and
following) CHUNKs.  For example, in both `sample` and `new` it will
print 900 ninety nine times in a row when doing the last CHUNK.  The
code still isn't doing what you intend.  Until it does, timing it
makes little sense :-)

> I am pretty sure about this.

Note that I'm talking about the Python code here, the code you run
through timeit.  You cannot have checked the results of running _that_
specific code, because it doesn't work at all.  You may have checked
_other_ code many times.  We may get to that later, but since I speak
Python, I'm not going to understand what you're doing until we have
Python code that works ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.

2015-03-10 Thread Tim Peters
OK - here's what the current binsort does, ignoring that it skips an
already-sorted prefix (if any), and creating a new list instead of
modifying in-place:

def oldsort(a):
from bisect import bisect_right as br
assert a
result = [a[0]]
for i in range(1, len(a)):
x = a[i]
index = br(result, x)
result.insert(index, x)
return result

And here's my best guess as to what you _intend_ the new version to
do.  Please confirm that, or, if I'm guessing wrong, please give a
Python function that _does_ implement your intent:

def newsort(a):
from bisect import bisect_right as br
assert a
oldx = a[0]
result = [oldx]
index = 0
for i in range(1, len(a)):
x = a[i]
if x < oldx:
index = br(result, x, 0, index)
else:
index = br(result, x, index + 1)
result.insert(index, x)
oldx = x
return result

Now assuming that's right, I don't care about timing it ;-)  The only
basic question to me is whether it in fact reduces the number of
comparisons.  So here's an integer wrapper that bumps a global counter
whenever it's asked to compare:

class IntWrap(object):
def __init__(self, i):
self.i = i

def __cmp__(a, b):
global gncmp
gncmp += 1
return cmp(a.i, b.i)

def __repr__(self):
return repr(self.i)

Now we can build lists containing that, and get exact comparison
counts.  To start, for a given length `n`, this counts the total
number of comparisons needed to sort all possible permutations of a
list of length n, under both the old and new ways:

def drive(n):
import itertools
global gncmp
base = [IntWrap(i) for i in range(n)]
oldcount = newcount = 0
numperms = 0
for p in itertools.permutations(base):
numperms += 1

gncmp = 0
oldresult = oldsort(p)
oldcount += gncmp

gncmp = 0
newresult = newsort(p)
newcount += gncmp

assert oldresult == newresult == base
print 'n', n, 'perms', numperms
print 'old compares', oldcount
print 'new compares', newcount
print 'diff %', (newcount - oldcount) * 100.0 / max(oldcount, 1)

And, finally, a tiny loop to drive it:

for n in range(1, 13):
print
drive(n)

It's still running as I type this, but the results aren't promising so
far - as soon as the list length gets non-trivial, the new way
requires more comparisons than the old way so far:

n 1 perms 1
old compares 0
new compares 0
diff % 0.0

n 2 perms 2
old compares 2
new compares 2
diff % 0.0

n 3 perms 6
old compares 16
new compares 16
diff % 0.0

n 4 perms 24
old compares 112
new compares 116
diff % 3.57142857143

n 5 perms 120
old compares 848
new compares 880
diff % 3.77358490566

n 6 perms 720
old compares 7008
new compares 7296
diff % 4.1095890411

n 7 perms 5040
old compares 63456
new compares 66432
diff % 4.68986384266

n 8 perms 40320
old compares 628608
new compares 662496
diff % 5.39095907147

n 9 perms 362880
old compares 6826752
new compares 7202304
diff % 5.50118123523

n 10 perms 3628800
old compares 80605440
new compares 85006080
diff % 5.45948263542

I believe it would be very difficult to analyze this rigorously - and
even if I saw an analysis it would be hard to trust it.  Raw counts
from simple code are hard to argue with ;-)

FYI, here are two ideas I had way back when, but didn't pursue:

1. Merge "2 at a time" instead of just 1.  That is, first "sort" the
next 2 elements to be merged (1 compare and a possible swap).  Then
binary search to find where the smaller belongs, and a shorter binary
search to find where the larger belongs.  Then shift both into place.
This can win on two counts:

A. Less data movement, since the already-sorted values after the
larger element get moved only once instead of twice.

B. A possible cut in number of compares.  Merging a sorted list of N
elements with a sorted list of 2 elements has been studied a lot
(e.g., search for "optimal merging of 2 elements" and find the paper
by Hwang and Lin).  The minimum average theoretical number of compares
needed is
ceiling(log2((N+2)*(N+1)/2)).

2. Instead of binary insertion sort, do an ordinary (but optimized)
bottom-up merge sort.  That may not cut the number of compares, but
would slash worst-case data movement cost from O(n**2) to O(n*log(n)).

As to why your code is sometimes faster, for the Python code in your
timing harness, well, that didn't actually sort anything, so wasn't
measuring anything interesting (or even explainable ;-) ).  For the
Java code, I have no guess - I don't know enough about Java internals.
Maybe "lucky" data, maybe cache effects, maybe a mistake - don't know,
and can't guess.  Or maybe my guess (above) at the intent of 

Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.

2015-03-11 Thread Tim Peters
[Tim]
>> 1. Merge "2 at a time" instead of just 1.  That is, first "sort" the
>> next 2 elements to be merged (1 compare and a possible swap).  Then
>> binary search to find where the smaller belongs, and a shorter binary
>> search to find where the larger belongs.  Then shift both into place.

[Armin]
> Good idea, but when I tried that it seemed to increase the total
> number of comparisons (on random inputs with only up to 136 items).
> The increase is on the order of 5%.  I'm not sure reduced data
> movement can compensate for that in Python.

Which is another way of saying "bad idea" - that must be why I didn't
pursue it to begin with ;-)

Thanks for trying!  I plugged a similar routine into the code I showed
before to count the # of comparisons in Nha Pham's idea, and this
"merge 2 at a time" thing has a higher average # of compares (over all
permutations) than Nha's (which in turn has a higher average than the
status quo).

That makes some sense, thinking about what they do.  Nha's algorithm
has some supernaturally good cases (input array already ascending or
already descending), but "merge 2 at a time" doesn't appear to have
any.

In any case, the information-theoretic minimum average number of
comparisons for merging N sorted elements with 2 sorted elements is
("where do the 2 belong in the final list of N+2 elements?" =
comb(N+2, 2)):

log2((N+2)*(N+1)/2) = log2(N+2) + log2(N+1) - 1

Add a comparison to get the 2 elements in order to begin with, and we're up to

log2(N+2) + log2(N+1)

Two independent binary inserts (first to a list of size N, and then to
a list of size N+1) comes out to the same.  So even being
supernaturally clever can't reduce the average number of compares this
way.  And since, in context, we're only looking at short arrays, a
marginal saving in data movement costs (which are still O(N**2) worst
case) are unlikely to be significant.

Still, if anyone wants to go nuts ... ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector

2017-09-08 Thread Tim Peters
[Neil Schemenauer ]
> Python objects that participate in cyclic GC (things like lists, dicts,
> sets but not strings, ints and floats) have extra memory overhead.  I
> think it is possible to mostly eliminate this overhead.  Also, while
> the GC is running, this GC state is mutated, which destroys
> copy-on-write optimizations.  This change would mostly fix that
> issue.
>
> All objects that participate in cyclic GC have the Py_TPFLAGS_HAVE_GC
> bit set in their type.  That causes an extra chunk of memory to be
> allocated *before* the ob_refcnt struct member.  This is the PyGC_Head
> struct.
>
> The whole object looks like this in memory (PyObject pointer is at
> arrow):
>
> union __gc_head *gc_next;
> union __gc_head *gc_prev;
> Py_ssize_t gc_refs;
> -->
> Py_ssize_t ob_refcnt
> struct _typeobject *ob_type;
> [rest of PyObject members]
>
>
> So, 24 bytes of overhead on a 64-bit machine.  The smallest Python
> object that can have a pointer to another object (e.g. a single PyObject
> * member) is 48 bytes.  Removing PyGC_Head would cut the size of these
> objects in half.
>
> Carl Shaprio questioned me today on why we use a double linked-list and
> not the memory bitmap.  I think the answer is that there is no good
> reason. We use a double linked list only due to historical constraints
> that are no longer present.

Since you wrote this code to begin with, it will come back to you ;-)
that the real purpose of the doubly-linked lists is to _partition_
(not just find) the tracked objects.  Between collections, they're
partitioned by generation, and within a collection equivalence classes
are first merged (up through the oldest generation to be scanned in
this run), and then temporarily partitioned internally in various ways
(based on things like whether objects turn out to be reachable from
outside, and whether they have finalizers).  The linked list
representation makes all the required operations cheap:  iteration,
merging classes, moving an object from one class to another, removing
an object entirely _while_ iterating over its equivalence class.
Don't know whether all that can be done efficiently with a bitmap
representation instead.

> Long ago, Python objects could be allocated using the system malloc or
> other memory allocators.  Since we could not control the memory
> location, bitmaps would be inefficient.  Today, we allocate all Python
> objects via our own function.  Python objects under a certain size are
> allocated using our own malloc, obmalloc, and are stored in memory
> blocks known "arenas".
>
> The PyGC_Head struct performs three functions.  First, it allows the GC
> to find all Python objects that will be checked for cycles (i.e. follow
> the linked list).

As above, the set of tracked objects is partitioned into more than one
linked list.

> Second, it stores a single bit of information to let
> the GC know if it is safe to traverse the object, set with
> PyObject_GC_Track().

?  An object is "tracked" if and only if it appears in _some_
doubly-linked list.  There is no bit set (or cleared) for this.
Untracking an object removes it entirely from whichever linked list
it's in (leaving it in no linked lists), and tracking an object
consists of adding it to the "generation 0" linked list.  Unless the
code has changed a whole lot recently.

For clarity, the top N-1 bits of gc_refs (which you cover next) are
also set to a special _PyGC_REFS_UNTRACKED constant when an object is
untracked:

"""
/* True if the object is currently tracked by the GC. */
#define _PyObject_GC_IS_TRACKED(o) \
(_PyGC_REFS(o) != _PyGC_REFS_UNTRACKED)
"""

But I believe it could just as well check to see whether the object's
gc_next is NULL.


>  Finally, it has a scratch area to compute the
> effective reference count while tracing refs (gc_refs).

As above, the top N-1 bits of that are also used between collections
to record whether an object is tracked.

The least significant bit of gc_refs now (not back when you or I were
mucking with this code) records whether the object has a finalizer
that has already been run, and that state needs to be preserved across
gc runs.  So that's another bit that would need to be stored somewhere
else.


> Here is a sketch of how we can remove the PyGC_Head struct for small
> objects (say less than 512 bytes).  Large objects or objects created by
> a different memory allocator will still have the PyGC_Head overhead.
>
> * Have memory arenas that contain only objects with the
>   Py_TPFLAGS_HAVE_GC flag.  Objects like ints, strings, etc will be
>   in different arenas, not have bitmaps, not be looked at by the
>   cyclic GC.
>
> * For those arenas, add a memory bitmap.  The bitmap is a bit array that
>   has a bit for each fixed size object in the arena.  The memory used by
>   the bitmap is a fraction of what is needed by PyGC_Head.  E.g. an
>   arena that holds up to 1024 objects of 48 bytes in size would have a
>   bitmap of 1024 bits.

If it's based on

Re: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector

2017-09-08 Thread Tim Peters
[Tim]
>> In that case, it's because Python
>> _does_ mutate the objects' refcount members under the covers, and so
>> the OS ends up making fresh copies of the memory anyway.

[Greg Ewing ]
> Has anyone ever considered addressing that by moving the
> refcounts out of the objects and keeping them somewhere
> else?

Not that I know of.  I know Larry Hastings was considering doing it as
part of his experiments with removing the GIL, but that had nothing to
do with reducing cross-process copy-on-write surprises (it had to do
with "batching" refcount operations to eliminate a need for
fine-grained locking).

As-is, I'd say it's "a feature" that the refcount is part of the
object header.  Ref count manipulations are very frequent, and as part
of the object header a refcount tends to show up in cache lines "for
free" as a side effect of accessing the object's type pointer.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Compiling Python-3.6.3 fails two tests test_math and test_cmath

2017-10-16 Thread Tim Peters
[Richard Hinerfeld ]
> Compiling Python-3.6.3 on Linux fails two tests: test_math and test_cmatg

Precisely which version of Linux?  The same failure has already been
reported on OpenBSD here:

https://bugs.python.org/issue31630
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \G (match last position) regex operator non-existant in python?

2017-10-27 Thread Tim Peters
Note that Matthew Barnett's `regex` module already supports \G, and a
great many other features that weren't around 15 years ago ;-) either:

https://pypi.python.org/pypi/regex/

I haven't followed this in detail.  I'm just surprised once per year
that it hasn't been folded into the core ;-)

[nothing new below]

On Fri, Oct 27, 2017 at 10:35 AM, Guido van Rossum  wrote:
> The "why" question is not very interesting -- it probably wasn't in PCRE and
> nobody was familiar with it when we moved off PCRE (maybe it wasn't even in
> Perl at the time -- it was ~15 years ago).
>
> I didn't understand your description of \G so I googled it and found a
> helpful StackOverflow article:
> https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex.
> From this I understand that when using e.g. findall() it forces successive
> matches to be adjacent.
>
> In general this seems to be a unique property of \G: it preserves *state*
> from one match to the next. This will make it somewhat difficult to
> implement -- e.g. that state should probably be thread-local in case
> multiple threads use the same compiled regex. It's also unclear when that
> state should be reset. (Only when you compile the regex? Each time you pass
> it a different source string?)
>
> So I'm not sure it's reasonable to add. But I also don't see a reason why it
> shouldn't be added -- presuming we can decide on good answer for the
> questions above about the "scope" of the anchor.
>
> I think it's okay to start a discussion on bugs.python.org about the precise
> specification of \G for Python. OTOH I expect that most core devs won't find
> this a very interesting problem (Python relies on regexes for parsing a lot
> less than Perl does).
>
> Good luck!
>
> On Thu, Oct 26, 2017 at 11:03 PM, Ed Peschko  wrote:
>>
>> All,
>>
>> perl has a regex assertion (\G) that allows multiple-match regular
>> expressions to be able to use the position of the last match. Perl's
>> documentation puts it this way:
>>
>> \G Match only at pos() (e.g. at the end-of-match position of prior
>> m//g)
>>
>> Anyways, this is exceedingly powerful for matching regularly
>> structured free-form records, and I was really surprised when I found
>> out that python did not have it. For example, if findall supported
>> this, it would be possible to write things like this (a quick and
>> dirty ifconfig parser):
>>
>> pat = re.compile(r'\G(\S+)(.*?\n)(?=\S+|\Z)', re.S)
>>
>> val = """
>> eth2  Link encap:Ethernet  HWaddr xx
>>  inet addr: xx.xx.xx.xx  Bcast:xx.xx.xx.xx  Mask:xx.xx.xx.xx
>> ...
>> loLink encap:Local Loopback
>>inet addr:127.0.0.1  Mask:255.0.0.0
>> """
>>  matches = re.findall(pat, val)
>>
>> So - why doesn't python have this? is it something that simply was
>> overlooked, or is there another method of doing the same thing with
>> arbitrarily complex freeform records?
>>
>> thanks much..
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/tim.peters%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The current dict is not an "OrderedDict"

2017-11-07 Thread Tim Peters
[Peter Ludemann]
> Does it matter whether the dict order after pop/delete is explicitly
> specified, or just specified that it's deterministic?

Any behavior whatsoever becomes essential after it becomes known ;-)

For example, dicts as currently ordered easily support LRU (least
recently used) purging like so:

On access:

result = d.pop(key)
d[key] = result

This moves `key` from wherever it was to the _most_ recently used position.

To purge the `numtopurge` least recently used keys (since traversing
the dict is always from least-recently to most-recently added):

topurge = tuple(itertools.islice(d, numtopurge))
for key in topurge:
del d[key]

Is it worth guaranteeing that will always "work" (as intended)?  Not
to me, but I do have code that relies on it now - and we can count on
someone else saying it's utterly crucial ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-12-15 Thread Tim Peters
[Eric Snow ]
> Does that include preserving order after deletion?

Given that we're blessing current behavior:

- At any moment, iteration order is from oldest to newest.  So, "yes"
to your question.

- While iteration starts with the oldest, .popitem() returns the
youngest.  This is analogous to how lists work, viewing a dict
similarly ordered "left to right" (iteration starts at the left,
.pop() at the right, for lists and dicts).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-12 Thread Tim Peters
[Guido]
>  as_integer_ratio() seems mostly cute (it has Tim Peters all
> over it),

Nope!  I had nothing to do with it.  I would have been -0.5 on adding
it had I been aware at the time.

- I expect the audience is tiny.

- While, ya, _I_ have uses for it, I had a utility function for it
approximately forever (it's easily built on top of math.frexp()).

- Especially now, fractions.Fraction(some_float) is the same thing
except for return type.


> OTOH it looks like Decimal has it,

Looks like ints got it first, and then spread to Decimal because "why
not?" ;-)  The first attempt to spread it to Decimal I found was
rejected (which would have been my vote too):

https://bugs.python.org/issue8947


> so I think this ship has sailed too and maybe it's best to add it to the
> numeric tower just to be done with it.

Or rip it out of everything.  Either way works for me ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-12 Thread Tim Peters
[David Mertz ]
> ...
> I can see no sane reason why anyone would ever call float.is_integer()
> actually. That should always be spelled math.isclose(x, int(x)) because
> IEEE-754. Attractive nuisance is probably too generous, I'd simply call the
> method a bug.

Sometimes it's necessary to know, and especially when _implementing_
754-conforming functions.  For example, what negative infinity raised
to a power needs to return depends on whether the power is an integer
(specifically on whether it's an odd integer):

>>> (-math.inf) ** random.random()
inf
>>> (-math.inf) ** random.random()
inf
>>> (-math.inf) ** random.random()
inf
>>> (-math.inf) ** 3.1
inf
>>> (-math.inf) ** 3.0 # NOTE THIS ONE
-inf
>>> (-math.inf) ** 2.9
inf

But, ya, for most people most of the time I agree is_integer() is an
attractive nuisance.  People implementing math functions are famous
for cheerfully enduring any amount of pain needed to get the job done
;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-12 Thread Tim Peters
[Tim Peters]
>> ...
>> >>> (-math.inf) ** 3.1
>> inf

[David Mertz]
> Weird. I take it that's what IEEE-754 says. NaN would sure be more intuitive
> here since inf+inf-j is not in the domain of Reals. Well, technically
> neither is inf, but at least it's the limit of the domain. :-).

Mathematical reals have all sorts of properties floats fail to
capture, while mathematical reals don't distinguish between -0 and +0
at all.  "Practical' symmetry arguments often underlie what float
standards require.  At heart , the rules for infinite arguments are
often _consequences_ of "more obvious" rules for signed zero
arguments, following from replacing +-inf with 1/+-0 in the latter.

More explanation here:

https://stackoverflow.com/questions/10367011/why-is-pow-infinity-positive-non-integer-infinity

But we're not required to _like_ it; we just have to implement it ;-)


>> >>> (-math.inf) ** 3.0 # NOTE THIS ONE
>> -inf
>> >>> (-math.inf) ** 2.9
>> inf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-12 Thread Tim Peters
[Tim. on as_integer_ratio()]
>> - I expect the audience is tiny.

[Alexander Belopolsky]
> The datetime module would benefit from having as_integer_ratio()
> supported by more types.  It's been hard to resist requests to allow
> Decimal in timedelta constructors and/or arithmetics

I don't see the connection.  That timedelta construction may use
as_integer_ratio() today doesn't mean it _has_ to use
as_integer_ratio() forever, and is no reason (to my mind) to add
as_integer_ratio all over the place.

Why not drop that, and in oddball cases see whether
fractions.Fraction() can handle the input?

>>> fractions.Fraction(decimal.Decimal("1.76"))
Fraction(44, 25)

Probably less efficient, but I don't care ;-)  And then, e.g.,
timedelta would also automagically allow Fraction arguments (which,
BTW, don't support as_integer_ratio() either).  Bonus:  if datetime is
bothering with hand-coding rational arithmetic now out of concern to
get every bit right, Fraction could handle that too by itself.

At heart, the Fraction() constructor is _all about_ creating integer
ratios, so is the most natural place to put knowledge of how to do so.
A protocol for allowing new numeric types to get converted to Fraction
would be more generally useful than just a weird method only datetime
uses ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-13 Thread Tim Peters
[Tim]
>> At heart, the Fraction() constructor is _all about_ creating integer
>> ratios, so is the most natural place to put knowledge of how to do so.
>> A protocol for allowing new numeric types to get converted to Fraction
>> would be more generally useful than just a weird method only datetime
>> uses ;-)

[Guido]
> Ironically, the various Fraction constructors *calls* as_integer_ratio() for
> floats and Decimals. From which follows IMO that the float and Decimal
> classes are the right place to encapsulate the knowledge on how to do it.

It appears that as_integer_ratio was slammed into floats and Decimals
precisely _so that_ Fraction() could call them, while Fraction has its
own self-contained knowledge of how to convert ints and Fractions and
strings and numbers.Rationals to Fraction (and the former types don't
support as_integer_ratio).

That's fine, but my objection is subtler:  the actual answer to "can
this thing be converted to an integer ratio?" is not "does it support
as_integer_ratio?",  but rather "can Fraction() deal with it?" - and
there's currently no way for a new numeric type to say "and here's how
I can be converted to Fraction".

An obvious way to extend it is for Fraction() to look for a special
method too, say "_as_integer_ratio()".  The leading underscore would
reflect the truth:  that this wasn't really intended to be a public
method on its own, but is an internal protocol for use by the
Fraction() constructor.

Then it would be obvious that, e.g., it would be just plain stupid ;-)
for `int` to bother implementing _as_integer_ratio.  The only real
point of the method is to play nice with the Fraction constructor.
_As is_, it's jarring that int.as_integer_ratio() doesn't exist - for
the same reason it's jarring int.hex() doesn't exist.

If Mark or I wanted to use float._as_integer_ratio() directly too,
that's fine: we're numeric grownups and won't throw a hissy fit if
ints don't support it too ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-13 Thread Tim Peters
[Guido]
> So let's make as_integer_ratio() the standard protocol for "how to make a
> Fraction out of a number that doesn't implement numbers.Rational". We
> already have two examples of this (float and Decimal) and perhaps numpy or
> the sometimes proposed fixed-width decimal type can benefit from it too.

Yup, that works.  I only would have preferred that you went back in
time to add a leading underscore.


> If this means we should add it to int, that's fine with me.

Given that int.numerator and int.denominator already exist, there's no
plausible "good reason" to refuse to return them as twople.  Still,
I'd wait for someone to complain ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Symmetry arguments for API expansion

2018-03-13 Thread Tim Peters
[Tim]
>> An obvious way to extend it is for Fraction() to look for a special
>> method too, say "_as_integer_ratio()".

[Greg Ewing]
> Why not __as_integer_ratio__?

Because. at this point, that would be beating a dead horse ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecating float.is_integer()

2018-03-21 Thread Tim Peters
[David Mertz]
> I've been using and teaching python for close to 20 years and I never
> noticed that x.is_integer() exists until this thread.

Except it was impossible to notice across most of those years, because
it didn't exist across most of those years ;-)


> I would say the "one obvious way" is less than obvious.

When it was introduced, it _became_ the one obvious way.


> On the other hand, `x == int(x)` is genuinely obvious..

But a bad approach:  it can raise OverflowError (for infinite x); it
can raise ValueError (for x a NaN); and can waste relative mountains
of time creating huge integers, e.g.,

>>> int(1e306)
117216064596736454828831087825013238982328892017892380671244575047987920451875459594568606138861698291060311049225532948520696938805711440650122628514669428460356992624968028329550689224175284346730060716088829214255439694630119794546505512415617982143262670862918816362862119154749127262208

In Python 2, x == math.floor(x) was much better on the latter count,
but not in Python 3 (math.floor used to return a float, but returns an
int now).


As to Serhiy's `not x % 1.0`, after 5 minutes I gave up trying to
prove it's always correct.  Besides infinities and NaNs, there's also
that Python's float mod can be surprising:

>>> (-1e-20) % 1.0
1.0

There isn't a "clean" mathematical definition of what Python's float %
does, which is why proof is strained.  In general, the "natural"
result is patched when and if needed to maintain that

x == y*(x//y) + x%y

is approximately true.  The odd % result above is a consequence of
that, and that (-1e-20) // 1.0 is inarguably -1.0.


> and it immediately suggests the probably better `math.isclose(x, int(x))` that
> is what you usually mean.

Even in some of the poor cases Serhiy found, that wouldn't be a lick
better.  For example,

math.isclose(x/5, int(x/5))

is still a plain wrong way to check whether x is divisible by 5.

>>> x = 1e306
>>> math.isclose(x/5, int(x/5))
True
>>> x/5 == int(x/5)
True
>>> int(x) % 5
3

The problem there isn't  how "is it an integer?" is spelled, it's that
_any_ way of spelling "is it an integer?" doesn't answer the question
they're trying to answer.  They're just plain confused about how
floating point works.  The use of `.is_integer()` (however spelled!)
isn't the cause of that, it's a symptom.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecating float.is_integer()

2018-03-21 Thread Tim Peters
Note:  this is a top-posted essay much more about floating-point
philosophy than about details.  Details follow from the philosophy,
and if philosophies don't match the desired details will never match
either.

Understanding floating point requires accepting that they're a funky
subset of rational numbers, augmented with some oddballs (NaNs,
"infinities", minus zero).  At best the reals are a vague inspiration,
and floats have their own terminology serving their actual nature.
Thinking about reals instead is often unhelpful.

For example, it's bog standard terminology to call all IEEE-754 values
that aren't infinities or NaNs "finite".  Which, by no coincidence, is
how Python's math.isfinite() discriminates.  Within the finites -
which are all rational numbers - the distinction between integers and
non-integers is obvious, but only after you're aware of it and give it
some thought.  Which most people aren't and don't - but that's no
reason to prevent the rest of us from getting work done ;-)

This isn't anything new in Python - it's as old as floating-point.
For example, look up C's ancient "modf" function (which breaks a
float/double into its "integer" and "fractional" parts, and treats all
finite floats of sufficiently large magnitude as having fractional
parts of 0.0 - because they in are fact exact integers).

The idea that floats are "just approximations - so all kinds of slop
is acceptable and all kinds of fear inescapable" went out of style
when IEEE-754 was introduced.  That standard codified an alternative
view:  that functions on floats should behave as if their inputs were
_exactly_ correct, and - given that - produce the closest
representable value to the infinitely precise result.  That proved to
be extremely valuable in practice, allowing the development of
shorter, faster, more robust, and more accurate numerical algorithms.

The trend ever since has been to do more & more along those lines,
from trig functions doing argument reduction as if pi were represented
with infinite precision, to adding single-rounding dot product
primitives (all again acting as if all the inputs were exactly
correct).

Since that approach has been highly productive in real life, it's the
one I favor.  Arguments like "no floating point number on the order of
1e306 is sufficiently precise as to be an integer in any meaningful
sense" don't even start to get off the ground in that approach.  Maybe
in 1970 ;-)  You can have no actual idea of whether 1e306 is exactly
right or off by a factor of a million just from staring at it, and
real progress has been made by assuming all values are exactly what
they appear to be, then acting accordingly.  If you want to model that
some values are uncertain, that's fine, but then you need something
like interval arithmetic instead.

>From that fundamental "take floats exactly at face value" view, what
.is_integer() should do for floats is utterly obvious:  there is no
possible argument about whether a given IEEE-754 float is or is not an
integer, provided you're thinking about IEEE-754 floats (and not,
e.g., about mathematical reals), and making even a tiny attempt to
honor the spirit of the IEEE-754 standard.

Whether that's _useful_ to you depends on the application you're
writing at the time.  The advantage of the philosophy is that it often
gives clear guidance about what implementations "should do"
regardless, and following that guidance has repeatedly proved to be a
boon to those writing numerical methods.  And, yes, also a pain in the
ass ;-)

--- nothing new below ---

On Wed, Mar 21, 2018 at 3:49 PM, David Mertz  wrote:
> On Wed, Mar 21, 2018 at 3:02 PM, Tim Peters  wrote:
>>
>> [David Mertz]
>> > I've been using and teaching python for close to 20 years and I never
>> > noticed that x.is_integer() exists until this thread.
>>
>> Except it was impossible to notice across most of those years, because
>> it didn't exist across most of those years ;-)
>
>
> That's probably some of the reason.  I wasn't sure if someone used the time
> machine to stick it back into Python 1.4.
>
>>
>> > On the other hand, `x == int(x)` is genuinely obvious..
>>
>> But a bad approach:  it can raise OverflowError (for infinite x); it
>> can raise ValueError (for x a NaN);
>
>
> These are the CORRECT answers! Infinity neither is nor is not an integer.
> Returning a boolean as an answer is bad behavior; I might argue about
> *which* exception is best, but False is not a good answer to
> `float('inf').is_integer()`.  Infinity is neither in the Reals nor in the
> Integers, but it's just as much the limit of either.
>
> Likewise Not-a-Number isn'

Re: [Python-Dev] Deprecating float.is_integer()

2018-03-21 Thread Tim Peters
[David Mertz ]
>> For example, this can be true (even without reaching inf):
>>
>> >>> x.is_integer()
>> True
>> >>> (math.sqrt(x**2)).is_integer()
>> False

[Mark Dickinson  ]
> If you have a moment to share it, I'd be interested to know what value of
> `x` you used to achieve this, and what system you were on. This can't happen
> under IEEE 754 arithmetic.

I expect it might happen under one of the directed rounding modes
(like "to +infinity").

But under 754 binary round-nearest/even arithmetic, it's been formally
proved that sqrt(x*x) == x exactly for all non-negative finite x such
that x*x neither overflows nor underflows (and .as_integer() has
nothing to do with that very strong result):

https://hal.inria.fr/hal-01148409/document

OTOH, the paper notes that it's not necessarily true for IEEE decimal
arithmetic; e.g.,

>>> import decimal
>>> decimal.getcontext().prec = 4
>>> (decimal.Decimal("31.66") ** 2).sqrt()  # result is 1 ulp smaller
Decimal('31.65')

>>> decimal.getcontext().prec = 5
>>> (decimal.Decimal("31.660") ** 2).sqrt() # result is 1 ulp larger
Decimal('31.661')
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecating float.is_integer()

2018-03-21 Thread Tim Peters
[Devin Jeanpierre ]
> PyPy (5.8):
>  x = 1e300
>  x.is_integer()
> True
>  math.sqrt(x**2).is_integer()
> False
>  x**2
> inf

I think you missed that David said "even without reaching inf" (you
did reach inf), and that I said "such that x*x neither overflows nor
underflows".  Those are technical words related to IEEE-754:  your x*x
sets the IEEE overflow flag, although CPython may or may not raise the
Python OverflowError exception.

>
> (It gives an OverflowError on my CPython installs.)
>
> I believe this is allowed, and Python is not required to raise
> OverflowError here:
> https://docs.python.org/3.6/library/exceptions.html#OverflowError
> says:
>
>> for historical reasons, OverflowError is sometimes raised for integers that 
>> are outside a required range. Because of the lack of standardization of 
>> floating point exception handling in C, most floating point operations are 
>> not checked

You can avoid the OverflowError (but not the IEEE overflow condition!)
under CPython by multiplying instead:

>>> x = 1e300
>>> x*x
inf
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecating float.is_integer()

2018-03-21 Thread Tim Peters
[Chris Barker ]
> ...
> ... "is it the "right" thing to do in most cases, when deployed by folks
> that haven't thought deeply about floating point.

Gimme a break ;-)  Even people who _believe_ they've thought about
floating point still litter the bug tracker with

>>> .1 + .2
0.30004

"bug reports".  .is_integer() is easy to explain compared to that  -
and you have to go out of your way to use it.

> ...
> I think pretty much all the real world code that's been shown here for using
> .is_integer() is really about type errors (issues). The function at hand
> really wants integer inputs -- but wants to allow the user to be sloppy and
> provide a float type that happens to be an int. Given Python's duck-typing
> nature, maybe that's a good thing? I know I really discourage dynamic type
> checking

So you identified a use case.  One you don't approve of (nor do I),
but not strongly enough to demand they suffer instead ;-)


> Also, every example has been for small-ish integers -- exponents,
> factorials, etc -- not order 1e300 -- or inf or NaN, etc.
>
> Finally, the use-cases where the value that happens-to-be-an-int is computed
> via floating point -- .is_integer() is probably the wrong check -- you
> probably want  isclose().

Everyone who has implemented a production math library can recall
cases where the functionality was needed.  Here, that includes at
least Stefan Krah and me.  You could also follow the link from Mark
Dickinson to SciPy's implementation of the beta function.

In every case I've needed the functionality, isclose() would have been
utterly useless.  Behold:

>>> (-1.0) ** 3.0
-1.0
>>> (-1.0) ** 3.0001  # different result _type_
(-1-3.142007854859299e-12j)
>>> math.isclose(3.0, 3.0001)
True

And another showing that the same functionality is needed regardless
of how large the power:

>>> (-1.0) ** 1e300  # an even integer power
1.0

When implementing an externally defined standard, when it says "and if
such-and-such is an integer ...", it _means_ exactly an integer, not
"or a few ULP away from an integer".  IEEE pow()-like functions
bristle with special cases for integers.

>>> (-math.inf) ** 3.1
inf
>>> (-math.inf) ** 3.0 # note: this one has a negative result (odd integer 
>>> power)
-inf
>>> (-math.inf) ** 2.9
inf


> ...
> All this points to: we don't need .is_integer

I'll grant that you don't think you need it.  So don't use it ;-)


> All the being said -- the standard for depreciation is much higher bar than
> not-adding-it-in-the-first-place.

I would not have added it as a method to begin with - but I agree with
Guido that it doesn't reach the bar for deprecation.  The only
examples of "bad" uses we saw were from people still so naive about
floating-point behavior that they'll easily fall into other ways to
get it wrong.  What we haven't seen:  a single person here saying "you
know, I think _I'd_ be seduced into misusing it!".  It's not
_inherently_ confusing at all.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecating float.is_integer()

2018-03-21 Thread Tim Peters
[Tim]
>> from trig functions doing argument reduction as if pi were represented
>> with infinite precision,

[Greg Ewing ]
> That sounds like an interesting trick! Can you provide
> pointers to any literature describing how it's done?
>
> Not doubting it's possible, just curious.

As I recall, when it was first done a "lazy" routine produced as many
bits of pi as a given argument required, doing gonzo arbitrary
precision arithmetic.

Later, computer-aided analysis based on continued fraction expansions
identified the worst possible case across all IEEE doubles (&
singles).  For example, it's possible in reasonable time to find the
IEEE double that comes closest to being an exact integer multiple of
pi/4 (or whatever other range you want to reduce to).  Then it's only
necessary to precompute pi to as many bits as needed to handle the
worst case.

In practice, falling back to that is necessary only for "large"
arguments, and the usual double-precision numeric tricks suffice for
smaller arguments.

Search the web for "trig argument reduction" for whatever the state of
the art may be today ;-)

For actual code, FDLIBM does "as if infinite precision" trig argument
reduction, using a precomputed number of pi bits sufficient to handle
the worst possible IEEE double case, and is available for free from
NETLIB:

http://www.netlib.org/fdlibm/

The code is likely to be baffling, though, as there's scant
explanation.  Reading a paper or two first would be a huge help.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecating float.is_integer()

2018-03-22 Thread Tim Peters
[Kirill Balunov ]
> I apologize that I get into the discussion. Obviously in some situations it
> will be useful to check that a floating-point number is integral, but from
> the examples given it is clear that they are very rare. Why the variant with
> the inclusion of this functionality into the math module was not considered
> at all.

Nobody here really discussed the history, and I don't know.  The
questions here have been about what to do given that `is_integer` and
`as_integer_ratio` are _already_ advertised (public) methods on some
numeric types.


> If the answer is - consistency upon the numeric tower - will it go
> for complex type and what will it mean (there can be two point of views)?

I haven't seen anyone suggest either method be added to Complex.
There are lots of methods that don't show up in the tower before
hitting Real.  For example, given that Complex doesn't support
__float__, it would be bizarre if it _did_ support as_integer_ratio.


> Is this functionality so often used and practical to be a method of float,
> int, ..., and not just to be an auxiliary function?
>
> p.s.: The same thoughts about `as_integer_ratio` discussion.

I would have added them as functions in the `math` module instead.
perhaps supported by dunder methods (__as_integer_ratio__,
__is_integer__).  But that's not what happened, and whether or not
they have double underscores on each end doesn't really make all that
much difference except to dedicated pedants ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Deprecating float.is_integer()

2018-03-22 Thread Tim Peters
[Kirill Balunov ]
> ...
> In spite of the fact that the pronouncement has
> already been made, there may still be an opportunity to influence this
> decision.

That's not really how this works.  Guido has been doing this for
decades, and when he Pronounces he's done with it :-)


> I do not think that this is only a matter of choice, how this
> functionality will be accessed through a method or function, in fact these
> highly specialized methods heavily pollute the API

"Heavily"?  Seems oversold.


> and open the door for persistent questions.

That's a door that can never be closed, no matter what.


> Given the frequency and  activity of using this
> `.is_integer` method the deprecation of this method is unlikely to greatly
> affect someone. (for `as_integer_ratio` I think the bar is higher).
> Summarizing this thread it seems to me that with deprecation of `is_integer`
> method and with addition of `is_integer` function in math module will make
> everyone happy:

Not at all, but that's already been explained.  Deprecation is
_serous_ business:  it's not only the presumably relative handful of
direct users who are directly annoyed, but any number of worldwide web
pages, blogs, books, papers, slides, handouts, message boards ... that
so much as mentioned the now-deprecated feature.  The language
implementation is the tiniest part of what's affected, yet is the
_only_ part we (Python developers) can repair.

Deprecation really requires that something is a security hole that
can't be repaired, impossible to make work as intended, approximately
senseless, or is superseded by a new way to accomplish a thing that's
near-universally agreed to be vastly superior.  Maybe others?
Regardless, they're all "really big deals".

The "harm" done by keeping these methods seems approximately
insignificant.  Serhiy certainly found examples where uses made no
good sense, but that's _common_ among floating-point features.  For
example, here's a near-useless implementation of Newton's method for
computing square roots:

def mysqrt(x):
guess = x / 2.0
while guess ** 2 != x:
guess = (guess + x / guess) / 2.0
return guess

And here I'll use it:

>>> mysqrt(25.0)
5.0
>>> mysqrt(25.2)
5.019960159204453

Works great!  Ship it :-)

>>> mysqrt(25.1)

Oops.  It just sits there, consuming cycles.

That's because there is no IEEE double x such that x*x == 25.1.  While
that's not at all obvious, it's true.  Some people really have argued
to deprecate (in)equality testing of floats because of "things like
that", but that's fundamentally nuts.  We may as well remove floats
entirely then.

In short, that an fp feature can be misused, and _is_ misused, is no
argument for deprecating it.  If it can _only_ be misused, that's
different, but that doesn't apply to is_integer.

That someone - or even almost everyone - is merely annoyed by seeing
an API they have no personal use for doesn't get close to "really big
deal".  The time to stop it was before it was added.


> PROS:
> ...
> 5. Make everyone happy and stop this thread :)

This thread ended before you replied to it - I'm just a ghost haunting
its graveyard to keep you from feeling ignored  -)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Peters
[Serhiy Storchaka ]
> ...
> This is not new. The optimizer already changes semantic.
> Non-optimized "if a and True:" would call bool(a) twice, but optimized code
> calls it only once.

I have a hard time imaging how that could have come to be, but if it's
true I'd say the unoptimized code was plain wrong.  The dumbest
possible way to implement `f() and g()` is also the correct ;-) way:

result = f()
if not bool(result):
result = g()

For the thing you really care about here, the language guarantees `a`
will be evaluated before `b` in:

'{}{}'.format(a, b)

but I'm not sure it says anything about how the format operations are
interleaved.  So your proposed transformation is fine by me (your #3:
still evaluate `a` before `b` but ignore that the format operations
may occur in a different order with respect to those).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Peters
[Tim]
> I have a hard time imaging how that could have come to be, but if it's
> true I'd say the unoptimized code was plain wrong.  The dumbest
> possible way to implement `f() and g()` is also the correct ;-) way:
>
> result = f()
> if not bool(result):
> result = g()

Heh - that's entirely wrong, isn't it?  That's how `or` is implemented ;-)

Same top-level point, though:

result = f()
if bool(result):
result = g()
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Peters
[Tim]
> Same top-level point, though: [for evaluating `f() and g()`]:
>
> result = f()
> if bool(result):
> result = g()

Ah, I think I see your point now.  In the _context_ of `if f() and
g()`, the dumbest possible code generation would do the above, and
then go on to do

if bool(result):


If in fact `f()` returned a false-like value, an optimizer could note
that `bool(result)` had already been evaluated and skip the redundant
evaluation.  I think that's fine either way:  what the language
guarantees is that `f()` will be evaluated exactly once, and `g()` no
more than once, and that's all so regardless.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-28 Thread Tim Peters
[Tim Delaney ]
> ...
> If I'm not mistaken, #3 would result in the optimiser changing str.format()
> into an f-string in-place. Is this correct? We're not talking here about
> people manually changing the code from str.format() to f-strings, right?

All correct.  It's a magical transformation from one spelling to another.


> I would argue that any optimisation needs to have the same semantics as the
> original code - in this case, that all arguments are evaluated before the
> string is formatted.

That's why Serhiy is asking about it - there _are_ potentially visible
changes in behavior under all but one of his suggestions.


> I also assumed (not having actually used an f-string) that all its
> formatting arguments were evaluated before formatting.

It's a string - it doesn't have "arguments" as such.  For example:

def f(a, b, n):
return f"{a+b:0{n}b}"  # the leading "f" makes it an f-string

Then

>>> f(2, 3, 12)
'0101'

The generated code currently interleaves evaluating expressions with
formatting the results in a more-than-less obvious way, waiting until
the end to paste all the formatted fragments together.  As shown in
the example, this can be more than one level deep (the example needs
to paste together "0", str(n), and "b" to _build_ the format code for
`a+b`).


> So my preference would be (if my understanding in the first line is
> correct):
>
> 1: +0

That's the only suggestion with no potentially visible changes.  I'll
add another:  leave `.format()` alone entirely - there's no _need_ to
"optimize" it, it's just a maybe-nice-to-have.


> 2a: +0.5
> 2b: +1

Those two don't change the behaviors of `.format()`, but _do_ change
some end-case behaviors of f-strings.  If you're overly ;-) concerned
about the former, it would be consistent to be overly concerned about
the latter too.


> 3: -1

And that's the one that proposes to let .format() also interleave
expression evaluation (but still strictly "left to right") with
formatting.

If it were a general code transformation, I'm sure everyone would be
-1.  As is, it's  hard to care.  String formatting is a tiny area, and
format methods are generally purely functional (no side effects).  If
anyone has a non-contrived example where the change would make a lick
of real difference, they shouldn't be shy about posting it :-)  I
looked, and can't find any in my code.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Subtle difference between f-strings and str.format()

2018-03-30 Thread Tim Peters
[Steven D'Aprano ]
> ...
> Is there a down-side to 2b? It sounds like something you might end up
> doing at a later date regardless of what you do now.

There are always downsides ;-)

As Serhiy noted later, the idea that "it's faster" is an educated
guess - you can't know before it's implemented.  Changes to the very
complicated eval loop often have not only surprising speed
consequences on one platform, but even consequences in opposite
directions across platforms.  Not necessarily in the part you directly
changed, either.  Optimizing C compilers just can't reliably guess
what's most important in such a massive pile of test-and-branch laden
code.  Indeed, which paths through the eval loop _are_ most important
depend on the Python program you're running at the time (which is,
e.g., why "profile-guided optimization" was invented).

So there's an ocean of potential complications there, and wading
through those has opportunity cost too:  Serhiy is a very productive
contributor, but time he spends on this is time he won't be spending
on other things of potentially greater value.  That's all up to him,
though.

I'm not keen on changing the behavior of f-strings regardless (2a or
2b).  While their implementation details aren't documented, they were
intentional, and follow the pattern increasingly large parts of the
language and std library adopted after the iterator protocol was
introduced:  compute intermediate results as they're needed, not all
in advance.  That's proved to have many advantages.

It's certainly possible to write custom purely functional (no side
effects) __format__ methods such that memory use in an f-string
remains bounded under the current implementation, but can grow without
bound if all __format__ arguments need to be evaluated before any
formatting begins.  It's akin to the difference between iterating over
range() and xrange() in Python 2.

I don't know that there's any real f-string code out there _relying_
on that - but don't know that there isn't either.  It's more plausible
to me than that there are non-functional real __format__ methods.

I'd be happiest if no behaviors changed in anything.  Then the only
downsides to optimizing are code bloat, code churn, new bugs, subtler
corner cases, less predictable behavior for end users, and increased
implementation complexity forever after ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] IDLE colorizer

2018-04-01 Thread Tim Peters
[MRAB [
> A thread on python-ideas is talking about the prefixes of string literals,
> and the regex used in IDLE.
>
> Line 25 of Lib\idlelib\colorizer.py is:
>
> stringprefix = r"(?i:\br|u|f|fr|rf|b|br|rb)?"
>
> which looks slightly wrong to me.
>
> The \b will apply only to the first choice.
>
> Shouldn't it be more like:
>
> stringprefix = r"(?:\b(?i:r|u|f|fr|rf|b|br|rb))?"
>
> ?

I believe the change would capture its real intent.  It doesn't seem
to matter a whole lot, though - IDLE isn't a syntax checker, and
applies heuristics to color on the fly based on best guesses.  As is,
if you type this fragment into an IDLE shell:

kr"sdf"

only the last 5 characters get "string colored", presumably because of
the leading \br in the original regexp.  But if you type in

ku"sdf"

the last 6 characters get "string colored", because - as you pointed
out - the \b part of the original regexp has no effect on anything
other than the r following \b.

But in neither case is the fragment legit Python.  If you do type in
legit Python, it makes no difference (legit string literals always
start at a word boundary, regardless of whether the regexp checks for
that).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-17 Thread Tim Peters
I'll channel that Guido would be happiest if this rule were followed:

Given an assignment statement using "=", the meaning is the same if
"=" is replaced with ":=".

In particular, the expression at the far right is evaluated once, and
- in case of chained assignments - is applied in turn to each target
left-to-right.

Otherwise the semantics of "=" and ":=" can be very different indeed.

So, then, e.g., and assuming the rule above always applies:

[Nick]
> Tuple unpacking:
>
> What's the result type for "a, b, c := range(3)"? Is it a range()
> object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2,
> 3)" or "(a, b, range(3))"?

It's the range object range(3).  Same as in:

x = a, b, c = range(3)

`x` is bound to the range object range(3).


> Once you have your answer, what about "a, b, c := iter(range(3))"

A range_iterator object, same as what `x` is bound to in:

x = a, b, c = iter(range(3))

However, `list(x)` then returns an empty list, because iter(range(3))
was evaluated only once, and the iterator was run to exhaustion when
unpacking it for the `a, b, c` target.

> or "a, b, *c := range(10)"?

The range object range(10).


> Whichever answers we chose would be surprising at least some of the
> time, so it seems simplest to disallow such ambiguous constructs, such
> that the only possible interpretation is as "(a, b, range(3))"

That's why Guido would be happiest with the rule at the top.  "The
answers" can already be surprising at times with current assignment
statements, but they are well defined.  It would be mondo bonkers to
make up entirely different subtle answers ;-)


> Subscript assignment:
>
> What's the final value of "result" in "seq = list(); result =
> (seq[:] := range(3))"? Is it "range(3)"? Or is it "[1, 2, 3]"?

As above, it's range(3).


> As for tuple unpacking, does your preferred answer change for the
> case of "seq[:] := iter(range(3))"?

As above, a range_iterator object, but one that's already been run to
exhaustion.


> More generally, if I write  "container[k] := value", does only
> "type(container).__setitem__" get called, or does
> "type(container).__getitem__" get called as well?

The rule at the top implies __setitem_ is called once, and __getitem__
not at all.  The value of the assignment is the object  `value` was
bound to at the start, regardless of how tricky __setitem__ may be.
And in

k := container[k] := value

`k` is bound to `value` before `container[k]` is evaluated.  Why?
Because that's how assignment _statements_ have always worked.


> Attribute assignment:
>
> If I write  "obj.attr := value", does only "type(obj).__setattr__"
> get called, or does "type(obj).__getattribute__" get called as well?

As above, only __setattr__.


> While I can't think of a simple obviously ambiguous example using
> builtins or the standard library, result ambiguity exists even for the
> attribute access case, since type or value coercion may occur either
> when setting the attribute, or when retrieving it, so it makes a
> difference as to whether a reference to the right hand side is passed
> through directly as the assignment expression result, or if the
> attribute is stored and then retrieved again.

This is already defined for assignment statements.  While the PEP
doesn't say "and the same for assignment expressions", my guess is
that it won't be accepted unless it does.

Or, indeed, the target is limited to a name.  But Guido wasn't keen on that.

In short, I think the PEP's chance of acceptance increases the _more_
assignment expressions act like assignment statements, not the less,
and is highest if they act exactly the same (except for returning a
value; e.g., while

>>> a = 3

at a shell displays nothing,

>>> a := 3

should display 3).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-17 Thread Tim Peters
[Tim]
>> I'll channel that Guido would be happiest if this rule were followed:
>>
>> Given an assignment statement using "=", the meaning is the same if
>> "=" is replaced with ":=".

[Chris]
> That's broadly the intention. At the moment, there are two exceptions:
>
> 1) Augmented assignment isn't a thing

Doesn't have to be :-)  "Augmented assignment statement" is already a
different thing than "assignment statement" (for example, in an
augmented assignment statement, there is no chaining, and the sole
target can' t be, e.g., a slice or any form of unpacking syntax).


> 2) Chained assignment isn't a thing, which means that the assignments
> operate right-to-left

>> In particular, the expression at the far right is evaluated once, and
>> - in case of chained assignments - is applied in turn to each target
>> left-to-right.

> I'll toy with this and see if I can implement it sanely. If so,
> that'll eliminate one more distinction.

>> Otherwise the semantics of "=" and ":=" can be very different indeed.

> TBH, the common cases won't actually be much affected.

Or at all!  That's not the point here, though:  if making assignment
expressions work as exactly like assignment statements as possible is
what's needed for the PEP to pass, it's the _annoying_ cases that have
to be looked at.

Personally, after considerable staring at my own code, I would be
perfectly happy to settle for assignment expressions no fancier than

identifier ":=" expression

That alone covers over 99% of the cases I'd be tempted to use the new
feature at all, and then gobs of general-case assignment-statement
difficulties go away, including the "right-to-left or left-to-right?"
distinction (there's no way to tell which order bindings happen in `x
:= y := z := 3` short of staring at the generated code).

But so far I haven't gotten the impression that Guido is fond of that.
He should be, though ;-)


> You give this example:
>
> k := container[k] := value
>
> but that's not going to be more common. What I'm more likely to see is
> something like this:

Not about what's common, but about the full range of what's possible to express.

...

[Nick]
>>> Tuple unpacking:
>>>
>>> What's the result type for "a, b, c := range(3)"? Is it a range()
>>> object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2,
>>> 3)" or "(a, b, range(3))"?

>> It's the range object range(3).  Same as in:
>>
>> x = a, b, c = range(3)
>>
>> `x` is bound to the range object range(3).

> At the moment, "x = a, b, c := range(3)" will set c to range(3), then
> build a tuple of that with the existing values of a and b. You can,
> however, parenthesize the (a, b, c) part, and then it'll behave as you
> say.

Which would be really annoying to "repair".


>>> Whichever answers we chose would be surprising at least some of the
>>> time, so it seems simplest to disallow such ambiguous constructs, such
>>> that the only possible interpretation is as "(a, b, range(3))"

>> That's why Guido would be happiest with the rule at the top.  "The
>> answers" can already be surprising at times with current assignment
>> statements, but they are well defined.  It would be mondo bonkers to
>> make up entirely different subtle answers ;-)

> Wholeheartedly agreed.

I'd like Guido to chime in again, because I'm pretty sure he won't
accept what's currently on the table.  There are two plausible ways to
repair that:

1. Continue down the road of making assignment expressions "exactly
like" assignment statements in their full generality.

2. Back off and limit assignment expressions to what appears to be the
overwhelmingly most common case motivated by looking at real code (as
opposed to constructing examples to illustrate pitfalls &
obscurities):

identifier ":=" expression
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-17 Thread Tim Peters
[Guido, makes peace with `identifier := expression`]
> ...
> I am fine with this, it certainly seems the easiest to implement, with the
> fewest corner cases, and the easiest restriction to explain.
>
> (I was thinking there would be a use case for basic tuple unpacking, like
> seen a lot in for-loop, but the only examples I tried to come up with were
> pretty sub-optimal, so I don't worry about that any more.)

Chris's pain threshold appears to be higher than ours ;-)

So I would really like to see if anyone has plausibly realistic uses
for fancier forms of assignment expression.

I have plenty of code that does stuff like this:

while True:
x, y = func_returning_tuple()
if y is None:
break
...

Maybe it's just that I'm used to it, but I find that very easy to
understand now.  If we had fancy assignment expressions, my first
thought was I could write it like so instead:

while ((x, y) := func_returning_tuple()) and y is not None:
...

and pray that I put in enough parens to get the intended meaning.

And maybe it's just that I'm _not_ used to that, but I do find it
harder to understand.  Contributing factor:  I don't really want "and"
there - what the context requires is really more like C's comma
operator (take only the last value from a sequence of expressions).
As is, I'm relying on that a 2-tuple is truthy regardless of its
content (so that `and` always goes on to evaluate its RHS).

And, for some reason, I find this even worse:

while ((x, y) := func_returning_tuple())[1] is not None:
...

The rub there:  I gave `y` a name but can't use it in the test?!


And those are the same kinds of headaches I saw over & over in my own
"fancier" code:  stuff that's already perfectly clear would become
more obscure instead.

Tuple unpacking works great in for-loops because the only effect there
is to give names to the tuple components, none of which are needed
_in_ the `for` statement itself.  But in a `while" or `if` statement,
I would typically _also_ want to use the names _in_ the `while` or
`if` tests.  But, as in C, that's what the comma operator is for, not
the assignment operator.

while (s = function_returning_struct_with_x_and_y_members(), s.y != NULL) {
...
}

In contrast, ,many plausible uses I saw for `identifier := expression`
in a `while` or `if` statement would have been improvements, and most
of the rest neutral:  I'm still wondering whether this one is better
or worse ;-):

def newton(f, fprime, x):
import math
while not math.isclose((next_x := x - f(x) / fprime(x)), x):
x = next_x
return next_x
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-17 Thread Tim Peters
[Paul Moore]
>> the next question will likely be "so why does = exist at all?"

[Greg Ewing ]
> And if we decide to make ':=' the official assigment operator and
> deprectate '=', the next question will be "Why do we have '=='
> instead of '='?"

Which would be a fine question!  In Python's very early days, it
didn't have "==" at all:  plain "=" was used for both assignment and
equality testing.

>From the HISTORY file:

"""
New features in 0.9.6:
...
- '==' is now the only equality operator; "../demo/scripts/eqfix.py" is
  a script that fixes old Python modules
"""

That script crawled a source tree and replaced instances of "=" used
for equality testing with the new-fangled "==".  We can obviously do
something similar to replace instances of "=" used for assignment when
that's removed, and I'm sure nobody will complain about that either
;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-18 Thread Tim Peters
[Tim]
>> And, for some reason, I find this even worse:
>>
>> while ((x, y) := func_returning_tuple())[1] is not None:
>> ...
>>
>> The rub there:  I gave `y` a name but can't use it in the test?!
>>
>> And those are the same kinds of headaches I saw over & over in my own
>> "fancier" code:  stuff that's already perfectly clear would become
>> more obscure instead.

[Nick]
> Whereas I think:
>
> while (s := func_returning_tuple())[1] is not None:
> s = x, y
> ...
>
> compares favourably with the loop-and-a-half version.

Obviously not, since it really needs to be

x, y = s

instead ;-)

In context, I was looking for realistic cases in which assignment
expressions _fancier than_

identifier ":=" expression

is a real improvement.  You found an improvement instead by
_replacing_ a "fancier than" instance with a plain-single-name target.
I already have lots of examples from real code where plain-single-name
target reads better to me.  I don't have any yet from real code where
something fancier does.

In this specific case, I find your rewriting about as readable as the
loop-and-a-half, except for the obvious drawback of the former:

> It does make the guarantee that "y is not None" harder to spot than it
> is in the loop-and-a-half version, though.

Over time, the functions in the real codes from which the example was
synthesized change, sometimes carrying more or less state in tuples.
When that happens, the original

x, y  = s

will helpfully blow up (size mismatch in unpacking),  But, if the
tuple length increased, is it still the case that I want to test the
1'th component?  The test is now divorced from the unpacking.  I do
know that I'll still want to test the component I think of as being
"the 'y' component", and the loop-and-a-half version accommodates that
naturally.

Then again, I could switch to new-fanged namedtuples instead, and do

while (s := func_returning_tuple()).y is not None:

to get the best of all worlds.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-20 Thread Tim Peters
[Guido, about
 g(items[idx] := idx := f())
]
> Does the PEP currently propose to *allow* that horrible example? I thought
> Tim Peters successfully pleaded to *only* allow a single "NAME := ".

I was "successful" only in that the two of us agreed that would be far
less disruptive, and quite possibly an actual improvement ;-)  But I
only argued for limiting assignment expressions to the form

 identifier ":=" expression

I expected that, given that expressions "naturally nest", chained
targets could still be specified:

a := b := c:= 5

but since they're all plain names there's no way to tell whether the
bindings occur "left to right" or "right to left" short of staring at
the generated code.  I have no use case for chaining plain-name
targets in assignment expressions, but didn't see a good reason to
torture the implementation to forbid it.  I expected chaining would
just be an unused-in-practice possibility.  Much like, e.g.,

a in b in c in d

is an unused-in-practice possibility.

And I'll take this opportunity to repeat the key point for me:  I
tried hard, but never found a single case based on staring at real
code where allowing _fancier_ (than "plain name") targets would be a
real improvement.  In every case I thought it _might_ help, it turned
out that it really didn't unless Python _also_ grew an analog to C's
"comma operator" (take only the last result from a sequence of
expressions).  I'll also note that I asked if anyone else had a
real-life example, and got no responses.


There were lots of "real life" cases where plain-name targets allowed
for code improvement, though.


> You don't have to implement this restriction -- we know it's possible to
> implement, and if specifying this alone were to pull enough people from -1
> to +0 there's a lot of hope!

Given my experience with _trying_ to find use cases for fancier
targets, and getting burned every time, I'm on the minus side of the
current PEP, because - best I can tell - all the extra complexity
would create an "attractive nuisance" :-(
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-20 Thread Tim Peters
[Chris Angelico ]
> I don't see much value in restricting the assignment target to names
> only, but if that's what it takes, it can be restricted, at least
> initially.

I believe this point was made most clearly before by Terry Reedy, but
it bears repeating :-)  This is from the PEP's motivation:

"""
Naming the result of an expression is an important part of
programming, allowing a descriptive name to be used in place of a
longer expression, and permitting reuse.
"""

As "head arguments" go, that's a good one!  But restricting assignment
expressions to

identifier ":=" expression

satisfies it.  If what's of value is to name the result of an
expression, that single case handles that and _only_ that.  In a
sense, it's "the simplest thing that could possibly work", and that's
generally a good thing to aim for.

Python assignment _statements_ are way more complex than that.
Besides just giving names to expression results, they can also
implicitly invoke arbitrarily complex __setitem__ and __setattr__
methods on targets, rely on all sorts of side effects across chained
assignments, and support funky syntax for magically iterating over an
expression's iterable result.

While that can all be useful _in_ an assignment statement, the PEP's
motivation doesn't say a word about why any of _that_ would also be
useful buried inside an assignment expression.  There doesn't appear
to be a good "head argument" for why, besides "why not?".  That's not
enough.

I think it's no coincidence that every example of an _intended_ use is
of the simple

identifier ":=" expression

form. There are no examples of fancier targets in the PEP, and - more
importantly - also none I saw in the hundreds of mailing-list messages
since this started.  Except for a few of mine, where I tried to
demonstrate why _trying_ fancier targets in examples derived from real
code made the original "loop and a half" code _worse_  And where other
people were illustrating how incomprehensibly code _could_ be written
(which isn't a real interest of mine).

Short course:  e.g., while a general assignment expression can
"unpack" an iterable expression result, giving names to its elements,
there's no clean way to _use_ the names bound by the unpacking _in_
the "if" or "while" tests.   That's fine for `for` loops (only the
_body_ of the loop needs the names), but in conditional constructs you
typically want to use the names _in_ the condition being tested.

if ((a, b, c) := func_returning_triple()) and b > 0:
process(a+b, b+c, a+c)

seems to be as good as it gets, but inherently relies on "a trick":
that a 3-tuple is always truthy, regardless of content.  OTOH,

if ((a, b, c) := func_returning_triple())[1] > 0:

doesn't rely on a trick, but can't use the name `b` in the test(!).

if [((a, b, c) := func_returning_triple()), b > 0][-1]::

manages to avoid "a trick", and to use the natural `b > 0`, but is ...
strained ;-)

So, to my eyes, this is a clear improvement over all of those:

a, b, c = func_returning_triple()
if b > 0:
process(a+b, b+c, a+c)

Of course I could be cherry-picking a bad example there, but that's
not the intent:  I'm still waiting for anyone to post an example where
a "fancy" assignment-expression target would actually make code
clearer.  I haven't found one.

There are lots of examples when the target is a plain single name.

Why the stark difference?  I don't need deep theoretical reasons to
see that there _is_ one, or to conclude that - in the absence of
compelling use cases - complex assignment-expression targets are
probably a Poor Idea.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-20 Thread Tim Peters
[Tim]
>> And I'll take this opportunity to repeat the key point for me:  I
>> tried hard, but never found a single case based on staring at real
>> code where allowing _fancier_ (than "plain name") targets would be a
>> real improvement.  In every case I thought it _might_ help, it turned
>> out that it really didn't unless Python _also_ grew an analog to C's
>> "comma operator" (take only the last result from a sequence of
>> expressions).  I'll also note that I asked if anyone else had a
>> real-life example, and got no responses.

[MRAB ]
> Could a semicolon in a parenthesised expression be an equivalent to C's
> "comma operator"?

I expect it could, but I it's been many years since I tried hacking
Python's grammar, and I wouldn't want a comma operator anyway ;-)

To recycle a recently-posted example, instead of one of these 3:

if ((a, b, c) := func_returning_triple()) and b > 0:
process(a+b, b+c, a+c)

if ((a, b, c) := func_returning_triple())[1] > 0:


if [((a, b, c) := func_returning_triple()), b > 0][-1]::
...

it would allow this instead:

if ((a, b, c) := func_returning_triple(); b > 0):
...

That's better than any of the first three, but I'm not sure it's
better than the original

a, b, c = func_returning_triple()
if b > 0:
...

It _may_ be more readable in other complex-target examples, though.

It's also what's wanted in one of the running plain-name target
examples, _not_ involving a conditional context:

r1, r2 = (D := sqrt(b**-4*a*c); a2 := 2*a; ((-b+D)/a2), (-b-D)/a2))

And if I saw enough code like that, I'd write a PEP suggesting that
Python introduce separate assignment statements where name bindings
persisted across statement boundaries ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-21 Thread Tim Peters
[Matthew Woodcraft ]
> I would like to suggest one more motivating example for "Capturing
> condition values": multiple regex matches with 'elif'.
>
> if match := re.search(pat1, text):
> print("Found one:", match.group(0))
> elif match := re.search(pat2, text):
> print("Found two:", match.group(0))
> elif match := re.search(pat3, text):
> print("Found three:", match.group(0))
>
> Without assignment expressions, you have an annoying choice between a
> cascade of 'else's with an ever-increasing indent and evaluating all the
> matches up front (so doing unnecessary work).

That's a reasonable use, but would more likely be written like so today:

for tag, pat in (("one", pat1), ("two", pat2), ("three", pat3).
("four", pat4), ...):
match = re.search(pat, text)
if match:
print("Found", tag + ":", match.group(0))
break

Which would still read a bit nicer if the first two loop body lines
could be collapsed to

if match := re.search(pat, text):
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-21 Thread Tim Peters
[Christoph Groth ]
> Tim, thanks for this clear analysis.  Here's the best use case of more
> general assignment expressions that I can come up with (from real code
> I'm currently working on):
>
> class Basis:
> def __init__(self, parent, periods=()):
> self._parent = parent
> if len(self._periods := np.asarray(periods, int)):
> ...
> else:
> # In absence of periods, treat them as an (0, n)-shaped array.
> # This avoids a special code path below.
> self._periods = np.empty((0, len(parent.periods)), int)
>
> But since this is a weak counterexample, it actually serves to
> strengthen your point that
>
> identifier ":=" expression
>
> is all that is needed.

That's a decent example.  In truth, I have no real objection to
binding an attribute - but am willing to throw out a bit of soap with
the bathwater if doing so can avoid throwing the baby out too ;-)


> Such minimal assignment expressions have the (IMHO important) advantage
> of not being inconsistent with assignment statements.
>
> 
>
> Still, it seems weird to have two different ways of binding names in the
> language where one would be sufficient (i.e. the old one would remain
> only for backwards compatibility).  From the point of view of someone
> who's new to the language that's two things to learn instead of just
> one.

But they're very different in a key respect.  the value of an
assignment expression is the value assigned.  Asking "what's the value
of a statement?" doesn't even make sense in Python (whether an
assignment statement or any other kind of statement).

For that reason, _if_ a PEP is reworked to suggest a "binding
expression" (I'd prefer the name change to nudge people away from
conflating it with the far more general assignment statement), the
usage pragmatics are clear:  use a binding expression if the context
requires using the value bound, else use a simple assignment
statement.

":=" doesn't _just_ mean "bind the simple name on the left" in that
world, but also "and return the value of the expression on the right".

For that reason, e.g.,

i = 1

would be strongly preferred to

i := 1

as a standalone line, except perhaps when typing at an interactive
shell (where you may _want_ to see the value being bound - but usually
don't).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-21 Thread Tim Peters
[Tim]
>> I expected that, given that expressions "naturally nest", chained
>> targets could still be specified:
>>
>> a := b := c:= 5
>>
>> but since they're all plain names there's no way to tell whether the
>> bindings occur "left to right" or "right to left" short of staring at
>> the generated code.

[Nick Coghlan ]
> The fact class namespaces are ordered by default now allow us to
> demonstrate the order of multiple target assignments and tuple
> unpacking without staring at generated code:
>
> >>> class AssignmentOrder:
> ... a = b = c = 0
> ... d, e, f = range(3)
> ...
> >>> class ReversedAssignmentOrder:
> ... c = b = a = 0
> ... f, e, d = range(3)
> ...
> >>> [attr for attr in AssignmentOrder.__dict__ if not attr.startswith("_")]
> ['a', 'b', 'c', 'd', 'e', 'f']
> >>> [attr for attr in ReversedAssignmentOrder.__dict__ if not 
> >>> attr.startswith("_")]
> ['c', 'b', 'a', 'f', 'e', 'd']
>
> So that's a situation where "name = alias = value" could end up
> matching "alias := name := value"

Cool!  So this is really a killer-strong argument for getting rid of
classes - way overdue, too ;-)


> (Even in earlier versions, you can illustrate the same assignment
> ordering behaviour with the enum module, and there it makes even more
> of a difference, as it affects which name binding is considered the
> canonical name, and which are considered aliases).

So if binding expressions can be chained, they'll need to ape
"left-to-right" binding order.

Or they can't be allowed to chain to begin with.

Either way would be fine by me.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-21 Thread Tim Peters
[Christoph Groth ]
>> > Still, it seems weird to have two different ways of binding names in
>> > the language where one would be sufficient (i.e. the old one would
>> > remain only for backwards compatibility).  From the point of view of
>> > someone who's new to the language that's two things to learn instead
>> > of just one.

[Tim]
>> But they're very different in a key respect.  the value of an
>> assignment expression is the value assigned.  Asking "what's the value
>> of a statement?" doesn't even make sense in Python (whether an
>> assignment statement or any other kind of statement).

[Christoph]
> There are also no function call statements in Python.  People are
> happily using function call expressions as statements when not
> interested in their value.

Sure.

> I hope to have shown [1] that the same could be done for assignments.  A
> consistent value can be defined for any assignment statement.  So, all
> assignment statements could be redefined as expressions and the language
> would continue to work and even be (perfectly?) backwards-compatible.

Except for shells.  When I type, e.g.,

>>> xs = sorted(iterator_returning_a_billion_strings)

I really don't want to wait for hours before I can type again ;-)  In
the same way now, when someone calls a function at a shell but doesn't
want to see its result, they do something like

>>> xxx = function(a, b, c)

knowing that an assignment statement never displays any output on its
own.  If an assignment statement did return a result, almost all
shells would display it.  Shells typically don't care at all what you
typed at them, they just care whether or not executing the compiled
code returns None:

result = execute_code()
if result is not None:
display(repr(result))

There's also that you're not considering the other half:  that every
existing assignment statement could be viewed as being as expression
does not imply that every existing assignment statement could be used
everywhere an expression can be used.  Syntax matters, and function
call argument lists in particular already bristle with their own
meanings for commas, equal signs, and asterisks.  The language was
designed with "and the twain shall never meet" in mind ;-) For
example, what would

f(a=b)

mean?

The worst possible ;-) answer is "well, since

a=b

is fine as an assignment statement, it must mean that we bind the
value of `b` to name `a` and then pass b's value to `f()` as its first
positional argument".  That reading would break countless lines of
code using keyword arguments.  If you're willing to concede that's too
much breakage to bear, then you have to identify and spell out "the
rules" for every case in which something that "looks like an
assignment expression really isn't, depending on context".

But since I have no interest in pursuing this, I'll stop there :-)


> Syntax-wise, if replacing = by := everywhere is unthinkable, as it
> seems, there's still the possibility (not completely ruled out by Guido
> ;-) to use = for assignment expressions but require extra parens for
> safety.

That would be received less well than the current PEP.  The people it
would hurt the most are newcomers from other languages who habitually
put _every_ "if" and "while" test in parentheses, because that's what
they're used to doing (e.g., in C).  Many of us still remember our
initial relief when we realized we'd never piss away hours debugging
an

assert(n=1)
or
if (x=0.0)

typo/thinko again.  Reintroducing that possibility would get an
instant -1 from me, because I don't want to debug that same mistake
for other people on Stackoverflow either - my time there is wholly
consumed by explaining why .1 + .2 doesn't display exactly "0.3" ;-)


> Thus, it seems to me that redefining assignments as expressions
> everywhere is a feasible, if radical, idea.  Compared to a dedicated
> syntax for "binding expressions" it would be conceptually simpler, but
> would provide more possibilities to shoot oneself in the foot.

As above, it wouldn't remain so simple after hammering out the
detailed rules for deciding when and where something that "looks like
an assignment expression" really is one.

For an example of a fine language that makes no distinction between
"statements" and "expressions" at all, Icon is top on my list.  That
_can_ work out fine - but Icon was designed that way from the start.
And, of course, like every sane language that has wholly general
assignment expressions, Icon uses ";=" as the assignment operator, and
"=" for numeric equality testing ;-)



> [1] https://mail.python.org/pipermail/python-dev/2018-April/152780.html
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-21 Thread Tim Peters
[Matthew Woodcraft]
>>> Well, that's a reason to make the example a bit more realistic, then.
>>>
>>> Say:
>>>
>>> if match := re.search(pat1, text):
>>> do_something_with(match.group(0))
>>> elif match := re.search(pat2, text):
>>> do_something_else_with(match.group(0), match.group(1))
>>> elif match := re.search(pat3, text):
>>> do_some_other_things_with(match.group(0))
>>> and_also_with(match.group(1), match.group(2))

[Steven D'Aprano ]
>> I don't think that a bunch of generic "do_something_with" functions is
>> precisely "realistic".
>>
>> If I saw something like that, I'd try very hard to find a way to
>> refactor it into code like this:
>>
>> for handler in handlers:
>> if handler.match(text):
>> handler.process()
>> break
>> else:
>> # handle no-match case here
>>
>> where the knowledge of what to search for, where to search for it, how
>> to search for it, and what to do when found, was encapsulated in the
>> handler objects. Your tastes may vary.
>>
>> But your point is well-taken that the version with binding assignment
>> (thanks Tim!) is nicer to read than the current procedural version:
>>
>> match = re.search(pat1, text)
>> if match:
>> do_something_with(match.group(0))
>> else:
>> match = re.search(pat2, text)
>> if match:
>> do_something_else_with(match.group(0), match.group(1))
>> else:
>> match = = re.search(pat3, text)
>> do_some_other_things_with(match.group(0))
>> and_also_with(match.group(1), match.group(2))
>>
>> I just don't think it counts as a motivating use-case distinct from the
>> single match case.

[Guido]
> The version of this code found in reality is not as regular as the example
> quoted, and the rebuttal "but I would rewrite it with a loop" shoots a straw
> man. To me the if-elif-elif portion of the example is very much a separate
> motivation, since being able to put the assignment in the elif clause avoids
> runaway indentation. I've regretted not being able to use elif in this kind
> of situation many times, whereas in the single match case I don't find it a
> burden to assign the variable in a separate statement preceding the
> if-clause. (I guess this is a case of "flat is better than nested" -- thanks
> Tim! :-)

Au contraire - thank you for forcing me to channel you succinctly lo
those many years ago ;-)

And for pointing out this real use case, which I'm not sure has been
stressed before.  The PEP could clearly use more motivating examples,
and this is a fine class of them.  Few things are more maddening than
runaway cascading indentation :-(

And noting again that a simple "binding expression" (my neologism for
`identifier ":=" expression`, to break the reflexive horror at
imagining the full complexity of assignment statements being allowed
everywhere expressions are allowed) is sufficient to address it.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-22 Thread Tim Peters
[Guido]
> In reality there often are other conditions being applied to the match for
> which `if expr as name` is inadequate. The simplest would be something like
>
>   if ...:
>   
>   elif (m := re.match('(.*):(.*)', line)) and m.group(1) == m.group(2):
> 
>
> And the match() call may not even be the first thing to check -- e.g. we
> could have
>
> elif line is not None and (m := re.match('(.*):(.*)', line)) and 
> m.group(1) == m.group(2):

I find myself warming more to binding expressions the more I keep them
in mind while writing new code.  And I think it may be helpful to
continue showing real examples where they would help.

Today's example:  I happened to code this a few hours ago:

diff = x - x_base
if diff:
g = gcd(diff, n)
if g > 1:
return g

It's not really hard to follow, but two levels of nesting "feels
excessive", as does using the names "diff" and "g" three times each.
It's _really_ an "and" test:  if the diff isn't 0 and gcd(diff, n) >
1, return the gcd.  That's how I _thought_ of it from the start.

Which this alternative expresses directly:

if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
return g

That's so Pythonic I could cry ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-23 Thread Tim Peters
[Tim]
>> Which this alternative expresses directly:
>>
>> if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
>> return g
>>
>> That's so Pythonic I could cry ;-)

[Antoine]
> It looks like C to me.  That won't make me cry (I write C++ code daily
> these days), but it's certainly not the same language as Python.
>
> The second part, especially, where you use the result of an
> assignment expression as a comparison operand, looks definitely
> un-Pythonic.

You snipped the part explaining _what's_ "Pythonic" about it:

It's _really_ an "and" test:  if the diff isn't 0 and gcd(diff, n) >
1, return the gcd.  That's how I _thought_ of it from the start.

"Expresses directly" is the Pythonic part; the syntax is minor to me.
Seeing that the _intent_ is an "and test" is a pattern-matching puzzle
in the original spelling (which essentially turned me into a compiler,
writing low-level code for the _concepts_ I had in mind from the
start):

diff = x - x_base
if diff:
g = gcd(diff, n)
if g > 1:
return g

But note that the part of the PEP I support is just the "binding
expression" part:  giving a simple name (binding an identifier) to the
result of an expression.  I don't want the full potential complexity
of assignment statements in expressions.  There's nothing
"un-Pythonic" about merely giving a name to an expression result,
apart from that there are few contexts that currently support that in
a sanely usable way.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-23 Thread Tim Peters
[Sven R. Kunze ]
> What about
>
> diff = x - x_base
> if diff and gcd(diff, n) > 1:
> return gcd(diff, n)
>
> # or
>
> if (x - x_base) and gcd(x - x_base, n) > 1:
> return gcd(x - x_base, n)
>
>
> and have the interpreter handle the optimization, or apply an lru_cache? ;-)

Surely you're joking.  This is math.gcd(), which is expensive for
multi-thousand bit integers, and the interpreter knows nothing about
it.  Adding a cache of _any_ kind (LRU or otherwise) would make it
even slower (in the application, there's no reason to expect that x -
x_base will repeat a value before O(sqrt(n)) iterations, which itself
can be thousands of bits - a cache hit would be a miracle).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-23 Thread Tim Peters
[Tim]
>> Surely you're joking.  This is math.gcd(), which is expensive for
>> multi-thousand bit integers, and the interpreter knows nothing about
>> it.  Adding a cache of _any_ kind (LRU or otherwise) would make it
>> even slower.

[Sven R. Kunze ]
> Alright, if that problem is just about performance,

It's not, but others had already pointed out that it's generally
considered Poor Practice (and for good reasons) to textually repeat
expressions, so I didn't echo that.  Even in purely functional
languages, where textually equal snippets are guaranteed to evaluate
to the same result every time, "give these expressions these brief
names" constructs are heavily used (see, .e.g, "let" and "where" in
Haskell).


>:then there must be a better way to resolve it rather than
> inventing a new syntax.

Why?  "Give the result of an expression a name" is already heavily
used in Python - it's just that the _contexts_ in which it can be done
are very limited now.


> Btw. storing the result in a local var is also a cache IMHO. And
> if gcd is immutable, I think Python can do a great job here of
> optimizing.

After decades, CPython still does nothing of the sort, short of having
eventually made, e.g., "None" and "True" and "False" reserved words so
at least it can optimize uses of those.  It knows nothing at all about
which library functions are pure - and there's no code in the
implementation currently capable of exploiting such information even
if it were known.  That remains a fantasy in CPython.


> Anyway, your example is the best one I've seen so far.

Guido gave better ones, where binding expressions would allow to
collapse arbitrarily deep levels of nesting to just one (if ... elif
... elif ... elif ...).  My example only eliminated a single level of
artificial indentation.  But my example did have the advantage of
being taken verbatim from actual, working code ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-23 Thread Tim Peters
[Ethan Furman ]
> So I really like being able to make the assignment in the expression, but I
> have a really hard time parsing it with the name first.
>
> ...
>
> On the other hand, if it were using the "as" keyword:
>
> if (x - xbase as diff) and (gcd(diff, n) as g) > 1:
> return g
>
> I would parse as:
>
>   if
> x - x_base
> as diff
>   and
> gcd(diff, n)
> as g
>   > 1:
>   return g
>
> For me at least, the last is much more readable.  Thinking about it some
> more, the problem (or maybe just my problem) is that I see an "if" or
> "while" and the I look for the thing that is True or False, and using the
> ":=" syntax the first thing I see is a placeholder for a result that doesn't
> exist yet, making me constantly scan backwards and forwards to put all the
> pieces in the correct place.
>
> With "as", it just flows forwards.

I can read it fine either way, and don't much care.  A possible
advantage of an "as" operator is that its precedence could be set to
bind just a tad stronger than comparisons (which include "is" and "is
not" in Python), and then, e.g.,

if f() as result is not None:
do something with result

could work as intended.  So long as people can't get "assignment
_statements_" out of their heads,

if result := f() is not None:

groups instead as

if result := (f() is not None):

which would almost never be _intended_.  Maybe spelling it "as"
instead could break that.

However, against "as" is that its current use in "with" statements
does something quite different:

with f() as name:

does not bind the result of `f()` to `name`, but the result of
`f().__enter__()`.  Whether that "should be" fatal, I don't know, but
it's at least annoying ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-23 Thread Tim Peters
[Tim]
>> if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
>> return g

[Greg Ewing ]
> My problem with this is -- how do you read such code out loud?

In the message in which I first gave that example:

if the diff isn't 0 and gcd(diff, n) > 1, return the gcd.
   That's how I _thought_ of it from the start.

In my mind, `x - x_base` doesn't even exist except as a low-level
definition of what "diff" means.  It's different for the other test:
_there_ `g` doesn't exist except as a shorthand for "the gcd".  In one
case it's the name that's important to me, and in the other case the
expression.  The entire function from which this came is doing all
arithmetic modulo `n`, so `n` isn't in my mind either - it's a
ubiquitous part of the background in this specific function.

But you did ask how_I_ would read that code ;-)  Anyone else is free
to read it however they like.  I naturally read it in the way that
makes most sense to me in its context.


> From my Pascal days I'm used to reading ":=" as "becomes". So
> this says:
>
>"If diff becomes x - base and g becomes gcd(diff, n) is
> greater than or equal to 1 then return g."
>
> But "diff becomes x - base" is not what we're testing!

I don't really follow that.  In Python,

if f() and g > 1:

first tests whether `f()` "is truthy", regardless of whether it does
or doesn't appear in a binding expression.  Because this code is
working with integers, there's an _implied_ "!= 0" comparison.


> That makes it sound like the result of x - base may or may not
> get assigned to diff, which is not what's happening at all.

Then I suggest the problem you're having doesn't stem from the binding
expression, but from that you're omitting to fill in the != 0 part:
if you're not thrown by "greater than 1", I can't see how you can be
thrown by "not zero".


> The "as" variant makes more sense when you read it as an
> English sentence:
>
>if ((x - x_base) as diff) and ...
>
>"If x - x_base (and by the way, I'm going to call that
> diff so I can refer to it later) is not zero ..."

So read the original as "if diff (which is x - x_base) is not zero ...".

Regardless, Guido has already said "as" is DOA (Dead On Arrival)
(illustrating that it's also common enough in English to give a short
name before its long-winded meaning ;-) ).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-23 Thread Tim Peters
[Steve Holden ]
>> ...
>> The assignment expression seems like a vary natural way to introduce
>> variables of limited (controlled?) scope, [...]

[Antoine Pitrou ]
> AFAIU, the scope isn't limited to the "if" block, it's a regular local
> variable.  I might have misread.

You're right about the current version of the PEP.  No new scoping
rules are introduced.  The PEP does suggest some changes to corner
case scoping semantics, though.


> ...
> Regardless, my three questions about this are:
> - does it make Python more powerful?

Goodness no.


> - does it make Python more readable?

There are cases where it would, and cases where it wouldn't.  People
shouldn't use it in the latter cases ;-)  I very recently wrote this
block of code:

outside = p2units[p1][tgt_kind] - units[2]
if outside:
if not all(self.crossout(q, n, undo)
 for q in outside):
return False

The opening pair is a very common minor annoyance; it's marginally
more readable like so:

if outside := p2units[p1][tgt_kind] - units[2]:

Saving an essentially useless line with a duplicated name is worth
something to me, because it comes up so very often.

But that's indeed "minor".  In my diff/gcd example, it reduced 5 lines
of code to 2; saved a level of annoying (semantically misleading)
indentation; and cut the number of instances of both "diff" and "g"
from 3 each to 2 each (ideal:  one each to bind the name, and then one
each to use the name later).  That's relatively substantial by any
measure.

In Guido's if/elif/elif/elif/elif ... complex text processing example
template, it can save an unbounded number of semantically needless
indentation levels.

So the readability benefits can range from highly negative to highly positive.


> - does it make Python easier to learn and teach?

By whom?  Almost no addition has ever made a language easier to learn
for raw beginners:  every addition is something they eventually need
to learn.  We could make Python easier to learn for beginners by
throwing out virtually everything added since version 0.9.6 ;-)

But people coming _from_ most other very widely used languages (C,
C++, Java, Javascript, Perl, ...) are already familiar with assignment
expressions.  The limited (to a plain identifier target) "binding
expression" PEP simplification I favor would be nothing new to them at
all (whereas the full complexity of Python's assignment statements is
indeed beyond what they're used to, but needs to be taught & learned
regardless of this PEP's fate).

At least when restricted to binding expressions, the syntax is simple
and the semantics are the very simplest case of what people (convert
or raw beginner) need to learn for Python's assignment statements
regardless.


> My answer would be "no" to all three, but YMMV.

And it did ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-23 Thread Tim Peters
[Antoine]
>>> - does it make Python easier to learn and teach?

[Tim]
>> By whom?  Almost no addition has ever made a language easier to learn
>> for raw beginners:  every addition is something they eventually need
>> to learn.  We could make Python easier to learn for beginners by
>> throwing out virtually everything added since version 0.9.6 ;-)

[Antoine]
> Constructs like "with ..." or "try / except / finally" make the
> language easier to learn compared to the dances they are meant to
> replace.

They nevertheless need to be taught & learned (and try/except/finally
was essentially always in the language),  You snipped the parts
pointing out that binding expressions are already familiar to people
coming from most other languages, and even for raw beginners the
semantics are the tiniest part of what they need to learn anyway about
Python's assignment expressions.

So that was my point:  they don't make Python any harder to learn or
teach.  To the contrary, for people coming from other languages, it's
one less thing they're used to they wouldn't have to _un_learn.


> "await" is a more readable and less confusing improvement
> over "yield from".

Heh.  Not to me.  I have literally have no idea what to with "await"
(I use generators heavily, but have had no use yet for coroutines),
but use

yield from an_iterable

routinely.  That use is perfectly clear, to the point that it _can't_
be improved on:  it already does exactly what I want from it, with no
effort at all.  It's simply impossible that, whatever "await" does, it
could be more readable or less confusing than what I use "yield from"
for.


>  Format strings dispense from the older, more convoluted formulations.

But they didn't _replace_ them.  They're Yet Another Way to Format
Strings everyone has to learn.  That made teaching/learning harder,
not easier, but you can make a case they make Python easier to _use_
for people who eschew the older forms.


> Iteration is much simpler than the longer forms we would have to write
> if generalized iterators didn't exist.

I'll buy that one.  Now go through the HISTORY file and count all the
changes you didn't name ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-24 Thread Tim Peters
[Antoine Pitrou ]
> ...
> Having to break things out over multiple lines is a fact of life, if
> only for readability when implementing (and maintaining!) non-trivial
> processing routines. It's a good thing to be used to it, and to learn to
> choose good names for intermediate variables.

Well, the last part is overselling:  by its very nature, a binding
expression does not relieve the programmer one whit from needing to
pick good names.  The name is part of the binding expression.  The
sheer number of names needed is the same with or without binding
expressions, although the latter allow for less repetitive typing (&
reading) of those names.

For the rest, _needing_ to split a simple bind-and-test across two
lines doesn't really build character,  or have any other virtue
(besides familiarity to old-time Python programmers) I can see.
Neither does falling into indentation hell have any virtue in the
rarer cases where binding expressions really shine.  Simple things
_should_ be simple to do; indeed, when they are, that's an incentive
to keep things simple.

There will still be plenty of code where splitting multiple bindings
across multiple lines is obviously better.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-24 Thread Tim Peters
[Stephen J. Turnbull[
>> Neologisms are usually written in the other order:
>> "dead on arrival (DOA, for short)." ;-)

[Greg Ewing ]
> Maybe we can make use of that?
>
>if (x - x_base) (diff) and gcd(diff, n) (g) > 1:
>
> That doesn't work, because the (...) look like function
> calls. But what if we used a different set of bracketing
> characters:
>
>if (x - x_base) {diff} and gcd(diff, n) {g} > 1:
>
> I think that's unambiguous, because you can't currently
> put {...} straight after an expression.

As Guido noted more than once when this was still on python-ideas,
this isn't a "a puzzle" to be solved by any technical tricks
conceivable.  He's not going to accept anything in his language that
isn't at least plausibly evident.  There's a long & distinguished
history of other languages using ":=" for binding, which is why that
one gained traction before this moved to python-dev.


> To make it look even more like a neologism definition,
> we could require the bound names to be all-uppercase. :-)
>
>if (x - x_base) {DIFF} and gcd(DIFF, n) {G} > 1:
>   return G

Yes - now you're on the right track ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-24 Thread Tim Peters
[Antoine]
> ...
> Yes... I think most will agree that Python is generally easy to take up
> for people coming from C++ etc., so my "easier to learn and teach" was
> mostly about non-programmers.

[Tim]
>> even for raw beginners the semantics are the tiniest part of what
>> they need to learn anyway about Python's assignment expressions.

> I'm not sure what you mean by that.  If it's the tiniest part, what's
> the overwhelming part?

I was hoping it was clear from context that I was talking about
"binding expressions", not the PEP's wholly general "assignment
expressions".


> Is the new assigment expression that delicate to use that it requires
> reading a long and intimidating design document ? I didn't get that
> impression, so it seems you may be making a stronger point than me
> for rejecting the PEP :-)

I'm -1 myself on the PEP's assignment expressions, because there are
no compelling use cases yet for any but the simplest ("binding
expressions") cases.  And, yes, understanding Python's assignment
statements is challenging.  Just understanding their grammar is
challenging:

assignment_stmt ::=  (target_list "=")+ (starred_expression | yield_expression)
target_list ::=  target ("," target)* [","]
target  ::=  identifier
 | "(" [target_list] ")"
 | "[" [target_list] "]"
 | attributeref
 | subscription
 | slicing
 | "*" target

Followed by pages of dense text explaining what all those
possibilities mean.  A binding expression is more like:

binding_expression ::= identifier ":=" expression

and the only part of the assignment statement docs needed to explain
the meaning is the brief "If the target is an identifier (name)"
section, augmented with "and the value of `expression` is the value of
the binding expression".  If someone has learned what

i = 1

means, they already learned almost all of what binding expressions
mean too.  The target in a binding expression can't be more
complicated than the `i` in that example.


>>> "await" is a more readable and less confusing improvement
>>> over "yield from".

>> Heh.  Not to me.  I have literally have no idea what to with "await"
>> (I use generators heavily, but have had no use yet for coroutines),
>> but use
>>
>> yield from an_iterable
>>
>> routinely.

> Yeah... "yield from" is fine for that, except that it was explicitly
> meant for the coroutine use case as well (I'm not sure what the
> timeline is, but probably Guido was already thinking/dreaming about
> tulip/asyncio back then). And trying to shoehorn both in a single
> construct made it confusing and inadequate.
>
> When you want to express two abstractly different concepts (generating
> a stream of values, or suspending a task until some asynchronous subtask
> finishes), it makes things easier if those two concepts have two
> different concrete expressions. Hence "await" making the language
> easier to learn for those whose use cases benefit from it.

All of which I remain blissfully unaware of :-)


...
>> It's simply impossible that, whatever "await" does, it
>> could be more readable or less confusing than what I use "yield from"
>> for.

> Probably because "await" wouldn't work at all for you, then :-)

I'm glad people who need "await" got it - they'd have to pry _my_ uses
of "yield from" from my cold, dead fingers ;-)  Despite that all my
uses could be trivially replaced by

for _ in an_iterable:
yield _

"yield from" saves typing, indentation, and conceptual noise for me.
It's the "binding expressions" of nested generators ;-)


>>>  Format strings dispense from the older, more convoluted formulations.

>> But they didn't _replace_ them.  That made teaching/learning harder,
>> not easier,

> Intuitively, it sounds easier to teach f'some {value}' rather than
> either the .format() or %-formatting alternatives.  The whole goal of
> f-strings, after all, is to make string formatting more approachable.
>
> Learning a language is not learning the whole spec.  When you learn C,
> you don't need to learn the oddities of pre-ANSI function
> declarations :-)

A difference is that there still are mountains of code using earlier
string formatting methods, and my guess is that there always will be.
f-strings aren't always "better".  For example, any number of
generators (including the combinatoric generators from itertools)
yield a sequence of tuples, and

format_string % a_tuple

is often the simplest way to format the tuple components.  Breaking
the tuple apart first, whether via explicit indexing in an f-string,
or via unpacking into a tuple of names for use in an f-string, is
often needless complication.

So % formatting needs to be learned by anyone who wants to read
_other_ peoples' code.

Then again, that's fine by me, because I don't really care whether
something new needs to be learned.  What I do care about is whether
the benefits exceed the co

Re: [Python-Dev] PEP 572: Assignment Expressions

2018-04-24 Thread Tim Peters
[Chris Angelico ]
> Hopefully you have seen, or soon will see, the latest posting of the
> PEP, in which assignment targets are restricted to simple names. :)

I haven't yet, but look forward to it!  You have the patience of a
saint to endure all this - I would have given up 6 years ago ;-)


> Though I still talk about "assignment expressions". I don't see a
> problem with calling them that, but I also don't see a problem with
> calling them "binding expressions" if you prefer.

It's psychology ;-)  So long as the PEP calls them assignment
expressions, people are going to imagine facing the horrors of things
like the current

*b, c = a[c] = a

assignment statement buried deep inside expressions.  But in
conventional use, "binding" is restricted to identifiers, which vastly
simplifies the mental model for "the worst" that can happen.

Since fear is the most potent motivator, "don't scare people" is rule #1 ;-)

But, in the absence of Guido chiming in, it's really up to you.  A few
people have expressed positive feelings about changing the name to
"binding expressions", and none opposed it (that I saw), but the
sample size is too small to claim that "proves" anything.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Write vs Read, Understand and Control Flow

2018-04-24 Thread Tim Peters
[Victor Stinner]
...
> Tim Peter gaves the following example. "LONG" version:
>
> diff = x - x_base
> if diff:
> g = gcd(diff, n)
> if g > 1:
>return g
>
> versus the "SHORT" version:
>
> if (diff := x - x_base) and (g := gcd(diff, n)) > 1:
>return g
>
> == Write ==
>
> If your job is to write code: the SHORT version can be preferred since
> it's closer to what you have in mind and the code is shorter. When you
> read your own code, it seems straightforward and you like to see
> everything on the same line.

All so, but a bit more:  in context, this is just one block in a
complex algorithm.  The amount of _vertical_ screen space it consumes
directly affects how much of what comes before and after it can be
seen without scrolling.  Understanding this one block in isolation is
approximately useless unless you can also see how it fits into the
whole.  Saving 3 lines of 5 is substantial, but it's more often saving
1 of 5 or 6.  Regardless, they add up.


> The LONG version looks like your expressiveness is limited by the
> computer. It's like having to use simple words when you talk to a
> child, because a child is unable to understand more subtle and
> advanced sentences. You want to write beautiful code for adults,
> right?

I want _the whole_ to be as transparent as possible.  That's a
complicated balancing act in practice.


> == Read and Understand ==
>
> In my professional experience, I spent most of my time on reading
> code, rather than writing code. By reading, I mean: try to understand
> why this specific bug that cannot occur... is always reproduced by the
> customer, whereas we fail to reproduce it in our test lab :-) This bug
> is impossible, you know it, right?
>
> So let's say that you never read the example before, and it has a bug.

Then you're screwed - pay me to fix it ;-)  Seriously, as above, this
block on its own is senseless without understanding both the
mathematics behind what it's doing, and on how all the code before it
picked `x` and `x_base` to begin with.


> By "reading the code", I really mean understanding here. In your
> opinion, which version is easier to *understand*, without actually
> running the code?

Honestly, I find the shorter version a bit easier to understand:
fewer indentation levels, and less semantically empty repetition of
names.


> IMHO the LONG version is simpler to understand, since the code is
> straightforward, it's easy to "guess" the *control flow* (guess in
> which order instructions will be executed).

You're saying you don't know that in "x and y" Python evaluates x
first, and only evaluates y if x "is truthy"?  Sorry, but this seems
trivial to me in either spelling.


> Print the code on paper and try to draw lines to follow the control
> flow. It may be easier to understand how SHORT is more complex to
> understand than LONG.

Since they're semantically identical, there's _something_ suspect
about a conclusion that one is _necessarily_ harder to understand than
the other ;-)  I don't have a problem with you finding the longer
version easier to understand, but I do have a problem if you have a
problem with me finding the shorter easier.


> == Debug ==
>
> Now let's imagine that you can run the code (someone succeeded to
> reproduce the bug in the test lab!). Since it has a bug, you now
> likely want to try to understand why the bug occurs using a debugger.
>
> Sadly, most debugger are designed as if a single line of code can only
> execute a single instruction. I tried pdb: you cannot only run (diff
> := x - x_base) and then get "diff" value, before running the second
> assingment, you can only execute the *full line* at once.
>
> I would say that the LONG version is easier to debug, at least using pdb.

That might be a good reason to avoid, say, list comprehensions (highly
complex expressions of just about any kind), but I think this
overlooks the primary _point_ of "binding expressions":  to give names
to intermediate results.  I couldn't care less if pdb executes the
whole "if" statement in one gulp, because I get exactly the same info
either way:  the names `diff` and `g` bound to the results of the
expressions they named.  What actual difference does it make whether
pdb binds the names one at a time, or both, before it returns to the
prompt?

Binding expressions are debugger-friendly in that they _don't_ just
vanish without a trace.  It's their purpose to _capture_ the values of
the expressions they name.  Indeed, you may want to add them all over
the place inside expressions, never intending to use the names, just
so that you can see otherwise-ephemeral intra-expression results in
your debugger ;-)


> ... Think about tracebacks. If you get an xception at "line 1" in the
> SHORT example (the long "if" expression), what can you deduce
> from the line number? What happened?
>
> If you get an exception in the LONG example, the line number gives you
> a little bit more information... maybe just enough to understand the
> bug?


Re: [Python-Dev] PEP 572: Write vs Read, Understand and Control Flow

2018-04-24 Thread Tim Peters
[Tim]
>> Binding expressions are debugger-friendly in that they _don't_ just
>> vanish without a trace.  It's their purpose to _capture_ the values of
>> the expressions they name.  Indeed, you may want to add them all over
>> the place inside expressions, never intending to use the names, just
>> so that you can see otherwise-ephemeral intra-expression results in
>> your debugger ;-)


[Steven D'Aprano ]
 wrote:
> That's a fantastic point and I'm surprised nobody has thought of it
> until now (that I've seen).
>
> Chris, if you're still reading this and aren't yet heartedly sick and
> tired of the PEP *wink* this ought to go in as another motivating point.

You know, I thought I was joking when I wrote that - but after I sent
it I realized I wasn't ;-)

It would actually be quite convenient, and far less error-prone, to
add a binding construct inside a complicated expression for purposes
of running under a debugger.  The alternative is typing the
sub-expression(s) of interest by hand at the debugger prompt, or
adding print()s, both of which are prone to introducing typos, or
changing results radically due to triggering side effects in the code
invoked by the duplicated sub-expression(s).  Adding a binding
construct wouldn't change anything about how the code worked (apart
from possibly clobbering a local name).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-25 Thread Tim Peters
[Guido]
>> You don't seem to grasp the usability improvements this will give.
>> I hear you but at this point appeals to Python's "Zen" don't help you.

[Łukasz Langa ]
> This reads dismissive to me. I did read the PEP and followed the discussion on
> python-dev. I referred to PEP 20 because it distills what's unique about the
> value proposition of Python. It's our shared vocabulary.
>
> Can you address the specific criticism I had? To paraphrase it without PEP 20
> jargon:

>  (name := expression) makes code less uniform.  It inserts more information
>   into a place that is already heavily packed with information (logic tests).

I'll take a crack at that.  It's not about "head arguments" at all.  I
sat out the first hundred messages about this on python-ideas, and
looked at code instead.  What I found had little to do with any of the
head (abstract) arguments passionately debated for the duration ;-)

In real life, I found a great many conditional tests that not only
weren't "heavily packed" with information, they were simply of the
form:

NAME = expression
if NAME:
... use NAME ...

That looks more like assembly language than Python ;-)  I saw no harm
at all, and a little gain, in

if NAME := expression:
... use NAME ...

instead.  But even a little gain adds up when it happens so often.

Of course there have been better examples given of bigger gains.  But
in no case have the tests in those examples been "heavily packed with
information".  If they had been, I would have suggested instead
breaking the test clauses _out_ of the conditional statements, and
giving them names each on their own dedicated lines, with comments
explaining what the heck the _intents_ are, even at the cost of adding
an indentation level or two. Sometimes conditionals are _already_ "too
dense".  But more often they're very sparse.

This becomes a question of seasoned judgment.  For example, here's a
real loop summing a series expansion, until the new terms become so
small they make no difference to the running total (a common enough
pattern in code slinging floats or decimals):

while True:
old = total
total += term
if old == total:
return total
term *= mx2 / (i*(i+1))
i += 2

To my eyes, this is genuinely harder to follow, despite its relative brevity:

while total != (total := total + term):
term *= mx2 / (i*(i+1))
i += 2
return total

So I wouldn't use binding expressions in that case.  I don't have a
compelling head argument for _why_ I find the latter spelling harder
to follow, but I don't need a theory to know that I in fact do.

But neither do I need a compelling head argument for "why" to know
that in many other cases I find that the use of binding expressions
improves the code.  You shouldn't believe me even if I pretended to
have one and passionately argued for it.  But, by the same token, I'm
spectacularly unmoved by other peoples' head arguments.

For that reason, the messages that sway me are those showing real
code, or at least plausibly realistic code.  In the majority of those
so far, binding expressions would be a small-to-major win.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-25 Thread Tim Peters
[Tim]
>> To my eyes, this is genuinely harder to follow, despite its relative brevity:
>>
>> while total != (total := total + term):

[Antoine]
> Does it even work?  Perhaps if the goal is to stop when total is NaN,
> but otherwise?

I don't follow you.  You snipped all the text explaining why it would
work, so trying reading that again?  When, e.g., `total` reaches 1.0
and `term` reaches 1e-30, this becomes:

while 1.0 != (total := 1.0 + 1-e30):

which leaves `total` unchanged (1.0 + 1e-30 == 1.0) and then

while 1.0 != 1.0:

causes the loop to exit (`while False:`).


>> For that reason, the messages that sway me are those showing real
>> code, or at least plausibly realistic code.  In the majority of those
>> so far, binding expressions would be a small-to-major win.

> I'm sure it's possible to find thousands of line of code where binding
> expressions wouldn't be a win, but I'm not sure that would be a
> constructive use of mailing-list bandwidth.

And that "argument" is? ;-)

Note that I managed to move the PEP _away_ from general "assignment
expressions" to the much simpler "binding expressions" precisely _by_
illustrating, via real code, why the generality of the former wasn't
actually useful in any case I looked at.  If something is always - or
almost always - useless, that can be shown via considering realistic
code.  That was far more productive than endless abstract debates.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-25 Thread Tim Peters
[Tim]
 To my eyes, this is genuinely harder to follow, despite its relative 
 brevity:

 while total != (total := total + term):

[Antoine]
>>> Does it even work?  Perhaps if the goal is to stop when total is NaN,
>>> but otherwise?

[Chris]
>> Yes, it does, because the first "total" is looked up before the
>> rebinding happens. It's 100% unambiguous to the compiler... but still
>> pretty unclear to a human. And I think the multiple use of 'total' is
>> to blame for that. So I agree with Tim that this particular example is
>> better in longhand.

[Antoine]
> "Better" is an understatement :-(  Now that I understood it (thanks
> for the explanation),

Ah, sorry - I had no idea it was the "left to right evaluation" part
you weren't seeing.  Next time explain why you think something is
broken?


> the shorthand version appears completely bonkers.

I wouldn't go that far, but I already said I wouldn't write it that way.

However, without looking at real code, people are just flat-out
guessing about how bad - or good - things _can_ get, no matter how
confident they sound.

So at least give me credit for presenting the _worst_ brief
binding-expression example you've seen too ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is PEP 572 really the most effective way to solve the problems it's targeting?

2018-04-25 Thread Tim Peters
[Ryan Gonzalez ]
> I have to say I'm not overly thrilled with PEP 572...it's almost odd,
> because if you asked me back when I first joined this list when I was 13, I
> would've no doubt said *YES*. But, since then, I've gone across many
> projects and languages, and fundamentally *I have never felt hurt by the
> lack of assignment in an expression*, and I always regretted every time I
> tried it in C or Crystal. I understand this experience is pretty
> insignificant in comparison to many of the wizards here, but I thought I'd
> still share it as an opener for what I'm about to say.

The older you get, the more you'll regret not still being 13 ;-)


> With this being said, I'd encourage everyone to take a bit of a step back:
> what exactly are people looking for in PEP 572?
>
> I see two main goals:
>
> - Assignment in a conditional structure.
> - Assignment in a list comprehension.
>
> Most other use cases would significantly hurt readability and seem pretty
> rare.

I haven't been much impressed by suggested uses outside conditional
contexts either.


> Now let's break down the top one:
>
> - Assignment in an if condition.
> - Assignment in a while condition.
>
> So there are roughly three main goals here overall. Now, are there better
> ways to solve these?
> ...
> C++ has recently solved the if condition by allowing declarations inside the
> conditions:

But C++ has always had assignment expressions.  This:

> if (auto a = 123; a != 456) {

is solving a different (albeit related) problem:  that C/C++ require
declaring variables before use.  Python doesn't.  They could have done
the same via, .e.g,,

{
auto a = 123;
if (a != 456) {
 ...
}
}

and still have had the scope of `a` limited to one block.
auto-initializers in conditionals just gave a bit of syntactic sugar
for what was already easily (although with more typing) done.


> Many languages have a 'let' expression (using Felix as my example):
>
> if let a = 1, b = 2 in a == b then

I don't read Felix, but I assume the _scope_ of `a` & `b` there ends
immediately before the "then".  If the names can't be used in the
_body_ of a Python `if` (or `while`) block, it's essentially useless
to allow binding names for use solely in the conditional test.

So it would help if you picked "real Python examples" from the many
other earlier messages in these threads.  Python expressions can't
span Python statement boundaries - only Python blocks can do that.  A
form of `let` that _would_ work would be block-structured:

let m = regexp.match(pattern. line) in:
if m:
print(m.group(0))

That solves "a scope problem" the current version of the PEP gave up
on, but in all other respects seems a step back from the current:

m = regexp.match(pattern, line)
if m:
print(m.group(0))


> Swift has taken a bit of a hybrid between the above two:
>
> if let a = 1, b = 2, a == b {

That seems plain incoherent ;-)


> Now, what's the common theme here? **Declarations should be separate from
> expressions.** We've got languages that range from baggage-filled to
> functional to a bit of all of the above, and none of them have added
> assignment *inside* an expression.

C++ and C have always had assignment expressions .  Ditto Java,
Javascript, Perl, Icon,  ... (many, many others).  I don't see a good
reason to grant that Felix and Swift are necessarily improvements over
the former (with the exception of Icon, which I'm merely fond of) very
widely used languages.


> The argument is roughly the same across all boards: you're putting major but
> easy-to-miss side effects in the midst of expressions that *seem* pure.
>
> All this is to say: I'd really encourage everyone here to think a bit more
> about *why* exactly you want this feature, and then think if there's really
> no better way. Any solution that separates declarations would be far more
> readable, (arguably) more Pythonic, and play more nicely with the new-ish
> typing features to boot

People have been trying for years.  If you come up with a realistic
(for Python) idea, that's great - share it!  But it's probably better
suited to python-ideas than python-dev.

>  ...
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-25 Thread Tim Peters
[Raymond Hettinger ]
> After re-reading all the proposed code samples, I believe that
> adopting the PEP will make the language harder to teach to people
> who are not already software engineers.

Can you elaborate on that?  I've used dozens of languages over the
decades, most of which did have some form of embedded assignment.
Yes, I'm a software engineer, but I've always pitched in on "help
forums" too.  One language feature conspicuous by absence in newbie
confusions was, consistently, assignment expressions.  Read any book
or tutorial for such a language, and you'll find very little space
devoted to them too.

What's to learn?  If they understand "binding a name" _at all_ (which
they must to even begin to write a non-trivial program), the only
twist is that a binding expression returns the value being bound.
Binding expressions certainly wouldn't be the _first_ thing to teach
people.  But by the time it would make sense to teach them, it's hard
for me to grasp how a student could struggle with such a tiny
variation on what they've already learned (all the subtleties are in
what - exactly - "binding"means - which they already faced the first
time they saw "j = 1").


> To my eyes, the examples give ample opportunity for being
> misunderstood and will create a need to puzzle-out the intended semantics.

Some do, many don't.  The same can be said of a great many constructs ;-)

> ...
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-25 Thread Tim Peters
[Tim]
>> One language feature conspicuous by absence in newbie
>> confusions was, consistently, assignment expressions.  Read any book
>> or tutorial for such a language, and you'll find very little space
>> devoted to them too.

[Łukasz Langa ]
> Well, you have an entire code style built around this feature called Yoda
> conditions. You teach people on Day 1 to never ever confuse == with =. Some
> compilers even warn about this because so many people did it wrong.

Sorry, I couldn't follow that.  In languages like C that use easily
confused operator symbols, sure, people are forever typing "=" when
they mean "==".  That's nothing to do with whether they _understand_
what the different operators do, though.  They do.  In languages like
Icon (that use "=" for numeric comparison and ":=" for assignment),
that never occurs.  But I'm not sure that addressed the point you were
making.


>> What's to learn?  If they understand "binding a name" _at all_ (which
>> they must to even begin to write a non-trivial program), the only
>> twist is that a binding expression returns the value being bound.

> Ha, not in Python! Here we have *different syntax* for assignments in
> expressions.

Yes, binding expressions in the current PEP support an extremely
limited subset of what Python's assignment statements support.  That
they use different operator symbols is irrelevant to that the meaning
of "binding a name" is exactly the same for both..  _That's_ the "hard
part" to learn.


> Well, you can also use it as a statement. But don't!

Why not?  _Every_ expression in Python can be used as a statement.
Nothing forbids it, and that's even (very!) useful at an interactive
prompt.


> We have a better one for that.

As a matter of style, sure, it's best to use the simplest thing that
works.  As a statement in a program (as opposed to typed at a shell),
"a := 3" has the unnecessary (in that context) property of returning
(and discarding 3), so it's better style to use "a = 3" in that
context.


> And that one supports type annotations, can unpack and assign to many
> targets at the same time, and can even increment, multiply and so on, at once.
> But the other one can't.

So?  math.sqrt() blows up when passed -1, but cmath.sqrt() doesn't.
Different tools for different tasks.


> So only use the Pascal one in expressions. But don't forget parentheses,
> otherwise it will bind the thing you probably didn't want anyway.

[Raymond]
>>> To my eyes, the examples give ample opportunity for being
>>> misunderstood and will create a need to puzzle-out the intended
>>> semantics.

>> Some do, many don't.

> As soon as we have to wrap a part of an expression in parentheses, parsing
> the entire thing becomes more complex. Often enough it will cause the
> expression to exceed whatever line length limit the codebase pledged not to
> exceed, causing one line to become three. And again, making it trickier for
> a regular Łukasz to understand what's going on.

At this point I think you must have a lower opinion of Python
programmers than I have ;-)  If adding even a dozen characters to a
line makes it exceed a reasonable line-length guide, the code was
almost certainly too confusingly dense to begin with.  All the
binding-expression examples I've given as "improvements" had _oceans_
of horizontal white space to swim in.

Guido's if/elif/elif/elif/ ... complex text-processing example didn't,
but because the current lack of an ability to bind-and-test in one
gulp forced the `elif` parts to be ever-more-deeply-indented `if`
blocks instead.

So, to match your sarcasm, here's mine:  try using a feature for what
it's good at instead of for what it's bad at ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-26 Thread Tim Peters
[Raymond Hettinger ]
>>> After re-reading all the proposed code samples, I believe that
>>> adopting the PEP will make the language harder to teach to people
>>> who are not already software engineers.

[Tim]
>> Can you elaborate on that?

[Raymond]
> Just distinguishing between =, :=, and == will be a forever recurring
> discussion, far more of a source of confusion than the occasional
> question of why Python doesn't have embedded assignment.

To be clear, is distinguishing between "=" and "==" already a forever
recurring discussion in your experience?  Or are you predicting that
adding ":=" will create that situation?


> Also, it is of concern that a number of prominent core dev
> respondents to this thread have reported difficulty scanning
> the posted code samples.

Yes, it is - although some of the examples sucked ;-)


>> I've used dozens of languages over the
>> decades, most of which did have some form of embedded assignment.

> Python is special, in part, because it is not one of those languages.
> It has virtues that make it suitable even for elementary school children.
> We can show well-written Python code to non-computer folks and walk
> them through what it does without their brains melting (something I can't
> do with many of the other languages I've used).  There is a virtue
> in encouraging simple statements that read like English sentences
> organized into English-like paragraphs, presenting itself like
> "executable pseudocode".

It's certainly possible to stick to a subset of Python for which
that's true.  But I didn't mention those dozens of languages because I
seek to emulate them, but to establish that I've had decades of
experience with embedded assignments in a wide variety of languages
and language communities.


> Perl does it or C++ does it is unpersuasive.

Wasn't meant to be.


> Its omission from Python was always something that I thought Guido had
> left-out on purpose, intentionally stepping away from constructs that would
> be of help in an obfuscated Python contest.

He left out lots of stuff at first, but warmed to it later.  Probably
the most profound:  there were exactly and only 3 scopes at first:
local, global, and builtin.  Functions (for example) could still nest,
but had no way to access names local to enclosing functions save via
deep trickery.  That was a noble experiment (it was a deliberate
attempt to avoid complex scoping rules), but eventually proved too
restrictive in practice.

This is nothing compared to that ;-)  But it's a tiny bit related in
that biting the arbitrarily-deeply-nested-scopes bullet was aimed more
at experienced programmers than at newbies.  The scoping rules became
far harder to explain as a result - but far more what experienced
programmers expected.


>> Yes, I'm a software engineer, but I've always pitched in on "help
>> forums" too.

> That's not really the same.

I believe it!

>  I've taught Python to many thousands of professionals, almost
> every week for over six years.  That's given me a keen sense of
> what is hard to teach.  It's okay to not agree with my assessment,
> but I would like for fruits of my experience to not be dismissed in a
> single wisp of a sentence.

I asked you to elaborate - I didn't dismiss anything.  You merely made
a raw assertion in your original message, without enough detail to
even know _what_ it is you thought would be hard to teach.  Your
elaboration is helping.

> Any one feature in isolation is usually easy to explain, but showing
> how to combine them into readable, expressive code is another matter.

OK, so it's not binding expressions in isolation that you expect will
be hard to teach if they're added, but ... how to use them
intelligently (if ever)?  That's progress, if so.

That part I can see having major trouble with.  Even the proponents of
this PEP don't always agree with each other about which examples are
"good ones".


> And as Yuri aptly noted, we spend more time reading code than writing code.
> If some fraction of our users finds the code harder to scan
> because the new syntax, then it would be a net loss for the language.

It would be a tradeoff pitting their losses against others' gains, of
course.  I don't know how to quantify that (not even to the extent of
determining the sign bit) in advance.  I'm also at least as concerned
about - indeed - professional software engineers as beginners.


> I hesitated to join this thread because you and Guido seemed to be
> pushing back so hard against anyone's who design instincts didn't favor
> the new syntax.

That's just vigorous debate, at least on my part.  Guido gets annoyed
by emotional tirades and FUD, of which there's always plenty in
threads that have gone on for hundreds of messages (I don't know
whether you followed any of this on python-ideas, but most arguments
on python-dev were already many-times-over old by the time it first
appeared here).


> It would be nice to find some common ground and perhaps stipulate that the
> g

Re: [Python-Dev] The new and improved PEP 572, same great taste with 75% less complexity!

2018-04-26 Thread Tim Peters
[Guido]
> Maybe the order for d[k] = v should also be reconsidered?

There's certainly code in the wild relying on the order "v, then d,
then k", because that's just how assignment statements have always
worked (dicts or not).  I'd rather change the dict comprehension code,
because I think the analogy to what `d[k] = v` does is weak.  I
expect:

{f(): g() for _ in [1]|

to do the same as

   {f(): g()}

The former currently evaluates g() first; the latter f().  Since I
also expect both of those to do the same as

dict((f(), g()) for _ in [1])

f() has to win ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The new and improved PEP 572, same great taste with 75% less complexity!

2018-04-26 Thread Tim Peters
[Kirill Balunov ]
> Not sure, but if additional motivating examples are required, there is a
> common pattern for dynamic attribute lookup (snippet from `copy.py`):
>
> reductor = dispatch_table.get(cls)
> if reductor:
> rv = reductor(x)
> else:
> reductor = getattr(x, "__reduce_ex__", None)
> if reductor:
> rv = reductor(4)
> else:
> reductor = getattr(x, "__reduce__", None)
> if reductor:
> rv = reductor()
> else:
> raise Error("un(shallow)copyable object of type %s" % cls)
>
> which can with the current `binding expression` syntax simplified to:
>
> if reductor := dispatch_table.get(cls):
> rv = reductor(x)
> elif reductor := getattr(x, "__reduce_ex__", None):
> rv = reductor(4)
> elif reductor := getattr(x, "__reduce__", None):
> rv = reductor()
> else:
> raise Error("un(shallow)copyable object of type %s" % cls)
>
> which becomes much clearer, at least in my opinion.
>
> With kind regards,
> -gdg

Thanks for sharing that!  While nobody else seems to, I absolutely
love real code ;-)

This is effectively an instance of Guido's "if/elif/elif/elif/..."
example template, where binding expressions shine.  But actual real
code can make a point viscerally that "consider stuff akin to the
following semi-abstract pattern" can't.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The new and improved PEP 572, same great taste with 75% less complexity!

2018-04-26 Thread Tim Peters
[Kirill Balunov]
> Not sure, but if additional motivating examples are required, there is a
> common pattern for dynamic attribute lookup (snippet from `copy.py`):
>
> reductor = dispatch_table.get(cls)
> if reductor:
> rv = reductor(x)
> else:
> reductor = getattr(x, "__reduce_ex__", None)
> if reductor:
> rv = reductor(4)
> else:
> reductor = getattr(x, "__reduce__", None)
> if reductor:
> rv = reductor()
> else:
> raise Error("un(shallow)copyable object of type %s" % cls)
>
> which can with the current `binding expression` syntax simplified to:
>
> if reductor := dispatch_table.get(cls):
> rv = reductor(x)
> elif reductor := getattr(x, "__reduce_ex__", None):
> rv = reductor(4)
> elif reductor := getattr(x, "__reduce__", None):
> rv = reductor()
> else:
> raise Error("un(shallow)copyable object of type %s" % cls)
>
> which becomes much clearer, at least in my opinion.

[Larry Hastings ]
> I hate to be pedantic--there's enough of that going on in this thread--but I
> can't agree with the word "simplifed" above.  I agree that the code using
> binding expressions is shorter.  But considering that emit the two code
> examples implement the exact same algorithm, to the point where their
> bytecode would look nearly* identical, ISTM that the two code examples are
> of identical complexity.

In the absence of defining an objectively computable complexity
measure,  I expect you're doomed to arguing taste.  For example, argue
that both spellings have the same formal "cyclomatic complexity"
measure (which they do).  By other formal measures (e.g., total number
of identifier instances), the latter spelling is "objectively
simpler".  By yet others (e.g., total number of non-whitespace
characters divided by total number of lines), the former spelling is
"objectively simpler".

But that all kinda misses the point to me:  the latter spelling is
"obviously simpler" in a way that _actually matters_, for the same
reason, e.g., a case statement with N cases is "obviously simpler"
than the semantically equivalent spelling using N nested if/else
if/else if/else if/else ... blocks.  The latter spelling above is
indeed visually very much like a case statement:  all the tests are at
the same indentation level, and all the conditional actions are too.
It's obvious _at a glance_ in the latter that exactly one of the
action blocks will be performed.  That's how if/elif/elif/else always
works.  It's not at all obvious at a glance (certainly not to me) in
the original spelling.

> Comparing the two, the code using the binding expressions obviates four
> newlines, three uses of the identifier "reductor", and allows folding two
> "else / if"s into "elif"s.  In exchange, it adds three extra colons,

If it weren't for that you hate being pedantic, I'd add that you're
overlooking the piles of leading whitespace characters also saved in
the latter ;-)  The number of those saved grows quadratically in the
number of uselessly indented blocks shifted left.

> and the density of complexity per line has shot up.

Average non-whitespace character count per line has certainly shot up,
but I don't actually know what you mean by "density of complexity"
there.

Just FYI, when I write long if/elif/elif/... chains, I typically put a
blank line before each elif, to give better visual separation of the
peer (both semantically and visually) test-action blocks.  Which has
nothing to do with any formal notion of complexity, because I don't
much care about that - readability is what I value, and that's not the
same as any formal notion of complexity I've ever seen.

>  ...
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-26 Thread Tim Peters
[Tim]
>> So, to match your sarcasm, here's mine:  try using a feature for what
>> it's good at instead of for what it's bad at ;-)

[Lukasz Langa ]
> Yes, this is the fundamental wisdom.  Judging which is which is left as an
> exercise to the programmer.
>
> With this, I'm leaving the discussion.  With Guido and you on board for PEP
> 572, I feel that Chris' streak is indeed about to break.

I still expect it could go either way, but do wish people didn't
believe it will be a major loss if "the other side wins".  I'll be
fine regardless - and so will everyone else.  Guido rarely makes
language design mistakes.  In this case he's seeing serious opposition
from several core developers, and you shouldn't believe either that he
just dismisses that.


 [Łukasz Langa]
>>> Well, you have an entire code style built around this feature called Yoda
>>> conditions. You teach people on Day 1 to never ever confuse == with =. Some
>>> compilers even warn about this because so many people did it wrong.

>> Sorry, I couldn't follow that.

Part of the problem here is that I had never seen "Yoda conditions"
before, and had no idea what it meant.  Some later Googling suggests
it's "a thing" youngsters say at times ;-)


> You implied that newbies don't have to even know about assignments in
> expressions.  I wanted to demonstrate that this isn't really the case because
> mistaking `=` for `==` is a relatively common occurence for newbies.  If you
> want to argue that it isn't, I'd like to point out that the WordPress code
> style *requires* Yoda conditions because it was enough of a hindrance.  ESLint
> (a JavaScript linter) also has a warning about assignment in a conditional.

What does that have to do with Python?  If they try to use "=" in an
expression now, they get a SyntaxError.  The PEP doesn't change
anything about that.  Indeed, that's why it uses ":=" instead.  I have
experience in other languages with embedded assignments that also use
":=", and it's _never_ the case that people type ":=" when they intend
"equality test" in those.  The horrid "I typed = when I meant =="
mistakes are unique to languages that mindlessly copied C.  The
mistakes aren't primarily due to embedded assignments, they're due to
that even highly experienced programmers sometimes type "=" when
they're _thinking_ "equals".  Nobody types ":=" when they're thinking
"equals".


> ...
> What you're saying is true.  But for it to be true, newbies *have to* learn 
> the
> distinction, and the fact that yes, sometimes the programmer indeed meant to
> put a single `=` sign in the conditional.

Again, the PEP is about Python:  a single "=" in a conditional is, and
will remain, a SyntaxError.  So nobody can sanely intend to put a
single "=" in a condition _in Python_ unless they're writing a test
intending to provoke a syntax error.


> That's why we'll end up with the Pascal assignment operator.

":=" is already in the PEP.


>  And that *is* a thing that you will have to explain to newbies when they 
> encounter
> it for the first time.

Sure.  That doesn't frighten me, though.  It's easy to explain what it
does - although it may be hard to explain when it's _desirable_ to use
it.


> Sadly, googling for a colon followed by an equal sign isn't trivial if you 
> don't
> know what you're looking for.

To judge from Stackoverflow volume, the single most misunderstood of
all Python operators - by far - is "is" - try Googling for that ;-)
In far second and third places are "and" and "or", for which searches
are also useless.

Regardless, I'm not concerned about one-time tiny learning curves.
Don't know what ":=" means already?  Ask someone.  If you know what
"=" means, you're already close to done.  Given that you already
understand what "binding a name" means, ":=" may well be the simplest
of all Python's operators (there's no computation _to_ be understood,
and no possibility either of a dunder method changing its meaning
depending on operand type(s)).


>>> Well, you can also use it as a statement. But don't!

>> Why not?  _Every_ expression in Python can be used as a statement.
>> Nothing forbids it, and that's even (very!) useful at an interactive
>> prompt.

> Because it suggests different intent, because it's limited, because it's 
> slower
> at runtime, and because PEP 572 says so itself.

I didn't say you're _required_ to use it as a statement.  Regardless
of what PEPs say, people will do what they find most useful.  I trust
people to figure this out quickly for themselves.


>> At this point I think you must have a lower opinion of Python
>> programmers than I have ;-)  If adding even a dozen characters to a
>> line makes it exceed a reasonable line-length guide, the code was
>> almost certainly too confusingly dense to begin with.

> Around 5% of if and elif statements in the standard library don't fit a single
> line *as is*.  Sure, that's a low percentage but that's over 1,000 statements.
> If you're putting an `if` statement in a method, you

Re: [Python-Dev] Is PEP 572 really the most effective way to solve the problems it's targeting?

2018-04-26 Thread Tim Peters
[Mike Miller]
>> - How are other modern languages solving this issue?

[Greg Ewing ]
> In all the languages I can think of that allow assignments in
> expressions, there is only one assignment operator -- a stand
> alone assignment is just a bare assignment expression.

Pretty much so, but I don't know what "modern" means to Mike.  The R
language may set a record for, umm, innovation here:

"""
There are three different assignment operators: two of them have
leftwards and rightwards forms.[1]
"""

So there are 5 assignment operator spellings in R:

=
<-
->
<<-
->>

Note that the link doesn't tell the whole story either; e.g., they
don't all have the same precedence level.  And, in addition to the 5
infix spellings shown above, there are also prefix (looks like a
2-argument function call) spellings.

Back on Earth ;-) , I think it's worth it to point out that only
languages (with assignment expressions) aping C use "=" for assignment
and "==" for equality.  That was a Really Bad Idea that all other (not
aping C) languages I know of avoided.

But I'm not sure any of this is relevant to what Mike meant by "this issue".


> But those languages were all designed that way from the start.
> I'm not aware of any that began by forbidding assignment in
> expressions and then added it later.

Me neither.  It's certainly the case that Guido would not have
designed a language that aped C's poor decision here.  At its very
start, Python used "=" for both assignment and equality testing (and
== was a syntax error).  So I think it's evident that, at the time, he
didn't envision ever adding assignment expressions.


[1] https://www.rdocumentation.org/packages/base/versions/3.5.0/topics/assignOps
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The new and improved PEP 572, same great taste with 75% less complexity!

2018-04-26 Thread Tim Peters
[Larry Hastings ]
>>> I hate to be pedantic--there's enough of that going on in this thread--but I
>>> can't agree with the word "simplifed" above.  I agree that the code using
>>> binding expressions is shorter.  But considering that emit the two code
>>> examples implement the exact same algorithm, to the point where their
>>> bytecode would look nearly* identical, ISTM that the two code examples are
>>> of identical complexity.

[Tim]
>> In the absence of defining an objectively computable complexity
>> measure,  I expect you're doomed to arguing taste.

[Larry]
> As are you!

I didn't claim otherwise.

> I haven't seen any arguments that binding expressions allow us
> to express programs that were inexpressible in Python before.

They don't.

> I'm not even sure that binding expressions fall under the heading
> of "syntactic sugar", given their negligible semantics (and, imo,
> negligible benefit).  What else is left, on both sides of the debate,
> if not a debate over aesthetics?

I prefer to look at effects on real code.  Other people prefer to philosophize.


>>  For example, argue that both spellings have the same formal
>> "cyclomatic complexity" measure (which they do).  By other formal
>> measures (e.g., total number of identifier instances), the latter
>> spelling is "objectively simpler".  By yet others (e.g., total number
>> of non-whitespace characters divided by total number of lines), the
>> former spelling is "objectively simpler".

> What is this "objective simplicity" measurement you cite?

There are many ways you can (and various programs do) attempt to
define, quantitatively, what "program complexity" means.  Under any
such objectively defined measure, two pieces of code can be
"objectively compared".  I use scare quotes with their ordinary
meaning:  that it's "objective" only if you're silly enough to believe
that _whatever_ numbers you're computing are going to settle the issue
;-)

> I understand that the code example cited had fewer identifiers, so when
> measuring "number of identifiers used" in isolation, the code example using
> binding expressions had fewer of them.

Then you necessarily agree that _if_ our objective definition of
complexity is "total number of identifier instances", the
binding-expression version is "objectively simpler".  It's been
reduced, by definition, to a question of determining which of two
integers is smaller.

> But this is so narrow as to be almost meaningless.

Of course!  As is your original claim that "the two code examples are
of identical complexity". "because" "their bytecode would look nearly
identical".  Well, sure, _if_ that's how we define program complexity,
the conclusion follows.  But there's no reason I can see to accept
that definition to begin with either.  I suspect _you_ like it
primarily because you found it supported the conclusion you had
already reached ;-)

> Perhaps I'm misunderstanding you, but I read this as saying that there's a
> larger, well-established concept called "objective simplicity", of which
> this measurement is a part.  Can you tell me more about it?  Google was no
> help here.

The metrics I mentioned are used by a number of programs that claim to
quantify program complexity.  For example, among many other things,
this program computes cyclomatic complexity, and uses N_2 for "total
number of operands" (which I called "identifiers" instead to
specialize it to the specific example) under the widely used "Halstead
Metrics":

http://radon.readthedocs.io/en/latest/intro.html

My favorite part is where the numerator of the "Maintainability Index" adds in

50 * sin(sqrt(2.4 * C))

where "C is the percent of comment lines (important: converted to
radians)".  WTF?! ;-)  But they're not joking:  some people take this
stuff very seriously.


>> But that all kinda misses the point to me:  the latter spelling is
>> "obviously simpler" in a way that _actually matters_, for the same
>> reason, e.g., a case statement with N cases is "obviously simpler"
>> than the semantically equivalent spelling using N nested if/else
>> if/else if/else if/else ... blocks.

> As I already mentioned, the with-binding-expressions code expresses the same
> code, the same concept, and likely results in the same bytecode, as the
> without-binding-expressions code.

And as I already explained in some detail, while I agree with (almost)
all that, it leaves me cold as a dead fish.  The test-action pairs in
the code are _semantically_ peers, not a nesting of subordinates.
It's _clearer_ to human eyes if the syntactic structure of the code
reflects the peer relationship directly.  I couldn't care less that
the byte code turns out being nearly the same.  I'm not a PVM - I need
to _reason_ about the code I read.  In failing to visually reflect the
peer relationship, the original code obscures a key simplicity.

> In contrast, a switch statement is simpler than a series of nested if
> statements.  It's a different code construct, it has differen

Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-26 Thread Tim Peters
[Zero Piraeus]
>> Since it now looks like it really *does* have a decent chance, and
>> maybe another -1 has a small chance of tipping the balance: my
>> reaction to the proposal is also emotional. Visceral, in fact, to the
>> extent that I'd aim to read and write less Python if it became
>> commonplace.

[Steven D'Aprano ]
> Funnily enough, that's what some people said about decorator syntax,
> ternary if, type annotations and list comprehensions.
>
> All of them have become great additions to the language.
>
> I hated the idea of aping C and adding += operators and swore I'd never
> use them. That lasted, well, about a month.
>
> Just sayin'.

Well - I've come to respect your opinion, so ... OK, I'll give += a
try.  Frankly, I've grown tired of editing it out of all the packages
I download anyway ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-27 Thread Tim Peters
[Lukasz]
>> >  And that *is* a thing that you will have to explain to newbies when
>> > they encounter it for the first time.

[Tim]
>> Sure.  That doesn't frighten me, though.  It's easy to explain what it
>> does - although it may be hard to explain when it's _desirable_ to use
>> it.

[Chris Barker ]
> I'm with Raymond here -- though I'm not sure "newbies" is quite right --
> I've found that newbies fall into two camps: folks to whom programming comes
> naturally, and those that it doesn't (OK, it's a distribution, but a bimodal
> one). And folks that are struggling with programming can struggle even with
> simple assignment (name binding), particularly when you add even function
> local scope.

Sure.  What I wrote was shorthand for what's already been covered at
length many times:  what a binding expression does is "easy to
explain" GIVEN THAT someone ALREADY UNDERSTANDS how binding a name
works.  The latter in fact seems difficult for a significant number of
people to learn, but it's utterly unavoidable that they learn it if
they're ever to write non-trivial Python programs.  That's been true
since Python's first release.

Binding expressions would be introduced much later in any sane course.
At THAT point, for students who haven't already dropped out, the
semantics are darned-near trivial to explain:  it binds the name to
the object the expression evaluates to (all of which they _already_
understand by this point), and the value of the binding expression is
that object (the only new bit).

Unlike as for most other operators, you don't even have to weasel-word
it to account for that a magical dunder method may change what ":="
does.  As for the "is" operator, the meaning is baked into the
language and can't be altered in the slightest.


> So having one more way to do assignment WILL make it harder to
> teach, not because it's that hard, but because it's one more thing to learn.

On a scale of 1 to a million, try to quantify how much harder ;-)  As
above, I can't see it getting beyond a single digit, GIVEN THAT a
student has already masteredf the far more complex assignment
_statement_ (binding expressions are limited to the single simplest
case of the many things an assignment statement can do).  "And it
returns the object" is a yawn.  But, as I already granted, it may be
truly hard to explain when it's a desirable thing to use.  That takes
experience and "good judgment", which - according to me - can be
learned but can't really be taught.


> But the fact is that as Python has evolved (particularly with the jump to
> py3) it has become less and less of a "scripting" language, and more of a
> "systems" language. And also harder to learn. Anyone remember CP4E? Python
> is not as good choice as a "newbie" language as it once was.

I agree - although I expect sticking to a subset of Python could make
life easier for beginners.  For example, would anyone in their right
mind even mention async gimmicks when teaching beginners?

Against that, though, one of the most unintentionally funny tech
things I ever read was Bjarne Stroustrup writing about why C++ is an
excellent choice for beginners.  But he does have a point:  if you
throw away the bulk of everything C++ added, there's an easily usable
little language exceedingly well hidden under it all ;-)


> Adding := will move it a little bit more along the complexity path -- not
> much, and that's where Python has gone anyway, so as Tim said, no one's
> going to suffer either way this decision goes.

Yet there will be much wailing and gnashing of teeth anyway ;-)

...
>> To judge from Stackoverflow volume, the single most misunderstood of
>> all Python operators - by far - is "is" -

> You now, I think instructors like me are partly responsible. "is" is rarely
> useful outside of comparing to singletons. Yet I use it early in instruction
> to do checks on name binding and show things with mutablilty, etc which
> has the unfortunate side effect of making it seem like a more common
> operator than it is.
>
> I've even had students write code like:
>
> if x is 3:
>
> and thanks to interning, it appears to work!

Yup, that's the real problem with "is":  its semantics are dead
simple, but "but under exactly what conditions are `x` and `y` bound
to the same object?" is intractable.  It seems to take a long time to
get across the point, that the question itself is misguided.  A full
answer requires delving into transient implementation details, which
is counterproductive because they _are_ accidents of the
implementation du jour.  What questioners need to be nudged into
asking instead is for examples of when using "is" is thoroughly sane.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-27 Thread Tim Peters
[Chris Angelico ]
> ...
> I don't understand why people bring up all these arguments that have
> absolutely nothing to do with the proposal at hand. None of this has
> in any way changed.

That's easy:  any time there's a long thread to which Guido has
contributed at least twice, it will be seen as a Golden Opportunity to
re-litigate every decision that's ever been made ;-)

Some amount of that seems healthy to me (people are thinking about
"language design" from a larger view than the proposal du jour).  In
this specific case, line-oriented coverage tools have missed
accounting for all possible code paths since day #1; e.g.,

x = f() or g()

You don't need to reply to messages so obviously irrelevant to the PEP
unless you want to.  It's not like Guido will read them and go "oh!  a
binding expression in a ternary conditional is a fundamentally new
potential problem for a line-oriented coverage tool!  that's fatal"
;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-27 Thread Tim Peters
Wes, sorry, but I really don't follow what you're saying.  For example,

[Wes Turner ]
> Do not do this:
>
> x = 2
> if (x == 3) or (x := 3):
>print(x)
>
> What do we call that mistake?

It displays 3 - while it appears to be silly code, there's nothing
about it that's undefined.  So I fail to see how showing that example
anywhere would do anyone any good.

You can do the same kind of thing today via, e.g.,

class Bindable:
def __init__(self, value):
self.bind(value)

def bind(self, value):
self.value = value
return value

def __bool__(self):
return bool(self.value)

def __eq__(self, other):
return self.value == other

def __str__(self):
return str(self.value)

Then:

>>> x = Bindable(2)
>>> if x == 3 or x.bind(3):
... print(x)
3

And I wouldn't put that example anywhere in any docs either ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20

2018-04-27 Thread Tim Peters
[Tres Seaver ]
> FWIW, Ned Batchelder's 'coverage.py' does a good job with branch coverage.
> I haven't seen anything in this discussion which indicates that binding
> expressions will change that at all.

I don't think you missed anything relevant either ;-)  Binding
operators are exactly as irrelevant to control-flow analyzers as,
e.g., introducing a floor division operator (//) was.

Data-flow analyzers (if there are any for Python) are a different
story, since they need to be aware of all (re)binding operations -
although at the byte code level, all such sites remain equally
apparent (no new flavor of "store" operation is added by this PEP).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 572: Usage of assignment expressions in C

2018-04-30 Thread Tim Peters
[Raymond Hettinger ]
> Thanks Antoine, this is an important point that I hope doesn't get lost.
> In a language with exceptions, assignment expressions are less needful.
> Also, the pattern of having of having mutating methods return None
> further limits the utility.

It doesn't diminish the utility one whit in cases where binding
expressions are helpful ;-)

What you're saying is that there are _fewer_ such opportunities in
Python than in C.  Which may or may not be true (depending on the code
you're working with).  If you believe it is true, fine, then that also
argues against that people will rush to abuse the feature (to the
extent that it's even plausibly useful less often, to that extent also
will there be less temptation to use it at all).

But then I only care about use cases at heart, and have presented
real-life examples wherein binding expressions read both better and
worse than what they're replacing.  I intend to limit myself to the
cases where they read better :-)  Which are most of the cases I even
considered, BTW - in the vast majority of cases in real code I'd use
them, they'd be replacing the annoyingly bare-bones yet somehow
repetitive anyway:

value = f()
if value;
doing something with value

with the still bare-bones but minimally repetitive:

if value := f():
doing something with value

For example, tons of functions I write and use return None or 0 or
False when they want to communicate "I have nothing useful to return
in this case - but since you expected that might happen, I'm not going
to annoy you with an exception".  That pattern isn't solely limited to
regexp search and match functions.

The "win" above is minor but frequent.  It adds up.

There are other cases where binding expressions really shine, but
they're much rarer in all the code I looked at (e.g., see the
uselessly ever-increasing indentation levels near the end of `copy()`
in the std library's copy.py).

In all, I expect I'd use them significantly more often than ternary
`if`, but far less often than augmented assignments.  If the PEP is
accepted, that's what all Pythoneers will be saying 5 years from now
;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Drop/deprecate Tkinter?

2018-05-03 Thread Tim Peters
[Matěj Cepl ]
> It absolutely impossible to remove Tkinter IMHO (it has been
> part of stdlib since like forever and people expect it there;
> its removal would be betrayal on the level of switching = to
> :=), I have my doubts about IDLE though. I know, the same
> argument applies, but really, does anybody use IDLE for
> development for long time, what is its real value for the
> community? Although, even this argument is questionable, because
> Python has some affinity with the learning, and IDLE is a nice
> for first steps nibbling into Python.

IDLE isn't just for eager beginners, but also for those so old &
senile they're incapable of learning anything new ever again.  As
proof, IDLE is still _my_ primary Python development environment, used
multiple times every day, and I'm so old & out-of-it that I'm +1 on
the binding expressions PEP ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Slow down...

2018-05-08 Thread Tim Peters
[Steven D'Aprano ]
> ...
> If we could look forward to 2028, when we're running Python 3.14 or so
> (4.7 if you prefer), how many fantastic language features that we cannot
> bear to give up would we be missing out on?

This, for just one:

k = 6 if >!{myobj.meth(arg)[2]} elsenone 7 elsenan 8 else 5

Which is really annoying to write today, but would be much clearer
with binding expressions:

if myobj is None:
k = 7
elif (t := myobj.meth) is None:
k = 7
elif (t := t(arg)) is None:
k = 7
elif (t := t[2]) is None:
k = 7
eiif math.isnan(t):
k = 8
elif t:
k = 6
else:
k = 5

The future is blindingly bright :-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Looking for examples: proof that a list comp is a function

2018-05-14 Thread Tim Peters
[Chris Angelico  ...
> With current semantics, you can easily prove that a list comp is
> implemented with a function by looking at how it interacts with other
> scopes (mainly class scope), but Tim's proposal may change that.

Absolutely not.  I haven't considered for a nanosecond that anything
_essential_ would change in the current implementation.  In effect, my
proposal to

bind assignment statement targets
that appear in a listcomp or genexp
in the blocks that immediately contain their synthetic functions

"merely" sprinkles in some `nonlocal` and/or `global` declarations to
change the targets' scopes.  Indeed, it _relies_ on that they're
implemented as (potentially nested) synthetic functions today.  And if
you haven't read my proposed changes to the reference manual, they
explicitly state that they're talking about the synthetic functions
created to implement genexps and listcomps.


> So I'm looking for examples that prove that a list comp is executed
> inside an implicit function. Ideally, examples that are supported by
> language guarantees, but something that's "CPython has done it this
> way since 3.0" is important too.

I don't believe you'll find that - but, of course, may be wrong about that.

> I'm aware of just two: the name lookup interaction that may be
> changing, and the fact that there's an extra line in a traceback. And
> the latter, as far as I know, is not guaranteed (and I doubt anyone
> would care if it changed). Are there any other provable points?

Nick pointed me to these future docs that _will_ pretty much imply it:

https://docs.python.org/dev/reference/expressions.html#displays-for-lists-sets-and-dictionaries

In part:

"""
However, aside from the iterable expression in the leftmost for
clause, the comprehension is executed in a separate implicitly nested
scope. This ensures that names assigned to in the target list don’t
“leak” into the enclosing scope.

The iterable expression in the leftmost for clause is evaluated
directly in the enclosing scope and then passed as an argument to the
implictly nested scope.
"""

I say "pretty much" because, for whatever reason(s), it seems to be
trying hard _not_ to use the word "function".  But I can't guess what
"then passed as an argument to the implicitly nested scope" could
possibly mean otherwise (it doesn't make literal sense to "pass an
argument" to "a scope").
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] bpo-33038: Fix gzip.GzipFile for file objects with a non-string name attribute. (GH-6095)

2018-05-15 Thread Tim Peters
Sorry about approving this message (I'm a python-dev list moderator)!
There will be a few more like it.

Looking closer, it appears to be another variation of pure-nuisance
spam that's been flooding all sorts of python.org lists.  You've been
spared many hundreds of those here, but since this one appeared to
contain actual Python-related content, I reflexively approved it.

On Tue, May 15, 2018 at 3:43 PM, nataliemorrisonxm980xm--- via
Python-Dev  wrote:
>
>
> 
> From: Serhiy Storchaka 
> To: python-check...@python.org
> Sent: Wednesday, 9 May 2018, 10:14
> Subject: [Python-checkins] bpo-33038: Fix gzip.GzipFile for file objects
> with a non-string name attribute. (GH-6095)
>
,,,
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Looking for examples: proof that a list comp is a function

2018-05-15 Thread Tim Peters
[ Tim, about the most version of the docs at

https://docs.python.org/dev/reference/expressions.html#displays-for-lists-sets-and-dictionaries
]
>> I say "pretty much" because, for whatever reason(s), it seems to be
>> trying hard _not_ to use the word "function".  But I can't guess what
>> "then passed as an argument to the implicitly nested scope" could
>> possibly mean otherwise (it doesn't make literal sense to "pass an
>> argument" to "a scope").

[Nick Coghlan ]
> I think my motivation was to avoid promising *exact* equivalence with a
> regular nested function, since the define-and-call may allow us
> opportunities for optimization that don't exist when those two are separated
> (e.g. Guido's point in another thread that we actually avoid calling "iter"
> twice even though the nominal expansion implies that we should). However,
> you're right that just calling it a function may be clearer than relying on
> the ill-defined phrase "implicitly nested scope".

Plus that, as noted, what passing an argument "to a scope" means is mysterious.

Language standard committees struggle for years with how to phrase
things so that no more than is intended appears to be promised.  It's
hard!  For example, if you were to show a workalike function and note
that the exact placement - and number - of `iter()` calls is not
guaranteed, someone else would point out that you need to explicitly
say that by "iter" you mean the builtin function of that name, not one
user code may have overridden it with in the current scope.   Then
someone else will note that it's tedious to say things like that
whenever they're needed, and more-general text will be added elsewhere
in the docs saying that the _rest_ of the docs always mean the
language-supplied versions of such-&-such explicitly named
functions/classes/modules/...

I'd say "nested function" anyway ;-)  And for another reason:  not
just someone from Mars is prone to misreading 'scope", but just about
anyone on Earth coming from another language.  The idea that the word
"scope" all by itself implies "and in general any name bound to within
the top-level code spanned by the scope is implicitly local to the
scope unless explicitly declared `global` or `nonlocal` in the scope"
may be unique to Python.


> For Chris's actual question, this is part of why I think adding
> "parentlocal" would actually make the scoping proposal easier to explain, as
> it means the name binding semantics aren't a uniquely magical property of
> binding expressions (however spelled), they're just a new form of target
> scope declaration that the compiler understands, and the binding expression
> form implies. Note: eas*ier*, not easy ;)

Adding an explanation of `parentlocal` to the docs could be a useful
pedagogical device, but I don't think I'd support adding that
statement to the _language_.  It's too weird, and seems to be at a
wrong level for plausible future language developments.

Let's step way back for a minute.  In many languages with
full-blown closures, first-class functions, and nested lexical scopes,
it's pretty common to define the meaning of various language
constructs in terms of calling derived lexically nested functions.  In
those languages, any "work variables" needed by the synthetic
functions are declared as being local to those functions, and _that's
the end of it_.  They're done.  All other names inside the expansions
mean exactly the same as what they mean in whatever chunks of
user-supplied code the construct interpolates into the synthesized
functions.  It doesn't matter one whit in which context(s) they
appear.

That's the only long-term sane way to go about defining constructs in
terms of calling synthesized functions interpolating user-supplied
pieces of code.

Now _if_ Python had been able to do that, the meaning of genexps and
listcomps would have been defined, from the start, in terms of
synthesized functions that declared all & only the for-target names
"local".  And, in fact, the change I'm suggesting wouldn't have
required changing the comprehension implementation _at all_ when
assignment expressions were added.  Instead the implementation would
need to change to _add_ assignment expression targets to the things
declared local if it was decided that those targets should be _local_
to the derived functions instead.

That's why this all seems so bloody obvious to me ;-)  It's how
virtually every other language in the business of defining constructs
in terms of nested synthesized functions works.

So if that's something we may ever do again - and possibly even if we
don't expect to ever do it again - I suggest a more generally useful
approach would be to add a new flavor of _function_ to Python.  Namely
one wherein the only locals are the formal arguments and those
explicitly declared local.  Whether or not a name is bound in the body
would be irrelevant.  To avoid a new keyword, `local` could be spelled
`not nonlocal` ;-)

Note that the only use for `parentlocal` s

Re: [Python-Dev] My fork lacks a 3.7 branch - can I create it somehow?

2018-05-22 Thread Tim Peters
[Nathaniel Smith ]
> ...
> As far as git is concerned, the main repo on github, your fork on
> github, and your local repo are 3 independent repositories, equally
> valid. The relationships between them are purely a matter of
> convention.

Thanks for that!  It instantly cleared up several mysteries for me.
I'm just starting to learn git & github, and am starkly reminded of an
old truth:  there is absolutely nothing "obvious" about source-control
systems, or workflows, before you already know them ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)

2018-06-24 Thread Tim Peters
[Guido]
> A quick follow-up: PEP 572 currently has two ideas: (a) introduce := for
inline
> assignment, (b) when := is used in a comprehension, set the scope for the
> target as if the assignment occurred outside any comprehensions. It seems
> we have more support for (a) than for (b) -- at least Nick and Greg seem
to
> be +0 or better for (a) but -1 for (b). IIRC (b) originated with Tim. But
his
> essay on the topic, included as Appendix A
> (
https://www.python.org/dev/peps/pep-0572/#appendix-a-tim-peters-s-findings)
> does not even mention comprehensions.

I was writing up my observations about simple changes to existing code.
Since there's nothing sanely akin to binding non-for-targets possible in
comprehensions now, comprehensions were out of scope for that effort (which
was limited to staring at existing code already doing bindings).


:> However, he did post his motivation for (b) on python-ideas, IIRC a bit
> before PyCon; and the main text of the PEP gives a strong motivation
> (https://www.python.org/dev/peps/pep-0572/#scope-of-the-target).
Nevertheless,
> maybe we should compromise and drop (b)?

Two things to say about that.  First, the original example I gave would be
approximately as well addressed by allowing to declare intended scopes in
magically synthesized functions; like (say)

p = None # to establish the intended scope of `p`
while any(  # split across lines just for readability
n % p == 0 for p in small_primes):
n //= p

It didn't really require an inline assignment, just a way to override the
unwanted (in this case) "all `for` targets are local to the invisible
function" rigid consequence of the implementation du jour.

Second, if it's dropped, then the PEP needs more words to define what
happens in cases like the following, because different textual parts of a
comprehension execute in different scopes, and that can  become visible
when bindings can be embedded:

def f():
y = -1
ys = [y for _ in range(y := 5)]
print(y, ys)

Here `range(y := 5)` is executed in f's scope.  Presumably the `y` in `y
for` also refers to f's scope, despite that `y` textually _appears_ to be
assigned to in the body of the listcomp, and so would - for that reason -
expected to be local to the synthesized function, and so raise
`UnboundLocalError` when referenced.  It's incoherent without detailed
knowledge of the implementation.

def g():
y = -1
ys = [y for y in range(y := 5)]
print(y, ys)

And here the `y` in `y for y` is local to the synthesized function, and
presumably has nothing to do with the `y` in the `range()` call.  That's
incoherent in its own way.

Under the current PEP, all instances of `y` in `f` refer to the f-local
`y`, and the listcomp in `g` is a compile-time error.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)

2018-06-24 Thread Tim Peters
>
> [Tim]
>>
> .  First, the original example I gave would be approximately as well
>> addressed by allowing to declare intended scopes in magically synthesized
>> functions; like (say)
>>
>> p = None # to establish the intended scope of `p`
>> while any(  # split across lines just for readability
>> n % p == 0 for p in small_primes):
>> n //= p
>>
>> It didn't really require an inline assignment, just a way to override the
>> unwanted (in this case) "all `for` targets are local to the invisible
>> function" rigid consequence of the implementation du jour.
>>
>
[Guido]

> Hm, that's more special syntax.
>

Of course - I'm anticipating that the PEP will be changed to throw out
useful assignment expressions in comprehensions, but I still want a way to
"export" comprehension for-targets at times ;-)


> The nice bit about (b) as currently specified is that it adds no syntax --
> it adds a scope rule, but (as IIRC Steven has convincingly argued) few
> people care about those. Python's scope rules, when fully specified, are
> intricate to the point of being arcane (e.g. for class scopes) but all that
> has a purpose -- to make them so DWIM ("Do what I Mean") that in practice
> you almost never have to worry about them, *especially* when reading
> non-obfuscated code (and also when writing, except for a few well-known
> patterns).
>

You and Steven and i appear to be on the same page here - but it's in a
book nobody else seems to own :-(  To me it's just screamingly obvious that

total = 0
cumsums = [total := total + value for value in data]

"should do" what it obviously intends to do - and that the only thing
stopping that is a bass-ackwards focus on what most trivially falls out of
the current implementation.

...

def f():
>> y = -1
>> ys = [y for _ in range(y := 5)]
>> print(y, ys)
>>
>> Here `range(y := 5)` is executed in f's scope.  Presumably the `y` in `y
>> for` also refers to f's scope, despite that `y` textually _appears_ to be
>> assigned to in the body of the listcomp, and so would - for that reason -
>> expected to be local to the synthesized function, and so raise
>> `UnboundLocalError` when referenced.  It's incoherent without detailed
>> knowledge of the implementation.
>>
>
> That code should have the same meaning regardless of whether we accept (b)
> or not -- there is only one `y`, in f's scope. I don't mind if we have to
> add more words to the PEP's scope rules to make this explicit, though I
> doubt it -- the existing weirdness (in the comprehension spec) about the
> "outermost iterable" being evaluated in the surrounding scope specifies
> this. I wouldn't call it incoherent -- I think what I said about scope
> rules above applies here, it just does what you expect.
>

Remove "y = -1" and - voila! - we have the dreaded "parent local scoping"
Nick finds so baffling to explain (or so he claims).  That is, "has exactly
the same scope in the comprehension as in the parent block, and will create
a local in the latter if the name is otherwise unknown in the parent" comes
with assignment expressions, regardless of whether _all_ such targets
"leak" (the current PEP) or only targets in the expression defining the
iterable of the outermost `for` (the PEP without leaking assignment
expressions in comprehensions).

As to whether it "does what you expect", no, not really! In a world where
_all_ binding targets in a comprehension are claimed to be local to the
comprehension, I _expect_ that `y := 5` appearing inside the listcomp means
`y` is local to the listcomp.  "Oh - unless the binding appears in the
expression defining the iterable of the outermost `for`" comes from Mars.

Not that it really matters much, but (b) provides consistent semantics in
these cases.  No need to search Mars for weird exceptions ;-)

...

> A "neutral" argument about (b) is that despite the "horrified" reactions
> that Nick saw, in practice it's going to confuse very few people (again,
> due to my point about Python's scope rules). I'd wager that the people who
> might be most horrified about it would be people who feel strongly that the
> change to the comprehension scope rules in Python 3 is a big improvement,
> and who are familiar with the difference in implementation of
> comprehensions (though not generator expressions) in Python 2 vs. 3.
>

I also doubt it will generally confuse people in practice (to the contrary,
I expect they _will_ be confused if things like the cumulative sums example
blow up with UnboundLocalError).

But I still don't get the source of the "horror".  Assignment expression
semantics are wholly consistent with ordinary nested lexical scoping, with
or without (b).  The only difference is in the scopes picked for assignment
expression target names (except for those appearing in the expression
defining the iterable yadda yadda yadda).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/l

Re: [Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)

2018-06-27 Thread Tim Peters
[Nick Coghlan]>

>  actually made those semantics available as an explicit

> "parentlocal NAME" declaration ...:
> >
> > def _list_comp(_outermost_iter):
> > parentlocal item
> > _result = []
> > for x in _outermost_iter:
> > item = x
> > _result.append(x)
> > return _result
> >
> > _expr_result = _list_comp(items)
>

[Greg Ewing]

I'm not sure that's possible. If I understand correctly,
> part of the definition of "parent local" is that "parent"
> refers to the nearest enclosing *non-comprehension* scope,
> to give the expected result for nested comprehensions.
> If that's so, then it's impossible to fully decouple its
> definition from comprehensions.
>
>  Nick's "parentlocal" does refer to the parent, but makes no distinction
between synthesized and user-written functions.  If the parent has a
matching parentlocal declaration for the same name then the original really
refers to the grandparent - and so on.  Ultimately, it resolves to the
closest enclosing scope in which the name is _not_ declared parentlocal.
In that scope, a "nonlocal" or "global" declaration settles it if one
appears, else the name is local to that scope.

So a nested comprehension would declare its assignment expression targets
as parentlocal in its synthesized function, and in all the containing
synthesized functions generated for containing comprehensions.

This appears in some strained ;-) way "natural" only because there is no
explicit way to declare something "local" in Python.  In just about any
other language with closures and nested lexical scopes, comprehensions and
generator expressions would have been implemented via nested functions that
explicitly declared their "for" target names "local". and nothing else.
The only change needed then for PEP 572 (b) semantics would be to declare
assignment expression target names local (if their scope wasn't already
known) in the closest containing non-synthesized block.

None of which really matters.  The real question is which semantics are
desired.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)

2018-06-27 Thread Tim Peters
[Nick Coghlan]

> However, PEP 572 in its current form takes the position "parent local
> scoping is sufficiently useful to make it a required pre-requisite for
> adding assignment expressions, but not useful enough to expose as a
> new scope declaration primitive",
>

Of course the PEP doesn't take that position at all:  it doesn't even
contain the term "parent local scoping".  That's your term, which nobody
else uses unless they're replying to you ;-)

What the PEP does say:

"""
an assignment expression occurring in a list, set or dict comprehension or
in a generator expression (below collectively referred to as
"comprehensions") binds the target in the containing scope, honoring a
nonlocal or global declaration for the target in that scope, if one exists.
For the purpose of this rule the containing scope of a nested comprehension
is the scope that contains the outermost comprehension. A lambda counts as
a containing scope.
"""

It's a small collection of plainly stated rules for specifying the intended
semantics.  If you want to claim that this _is_ "useful enough to expose as
a new scope declaration primitive", it's really on you to present use cases
to justify that claim.  I'd present some for you, but I don't have any (I
don't care that "by hand" conversion of nested comprehensions to workalike
Python nested functions may require a bit of thought to establish the
intended scope of assignment expression target names - all of which is
easily doable without adding any new statements).

I don't _expect_ that other good use cases exist.  The gimmick's purpose is
to make code that visually _appears_ to belong to a block act as if
embedded assignments do occur in that block.  If there's an explicitly
nested function, that fundamental motivation no longer applies.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-24 Thread Tim Peters
[ISAAC J SCHWABACHER ]
> ...
> I disagree with the view Tim had of time zones when he wrote that comment
> (and that code). It sounds like he views US/Eastern and US/Central as time
> zones (which they are), but thinks of the various America/Indiana zones as
> switching back and forth between them, rather than being time zones in their
> own right

You can think of them anyway you like.  The point of the code was to
provide a simple & efficient way to convert from UTC to local time in
all "time zones" in known actual use at the time; the point of the
comment was to explain the limitations of the code.  Although, as
Allexander noted, the stated assumptions are stronger than needed.

> I think the right perspective is that a time zone *is* the function that its
> `fromutc()` method implements,

Fine by me ;-)

> although of course we need additional information in order to actually
> compute (rather than merely mathematically define) its inverse. Daylight 
> Saving
> Time is a red herring,

Overstated.  DST is in fact the _only_ real complication in 99.99% of
time zones (perhaps even 99.9913% ;-) ).  As the docs say, if you have
some crazy-ass time zone in mind, fine, that's why fromutc() was
exposed (so your; crazy-ass tzinfo class can override it).

> and assumptions 2 and 4

Nitpick:  4 is a consequence of 2, not an independent assumption.

> in that exposition are just wrong from this point of view.

As above, there is no particular POV in this code:  just a specific
fromutc() implementation, comments that explain its limitations, and
an invitation in the docs to override it if it's not enough for your
case.

> In the worst case, Asia/Riyadh's two years of solar time completely shatter
> these assumptions.

Sure.  But, honestly, who cares?  Riyadh Solar Time was so
off-the-wall that even the Saudis gave up on it 25 years ago (after a
miserable 3-year experiment with it).  "Practicality beats purity".

> [eliding a more-general view of what time zones "really" are]

I'm not eliding it because I disagree with it, but because time zones
are political constructions.  "The math" we make up may or may not be
good enough to deal with all future political abominations; for
example:

> ...
> This assumption would be violated if, for example, some jurisdiction
> decided to fall back two hours by falling back one hour and then
> immediately falling back a second hour.  I recommend the overthrow
> of any such jurisdiction and its (annexation by the Netherlands)[3].

That's not objectively any more bizarre than Riyadh Solar Time.
Although, if I've lived longer than you, I may be more wary about the
creative stupidity of political schemes ;-)


> ... (Lennart, I think this third assumption is the important part of your "no
> changes within 48 hours of each other" assumption,

The "48 hours" bit came from Alexander.  I'm personally unclear on
what Lennart's problems are.

> ...
> All of these computations can be accomplished by searches of ordered lists
> and applications of $fromlocal_i$.

Do you have real-world use cases in mind beyond supporting
long-abandoned Riyadh Solar time?

> ...
> With this perspective, arithmetic becomes "translate to UTC, operate, 
> translate
> back", which is as it should be.

There _was_ a POV in the datetime design about that:  no, that's not
how it should be.  Blame Guido ;-)  If I add, say, 24 hours to noon
today, I want to get noon tomorrow, and couldn't care less whether DST
started or stopped (or any other political adjustment was made) in
between.  For that reason, it was wholly intentional that datetime +
timedelta treats datetime as "naive".  If that's not what someone
wants, fine, but then they don't want Python's datetime arithmetic
BTW, there's no implication that they're "wrong" for wanting something
different; what would be wrong is insisting that datetime's POV is
"wrong".  Both views are valid and useful, depending on the needs of
the application.  One had to picked as the built-in behavior, and
"naive" won.

> ...
> But IIUC what Lennart is complaining about

I don't, and I wish he would be more explicit about what "the
problem(s)" is(are).

> is the fact that the DST flag isn't part of and can't be embedded into a 
> local time,
> so it's impossible to fold the second parameter to $fromlocal$ into $t$.  
> Without
> that, a local time isn't rich enough to designate a single point in time and 
> the
> whole edifice breaks.

You can blame Guido for that too ;-) , but in this case I disagree(d)
with him:  Guido was overly (IMO) annoyed by that the only apparent
purpose for a struct tm's tm_ isdst flag was to disambiguate local
times in a relative handful of cases.  His thought:  an entire bit
just for that?!  My thought:  get over it, it's one measly bit.

my-kingdom-for-bit-ingly y'rs  - tim
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org

Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-24 Thread Tim Peters
[Tim]
> Sure.  But, honestly, who cares?  Riyadh Solar Time was so
> off-the-wall that even the Saudis gave up on it 25 years ago (after a
> miserable 3-year experiment with it).  "Practicality beats purity".

Heh.  It's even sillier than that - the Saudis never used "Riyadh
Solar Time", and it's been removed from release 2015e of the tz
database:

https://www.ietf.org/timezones/data/NEWS
Release 2015e - 2015-06-13 10:56:02 -0700
...
The files solar87, solar88, and solar89 are no longer distributed.
They were a negative experiment - that is, a demonstration that
tz data can represent solar time only with some difficulty and error.
Their presence in the distribution caused confusion, as Riyadh
civil time was generally not solar time in those years.

Looking back, Paul Eggert explained more in 2013, but it took this
long for the patch to land:

http://comments.gmane.org/gmane.comp.time.tz/7717
> did Saudi Arabia really use this as clock time?

Not as far as I know, for civil time.  There was some use
for religious purposes but it didn't use the approximation
in those files.

These files probably cause more confusion than they're worth,
so I'll propose a couple of patches to remove them, in two followup
emails.  I haven't pushed these patches to the experimental
github version.

The position of the sun is vital to establishing prayer times in
Islam, but that's got little to do with civil time in Islamic
countries.  And Olson didn't take his "Riyadh Solar Time" rules from
the Saudis, he made up the times himself:  "Times were computed using
formulas in the U.S. Naval Observatory's Almanac for Computers
1987[89]".  The formulas only produced approximations, and then
rounded to 5-second boundaries because the tz data format didn't have
enough bits.

So, as a motivating example, it's hard to get less compelling:  Riyadh
Solar is a wholly artificial "time zone" made up by a time zone wonk
to demonstrate some limitations of the tz database he maintained.
Although I expect he could have done so just as effectively by writing
a brief note about it ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-24 Thread Tim Peters
[Lennart Regebro ]
> And I would want to remind everyone again that this is not a question
> of the problem being impossible. It's just really complex to get right
> in all cases, and that always having the UTC timestamp around gets rid
> of most of that complexity.

Could you please be explicit about what "the problem" is?  Everyone
here is guessing at what you think "the problem" is.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-24 Thread Tim Peters
[Tim]
>> The formulas only produced approximations, and then
>> rounded to 5-second boundaries because the tz data format didn't have
>> enough bits.

[ISAAC J SCHWABACHER ]
> Little known fact: if you have a sub-minute-resolution UTC offset when a
> leap second hits, it rips open a hole in the space-time continuum and
> you find yourself in New Netherlands.

Tell me about it!  Last time that happened I had to grow stinking
tulips for 3 years to get enough money to sail back home.  I'll never
use a sub-minute-resolution UTC offset again ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-25 Thread Tim Peters
[Lennart Regebro ]
>>> And I would want to remind everyone again that this is not a question
>>> of the problem being impossible. It's just really complex to get right
>>> in all cases, and that always having the UTC timestamp around gets rid
>>> of most of that complexity.

[Tim]
>> Could you please be explicit about what "the problem" is?  Everyone
>> here is guessing at what you think "the problem" is.

[Lennart]
> The problem is that it is exceedingly complicated to get all the
> calculations back and forth between local time and UTC to be correct
> at all times and for all cases. It really doesn't get more specific
> than that. I don't remember which exact problem it was that made me
> decide that this was not the correct solution and that we should use
> UTC internally, but I don't think that matters, because I'm also sure
> that it was not the last case, as I was far from near the end in
> adding testcases.

I believe everyone here is saying it "shouldn't be" exceedingly
complicated, or even particularly hard, if you add the is_dst flags
the PEP says it would add.

But is the PEP complete?  Under the "Postponement" section, it says:

The implementation has turned out to be exceedingly complex,
due to having to convert back and forth between the local time
and UTC during arithmetic and adjusting the DST for each
arithmetic step, with ambiguous times being particularly hard
to get right.

However, the _body_ of the PEP said nothing whatsoever about altering
arithmetic.  The body of the PEP sounds like it's mainly just
proposing to fold the pytz package into the core.  Perhaps doing
_just_ that much would get this project unstuck?  Hope springs eternal
:-)

> Once again I'm sure it's not impossible to somehow come up with an
> implementation and an API that can do this based on local time, but
> once again I am of the opinion that it is the wrong thing to do. We
> should switch to using UTC internally, because that will make
> everything so much simpler.

Like what?  I'm still looking for a concrete example of what "the
problem" is (or even "a" problem).

> I am in no way against other people implementing this PEP, but I think
> you will end up with very complex code that will be hard to maintain.

Somebody first needs to define what "the problem" is ;-)

> There really is a reason every other date time implementation I know
> of uses UTC internally,

Yes, but the fundamental reason datetime does not is that Guido
consciously and deliberately decided that "naive datetime" would be
most useful most often for most users.  That's why "naive" objects are
the default.  And even for "aware" objects, arithmetic staying within
a single time zone was deliberately specified to be "naive" too.  My
guess is that all other datetime implementations you know of have no
concept of "naive" datetimes. let alone make naive datetimes primary.
Small wonder, if so, that they're all different in this way.

That's a design decision not everyone likes, and certainly isn't
suitable for all purposes, but the debate over that ended a dozen
years ago when the decision was made.  If your vision of PEP 431
_changes_ that design decision (which it sure _sounds_ like it wants
to based on what you're typing here, but which PEP 431 itself does not
appear to say - impossible to tell which from here without any
specific example(s)), that may account for all sorts of complications
that aren't apparent to me.

> and there really is a reason why everyone always recommends storing date
> times in UTC with the time zone or offset separately.

Well, that's the second thing they recommend - and they can already do
that.  The first thing to recommend is to use naive objects in any
application where that's possible, so that you don't have to bother
with _any_ time zone esoterica, surprises, complications or overheads.
After all, it's 7:54 PM as I type this, and that's perfectly clear to
me ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-25 Thread Tim Peters
[ISAAC J SCHWABACHER ]
>>> ...
>>> I think the right perspective is that a time zone *is* the function that its
>>> `fromutc()` method implements,

[Tim]
>> Fine by me ;-)

[Isaac]
> My issue is that you're computing `fromutc()`, which is a function, in
> terms of `dst()` and `utcoffset()`, which aren't.

I know.  That's not "an issue" that will gain traction, though ;-)

> I think this is backwards; `dst()` and `utcoffset()` should be computed
> from `fromutc()` plus some additional information that has to be present
> anyway in order to implement `fromutc()`.

Memory lane:  that additional information doesn't exist now.  I think
it "should have", but at the time, as I recall there was fatal
opposition to storing an `isdst` flag because it would consume an
extra byte in the pickle format.  That was enough to kill it:
datetime development was paid for by a company very concerned about
pickle sizes ;-)

> With the extra bit, `dst()` and `utcoffset()` become partial functions, which
> makes it *possible* to get the right answer in all cases, but it's still 
> simpler
> to start with the total function and work from there.

Well, maybe simpler for you, but I think not in general.  At the time,
all aspects of datetime's development were vigorously debated, but
mostly on Zope Corp (the company paying for it) wikis and mailing
lists.  While some people didn't care about time zones at all, most
did.  Of the latter:

- All were keenly aware of the need to incorporate UTC offsets.
- All were keenly aware of the need to accommodate "daylight time" schemes.
- None gave a fig about anything else.

Very late in the game, examples were given of localities that had in
fact changed their UTC offsets from time to time, but as curiosities
rather than as "issues".  That's when I created fromutc() - it was a
last-second addition.  I cared enough to make it _possible_ to
accommodate such cases, but there was no interest (or time) to cater
to them directly.  Instead fromutc() was written to use only the
already-existing utcoffset() and dst().  Everyone already knew how to
use the latter:  they directly corresponded to the two things everyone
cared about keenly from the start.

That doesn't _preclude_ anyone from writing a more-general fromutc(),
and I encourage, for example, you to do so ;-)  I agree it's the most
fundamental thing from an abstract mathematical view, but "UTC offset"
and "DST offset" fit most peoples' brains a hell of a lot better than
"collection of piecewise continuous monotonically increasing functions
whose images don't overlap too much" ;-)

>>>  Daylight Saving Time is a red herring,

>> Overstated 

> I stand by what I meant by this, even if I did a bad job of expressing
> the point. Assuming that all time zone discontinuities are due to DST
> changes breaks many time zones (really almost all of the Olson time
> zones, though only for a vanishingly small fraction of datetimes),

It's use cases that are missing here:  who needs to convert historic
times to/from UTC. and where the "problem times" are generally
arranged by politicians to occur when most people are sleeping?
That's why nobody really cared about offset-changing zones at the
start.  Yes, such zones exist, but times recorded in such zones are in
yesterday's databases we don't care about anymore except maybe to
display the values.

> but that's not the point I was making. The point is that it doesn't buy us
> anything.

Au contraire:  as above, it bought datetime a model people thought
they understood at once, since almost everyone has wrestled with UTC
offsets and daylight-time switches in ordinary daily life.  Implement
utcoffset() and dst(), and you're done.  Even if you're really not,
you _think_ you are, so you slumber peacefully then ;-)

> Though this is probably obscured by all the markup, the more general
> algorithm I gave is also simpler than the one in the comment in datetime.py,
> and the reason for that is that it solves an easier problem, but one that
> serves our practical purposes just as well.

It's heavily obscured by the heavy markup.  Write some Python code
instead?  I expect few people will try to untangle the meaning
otherwise.

As for whether it's simpler - eh, don't know.  Here's the actual code,
stripped of error-checking:

def fromutc(self, dt):
dtoff = dt.utcoffset()
dtdst = dt.dst()
delta = dtoff - dtdst
if delta:
dt += delta
dtdst = dt.dst()
return dt + dtdst

Will your code run faster?  Have fewer conditionals?  Fewer lines?
Shorter lines?  Less nesting?  Fewer operations?  Important to me,
though, is that your code should be far more self-evidently _correct_,
provided the reader understands the math underlying it (which will
require - as this code does - referring to a relatively massive wall
of text to explain it).

> ...
> I went too far in inferring your viewpoint from your code.  I don't find fault
> with the explanation on its own ter

Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-25 Thread Tim Peters
[Tim]
>> However, the _body_ of the PEP said nothing whatsoever about altering
>> arithmetic.  The body of the PEP sounds like it's mainly just
>> proposing to fold the pytz package into the core.  Perhaps doing
>> _just_ that much would get this project unstuck?  Hope springs eternal :-)

[Lennart Regebro ]
> The pytz package has an API and a usage that is different from the
> datetime() module. One of the things you need to do is that after each
> time you do arithmetic, you have to normalize the result. This is done
> because the original API design did not realize the difficulties and
> complexities of timezone handling and therefore left out things like
> ambiguous times.

Oh, they were realized - indeed, the pytz docs point to Python's
tzinfo docs to explain the ambiguities, and the latter docs existed
before ;-) day 1.

The Python docs also are quite clear about that all arithmetic within
a single timezone is "naive".  That was intentional.  The _intended_
way to do "aware" arithmetic was always to convert to UTC, do the
arithmetic, then convert back.

You never _have_ to normalize() in pytz.  But it's needed if you
_don't_ follow pytz's explicit

The preferred way of dealing with times is to always work in UTC,
converting to localtime only when generating output to be read by
humans

advice, and want to do "aware" arithmetic directly in a non-UTC time
zone.  Python's datetime never intended to support that directly.
Quite the contrary.  I know people who feel otherwise tend to think of
that as a lazy compromise (or some such), but naive arithmetic was
intended to be "a feature".  Fight the design every step of the way,
and, yup, you get problems every step of the way.

> The PEP attemps to improved the datetime modules API so that it can
> handle the ambiguous times.

No problem with that.  I always thought the lack of storing
is_dst-like info was datetime's biggest wart.

> It also says that the implementation will be based on pytz, because it
> was my assumption that this would be easy, since pytz already handles
> ambiguous times. During my attempt of implementing it I realized it
> wasn't easy at all, and it wasn't as easy as folding pytz into the core.

Is it the case that pytz also "fails" in the cases your attempts "fail"?

In any case, if you're trying to change how "aware" datetime
arithmetic works, that's a major and backward-incompatible change.
Does Guido realize it?  As before, it's not at all clear from the PEP.

> Yes, the PEP gives that impression, because that was the assumption
> when I wrote the draft. Just folding pytz into the core without
> modifying the API defeats the whole purpose of the PEP, since
> installing pytz is a trivial task.

"Batteries included" has some attractions all on its own.  On top of
that, adding is_dst-like flags to appropriate methods may have major
attractions.  Changing the semantics of datetime arithmetic has major
attractions to some people, but also major drawbacks - regardless,
since changing it turns Guido's original design on its head, he really
needs to Pronounce on that part.

>> Like what?  I'm still looking for a concrete example of what "the
>> problem" is (or even "a" problem).

> A problem is that you have a datetime, and add a timedelata to it, and
> it should then result in a datetime that is actually that timedelta
> later. And if you subtract the same timedelta from the result, it
> should return a datetime that is equal to the original datetime.
>
> This sounds ridiculously simple

Ah, but it already happens that way - because the builtin datetime
arithmetic is "naive".  The docs have always promised this:

"""
datetime2 = datetime1 + timedelta (1)
datetime2 = datetime1 - timedelta (2)

1) datetime2 is a duration of timedelta removed from datetime1, moving
forward in time if timedelta.days > 0, or backward if timedelta.days <
0. The result has the same tzinfo attribute as the input datetime, and
datetime2 - datetime1 == timedelta after. OverflowError is raised if
datetime2.year would be smaller than MINYEAR or larger than MAXYEAR.
Note that no time zone adjustments are done even if the input is an
aware object.

2) Computes the datetime2 such that datetime2 + timedelta ==
datetime1. As for addition, the result has the same tzinfo attribute
as the input datetime, and no time zone adjustments are done even if
the input is aware. This isn’t quite equivalent to datetime1 +
(-timedelta), because -timedelta in isolation can overflow in cases
where datetime1 - timedelta does not.
"""

>, and is ridiculously difficult to make happen in all cases that we want to
> support (Riyahd time zone and leap seconds not included).
> That IS the specific, concrete problem, and if you don't believe me, there
> is nothing I can do to convince you.

I apologize if I've come off as unduly critical - I truly have been
_only_ trying to find out what "the problem" is.  That helps!  Thank
you.  Note that I've had nothing to do with datetime (except

Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-26 Thread Tim Peters
[Paul Moore ]
> I think the current naive semantics are useful and should not be
> discarded lightly. At an absolute minimum, there should be a clear,
> documented way to get the current semantics under any changed
> implementation.

Realistically, default arithmetic behavior can't change in Python 3
(let alone Python 2).  Pushing for a different design is fine, but
that can't be sold on the grounds that current behavior is "a bug" -
it's working as designed, as intended, and as documented, and hasn't
materially changed in the dozen-or-so years since it was introduced.
It's not even that the proposed alternative arithmetic is "better",
either:  while it's certainly more suitable for some applications,
it's certainly worse for others.  Making an incompatible change would
be (& should be) a hard sell even if there were a much stronger case
for it than there is here.

But that's just arithmetic.  Some way to disambiguate local times, and
support for most zoneinfo time zones, are different issues.


> As an example, consider an alarm clock. I want it to go off at 7am
> each morning. I'd feel completely justified in writing tomorrows_alarm
> = todays_alarm + timedelta(days=1).
>
> If the time changes to DST overnight, I still want the alarm to go off
> at 7am. Even though +1 day is in this case actually + 25 (or is it
> 23?) hours. That's the current semantics.

There was a long list of use cases coming to the same conclusion.  The
current arithmetic allows uniform patterns in local time to be coded
in uniform, straightforward ways.  Indeed, in "the obvious" ways.  The
alternative behavior favors uniform patterns in UTC, but who cares?
;-)  Few local clocks show UTC.  Trying to code uniform local-time
behaviors using "aware arithmetic" (which is uniform in UTC. but may
be "lumpy" in local time) can be a nightmare.

The canonical counterexample is a nuclear reactor that needs to be
vented every 24 hours.  To which the canonical rejoinder is that the
programmer in charge of that system is criminally incompetent if
they're using _any_ notion of time other than UTC ;-)

> To be honest, I would imagine, from experience with programmers
> writing naive algorithms, that the current semantics is a lot less
> prone to error when used by such people. People forget about timezones
> until they are bitten by them, and if they are using the convert to
> UTC->calculate->convert back model, their code ends up with
> off-by-1-hour bugs. Certainly such mistakes can be fixed, and the
> people who make them educated, but I like the fact that Python's
> typical behaviour is to do what a non-expert would expect. By all
> means have the more sophisticated approach available, but if it's the
> default then naive users have to either (1) learn the subtleties of
> timezones, or (2) learn how to code naive datetime behaviour in Python
> before they can write their code. If the current behaviour remains the
> default, then *when* the naive user learns about the subtleties of
> timezones, they can switch to the TZ-aware datetime - but that's a
> single learning step, and it can be taken when the user is ready.

There is a design flaw here, IMO:  when they switch to a TZ-aware
datetime, they _still_ get "naive" arithmetic within that time zone.
It's at best peculiar that such a datetime is _called_ "aware" yet
still ignores the time zone rules when doing arithmetic.  I would have
preferred a sharper distinction, like "completely naive" (tzinfo
absent) versus "completely aware" (tzinfo present).  But, again, it's
working as designed, intended and documented.

One possibility to get "the other" behavior in a backward-compatible
way:  recognize a new magic attribute on a tzinfo instance, say,
__aware_arithmetic__.  If it's present, arithmetic on a datetime with
such a tzinfo member "acts as if" arithmetic were done by converting
to UTC first, doing the arithmetic, then converting back.  Otherwise
(magic new attribute not present) arithmetic remains naive.  Bonus:
then you could stare at datetime code and have no idea which kind of
arithmetic is being used ;-)

> PS I don't think the above is particularly original - IIRC, it's
> basically Guido's argument for naive datetimes from when they were
> introduced. I think his example was checking his watch while on a
> transatlantic plane flight, but the principle is the same.

Yup, your account is fair (according to me ;-) ).  Here's Guido's
first message on the topic:

https://mail.python.org/pipermail/python-dev/2002-March/020648.html
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-26 Thread Tim Peters
[Tim]
>> The Python docs also are quite clear about that all arithmetic within
>> a single timezone is "naive".  That was intentional.  The _intended_
>> way to do "aware" arithmetic was always to convert to UTC, do the
>> arithmetic, then convert back.

[Lennart]
> We can't explicitly implement incorrect timezone aware arithmetic and
> then expect people to not use it.

Python didn't implement timezone-aware arithmetic at all within a
single time zone.  Read what I wrote just above.  It implements naive
arithmetic within a single time zone.

> We can make the arithmetic correct,

The naive arithmetic within a timezone is already correct, by its own
internal criteria.  It's also useful (see the original discussions, or
Paul Moore's recent brief account).  That it's not the arithmetic you
want doesn't make it "incorrect", it makes it different from what you
want.  That's fine - you're allowed to want anything ;-)  But it's a
dozen years too late to change that decision.  Maybe for Python 4.

> and we can raise an error when doing tz-aware arithmetic in a
> non-fixed timezone.

Sorry, I don't know what that means.  Under any plausible
interpretation, I don't see any need to raise an exception.

> But having an implementation we know is incorrect

You really have to get over insisting it's incorrect.  It's
functioning exactly the way it was intended to function.  It's
_different_ from what you favor.  Note that I'm not calling what you
favor "incorrect".  It's different.  Both kinds of arithmetic are
useful for different purposes, although I still agree with Guido's
original belief that the current arithmetic is most useful most often
for most programmers.

> and telling people "don't do that" doesn't seem like a good solution
> here.

We don't tell people "don't do that".  It's perfectly usable exactly
as-is for many applications.  Not all.  For those applications needing
the other kind of arithmetic, the convert-to/from-UTC dance was the
intended solution.

> Why do we even have timezone aware datetimes if we don't intend them
> for usage?

They are intended for usage.  But a single way of using them is not
suitable for all possible applications.

>> ...
>> Python's datetime never intended to support that directly.

> I think it should.

Ya, I picked that up ;-)  I don't, but it's too late to break backward
compatibility regardless.

> It's expected that it supports it,

By some people, yes.  Not by all.

> and there is no real reason not to support it.

Backward compatibility is a gigantic reason to continue with the
status quo.  See Paul Moore's post for a start on why naive arithmetic
was picked to begin with.

> The timezone handling becomes complicated if you base yourself on
> localtime, and simple if you base yourself on UTC.

That's an implementation detail unrelated (in principle) to how
arithmetic works.  Although as a practical matter it cuts both ways:
naive local-time arithmetic is complicated if the internal time is
stored in UTC, but simple if stored in local time.

> As you agree, we recommend to people to use UTC at all times,

I recommend people don't use tzinfo at all if they can avoid it.
Beyond that, there are many attractions to using UTC, and to
explicitly use UTC.  Not all applications need to care, though.

> and only use timezones for input and output. Well, what I'm now
> proposing is to take that recommendation to heart, and change
> datetime's implementation so it does exactly that.

Suppose I'm correct in my belief that there's scant chance of getting
approval for changing the default datetime arithmetic in Python 3 (or
Python 2).  Would you still be keen to replace the internals with UTC
format?  Note that there are many consequences to that implementation
detail.  For example, it was an explicit requirement of the datetime
design that the month, day, hour, minute and second components be very
cheap to extract.  If you have to do conversion every time one is
accessed, it's much slower; if you cache the "local time" components
separately, the memory burden increases.  Etc.

> I saw the previous mention of "pure" vs "practical", and that is often
> a concern. Here it clearly is not. This is a choice between impure,
> complicated and impractical, and pure, simple and practical.

There is nothing in the datetime world simpler than naive arithmetic
;-)  "Practical" is relevant to a specific application's specific
needs, and neither kind of arithmetic is "practical" for all
applications.  Guido believed naive arithmetic is most practical
overall.  But even believing that too, datetime certainly "should be"
beefed up to solve the _other_ problems:  like resolving ambiguous
times, and supporting the full range of zoneinfo possibilities

>> Is it the case that pytz also "fails" in the cases your attempts "fail"?

> No, that is not the case. And if you wonder why I just don't do it
> like pytz does it, it's because that leads to infinite recursion, much
> as discussions on this mailing 

Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-27 Thread Tim Peters
[Paul Moore ]
>> 
>> As an example, consider an alarm clock. I want it to go off at 7am
>> each morning. I'd feel completely justified in writing
>> tomorrows_alarm = todays_alarm + timedelta(days=1).

[Lennart Regebro ]
> That's a calendar operation made with a timedelta.

It's an instance of single-timezone datetime arithmetic, of the
datetime + timedelta form.  Your examples have been of the same form.
Note that after Paul's

 tomorrows_alarm = todays_alarm + timedelta(days=1)

it's guaranteed that

assert tomorrows_alarm - todays_alarm == timedelta(days=1)

will succeed too.

> The "days" attribute here is indeed confusing as it doesn't mean 1 day,
> it means 24 hours.

Which, in naive arithmetic, are exactly the same thing.  That's
essentially why naive arithmetic is the default:  it doesn't insist on
telling people that everything they know is wrong ;-)  There's nothing
confusing about Paul's example _provided that_ single-timezone
arithmetic is naive.  It works exactly as he intends every time, and
obviously so.

Seriously, try this exercise:  how would you code Paul's example if
"your kind" of arithmetic were in use instead?  For a start, you have
no idea in advance how many hours you may need to add to get to "the
same local time tomorrow".  24 won't always work  Indeed, no _whole_
number of hours may work (according to one source I found, Australia's
Lord Howe Island uses a 30-minute DST adjustment).  So maybe you don't
want to do it by addition.  What then?  Pick apart the year, month and
day components, then simulate "naive arithmetic" by hand?

The point is that there's no _obvious_ way to do it then.  I'd
personally strip off the tzinfo member, leaving a wholly naive
datetime where arithmetic "works correctly" ;-) , add the day, then
attach the original tzinfo member again.

But for a dozen years it's sufficed to do what Paul did.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-27 Thread Tim Peters
[Ronald Oussoren ]
> IMHO “+ 1 days” and “+ 24 hours” are two different things.
> Date arithmetic is full of messy things like that.

But it's a fact that they _are_ the same in naive time, which Python's
datetime single-timezone arithmetic implements:

- A minute is exactly 60 seconds.
- An hour is exactly 60 minutes.
- A day is exactly 24 hours.
- A week is exactly 7 days.

No context is necessary:  those are always true in naive time, and
that lack of mess is "a feature" to those who accept it for what it
is.

> “+ 1 month” is another example of that (which the datetime
> module punts completely and can be a source of endless
> bikeshidding).

Note that the only units timedelta accepts have clear (utterly
inarguable) meanings in naive time.  That's intentional too.  For
example, "a month" and "a year" have no clear meanings (as durations)
in naive time, so timedelta doesn't even pretend to support them.
Despite all appearance to the contrary in this thread, naive time is
bikeshed-free:  it's easy for someone to know all there is to know
about it by the time they're 12 ;-)

datetime + timedelta(days=1)

is equivalent to

datetime + timedelta(hours=24)

is equivalent to

datetime + timedelta(minutes=60*24)

is equivalent to

datetime + timedelta(seconds=60*60*24)

is equivalent to

datetime + timedelta(microseconds=100*60*60*24)

Naive time is easy to understand, reason about, and work with.  When
it comes to the real world, political adjustments to and within time
zones can make the results dodgy, typically in the two DST-transition
hours per year when most people living in a given time zone are
sleeping.  How much complexity do you want to endure in case they wake
up? ;-)  Guido's answer was "none in arithmetic - push all the
complexity into conversions - then most uses can blissfully ignore the
complexities".

And note that because DST transitions "cancel out" over the span of a
year, the benefits and the few dodgy cases don't really change
regardless of whether you add one week or a hundred thousand weeks
(although there's no way to predict what governments will decide the
local clock "should say" a hundred thousand weeks from now - it's only
predictable in naive time).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Status on PEP-431 Timezones

2015-07-27 Thread Tim Peters


>>> The "days" attribute here is indeed confusing as it doesn't mean 1 day,
>>> it means 24 hours.

>> Which, in naive arithmetic, are exactly the same thing.

[Terry Reedy]
> I think using the word 'naive' is both inaccurate and a mistake.  The issue
> is civil or legal time versus STEM time, where the latter includes
> applications like baking cakes.

Sorry, never heard of "STEM time" before - & a quick Google search didn't help.

> It could also be called calendar time versus elapsed time.  (Financial/legal
> arithmetic versus STEM arithmetic is a somewhat similar contrast.)

And I am, alas, equally unclear on what any of those others mean
(exactly) to you.

> The idea that an hour can be sliced out of a somewhat random March day and
> inserting it into a somewhat random October day is rather sophisticated.  It
> came from the minds of government bureaucrats.  It might be smart, dumb, or
> just a cunning way for civil authorities to show who is in charge by making
> us all jump.  But not 'naive'.

I agree.  Python's "naive time" single-timezone arithmetic
intentionally ignores all that:  it ignores leap seconds, it ignores
DST transition points, it ignores governments deciding to change the
base UTC offset within a pre-existing time zone, ...  It's time s
naive that it thinks 24 hours is the same thing as a day ;-)

> 'Naive' means simple, primitive, or deficient in informed judgement. It is
> easy to take it as connoting 'wrong'.

While some people in this thread seem convinced Python's naive time
_is_ "wrong", it's not because it's called "naive".  In any case,
Guido decided to call it "naive" over 13 years ago, starting here, and
that terminology has been in use ever since:

https://mail.python.org/pipermail/python-dev/2002-March/020648.html

> Tim, you have been arguing that civil/legal time arithmetic is not naive.

Yes.  But that's not "an argument", it's a plain fact that Python's
"naive time" (note that "naive" here is technical term, used widely in
the datetime docs) is not civil/legal time (assuming I understand what
you mean by that phrase).

> Calling civil time naive undercuts this claim.

I don't see that I ever said civil time is naive.  Adding a day is
_not_ always the same as adding 24 hours in (at least Lennart's
beliefs about) civil time.  They _are_ always the same in Python's
("naive") datetime arithmetic.  And the latter is all I said in the
quote at the top of this msg.

What am I missing?  It's always something ;-)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   5   6   7   8   9   10   >