Re: [Python-Dev] Adding numbering to PEP 20, the Zen of Python
[Ben Hoyt] > I was emailing someone today about implementing something (for PEP > 471, as it happens) and wanted to link to the Zen of Python [1] and > note a particular clause (in this case "If the implementation is hard > to explain, it's a bad idea."). However, there are no clause numbers, > so you can't refer to specific phrases. > > I know it's a short enough document that it probably doesn't matter. > And maybe numbering them would make it less Zen. Would be handy in > code reviews and the like, for example: "Not very Pythonic. See PEP 20 > point 5." Is it just my pedantic self, or have others wanted to do > this too? > > [1] http://legacy.python.org/dev/peps/pep-0020/ See Revelation 22:18. Adding numbers is how heresy begins ;-) http://biblehub.com/revelation/22-18.htm ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.
[nha pham ] > Statement_1: With an array of size N or less than N, we need at most log2(N) > comparisons to find a value > (or a position, incase the search miss), using the binary search algorithm. > > proof: This statement is trivia, and I believe, someone outthere already > proved it. Sorry for the quick message here. It's just a simple point where it will pay not to get off on a wrong foot ;-) Correct: for an array of size N, binary search can require as many as ceiling(log2(N+1)) comparisons. That's because there are N+1 possible results for an array of size N. For example, for an array of size 3, [A, B, C], "the answer" may be "before A", "between A and B", "between B and C", or "after C". 3 elements, 3+1 = 4 possible results. log2(3) comparisons are not enough to distinguish among 4 results. Make it trivial, an array of length 1. Then 1 comparison is obviously necessary and sufficient in all cases. And, indeed, ceiling(log2(1+1)) = 1. log2(1) equals 0, too small. For the rest, I haven't been able to understand your description or your pseudo-code. I'll try harder. Some things clearly aren't doing what you _intend_ them to do. For example, in your Python code, each time through the outer loop you're apparently trying to sort the next CHUNK elements, but you end up appending CHUNK+1 values to data2 (or data3). Or in this part: for i in range(low,high): x = data[i] if x >= data[i-1]: the first time that loop is executed low == 0, and so i == 0 on the first iteration, and so the conditional is if x >= data[0-1] That's referencing data[-1], which is the very last element in data - which has nothing to do with the CHUNK you're trying to sort at the time. So there are a number of errors here, which makes it that much harder to sort out (pun intended ) what you're trying to do. It would help you to add some asserts to verify your code is doing what you _hope_ it's doing. For example, add assert data2[low: high] == sorted(data[low: high]) assert len(data2) == high to the end of your `sample` loop, and similarly for data3 in your `new` loop. Until those asserts all pass, you're not testing code that's actually sorting correctly. Repair the errors and you almost certainly won't find `new` running over 10 times faster than `sample` anymore. I don't know what you _will_ discover, though. If the code doesn't have to sort correctly, there are much simpler ways to make it run _very_ much faster ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.
[nha pham ] > Thank you very much. I am very happy that I got a reply from Tim Peter. My pleasure to speak with you too :-) > You are correct, my mistake. > > The python code should be: > for i in range(low+1,high): //because we already add > data[low] > x = data[i] > if x >= data[i-1]: > > After I fix it, here is the result: > > random array 10^6: > Old binsort: 1.3322 > New binsort: 1.0015 > ratio: 0.33 > > You are right, it is not ten times faster anymore. I will update other > results soon. > > I do check the result of two sorting methods many times to make sure they > are the same. It is just because I do not know how to put assert into the > timeit.Timer class. `assert` is just another Python statement. You simply add it to the code - there's nothing tricky about this. You could, e.g., simply copy and paste the `assert`s I suggested last time. Before you do, trying adding `print index` to your inner loops, and make SIZE much smaller (say, 1000) so you're not overwhelmed with output. You'll be surprised by what you see on the second (and following) CHUNKs. For example, in both `sample` and `new` it will print 900 ninety nine times in a row when doing the last CHUNK. The code still isn't doing what you intend. Until it does, timing it makes little sense :-) > I am pretty sure about this. Note that I'm talking about the Python code here, the code you run through timeit. You cannot have checked the results of running _that_ specific code, because it doesn't work at all. You may have checked _other_ code many times. We may get to that later, but since I speak Python, I'm not going to understand what you're doing until we have Python code that works ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.
OK - here's what the current binsort does, ignoring that it skips an already-sorted prefix (if any), and creating a new list instead of modifying in-place: def oldsort(a): from bisect import bisect_right as br assert a result = [a[0]] for i in range(1, len(a)): x = a[i] index = br(result, x) result.insert(index, x) return result And here's my best guess as to what you _intend_ the new version to do. Please confirm that, or, if I'm guessing wrong, please give a Python function that _does_ implement your intent: def newsort(a): from bisect import bisect_right as br assert a oldx = a[0] result = [oldx] index = 0 for i in range(1, len(a)): x = a[i] if x < oldx: index = br(result, x, 0, index) else: index = br(result, x, index + 1) result.insert(index, x) oldx = x return result Now assuming that's right, I don't care about timing it ;-) The only basic question to me is whether it in fact reduces the number of comparisons. So here's an integer wrapper that bumps a global counter whenever it's asked to compare: class IntWrap(object): def __init__(self, i): self.i = i def __cmp__(a, b): global gncmp gncmp += 1 return cmp(a.i, b.i) def __repr__(self): return repr(self.i) Now we can build lists containing that, and get exact comparison counts. To start, for a given length `n`, this counts the total number of comparisons needed to sort all possible permutations of a list of length n, under both the old and new ways: def drive(n): import itertools global gncmp base = [IntWrap(i) for i in range(n)] oldcount = newcount = 0 numperms = 0 for p in itertools.permutations(base): numperms += 1 gncmp = 0 oldresult = oldsort(p) oldcount += gncmp gncmp = 0 newresult = newsort(p) newcount += gncmp assert oldresult == newresult == base print 'n', n, 'perms', numperms print 'old compares', oldcount print 'new compares', newcount print 'diff %', (newcount - oldcount) * 100.0 / max(oldcount, 1) And, finally, a tiny loop to drive it: for n in range(1, 13): print drive(n) It's still running as I type this, but the results aren't promising so far - as soon as the list length gets non-trivial, the new way requires more comparisons than the old way so far: n 1 perms 1 old compares 0 new compares 0 diff % 0.0 n 2 perms 2 old compares 2 new compares 2 diff % 0.0 n 3 perms 6 old compares 16 new compares 16 diff % 0.0 n 4 perms 24 old compares 112 new compares 116 diff % 3.57142857143 n 5 perms 120 old compares 848 new compares 880 diff % 3.77358490566 n 6 perms 720 old compares 7008 new compares 7296 diff % 4.1095890411 n 7 perms 5040 old compares 63456 new compares 66432 diff % 4.68986384266 n 8 perms 40320 old compares 628608 new compares 662496 diff % 5.39095907147 n 9 perms 362880 old compares 6826752 new compares 7202304 diff % 5.50118123523 n 10 perms 3628800 old compares 80605440 new compares 85006080 diff % 5.45948263542 I believe it would be very difficult to analyze this rigorously - and even if I saw an analysis it would be hard to trust it. Raw counts from simple code are hard to argue with ;-) FYI, here are two ideas I had way back when, but didn't pursue: 1. Merge "2 at a time" instead of just 1. That is, first "sort" the next 2 elements to be merged (1 compare and a possible swap). Then binary search to find where the smaller belongs, and a shorter binary search to find where the larger belongs. Then shift both into place. This can win on two counts: A. Less data movement, since the already-sorted values after the larger element get moved only once instead of twice. B. A possible cut in number of compares. Merging a sorted list of N elements with a sorted list of 2 elements has been studied a lot (e.g., search for "optimal merging of 2 elements" and find the paper by Hwang and Lin). The minimum average theoretical number of compares needed is ceiling(log2((N+2)*(N+1)/2)). 2. Instead of binary insertion sort, do an ordinary (but optimized) bottom-up merge sort. That may not cut the number of compares, but would slash worst-case data movement cost from O(n**2) to O(n*log(n)). As to why your code is sometimes faster, for the Python code in your timing harness, well, that didn't actually sort anything, so wasn't measuring anything interesting (or even explainable ;-) ). For the Java code, I have no guess - I don't know enough about Java internals. Maybe "lucky" data, maybe cache effects, maybe a mistake - don't know, and can't guess. Or maybe my guess (above) at the intent of
Re: [Python-Dev] Tunning binary insertion sort algorithm in Timsort.
[Tim] >> 1. Merge "2 at a time" instead of just 1. That is, first "sort" the >> next 2 elements to be merged (1 compare and a possible swap). Then >> binary search to find where the smaller belongs, and a shorter binary >> search to find where the larger belongs. Then shift both into place. [Armin] > Good idea, but when I tried that it seemed to increase the total > number of comparisons (on random inputs with only up to 136 items). > The increase is on the order of 5%. I'm not sure reduced data > movement can compensate for that in Python. Which is another way of saying "bad idea" - that must be why I didn't pursue it to begin with ;-) Thanks for trying! I plugged a similar routine into the code I showed before to count the # of comparisons in Nha Pham's idea, and this "merge 2 at a time" thing has a higher average # of compares (over all permutations) than Nha's (which in turn has a higher average than the status quo). That makes some sense, thinking about what they do. Nha's algorithm has some supernaturally good cases (input array already ascending or already descending), but "merge 2 at a time" doesn't appear to have any. In any case, the information-theoretic minimum average number of comparisons for merging N sorted elements with 2 sorted elements is ("where do the 2 belong in the final list of N+2 elements?" = comb(N+2, 2)): log2((N+2)*(N+1)/2) = log2(N+2) + log2(N+1) - 1 Add a comparison to get the 2 elements in order to begin with, and we're up to log2(N+2) + log2(N+1) Two independent binary inserts (first to a list of size N, and then to a list of size N+1) comes out to the same. So even being supernaturally clever can't reduce the average number of compares this way. And since, in context, we're only looking at short arrays, a marginal saving in data movement costs (which are still O(N**2) worst case) are unlikely to be significant. Still, if anyone wants to go nuts ... ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector
[Neil Schemenauer ] > Python objects that participate in cyclic GC (things like lists, dicts, > sets but not strings, ints and floats) have extra memory overhead. I > think it is possible to mostly eliminate this overhead. Also, while > the GC is running, this GC state is mutated, which destroys > copy-on-write optimizations. This change would mostly fix that > issue. > > All objects that participate in cyclic GC have the Py_TPFLAGS_HAVE_GC > bit set in their type. That causes an extra chunk of memory to be > allocated *before* the ob_refcnt struct member. This is the PyGC_Head > struct. > > The whole object looks like this in memory (PyObject pointer is at > arrow): > > union __gc_head *gc_next; > union __gc_head *gc_prev; > Py_ssize_t gc_refs; > --> > Py_ssize_t ob_refcnt > struct _typeobject *ob_type; > [rest of PyObject members] > > > So, 24 bytes of overhead on a 64-bit machine. The smallest Python > object that can have a pointer to another object (e.g. a single PyObject > * member) is 48 bytes. Removing PyGC_Head would cut the size of these > objects in half. > > Carl Shaprio questioned me today on why we use a double linked-list and > not the memory bitmap. I think the answer is that there is no good > reason. We use a double linked list only due to historical constraints > that are no longer present. Since you wrote this code to begin with, it will come back to you ;-) that the real purpose of the doubly-linked lists is to _partition_ (not just find) the tracked objects. Between collections, they're partitioned by generation, and within a collection equivalence classes are first merged (up through the oldest generation to be scanned in this run), and then temporarily partitioned internally in various ways (based on things like whether objects turn out to be reachable from outside, and whether they have finalizers). The linked list representation makes all the required operations cheap: iteration, merging classes, moving an object from one class to another, removing an object entirely _while_ iterating over its equivalence class. Don't know whether all that can be done efficiently with a bitmap representation instead. > Long ago, Python objects could be allocated using the system malloc or > other memory allocators. Since we could not control the memory > location, bitmaps would be inefficient. Today, we allocate all Python > objects via our own function. Python objects under a certain size are > allocated using our own malloc, obmalloc, and are stored in memory > blocks known "arenas". > > The PyGC_Head struct performs three functions. First, it allows the GC > to find all Python objects that will be checked for cycles (i.e. follow > the linked list). As above, the set of tracked objects is partitioned into more than one linked list. > Second, it stores a single bit of information to let > the GC know if it is safe to traverse the object, set with > PyObject_GC_Track(). ? An object is "tracked" if and only if it appears in _some_ doubly-linked list. There is no bit set (or cleared) for this. Untracking an object removes it entirely from whichever linked list it's in (leaving it in no linked lists), and tracking an object consists of adding it to the "generation 0" linked list. Unless the code has changed a whole lot recently. For clarity, the top N-1 bits of gc_refs (which you cover next) are also set to a special _PyGC_REFS_UNTRACKED constant when an object is untracked: """ /* True if the object is currently tracked by the GC. */ #define _PyObject_GC_IS_TRACKED(o) \ (_PyGC_REFS(o) != _PyGC_REFS_UNTRACKED) """ But I believe it could just as well check to see whether the object's gc_next is NULL. > Finally, it has a scratch area to compute the > effective reference count while tracing refs (gc_refs). As above, the top N-1 bits of that are also used between collections to record whether an object is tracked. The least significant bit of gc_refs now (not back when you or I were mucking with this code) records whether the object has a finalizer that has already been run, and that state needs to be preserved across gc runs. So that's another bit that would need to be stored somewhere else. > Here is a sketch of how we can remove the PyGC_Head struct for small > objects (say less than 512 bytes). Large objects or objects created by > a different memory allocator will still have the PyGC_Head overhead. > > * Have memory arenas that contain only objects with the > Py_TPFLAGS_HAVE_GC flag. Objects like ints, strings, etc will be > in different arenas, not have bitmaps, not be looked at by the > cyclic GC. > > * For those arenas, add a memory bitmap. The bitmap is a bit array that > has a bit for each fixed size object in the arena. The memory used by > the bitmap is a fraction of what is needed by PyGC_Head. E.g. an > arena that holds up to 1024 objects of 48 bytes in size would have a > bitmap of 1024 bits. If it's based on
Re: [Python-Dev] Memory bitmaps for the Python cyclic garbage collector
[Tim] >> In that case, it's because Python >> _does_ mutate the objects' refcount members under the covers, and so >> the OS ends up making fresh copies of the memory anyway. [Greg Ewing ] > Has anyone ever considered addressing that by moving the > refcounts out of the objects and keeping them somewhere > else? Not that I know of. I know Larry Hastings was considering doing it as part of his experiments with removing the GIL, but that had nothing to do with reducing cross-process copy-on-write surprises (it had to do with "batching" refcount operations to eliminate a need for fine-grained locking). As-is, I'd say it's "a feature" that the refcount is part of the object header. Ref count manipulations are very frequent, and as part of the object header a refcount tends to show up in cache lines "for free" as a side effect of accessing the object's type pointer. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Compiling Python-3.6.3 fails two tests test_math and test_cmath
[Richard Hinerfeld ] > Compiling Python-3.6.3 on Linux fails two tests: test_math and test_cmatg Precisely which version of Linux? The same failure has already been reported on OpenBSD here: https://bugs.python.org/issue31630 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] \G (match last position) regex operator non-existant in python?
Note that Matthew Barnett's `regex` module already supports \G, and a great many other features that weren't around 15 years ago ;-) either: https://pypi.python.org/pypi/regex/ I haven't followed this in detail. I'm just surprised once per year that it hasn't been folded into the core ;-) [nothing new below] On Fri, Oct 27, 2017 at 10:35 AM, Guido van Rossum wrote: > The "why" question is not very interesting -- it probably wasn't in PCRE and > nobody was familiar with it when we moved off PCRE (maybe it wasn't even in > Perl at the time -- it was ~15 years ago). > > I didn't understand your description of \G so I googled it and found a > helpful StackOverflow article: > https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. > From this I understand that when using e.g. findall() it forces successive > matches to be adjacent. > > In general this seems to be a unique property of \G: it preserves *state* > from one match to the next. This will make it somewhat difficult to > implement -- e.g. that state should probably be thread-local in case > multiple threads use the same compiled regex. It's also unclear when that > state should be reset. (Only when you compile the regex? Each time you pass > it a different source string?) > > So I'm not sure it's reasonable to add. But I also don't see a reason why it > shouldn't be added -- presuming we can decide on good answer for the > questions above about the "scope" of the anchor. > > I think it's okay to start a discussion on bugs.python.org about the precise > specification of \G for Python. OTOH I expect that most core devs won't find > this a very interesting problem (Python relies on regexes for parsing a lot > less than Perl does). > > Good luck! > > On Thu, Oct 26, 2017 at 11:03 PM, Ed Peschko wrote: >> >> All, >> >> perl has a regex assertion (\G) that allows multiple-match regular >> expressions to be able to use the position of the last match. Perl's >> documentation puts it this way: >> >> \G Match only at pos() (e.g. at the end-of-match position of prior >> m//g) >> >> Anyways, this is exceedingly powerful for matching regularly >> structured free-form records, and I was really surprised when I found >> out that python did not have it. For example, if findall supported >> this, it would be possible to write things like this (a quick and >> dirty ifconfig parser): >> >> pat = re.compile(r'\G(\S+)(.*?\n)(?=\S+|\Z)', re.S) >> >> val = """ >> eth2 Link encap:Ethernet HWaddr xx >> inet addr: xx.xx.xx.xx Bcast:xx.xx.xx.xx Mask:xx.xx.xx.xx >> ... >> loLink encap:Local Loopback >>inet addr:127.0.0.1 Mask:255.0.0.0 >> """ >> matches = re.findall(pat, val) >> >> So - why doesn't python have this? is it something that simply was >> overlooked, or is there another method of doing the same thing with >> arbitrarily complex freeform records? >> >> thanks much.. >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: >> https://mail.python.org/mailman/options/python-dev/guido%40python.org > > > > > -- > --Guido van Rossum (python.org/~guido) > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/tim.peters%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The current dict is not an "OrderedDict"
[Peter Ludemann] > Does it matter whether the dict order after pop/delete is explicitly > specified, or just specified that it's deterministic? Any behavior whatsoever becomes essential after it becomes known ;-) For example, dicts as currently ordered easily support LRU (least recently used) purging like so: On access: result = d.pop(key) d[key] = result This moves `key` from wherever it was to the _most_ recently used position. To purge the `numtopurge` least recently used keys (since traversing the dict is always from least-recently to most-recently added): topurge = tuple(itertools.islice(d, numtopurge)) for key in topurge: del d[key] Is it worth guaranteeing that will always "work" (as intended)? Not to me, but I do have code that relies on it now - and we can count on someone else saying it's utterly crucial ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Guarantee ordered dict literals in v3.7?
[Eric Snow ] > Does that include preserving order after deletion? Given that we're blessing current behavior: - At any moment, iteration order is from oldest to newest. So, "yes" to your question. - While iteration starts with the oldest, .popitem() returns the youngest. This is analogous to how lists work, viewing a dict similarly ordered "left to right" (iteration starts at the left, .pop() at the right, for lists and dicts). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symmetry arguments for API expansion
[Guido] > as_integer_ratio() seems mostly cute (it has Tim Peters all > over it), Nope! I had nothing to do with it. I would have been -0.5 on adding it had I been aware at the time. - I expect the audience is tiny. - While, ya, _I_ have uses for it, I had a utility function for it approximately forever (it's easily built on top of math.frexp()). - Especially now, fractions.Fraction(some_float) is the same thing except for return type. > OTOH it looks like Decimal has it, Looks like ints got it first, and then spread to Decimal because "why not?" ;-) The first attempt to spread it to Decimal I found was rejected (which would have been my vote too): https://bugs.python.org/issue8947 > so I think this ship has sailed too and maybe it's best to add it to the > numeric tower just to be done with it. Or rip it out of everything. Either way works for me ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symmetry arguments for API expansion
[David Mertz ] > ... > I can see no sane reason why anyone would ever call float.is_integer() > actually. That should always be spelled math.isclose(x, int(x)) because > IEEE-754. Attractive nuisance is probably too generous, I'd simply call the > method a bug. Sometimes it's necessary to know, and especially when _implementing_ 754-conforming functions. For example, what negative infinity raised to a power needs to return depends on whether the power is an integer (specifically on whether it's an odd integer): >>> (-math.inf) ** random.random() inf >>> (-math.inf) ** random.random() inf >>> (-math.inf) ** random.random() inf >>> (-math.inf) ** 3.1 inf >>> (-math.inf) ** 3.0 # NOTE THIS ONE -inf >>> (-math.inf) ** 2.9 inf But, ya, for most people most of the time I agree is_integer() is an attractive nuisance. People implementing math functions are famous for cheerfully enduring any amount of pain needed to get the job done ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symmetry arguments for API expansion
[Tim Peters] >> ... >> >>> (-math.inf) ** 3.1 >> inf [David Mertz] > Weird. I take it that's what IEEE-754 says. NaN would sure be more intuitive > here since inf+inf-j is not in the domain of Reals. Well, technically > neither is inf, but at least it's the limit of the domain. :-). Mathematical reals have all sorts of properties floats fail to capture, while mathematical reals don't distinguish between -0 and +0 at all. "Practical' symmetry arguments often underlie what float standards require. At heart , the rules for infinite arguments are often _consequences_ of "more obvious" rules for signed zero arguments, following from replacing +-inf with 1/+-0 in the latter. More explanation here: https://stackoverflow.com/questions/10367011/why-is-pow-infinity-positive-non-integer-infinity But we're not required to _like_ it; we just have to implement it ;-) >> >>> (-math.inf) ** 3.0 # NOTE THIS ONE >> -inf >> >>> (-math.inf) ** 2.9 >> inf ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symmetry arguments for API expansion
[Tim. on as_integer_ratio()] >> - I expect the audience is tiny. [Alexander Belopolsky] > The datetime module would benefit from having as_integer_ratio() > supported by more types. It's been hard to resist requests to allow > Decimal in timedelta constructors and/or arithmetics I don't see the connection. That timedelta construction may use as_integer_ratio() today doesn't mean it _has_ to use as_integer_ratio() forever, and is no reason (to my mind) to add as_integer_ratio all over the place. Why not drop that, and in oddball cases see whether fractions.Fraction() can handle the input? >>> fractions.Fraction(decimal.Decimal("1.76")) Fraction(44, 25) Probably less efficient, but I don't care ;-) And then, e.g., timedelta would also automagically allow Fraction arguments (which, BTW, don't support as_integer_ratio() either). Bonus: if datetime is bothering with hand-coding rational arithmetic now out of concern to get every bit right, Fraction could handle that too by itself. At heart, the Fraction() constructor is _all about_ creating integer ratios, so is the most natural place to put knowledge of how to do so. A protocol for allowing new numeric types to get converted to Fraction would be more generally useful than just a weird method only datetime uses ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symmetry arguments for API expansion
[Tim] >> At heart, the Fraction() constructor is _all about_ creating integer >> ratios, so is the most natural place to put knowledge of how to do so. >> A protocol for allowing new numeric types to get converted to Fraction >> would be more generally useful than just a weird method only datetime >> uses ;-) [Guido] > Ironically, the various Fraction constructors *calls* as_integer_ratio() for > floats and Decimals. From which follows IMO that the float and Decimal > classes are the right place to encapsulate the knowledge on how to do it. It appears that as_integer_ratio was slammed into floats and Decimals precisely _so that_ Fraction() could call them, while Fraction has its own self-contained knowledge of how to convert ints and Fractions and strings and numbers.Rationals to Fraction (and the former types don't support as_integer_ratio). That's fine, but my objection is subtler: the actual answer to "can this thing be converted to an integer ratio?" is not "does it support as_integer_ratio?", but rather "can Fraction() deal with it?" - and there's currently no way for a new numeric type to say "and here's how I can be converted to Fraction". An obvious way to extend it is for Fraction() to look for a special method too, say "_as_integer_ratio()". The leading underscore would reflect the truth: that this wasn't really intended to be a public method on its own, but is an internal protocol for use by the Fraction() constructor. Then it would be obvious that, e.g., it would be just plain stupid ;-) for `int` to bother implementing _as_integer_ratio. The only real point of the method is to play nice with the Fraction constructor. _As is_, it's jarring that int.as_integer_ratio() doesn't exist - for the same reason it's jarring int.hex() doesn't exist. If Mark or I wanted to use float._as_integer_ratio() directly too, that's fine: we're numeric grownups and won't throw a hissy fit if ints don't support it too ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symmetry arguments for API expansion
[Guido] > So let's make as_integer_ratio() the standard protocol for "how to make a > Fraction out of a number that doesn't implement numbers.Rational". We > already have two examples of this (float and Decimal) and perhaps numpy or > the sometimes proposed fixed-width decimal type can benefit from it too. Yup, that works. I only would have preferred that you went back in time to add a leading underscore. > If this means we should add it to int, that's fine with me. Given that int.numerator and int.denominator already exist, there's no plausible "good reason" to refuse to return them as twople. Still, I'd wait for someone to complain ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Symmetry arguments for API expansion
[Tim] >> An obvious way to extend it is for Fraction() to look for a special >> method too, say "_as_integer_ratio()". [Greg Ewing] > Why not __as_integer_ratio__? Because. at this point, that would be beating a dead horse ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Deprecating float.is_integer()
[David Mertz] > I've been using and teaching python for close to 20 years and I never > noticed that x.is_integer() exists until this thread. Except it was impossible to notice across most of those years, because it didn't exist across most of those years ;-) > I would say the "one obvious way" is less than obvious. When it was introduced, it _became_ the one obvious way. > On the other hand, `x == int(x)` is genuinely obvious.. But a bad approach: it can raise OverflowError (for infinite x); it can raise ValueError (for x a NaN); and can waste relative mountains of time creating huge integers, e.g., >>> int(1e306) 117216064596736454828831087825013238982328892017892380671244575047987920451875459594568606138861698291060311049225532948520696938805711440650122628514669428460356992624968028329550689224175284346730060716088829214255439694630119794546505512415617982143262670862918816362862119154749127262208 In Python 2, x == math.floor(x) was much better on the latter count, but not in Python 3 (math.floor used to return a float, but returns an int now). As to Serhiy's `not x % 1.0`, after 5 minutes I gave up trying to prove it's always correct. Besides infinities and NaNs, there's also that Python's float mod can be surprising: >>> (-1e-20) % 1.0 1.0 There isn't a "clean" mathematical definition of what Python's float % does, which is why proof is strained. In general, the "natural" result is patched when and if needed to maintain that x == y*(x//y) + x%y is approximately true. The odd % result above is a consequence of that, and that (-1e-20) // 1.0 is inarguably -1.0. > and it immediately suggests the probably better `math.isclose(x, int(x))` that > is what you usually mean. Even in some of the poor cases Serhiy found, that wouldn't be a lick better. For example, math.isclose(x/5, int(x/5)) is still a plain wrong way to check whether x is divisible by 5. >>> x = 1e306 >>> math.isclose(x/5, int(x/5)) True >>> x/5 == int(x/5) True >>> int(x) % 5 3 The problem there isn't how "is it an integer?" is spelled, it's that _any_ way of spelling "is it an integer?" doesn't answer the question they're trying to answer. They're just plain confused about how floating point works. The use of `.is_integer()` (however spelled!) isn't the cause of that, it's a symptom. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Deprecating float.is_integer()
Note: this is a top-posted essay much more about floating-point philosophy than about details. Details follow from the philosophy, and if philosophies don't match the desired details will never match either. Understanding floating point requires accepting that they're a funky subset of rational numbers, augmented with some oddballs (NaNs, "infinities", minus zero). At best the reals are a vague inspiration, and floats have their own terminology serving their actual nature. Thinking about reals instead is often unhelpful. For example, it's bog standard terminology to call all IEEE-754 values that aren't infinities or NaNs "finite". Which, by no coincidence, is how Python's math.isfinite() discriminates. Within the finites - which are all rational numbers - the distinction between integers and non-integers is obvious, but only after you're aware of it and give it some thought. Which most people aren't and don't - but that's no reason to prevent the rest of us from getting work done ;-) This isn't anything new in Python - it's as old as floating-point. For example, look up C's ancient "modf" function (which breaks a float/double into its "integer" and "fractional" parts, and treats all finite floats of sufficiently large magnitude as having fractional parts of 0.0 - because they in are fact exact integers). The idea that floats are "just approximations - so all kinds of slop is acceptable and all kinds of fear inescapable" went out of style when IEEE-754 was introduced. That standard codified an alternative view: that functions on floats should behave as if their inputs were _exactly_ correct, and - given that - produce the closest representable value to the infinitely precise result. That proved to be extremely valuable in practice, allowing the development of shorter, faster, more robust, and more accurate numerical algorithms. The trend ever since has been to do more & more along those lines, from trig functions doing argument reduction as if pi were represented with infinite precision, to adding single-rounding dot product primitives (all again acting as if all the inputs were exactly correct). Since that approach has been highly productive in real life, it's the one I favor. Arguments like "no floating point number on the order of 1e306 is sufficiently precise as to be an integer in any meaningful sense" don't even start to get off the ground in that approach. Maybe in 1970 ;-) You can have no actual idea of whether 1e306 is exactly right or off by a factor of a million just from staring at it, and real progress has been made by assuming all values are exactly what they appear to be, then acting accordingly. If you want to model that some values are uncertain, that's fine, but then you need something like interval arithmetic instead. >From that fundamental "take floats exactly at face value" view, what .is_integer() should do for floats is utterly obvious: there is no possible argument about whether a given IEEE-754 float is or is not an integer, provided you're thinking about IEEE-754 floats (and not, e.g., about mathematical reals), and making even a tiny attempt to honor the spirit of the IEEE-754 standard. Whether that's _useful_ to you depends on the application you're writing at the time. The advantage of the philosophy is that it often gives clear guidance about what implementations "should do" regardless, and following that guidance has repeatedly proved to be a boon to those writing numerical methods. And, yes, also a pain in the ass ;-) --- nothing new below --- On Wed, Mar 21, 2018 at 3:49 PM, David Mertz wrote: > On Wed, Mar 21, 2018 at 3:02 PM, Tim Peters wrote: >> >> [David Mertz] >> > I've been using and teaching python for close to 20 years and I never >> > noticed that x.is_integer() exists until this thread. >> >> Except it was impossible to notice across most of those years, because >> it didn't exist across most of those years ;-) > > > That's probably some of the reason. I wasn't sure if someone used the time > machine to stick it back into Python 1.4. > >> >> > On the other hand, `x == int(x)` is genuinely obvious.. >> >> But a bad approach: it can raise OverflowError (for infinite x); it >> can raise ValueError (for x a NaN); > > > These are the CORRECT answers! Infinity neither is nor is not an integer. > Returning a boolean as an answer is bad behavior; I might argue about > *which* exception is best, but False is not a good answer to > `float('inf').is_integer()`. Infinity is neither in the Reals nor in the > Integers, but it's just as much the limit of either. > > Likewise Not-a-Number isn'
Re: [Python-Dev] Deprecating float.is_integer()
[David Mertz ] >> For example, this can be true (even without reaching inf): >> >> >>> x.is_integer() >> True >> >>> (math.sqrt(x**2)).is_integer() >> False [Mark Dickinson ] > If you have a moment to share it, I'd be interested to know what value of > `x` you used to achieve this, and what system you were on. This can't happen > under IEEE 754 arithmetic. I expect it might happen under one of the directed rounding modes (like "to +infinity"). But under 754 binary round-nearest/even arithmetic, it's been formally proved that sqrt(x*x) == x exactly for all non-negative finite x such that x*x neither overflows nor underflows (and .as_integer() has nothing to do with that very strong result): https://hal.inria.fr/hal-01148409/document OTOH, the paper notes that it's not necessarily true for IEEE decimal arithmetic; e.g., >>> import decimal >>> decimal.getcontext().prec = 4 >>> (decimal.Decimal("31.66") ** 2).sqrt() # result is 1 ulp smaller Decimal('31.65') >>> decimal.getcontext().prec = 5 >>> (decimal.Decimal("31.660") ** 2).sqrt() # result is 1 ulp larger Decimal('31.661') ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Deprecating float.is_integer()
[Devin Jeanpierre ] > PyPy (5.8): > x = 1e300 > x.is_integer() > True > math.sqrt(x**2).is_integer() > False > x**2 > inf I think you missed that David said "even without reaching inf" (you did reach inf), and that I said "such that x*x neither overflows nor underflows". Those are technical words related to IEEE-754: your x*x sets the IEEE overflow flag, although CPython may or may not raise the Python OverflowError exception. > > (It gives an OverflowError on my CPython installs.) > > I believe this is allowed, and Python is not required to raise > OverflowError here: > https://docs.python.org/3.6/library/exceptions.html#OverflowError > says: > >> for historical reasons, OverflowError is sometimes raised for integers that >> are outside a required range. Because of the lack of standardization of >> floating point exception handling in C, most floating point operations are >> not checked You can avoid the OverflowError (but not the IEEE overflow condition!) under CPython by multiplying instead: >>> x = 1e300 >>> x*x inf ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Deprecating float.is_integer()
[Chris Barker ] > ... > ... "is it the "right" thing to do in most cases, when deployed by folks > that haven't thought deeply about floating point. Gimme a break ;-) Even people who _believe_ they've thought about floating point still litter the bug tracker with >>> .1 + .2 0.30004 "bug reports". .is_integer() is easy to explain compared to that - and you have to go out of your way to use it. > ... > I think pretty much all the real world code that's been shown here for using > .is_integer() is really about type errors (issues). The function at hand > really wants integer inputs -- but wants to allow the user to be sloppy and > provide a float type that happens to be an int. Given Python's duck-typing > nature, maybe that's a good thing? I know I really discourage dynamic type > checking So you identified a use case. One you don't approve of (nor do I), but not strongly enough to demand they suffer instead ;-) > Also, every example has been for small-ish integers -- exponents, > factorials, etc -- not order 1e300 -- or inf or NaN, etc. > > Finally, the use-cases where the value that happens-to-be-an-int is computed > via floating point -- .is_integer() is probably the wrong check -- you > probably want isclose(). Everyone who has implemented a production math library can recall cases where the functionality was needed. Here, that includes at least Stefan Krah and me. You could also follow the link from Mark Dickinson to SciPy's implementation of the beta function. In every case I've needed the functionality, isclose() would have been utterly useless. Behold: >>> (-1.0) ** 3.0 -1.0 >>> (-1.0) ** 3.0001 # different result _type_ (-1-3.142007854859299e-12j) >>> math.isclose(3.0, 3.0001) True And another showing that the same functionality is needed regardless of how large the power: >>> (-1.0) ** 1e300 # an even integer power 1.0 When implementing an externally defined standard, when it says "and if such-and-such is an integer ...", it _means_ exactly an integer, not "or a few ULP away from an integer". IEEE pow()-like functions bristle with special cases for integers. >>> (-math.inf) ** 3.1 inf >>> (-math.inf) ** 3.0 # note: this one has a negative result (odd integer >>> power) -inf >>> (-math.inf) ** 2.9 inf > ... > All this points to: we don't need .is_integer I'll grant that you don't think you need it. So don't use it ;-) > All the being said -- the standard for depreciation is much higher bar than > not-adding-it-in-the-first-place. I would not have added it as a method to begin with - but I agree with Guido that it doesn't reach the bar for deprecation. The only examples of "bad" uses we saw were from people still so naive about floating-point behavior that they'll easily fall into other ways to get it wrong. What we haven't seen: a single person here saying "you know, I think _I'd_ be seduced into misusing it!". It's not _inherently_ confusing at all. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Deprecating float.is_integer()
[Tim] >> from trig functions doing argument reduction as if pi were represented >> with infinite precision, [Greg Ewing ] > That sounds like an interesting trick! Can you provide > pointers to any literature describing how it's done? > > Not doubting it's possible, just curious. As I recall, when it was first done a "lazy" routine produced as many bits of pi as a given argument required, doing gonzo arbitrary precision arithmetic. Later, computer-aided analysis based on continued fraction expansions identified the worst possible case across all IEEE doubles (& singles). For example, it's possible in reasonable time to find the IEEE double that comes closest to being an exact integer multiple of pi/4 (or whatever other range you want to reduce to). Then it's only necessary to precompute pi to as many bits as needed to handle the worst case. In practice, falling back to that is necessary only for "large" arguments, and the usual double-precision numeric tricks suffice for smaller arguments. Search the web for "trig argument reduction" for whatever the state of the art may be today ;-) For actual code, FDLIBM does "as if infinite precision" trig argument reduction, using a precomputed number of pi bits sufficient to handle the worst possible IEEE double case, and is available for free from NETLIB: http://www.netlib.org/fdlibm/ The code is likely to be baffling, though, as there's scant explanation. Reading a paper or two first would be a huge help. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Deprecating float.is_integer()
[Kirill Balunov ] > I apologize that I get into the discussion. Obviously in some situations it > will be useful to check that a floating-point number is integral, but from > the examples given it is clear that they are very rare. Why the variant with > the inclusion of this functionality into the math module was not considered > at all. Nobody here really discussed the history, and I don't know. The questions here have been about what to do given that `is_integer` and `as_integer_ratio` are _already_ advertised (public) methods on some numeric types. > If the answer is - consistency upon the numeric tower - will it go > for complex type and what will it mean (there can be two point of views)? I haven't seen anyone suggest either method be added to Complex. There are lots of methods that don't show up in the tower before hitting Real. For example, given that Complex doesn't support __float__, it would be bizarre if it _did_ support as_integer_ratio. > Is this functionality so often used and practical to be a method of float, > int, ..., and not just to be an auxiliary function? > > p.s.: The same thoughts about `as_integer_ratio` discussion. I would have added them as functions in the `math` module instead. perhaps supported by dunder methods (__as_integer_ratio__, __is_integer__). But that's not what happened, and whether or not they have double underscores on each end doesn't really make all that much difference except to dedicated pedants ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Deprecating float.is_integer()
[Kirill Balunov ] > ... > In spite of the fact that the pronouncement has > already been made, there may still be an opportunity to influence this > decision. That's not really how this works. Guido has been doing this for decades, and when he Pronounces he's done with it :-) > I do not think that this is only a matter of choice, how this > functionality will be accessed through a method or function, in fact these > highly specialized methods heavily pollute the API "Heavily"? Seems oversold. > and open the door for persistent questions. That's a door that can never be closed, no matter what. > Given the frequency and activity of using this > `.is_integer` method the deprecation of this method is unlikely to greatly > affect someone. (for `as_integer_ratio` I think the bar is higher). > Summarizing this thread it seems to me that with deprecation of `is_integer` > method and with addition of `is_integer` function in math module will make > everyone happy: Not at all, but that's already been explained. Deprecation is _serous_ business: it's not only the presumably relative handful of direct users who are directly annoyed, but any number of worldwide web pages, blogs, books, papers, slides, handouts, message boards ... that so much as mentioned the now-deprecated feature. The language implementation is the tiniest part of what's affected, yet is the _only_ part we (Python developers) can repair. Deprecation really requires that something is a security hole that can't be repaired, impossible to make work as intended, approximately senseless, or is superseded by a new way to accomplish a thing that's near-universally agreed to be vastly superior. Maybe others? Regardless, they're all "really big deals". The "harm" done by keeping these methods seems approximately insignificant. Serhiy certainly found examples where uses made no good sense, but that's _common_ among floating-point features. For example, here's a near-useless implementation of Newton's method for computing square roots: def mysqrt(x): guess = x / 2.0 while guess ** 2 != x: guess = (guess + x / guess) / 2.0 return guess And here I'll use it: >>> mysqrt(25.0) 5.0 >>> mysqrt(25.2) 5.019960159204453 Works great! Ship it :-) >>> mysqrt(25.1) Oops. It just sits there, consuming cycles. That's because there is no IEEE double x such that x*x == 25.1. While that's not at all obvious, it's true. Some people really have argued to deprecate (in)equality testing of floats because of "things like that", but that's fundamentally nuts. We may as well remove floats entirely then. In short, that an fp feature can be misused, and _is_ misused, is no argument for deprecating it. If it can _only_ be misused, that's different, but that doesn't apply to is_integer. That someone - or even almost everyone - is merely annoyed by seeing an API they have no personal use for doesn't get close to "really big deal". The time to stop it was before it was added. > PROS: > ... > 5. Make everyone happy and stop this thread :) This thread ended before you replied to it - I'm just a ghost haunting its graveyard to keep you from feeling ignored -) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Subtle difference between f-strings and str.format()
[Serhiy Storchaka ] > ... > This is not new. The optimizer already changes semantic. > Non-optimized "if a and True:" would call bool(a) twice, but optimized code > calls it only once. I have a hard time imaging how that could have come to be, but if it's true I'd say the unoptimized code was plain wrong. The dumbest possible way to implement `f() and g()` is also the correct ;-) way: result = f() if not bool(result): result = g() For the thing you really care about here, the language guarantees `a` will be evaluated before `b` in: '{}{}'.format(a, b) but I'm not sure it says anything about how the format operations are interleaved. So your proposed transformation is fine by me (your #3: still evaluate `a` before `b` but ignore that the format operations may occur in a different order with respect to those). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Subtle difference between f-strings and str.format()
[Tim] > I have a hard time imaging how that could have come to be, but if it's > true I'd say the unoptimized code was plain wrong. The dumbest > possible way to implement `f() and g()` is also the correct ;-) way: > > result = f() > if not bool(result): > result = g() Heh - that's entirely wrong, isn't it? That's how `or` is implemented ;-) Same top-level point, though: result = f() if bool(result): result = g() ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Subtle difference between f-strings and str.format()
[Tim] > Same top-level point, though: [for evaluating `f() and g()`]: > > result = f() > if bool(result): > result = g() Ah, I think I see your point now. In the _context_ of `if f() and g()`, the dumbest possible code generation would do the above, and then go on to do if bool(result): If in fact `f()` returned a false-like value, an optimizer could note that `bool(result)` had already been evaluated and skip the redundant evaluation. I think that's fine either way: what the language guarantees is that `f()` will be evaluated exactly once, and `g()` no more than once, and that's all so regardless. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Subtle difference between f-strings and str.format()
[Tim Delaney ] > ... > If I'm not mistaken, #3 would result in the optimiser changing str.format() > into an f-string in-place. Is this correct? We're not talking here about > people manually changing the code from str.format() to f-strings, right? All correct. It's a magical transformation from one spelling to another. > I would argue that any optimisation needs to have the same semantics as the > original code - in this case, that all arguments are evaluated before the > string is formatted. That's why Serhiy is asking about it - there _are_ potentially visible changes in behavior under all but one of his suggestions. > I also assumed (not having actually used an f-string) that all its > formatting arguments were evaluated before formatting. It's a string - it doesn't have "arguments" as such. For example: def f(a, b, n): return f"{a+b:0{n}b}" # the leading "f" makes it an f-string Then >>> f(2, 3, 12) '0101' The generated code currently interleaves evaluating expressions with formatting the results in a more-than-less obvious way, waiting until the end to paste all the formatted fragments together. As shown in the example, this can be more than one level deep (the example needs to paste together "0", str(n), and "b" to _build_ the format code for `a+b`). > So my preference would be (if my understanding in the first line is > correct): > > 1: +0 That's the only suggestion with no potentially visible changes. I'll add another: leave `.format()` alone entirely - there's no _need_ to "optimize" it, it's just a maybe-nice-to-have. > 2a: +0.5 > 2b: +1 Those two don't change the behaviors of `.format()`, but _do_ change some end-case behaviors of f-strings. If you're overly ;-) concerned about the former, it would be consistent to be overly concerned about the latter too. > 3: -1 And that's the one that proposes to let .format() also interleave expression evaluation (but still strictly "left to right") with formatting. If it were a general code transformation, I'm sure everyone would be -1. As is, it's hard to care. String formatting is a tiny area, and format methods are generally purely functional (no side effects). If anyone has a non-contrived example where the change would make a lick of real difference, they shouldn't be shy about posting it :-) I looked, and can't find any in my code. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Subtle difference between f-strings and str.format()
[Steven D'Aprano ] > ... > Is there a down-side to 2b? It sounds like something you might end up > doing at a later date regardless of what you do now. There are always downsides ;-) As Serhiy noted later, the idea that "it's faster" is an educated guess - you can't know before it's implemented. Changes to the very complicated eval loop often have not only surprising speed consequences on one platform, but even consequences in opposite directions across platforms. Not necessarily in the part you directly changed, either. Optimizing C compilers just can't reliably guess what's most important in such a massive pile of test-and-branch laden code. Indeed, which paths through the eval loop _are_ most important depend on the Python program you're running at the time (which is, e.g., why "profile-guided optimization" was invented). So there's an ocean of potential complications there, and wading through those has opportunity cost too: Serhiy is a very productive contributor, but time he spends on this is time he won't be spending on other things of potentially greater value. That's all up to him, though. I'm not keen on changing the behavior of f-strings regardless (2a or 2b). While their implementation details aren't documented, they were intentional, and follow the pattern increasingly large parts of the language and std library adopted after the iterator protocol was introduced: compute intermediate results as they're needed, not all in advance. That's proved to have many advantages. It's certainly possible to write custom purely functional (no side effects) __format__ methods such that memory use in an f-string remains bounded under the current implementation, but can grow without bound if all __format__ arguments need to be evaluated before any formatting begins. It's akin to the difference between iterating over range() and xrange() in Python 2. I don't know that there's any real f-string code out there _relying_ on that - but don't know that there isn't either. It's more plausible to me than that there are non-functional real __format__ methods. I'd be happiest if no behaviors changed in anything. Then the only downsides to optimizing are code bloat, code churn, new bugs, subtler corner cases, less predictable behavior for end users, and increased implementation complexity forever after ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] IDLE colorizer
[MRAB [ > A thread on python-ideas is talking about the prefixes of string literals, > and the regex used in IDLE. > > Line 25 of Lib\idlelib\colorizer.py is: > > stringprefix = r"(?i:\br|u|f|fr|rf|b|br|rb)?" > > which looks slightly wrong to me. > > The \b will apply only to the first choice. > > Shouldn't it be more like: > > stringprefix = r"(?:\b(?i:r|u|f|fr|rf|b|br|rb))?" > > ? I believe the change would capture its real intent. It doesn't seem to matter a whole lot, though - IDLE isn't a syntax checker, and applies heuristics to color on the fly based on best guesses. As is, if you type this fragment into an IDLE shell: kr"sdf" only the last 5 characters get "string colored", presumably because of the leading \br in the original regexp. But if you type in ku"sdf" the last 6 characters get "string colored", because - as you pointed out - the \b part of the original regexp has no effect on anything other than the r following \b. But in neither case is the fragment legit Python. If you do type in legit Python, it makes no difference (legit string literals always start at a word boundary, regardless of whether the regexp checks for that). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
I'll channel that Guido would be happiest if this rule were followed: Given an assignment statement using "=", the meaning is the same if "=" is replaced with ":=". In particular, the expression at the far right is evaluated once, and - in case of chained assignments - is applied in turn to each target left-to-right. Otherwise the semantics of "=" and ":=" can be very different indeed. So, then, e.g., and assuming the rule above always applies: [Nick] > Tuple unpacking: > > What's the result type for "a, b, c := range(3)"? Is it a range() > object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2, > 3)" or "(a, b, range(3))"? It's the range object range(3). Same as in: x = a, b, c = range(3) `x` is bound to the range object range(3). > Once you have your answer, what about "a, b, c := iter(range(3))" A range_iterator object, same as what `x` is bound to in: x = a, b, c = iter(range(3)) However, `list(x)` then returns an empty list, because iter(range(3)) was evaluated only once, and the iterator was run to exhaustion when unpacking it for the `a, b, c` target. > or "a, b, *c := range(10)"? The range object range(10). > Whichever answers we chose would be surprising at least some of the > time, so it seems simplest to disallow such ambiguous constructs, such > that the only possible interpretation is as "(a, b, range(3))" That's why Guido would be happiest with the rule at the top. "The answers" can already be surprising at times with current assignment statements, but they are well defined. It would be mondo bonkers to make up entirely different subtle answers ;-) > Subscript assignment: > > What's the final value of "result" in "seq = list(); result = > (seq[:] := range(3))"? Is it "range(3)"? Or is it "[1, 2, 3]"? As above, it's range(3). > As for tuple unpacking, does your preferred answer change for the > case of "seq[:] := iter(range(3))"? As above, a range_iterator object, but one that's already been run to exhaustion. > More generally, if I write "container[k] := value", does only > "type(container).__setitem__" get called, or does > "type(container).__getitem__" get called as well? The rule at the top implies __setitem_ is called once, and __getitem__ not at all. The value of the assignment is the object `value` was bound to at the start, regardless of how tricky __setitem__ may be. And in k := container[k] := value `k` is bound to `value` before `container[k]` is evaluated. Why? Because that's how assignment _statements_ have always worked. > Attribute assignment: > > If I write "obj.attr := value", does only "type(obj).__setattr__" > get called, or does "type(obj).__getattribute__" get called as well? As above, only __setattr__. > While I can't think of a simple obviously ambiguous example using > builtins or the standard library, result ambiguity exists even for the > attribute access case, since type or value coercion may occur either > when setting the attribute, or when retrieving it, so it makes a > difference as to whether a reference to the right hand side is passed > through directly as the assignment expression result, or if the > attribute is stored and then retrieved again. This is already defined for assignment statements. While the PEP doesn't say "and the same for assignment expressions", my guess is that it won't be accepted unless it does. Or, indeed, the target is limited to a name. But Guido wasn't keen on that. In short, I think the PEP's chance of acceptance increases the _more_ assignment expressions act like assignment statements, not the less, and is highest if they act exactly the same (except for returning a value; e.g., while >>> a = 3 at a shell displays nothing, >>> a := 3 should display 3). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Tim] >> I'll channel that Guido would be happiest if this rule were followed: >> >> Given an assignment statement using "=", the meaning is the same if >> "=" is replaced with ":=". [Chris] > That's broadly the intention. At the moment, there are two exceptions: > > 1) Augmented assignment isn't a thing Doesn't have to be :-) "Augmented assignment statement" is already a different thing than "assignment statement" (for example, in an augmented assignment statement, there is no chaining, and the sole target can' t be, e.g., a slice or any form of unpacking syntax). > 2) Chained assignment isn't a thing, which means that the assignments > operate right-to-left >> In particular, the expression at the far right is evaluated once, and >> - in case of chained assignments - is applied in turn to each target >> left-to-right. > I'll toy with this and see if I can implement it sanely. If so, > that'll eliminate one more distinction. >> Otherwise the semantics of "=" and ":=" can be very different indeed. > TBH, the common cases won't actually be much affected. Or at all! That's not the point here, though: if making assignment expressions work as exactly like assignment statements as possible is what's needed for the PEP to pass, it's the _annoying_ cases that have to be looked at. Personally, after considerable staring at my own code, I would be perfectly happy to settle for assignment expressions no fancier than identifier ":=" expression That alone covers over 99% of the cases I'd be tempted to use the new feature at all, and then gobs of general-case assignment-statement difficulties go away, including the "right-to-left or left-to-right?" distinction (there's no way to tell which order bindings happen in `x := y := z := 3` short of staring at the generated code). But so far I haven't gotten the impression that Guido is fond of that. He should be, though ;-) > You give this example: > > k := container[k] := value > > but that's not going to be more common. What I'm more likely to see is > something like this: Not about what's common, but about the full range of what's possible to express. ... [Nick] >>> Tuple unpacking: >>> >>> What's the result type for "a, b, c := range(3)"? Is it a range() >>> object? Or is it a 3-tuple? If it's a 3-tuple, is that 3-tuple "(1, 2, >>> 3)" or "(a, b, range(3))"? >> It's the range object range(3). Same as in: >> >> x = a, b, c = range(3) >> >> `x` is bound to the range object range(3). > At the moment, "x = a, b, c := range(3)" will set c to range(3), then > build a tuple of that with the existing values of a and b. You can, > however, parenthesize the (a, b, c) part, and then it'll behave as you > say. Which would be really annoying to "repair". >>> Whichever answers we chose would be surprising at least some of the >>> time, so it seems simplest to disallow such ambiguous constructs, such >>> that the only possible interpretation is as "(a, b, range(3))" >> That's why Guido would be happiest with the rule at the top. "The >> answers" can already be surprising at times with current assignment >> statements, but they are well defined. It would be mondo bonkers to >> make up entirely different subtle answers ;-) > Wholeheartedly agreed. I'd like Guido to chime in again, because I'm pretty sure he won't accept what's currently on the table. There are two plausible ways to repair that: 1. Continue down the road of making assignment expressions "exactly like" assignment statements in their full generality. 2. Back off and limit assignment expressions to what appears to be the overwhelmingly most common case motivated by looking at real code (as opposed to constructing examples to illustrate pitfalls & obscurities): identifier ":=" expression ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Guido, makes peace with `identifier := expression`] > ... > I am fine with this, it certainly seems the easiest to implement, with the > fewest corner cases, and the easiest restriction to explain. > > (I was thinking there would be a use case for basic tuple unpacking, like > seen a lot in for-loop, but the only examples I tried to come up with were > pretty sub-optimal, so I don't worry about that any more.) Chris's pain threshold appears to be higher than ours ;-) So I would really like to see if anyone has plausibly realistic uses for fancier forms of assignment expression. I have plenty of code that does stuff like this: while True: x, y = func_returning_tuple() if y is None: break ... Maybe it's just that I'm used to it, but I find that very easy to understand now. If we had fancy assignment expressions, my first thought was I could write it like so instead: while ((x, y) := func_returning_tuple()) and y is not None: ... and pray that I put in enough parens to get the intended meaning. And maybe it's just that I'm _not_ used to that, but I do find it harder to understand. Contributing factor: I don't really want "and" there - what the context requires is really more like C's comma operator (take only the last value from a sequence of expressions). As is, I'm relying on that a 2-tuple is truthy regardless of its content (so that `and` always goes on to evaluate its RHS). And, for some reason, I find this even worse: while ((x, y) := func_returning_tuple())[1] is not None: ... The rub there: I gave `y` a name but can't use it in the test?! And those are the same kinds of headaches I saw over & over in my own "fancier" code: stuff that's already perfectly clear would become more obscure instead. Tuple unpacking works great in for-loops because the only effect there is to give names to the tuple components, none of which are needed _in_ the `for` statement itself. But in a `while" or `if` statement, I would typically _also_ want to use the names _in_ the `while` or `if` tests. But, as in C, that's what the comma operator is for, not the assignment operator. while (s = function_returning_struct_with_x_and_y_members(), s.y != NULL) { ... } In contrast, ,many plausible uses I saw for `identifier := expression` in a `while` or `if` statement would have been improvements, and most of the rest neutral: I'm still wondering whether this one is better or worse ;-): def newton(f, fprime, x): import math while not math.isclose((next_x := x - f(x) / fprime(x)), x): x = next_x return next_x ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Paul Moore] >> the next question will likely be "so why does = exist at all?" [Greg Ewing ] > And if we decide to make ':=' the official assigment operator and > deprectate '=', the next question will be "Why do we have '==' > instead of '='?" Which would be a fine question! In Python's very early days, it didn't have "==" at all: plain "=" was used for both assignment and equality testing. >From the HISTORY file: """ New features in 0.9.6: ... - '==' is now the only equality operator; "../demo/scripts/eqfix.py" is a script that fixes old Python modules """ That script crawled a source tree and replaced instances of "=" used for equality testing with the new-fangled "==". We can obviously do something similar to replace instances of "=" used for assignment when that's removed, and I'm sure nobody will complain about that either ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Tim] >> And, for some reason, I find this even worse: >> >> while ((x, y) := func_returning_tuple())[1] is not None: >> ... >> >> The rub there: I gave `y` a name but can't use it in the test?! >> >> And those are the same kinds of headaches I saw over & over in my own >> "fancier" code: stuff that's already perfectly clear would become >> more obscure instead. [Nick] > Whereas I think: > > while (s := func_returning_tuple())[1] is not None: > s = x, y > ... > > compares favourably with the loop-and-a-half version. Obviously not, since it really needs to be x, y = s instead ;-) In context, I was looking for realistic cases in which assignment expressions _fancier than_ identifier ":=" expression is a real improvement. You found an improvement instead by _replacing_ a "fancier than" instance with a plain-single-name target. I already have lots of examples from real code where plain-single-name target reads better to me. I don't have any yet from real code where something fancier does. In this specific case, I find your rewriting about as readable as the loop-and-a-half, except for the obvious drawback of the former: > It does make the guarantee that "y is not None" harder to spot than it > is in the loop-and-a-half version, though. Over time, the functions in the real codes from which the example was synthesized change, sometimes carrying more or less state in tuples. When that happens, the original x, y = s will helpfully blow up (size mismatch in unpacking), But, if the tuple length increased, is it still the case that I want to test the 1'th component? The test is now divorced from the unpacking. I do know that I'll still want to test the component I think of as being "the 'y' component", and the loop-and-a-half version accommodates that naturally. Then again, I could switch to new-fanged namedtuples instead, and do while (s := func_returning_tuple()).y is not None: to get the best of all worlds. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Guido, about g(items[idx] := idx := f()) ] > Does the PEP currently propose to *allow* that horrible example? I thought > Tim Peters successfully pleaded to *only* allow a single "NAME := ". I was "successful" only in that the two of us agreed that would be far less disruptive, and quite possibly an actual improvement ;-) But I only argued for limiting assignment expressions to the form identifier ":=" expression I expected that, given that expressions "naturally nest", chained targets could still be specified: a := b := c:= 5 but since they're all plain names there's no way to tell whether the bindings occur "left to right" or "right to left" short of staring at the generated code. I have no use case for chaining plain-name targets in assignment expressions, but didn't see a good reason to torture the implementation to forbid it. I expected chaining would just be an unused-in-practice possibility. Much like, e.g., a in b in c in d is an unused-in-practice possibility. And I'll take this opportunity to repeat the key point for me: I tried hard, but never found a single case based on staring at real code where allowing _fancier_ (than "plain name") targets would be a real improvement. In every case I thought it _might_ help, it turned out that it really didn't unless Python _also_ grew an analog to C's "comma operator" (take only the last result from a sequence of expressions). I'll also note that I asked if anyone else had a real-life example, and got no responses. There were lots of "real life" cases where plain-name targets allowed for code improvement, though. > You don't have to implement this restriction -- we know it's possible to > implement, and if specifying this alone were to pull enough people from -1 > to +0 there's a lot of hope! Given my experience with _trying_ to find use cases for fancier targets, and getting burned every time, I'm on the minus side of the current PEP, because - best I can tell - all the extra complexity would create an "attractive nuisance" :-( ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Chris Angelico ] > I don't see much value in restricting the assignment target to names > only, but if that's what it takes, it can be restricted, at least > initially. I believe this point was made most clearly before by Terry Reedy, but it bears repeating :-) This is from the PEP's motivation: """ Naming the result of an expression is an important part of programming, allowing a descriptive name to be used in place of a longer expression, and permitting reuse. """ As "head arguments" go, that's a good one! But restricting assignment expressions to identifier ":=" expression satisfies it. If what's of value is to name the result of an expression, that single case handles that and _only_ that. In a sense, it's "the simplest thing that could possibly work", and that's generally a good thing to aim for. Python assignment _statements_ are way more complex than that. Besides just giving names to expression results, they can also implicitly invoke arbitrarily complex __setitem__ and __setattr__ methods on targets, rely on all sorts of side effects across chained assignments, and support funky syntax for magically iterating over an expression's iterable result. While that can all be useful _in_ an assignment statement, the PEP's motivation doesn't say a word about why any of _that_ would also be useful buried inside an assignment expression. There doesn't appear to be a good "head argument" for why, besides "why not?". That's not enough. I think it's no coincidence that every example of an _intended_ use is of the simple identifier ":=" expression form. There are no examples of fancier targets in the PEP, and - more importantly - also none I saw in the hundreds of mailing-list messages since this started. Except for a few of mine, where I tried to demonstrate why _trying_ fancier targets in examples derived from real code made the original "loop and a half" code _worse_ And where other people were illustrating how incomprehensibly code _could_ be written (which isn't a real interest of mine). Short course: e.g., while a general assignment expression can "unpack" an iterable expression result, giving names to its elements, there's no clean way to _use_ the names bound by the unpacking _in_ the "if" or "while" tests. That's fine for `for` loops (only the _body_ of the loop needs the names), but in conditional constructs you typically want to use the names _in_ the condition being tested. if ((a, b, c) := func_returning_triple()) and b > 0: process(a+b, b+c, a+c) seems to be as good as it gets, but inherently relies on "a trick": that a 3-tuple is always truthy, regardless of content. OTOH, if ((a, b, c) := func_returning_triple())[1] > 0: doesn't rely on a trick, but can't use the name `b` in the test(!). if [((a, b, c) := func_returning_triple()), b > 0][-1]:: manages to avoid "a trick", and to use the natural `b > 0`, but is ... strained ;-) So, to my eyes, this is a clear improvement over all of those: a, b, c = func_returning_triple() if b > 0: process(a+b, b+c, a+c) Of course I could be cherry-picking a bad example there, but that's not the intent: I'm still waiting for anyone to post an example where a "fancy" assignment-expression target would actually make code clearer. I haven't found one. There are lots of examples when the target is a plain single name. Why the stark difference? I don't need deep theoretical reasons to see that there _is_ one, or to conclude that - in the absence of compelling use cases - complex assignment-expression targets are probably a Poor Idea. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Tim] >> And I'll take this opportunity to repeat the key point for me: I >> tried hard, but never found a single case based on staring at real >> code where allowing _fancier_ (than "plain name") targets would be a >> real improvement. In every case I thought it _might_ help, it turned >> out that it really didn't unless Python _also_ grew an analog to C's >> "comma operator" (take only the last result from a sequence of >> expressions). I'll also note that I asked if anyone else had a >> real-life example, and got no responses. [MRAB ] > Could a semicolon in a parenthesised expression be an equivalent to C's > "comma operator"? I expect it could, but I it's been many years since I tried hacking Python's grammar, and I wouldn't want a comma operator anyway ;-) To recycle a recently-posted example, instead of one of these 3: if ((a, b, c) := func_returning_triple()) and b > 0: process(a+b, b+c, a+c) if ((a, b, c) := func_returning_triple())[1] > 0: if [((a, b, c) := func_returning_triple()), b > 0][-1]:: ... it would allow this instead: if ((a, b, c) := func_returning_triple(); b > 0): ... That's better than any of the first three, but I'm not sure it's better than the original a, b, c = func_returning_triple() if b > 0: ... It _may_ be more readable in other complex-target examples, though. It's also what's wanted in one of the running plain-name target examples, _not_ involving a conditional context: r1, r2 = (D := sqrt(b**-4*a*c); a2 := 2*a; ((-b+D)/a2), (-b-D)/a2)) And if I saw enough code like that, I'd write a PEP suggesting that Python introduce separate assignment statements where name bindings persisted across statement boundaries ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Matthew Woodcraft ] > I would like to suggest one more motivating example for "Capturing > condition values": multiple regex matches with 'elif'. > > if match := re.search(pat1, text): > print("Found one:", match.group(0)) > elif match := re.search(pat2, text): > print("Found two:", match.group(0)) > elif match := re.search(pat3, text): > print("Found three:", match.group(0)) > > Without assignment expressions, you have an annoying choice between a > cascade of 'else's with an ever-increasing indent and evaluating all the > matches up front (so doing unnecessary work). That's a reasonable use, but would more likely be written like so today: for tag, pat in (("one", pat1), ("two", pat2), ("three", pat3). ("four", pat4), ...): match = re.search(pat, text) if match: print("Found", tag + ":", match.group(0)) break Which would still read a bit nicer if the first two loop body lines could be collapsed to if match := re.search(pat, text): ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Christoph Groth ] > Tim, thanks for this clear analysis. Here's the best use case of more > general assignment expressions that I can come up with (from real code > I'm currently working on): > > class Basis: > def __init__(self, parent, periods=()): > self._parent = parent > if len(self._periods := np.asarray(periods, int)): > ... > else: > # In absence of periods, treat them as an (0, n)-shaped array. > # This avoids a special code path below. > self._periods = np.empty((0, len(parent.periods)), int) > > But since this is a weak counterexample, it actually serves to > strengthen your point that > > identifier ":=" expression > > is all that is needed. That's a decent example. In truth, I have no real objection to binding an attribute - but am willing to throw out a bit of soap with the bathwater if doing so can avoid throwing the baby out too ;-) > Such minimal assignment expressions have the (IMHO important) advantage > of not being inconsistent with assignment statements. > > > > Still, it seems weird to have two different ways of binding names in the > language where one would be sufficient (i.e. the old one would remain > only for backwards compatibility). From the point of view of someone > who's new to the language that's two things to learn instead of just > one. But they're very different in a key respect. the value of an assignment expression is the value assigned. Asking "what's the value of a statement?" doesn't even make sense in Python (whether an assignment statement or any other kind of statement). For that reason, _if_ a PEP is reworked to suggest a "binding expression" (I'd prefer the name change to nudge people away from conflating it with the far more general assignment statement), the usage pragmatics are clear: use a binding expression if the context requires using the value bound, else use a simple assignment statement. ":=" doesn't _just_ mean "bind the simple name on the left" in that world, but also "and return the value of the expression on the right". For that reason, e.g., i = 1 would be strongly preferred to i := 1 as a standalone line, except perhaps when typing at an interactive shell (where you may _want_ to see the value being bound - but usually don't). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Tim] >> I expected that, given that expressions "naturally nest", chained >> targets could still be specified: >> >> a := b := c:= 5 >> >> but since they're all plain names there's no way to tell whether the >> bindings occur "left to right" or "right to left" short of staring at >> the generated code. [Nick Coghlan ] > The fact class namespaces are ordered by default now allow us to > demonstrate the order of multiple target assignments and tuple > unpacking without staring at generated code: > > >>> class AssignmentOrder: > ... a = b = c = 0 > ... d, e, f = range(3) > ... > >>> class ReversedAssignmentOrder: > ... c = b = a = 0 > ... f, e, d = range(3) > ... > >>> [attr for attr in AssignmentOrder.__dict__ if not attr.startswith("_")] > ['a', 'b', 'c', 'd', 'e', 'f'] > >>> [attr for attr in ReversedAssignmentOrder.__dict__ if not > >>> attr.startswith("_")] > ['c', 'b', 'a', 'f', 'e', 'd'] > > So that's a situation where "name = alias = value" could end up > matching "alias := name := value" Cool! So this is really a killer-strong argument for getting rid of classes - way overdue, too ;-) > (Even in earlier versions, you can illustrate the same assignment > ordering behaviour with the enum module, and there it makes even more > of a difference, as it affects which name binding is considered the > canonical name, and which are considered aliases). So if binding expressions can be chained, they'll need to ape "left-to-right" binding order. Or they can't be allowed to chain to begin with. Either way would be fine by me. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Christoph Groth ] >> > Still, it seems weird to have two different ways of binding names in >> > the language where one would be sufficient (i.e. the old one would >> > remain only for backwards compatibility). From the point of view of >> > someone who's new to the language that's two things to learn instead >> > of just one. [Tim] >> But they're very different in a key respect. the value of an >> assignment expression is the value assigned. Asking "what's the value >> of a statement?" doesn't even make sense in Python (whether an >> assignment statement or any other kind of statement). [Christoph] > There are also no function call statements in Python. People are > happily using function call expressions as statements when not > interested in their value. Sure. > I hope to have shown [1] that the same could be done for assignments. A > consistent value can be defined for any assignment statement. So, all > assignment statements could be redefined as expressions and the language > would continue to work and even be (perfectly?) backwards-compatible. Except for shells. When I type, e.g., >>> xs = sorted(iterator_returning_a_billion_strings) I really don't want to wait for hours before I can type again ;-) In the same way now, when someone calls a function at a shell but doesn't want to see its result, they do something like >>> xxx = function(a, b, c) knowing that an assignment statement never displays any output on its own. If an assignment statement did return a result, almost all shells would display it. Shells typically don't care at all what you typed at them, they just care whether or not executing the compiled code returns None: result = execute_code() if result is not None: display(repr(result)) There's also that you're not considering the other half: that every existing assignment statement could be viewed as being as expression does not imply that every existing assignment statement could be used everywhere an expression can be used. Syntax matters, and function call argument lists in particular already bristle with their own meanings for commas, equal signs, and asterisks. The language was designed with "and the twain shall never meet" in mind ;-) For example, what would f(a=b) mean? The worst possible ;-) answer is "well, since a=b is fine as an assignment statement, it must mean that we bind the value of `b` to name `a` and then pass b's value to `f()` as its first positional argument". That reading would break countless lines of code using keyword arguments. If you're willing to concede that's too much breakage to bear, then you have to identify and spell out "the rules" for every case in which something that "looks like an assignment expression really isn't, depending on context". But since I have no interest in pursuing this, I'll stop there :-) > Syntax-wise, if replacing = by := everywhere is unthinkable, as it > seems, there's still the possibility (not completely ruled out by Guido > ;-) to use = for assignment expressions but require extra parens for > safety. That would be received less well than the current PEP. The people it would hurt the most are newcomers from other languages who habitually put _every_ "if" and "while" test in parentheses, because that's what they're used to doing (e.g., in C). Many of us still remember our initial relief when we realized we'd never piss away hours debugging an assert(n=1) or if (x=0.0) typo/thinko again. Reintroducing that possibility would get an instant -1 from me, because I don't want to debug that same mistake for other people on Stackoverflow either - my time there is wholly consumed by explaining why .1 + .2 doesn't display exactly "0.3" ;-) > Thus, it seems to me that redefining assignments as expressions > everywhere is a feasible, if radical, idea. Compared to a dedicated > syntax for "binding expressions" it would be conceptually simpler, but > would provide more possibilities to shoot oneself in the foot. As above, it wouldn't remain so simple after hammering out the detailed rules for deciding when and where something that "looks like an assignment expression" really is one. For an example of a fine language that makes no distinction between "statements" and "expressions" at all, Icon is top on my list. That _can_ work out fine - but Icon was designed that way from the start. And, of course, like every sane language that has wholly general assignment expressions, Icon uses ";=" as the assignment operator, and "=" for numeric equality testing ;-) > [1] https://mail.python.org/pipermail/python-dev/2018-April/152780.html ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Matthew Woodcraft] >>> Well, that's a reason to make the example a bit more realistic, then. >>> >>> Say: >>> >>> if match := re.search(pat1, text): >>> do_something_with(match.group(0)) >>> elif match := re.search(pat2, text): >>> do_something_else_with(match.group(0), match.group(1)) >>> elif match := re.search(pat3, text): >>> do_some_other_things_with(match.group(0)) >>> and_also_with(match.group(1), match.group(2)) [Steven D'Aprano ] >> I don't think that a bunch of generic "do_something_with" functions is >> precisely "realistic". >> >> If I saw something like that, I'd try very hard to find a way to >> refactor it into code like this: >> >> for handler in handlers: >> if handler.match(text): >> handler.process() >> break >> else: >> # handle no-match case here >> >> where the knowledge of what to search for, where to search for it, how >> to search for it, and what to do when found, was encapsulated in the >> handler objects. Your tastes may vary. >> >> But your point is well-taken that the version with binding assignment >> (thanks Tim!) is nicer to read than the current procedural version: >> >> match = re.search(pat1, text) >> if match: >> do_something_with(match.group(0)) >> else: >> match = re.search(pat2, text) >> if match: >> do_something_else_with(match.group(0), match.group(1)) >> else: >> match = = re.search(pat3, text) >> do_some_other_things_with(match.group(0)) >> and_also_with(match.group(1), match.group(2)) >> >> I just don't think it counts as a motivating use-case distinct from the >> single match case. [Guido] > The version of this code found in reality is not as regular as the example > quoted, and the rebuttal "but I would rewrite it with a loop" shoots a straw > man. To me the if-elif-elif portion of the example is very much a separate > motivation, since being able to put the assignment in the elif clause avoids > runaway indentation. I've regretted not being able to use elif in this kind > of situation many times, whereas in the single match case I don't find it a > burden to assign the variable in a separate statement preceding the > if-clause. (I guess this is a case of "flat is better than nested" -- thanks > Tim! :-) Au contraire - thank you for forcing me to channel you succinctly lo those many years ago ;-) And for pointing out this real use case, which I'm not sure has been stressed before. The PEP could clearly use more motivating examples, and this is a fine class of them. Few things are more maddening than runaway cascading indentation :-( And noting again that a simple "binding expression" (my neologism for `identifier ":=" expression`, to break the reflexive horror at imagining the full complexity of assignment statements being allowed everywhere expressions are allowed) is sufficient to address it. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Guido] > In reality there often are other conditions being applied to the match for > which `if expr as name` is inadequate. The simplest would be something like > > if ...: > > elif (m := re.match('(.*):(.*)', line)) and m.group(1) == m.group(2): > > > And the match() call may not even be the first thing to check -- e.g. we > could have > > elif line is not None and (m := re.match('(.*):(.*)', line)) and > m.group(1) == m.group(2): I find myself warming more to binding expressions the more I keep them in mind while writing new code. And I think it may be helpful to continue showing real examples where they would help. Today's example: I happened to code this a few hours ago: diff = x - x_base if diff: g = gcd(diff, n) if g > 1: return g It's not really hard to follow, but two levels of nesting "feels excessive", as does using the names "diff" and "g" three times each. It's _really_ an "and" test: if the diff isn't 0 and gcd(diff, n) > 1, return the gcd. That's how I _thought_ of it from the start. Which this alternative expresses directly: if (diff := x - x_base) and (g := gcd(diff, n)) > 1: return g That's so Pythonic I could cry ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Tim] >> Which this alternative expresses directly: >> >> if (diff := x - x_base) and (g := gcd(diff, n)) > 1: >> return g >> >> That's so Pythonic I could cry ;-) [Antoine] > It looks like C to me. That won't make me cry (I write C++ code daily > these days), but it's certainly not the same language as Python. > > The second part, especially, where you use the result of an > assignment expression as a comparison operand, looks definitely > un-Pythonic. You snipped the part explaining _what's_ "Pythonic" about it: It's _really_ an "and" test: if the diff isn't 0 and gcd(diff, n) > 1, return the gcd. That's how I _thought_ of it from the start. "Expresses directly" is the Pythonic part; the syntax is minor to me. Seeing that the _intent_ is an "and test" is a pattern-matching puzzle in the original spelling (which essentially turned me into a compiler, writing low-level code for the _concepts_ I had in mind from the start): diff = x - x_base if diff: g = gcd(diff, n) if g > 1: return g But note that the part of the PEP I support is just the "binding expression" part: giving a simple name (binding an identifier) to the result of an expression. I don't want the full potential complexity of assignment statements in expressions. There's nothing "un-Pythonic" about merely giving a name to an expression result, apart from that there are few contexts that currently support that in a sanely usable way. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Sven R. Kunze ] > What about > > diff = x - x_base > if diff and gcd(diff, n) > 1: > return gcd(diff, n) > > # or > > if (x - x_base) and gcd(x - x_base, n) > 1: > return gcd(x - x_base, n) > > > and have the interpreter handle the optimization, or apply an lru_cache? ;-) Surely you're joking. This is math.gcd(), which is expensive for multi-thousand bit integers, and the interpreter knows nothing about it. Adding a cache of _any_ kind (LRU or otherwise) would make it even slower (in the application, there's no reason to expect that x - x_base will repeat a value before O(sqrt(n)) iterations, which itself can be thousands of bits - a cache hit would be a miracle). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Tim] >> Surely you're joking. This is math.gcd(), which is expensive for >> multi-thousand bit integers, and the interpreter knows nothing about >> it. Adding a cache of _any_ kind (LRU or otherwise) would make it >> even slower. [Sven R. Kunze ] > Alright, if that problem is just about performance, It's not, but others had already pointed out that it's generally considered Poor Practice (and for good reasons) to textually repeat expressions, so I didn't echo that. Even in purely functional languages, where textually equal snippets are guaranteed to evaluate to the same result every time, "give these expressions these brief names" constructs are heavily used (see, .e.g, "let" and "where" in Haskell). >:then there must be a better way to resolve it rather than > inventing a new syntax. Why? "Give the result of an expression a name" is already heavily used in Python - it's just that the _contexts_ in which it can be done are very limited now. > Btw. storing the result in a local var is also a cache IMHO. And > if gcd is immutable, I think Python can do a great job here of > optimizing. After decades, CPython still does nothing of the sort, short of having eventually made, e.g., "None" and "True" and "False" reserved words so at least it can optimize uses of those. It knows nothing at all about which library functions are pure - and there's no code in the implementation currently capable of exploiting such information even if it were known. That remains a fantasy in CPython. > Anyway, your example is the best one I've seen so far. Guido gave better ones, where binding expressions would allow to collapse arbitrarily deep levels of nesting to just one (if ... elif ... elif ... elif ...). My example only eliminated a single level of artificial indentation. But my example did have the advantage of being taken verbatim from actual, working code ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Ethan Furman ] > So I really like being able to make the assignment in the expression, but I > have a really hard time parsing it with the name first. > > ... > > On the other hand, if it were using the "as" keyword: > > if (x - xbase as diff) and (gcd(diff, n) as g) > 1: > return g > > I would parse as: > > if > x - x_base > as diff > and > gcd(diff, n) > as g > > 1: > return g > > For me at least, the last is much more readable. Thinking about it some > more, the problem (or maybe just my problem) is that I see an "if" or > "while" and the I look for the thing that is True or False, and using the > ":=" syntax the first thing I see is a placeholder for a result that doesn't > exist yet, making me constantly scan backwards and forwards to put all the > pieces in the correct place. > > With "as", it just flows forwards. I can read it fine either way, and don't much care. A possible advantage of an "as" operator is that its precedence could be set to bind just a tad stronger than comparisons (which include "is" and "is not" in Python), and then, e.g., if f() as result is not None: do something with result could work as intended. So long as people can't get "assignment _statements_" out of their heads, if result := f() is not None: groups instead as if result := (f() is not None): which would almost never be _intended_. Maybe spelling it "as" instead could break that. However, against "as" is that its current use in "with" statements does something quite different: with f() as name: does not bind the result of `f()` to `name`, but the result of `f().__enter__()`. Whether that "should be" fatal, I don't know, but it's at least annoying ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Tim] >> if (diff := x - x_base) and (g := gcd(diff, n)) > 1: >> return g [Greg Ewing ] > My problem with this is -- how do you read such code out loud? In the message in which I first gave that example: if the diff isn't 0 and gcd(diff, n) > 1, return the gcd. That's how I _thought_ of it from the start. In my mind, `x - x_base` doesn't even exist except as a low-level definition of what "diff" means. It's different for the other test: _there_ `g` doesn't exist except as a shorthand for "the gcd". In one case it's the name that's important to me, and in the other case the expression. The entire function from which this came is doing all arithmetic modulo `n`, so `n` isn't in my mind either - it's a ubiquitous part of the background in this specific function. But you did ask how_I_ would read that code ;-) Anyone else is free to read it however they like. I naturally read it in the way that makes most sense to me in its context. > From my Pascal days I'm used to reading ":=" as "becomes". So > this says: > >"If diff becomes x - base and g becomes gcd(diff, n) is > greater than or equal to 1 then return g." > > But "diff becomes x - base" is not what we're testing! I don't really follow that. In Python, if f() and g > 1: first tests whether `f()` "is truthy", regardless of whether it does or doesn't appear in a binding expression. Because this code is working with integers, there's an _implied_ "!= 0" comparison. > That makes it sound like the result of x - base may or may not > get assigned to diff, which is not what's happening at all. Then I suggest the problem you're having doesn't stem from the binding expression, but from that you're omitting to fill in the != 0 part: if you're not thrown by "greater than 1", I can't see how you can be thrown by "not zero". > The "as" variant makes more sense when you read it as an > English sentence: > >if ((x - x_base) as diff) and ... > >"If x - x_base (and by the way, I'm going to call that > diff so I can refer to it later) is not zero ..." So read the original as "if diff (which is x - x_base) is not zero ...". Regardless, Guido has already said "as" is DOA (Dead On Arrival) (illustrating that it's also common enough in English to give a short name before its long-winded meaning ;-) ). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Steve Holden ] >> ... >> The assignment expression seems like a vary natural way to introduce >> variables of limited (controlled?) scope, [...] [Antoine Pitrou ] > AFAIU, the scope isn't limited to the "if" block, it's a regular local > variable. I might have misread. You're right about the current version of the PEP. No new scoping rules are introduced. The PEP does suggest some changes to corner case scoping semantics, though. > ... > Regardless, my three questions about this are: > - does it make Python more powerful? Goodness no. > - does it make Python more readable? There are cases where it would, and cases where it wouldn't. People shouldn't use it in the latter cases ;-) I very recently wrote this block of code: outside = p2units[p1][tgt_kind] - units[2] if outside: if not all(self.crossout(q, n, undo) for q in outside): return False The opening pair is a very common minor annoyance; it's marginally more readable like so: if outside := p2units[p1][tgt_kind] - units[2]: Saving an essentially useless line with a duplicated name is worth something to me, because it comes up so very often. But that's indeed "minor". In my diff/gcd example, it reduced 5 lines of code to 2; saved a level of annoying (semantically misleading) indentation; and cut the number of instances of both "diff" and "g" from 3 each to 2 each (ideal: one each to bind the name, and then one each to use the name later). That's relatively substantial by any measure. In Guido's if/elif/elif/elif/elif ... complex text processing example template, it can save an unbounded number of semantically needless indentation levels. So the readability benefits can range from highly negative to highly positive. > - does it make Python easier to learn and teach? By whom? Almost no addition has ever made a language easier to learn for raw beginners: every addition is something they eventually need to learn. We could make Python easier to learn for beginners by throwing out virtually everything added since version 0.9.6 ;-) But people coming _from_ most other very widely used languages (C, C++, Java, Javascript, Perl, ...) are already familiar with assignment expressions. The limited (to a plain identifier target) "binding expression" PEP simplification I favor would be nothing new to them at all (whereas the full complexity of Python's assignment statements is indeed beyond what they're used to, but needs to be taught & learned regardless of this PEP's fate). At least when restricted to binding expressions, the syntax is simple and the semantics are the very simplest case of what people (convert or raw beginner) need to learn for Python's assignment statements regardless. > My answer would be "no" to all three, but YMMV. And it did ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Antoine] >>> - does it make Python easier to learn and teach? [Tim] >> By whom? Almost no addition has ever made a language easier to learn >> for raw beginners: every addition is something they eventually need >> to learn. We could make Python easier to learn for beginners by >> throwing out virtually everything added since version 0.9.6 ;-) [Antoine] > Constructs like "with ..." or "try / except / finally" make the > language easier to learn compared to the dances they are meant to > replace. They nevertheless need to be taught & learned (and try/except/finally was essentially always in the language), You snipped the parts pointing out that binding expressions are already familiar to people coming from most other languages, and even for raw beginners the semantics are the tiniest part of what they need to learn anyway about Python's assignment expressions. So that was my point: they don't make Python any harder to learn or teach. To the contrary, for people coming from other languages, it's one less thing they're used to they wouldn't have to _un_learn. > "await" is a more readable and less confusing improvement > over "yield from". Heh. Not to me. I have literally have no idea what to with "await" (I use generators heavily, but have had no use yet for coroutines), but use yield from an_iterable routinely. That use is perfectly clear, to the point that it _can't_ be improved on: it already does exactly what I want from it, with no effort at all. It's simply impossible that, whatever "await" does, it could be more readable or less confusing than what I use "yield from" for. > Format strings dispense from the older, more convoluted formulations. But they didn't _replace_ them. They're Yet Another Way to Format Strings everyone has to learn. That made teaching/learning harder, not easier, but you can make a case they make Python easier to _use_ for people who eschew the older forms. > Iteration is much simpler than the longer forms we would have to write > if generalized iterators didn't exist. I'll buy that one. Now go through the HISTORY file and count all the changes you didn't name ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Antoine Pitrou ] > ... > Having to break things out over multiple lines is a fact of life, if > only for readability when implementing (and maintaining!) non-trivial > processing routines. It's a good thing to be used to it, and to learn to > choose good names for intermediate variables. Well, the last part is overselling: by its very nature, a binding expression does not relieve the programmer one whit from needing to pick good names. The name is part of the binding expression. The sheer number of names needed is the same with or without binding expressions, although the latter allow for less repetitive typing (& reading) of those names. For the rest, _needing_ to split a simple bind-and-test across two lines doesn't really build character, or have any other virtue (besides familiarity to old-time Python programmers) I can see. Neither does falling into indentation hell have any virtue in the rarer cases where binding expressions really shine. Simple things _should_ be simple to do; indeed, when they are, that's an incentive to keep things simple. There will still be plenty of code where splitting multiple bindings across multiple lines is obviously better. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Stephen J. Turnbull[ >> Neologisms are usually written in the other order: >> "dead on arrival (DOA, for short)." ;-) [Greg Ewing ] > Maybe we can make use of that? > >if (x - x_base) (diff) and gcd(diff, n) (g) > 1: > > That doesn't work, because the (...) look like function > calls. But what if we used a different set of bracketing > characters: > >if (x - x_base) {diff} and gcd(diff, n) {g} > 1: > > I think that's unambiguous, because you can't currently > put {...} straight after an expression. As Guido noted more than once when this was still on python-ideas, this isn't a "a puzzle" to be solved by any technical tricks conceivable. He's not going to accept anything in his language that isn't at least plausibly evident. There's a long & distinguished history of other languages using ":=" for binding, which is why that one gained traction before this moved to python-dev. > To make it look even more like a neologism definition, > we could require the bound names to be all-uppercase. :-) > >if (x - x_base) {DIFF} and gcd(DIFF, n) {G} > 1: > return G Yes - now you're on the right track ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Assignment Expressions
[Antoine] > ... > Yes... I think most will agree that Python is generally easy to take up > for people coming from C++ etc., so my "easier to learn and teach" was > mostly about non-programmers. [Tim] >> even for raw beginners the semantics are the tiniest part of what >> they need to learn anyway about Python's assignment expressions. > I'm not sure what you mean by that. If it's the tiniest part, what's > the overwhelming part? I was hoping it was clear from context that I was talking about "binding expressions", not the PEP's wholly general "assignment expressions". > Is the new assigment expression that delicate to use that it requires > reading a long and intimidating design document ? I didn't get that > impression, so it seems you may be making a stronger point than me > for rejecting the PEP :-) I'm -1 myself on the PEP's assignment expressions, because there are no compelling use cases yet for any but the simplest ("binding expressions") cases. And, yes, understanding Python's assignment statements is challenging. Just understanding their grammar is challenging: assignment_stmt ::= (target_list "=")+ (starred_expression | yield_expression) target_list ::= target ("," target)* [","] target ::= identifier | "(" [target_list] ")" | "[" [target_list] "]" | attributeref | subscription | slicing | "*" target Followed by pages of dense text explaining what all those possibilities mean. A binding expression is more like: binding_expression ::= identifier ":=" expression and the only part of the assignment statement docs needed to explain the meaning is the brief "If the target is an identifier (name)" section, augmented with "and the value of `expression` is the value of the binding expression". If someone has learned what i = 1 means, they already learned almost all of what binding expressions mean too. The target in a binding expression can't be more complicated than the `i` in that example. >>> "await" is a more readable and less confusing improvement >>> over "yield from". >> Heh. Not to me. I have literally have no idea what to with "await" >> (I use generators heavily, but have had no use yet for coroutines), >> but use >> >> yield from an_iterable >> >> routinely. > Yeah... "yield from" is fine for that, except that it was explicitly > meant for the coroutine use case as well (I'm not sure what the > timeline is, but probably Guido was already thinking/dreaming about > tulip/asyncio back then). And trying to shoehorn both in a single > construct made it confusing and inadequate. > > When you want to express two abstractly different concepts (generating > a stream of values, or suspending a task until some asynchronous subtask > finishes), it makes things easier if those two concepts have two > different concrete expressions. Hence "await" making the language > easier to learn for those whose use cases benefit from it. All of which I remain blissfully unaware of :-) ... >> It's simply impossible that, whatever "await" does, it >> could be more readable or less confusing than what I use "yield from" >> for. > Probably because "await" wouldn't work at all for you, then :-) I'm glad people who need "await" got it - they'd have to pry _my_ uses of "yield from" from my cold, dead fingers ;-) Despite that all my uses could be trivially replaced by for _ in an_iterable: yield _ "yield from" saves typing, indentation, and conceptual noise for me. It's the "binding expressions" of nested generators ;-) >>> Format strings dispense from the older, more convoluted formulations. >> But they didn't _replace_ them. That made teaching/learning harder, >> not easier, > Intuitively, it sounds easier to teach f'some {value}' rather than > either the .format() or %-formatting alternatives. The whole goal of > f-strings, after all, is to make string formatting more approachable. > > Learning a language is not learning the whole spec. When you learn C, > you don't need to learn the oddities of pre-ANSI function > declarations :-) A difference is that there still are mountains of code using earlier string formatting methods, and my guess is that there always will be. f-strings aren't always "better". For example, any number of generators (including the combinatoric generators from itertools) yield a sequence of tuples, and format_string % a_tuple is often the simplest way to format the tuple components. Breaking the tuple apart first, whether via explicit indexing in an f-string, or via unpacking into a tuple of names for use in an f-string, is often needless complication. So % formatting needs to be learned by anyone who wants to read _other_ peoples' code. Then again, that's fine by me, because I don't really care whether something new needs to be learned. What I do care about is whether the benefits exceed the co
Re: [Python-Dev] PEP 572: Assignment Expressions
[Chris Angelico ] > Hopefully you have seen, or soon will see, the latest posting of the > PEP, in which assignment targets are restricted to simple names. :) I haven't yet, but look forward to it! You have the patience of a saint to endure all this - I would have given up 6 years ago ;-) > Though I still talk about "assignment expressions". I don't see a > problem with calling them that, but I also don't see a problem with > calling them "binding expressions" if you prefer. It's psychology ;-) So long as the PEP calls them assignment expressions, people are going to imagine facing the horrors of things like the current *b, c = a[c] = a assignment statement buried deep inside expressions. But in conventional use, "binding" is restricted to identifiers, which vastly simplifies the mental model for "the worst" that can happen. Since fear is the most potent motivator, "don't scare people" is rule #1 ;-) But, in the absence of Guido chiming in, it's really up to you. A few people have expressed positive feelings about changing the name to "binding expressions", and none opposed it (that I saw), but the sample size is too small to claim that "proves" anything. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Write vs Read, Understand and Control Flow
[Victor Stinner] ... > Tim Peter gaves the following example. "LONG" version: > > diff = x - x_base > if diff: > g = gcd(diff, n) > if g > 1: >return g > > versus the "SHORT" version: > > if (diff := x - x_base) and (g := gcd(diff, n)) > 1: >return g > > == Write == > > If your job is to write code: the SHORT version can be preferred since > it's closer to what you have in mind and the code is shorter. When you > read your own code, it seems straightforward and you like to see > everything on the same line. All so, but a bit more: in context, this is just one block in a complex algorithm. The amount of _vertical_ screen space it consumes directly affects how much of what comes before and after it can be seen without scrolling. Understanding this one block in isolation is approximately useless unless you can also see how it fits into the whole. Saving 3 lines of 5 is substantial, but it's more often saving 1 of 5 or 6. Regardless, they add up. > The LONG version looks like your expressiveness is limited by the > computer. It's like having to use simple words when you talk to a > child, because a child is unable to understand more subtle and > advanced sentences. You want to write beautiful code for adults, > right? I want _the whole_ to be as transparent as possible. That's a complicated balancing act in practice. > == Read and Understand == > > In my professional experience, I spent most of my time on reading > code, rather than writing code. By reading, I mean: try to understand > why this specific bug that cannot occur... is always reproduced by the > customer, whereas we fail to reproduce it in our test lab :-) This bug > is impossible, you know it, right? > > So let's say that you never read the example before, and it has a bug. Then you're screwed - pay me to fix it ;-) Seriously, as above, this block on its own is senseless without understanding both the mathematics behind what it's doing, and on how all the code before it picked `x` and `x_base` to begin with. > By "reading the code", I really mean understanding here. In your > opinion, which version is easier to *understand*, without actually > running the code? Honestly, I find the shorter version a bit easier to understand: fewer indentation levels, and less semantically empty repetition of names. > IMHO the LONG version is simpler to understand, since the code is > straightforward, it's easy to "guess" the *control flow* (guess in > which order instructions will be executed). You're saying you don't know that in "x and y" Python evaluates x first, and only evaluates y if x "is truthy"? Sorry, but this seems trivial to me in either spelling. > Print the code on paper and try to draw lines to follow the control > flow. It may be easier to understand how SHORT is more complex to > understand than LONG. Since they're semantically identical, there's _something_ suspect about a conclusion that one is _necessarily_ harder to understand than the other ;-) I don't have a problem with you finding the longer version easier to understand, but I do have a problem if you have a problem with me finding the shorter easier. > == Debug == > > Now let's imagine that you can run the code (someone succeeded to > reproduce the bug in the test lab!). Since it has a bug, you now > likely want to try to understand why the bug occurs using a debugger. > > Sadly, most debugger are designed as if a single line of code can only > execute a single instruction. I tried pdb: you cannot only run (diff > := x - x_base) and then get "diff" value, before running the second > assingment, you can only execute the *full line* at once. > > I would say that the LONG version is easier to debug, at least using pdb. That might be a good reason to avoid, say, list comprehensions (highly complex expressions of just about any kind), but I think this overlooks the primary _point_ of "binding expressions": to give names to intermediate results. I couldn't care less if pdb executes the whole "if" statement in one gulp, because I get exactly the same info either way: the names `diff` and `g` bound to the results of the expressions they named. What actual difference does it make whether pdb binds the names one at a time, or both, before it returns to the prompt? Binding expressions are debugger-friendly in that they _don't_ just vanish without a trace. It's their purpose to _capture_ the values of the expressions they name. Indeed, you may want to add them all over the place inside expressions, never intending to use the names, just so that you can see otherwise-ephemeral intra-expression results in your debugger ;-) > ... Think about tracebacks. If you get an xception at "line 1" in the > SHORT example (the long "if" expression), what can you deduce > from the line number? What happened? > > If you get an exception in the LONG example, the line number gives you > a little bit more information... maybe just enough to understand the > bug?
Re: [Python-Dev] PEP 572: Write vs Read, Understand and Control Flow
[Tim] >> Binding expressions are debugger-friendly in that they _don't_ just >> vanish without a trace. It's their purpose to _capture_ the values of >> the expressions they name. Indeed, you may want to add them all over >> the place inside expressions, never intending to use the names, just >> so that you can see otherwise-ephemeral intra-expression results in >> your debugger ;-) [Steven D'Aprano ] wrote: > That's a fantastic point and I'm surprised nobody has thought of it > until now (that I've seen). > > Chris, if you're still reading this and aren't yet heartedly sick and > tired of the PEP *wink* this ought to go in as another motivating point. You know, I thought I was joking when I wrote that - but after I sent it I realized I wasn't ;-) It would actually be quite convenient, and far less error-prone, to add a binding construct inside a complicated expression for purposes of running under a debugger. The alternative is typing the sub-expression(s) of interest by hand at the debugger prompt, or adding print()s, both of which are prone to introducing typos, or changing results radically due to triggering side effects in the code invoked by the duplicated sub-expression(s). Adding a binding construct wouldn't change anything about how the code worked (apart from possibly clobbering a local name). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Guido] >> You don't seem to grasp the usability improvements this will give. >> I hear you but at this point appeals to Python's "Zen" don't help you. [Łukasz Langa ] > This reads dismissive to me. I did read the PEP and followed the discussion on > python-dev. I referred to PEP 20 because it distills what's unique about the > value proposition of Python. It's our shared vocabulary. > > Can you address the specific criticism I had? To paraphrase it without PEP 20 > jargon: > (name := expression) makes code less uniform. It inserts more information > into a place that is already heavily packed with information (logic tests). I'll take a crack at that. It's not about "head arguments" at all. I sat out the first hundred messages about this on python-ideas, and looked at code instead. What I found had little to do with any of the head (abstract) arguments passionately debated for the duration ;-) In real life, I found a great many conditional tests that not only weren't "heavily packed" with information, they were simply of the form: NAME = expression if NAME: ... use NAME ... That looks more like assembly language than Python ;-) I saw no harm at all, and a little gain, in if NAME := expression: ... use NAME ... instead. But even a little gain adds up when it happens so often. Of course there have been better examples given of bigger gains. But in no case have the tests in those examples been "heavily packed with information". If they had been, I would have suggested instead breaking the test clauses _out_ of the conditional statements, and giving them names each on their own dedicated lines, with comments explaining what the heck the _intents_ are, even at the cost of adding an indentation level or two. Sometimes conditionals are _already_ "too dense". But more often they're very sparse. This becomes a question of seasoned judgment. For example, here's a real loop summing a series expansion, until the new terms become so small they make no difference to the running total (a common enough pattern in code slinging floats or decimals): while True: old = total total += term if old == total: return total term *= mx2 / (i*(i+1)) i += 2 To my eyes, this is genuinely harder to follow, despite its relative brevity: while total != (total := total + term): term *= mx2 / (i*(i+1)) i += 2 return total So I wouldn't use binding expressions in that case. I don't have a compelling head argument for _why_ I find the latter spelling harder to follow, but I don't need a theory to know that I in fact do. But neither do I need a compelling head argument for "why" to know that in many other cases I find that the use of binding expressions improves the code. You shouldn't believe me even if I pretended to have one and passionately argued for it. But, by the same token, I'm spectacularly unmoved by other peoples' head arguments. For that reason, the messages that sway me are those showing real code, or at least plausibly realistic code. In the majority of those so far, binding expressions would be a small-to-major win. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Tim] >> To my eyes, this is genuinely harder to follow, despite its relative brevity: >> >> while total != (total := total + term): [Antoine] > Does it even work? Perhaps if the goal is to stop when total is NaN, > but otherwise? I don't follow you. You snipped all the text explaining why it would work, so trying reading that again? When, e.g., `total` reaches 1.0 and `term` reaches 1e-30, this becomes: while 1.0 != (total := 1.0 + 1-e30): which leaves `total` unchanged (1.0 + 1e-30 == 1.0) and then while 1.0 != 1.0: causes the loop to exit (`while False:`). >> For that reason, the messages that sway me are those showing real >> code, or at least plausibly realistic code. In the majority of those >> so far, binding expressions would be a small-to-major win. > I'm sure it's possible to find thousands of line of code where binding > expressions wouldn't be a win, but I'm not sure that would be a > constructive use of mailing-list bandwidth. And that "argument" is? ;-) Note that I managed to move the PEP _away_ from general "assignment expressions" to the much simpler "binding expressions" precisely _by_ illustrating, via real code, why the generality of the former wasn't actually useful in any case I looked at. If something is always - or almost always - useless, that can be shown via considering realistic code. That was far more productive than endless abstract debates. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Tim] To my eyes, this is genuinely harder to follow, despite its relative brevity: while total != (total := total + term): [Antoine] >>> Does it even work? Perhaps if the goal is to stop when total is NaN, >>> but otherwise? [Chris] >> Yes, it does, because the first "total" is looked up before the >> rebinding happens. It's 100% unambiguous to the compiler... but still >> pretty unclear to a human. And I think the multiple use of 'total' is >> to blame for that. So I agree with Tim that this particular example is >> better in longhand. [Antoine] > "Better" is an understatement :-( Now that I understood it (thanks > for the explanation), Ah, sorry - I had no idea it was the "left to right evaluation" part you weren't seeing. Next time explain why you think something is broken? > the shorthand version appears completely bonkers. I wouldn't go that far, but I already said I wouldn't write it that way. However, without looking at real code, people are just flat-out guessing about how bad - or good - things _can_ get, no matter how confident they sound. So at least give me credit for presenting the _worst_ brief binding-expression example you've seen too ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Is PEP 572 really the most effective way to solve the problems it's targeting?
[Ryan Gonzalez ] > I have to say I'm not overly thrilled with PEP 572...it's almost odd, > because if you asked me back when I first joined this list when I was 13, I > would've no doubt said *YES*. But, since then, I've gone across many > projects and languages, and fundamentally *I have never felt hurt by the > lack of assignment in an expression*, and I always regretted every time I > tried it in C or Crystal. I understand this experience is pretty > insignificant in comparison to many of the wizards here, but I thought I'd > still share it as an opener for what I'm about to say. The older you get, the more you'll regret not still being 13 ;-) > With this being said, I'd encourage everyone to take a bit of a step back: > what exactly are people looking for in PEP 572? > > I see two main goals: > > - Assignment in a conditional structure. > - Assignment in a list comprehension. > > Most other use cases would significantly hurt readability and seem pretty > rare. I haven't been much impressed by suggested uses outside conditional contexts either. > Now let's break down the top one: > > - Assignment in an if condition. > - Assignment in a while condition. > > So there are roughly three main goals here overall. Now, are there better > ways to solve these? > ... > C++ has recently solved the if condition by allowing declarations inside the > conditions: But C++ has always had assignment expressions. This: > if (auto a = 123; a != 456) { is solving a different (albeit related) problem: that C/C++ require declaring variables before use. Python doesn't. They could have done the same via, .e.g,, { auto a = 123; if (a != 456) { ... } } and still have had the scope of `a` limited to one block. auto-initializers in conditionals just gave a bit of syntactic sugar for what was already easily (although with more typing) done. > Many languages have a 'let' expression (using Felix as my example): > > if let a = 1, b = 2 in a == b then I don't read Felix, but I assume the _scope_ of `a` & `b` there ends immediately before the "then". If the names can't be used in the _body_ of a Python `if` (or `while`) block, it's essentially useless to allow binding names for use solely in the conditional test. So it would help if you picked "real Python examples" from the many other earlier messages in these threads. Python expressions can't span Python statement boundaries - only Python blocks can do that. A form of `let` that _would_ work would be block-structured: let m = regexp.match(pattern. line) in: if m: print(m.group(0)) That solves "a scope problem" the current version of the PEP gave up on, but in all other respects seems a step back from the current: m = regexp.match(pattern, line) if m: print(m.group(0)) > Swift has taken a bit of a hybrid between the above two: > > if let a = 1, b = 2, a == b { That seems plain incoherent ;-) > Now, what's the common theme here? **Declarations should be separate from > expressions.** We've got languages that range from baggage-filled to > functional to a bit of all of the above, and none of them have added > assignment *inside* an expression. C++ and C have always had assignment expressions . Ditto Java, Javascript, Perl, Icon, ... (many, many others). I don't see a good reason to grant that Felix and Swift are necessarily improvements over the former (with the exception of Icon, which I'm merely fond of) very widely used languages. > The argument is roughly the same across all boards: you're putting major but > easy-to-miss side effects in the midst of expressions that *seem* pure. > > All this is to say: I'd really encourage everyone here to think a bit more > about *why* exactly you want this feature, and then think if there's really > no better way. Any solution that separates declarations would be far more > readable, (arguably) more Pythonic, and play more nicely with the new-ish > typing features to boot People have been trying for years. If you come up with a realistic (for Python) idea, that's great - share it! But it's probably better suited to python-ideas than python-dev. > ... ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Raymond Hettinger ] > After re-reading all the proposed code samples, I believe that > adopting the PEP will make the language harder to teach to people > who are not already software engineers. Can you elaborate on that? I've used dozens of languages over the decades, most of which did have some form of embedded assignment. Yes, I'm a software engineer, but I've always pitched in on "help forums" too. One language feature conspicuous by absence in newbie confusions was, consistently, assignment expressions. Read any book or tutorial for such a language, and you'll find very little space devoted to them too. What's to learn? If they understand "binding a name" _at all_ (which they must to even begin to write a non-trivial program), the only twist is that a binding expression returns the value being bound. Binding expressions certainly wouldn't be the _first_ thing to teach people. But by the time it would make sense to teach them, it's hard for me to grasp how a student could struggle with such a tiny variation on what they've already learned (all the subtleties are in what - exactly - "binding"means - which they already faced the first time they saw "j = 1"). > To my eyes, the examples give ample opportunity for being > misunderstood and will create a need to puzzle-out the intended semantics. Some do, many don't. The same can be said of a great many constructs ;-) > ... ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Tim] >> One language feature conspicuous by absence in newbie >> confusions was, consistently, assignment expressions. Read any book >> or tutorial for such a language, and you'll find very little space >> devoted to them too. [Łukasz Langa ] > Well, you have an entire code style built around this feature called Yoda > conditions. You teach people on Day 1 to never ever confuse == with =. Some > compilers even warn about this because so many people did it wrong. Sorry, I couldn't follow that. In languages like C that use easily confused operator symbols, sure, people are forever typing "=" when they mean "==". That's nothing to do with whether they _understand_ what the different operators do, though. They do. In languages like Icon (that use "=" for numeric comparison and ":=" for assignment), that never occurs. But I'm not sure that addressed the point you were making. >> What's to learn? If they understand "binding a name" _at all_ (which >> they must to even begin to write a non-trivial program), the only >> twist is that a binding expression returns the value being bound. > Ha, not in Python! Here we have *different syntax* for assignments in > expressions. Yes, binding expressions in the current PEP support an extremely limited subset of what Python's assignment statements support. That they use different operator symbols is irrelevant to that the meaning of "binding a name" is exactly the same for both.. _That's_ the "hard part" to learn. > Well, you can also use it as a statement. But don't! Why not? _Every_ expression in Python can be used as a statement. Nothing forbids it, and that's even (very!) useful at an interactive prompt. > We have a better one for that. As a matter of style, sure, it's best to use the simplest thing that works. As a statement in a program (as opposed to typed at a shell), "a := 3" has the unnecessary (in that context) property of returning (and discarding 3), so it's better style to use "a = 3" in that context. > And that one supports type annotations, can unpack and assign to many > targets at the same time, and can even increment, multiply and so on, at once. > But the other one can't. So? math.sqrt() blows up when passed -1, but cmath.sqrt() doesn't. Different tools for different tasks. > So only use the Pascal one in expressions. But don't forget parentheses, > otherwise it will bind the thing you probably didn't want anyway. [Raymond] >>> To my eyes, the examples give ample opportunity for being >>> misunderstood and will create a need to puzzle-out the intended >>> semantics. >> Some do, many don't. > As soon as we have to wrap a part of an expression in parentheses, parsing > the entire thing becomes more complex. Often enough it will cause the > expression to exceed whatever line length limit the codebase pledged not to > exceed, causing one line to become three. And again, making it trickier for > a regular Łukasz to understand what's going on. At this point I think you must have a lower opinion of Python programmers than I have ;-) If adding even a dozen characters to a line makes it exceed a reasonable line-length guide, the code was almost certainly too confusingly dense to begin with. All the binding-expression examples I've given as "improvements" had _oceans_ of horizontal white space to swim in. Guido's if/elif/elif/elif/ ... complex text-processing example didn't, but because the current lack of an ability to bind-and-test in one gulp forced the `elif` parts to be ever-more-deeply-indented `if` blocks instead. So, to match your sarcasm, here's mine: try using a feature for what it's good at instead of for what it's bad at ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Raymond Hettinger ] >>> After re-reading all the proposed code samples, I believe that >>> adopting the PEP will make the language harder to teach to people >>> who are not already software engineers. [Tim] >> Can you elaborate on that? [Raymond] > Just distinguishing between =, :=, and == will be a forever recurring > discussion, far more of a source of confusion than the occasional > question of why Python doesn't have embedded assignment. To be clear, is distinguishing between "=" and "==" already a forever recurring discussion in your experience? Or are you predicting that adding ":=" will create that situation? > Also, it is of concern that a number of prominent core dev > respondents to this thread have reported difficulty scanning > the posted code samples. Yes, it is - although some of the examples sucked ;-) >> I've used dozens of languages over the >> decades, most of which did have some form of embedded assignment. > Python is special, in part, because it is not one of those languages. > It has virtues that make it suitable even for elementary school children. > We can show well-written Python code to non-computer folks and walk > them through what it does without their brains melting (something I can't > do with many of the other languages I've used). There is a virtue > in encouraging simple statements that read like English sentences > organized into English-like paragraphs, presenting itself like > "executable pseudocode". It's certainly possible to stick to a subset of Python for which that's true. But I didn't mention those dozens of languages because I seek to emulate them, but to establish that I've had decades of experience with embedded assignments in a wide variety of languages and language communities. > Perl does it or C++ does it is unpersuasive. Wasn't meant to be. > Its omission from Python was always something that I thought Guido had > left-out on purpose, intentionally stepping away from constructs that would > be of help in an obfuscated Python contest. He left out lots of stuff at first, but warmed to it later. Probably the most profound: there were exactly and only 3 scopes at first: local, global, and builtin. Functions (for example) could still nest, but had no way to access names local to enclosing functions save via deep trickery. That was a noble experiment (it was a deliberate attempt to avoid complex scoping rules), but eventually proved too restrictive in practice. This is nothing compared to that ;-) But it's a tiny bit related in that biting the arbitrarily-deeply-nested-scopes bullet was aimed more at experienced programmers than at newbies. The scoping rules became far harder to explain as a result - but far more what experienced programmers expected. >> Yes, I'm a software engineer, but I've always pitched in on "help >> forums" too. > That's not really the same. I believe it! > I've taught Python to many thousands of professionals, almost > every week for over six years. That's given me a keen sense of > what is hard to teach. It's okay to not agree with my assessment, > but I would like for fruits of my experience to not be dismissed in a > single wisp of a sentence. I asked you to elaborate - I didn't dismiss anything. You merely made a raw assertion in your original message, without enough detail to even know _what_ it is you thought would be hard to teach. Your elaboration is helping. > Any one feature in isolation is usually easy to explain, but showing > how to combine them into readable, expressive code is another matter. OK, so it's not binding expressions in isolation that you expect will be hard to teach if they're added, but ... how to use them intelligently (if ever)? That's progress, if so. That part I can see having major trouble with. Even the proponents of this PEP don't always agree with each other about which examples are "good ones". > And as Yuri aptly noted, we spend more time reading code than writing code. > If some fraction of our users finds the code harder to scan > because the new syntax, then it would be a net loss for the language. It would be a tradeoff pitting their losses against others' gains, of course. I don't know how to quantify that (not even to the extent of determining the sign bit) in advance. I'm also at least as concerned about - indeed - professional software engineers as beginners. > I hesitated to join this thread because you and Guido seemed to be > pushing back so hard against anyone's who design instincts didn't favor > the new syntax. That's just vigorous debate, at least on my part. Guido gets annoyed by emotional tirades and FUD, of which there's always plenty in threads that have gone on for hundreds of messages (I don't know whether you followed any of this on python-ideas, but most arguments on python-dev were already many-times-over old by the time it first appeared here). > It would be nice to find some common ground and perhaps stipulate that the > g
Re: [Python-Dev] The new and improved PEP 572, same great taste with 75% less complexity!
[Guido] > Maybe the order for d[k] = v should also be reconsidered? There's certainly code in the wild relying on the order "v, then d, then k", because that's just how assignment statements have always worked (dicts or not). I'd rather change the dict comprehension code, because I think the analogy to what `d[k] = v` does is weak. I expect: {f(): g() for _ in [1]| to do the same as {f(): g()} The former currently evaluates g() first; the latter f(). Since I also expect both of those to do the same as dict((f(), g()) for _ in [1]) f() has to win ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The new and improved PEP 572, same great taste with 75% less complexity!
[Kirill Balunov ] > Not sure, but if additional motivating examples are required, there is a > common pattern for dynamic attribute lookup (snippet from `copy.py`): > > reductor = dispatch_table.get(cls) > if reductor: > rv = reductor(x) > else: > reductor = getattr(x, "__reduce_ex__", None) > if reductor: > rv = reductor(4) > else: > reductor = getattr(x, "__reduce__", None) > if reductor: > rv = reductor() > else: > raise Error("un(shallow)copyable object of type %s" % cls) > > which can with the current `binding expression` syntax simplified to: > > if reductor := dispatch_table.get(cls): > rv = reductor(x) > elif reductor := getattr(x, "__reduce_ex__", None): > rv = reductor(4) > elif reductor := getattr(x, "__reduce__", None): > rv = reductor() > else: > raise Error("un(shallow)copyable object of type %s" % cls) > > which becomes much clearer, at least in my opinion. > > With kind regards, > -gdg Thanks for sharing that! While nobody else seems to, I absolutely love real code ;-) This is effectively an instance of Guido's "if/elif/elif/elif/..." example template, where binding expressions shine. But actual real code can make a point viscerally that "consider stuff akin to the following semi-abstract pattern" can't. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The new and improved PEP 572, same great taste with 75% less complexity!
[Kirill Balunov] > Not sure, but if additional motivating examples are required, there is a > common pattern for dynamic attribute lookup (snippet from `copy.py`): > > reductor = dispatch_table.get(cls) > if reductor: > rv = reductor(x) > else: > reductor = getattr(x, "__reduce_ex__", None) > if reductor: > rv = reductor(4) > else: > reductor = getattr(x, "__reduce__", None) > if reductor: > rv = reductor() > else: > raise Error("un(shallow)copyable object of type %s" % cls) > > which can with the current `binding expression` syntax simplified to: > > if reductor := dispatch_table.get(cls): > rv = reductor(x) > elif reductor := getattr(x, "__reduce_ex__", None): > rv = reductor(4) > elif reductor := getattr(x, "__reduce__", None): > rv = reductor() > else: > raise Error("un(shallow)copyable object of type %s" % cls) > > which becomes much clearer, at least in my opinion. [Larry Hastings ] > I hate to be pedantic--there's enough of that going on in this thread--but I > can't agree with the word "simplifed" above. I agree that the code using > binding expressions is shorter. But considering that emit the two code > examples implement the exact same algorithm, to the point where their > bytecode would look nearly* identical, ISTM that the two code examples are > of identical complexity. In the absence of defining an objectively computable complexity measure, I expect you're doomed to arguing taste. For example, argue that both spellings have the same formal "cyclomatic complexity" measure (which they do). By other formal measures (e.g., total number of identifier instances), the latter spelling is "objectively simpler". By yet others (e.g., total number of non-whitespace characters divided by total number of lines), the former spelling is "objectively simpler". But that all kinda misses the point to me: the latter spelling is "obviously simpler" in a way that _actually matters_, for the same reason, e.g., a case statement with N cases is "obviously simpler" than the semantically equivalent spelling using N nested if/else if/else if/else if/else ... blocks. The latter spelling above is indeed visually very much like a case statement: all the tests are at the same indentation level, and all the conditional actions are too. It's obvious _at a glance_ in the latter that exactly one of the action blocks will be performed. That's how if/elif/elif/else always works. It's not at all obvious at a glance (certainly not to me) in the original spelling. > Comparing the two, the code using the binding expressions obviates four > newlines, three uses of the identifier "reductor", and allows folding two > "else / if"s into "elif"s. In exchange, it adds three extra colons, If it weren't for that you hate being pedantic, I'd add that you're overlooking the piles of leading whitespace characters also saved in the latter ;-) The number of those saved grows quadratically in the number of uselessly indented blocks shifted left. > and the density of complexity per line has shot up. Average non-whitespace character count per line has certainly shot up, but I don't actually know what you mean by "density of complexity" there. Just FYI, when I write long if/elif/elif/... chains, I typically put a blank line before each elif, to give better visual separation of the peer (both semantically and visually) test-action blocks. Which has nothing to do with any formal notion of complexity, because I don't much care about that - readability is what I value, and that's not the same as any formal notion of complexity I've ever seen. > ... ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Tim] >> So, to match your sarcasm, here's mine: try using a feature for what >> it's good at instead of for what it's bad at ;-) [Lukasz Langa ] > Yes, this is the fundamental wisdom. Judging which is which is left as an > exercise to the programmer. > > With this, I'm leaving the discussion. With Guido and you on board for PEP > 572, I feel that Chris' streak is indeed about to break. I still expect it could go either way, but do wish people didn't believe it will be a major loss if "the other side wins". I'll be fine regardless - and so will everyone else. Guido rarely makes language design mistakes. In this case he's seeing serious opposition from several core developers, and you shouldn't believe either that he just dismisses that. [Łukasz Langa] >>> Well, you have an entire code style built around this feature called Yoda >>> conditions. You teach people on Day 1 to never ever confuse == with =. Some >>> compilers even warn about this because so many people did it wrong. >> Sorry, I couldn't follow that. Part of the problem here is that I had never seen "Yoda conditions" before, and had no idea what it meant. Some later Googling suggests it's "a thing" youngsters say at times ;-) > You implied that newbies don't have to even know about assignments in > expressions. I wanted to demonstrate that this isn't really the case because > mistaking `=` for `==` is a relatively common occurence for newbies. If you > want to argue that it isn't, I'd like to point out that the WordPress code > style *requires* Yoda conditions because it was enough of a hindrance. ESLint > (a JavaScript linter) also has a warning about assignment in a conditional. What does that have to do with Python? If they try to use "=" in an expression now, they get a SyntaxError. The PEP doesn't change anything about that. Indeed, that's why it uses ":=" instead. I have experience in other languages with embedded assignments that also use ":=", and it's _never_ the case that people type ":=" when they intend "equality test" in those. The horrid "I typed = when I meant ==" mistakes are unique to languages that mindlessly copied C. The mistakes aren't primarily due to embedded assignments, they're due to that even highly experienced programmers sometimes type "=" when they're _thinking_ "equals". Nobody types ":=" when they're thinking "equals". > ... > What you're saying is true. But for it to be true, newbies *have to* learn > the > distinction, and the fact that yes, sometimes the programmer indeed meant to > put a single `=` sign in the conditional. Again, the PEP is about Python: a single "=" in a conditional is, and will remain, a SyntaxError. So nobody can sanely intend to put a single "=" in a condition _in Python_ unless they're writing a test intending to provoke a syntax error. > That's why we'll end up with the Pascal assignment operator. ":=" is already in the PEP. > And that *is* a thing that you will have to explain to newbies when they > encounter > it for the first time. Sure. That doesn't frighten me, though. It's easy to explain what it does - although it may be hard to explain when it's _desirable_ to use it. > Sadly, googling for a colon followed by an equal sign isn't trivial if you > don't > know what you're looking for. To judge from Stackoverflow volume, the single most misunderstood of all Python operators - by far - is "is" - try Googling for that ;-) In far second and third places are "and" and "or", for which searches are also useless. Regardless, I'm not concerned about one-time tiny learning curves. Don't know what ":=" means already? Ask someone. If you know what "=" means, you're already close to done. Given that you already understand what "binding a name" means, ":=" may well be the simplest of all Python's operators (there's no computation _to_ be understood, and no possibility either of a dunder method changing its meaning depending on operand type(s)). >>> Well, you can also use it as a statement. But don't! >> Why not? _Every_ expression in Python can be used as a statement. >> Nothing forbids it, and that's even (very!) useful at an interactive >> prompt. > Because it suggests different intent, because it's limited, because it's > slower > at runtime, and because PEP 572 says so itself. I didn't say you're _required_ to use it as a statement. Regardless of what PEPs say, people will do what they find most useful. I trust people to figure this out quickly for themselves. >> At this point I think you must have a lower opinion of Python >> programmers than I have ;-) If adding even a dozen characters to a >> line makes it exceed a reasonable line-length guide, the code was >> almost certainly too confusingly dense to begin with. > Around 5% of if and elif statements in the standard library don't fit a single > line *as is*. Sure, that's a low percentage but that's over 1,000 statements. > If you're putting an `if` statement in a method, you
Re: [Python-Dev] Is PEP 572 really the most effective way to solve the problems it's targeting?
[Mike Miller] >> - How are other modern languages solving this issue? [Greg Ewing ] > In all the languages I can think of that allow assignments in > expressions, there is only one assignment operator -- a stand > alone assignment is just a bare assignment expression. Pretty much so, but I don't know what "modern" means to Mike. The R language may set a record for, umm, innovation here: """ There are three different assignment operators: two of them have leftwards and rightwards forms.[1] """ So there are 5 assignment operator spellings in R: = <- -> <<- ->> Note that the link doesn't tell the whole story either; e.g., they don't all have the same precedence level. And, in addition to the 5 infix spellings shown above, there are also prefix (looks like a 2-argument function call) spellings. Back on Earth ;-) , I think it's worth it to point out that only languages (with assignment expressions) aping C use "=" for assignment and "==" for equality. That was a Really Bad Idea that all other (not aping C) languages I know of avoided. But I'm not sure any of this is relevant to what Mike meant by "this issue". > But those languages were all designed that way from the start. > I'm not aware of any that began by forbidding assignment in > expressions and then added it later. Me neither. It's certainly the case that Guido would not have designed a language that aped C's poor decision here. At its very start, Python used "=" for both assignment and equality testing (and == was a syntax error). So I think it's evident that, at the time, he didn't envision ever adding assignment expressions. [1] https://www.rdocumentation.org/packages/base/versions/3.5.0/topics/assignOps ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The new and improved PEP 572, same great taste with 75% less complexity!
[Larry Hastings ] >>> I hate to be pedantic--there's enough of that going on in this thread--but I >>> can't agree with the word "simplifed" above. I agree that the code using >>> binding expressions is shorter. But considering that emit the two code >>> examples implement the exact same algorithm, to the point where their >>> bytecode would look nearly* identical, ISTM that the two code examples are >>> of identical complexity. [Tim] >> In the absence of defining an objectively computable complexity >> measure, I expect you're doomed to arguing taste. [Larry] > As are you! I didn't claim otherwise. > I haven't seen any arguments that binding expressions allow us > to express programs that were inexpressible in Python before. They don't. > I'm not even sure that binding expressions fall under the heading > of "syntactic sugar", given their negligible semantics (and, imo, > negligible benefit). What else is left, on both sides of the debate, > if not a debate over aesthetics? I prefer to look at effects on real code. Other people prefer to philosophize. >> For example, argue that both spellings have the same formal >> "cyclomatic complexity" measure (which they do). By other formal >> measures (e.g., total number of identifier instances), the latter >> spelling is "objectively simpler". By yet others (e.g., total number >> of non-whitespace characters divided by total number of lines), the >> former spelling is "objectively simpler". > What is this "objective simplicity" measurement you cite? There are many ways you can (and various programs do) attempt to define, quantitatively, what "program complexity" means. Under any such objectively defined measure, two pieces of code can be "objectively compared". I use scare quotes with their ordinary meaning: that it's "objective" only if you're silly enough to believe that _whatever_ numbers you're computing are going to settle the issue ;-) > I understand that the code example cited had fewer identifiers, so when > measuring "number of identifiers used" in isolation, the code example using > binding expressions had fewer of them. Then you necessarily agree that _if_ our objective definition of complexity is "total number of identifier instances", the binding-expression version is "objectively simpler". It's been reduced, by definition, to a question of determining which of two integers is smaller. > But this is so narrow as to be almost meaningless. Of course! As is your original claim that "the two code examples are of identical complexity". "because" "their bytecode would look nearly identical". Well, sure, _if_ that's how we define program complexity, the conclusion follows. But there's no reason I can see to accept that definition to begin with either. I suspect _you_ like it primarily because you found it supported the conclusion you had already reached ;-) > Perhaps I'm misunderstanding you, but I read this as saying that there's a > larger, well-established concept called "objective simplicity", of which > this measurement is a part. Can you tell me more about it? Google was no > help here. The metrics I mentioned are used by a number of programs that claim to quantify program complexity. For example, among many other things, this program computes cyclomatic complexity, and uses N_2 for "total number of operands" (which I called "identifiers" instead to specialize it to the specific example) under the widely used "Halstead Metrics": http://radon.readthedocs.io/en/latest/intro.html My favorite part is where the numerator of the "Maintainability Index" adds in 50 * sin(sqrt(2.4 * C)) where "C is the percent of comment lines (important: converted to radians)". WTF?! ;-) But they're not joking: some people take this stuff very seriously. >> But that all kinda misses the point to me: the latter spelling is >> "obviously simpler" in a way that _actually matters_, for the same >> reason, e.g., a case statement with N cases is "obviously simpler" >> than the semantically equivalent spelling using N nested if/else >> if/else if/else if/else ... blocks. > As I already mentioned, the with-binding-expressions code expresses the same > code, the same concept, and likely results in the same bytecode, as the > without-binding-expressions code. And as I already explained in some detail, while I agree with (almost) all that, it leaves me cold as a dead fish. The test-action pairs in the code are _semantically_ peers, not a nesting of subordinates. It's _clearer_ to human eyes if the syntactic structure of the code reflects the peer relationship directly. I couldn't care less that the byte code turns out being nearly the same. I'm not a PVM - I need to _reason_ about the code I read. In failing to visually reflect the peer relationship, the original code obscures a key simplicity. > In contrast, a switch statement is simpler than a series of nested if > statements. It's a different code construct, it has differen
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Zero Piraeus] >> Since it now looks like it really *does* have a decent chance, and >> maybe another -1 has a small chance of tipping the balance: my >> reaction to the proposal is also emotional. Visceral, in fact, to the >> extent that I'd aim to read and write less Python if it became >> commonplace. [Steven D'Aprano ] > Funnily enough, that's what some people said about decorator syntax, > ternary if, type annotations and list comprehensions. > > All of them have become great additions to the language. > > I hated the idea of aping C and adding += operators and swore I'd never > use them. That lasted, well, about a month. > > Just sayin'. Well - I've come to respect your opinion, so ... OK, I'll give += a try. Frankly, I've grown tired of editing it out of all the packages I download anyway ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Lukasz] >> > And that *is* a thing that you will have to explain to newbies when >> > they encounter it for the first time. [Tim] >> Sure. That doesn't frighten me, though. It's easy to explain what it >> does - although it may be hard to explain when it's _desirable_ to use >> it. [Chris Barker ] > I'm with Raymond here -- though I'm not sure "newbies" is quite right -- > I've found that newbies fall into two camps: folks to whom programming comes > naturally, and those that it doesn't (OK, it's a distribution, but a bimodal > one). And folks that are struggling with programming can struggle even with > simple assignment (name binding), particularly when you add even function > local scope. Sure. What I wrote was shorthand for what's already been covered at length many times: what a binding expression does is "easy to explain" GIVEN THAT someone ALREADY UNDERSTANDS how binding a name works. The latter in fact seems difficult for a significant number of people to learn, but it's utterly unavoidable that they learn it if they're ever to write non-trivial Python programs. That's been true since Python's first release. Binding expressions would be introduced much later in any sane course. At THAT point, for students who haven't already dropped out, the semantics are darned-near trivial to explain: it binds the name to the object the expression evaluates to (all of which they _already_ understand by this point), and the value of the binding expression is that object (the only new bit). Unlike as for most other operators, you don't even have to weasel-word it to account for that a magical dunder method may change what ":=" does. As for the "is" operator, the meaning is baked into the language and can't be altered in the slightest. > So having one more way to do assignment WILL make it harder to > teach, not because it's that hard, but because it's one more thing to learn. On a scale of 1 to a million, try to quantify how much harder ;-) As above, I can't see it getting beyond a single digit, GIVEN THAT a student has already masteredf the far more complex assignment _statement_ (binding expressions are limited to the single simplest case of the many things an assignment statement can do). "And it returns the object" is a yawn. But, as I already granted, it may be truly hard to explain when it's a desirable thing to use. That takes experience and "good judgment", which - according to me - can be learned but can't really be taught. > But the fact is that as Python has evolved (particularly with the jump to > py3) it has become less and less of a "scripting" language, and more of a > "systems" language. And also harder to learn. Anyone remember CP4E? Python > is not as good choice as a "newbie" language as it once was. I agree - although I expect sticking to a subset of Python could make life easier for beginners. For example, would anyone in their right mind even mention async gimmicks when teaching beginners? Against that, though, one of the most unintentionally funny tech things I ever read was Bjarne Stroustrup writing about why C++ is an excellent choice for beginners. But he does have a point: if you throw away the bulk of everything C++ added, there's an easily usable little language exceedingly well hidden under it all ;-) > Adding := will move it a little bit more along the complexity path -- not > much, and that's where Python has gone anyway, so as Tim said, no one's > going to suffer either way this decision goes. Yet there will be much wailing and gnashing of teeth anyway ;-) ... >> To judge from Stackoverflow volume, the single most misunderstood of >> all Python operators - by far - is "is" - > You now, I think instructors like me are partly responsible. "is" is rarely > useful outside of comparing to singletons. Yet I use it early in instruction > to do checks on name binding and show things with mutablilty, etc which > has the unfortunate side effect of making it seem like a more common > operator than it is. > > I've even had students write code like: > > if x is 3: > > and thanks to interning, it appears to work! Yup, that's the real problem with "is": its semantics are dead simple, but "but under exactly what conditions are `x` and `y` bound to the same object?" is intractable. It seems to take a long time to get across the point, that the question itself is misguided. A full answer requires delving into transient implementation details, which is counterproductive because they _are_ accidents of the implementation du jour. What questioners need to be nudged into asking instead is for examples of when using "is" is thoroughly sane. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Chris Angelico ] > ... > I don't understand why people bring up all these arguments that have > absolutely nothing to do with the proposal at hand. None of this has > in any way changed. That's easy: any time there's a long thread to which Guido has contributed at least twice, it will be seen as a Golden Opportunity to re-litigate every decision that's ever been made ;-) Some amount of that seems healthy to me (people are thinking about "language design" from a larger view than the proposal du jour). In this specific case, line-oriented coverage tools have missed accounting for all possible code paths since day #1; e.g., x = f() or g() You don't need to reply to messages so obviously irrelevant to the PEP unless you want to. It's not like Guido will read them and go "oh! a binding expression in a ternary conditional is a fundamentally new potential problem for a line-oriented coverage tool! that's fatal" ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
Wes, sorry, but I really don't follow what you're saying. For example, [Wes Turner ] > Do not do this: > > x = 2 > if (x == 3) or (x := 3): >print(x) > > What do we call that mistake? It displays 3 - while it appears to be silly code, there's nothing about it that's undefined. So I fail to see how showing that example anywhere would do anyone any good. You can do the same kind of thing today via, e.g., class Bindable: def __init__(self, value): self.bind(value) def bind(self, value): self.value = value return value def __bool__(self): return bool(self.value) def __eq__(self, other): return self.value == other def __str__(self): return str(self.value) Then: >>> x = Bindable(2) >>> if x == 3 or x.bind(3): ... print(x) 3 And I wouldn't put that example anywhere in any docs either ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] (name := expression) doesn't fit the narrative of PEP 20
[Tres Seaver ] > FWIW, Ned Batchelder's 'coverage.py' does a good job with branch coverage. > I haven't seen anything in this discussion which indicates that binding > expressions will change that at all. I don't think you missed anything relevant either ;-) Binding operators are exactly as irrelevant to control-flow analyzers as, e.g., introducing a floor division operator (//) was. Data-flow analyzers (if there are any for Python) are a different story, since they need to be aware of all (re)binding operations - although at the byte code level, all such sites remain equally apparent (no new flavor of "store" operation is added by this PEP). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 572: Usage of assignment expressions in C
[Raymond Hettinger ] > Thanks Antoine, this is an important point that I hope doesn't get lost. > In a language with exceptions, assignment expressions are less needful. > Also, the pattern of having of having mutating methods return None > further limits the utility. It doesn't diminish the utility one whit in cases where binding expressions are helpful ;-) What you're saying is that there are _fewer_ such opportunities in Python than in C. Which may or may not be true (depending on the code you're working with). If you believe it is true, fine, then that also argues against that people will rush to abuse the feature (to the extent that it's even plausibly useful less often, to that extent also will there be less temptation to use it at all). But then I only care about use cases at heart, and have presented real-life examples wherein binding expressions read both better and worse than what they're replacing. I intend to limit myself to the cases where they read better :-) Which are most of the cases I even considered, BTW - in the vast majority of cases in real code I'd use them, they'd be replacing the annoyingly bare-bones yet somehow repetitive anyway: value = f() if value; doing something with value with the still bare-bones but minimally repetitive: if value := f(): doing something with value For example, tons of functions I write and use return None or 0 or False when they want to communicate "I have nothing useful to return in this case - but since you expected that might happen, I'm not going to annoy you with an exception". That pattern isn't solely limited to regexp search and match functions. The "win" above is minor but frequent. It adds up. There are other cases where binding expressions really shine, but they're much rarer in all the code I looked at (e.g., see the uselessly ever-increasing indentation levels near the end of `copy()` in the std library's copy.py). In all, I expect I'd use them significantly more often than ternary `if`, but far less often than augmented assignments. If the PEP is accepted, that's what all Pythoneers will be saying 5 years from now ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Drop/deprecate Tkinter?
[Matěj Cepl ] > It absolutely impossible to remove Tkinter IMHO (it has been > part of stdlib since like forever and people expect it there; > its removal would be betrayal on the level of switching = to > :=), I have my doubts about IDLE though. I know, the same > argument applies, but really, does anybody use IDLE for > development for long time, what is its real value for the > community? Although, even this argument is questionable, because > Python has some affinity with the learning, and IDLE is a nice > for first steps nibbling into Python. IDLE isn't just for eager beginners, but also for those so old & senile they're incapable of learning anything new ever again. As proof, IDLE is still _my_ primary Python development environment, used multiple times every day, and I'm so old & out-of-it that I'm +1 on the binding expressions PEP ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Slow down...
[Steven D'Aprano ] > ... > If we could look forward to 2028, when we're running Python 3.14 or so > (4.7 if you prefer), how many fantastic language features that we cannot > bear to give up would we be missing out on? This, for just one: k = 6 if >!{myobj.meth(arg)[2]} elsenone 7 elsenan 8 else 5 Which is really annoying to write today, but would be much clearer with binding expressions: if myobj is None: k = 7 elif (t := myobj.meth) is None: k = 7 elif (t := t(arg)) is None: k = 7 elif (t := t[2]) is None: k = 7 eiif math.isnan(t): k = 8 elif t: k = 6 else: k = 5 The future is blindingly bright :-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Looking for examples: proof that a list comp is a function
[Chris Angelico ... > With current semantics, you can easily prove that a list comp is > implemented with a function by looking at how it interacts with other > scopes (mainly class scope), but Tim's proposal may change that. Absolutely not. I haven't considered for a nanosecond that anything _essential_ would change in the current implementation. In effect, my proposal to bind assignment statement targets that appear in a listcomp or genexp in the blocks that immediately contain their synthetic functions "merely" sprinkles in some `nonlocal` and/or `global` declarations to change the targets' scopes. Indeed, it _relies_ on that they're implemented as (potentially nested) synthetic functions today. And if you haven't read my proposed changes to the reference manual, they explicitly state that they're talking about the synthetic functions created to implement genexps and listcomps. > So I'm looking for examples that prove that a list comp is executed > inside an implicit function. Ideally, examples that are supported by > language guarantees, but something that's "CPython has done it this > way since 3.0" is important too. I don't believe you'll find that - but, of course, may be wrong about that. > I'm aware of just two: the name lookup interaction that may be > changing, and the fact that there's an extra line in a traceback. And > the latter, as far as I know, is not guaranteed (and I doubt anyone > would care if it changed). Are there any other provable points? Nick pointed me to these future docs that _will_ pretty much imply it: https://docs.python.org/dev/reference/expressions.html#displays-for-lists-sets-and-dictionaries In part: """ However, aside from the iterable expression in the leftmost for clause, the comprehension is executed in a separate implicitly nested scope. This ensures that names assigned to in the target list don’t “leak” into the enclosing scope. The iterable expression in the leftmost for clause is evaluated directly in the enclosing scope and then passed as an argument to the implictly nested scope. """ I say "pretty much" because, for whatever reason(s), it seems to be trying hard _not_ to use the word "function". But I can't guess what "then passed as an argument to the implicitly nested scope" could possibly mean otherwise (it doesn't make literal sense to "pass an argument" to "a scope"). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] bpo-33038: Fix gzip.GzipFile for file objects with a non-string name attribute. (GH-6095)
Sorry about approving this message (I'm a python-dev list moderator)! There will be a few more like it. Looking closer, it appears to be another variation of pure-nuisance spam that's been flooding all sorts of python.org lists. You've been spared many hundreds of those here, but since this one appeared to contain actual Python-related content, I reflexively approved it. On Tue, May 15, 2018 at 3:43 PM, nataliemorrisonxm980xm--- via Python-Dev wrote: > > > > From: Serhiy Storchaka > To: python-check...@python.org > Sent: Wednesday, 9 May 2018, 10:14 > Subject: [Python-checkins] bpo-33038: Fix gzip.GzipFile for file objects > with a non-string name attribute. (GH-6095) > ,,, ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Looking for examples: proof that a list comp is a function
[ Tim, about the most version of the docs at https://docs.python.org/dev/reference/expressions.html#displays-for-lists-sets-and-dictionaries ] >> I say "pretty much" because, for whatever reason(s), it seems to be >> trying hard _not_ to use the word "function". But I can't guess what >> "then passed as an argument to the implicitly nested scope" could >> possibly mean otherwise (it doesn't make literal sense to "pass an >> argument" to "a scope"). [Nick Coghlan ] > I think my motivation was to avoid promising *exact* equivalence with a > regular nested function, since the define-and-call may allow us > opportunities for optimization that don't exist when those two are separated > (e.g. Guido's point in another thread that we actually avoid calling "iter" > twice even though the nominal expansion implies that we should). However, > you're right that just calling it a function may be clearer than relying on > the ill-defined phrase "implicitly nested scope". Plus that, as noted, what passing an argument "to a scope" means is mysterious. Language standard committees struggle for years with how to phrase things so that no more than is intended appears to be promised. It's hard! For example, if you were to show a workalike function and note that the exact placement - and number - of `iter()` calls is not guaranteed, someone else would point out that you need to explicitly say that by "iter" you mean the builtin function of that name, not one user code may have overridden it with in the current scope. Then someone else will note that it's tedious to say things like that whenever they're needed, and more-general text will be added elsewhere in the docs saying that the _rest_ of the docs always mean the language-supplied versions of such-&-such explicitly named functions/classes/modules/... I'd say "nested function" anyway ;-) And for another reason: not just someone from Mars is prone to misreading 'scope", but just about anyone on Earth coming from another language. The idea that the word "scope" all by itself implies "and in general any name bound to within the top-level code spanned by the scope is implicitly local to the scope unless explicitly declared `global` or `nonlocal` in the scope" may be unique to Python. > For Chris's actual question, this is part of why I think adding > "parentlocal" would actually make the scoping proposal easier to explain, as > it means the name binding semantics aren't a uniquely magical property of > binding expressions (however spelled), they're just a new form of target > scope declaration that the compiler understands, and the binding expression > form implies. Note: eas*ier*, not easy ;) Adding an explanation of `parentlocal` to the docs could be a useful pedagogical device, but I don't think I'd support adding that statement to the _language_. It's too weird, and seems to be at a wrong level for plausible future language developments. Let's step way back for a minute. In many languages with full-blown closures, first-class functions, and nested lexical scopes, it's pretty common to define the meaning of various language constructs in terms of calling derived lexically nested functions. In those languages, any "work variables" needed by the synthetic functions are declared as being local to those functions, and _that's the end of it_. They're done. All other names inside the expansions mean exactly the same as what they mean in whatever chunks of user-supplied code the construct interpolates into the synthesized functions. It doesn't matter one whit in which context(s) they appear. That's the only long-term sane way to go about defining constructs in terms of calling synthesized functions interpolating user-supplied pieces of code. Now _if_ Python had been able to do that, the meaning of genexps and listcomps would have been defined, from the start, in terms of synthesized functions that declared all & only the for-target names "local". And, in fact, the change I'm suggesting wouldn't have required changing the comprehension implementation _at all_ when assignment expressions were added. Instead the implementation would need to change to _add_ assignment expression targets to the things declared local if it was decided that those targets should be _local_ to the derived functions instead. That's why this all seems so bloody obvious to me ;-) It's how virtually every other language in the business of defining constructs in terms of nested synthesized functions works. So if that's something we may ever do again - and possibly even if we don't expect to ever do it again - I suggest a more generally useful approach would be to add a new flavor of _function_ to Python. Namely one wherein the only locals are the formal arguments and those explicitly declared local. Whether or not a name is bound in the body would be irrelevant. To avoid a new keyword, `local` could be spelled `not nonlocal` ;-) Note that the only use for `parentlocal` s
Re: [Python-Dev] My fork lacks a 3.7 branch - can I create it somehow?
[Nathaniel Smith ] > ... > As far as git is concerned, the main repo on github, your fork on > github, and your local repo are 3 independent repositories, equally > valid. The relationships between them are purely a matter of > convention. Thanks for that! It instantly cleared up several mysteries for me. I'm just starting to learn git & github, and am starkly reminded of an old truth: there is absolutely nothing "obvious" about source-control systems, or workflows, before you already know them ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)
[Guido] > A quick follow-up: PEP 572 currently has two ideas: (a) introduce := for inline > assignment, (b) when := is used in a comprehension, set the scope for the > target as if the assignment occurred outside any comprehensions. It seems > we have more support for (a) than for (b) -- at least Nick and Greg seem to > be +0 or better for (a) but -1 for (b). IIRC (b) originated with Tim. But his > essay on the topic, included as Appendix A > ( https://www.python.org/dev/peps/pep-0572/#appendix-a-tim-peters-s-findings) > does not even mention comprehensions. I was writing up my observations about simple changes to existing code. Since there's nothing sanely akin to binding non-for-targets possible in comprehensions now, comprehensions were out of scope for that effort (which was limited to staring at existing code already doing bindings). :> However, he did post his motivation for (b) on python-ideas, IIRC a bit > before PyCon; and the main text of the PEP gives a strong motivation > (https://www.python.org/dev/peps/pep-0572/#scope-of-the-target). Nevertheless, > maybe we should compromise and drop (b)? Two things to say about that. First, the original example I gave would be approximately as well addressed by allowing to declare intended scopes in magically synthesized functions; like (say) p = None # to establish the intended scope of `p` while any( # split across lines just for readability n % p == 0 for p in small_primes): n //= p It didn't really require an inline assignment, just a way to override the unwanted (in this case) "all `for` targets are local to the invisible function" rigid consequence of the implementation du jour. Second, if it's dropped, then the PEP needs more words to define what happens in cases like the following, because different textual parts of a comprehension execute in different scopes, and that can become visible when bindings can be embedded: def f(): y = -1 ys = [y for _ in range(y := 5)] print(y, ys) Here `range(y := 5)` is executed in f's scope. Presumably the `y` in `y for` also refers to f's scope, despite that `y` textually _appears_ to be assigned to in the body of the listcomp, and so would - for that reason - expected to be local to the synthesized function, and so raise `UnboundLocalError` when referenced. It's incoherent without detailed knowledge of the implementation. def g(): y = -1 ys = [y for y in range(y := 5)] print(y, ys) And here the `y` in `y for y` is local to the synthesized function, and presumably has nothing to do with the `y` in the `range()` call. That's incoherent in its own way. Under the current PEP, all instances of `y` in `f` refer to the f-local `y`, and the listcomp in `g` is a compile-time error. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)
> > [Tim] >> > . First, the original example I gave would be approximately as well >> addressed by allowing to declare intended scopes in magically synthesized >> functions; like (say) >> >> p = None # to establish the intended scope of `p` >> while any( # split across lines just for readability >> n % p == 0 for p in small_primes): >> n //= p >> >> It didn't really require an inline assignment, just a way to override the >> unwanted (in this case) "all `for` targets are local to the invisible >> function" rigid consequence of the implementation du jour. >> > [Guido] > Hm, that's more special syntax. > Of course - I'm anticipating that the PEP will be changed to throw out useful assignment expressions in comprehensions, but I still want a way to "export" comprehension for-targets at times ;-) > The nice bit about (b) as currently specified is that it adds no syntax -- > it adds a scope rule, but (as IIRC Steven has convincingly argued) few > people care about those. Python's scope rules, when fully specified, are > intricate to the point of being arcane (e.g. for class scopes) but all that > has a purpose -- to make them so DWIM ("Do what I Mean") that in practice > you almost never have to worry about them, *especially* when reading > non-obfuscated code (and also when writing, except for a few well-known > patterns). > You and Steven and i appear to be on the same page here - but it's in a book nobody else seems to own :-( To me it's just screamingly obvious that total = 0 cumsums = [total := total + value for value in data] "should do" what it obviously intends to do - and that the only thing stopping that is a bass-ackwards focus on what most trivially falls out of the current implementation. ... def f(): >> y = -1 >> ys = [y for _ in range(y := 5)] >> print(y, ys) >> >> Here `range(y := 5)` is executed in f's scope. Presumably the `y` in `y >> for` also refers to f's scope, despite that `y` textually _appears_ to be >> assigned to in the body of the listcomp, and so would - for that reason - >> expected to be local to the synthesized function, and so raise >> `UnboundLocalError` when referenced. It's incoherent without detailed >> knowledge of the implementation. >> > > That code should have the same meaning regardless of whether we accept (b) > or not -- there is only one `y`, in f's scope. I don't mind if we have to > add more words to the PEP's scope rules to make this explicit, though I > doubt it -- the existing weirdness (in the comprehension spec) about the > "outermost iterable" being evaluated in the surrounding scope specifies > this. I wouldn't call it incoherent -- I think what I said about scope > rules above applies here, it just does what you expect. > Remove "y = -1" and - voila! - we have the dreaded "parent local scoping" Nick finds so baffling to explain (or so he claims). That is, "has exactly the same scope in the comprehension as in the parent block, and will create a local in the latter if the name is otherwise unknown in the parent" comes with assignment expressions, regardless of whether _all_ such targets "leak" (the current PEP) or only targets in the expression defining the iterable of the outermost `for` (the PEP without leaking assignment expressions in comprehensions). As to whether it "does what you expect", no, not really! In a world where _all_ binding targets in a comprehension are claimed to be local to the comprehension, I _expect_ that `y := 5` appearing inside the listcomp means `y` is local to the listcomp. "Oh - unless the binding appears in the expression defining the iterable of the outermost `for`" comes from Mars. Not that it really matters much, but (b) provides consistent semantics in these cases. No need to search Mars for weird exceptions ;-) ... > A "neutral" argument about (b) is that despite the "horrified" reactions > that Nick saw, in practice it's going to confuse very few people (again, > due to my point about Python's scope rules). I'd wager that the people who > might be most horrified about it would be people who feel strongly that the > change to the comprehension scope rules in Python 3 is a big improvement, > and who are familiar with the difference in implementation of > comprehensions (though not generator expressions) in Python 2 vs. 3. > I also doubt it will generally confuse people in practice (to the contrary, I expect they _will_ be confused if things like the cumulative sums example blow up with UnboundLocalError). But I still don't get the source of the "horror". Assignment expression semantics are wholly consistent with ordinary nested lexical scoping, with or without (b). The only difference is in the scopes picked for assignment expression target names (except for those appearing in the expression defining the iterable yadda yadda yadda). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/l
Re: [Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)
[Nick Coghlan]> > actually made those semantics available as an explicit > "parentlocal NAME" declaration ...: > > > > def _list_comp(_outermost_iter): > > parentlocal item > > _result = [] > > for x in _outermost_iter: > > item = x > > _result.append(x) > > return _result > > > > _expr_result = _list_comp(items) > [Greg Ewing] I'm not sure that's possible. If I understand correctly, > part of the definition of "parent local" is that "parent" > refers to the nearest enclosing *non-comprehension* scope, > to give the expected result for nested comprehensions. > If that's so, then it's impossible to fully decouple its > definition from comprehensions. > > Nick's "parentlocal" does refer to the parent, but makes no distinction between synthesized and user-written functions. If the parent has a matching parentlocal declaration for the same name then the original really refers to the grandparent - and so on. Ultimately, it resolves to the closest enclosing scope in which the name is _not_ declared parentlocal. In that scope, a "nonlocal" or "global" declaration settles it if one appears, else the name is local to that scope. So a nested comprehension would declare its assignment expression targets as parentlocal in its synthesized function, and in all the containing synthesized functions generated for containing comprehensions. This appears in some strained ;-) way "natural" only because there is no explicit way to declare something "local" in Python. In just about any other language with closures and nested lexical scopes, comprehensions and generator expressions would have been implemented via nested functions that explicitly declared their "for" target names "local". and nothing else. The only change needed then for PEP 572 (b) semantics would be to declare assignment expression target names local (if their scope wasn't already known) in the closest containing non-synthesized block. None of which really matters. The real question is which semantics are desired. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Informal educator feedback on PEP 572 (was Re: 2018 Python Language Summit coverage, last part)
[Nick Coghlan] > However, PEP 572 in its current form takes the position "parent local > scoping is sufficiently useful to make it a required pre-requisite for > adding assignment expressions, but not useful enough to expose as a > new scope declaration primitive", > Of course the PEP doesn't take that position at all: it doesn't even contain the term "parent local scoping". That's your term, which nobody else uses unless they're replying to you ;-) What the PEP does say: """ an assignment expression occurring in a list, set or dict comprehension or in a generator expression (below collectively referred to as "comprehensions") binds the target in the containing scope, honoring a nonlocal or global declaration for the target in that scope, if one exists. For the purpose of this rule the containing scope of a nested comprehension is the scope that contains the outermost comprehension. A lambda counts as a containing scope. """ It's a small collection of plainly stated rules for specifying the intended semantics. If you want to claim that this _is_ "useful enough to expose as a new scope declaration primitive", it's really on you to present use cases to justify that claim. I'd present some for you, but I don't have any (I don't care that "by hand" conversion of nested comprehensions to workalike Python nested functions may require a bit of thought to establish the intended scope of assignment expression target names - all of which is easily doable without adding any new statements). I don't _expect_ that other good use cases exist. The gimmick's purpose is to make code that visually _appears_ to belong to a block act as if embedded assignments do occur in that block. If there's an explicitly nested function, that fundamental motivation no longer applies. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status on PEP-431 Timezones
[ISAAC J SCHWABACHER ] > ... > I disagree with the view Tim had of time zones when he wrote that comment > (and that code). It sounds like he views US/Eastern and US/Central as time > zones (which they are), but thinks of the various America/Indiana zones as > switching back and forth between them, rather than being time zones in their > own right You can think of them anyway you like. The point of the code was to provide a simple & efficient way to convert from UTC to local time in all "time zones" in known actual use at the time; the point of the comment was to explain the limitations of the code. Although, as Allexander noted, the stated assumptions are stronger than needed. > I think the right perspective is that a time zone *is* the function that its > `fromutc()` method implements, Fine by me ;-) > although of course we need additional information in order to actually > compute (rather than merely mathematically define) its inverse. Daylight > Saving > Time is a red herring, Overstated. DST is in fact the _only_ real complication in 99.99% of time zones (perhaps even 99.9913% ;-) ). As the docs say, if you have some crazy-ass time zone in mind, fine, that's why fromutc() was exposed (so your; crazy-ass tzinfo class can override it). > and assumptions 2 and 4 Nitpick: 4 is a consequence of 2, not an independent assumption. > in that exposition are just wrong from this point of view. As above, there is no particular POV in this code: just a specific fromutc() implementation, comments that explain its limitations, and an invitation in the docs to override it if it's not enough for your case. > In the worst case, Asia/Riyadh's two years of solar time completely shatter > these assumptions. Sure. But, honestly, who cares? Riyadh Solar Time was so off-the-wall that even the Saudis gave up on it 25 years ago (after a miserable 3-year experiment with it). "Practicality beats purity". > [eliding a more-general view of what time zones "really" are] I'm not eliding it because I disagree with it, but because time zones are political constructions. "The math" we make up may or may not be good enough to deal with all future political abominations; for example: > ... > This assumption would be violated if, for example, some jurisdiction > decided to fall back two hours by falling back one hour and then > immediately falling back a second hour. I recommend the overthrow > of any such jurisdiction and its (annexation by the Netherlands)[3]. That's not objectively any more bizarre than Riyadh Solar Time. Although, if I've lived longer than you, I may be more wary about the creative stupidity of political schemes ;-) > ... (Lennart, I think this third assumption is the important part of your "no > changes within 48 hours of each other" assumption, The "48 hours" bit came from Alexander. I'm personally unclear on what Lennart's problems are. > ... > All of these computations can be accomplished by searches of ordered lists > and applications of $fromlocal_i$. Do you have real-world use cases in mind beyond supporting long-abandoned Riyadh Solar time? > ... > With this perspective, arithmetic becomes "translate to UTC, operate, > translate > back", which is as it should be. There _was_ a POV in the datetime design about that: no, that's not how it should be. Blame Guido ;-) If I add, say, 24 hours to noon today, I want to get noon tomorrow, and couldn't care less whether DST started or stopped (or any other political adjustment was made) in between. For that reason, it was wholly intentional that datetime + timedelta treats datetime as "naive". If that's not what someone wants, fine, but then they don't want Python's datetime arithmetic BTW, there's no implication that they're "wrong" for wanting something different; what would be wrong is insisting that datetime's POV is "wrong". Both views are valid and useful, depending on the needs of the application. One had to picked as the built-in behavior, and "naive" won. > ... > But IIUC what Lennart is complaining about I don't, and I wish he would be more explicit about what "the problem(s)" is(are). > is the fact that the DST flag isn't part of and can't be embedded into a > local time, > so it's impossible to fold the second parameter to $fromlocal$ into $t$. > Without > that, a local time isn't rich enough to designate a single point in time and > the > whole edifice breaks. You can blame Guido for that too ;-) , but in this case I disagree(d) with him: Guido was overly (IMO) annoyed by that the only apparent purpose for a struct tm's tm_ isdst flag was to disambiguate local times in a relative handful of cases. His thought: an entire bit just for that?! My thought: get over it, it's one measly bit. my-kingdom-for-bit-ingly y'rs - tim ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org
Re: [Python-Dev] Status on PEP-431 Timezones
[Tim] > Sure. But, honestly, who cares? Riyadh Solar Time was so > off-the-wall that even the Saudis gave up on it 25 years ago (after a > miserable 3-year experiment with it). "Practicality beats purity". Heh. It's even sillier than that - the Saudis never used "Riyadh Solar Time", and it's been removed from release 2015e of the tz database: https://www.ietf.org/timezones/data/NEWS Release 2015e - 2015-06-13 10:56:02 -0700 ... The files solar87, solar88, and solar89 are no longer distributed. They were a negative experiment - that is, a demonstration that tz data can represent solar time only with some difficulty and error. Their presence in the distribution caused confusion, as Riyadh civil time was generally not solar time in those years. Looking back, Paul Eggert explained more in 2013, but it took this long for the patch to land: http://comments.gmane.org/gmane.comp.time.tz/7717 > did Saudi Arabia really use this as clock time? Not as far as I know, for civil time. There was some use for religious purposes but it didn't use the approximation in those files. These files probably cause more confusion than they're worth, so I'll propose a couple of patches to remove them, in two followup emails. I haven't pushed these patches to the experimental github version. The position of the sun is vital to establishing prayer times in Islam, but that's got little to do with civil time in Islamic countries. And Olson didn't take his "Riyadh Solar Time" rules from the Saudis, he made up the times himself: "Times were computed using formulas in the U.S. Naval Observatory's Almanac for Computers 1987[89]". The formulas only produced approximations, and then rounded to 5-second boundaries because the tz data format didn't have enough bits. So, as a motivating example, it's hard to get less compelling: Riyadh Solar is a wholly artificial "time zone" made up by a time zone wonk to demonstrate some limitations of the tz database he maintained. Although I expect he could have done so just as effectively by writing a brief note about it ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status on PEP-431 Timezones
[Lennart Regebro ] > And I would want to remind everyone again that this is not a question > of the problem being impossible. It's just really complex to get right > in all cases, and that always having the UTC timestamp around gets rid > of most of that complexity. Could you please be explicit about what "the problem" is? Everyone here is guessing at what you think "the problem" is. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status on PEP-431 Timezones
[Tim] >> The formulas only produced approximations, and then >> rounded to 5-second boundaries because the tz data format didn't have >> enough bits. [ISAAC J SCHWABACHER ] > Little known fact: if you have a sub-minute-resolution UTC offset when a > leap second hits, it rips open a hole in the space-time continuum and > you find yourself in New Netherlands. Tell me about it! Last time that happened I had to grow stinking tulips for 3 years to get enough money to sail back home. I'll never use a sub-minute-resolution UTC offset again ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status on PEP-431 Timezones
[Lennart Regebro ] >>> And I would want to remind everyone again that this is not a question >>> of the problem being impossible. It's just really complex to get right >>> in all cases, and that always having the UTC timestamp around gets rid >>> of most of that complexity. [Tim] >> Could you please be explicit about what "the problem" is? Everyone >> here is guessing at what you think "the problem" is. [Lennart] > The problem is that it is exceedingly complicated to get all the > calculations back and forth between local time and UTC to be correct > at all times and for all cases. It really doesn't get more specific > than that. I don't remember which exact problem it was that made me > decide that this was not the correct solution and that we should use > UTC internally, but I don't think that matters, because I'm also sure > that it was not the last case, as I was far from near the end in > adding testcases. I believe everyone here is saying it "shouldn't be" exceedingly complicated, or even particularly hard, if you add the is_dst flags the PEP says it would add. But is the PEP complete? Under the "Postponement" section, it says: The implementation has turned out to be exceedingly complex, due to having to convert back and forth between the local time and UTC during arithmetic and adjusting the DST for each arithmetic step, with ambiguous times being particularly hard to get right. However, the _body_ of the PEP said nothing whatsoever about altering arithmetic. The body of the PEP sounds like it's mainly just proposing to fold the pytz package into the core. Perhaps doing _just_ that much would get this project unstuck? Hope springs eternal :-) > Once again I'm sure it's not impossible to somehow come up with an > implementation and an API that can do this based on local time, but > once again I am of the opinion that it is the wrong thing to do. We > should switch to using UTC internally, because that will make > everything so much simpler. Like what? I'm still looking for a concrete example of what "the problem" is (or even "a" problem). > I am in no way against other people implementing this PEP, but I think > you will end up with very complex code that will be hard to maintain. Somebody first needs to define what "the problem" is ;-) > There really is a reason every other date time implementation I know > of uses UTC internally, Yes, but the fundamental reason datetime does not is that Guido consciously and deliberately decided that "naive datetime" would be most useful most often for most users. That's why "naive" objects are the default. And even for "aware" objects, arithmetic staying within a single time zone was deliberately specified to be "naive" too. My guess is that all other datetime implementations you know of have no concept of "naive" datetimes. let alone make naive datetimes primary. Small wonder, if so, that they're all different in this way. That's a design decision not everyone likes, and certainly isn't suitable for all purposes, but the debate over that ended a dozen years ago when the decision was made. If your vision of PEP 431 _changes_ that design decision (which it sure _sounds_ like it wants to based on what you're typing here, but which PEP 431 itself does not appear to say - impossible to tell which from here without any specific example(s)), that may account for all sorts of complications that aren't apparent to me. > and there really is a reason why everyone always recommends storing date > times in UTC with the time zone or offset separately. Well, that's the second thing they recommend - and they can already do that. The first thing to recommend is to use naive objects in any application where that's possible, so that you don't have to bother with _any_ time zone esoterica, surprises, complications or overheads. After all, it's 7:54 PM as I type this, and that's perfectly clear to me ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status on PEP-431 Timezones
[ISAAC J SCHWABACHER ] >>> ... >>> I think the right perspective is that a time zone *is* the function that its >>> `fromutc()` method implements, [Tim] >> Fine by me ;-) [Isaac] > My issue is that you're computing `fromutc()`, which is a function, in > terms of `dst()` and `utcoffset()`, which aren't. I know. That's not "an issue" that will gain traction, though ;-) > I think this is backwards; `dst()` and `utcoffset()` should be computed > from `fromutc()` plus some additional information that has to be present > anyway in order to implement `fromutc()`. Memory lane: that additional information doesn't exist now. I think it "should have", but at the time, as I recall there was fatal opposition to storing an `isdst` flag because it would consume an extra byte in the pickle format. That was enough to kill it: datetime development was paid for by a company very concerned about pickle sizes ;-) > With the extra bit, `dst()` and `utcoffset()` become partial functions, which > makes it *possible* to get the right answer in all cases, but it's still > simpler > to start with the total function and work from there. Well, maybe simpler for you, but I think not in general. At the time, all aspects of datetime's development were vigorously debated, but mostly on Zope Corp (the company paying for it) wikis and mailing lists. While some people didn't care about time zones at all, most did. Of the latter: - All were keenly aware of the need to incorporate UTC offsets. - All were keenly aware of the need to accommodate "daylight time" schemes. - None gave a fig about anything else. Very late in the game, examples were given of localities that had in fact changed their UTC offsets from time to time, but as curiosities rather than as "issues". That's when I created fromutc() - it was a last-second addition. I cared enough to make it _possible_ to accommodate such cases, but there was no interest (or time) to cater to them directly. Instead fromutc() was written to use only the already-existing utcoffset() and dst(). Everyone already knew how to use the latter: they directly corresponded to the two things everyone cared about keenly from the start. That doesn't _preclude_ anyone from writing a more-general fromutc(), and I encourage, for example, you to do so ;-) I agree it's the most fundamental thing from an abstract mathematical view, but "UTC offset" and "DST offset" fit most peoples' brains a hell of a lot better than "collection of piecewise continuous monotonically increasing functions whose images don't overlap too much" ;-) >>> Daylight Saving Time is a red herring, >> Overstated > I stand by what I meant by this, even if I did a bad job of expressing > the point. Assuming that all time zone discontinuities are due to DST > changes breaks many time zones (really almost all of the Olson time > zones, though only for a vanishingly small fraction of datetimes), It's use cases that are missing here: who needs to convert historic times to/from UTC. and where the "problem times" are generally arranged by politicians to occur when most people are sleeping? That's why nobody really cared about offset-changing zones at the start. Yes, such zones exist, but times recorded in such zones are in yesterday's databases we don't care about anymore except maybe to display the values. > but that's not the point I was making. The point is that it doesn't buy us > anything. Au contraire: as above, it bought datetime a model people thought they understood at once, since almost everyone has wrestled with UTC offsets and daylight-time switches in ordinary daily life. Implement utcoffset() and dst(), and you're done. Even if you're really not, you _think_ you are, so you slumber peacefully then ;-) > Though this is probably obscured by all the markup, the more general > algorithm I gave is also simpler than the one in the comment in datetime.py, > and the reason for that is that it solves an easier problem, but one that > serves our practical purposes just as well. It's heavily obscured by the heavy markup. Write some Python code instead? I expect few people will try to untangle the meaning otherwise. As for whether it's simpler - eh, don't know. Here's the actual code, stripped of error-checking: def fromutc(self, dt): dtoff = dt.utcoffset() dtdst = dt.dst() delta = dtoff - dtdst if delta: dt += delta dtdst = dt.dst() return dt + dtdst Will your code run faster? Have fewer conditionals? Fewer lines? Shorter lines? Less nesting? Fewer operations? Important to me, though, is that your code should be far more self-evidently _correct_, provided the reader understands the math underlying it (which will require - as this code does - referring to a relatively massive wall of text to explain it). > ... > I went too far in inferring your viewpoint from your code. I don't find fault > with the explanation on its own ter
Re: [Python-Dev] Status on PEP-431 Timezones
[Tim] >> However, the _body_ of the PEP said nothing whatsoever about altering >> arithmetic. The body of the PEP sounds like it's mainly just >> proposing to fold the pytz package into the core. Perhaps doing >> _just_ that much would get this project unstuck? Hope springs eternal :-) [Lennart Regebro ] > The pytz package has an API and a usage that is different from the > datetime() module. One of the things you need to do is that after each > time you do arithmetic, you have to normalize the result. This is done > because the original API design did not realize the difficulties and > complexities of timezone handling and therefore left out things like > ambiguous times. Oh, they were realized - indeed, the pytz docs point to Python's tzinfo docs to explain the ambiguities, and the latter docs existed before ;-) day 1. The Python docs also are quite clear about that all arithmetic within a single timezone is "naive". That was intentional. The _intended_ way to do "aware" arithmetic was always to convert to UTC, do the arithmetic, then convert back. You never _have_ to normalize() in pytz. But it's needed if you _don't_ follow pytz's explicit The preferred way of dealing with times is to always work in UTC, converting to localtime only when generating output to be read by humans advice, and want to do "aware" arithmetic directly in a non-UTC time zone. Python's datetime never intended to support that directly. Quite the contrary. I know people who feel otherwise tend to think of that as a lazy compromise (or some such), but naive arithmetic was intended to be "a feature". Fight the design every step of the way, and, yup, you get problems every step of the way. > The PEP attemps to improved the datetime modules API so that it can > handle the ambiguous times. No problem with that. I always thought the lack of storing is_dst-like info was datetime's biggest wart. > It also says that the implementation will be based on pytz, because it > was my assumption that this would be easy, since pytz already handles > ambiguous times. During my attempt of implementing it I realized it > wasn't easy at all, and it wasn't as easy as folding pytz into the core. Is it the case that pytz also "fails" in the cases your attempts "fail"? In any case, if you're trying to change how "aware" datetime arithmetic works, that's a major and backward-incompatible change. Does Guido realize it? As before, it's not at all clear from the PEP. > Yes, the PEP gives that impression, because that was the assumption > when I wrote the draft. Just folding pytz into the core without > modifying the API defeats the whole purpose of the PEP, since > installing pytz is a trivial task. "Batteries included" has some attractions all on its own. On top of that, adding is_dst-like flags to appropriate methods may have major attractions. Changing the semantics of datetime arithmetic has major attractions to some people, but also major drawbacks - regardless, since changing it turns Guido's original design on its head, he really needs to Pronounce on that part. >> Like what? I'm still looking for a concrete example of what "the >> problem" is (or even "a" problem). > A problem is that you have a datetime, and add a timedelata to it, and > it should then result in a datetime that is actually that timedelta > later. And if you subtract the same timedelta from the result, it > should return a datetime that is equal to the original datetime. > > This sounds ridiculously simple Ah, but it already happens that way - because the builtin datetime arithmetic is "naive". The docs have always promised this: """ datetime2 = datetime1 + timedelta (1) datetime2 = datetime1 - timedelta (2) 1) datetime2 is a duration of timedelta removed from datetime1, moving forward in time if timedelta.days > 0, or backward if timedelta.days < 0. The result has the same tzinfo attribute as the input datetime, and datetime2 - datetime1 == timedelta after. OverflowError is raised if datetime2.year would be smaller than MINYEAR or larger than MAXYEAR. Note that no time zone adjustments are done even if the input is an aware object. 2) Computes the datetime2 such that datetime2 + timedelta == datetime1. As for addition, the result has the same tzinfo attribute as the input datetime, and no time zone adjustments are done even if the input is aware. This isn’t quite equivalent to datetime1 + (-timedelta), because -timedelta in isolation can overflow in cases where datetime1 - timedelta does not. """ >, and is ridiculously difficult to make happen in all cases that we want to > support (Riyahd time zone and leap seconds not included). > That IS the specific, concrete problem, and if you don't believe me, there > is nothing I can do to convince you. I apologize if I've come off as unduly critical - I truly have been _only_ trying to find out what "the problem" is. That helps! Thank you. Note that I've had nothing to do with datetime (except
Re: [Python-Dev] Status on PEP-431 Timezones
[Paul Moore ] > I think the current naive semantics are useful and should not be > discarded lightly. At an absolute minimum, there should be a clear, > documented way to get the current semantics under any changed > implementation. Realistically, default arithmetic behavior can't change in Python 3 (let alone Python 2). Pushing for a different design is fine, but that can't be sold on the grounds that current behavior is "a bug" - it's working as designed, as intended, and as documented, and hasn't materially changed in the dozen-or-so years since it was introduced. It's not even that the proposed alternative arithmetic is "better", either: while it's certainly more suitable for some applications, it's certainly worse for others. Making an incompatible change would be (& should be) a hard sell even if there were a much stronger case for it than there is here. But that's just arithmetic. Some way to disambiguate local times, and support for most zoneinfo time zones, are different issues. > As an example, consider an alarm clock. I want it to go off at 7am > each morning. I'd feel completely justified in writing tomorrows_alarm > = todays_alarm + timedelta(days=1). > > If the time changes to DST overnight, I still want the alarm to go off > at 7am. Even though +1 day is in this case actually + 25 (or is it > 23?) hours. That's the current semantics. There was a long list of use cases coming to the same conclusion. The current arithmetic allows uniform patterns in local time to be coded in uniform, straightforward ways. Indeed, in "the obvious" ways. The alternative behavior favors uniform patterns in UTC, but who cares? ;-) Few local clocks show UTC. Trying to code uniform local-time behaviors using "aware arithmetic" (which is uniform in UTC. but may be "lumpy" in local time) can be a nightmare. The canonical counterexample is a nuclear reactor that needs to be vented every 24 hours. To which the canonical rejoinder is that the programmer in charge of that system is criminally incompetent if they're using _any_ notion of time other than UTC ;-) > To be honest, I would imagine, from experience with programmers > writing naive algorithms, that the current semantics is a lot less > prone to error when used by such people. People forget about timezones > until they are bitten by them, and if they are using the convert to > UTC->calculate->convert back model, their code ends up with > off-by-1-hour bugs. Certainly such mistakes can be fixed, and the > people who make them educated, but I like the fact that Python's > typical behaviour is to do what a non-expert would expect. By all > means have the more sophisticated approach available, but if it's the > default then naive users have to either (1) learn the subtleties of > timezones, or (2) learn how to code naive datetime behaviour in Python > before they can write their code. If the current behaviour remains the > default, then *when* the naive user learns about the subtleties of > timezones, they can switch to the TZ-aware datetime - but that's a > single learning step, and it can be taken when the user is ready. There is a design flaw here, IMO: when they switch to a TZ-aware datetime, they _still_ get "naive" arithmetic within that time zone. It's at best peculiar that such a datetime is _called_ "aware" yet still ignores the time zone rules when doing arithmetic. I would have preferred a sharper distinction, like "completely naive" (tzinfo absent) versus "completely aware" (tzinfo present). But, again, it's working as designed, intended and documented. One possibility to get "the other" behavior in a backward-compatible way: recognize a new magic attribute on a tzinfo instance, say, __aware_arithmetic__. If it's present, arithmetic on a datetime with such a tzinfo member "acts as if" arithmetic were done by converting to UTC first, doing the arithmetic, then converting back. Otherwise (magic new attribute not present) arithmetic remains naive. Bonus: then you could stare at datetime code and have no idea which kind of arithmetic is being used ;-) > PS I don't think the above is particularly original - IIRC, it's > basically Guido's argument for naive datetimes from when they were > introduced. I think his example was checking his watch while on a > transatlantic plane flight, but the principle is the same. Yup, your account is fair (according to me ;-) ). Here's Guido's first message on the topic: https://mail.python.org/pipermail/python-dev/2002-March/020648.html ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status on PEP-431 Timezones
[Tim] >> The Python docs also are quite clear about that all arithmetic within >> a single timezone is "naive". That was intentional. The _intended_ >> way to do "aware" arithmetic was always to convert to UTC, do the >> arithmetic, then convert back. [Lennart] > We can't explicitly implement incorrect timezone aware arithmetic and > then expect people to not use it. Python didn't implement timezone-aware arithmetic at all within a single time zone. Read what I wrote just above. It implements naive arithmetic within a single time zone. > We can make the arithmetic correct, The naive arithmetic within a timezone is already correct, by its own internal criteria. It's also useful (see the original discussions, or Paul Moore's recent brief account). That it's not the arithmetic you want doesn't make it "incorrect", it makes it different from what you want. That's fine - you're allowed to want anything ;-) But it's a dozen years too late to change that decision. Maybe for Python 4. > and we can raise an error when doing tz-aware arithmetic in a > non-fixed timezone. Sorry, I don't know what that means. Under any plausible interpretation, I don't see any need to raise an exception. > But having an implementation we know is incorrect You really have to get over insisting it's incorrect. It's functioning exactly the way it was intended to function. It's _different_ from what you favor. Note that I'm not calling what you favor "incorrect". It's different. Both kinds of arithmetic are useful for different purposes, although I still agree with Guido's original belief that the current arithmetic is most useful most often for most programmers. > and telling people "don't do that" doesn't seem like a good solution > here. We don't tell people "don't do that". It's perfectly usable exactly as-is for many applications. Not all. For those applications needing the other kind of arithmetic, the convert-to/from-UTC dance was the intended solution. > Why do we even have timezone aware datetimes if we don't intend them > for usage? They are intended for usage. But a single way of using them is not suitable for all possible applications. >> ... >> Python's datetime never intended to support that directly. > I think it should. Ya, I picked that up ;-) I don't, but it's too late to break backward compatibility regardless. > It's expected that it supports it, By some people, yes. Not by all. > and there is no real reason not to support it. Backward compatibility is a gigantic reason to continue with the status quo. See Paul Moore's post for a start on why naive arithmetic was picked to begin with. > The timezone handling becomes complicated if you base yourself on > localtime, and simple if you base yourself on UTC. That's an implementation detail unrelated (in principle) to how arithmetic works. Although as a practical matter it cuts both ways: naive local-time arithmetic is complicated if the internal time is stored in UTC, but simple if stored in local time. > As you agree, we recommend to people to use UTC at all times, I recommend people don't use tzinfo at all if they can avoid it. Beyond that, there are many attractions to using UTC, and to explicitly use UTC. Not all applications need to care, though. > and only use timezones for input and output. Well, what I'm now > proposing is to take that recommendation to heart, and change > datetime's implementation so it does exactly that. Suppose I'm correct in my belief that there's scant chance of getting approval for changing the default datetime arithmetic in Python 3 (or Python 2). Would you still be keen to replace the internals with UTC format? Note that there are many consequences to that implementation detail. For example, it was an explicit requirement of the datetime design that the month, day, hour, minute and second components be very cheap to extract. If you have to do conversion every time one is accessed, it's much slower; if you cache the "local time" components separately, the memory burden increases. Etc. > I saw the previous mention of "pure" vs "practical", and that is often > a concern. Here it clearly is not. This is a choice between impure, > complicated and impractical, and pure, simple and practical. There is nothing in the datetime world simpler than naive arithmetic ;-) "Practical" is relevant to a specific application's specific needs, and neither kind of arithmetic is "practical" for all applications. Guido believed naive arithmetic is most practical overall. But even believing that too, datetime certainly "should be" beefed up to solve the _other_ problems: like resolving ambiguous times, and supporting the full range of zoneinfo possibilities >> Is it the case that pytz also "fails" in the cases your attempts "fail"? > No, that is not the case. And if you wonder why I just don't do it > like pytz does it, it's because that leads to infinite recursion, much > as discussions on this mailing
Re: [Python-Dev] Status on PEP-431 Timezones
[Paul Moore ] >> >> As an example, consider an alarm clock. I want it to go off at 7am >> each morning. I'd feel completely justified in writing >> tomorrows_alarm = todays_alarm + timedelta(days=1). [Lennart Regebro ] > That's a calendar operation made with a timedelta. It's an instance of single-timezone datetime arithmetic, of the datetime + timedelta form. Your examples have been of the same form. Note that after Paul's tomorrows_alarm = todays_alarm + timedelta(days=1) it's guaranteed that assert tomorrows_alarm - todays_alarm == timedelta(days=1) will succeed too. > The "days" attribute here is indeed confusing as it doesn't mean 1 day, > it means 24 hours. Which, in naive arithmetic, are exactly the same thing. That's essentially why naive arithmetic is the default: it doesn't insist on telling people that everything they know is wrong ;-) There's nothing confusing about Paul's example _provided that_ single-timezone arithmetic is naive. It works exactly as he intends every time, and obviously so. Seriously, try this exercise: how would you code Paul's example if "your kind" of arithmetic were in use instead? For a start, you have no idea in advance how many hours you may need to add to get to "the same local time tomorrow". 24 won't always work Indeed, no _whole_ number of hours may work (according to one source I found, Australia's Lord Howe Island uses a 30-minute DST adjustment). So maybe you don't want to do it by addition. What then? Pick apart the year, month and day components, then simulate "naive arithmetic" by hand? The point is that there's no _obvious_ way to do it then. I'd personally strip off the tzinfo member, leaving a wholly naive datetime where arithmetic "works correctly" ;-) , add the day, then attach the original tzinfo member again. But for a dozen years it's sufficed to do what Paul did. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status on PEP-431 Timezones
[Ronald Oussoren ] > IMHO “+ 1 days” and “+ 24 hours” are two different things. > Date arithmetic is full of messy things like that. But it's a fact that they _are_ the same in naive time, which Python's datetime single-timezone arithmetic implements: - A minute is exactly 60 seconds. - An hour is exactly 60 minutes. - A day is exactly 24 hours. - A week is exactly 7 days. No context is necessary: those are always true in naive time, and that lack of mess is "a feature" to those who accept it for what it is. > “+ 1 month” is another example of that (which the datetime > module punts completely and can be a source of endless > bikeshidding). Note that the only units timedelta accepts have clear (utterly inarguable) meanings in naive time. That's intentional too. For example, "a month" and "a year" have no clear meanings (as durations) in naive time, so timedelta doesn't even pretend to support them. Despite all appearance to the contrary in this thread, naive time is bikeshed-free: it's easy for someone to know all there is to know about it by the time they're 12 ;-) datetime + timedelta(days=1) is equivalent to datetime + timedelta(hours=24) is equivalent to datetime + timedelta(minutes=60*24) is equivalent to datetime + timedelta(seconds=60*60*24) is equivalent to datetime + timedelta(microseconds=100*60*60*24) Naive time is easy to understand, reason about, and work with. When it comes to the real world, political adjustments to and within time zones can make the results dodgy, typically in the two DST-transition hours per year when most people living in a given time zone are sleeping. How much complexity do you want to endure in case they wake up? ;-) Guido's answer was "none in arithmetic - push all the complexity into conversions - then most uses can blissfully ignore the complexities". And note that because DST transitions "cancel out" over the span of a year, the benefits and the few dodgy cases don't really change regardless of whether you add one week or a hundred thousand weeks (although there's no way to predict what governments will decide the local clock "should say" a hundred thousand weeks from now - it's only predictable in naive time). ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Status on PEP-431 Timezones
>>> The "days" attribute here is indeed confusing as it doesn't mean 1 day, >>> it means 24 hours. >> Which, in naive arithmetic, are exactly the same thing. [Terry Reedy] > I think using the word 'naive' is both inaccurate and a mistake. The issue > is civil or legal time versus STEM time, where the latter includes > applications like baking cakes. Sorry, never heard of "STEM time" before - & a quick Google search didn't help. > It could also be called calendar time versus elapsed time. (Financial/legal > arithmetic versus STEM arithmetic is a somewhat similar contrast.) And I am, alas, equally unclear on what any of those others mean (exactly) to you. > The idea that an hour can be sliced out of a somewhat random March day and > inserting it into a somewhat random October day is rather sophisticated. It > came from the minds of government bureaucrats. It might be smart, dumb, or > just a cunning way for civil authorities to show who is in charge by making > us all jump. But not 'naive'. I agree. Python's "naive time" single-timezone arithmetic intentionally ignores all that: it ignores leap seconds, it ignores DST transition points, it ignores governments deciding to change the base UTC offset within a pre-existing time zone, ... It's time s naive that it thinks 24 hours is the same thing as a day ;-) > 'Naive' means simple, primitive, or deficient in informed judgement. It is > easy to take it as connoting 'wrong'. While some people in this thread seem convinced Python's naive time _is_ "wrong", it's not because it's called "naive". In any case, Guido decided to call it "naive" over 13 years ago, starting here, and that terminology has been in use ever since: https://mail.python.org/pipermail/python-dev/2002-March/020648.html > Tim, you have been arguing that civil/legal time arithmetic is not naive. Yes. But that's not "an argument", it's a plain fact that Python's "naive time" (note that "naive" here is technical term, used widely in the datetime docs) is not civil/legal time (assuming I understand what you mean by that phrase). > Calling civil time naive undercuts this claim. I don't see that I ever said civil time is naive. Adding a day is _not_ always the same as adding 24 hours in (at least Lennart's beliefs about) civil time. They _are_ always the same in Python's ("naive") datetime arithmetic. And the latter is all I said in the quote at the top of this msg. What am I missing? It's always something ;-) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com