[issue38200] Adding itertools.pairwise to the standard library?

2019-09-18 Thread Tim Peters
Tim Peters added the comment: There's an eternal culture clash here: functional languages have a long history of building in just about everything of plausible use, regardless of how trivial to build on other stuff. This started when LISP was barely released before (cadr x) was intro

[issue37812] Make implicit returns explicit in longobject.c (in CHECK_SMALL_INT)

2019-09-19 Thread Tim Peters
Tim Peters added the comment: Sorry, but there was nothing wrong with the CHECK_SMALL_INT macro, to my eyes, to begin with - except that it was burdened with an over-elaborate "do ... while(0)" wrapper. Not all macros are _intended_ to be "cheap functions". Lik

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-27 Thread Tim Peters
Tim Peters added the comment: tp_clear implementations are necessary to reclaim trash cycles. They're always best practice for objects that may be in trash cycles. tuples are just "cute rebels" that way ;-) Best guess is that the (some) extension isn't playing by t

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-28 Thread Tim Peters
Tim Peters added the comment: > call_func and remove are part of a reference cycle. A forced garbage > collection breaks the cycle and removes the two objects, but they are > not removed in the expected order: > > * first: call_func > * then: remove > > The crash

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-29 Thread Tim Peters
Tim Peters added the comment: Sorry, this is very hard for me - broke an arm near the shoulder on Tuesday, and between bouts of pain and lack of good sleep, concentration is nearly impossible. Typing with one hand just makes it worse :-( We must know that F is trash, else we never would

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-29 Thread Tim Peters
Tim Peters added the comment: Fleshing out something I left implicit: if there's a trash object T with a finalizer but we don't KNOW it's trash, we won't force-run its finalizer before delete_garbage starts either. Then, really the same thing: we may tp_clear som

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-29 Thread Tim Peters
Tim Peters added the comment: > Note that my flags show that W *is* in 'unreachable'. It has > to be otherwise F would not have tp_clear called on it. Right! W holds a strong reference to F, so if W were thought to be reachable, F would be too. But F isn't. >

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-29 Thread Tim Peters
Tim Peters added the comment: > I see that handle_weakrefs() calls _PyWeakref_ClearRef() and that > will clear the weakref even if it doesn't have callback. So, I > think that takes care for the hole I was worried about. I.e. a > __del__ method could have a weakref to an

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-29 Thread Tim Peters
Tim Peters added the comment: Ah, nevermind my last comment - yes. handle_weakrefs will clear all weakrefs to the objects we know are trash. -- ___ Python tracker <https://bugs.python.org/issue38

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-30 Thread Tim Peters
Tim Peters added the comment: > Would the attached rough patch (gc_disable_wr_callback.txt) > be a possible fix? When we find W inside handle_weakrefs(), > we mark it as trash and will not execute the callback. It's semantically correct since we never wanted to execute a c

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-30 Thread Tim Peters
Tim Peters added the comment: Neil, how about this alternative: leave the weakref implementation alone. If we find a trash weakref, simply clear it instead. That would prevent callbacks too, & would also prevent the weakref from being used to retrieve its possibly-trash-too refe

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-30 Thread Tim Peters
Tim Peters added the comment: Neil, my brief msg 10 minutes before yours suggested the same thing (just clear the weakref), so it must be right ;-) -- ___ Python tracker <https://bugs.python.org/issue38

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-30 Thread Tim Peters
Tim Peters added the comment: FWIW, I agree with Neil in all respects about the release: his patch is the best approach, plugs segfaulting holes that have been there for many years, and the earlier patches aren't needed anymore. -- ___ P

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-30 Thread Tim Peters
Tim Peters added the comment: It's unclear to me whether BPO-33418 was a bug or a contrived annoyance :-) If someone believes it was worth addressing, then what it did is the only way to fix it, so should be restored now. -- ___ Python tr

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-30 Thread Tim Peters
Tim Peters added the comment: Yes, it's better to have tp_clear than not for a variety of reasons (including setting examples of best practice). Best I can tell, the patch for BPO-33418 was reverted _only_ to worm around the crash in _this_ report. That's no longer needed. Or

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-09-30 Thread Tim Peters
Tim Peters added the comment: Łukasz, all type objects have tp_clear slots, and always did. The patch in question put something useful in the function object's tp_clear slot instead of leaving it NULL. No interface, as such, changes eithe

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-10-01 Thread Tim Peters
Tim Peters added the comment: Neil, about this comment: # - ct is not yet trash (it actually is but the GC doesn't know because of # the missing tp_traverse method). I believe gc should know ct is trash. ct is in the cf list, and the latter does have tp_traverse. What gc won&#

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-10-03 Thread Tim Peters
Tim Peters added the comment: Loose ends. Telegraphic because typing is hard. 1. Docs should be changed to encourage implementing the full gc protocol for "all" containers. Spell out what can go wrong if they don't. Be upfront about that history has, at times, proved us

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-10-03 Thread Tim Peters
Tim Peters added the comment: My understanding is that the CFFI types at issue don't even have Py_TPFLAGS_HAVE_GC. They're completely invisible to gc. As Armin says in the CFFI issue report (linked to earlier), he never got the impression from the docs that he needed to implemen

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-10-04 Thread Tim Peters
Tim Peters added the comment: BTW, the phrase "missing tp_traverse" is misleading. If an object with a NULL tp_traverse appears in a gc generation, gc will blow up the next time that generation is collected. That's always been so - gc doesn't check whether tp_trave

[issue38373] List overallocation strategy

2019-10-04 Thread Tim Peters
Tim Peters added the comment: WRT pymalloc, it will always copy on growing resize in this context. A pymalloc pool is dedicated to blocks of the same size class, so if the size class increases (they're 16 bytes apart now), the data must be copied to a different pool (dedicated to bloc

[issue38373] List overallocation strategy

2019-10-04 Thread Tim Peters
Tim Peters added the comment: Don't know. Define "the problem" ;-) As soon as the allocation is over 512 bytes (64 pointers), it's punted to the system malloc family. Before then, do a relative handful of relatively small memcpy's really matter? pymalloc is f

[issue38379] finalizer resurrection in gc

2019-10-05 Thread Tim Peters
New submission from Tim Peters : While people are thinking about gc, zleak.py shows a small bug, and a possible opportunity for improvement, in the way gc treats finalizers that resurrect objects. The bug: the stats keep claiming gc is collecting an enormous number of objects, but in fact

[issue38379] finalizer resurrection in gc

2019-10-06 Thread Tim Peters
Tim Peters added the comment: Just noting that check_garbage() currently only determines which trash objects are now directly reachable from outside. To be usable for the intended purpose, it would need to go on to compute which trash objects are reachable from those too. Maybe a new

[issue38382] statistics.harmonic_mean fails to raise error with negative input that follows a 0

2019-10-06 Thread Tim Peters
Tim Peters added the comment: I don't have a problem with the current behavior (early out on zero, even if later arguments are senseless). So: > * Just document that there is an early-out for zero. -- ___ Python tracker <https://bugs

[issue38379] finalizer resurrection in gc

2019-10-08 Thread Tim Peters
Change by Tim Peters : -- keywords: +patch pull_requests: +16241 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/16658 ___ Python tracker <https://bugs.python.org/issu

[issue38379] finalizer resurrection in gc

2019-10-08 Thread Tim Peters
Tim Peters added the comment: PR 16658 aims to repair the stats reported. -- ___ Python tracker <https://bugs.python.org/issue38379> ___ ___ Python-bugs-list m

[issue38379] finalizer resurrection in gc

2019-10-09 Thread Tim Peters
Tim Peters added the comment: New changeset ecbf35f9335b0420cb8adfda6f299d6747a16515 by Tim Peters in branch 'master': bpo-38379: don't claim objects are collected when they aren't (#16658) https://github.com/python/cpython/commit/ecbf35f9335b0420cb8a

[issue38379] finalizer resurrection in gc

2019-10-09 Thread Tim Peters
Tim Peters added the comment: I checked the stat fix into master, but GH failed to backport to 3.7 or 3.8 and I'm clueless. More info in the PR. Does someone else here know how to get a backport done? -- stage: patch review -> backport needed versions: +Python 3.7, Py

[issue38437] Set GC_DEBUG for debug builds of the interpreter

2019-10-10 Thread Tim Peters
Tim Peters added the comment: +1. This code got quite brittle when they decided to fit two pointers, a fat integer, and 3 flags into a struct with room for only the two pointers ;-) It's a mine field now. Enabling one of the few automated mine detectors is thoroughly sen

[issue38379] finalizer resurrection in gc

2019-10-13 Thread Tim Peters
Tim Peters added the comment: Everything here has been addressed, so closing this. zleak.py can apparently run forever now without leaking a byte :-) -- resolution: -> fixed stage: patch review -> resolved status: open -> closed _

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-10-14 Thread Tim Peters
Tim Peters added the comment: > I'm often amazed it works at all, let alone perfectly. ;-P Indeed! Every time I take a break from gc and come back, I burn another hour wondering why it doesn't recycle _everything_ ;-) > But what happens if the GC doesn't see that W

[issue38006] Crash in remove() weak reference callback of weakref.WeakValueDictionary at Python exit

2019-10-14 Thread Tim Peters
Tim Peters added the comment: > While Neil & I haven't thought of ways that can go wrong now > beyond that a "surprise finalizer" may get run any number of > times ... Speaking of which, I no longer believe that's true. Thanks to the usual layers of baffli

[issue38490] statistics: add covariance and Pearson's correlation

2019-10-17 Thread Tim Peters
Tim Peters added the comment: I'm in favor of adding all of this (covariance, coefficient, linear regression). It's still at the level of elementary statistics, and even taught in watered down "business statistics" classes. It's about the minimum that can be do

[issue38626] small change at bisect_left function for easy understanding

2019-10-28 Thread Tim Peters
Tim Peters added the comment: So as far as possible, CPython only uses __lt__ ("<") element comparisons for its order-sensitive algorithms. This is documented for list.sort(), but the bisect and heapq modules strive to do the same. The point is to minimize the number of compa

[issue43420] Optimize rational arithmetics

2021-03-12 Thread Tim Peters
Tim Peters added the comment: Terry, we could do that, but the opposition here isn't strong, and is pretty baffling anyway ;-) : the suggested changes are utterly ordinary for implementations of rationals, require very little code, are not delicate, and are actually straightforward t

[issue43420] Optimize rational arithmetics

2021-03-21 Thread Tim Peters
Tim Peters added the comment: New changeset 690aca781152a498f5117682524d2cd9aa4d7657 by Sergey B Kirpichev in branch 'master': bpo-43420: Simple optimizations for Fraction's arithmetics (GH-24779) https://github.com/python/cpython/commit/690aca781152a498f51176825

[issue43420] Optimize rational arithmetics

2021-03-21 Thread Tim Peters
Tim Peters added the comment: Thanks, all! This has been merged now. If someone wants to continue pursuing things left hanging, I'd suggest opening a different BPO report. -- resolution: -> fixed stage: patch review -> resolved status: ope

[issue43420] Optimize rational arithmetics

2021-03-21 Thread Tim Peters
Tim Peters added the comment: If experience is any guide, nothing about anything here will go smoothly ;-) For example, setting up a module global `_gcd` name for `math.gcd` is a very standard, widespread kind of micro-optimization. But - if that's thought to be valuable (who knows?

[issue43420] Optimize rational arithmetics

2021-03-22 Thread Tim Peters
Tim Peters added the comment: This report is closed. Please open a different report. We've already demonstrated that, as predicted, nothing can be said here without it being taken as invitation to open-ended discussion. So it goes, but it doesn't belong on _this_ repo

[issue43618] random.shuffle loses most of the elements

2021-03-24 Thread Tim Peters
Tim Peters added the comment: Are you sure it's "a list"? At least print out `type(questions_element)`. `random.shuffle()` doesn't contain any code _capable_ of changing a list's length. It only does indexed accessing of the list: ... for i in reversed(range(1,

[issue43593] pymalloc is not aware of Memory Tagging Extension (MTE) and crashes

2021-03-31 Thread Tim Peters
Tim Peters added the comment: I'm skeptical ;-) If MTE is actually being used, system software assigns "random" values to 4 of the higher-order bits. When obmalloc punts to the system malloc, presumably those bits will be randomized in the addresses returned by malloc. The

[issue43684] Add combined opcodes

2021-04-02 Thread Tim Peters
Tim Peters added the comment: """ My philosophy here (which I learned from Tim Peters in the early 2000s) is that even though each individual improvement has no measurable effect on a general benchmark (as shown in the same comment), the combined effect of a number of tiny i

[issue43689] difflib: mention other "problematic" characters in documentation

2021-04-02 Thread Tim Peters
Tim Peters added the comment: Lines beginning with "?" are entirely synthetic: they were not present in either input. So that's what that part means. I'm not clear on what else could be materially clearer without greatly bloating the text. For example, >>> d

[issue43593] pymalloc is not aware of Memory Tagging Extension (MTE) and crashes

2021-04-03 Thread Tim Peters
Tim Peters added the comment: Can't really know without a system to try it on, but my best guess is that these asserts are the only thing that will fail with tagging enabled. The obvious "fix" is indeed just to skip them on a platform with tagging enabled. They're mea

[issue43593] pymalloc is not aware of Memory Tagging Extension (MTE) and crashes

2021-04-03 Thread Tim Peters
Tim Peters added the comment: BTW, your cache WIP https://github.com/python/cpython/pull/25130/files partly moves to tracking pool (instead of byte) addresses, but any such attempt faces a subtlety: it's not necessarily the case that a pool is entirely "owned" by obmalloc o

[issue43593] pymalloc is not aware of Memory Tagging Extension (MTE) and crashes

2021-04-04 Thread Tim Peters
Tim Peters added the comment: I think it's time to change what address_in_range() tries to answer. It currently gives a precise answer to "is this byte address in a region obmalloc owns?". But that's stronger than what it needs to do its job: the real question is &quo

[issue43689] difflib: mention other "problematic" characters in documentation

2021-04-05 Thread Tim Peters
Tim Peters added the comment: Terry, your suggested replacement statement looks like an improvement to me. Perhaps the longer explanation could be placed in a footnote. Note that I'm old ;-) I grew up on plain old ASCII, decades & decades ago, and tabs are in fact the only "ch

[issue43475] Worst-case behaviour of hash collision with float NaN

2021-04-10 Thread Tim Peters
Tim Peters added the comment: I agree hashing a NaN acting like the generic object hash (return rotated address) is a fine workaround, although I'm not convinced it addresses a real-world problem ;-) But why not? It might. But that's for CPython. I'm loathe to guarantee anyt

[issue43955] Test Failures on Windows 10

2021-04-27 Thread Tim Peters
Tim Peters added the comment: I expect parallelism is a red herring: early in the test output attached to this report: 0:00:04 Run tests sequentially and there's no other evidence in the output that multiple tests are running simultaneously. Also on Win10, the 4 failing tests here

[issue43955] Test Failures on Windows 10

2021-04-27 Thread Tim Peters
Tim Peters added the comment: Shreyan Avigyan: > And the "(Pdb) continue (...) actually is manually entered by me. Victor Stinner: Do you mean that you modified the Python source code? Me: Doubt it. For me, with more words: the "(Pdb) " prompt appears all by itself, by m

[issue37387] test_compileall fails randomly on Windows when tests are run in parallel

2021-04-29 Thread Tim Peters
Tim Peters added the comment: @Sheyvan, whether it's possible to delete (rename, etc) an open file is a property not of Python, but of the operating system. Windows doesn't allow it; Linux (for example) does. It's generally considered to be "a bug" in CPython

[issue37387] test_compileall fails randomly on Windows when tests are run in parallel

2021-04-29 Thread Tim Peters
Tim Peters added the comment: A "good" solution would be one that runs the test in such a way that it doesn't fail only on Windows ;-) There are presumably many ways that could be accomplished, including ugly ones. For example, if test_compileall is in the collection of tests

[issue37387] test_compileall fails randomly on Windows when tests are run in parallel

2021-04-30 Thread Tim Peters
Tim Peters added the comment: Yes, test_compileall can still fail for this reason on Windows. From a run just now with -j0 (same as -j10 on this box, which has 8 logical cores: a -j value <= 0 is treated the same as "2 + number of logical cores"): """ Comp

[issue44034] Incorrect type casting of float into int

2021-05-04 Thread Tim Peters
Tim Peters added the comment: Please study the docs first: https://docs.python.org/3/tutorial/floatingpoint.html That will give you the background to understand why `int()` has nothing to do with this. >>> 1. 2.0 That is, `int()` was passed 2.0 to begin with, be

[issue44054] 2**53+1 != float(2**53+1)

2021-05-06 Thread Tim Peters
Tim Peters added the comment: [Stefan] > I found it surprising that a comparison uses a different > method of conversion than the (obvious) user-side > conversion, with a different outcome. This seems to be > implementation details leaking into the user side. It's "spir

[issue44154] Optimize Fraction pickling

2021-05-16 Thread Tim Peters
Tim Peters added the comment: Oh yes - please do. It's not just pickle size - going through str() makes (un)pickling quadratic time in both directions if components are large. Pickle the component ints instead, and the more recent pickle protocol(s) can do both directions in linear

[issue44197] [request feature] Itertools extended combinations to limited number of repetition

2021-05-20 Thread Tim Peters
Tim Peters added the comment: Dennis, combinations("aaabbbcccddd") isn't a valid call - the function requires a "how many?" argument too. If, e.g., we were asking for groups of 4, then combinations("aaabbbcccddd", 4) generates the 4-tuple ('a',

[issue44197] [request feature] Itertools extended combinations to limited number of repetition

2021-05-20 Thread Tim Peters
Change by Tim Peters : -- nosy: +rhettinger ___ Python tracker <https://bugs.python.org/issue44197> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue44339] Discrepancy between math.pow(0.0, -inf) and 0.0**-inf

2021-06-07 Thread Tim Peters
Tim Peters added the comment: +1. Although, to be fair, I'd personally be happy if (+-0)**inf returned, say, 1.375 instead ;-) -- nosy: +tim.peters ___ Python tracker <https://bugs.python.org/is

[issue44376] Improve performance of integer exponentiation

2021-06-10 Thread Tim Peters
Tim Peters added the comment: Under the released 3.9.5 for 64-bit Win10, raising to the power 2 is clearly much slower than multiplying directly: C:\Windows\System32>py -3 -m timeit -s "x=151" "x*x" 1000 loops, best of 5: 30 nsec per loop C:\Windows\System32>

[issue44376] Improve performance of integer exponentiation

2021-06-10 Thread Tim Peters
Change by Tim Peters : -- keywords: +patch pull_requests: +25248 stage: -> patch review pull_request: https://github.com/python/cpython/pull/26662 ___ Python tracker <https://bugs.python.org/issu

[issue44376] Improve performance of integer exponentiation

2021-06-10 Thread Tim Peters
Tim Peters added the comment: This is a stab at reducing overhead for small exponents, along the lines I sketched: https://github.com/python/cpython/pull/26662 Unfortunately, I've been unable to convince BPO and GitHub to recognize that the PR is related to this report. Did something

[issue44376] Improve performance of integer exponentiation

2021-06-12 Thread Tim Peters
Tim Peters added the comment: New changeset 9d8dd8f08aae4ad6e73a9322a4e9dee965afebbc by Tim Peters in branch 'main': bpo-44376 - reduce pow() overhead for small exponents (GH-26662) https://github.com/python/cpython/commit/9d8dd8f08aae4ad6e73a9322a4e9de

[issue44376] Improve performance of integer exponentiation

2021-06-12 Thread Tim Peters
Tim Peters added the comment: Closing this now because the pull request did, I believe, all that can be done at the function level. Exponents of 1 and 2 are well within a factor of 2 of repeated multiplication now, and it's essentially a tie at exponent 3 now. Above that, pow() wins no

[issue44571] itertools: takedowhile()

2021-07-10 Thread Tim Peters
Tim Peters added the comment: I agree Raymond's `before_and_after()` looks like an elegant, efficient, and usable approach to this. One minor nit: there's no need for the `iter()` call in: yield from iter(transition) Indeed, it confused me at first, because `yield from x

[issue44571] itertools: takedowhile()

2021-07-12 Thread Tim Peters
Tim Peters added the comment: If you don't use the 'after` iterator, then of course you'll never see the values (if any) it would have yielded. How could it possibly be otherwise? By design and construction, the `before` iterator ends before yielding the first (if any) trans

[issue44571] itertools: takedowhile()

2021-07-12 Thread Tim Peters
Tim Peters added the comment: That said, if you really do want those semantics, it's easy to build on top of Raymond's API: def takewhile_plus_one_more_if_any(pred, iterable): from itertools import islice, chain before, after = before_and_after(pred, iterable) return ch

[issue44611] CPython uses deprecated randomness API

2021-07-12 Thread Tim Peters
Tim Peters added the comment: Dan, the Microsoft URL in your message gives a 404 for me. Did you perhaps mean to end it with "cng-portal" (instead of "cng-por")? -- nosy: +tim.peters ___ Python tracker <https://bug

[issue44663] Possible bug in datetime utc

2021-07-17 Thread Tim Peters
Tim Peters added the comment: > It looks like the difference one would expect from (fast) human input) Nope, the timestamps in the original report are about 3 hours apart (10808+ seconds). Reports like these are often much clearer if they state the timezone of the system they're ru

[issue44663] Possible bug in datetime utc

2021-07-18 Thread Tim Peters
Tim Peters added the comment: If you want to pursue changing what utcnow() does, python-ideas or python-dev would probably need to be involved. Backward-incompatible changes are very hard sells. As Paul Ganssle noted here, https://blog.ganssle.io/articles/2019/11/utcnow.html in Python 2

[issue44692] Const folding in parser with negative numbers doesn't match float/int behaviour

2021-07-20 Thread Tim Peters
Tim Peters added the comment: The binary power operator (`**`) has higher precedence than the unary negation operator (`-`). That is, -x**y groups as -(x**y). Not a bug - that's how it was designed and how it's documented. Note that this isn't novel, either. For example, t

[issue44770] float('nan') is True

2021-07-28 Thread Tim Peters
Tim Peters added the comment: Sorry, I'm just going to close this. For values of all numeric types now, `bool(x)` returns the same as `x != type(x)(0)`. Besides being backward-incompatible, making an exception for NaN would be jarringly inconsistent. Note that you don't nee

[issue44835] What does "Python for Windows will still be Python for DOS" mean?

2021-08-04 Thread Tim Peters
Tim Peters added the comment: The CPython Windows installer has a "thank you" box at the end: """ Special Windows thanks to Mark Hammond, without whose years of freely shared Windows expertise, Python for Windows would still be Python for DOS. ""&qu

[issue34561] Replace list sorting merge_collapse()?

2021-08-28 Thread Tim Peters
Tim Peters added the comment: The merge order was mentioned on python-dev today, and a quick web searched turned up a revision of Vincent Jugé's "Adaptive Shivers Sort: An Alternative Sorting Algorithm" paper I hadn't seen before: https://arxiv.org/pdf/1809.08411.pdf

[issue34561] Replace list sorting merge_collapse()?

2021-08-29 Thread Tim Peters
Tim Peters added the comment: Added new runstack.py. New `shivers2()` adds the implementation of adaptive ShiversSort from Vincent's later paper. While the code is simpler, it appears to behave identically. New `shivers3()` adds, from the same paper, the new "length-adaptive S

[issue45045] Optimize mapping patterns of structural pattern matching

2021-08-29 Thread Tim Peters
Tim Peters added the comment: And another runstack.py adds `shivers4()`, which reworks `shivers3()` (length-adaptive ShiversSort) so that division, log2(), and floor() aren't used anymore. It does need a loop, though, which needs to go around a number of times `k` such that k i

[issue45045] Optimize mapping patterns of structural pattern matching

2021-08-29 Thread Tim Peters
Change by Tim Peters : -- Removed message: https://bugs.python.org/msg400568 ___ Python tracker <https://bugs.python.org/issue45045> ___ ___ Python-bugs-list m

[issue45045] Optimize mapping patterns of structural pattern matching

2021-08-29 Thread Tim Peters
Change by Tim Peters : Removed file: https://bugs.python.org/file50242/runstack.py ___ Python tracker <https://bugs.python.org/issue45045> ___ ___ Python-bugs-list mailin

[issue34561] Replace list sorting merge_collapse()?

2021-08-29 Thread Tim Peters
Tim Peters added the comment: And another runstack.py adds `shivers4()`, which reworks `shivers3()` (length-adaptive ShiversSort) so that division, log2(), and floor() aren't used anymore. It does need a loop, though, which needs to go around a number of times `k` such that k i

[issue34561] Replace list sorting merge_collapse()?

2021-08-30 Thread Tim Peters
Tim Peters added the comment: New runstack.py mostly adds comments about a surprise: the idea that length-adaptive ShiversSort eeks out better results than powersort appears nearly unique to the specific "0.80" cutoff used in the random-case generation code to pick between t

[issue34561] Replace list sorting merge_collapse()?

2021-08-31 Thread Tim Peters
Change by Tim Peters : -- keywords: +patch pull_requests: +26550 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/28108 ___ Python tracker <https://bugs.python.org/issu

[issue34561] Replace list sorting merge_collapse()?

2021-09-01 Thread Tim Peters
Tim Peters added the comment: I created a PR that implements the powersort merge strategy: https://github.com/python/cpython/pull/28108 Across all the time this issue report has been open, that strategy continues to be the top contender. Enough already ;-) It's indeed a more diff

[issue34561] Replace list sorting merge_collapse()?

2021-09-06 Thread Tim Peters
Tim Peters added the comment: New changeset 5cb4c672d855033592f0e05162f887def236c00a by Tim Peters in branch 'main': bpo-34561: Switch to Munro & Wild "powersort" merge strategy. (#28108) https://github.com/python/cpython/commit/5cb4c672d8550335

[issue34561] Replace list sorting merge_collapse()?

2021-09-06 Thread Tim Peters
Change by Tim Peters : -- resolution: -> fixed stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.or

[issue45180] possible wrong result for difflib.SequenceMatcher.ratio()

2021-09-12 Thread Tim Peters
Tim Peters added the comment: Unfortunately, you're getting hurt by the "autojunk" feature (see the docs). If you turn it off, you'll get a result much more to your liking: >>> print(SequenceMatcher(None, a, b).ratio()) 0.3431803896920176 >>> print(Sequ

[issue45180] possible wrong result for difflib.SequenceMatcher.ratio()

2021-09-15 Thread Tim Peters
Tim Peters added the comment: I have no idea why you think the result should be 0.2. 0.5630188679245283 looks correct to me with autojunk disabled: sm = SequenceMatcher(None, a, b, autojunk=False) total = 0 for m in sm.get_matching_blocks(): print(m, repr(a[m.a : m.a + m.size

[issue45180] possible wrong result for difflib.SequenceMatcher.ratio()

2021-09-15 Thread Tim Peters
Tim Peters added the comment: Please stop re-opening this. The issue tracker is not a "help desk", and your confusions aren't necessarily Python bugs ;-) If you post something that looks like an actual bug, I'll re-open the report. SequenceMatcher works on sequences

[issue45348] math.log(243, 3) value issue

2021-10-02 Thread Tim Peters
Tim Peters added the comment: CPython's log() builds on the platform C's libm facilities, and C simply doesn't define primitives capable of producing a worst-case < 1 ulp error 2-argument log in reasonable time. Instead we have to build it out of two separate log operatio

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-19 Thread Tim Peters
New submission from Tim Peters : The code could typically be faster if it did what its comments imply it does: skip the expense of PyObject_RichCompareBool() entirely for the first pair of tuple elements. It actually always calls PyObject_RichCompareBool() on the first pair, and only if that

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-19 Thread Tim Peters
Tim Peters added the comment: FYI, this is fallout from a StackOverflow mystery: https://stackoverflow.com/questions/69468552/efficiency-of-sorting-by-multiple-keys-in-python/69610671# -- ___ Python tracker <https://bugs.python.org/issue45

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-19 Thread Tim Peters
Tim Peters added the comment: The attached tupsort.py gives a simple. focused example. Typical output on my box: float 3.10 (float,) 11.75 [float]25.68 It's sorting a large list of floats. In the first line the list contains plain floats. In the second line, each floa

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-19 Thread Tim Peters
Change by Tim Peters : -- keywords: +patch pull_requests: +27345 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/29076 ___ Python tracker <https://bugs.python.org/issu

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-19 Thread Tim Peters
Change by Tim Peters : -- assignee: -> tim.peters ___ Python tracker <https://bugs.python.org/issue45530> ___ ___ Python-bugs-list mailing list Unsubscrib

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-19 Thread Tim Peters
Tim Peters added the comment: Stefan, I have scant memory of ever caring, but, if I did, I got over it ;-) >>> math.nan == math.nan False >>> {math.nan : 5}[math.nan] 5 That is, PyObject_RichCompareBool() takes object identity as overriding __eq__; that's why the

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-19 Thread Tim Peters
Tim Peters added the comment: Stefan, I looked at that old PR and can't find anywhere I suggested that he change the unsafe_tuple_compare() logic. I just _asked_ him "I'm curious about what the patched Python prints for this program:". And, in fact, that program sho

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-19 Thread Tim Peters
Tim Peters added the comment: > Elliot shortly after retrated from the approach, saying he > rewrote unsafe_tuple_compare to move the less-than after > the equality testing, to make sure it's 100% consistent". I remember at the time having no idea what he meant by that com

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-20 Thread Tim Peters
Tim Peters added the comment: It's rare that an optimization is a _pure_ win. Some cases win, others lose. There's almost never "a proof" of net gain ending with "QED". Of course it's dead easy to construct examples where "many duplicates in the first

[issue45542] Using multiple comparison operators can cause performance issues

2021-10-20 Thread Tim Peters
Tim Peters added the comment: I think Dennis's example is fatal: from section 6.10 ("Comparisons"): """ Comparisons can be chained arbitrarily, e.g., `x < y <= z` is equivalent to `x < y and y <= z`, except that y is evaluated only once (but in bo

[issue45530] Improve listobject.c's unsafe_tuple_compare()

2021-10-21 Thread Tim Peters
Tim Peters added the comment: > I see you mentioned that PyObject_RichCompareBool(..., Py_EQ) might be > faster for example because it checks identity. For example, in tupsort.py replace xs = [random() for _ in range(length)] with xs = ['z' * 100 for _ in rang

<    1   2   3   4   5   6   7   8   9   10   >