[issue41972] bytes.find consistently hangs in a particular scenario

2020-12-12 Thread Dennis Sweeney
Change by Dennis Sweeney : Removed file: https://bugs.python.org/file49672/twoway_demo.py ___ Python tracker <https://bugs.python.org/issue41972> ___ ___ Python-bug

[issue41972] bytes.find consistently hangs in a particular scenario

2020-12-12 Thread Dennis Sweeney
Change by Dennis Sweeney : Added file: https://bugs.python.org/file49674/twoway_demo.py ___ Python tracker <https://bugs.python.org/issue41972> ___ ___ Python-bugs-list m

[issue42632] Reassgining ZeroDivisionError will lead to bug in Except clause

2020-12-14 Thread Dennis Sweeney
Dennis Sweeney added the comment: This is just how local/nonlocal/global/builtin variables work in Python. When you assign to a name anywhere inside of a function, all occurrences of that name refer by default to a local variable. So the line "ZeroDivisionError = 1" tells the foo(

[issue42734] Outdated CodeType call in "bogus_code_obj.py"

2020-12-25 Thread Dennis Sweeney
Dennis Sweeney added the comment: To make it slightly more readable and future-proof so such things don't become outdated again in the future, you could use the CodeType.replace() method. See also https://bugs.python.org/issue42422 -- nosy: +Dennis Sw

[issue42793] Bug of round function

2020-12-31 Thread Dennis Sweeney
Dennis Sweeney added the comment: This is not a bug. See https://stackoverflow.com/a/10825998/11461120 -- nosy: +Dennis Sweeney ___ Python tracker <https://bugs.python.org/issue42

[issue42804] Unable to compile the cpython code x86 windows

2021-01-01 Thread Dennis Sweeney
Dennis Sweeney added the comment: What commands did you enter? Are you following the instructions at https://devguide.python.org/setup/ ? -- nosy: +Dennis Sweeney ___ Python tracker <https://bugs.python.org/issue42

[issue42804] Unable to compile the cpython code x86 windows

2021-01-01 Thread Dennis Sweeney
Dennis Sweeney added the comment: Were you able to run PCbuild\get_externals.bat ? -- ___ Python tracker <https://bugs.python.org/issue42804> ___ ___ Python-bug

[issue42808] Add PyType_Type.tp_vectorcall for type(obj) performance

2021-01-02 Thread Dennis Sweeney
New submission from Dennis Sweeney : # common case .\python.bat -m pyperf timeit "type(17)" Master: 49.9 ns +- 1.8 ns ---> PR: 33.3 ns +- 1.4 ns # uncommon case .\python.bat -m pyperf timeit "type('A', (object, ), {})" Master: 5.14 us +- 0.

[issue42808] Add PyType_Type.tp_vectorcall for type(obj) performance

2021-01-02 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch pull_requests: +22892 stage: -> patch review pull_request: https://github.com/python/cpython/pull/24058 ___ Python tracker <https://bugs.python.org/issu

[issue41972] bytes.find consistently hangs in a particular scenario

2021-01-17 Thread Dennis Sweeney
Dennis Sweeney added the comment: PR 22904 now adds a text document explaining how the Two-Way algorithm works in much more detail. I was looking at more benchmarking results, and I came to a couple of conclusions about cutoffs. There's a consistent benefit to using the two-way algo

[issue41545] gc API requiring matching number of gc.disable - gc.enable calls

2021-01-18 Thread Dennis Sweeney
Dennis Sweeney added the comment: It looks like this was a duplicate of https://bugs.python.org/issue31356 -- ___ Python tracker <https://bugs.python.org/issue41

[issue31356] Add context manager to temporarily disable GC

2021-01-18 Thread Dennis Sweeney
Dennis Sweeney added the comment: https://bugs.python.org/issue41545 is a duplicate of this. In that report, there's an example of something that can go wrong with the save-a-boolean-per-context-manager approach even when threads are not used, but when concurrent generators are used,

[issue43057] Timezone 'Etc/GMT-5' functions as Timezone 'Etc/GMT+5'

2021-01-28 Thread Dennis Sweeney
Dennis Sweeney added the comment: I reproduced something similar in 3.9 with standard library modules: >>> from datetime import datetime >>> from zoneinfo import ZoneInfo >>> zone = ZoneInfo(key='Etc/GMT-3') >>> zone.tzname(datetime(2021, 1

[issue43078] Equality Errors when Unpickling and Pickling a Dictionary with a nan

2021-01-30 Thread Dennis Sweeney
Dennis Sweeney added the comment: I think this is the expected behavior. It is expected by IEEE 754 that nan != nan, so that behavior exists: >>> nan = float('nan') >>> nan2 = float('nan') >>> assert nan != nan2 >>> assert nan != nan

[issue43130] Should this construct throw an exception?

2021-02-04 Thread Dennis Sweeney
Dennis Sweeney added the comment: This is the expected behavior. >From >https://docs.python.org/3/tutorial/introduction.html#first-steps-towards-programming """ In Python, like in C, any non-zero integer value is true; zero is false. The condition may also be a str

[issue43134] (list have item) instate (item in list)

2021-02-05 Thread Dennis Sweeney
Dennis Sweeney added the comment: You could express this as: a = [input() for i in range(10)] x = input() print(x in a) This is more clear IMO, because if you want to have something happen before something else, it's clearest to put them on separate lines, one after the other. I

[issue43151] is with literals in 3.8 release

2021-02-06 Thread Dennis Sweeney
Dennis Sweeney added the comment: This was a very intentional change from the commit 3bcbedc9f1471d957a30a90f9d1251516b422416 It's not safe to check `x is y` when x and y are strings. You should always use `x == y` for strings instead. In CPython, if the names x and y both refer to the

[issue43151] is with literals in 3.8 release

2021-02-07 Thread Dennis Sweeney
Dennis Sweeney added the comment: There may be reason to re-open this. With Python 3.9.0, I see the inconsistent behavior that Gary describes only when using IDLE. So this is likely an IDLE issue. I attached a screenshot of the difference. -- Added file: https://bugs.python.org

[issue43151] is with literals in 3.8 release

2021-02-07 Thread Dennis Sweeney
Dennis Sweeney added the comment: I think the strangeness is happening because sometimes, the warning is printed to stderr, while other times, IDLE's parser notices the "is " anti-pattern and raises a SyntaxError. See the attached screenshot for the IDLE output versus the con

[issue43198] Operations on sets more than hundred times less efficient with python3.9 than with previous versions

2021-02-10 Thread Dennis Sweeney
Dennis Sweeney added the comment: I bisected the change to here: https://github.com/python/cpython/pull/19881 commit 3dd2157febae5087cad24f69b6de9cbd13cd Author: Raymond Hettinger Date: Sun May 3 04:51:05 2020 -0700 Simplify set entry insertion logic. (GH-19881) ""&qu

[issue43280] additional argument for str.join()

2021-02-20 Thread Dennis Sweeney
Dennis Sweeney added the comment: This seems like a very specific use case. Too specific IMO for a method on all string objects for anyone using Python anywhere in the world. Why not just write a function like this? def my_join(strings, sep=", ", last_sep=", and &quo

[issue43289] step bug in turtle's for loop

2021-02-21 Thread Dennis Sweeney
Dennis Sweeney added the comment: If I understand correctly, changing the -1 to a -2 does not actually make the program "crash" -- you just only see one black circle. The reason is that range(40, 0, -2) produces 40, 38, 36, etc., all of which are even numbers, so rad % 2 is always

[issue41972] bytes.find consistently hangs in a particular scenario

2021-02-27 Thread Dennis Sweeney
Dennis Sweeney added the comment: Any chance PR 22904 can make it into 3.10 before the May 03 feature freeze? The text document in that PR has some notes on how the algorithm works, so that may be a good place to start reviewing if anyone is interested

[issue41972] bytes.find consistently hangs in a particular scenario

2021-02-28 Thread Dennis Sweeney
Change by Dennis Sweeney : -- pull_requests: +23457 pull_request: https://github.com/python/cpython/pull/24672 ___ Python tracker <https://bugs.python.org/issue41

[issue41361] Converting collections.deque methods to Argument Clinic

2021-03-06 Thread Dennis Sweeney
Dennis Sweeney added the comment: If the argument clinic is too disruptive, would it be okay to inline the equivalent code like this? diff --git a/Modules/_collectionsmodule.c b/Modules/_collectionsmodule.c index 90bafb0ea8..d75388abc8 100644 --- a/Modules/_collectionsmodule.c +++ b/Modules

[issue41361] Converting collections.deque methods to Argument Clinic

2021-03-08 Thread Dennis Sweeney
Change by Dennis Sweeney : -- pull_requests: +23564 pull_request: https://github.com/python/cpython/pull/24796 ___ Python tracker <https://bugs.python.org/issue41

[issue40679] show class name in method invocation TypeError

2020-05-19 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch nosy: +Dennis Sweeney nosy_count: 2.0 -> 3.0 pull_requests: +19523 stage: -> patch review pull_request: https://github.com/python/cpython/pull/20236 ___ Python tracker <https://bugs.p

[issue40679] show class name in method invocation TypeError

2020-05-19 Thread Dennis Sweeney
Dennis Sweeney added the comment: The attached PR isn't exactly what you requested, but it's a very minimal code change that uses the existing __qualname__ functionality to change the message to TypeError: A.foo() takes 1 positional argument but 2 were given Does that add

[issue40679] show class name in method invocation TypeError

2020-05-20 Thread Dennis Sweeney
Dennis Sweeney added the comment: While trying to write tests, I stumbled across something interesting: _PyObject_FunctionString as discussed here ( https://bugs.python.org/issue37645 ) returns a string that also includes the module name where applicable. For example, the module name is

[issue40679] show class name in method invocation TypeError

2020-05-20 Thread Dennis Sweeney
Dennis Sweeney added the comment: I got this: >>> class A: ... def f(): ... pass ... >>> A.f(1) Traceback (most recent call last): File "", line 1, in TypeError: A.f() takes 0 positional arguments but 1 was given >>> A.f(**{1:2}) Traceback (most rec

[issue40679] show class name in method invocation TypeError

2020-05-20 Thread Dennis Sweeney
Dennis Sweeney added the comment: Never mind; I think you're right, and https://github.com/python/cpython/blob/master/Objects/call.c#L1009 is the line. -- ___ Python tracker <https://bugs.python.org/is

[issue40679] show class name in method invocation TypeError

2020-05-20 Thread Dennis Sweeney
Dennis Sweeney added the comment: I just ran the entire test suite with: --- a/Python/ceval.c +++ b/Python/ceval.c @@ -4179,6 +4179,7 @@ _PyEval_EvalCode(PyThreadState *tstate, Py_ssize_t j; if (keyword == NULL || !PyUnicode_Check(keyword)) { +printf("THIS

[issue40679] show class name in method invocation TypeError

2020-05-20 Thread Dennis Sweeney
Dennis Sweeney added the comment: Sure -- I'll file the issue. -- ___ Python tracker <https://bugs.python.org/issue40679> ___ ___ Python-bugs-list m

[issue40706] Unreachable code in _PyEval_EvalCode

2020-05-20 Thread Dennis Sweeney
New submission from Dennis Sweeney : When I was looking into https://bugs.python.org/issue40679, I couldn't come up with a test case for the following block, so I added a print statement: --- a/Python/ceval.c +++ b/Python/ceval.c @@ -4179,6 +4179,7 @@ _PyEval_EvalCode(PyThreadState *t

[issue40679] show class name in method invocation TypeError

2020-05-20 Thread Dennis Sweeney
Dennis Sweeney added the comment: https://bugs.python.org/issue40706 -- ___ Python tracker <https://bugs.python.org/issue40679> ___ ___ Python-bugs-list mailin

[issue40722] test_ttk_guionly times out on Ubuntu CI

2020-05-21 Thread Dennis Sweeney
New submission from Dennis Sweeney : One of the tests (test_ttk_guionly.test_variable_change) on the Ubuntu CI is intermittently hanging on this code: https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Lib/tkinter/test/test_ttk/test_extensions.py#L147

[issue38938] Possible performance improvement for heaqq.merge()

2020-05-22 Thread Dennis Sweeney
Dennis Sweeney added the comment: key_and_reverse.py employs the same strategy as winners.py, but uses lists as the nodes of the tree rather than using Node instances. It also eliminates the recursion of treeify, and adds (with neither much of a performance hit nor much code duplication

[issue40696] exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1")

2020-05-22 Thread Dennis Sweeney
Dennis Sweeney added the comment: Wouldn't Floyd's or Brent's cycle detection algorithms be better here than the allocation of a new set? I believe they might also eliminate the need to fast-path the first 100 or however many. As in: https://en.wikipedia.org/wiki/

[issue38938] Possible performance improvement for heapq.merge()

2020-05-28 Thread Dennis Sweeney
Dennis Sweeney added the comment: less_movement.py is my favorite so far. It still handles key and reverse, but using instance attributes instead of the list indices I tried before. It does this by only advancing the "key" and "leaf" attributes up toward the root (where

[issue40696] exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1")

2020-05-30 Thread Dennis Sweeney
Change by Dennis Sweeney : -- pull_requests: +19782 stage: needs patch -> patch review pull_request: https://github.com/python/cpython/pull/20539 ___ Python tracker <https://bugs.python.org/issu

[issue40696] exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1")

2020-05-30 Thread Dennis Sweeney
Dennis Sweeney added the comment: I believe PR 20539 solves the more general problem (using Floyd's Tortoise and Hare Algorithm), and I would appreciate review / some more ideas for test cases. -- ___ Python tracker <https://bugs.py

[issue40696] exception chain cycles cause hangs (was "Exception handling with "await" can hang in Python3.9.0b1")

2020-05-30 Thread Dennis Sweeney
Dennis Sweeney added the comment: > I it related to issue25782? Yes -- I didn't see that issue. I'm a little confused about the resolution of that issue though. For clarification, the existing behavior on master: When trying to raise the exception H, F -> G -

[issue25782] CPython hangs on error __context__ set to the error itself

2020-05-30 Thread Dennis Sweeney
Dennis Sweeney added the comment: For clarification, the existing behavior on master: When trying to raise the exception H, F -> G -> H -> I -> NULL becomes H -> F -> G -> NULL But when trying to set the exception A on top of

[issue25782] CPython hangs on error __context__ set to the error itself

2020-05-30 Thread Dennis Sweeney
Change by Dennis Sweeney : -- pull_requests: +19788 pull_request: https://github.com/python/cpython/pull/20539 ___ Python tracker <https://bugs.python.org/issue25

[issue38938] Possible performance improvement for heapq.merge()

2020-05-30 Thread Dennis Sweeney
Change by Dennis Sweeney : -- pull_requests: +19794 pull_request: https://github.com/python/cpython/pull/20550 ___ Python tracker <https://bugs.python.org/issue38

[issue38938] Possible performance improvement for heapq.merge()

2020-05-30 Thread Dennis Sweeney
Dennis Sweeney added the comment: PR 20550 uses a linked structure like what we've been talking about. -- ___ Python tracker <https://bugs.python.org/is

[issue40879] Strange regex cycle

2020-06-05 Thread Dennis Sweeney
Dennis Sweeney added the comment: It looks like only the first 200 characters of the input string's repr are used as the compiled pattern's repr for some reason: https://github.com/python/cpython/blob/master/Modules/_sre.c#L1294 I don't know if there is a good reason, e

[issue40890] Dict views should be introspectable

2020-06-06 Thread Dennis Sweeney
Dennis Sweeney added the comment: I'd be happy to write a PR. Method names could be "mapping", "target", "target_mapping", "target_dict", "referent_dict", etc. I like the choice of "target_mapping": d = dict()

[issue40890] Dict views should be introspectable

2020-06-06 Thread Dennis Sweeney
Dennis Sweeney added the comment: I think this will also require typing.MappingProxyType to change a bit, since it would make a proxy's underlying dict accessible: >>> d = dict() >>> proxy = MappingProxyType(d) >>> type(proxy.items()) is ty

[issue40890] Dict views should be introspectable

2020-06-07 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch pull_requests: +19906 stage: -> patch review pull_request: https://github.com/python/cpython/pull/20691 ___ Python tracker <https://bugs.python.org/issu

[issue40890] Dict views should be introspectable

2020-06-07 Thread Dennis Sweeney
Dennis Sweeney added the comment: Indeed, with PR 20691 applied, the following crashes: >>> vars(str).items().mapping.clear() >>> "uh oh" -- ___ Python tracker <htt

[issue40890] Dict views should be introspectable

2020-06-07 Thread Dennis Sweeney
Dennis Sweeney added the comment: Would the best way to address this be adding new KeysProxy, ValuesProxy, and ItemsProxy types? -- ___ Python tracker <https://bugs.python.org/issue40

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-07 Thread Dennis Sweeney
Dennis Sweeney added the comment: What about returning another dict_items instead of a set? As in (using the convention `d.items().mapping is d`): dict_items = type({}.items()) def __xor__(self: dict_items, other): if isinstance(other, dict_items): new

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-08 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch pull_requests: +19928 stage: -> patch review pull_request: https://github.com/python/cpython/pull/20718 ___ Python tracker <https://bugs.python.org/issu

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-08 Thread Dennis Sweeney
Dennis Sweeney added the comment: PR 20718 helps somewhat by only creating and hashing the tuples that wind up in the final set. Here's a benchmark: -m pyperf timeit -s "d1 = {i:i for i in range(100_000)}; d2 = {i:i|1 for i in range(100_000)}" "d1.items() ^ d2.items()&

[issue40889] Symmetric difference on dict_views is inefficient

2020-06-08 Thread Dennis Sweeney
Dennis Sweeney added the comment: A demo: >>> class Int(int): ... hash_calls = 0 ... def __hash__(self): ... Int.hash_calls += 1 ... return super().__hash__() ... >>> left = {Int(1): -1, Int(2): -2, Int(3): -3, Int(4): -4, Int(5): -5, Int(6):

[issue40890] Dict views should be introspectable

2020-06-08 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch pull_requests: +19954 pull_request: https://github.com/python/cpython/pull/20749 ___ Python tracker <https://bugs.python.org/issue40

[issue40890] Dict views should be introspectable

2020-06-08 Thread Dennis Sweeney
Dennis Sweeney added the comment: PR 20749 gives each dict view access to a mappingproxy for the original dict, although I don't know if that defeats the original purpose. It might be hard to sensibly make MappingProxy(d).items() return something other than d.items(), since this is al

[issue40890] Dict views should be introspectable

2020-06-08 Thread Dennis Sweeney
Dennis Sweeney added the comment: Here's a workaround that's possible with PR 20749 applied: >>> d = {"a":1, "b":2} # fill up the dict... >>> DICT = object() >>> d[DICT] = d >>> items = d.items() >>

[issue40890] Dict views should be introspectable

2020-06-12 Thread Dennis Sweeney
Dennis Sweeney added the comment: Would it be better to have a dictview.mapping() method rather than an attribute, since it constructs a new object of a different type and since that's what keys(), values(), and items() are? -- ___ Python tr

[issue40925] Remove redundant macros used for stack manipulation in interpreter

2020-06-12 Thread Dennis Sweeney
Change by Dennis Sweeney : -- nosy: +Dennis Sweeney nosy_count: 3.0 -> 4.0 pull_requests: +20038 pull_request: https://github.com/python/cpython/pull/20845 ___ Python tracker <https://bugs.python.org/issu

[issue40925] Remove redundant macros used for stack manipulation in interpreter

2020-06-12 Thread Dennis Sweeney
Dennis Sweeney added the comment: I just added PR 20845, but I'm concerned about performance. I'm attaching the results of a run of pyperformance before and after PR 20845. -- Added file: https://bugs.python.org/file49228/pushpo

[issue40925] Remove redundant macros used for stack manipulation in interpreter

2020-06-12 Thread Dennis Sweeney
Change by Dennis Sweeney : Added file: https://bugs.python.org/file49229/master_perf.txt ___ Python tracker <https://bugs.python.org/issue40925> ___ ___ Python-bug

[issue40925] Remove redundant macros used for stack manipulation in interpreter

2020-06-12 Thread Dennis Sweeney
Change by Dennis Sweeney : Added file: https://bugs.python.org/file49230/perf_diff.txt ___ Python tracker <https://bugs.python.org/issue40925> ___ ___ Python-bugs-list m

[issue40925] Remove redundant macros used for stack manipulation in interpreter

2020-06-12 Thread Dennis Sweeney
Change by Dennis Sweeney : Removed file: https://bugs.python.org/file49229/master_perf.txt ___ Python tracker <https://bugs.python.org/issue40925> ___ ___ Python-bug

[issue40925] Remove redundant macros used for stack manipulation in interpreter

2020-06-12 Thread Dennis Sweeney
Change by Dennis Sweeney : Removed file: https://bugs.python.org/file49228/pushpop_perf.txt ___ Python tracker <https://bugs.python.org/issue40925> ___ ___ Python-bug

[issue40890] Dict views should be introspectable

2020-06-14 Thread Dennis Sweeney
Change by Dennis Sweeney : -- pull_requests: +20061 stage: resolved -> patch review pull_request: https://github.com/python/cpython/pull/20873 ___ Python tracker <https://bugs.python.org/issu

[issue41166] CLASS ATTRIBUTES

2020-06-30 Thread Dennis Sweeney
Dennis Sweeney added the comment: I think the word "attribute" is preferred over "value" because "value" can mean just about anything, whereas, according to https://docs.python.org/3/glossary.html?highlight=glossary , an attribute is specifically: ""

[issue41299] Python3 threading.Event().wait time is twice as large as Python27

2020-07-15 Thread Dennis Sweeney
Dennis Sweeney added the comment: I reproduced something similar on Python 3.9.0b1, Windows 64-bit version: py -m pyperf timeit -s "import threading; E = threading.Event()" "E.wait()" NUMBERMean +- std dev --- 0.0

[issue41356] Convert bool.__new__ to argument clinic

2020-07-21 Thread Dennis Sweeney
New submission from Dennis Sweeney : Benchmarked on my machine (Windows 10): .\python.bat -m pyperf timeit -s "from collections import deque; x = [[], [1]] * 1_000_000" "deque(map(bool, x), maxlen=0)" --- Win32 build configuration --- Master: 105

[issue41356] Convert bool.__new__ to argument clinic

2020-07-21 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch pull_requests: +20723 stage: -> patch review pull_request: https://github.com/python/cpython/pull/21581 ___ Python tracker <https://bugs.python.org/issu

[issue41356] Convert bool.__new__ to argument clinic

2020-07-22 Thread Dennis Sweeney
Dennis Sweeney added the comment: More microbenchmarks: pyperf timeit "bool()" Before: 63.1 ns +- 0.7 ns After: 51.7 ns +- 1.2 ns pyperf timeit "bool(0)" Before: 77.4 ns +- 1.9 ns After: 67.2 ns +- 1.3 ns pyperf timeit "bool(17)" Befor

[issue41545] gc API requiring matching number of gc.disable - gc.enable calls

2020-08-15 Thread Dennis Sweeney
Dennis Sweeney added the comment: This is exactly the motivation for context managers, no? I attached no_gc.py, which works when nested and should additionally be thread-safe. Usage: from no_gc import no_gc with no_gc(): # collection disabled with no_gc(): # collection is

[issue41545] gc API requiring matching number of gc.disable - gc.enable calls

2020-08-17 Thread Dennis Sweeney
Dennis Sweeney added the comment: The save-a-boolean-for-each-context-manager approach has an issue if used with concurrent generators, where the lifetimes of two generator objects might be overlapping but not completely nested, as shown below. The same issue should arise when using

[issue41545] gc API requiring matching number of gc.disable - gc.enable calls

2020-08-17 Thread Dennis Sweeney
Dennis Sweeney added the comment: FWIW I forgot the gc.disable() line in the contextmanager, but what I said still applies. -- ___ Python tracker <https://bugs.python.org/issue41

[issue41621] defaultdict miss behave when using default_factory passed as kwargs

2020-08-24 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch nosy: +Dennis Sweeney nosy_count: 3.0 -> 4.0 pull_requests: +21055 stage: -> patch review pull_request: https://github.com/python/cpython/pull/21945 ___ Python tracker <https://bugs.p

[issue41621] defaultdict miss behave when using default_factory passed as kwargs

2020-08-24 Thread Dennis Sweeney
Dennis Sweeney added the comment: PR 21945 changes the signature: - defaultdict(default_factory[, ...]) + defaultdict(default_factory=None, /, [...]) -- ___ Python tracker <https://bugs.python.org/issue41

[issue41678] File-level, optionally external sorting

2020-08-31 Thread Dennis Sweeney
Dennis Sweeney added the comment: If we were to do this, I think a better API might be to accept an arbitrary iterable, then produce a sorted iterable: def sorted_on_disk(iterable, key=None, reverse=False) -> Iterable: ... It would sort chunks of the input and store them in files

[issue41678] File-level, optionally external sorting

2020-08-31 Thread Dennis Sweeney
Dennis Sweeney added the comment: Attached is a proof of concept. -- Added file: https://bugs.python.org/file49436/disksort.py ___ Python tracker <https://bugs.python.org/issue41

[issue41724] SQLite returns "str" instead of "datetime.datetime" with aggregate queries.

2020-09-05 Thread Dennis Sweeney
Dennis Sweeney added the comment: Here's a reproducer. -- nosy: +Dennis Sweeney Added file: https://bugs.python.org/file49447/reproducer.py ___ Python tracker <https://bugs.python.org/is

[issue41850] inspect.py: access block stack

2020-09-24 Thread Dennis Sweeney
Dennis Sweeney added the comment: I believe the block stack is only for catching exceptions. Loops do not interact with the block stack. Only SETUP_FINALLY adds to the block stack: https://docs.python.org/3/library/dis.html#opcode-SETUP_FINALLY Meanwhile, loops and conditionals and the like

[issue41873] Add vectorcall for float()

2020-09-28 Thread Dennis Sweeney
New submission from Dennis Sweeney : I got these benchmarks: .\python.bat -m pyperf timeit "float(0)" Before: Mean +- std dev: 79.0 ns +- 1.0 ns After: Mean +- std dev: 51.5 ns +- 1.6 ns -- components: Interpreter Core messages: 377590 nosy: Dennis Sweeney priority: norma

[issue41873] Add vectorcall for float()

2020-09-28 Thread Dennis Sweeney
Change by Dennis Sweeney : -- keywords: +patch pull_requests: +21463 stage: -> patch review pull_request: https://github.com/python/cpython/pull/22432 ___ Python tracker <https://bugs.python.org/issu

[issue41356] Convert bool.__new__ to argument clinic

2020-09-29 Thread Dennis Sweeney
Dennis Sweeney added the comment: Superseded by https://bugs.python.org/issue41870 -- nosy: -larry resolution: -> works for me stage: patch review -> resolved status: open -> closed ___ Python tracker <https://bugs.python.or

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-08 Thread Dennis Sweeney
Dennis Sweeney added the comment: Adding some hasty printf-debugging to fastsearch.h, I see this: >>> with open('data.bin', 'rb') as f: ... s = f.read() ... >>> base = 15403807 * b'\xff' >>> longer = base + b

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-08 Thread Dennis Sweeney
Dennis Sweeney added the comment: Indeed, this is just a very unlucky case. >>> n = len(longer) >>> from collections import Counter >>> Counter(s[:n]) Counter({0: 9056995, 255: 6346813}) >>> s[n-30:n+30].replace(b'

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-10 Thread Dennis Sweeney
Dennis Sweeney added the comment: I agree that skip could could do 1 better. --- > I don't know whether it "should be" applied I don't think I'm convinced: the second check fixes only the very specific case when s[len(p):].startswith(p). Perturbations of repro

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-10 Thread Dennis Sweeney
Dennis Sweeney added the comment: > Offhand do you know what the _best_ timing for two-way search is in a > pattern-not-found case? Looking at the glibc implementation, in the top-level "else" clause (for when the string isn't completely periodic), we have:

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-12 Thread Dennis Sweeney
Dennis Sweeney added the comment: Here is a C implementation of the two-way algorithm that should work as a drop-in replacement for Objects/stringlib/fastsearch.h. Benchmarking so far, it looks like it is a bit slower in a lot of cases. But it's also a bit faster in a some other case

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-12 Thread Dennis Sweeney
Change by Dennis Sweeney : -- pull_requests: +21650 stage: -> patch review pull_request: https://github.com/python/cpython/pull/22679 ___ Python tracker <https://bugs.python.org/issu

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-12 Thread Dennis Sweeney
Dennis Sweeney added the comment: PR 22679 is a draft that does the two-way algorithm but also adds both of the tricks from Fredrik's implementation: a bit-set "bloom filter" and remembering the skip-distance between some pair of characters. -- Added file: https://

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-12 Thread Dennis Sweeney
Dennis Sweeney added the comment: I used random_bench.py to compare PR 22679 to Master, and the results are in bench_results.txt. Results were varied. I suppose this depends on what cases we want to optimize for. -- Added file: https://bugs.python.org/file49512/random_bench.py

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-13 Thread Dennis Sweeney
Dennis Sweeney added the comment: bench_table.txt gives my results (`ref` is Master, `change` is with PR 22679). The change gives 342 faster cases and 275 slower cases, and 9 cases with no change. I chose a random word of length 10**6 with a zipf character distribution for the haystack

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-13 Thread Dennis Sweeney
Dennis Sweeney added the comment: Another algorithmic possibility: Instead of the bitset, we could have a stack-allocated uint8_t jump[32]; // maybe 64? Maybe uint16_t? It would say this: If the last character lined up in the haystack is congruent to i mod (1 << 8), then jump ah

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-13 Thread Dennis Sweeney
Dennis Sweeney added the comment: @Tim I got this again for that benchmark: length=3442, value=ASXABCDHAB...: Mean +- std dev: 2.39 ms +- 0.01 ms Unfortunately not a ghost. -- ___ Python tracker <https://bugs.python.org/issue41

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-13 Thread Dennis Sweeney
Dennis Sweeney added the comment: That test needle happened to end with a G and not have another G until much earlier. The status quo took advantage of that, but the PR only takes advantage of the skip value for a certain middle character. Perhaps it could do both

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-14 Thread Dennis Sweeney
Dennis Sweeney added the comment: The most recent batch of commits added a jump table. Between master and PR 22679 now, there are 151 cases slower than master and 463 that faster than master. The slower cases are at most twice as slow, but the faster cases are often 10-20x faster. I could

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-16 Thread Dennis Sweeney
Dennis Sweeney added the comment: I'm doing a couple more timing tests to try to understand exactly when the cutoff should be applied (based on some combination of needle and haystack lengths). Can the rolling hash algorithm be made to go sublinear like O(n/m)? It looked like it was p

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-17 Thread Dennis Sweeney
Dennis Sweeney added the comment: I added the cutoff for strings >= 10 characters, and I converted the PR from a draft to "Ready to Review." When running stringbench.py before and after the PR, I get these results: Summary: Unicode Before: 81.82 Bytes Before: 92.62

[issue41972] bytes.find consistently hangs in a particular scenario

2020-10-17 Thread Dennis Sweeney
Dennis Sweeney added the comment: > But there _also_ seem to be real (but much smaller) benefits > for the "search backward" cases, which I don't recall seeing > when I tried it. Do you have a guess as to why? I did change `skip = mlast - 1;` to `skip = mlast;` as yo

<    1   2   3   4   5   6   >