[issue46205] Race condition in runtest_mp leads to hangs (never exits)
New submission from Sam Gross : The runtest_mp.py has a race condition between checking for worker.is_alive() and processing the queue that can lead to indefinite hangs. The hang happens when the all the results from the self.output queue are processed but at least one of the workers hasn't finished exiting. https://github.com/python/cpython/blob/8d7644fa64213207b8dc6f555cb8a02bfabeced2/Lib/test/libregrtest/runtest_mp.py#L394-L418 The main thread tries to get a result from the output queue, but the queue is empty and remains empty. Although the queue.get() operation eventually times out (after 30 seconds), the main thread does not re-check if all the workers have exited (!), but instead retries the queue.get() in the "while True" loop. https://github.com/python/cpython/blob/8d7644fa64213207b8dc6f555cb8a02bfabeced2/Lib/test/libregrtest/runtest_mp.py#L415-L418 To reproduce, apply the below patch which introduces a small delay to more reliably trigger the hang. curl "https://gist.githubusercontent.com/colesbury/fe3769f43dfb724c86ecbb182b1f6749/raw/e29a4eaeebb8d5252cdd66f3f8a70f7bc5fa14e7/runtest_mp.diff"; | patch -p1 ./python -m test test_regrtest -m test_module_from_test_autotest -v -- messages: 409374 nosy: colesbury priority: normal severity: normal status: open title: Race condition in runtest_mp leads to hangs (never exits) versions: Python 3.10, Python 3.11, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue46205> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46205] Race condition in runtest_mp leads to hangs (never exits)
Change by Sam Gross : -- type: -> behavior ___ Python tracker <https://bugs.python.org/issue46205> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46205] Race condition in runtest_mp leads to hangs (never exits)
Change by Sam Gross : -- nosy: +vstinner ___ Python tracker <https://bugs.python.org/issue46205> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46205] test.libregrtest: Race condition in runtest_mp leads to hangs (never exits)
Change by Sam Gross : -- keywords: +patch pull_requests: +28673 stage: -> patch review pull_request: https://github.com/python/cpython/pull/30470 ___ Python tracker <https://bugs.python.org/issue46205> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38395] proxy_contains (weakref.proxy) can access an object with 0 refcount
New submission from Sam Gross : The implementation of weakref.proxy's methods call back into the Python API using a "borrowed reference" of the weakly referenced object (acquired via PyWeakref_GET_OBJECT). This API call may delete the last reference to the object (either directly or via GC), leaving a dangling pointer, which can be subsequently dereferenced. Tested with Python 3.8.0b4 (debug build) The following code crashes with a debug build of Python 3.8.0b4 on Linux. import weakref obj = None class MyObj: def __iter__(self): global obj del obj return NotImplemented obj = MyObj() p = weakref.proxy(obj) print(5 in p) This particular test case does not crash with a release build (on Linux). The implementation of `in` on a proxy object calls proxy_contains: return PySequence_Contains(PyWeakref_GET_OBJECT(proxy), value); https://github.com/python/cpython/blob/v3.8.0b4/Objects/weakrefobject.c#L556 This eventually calls _PySequence_IterSearch. The call to PyObject_GetIter can call arbitrary code, which can lead to seq (the proxy's referent) being deleted. The subsequent call to type_error dereferences a dead object. it = PyObject_GetIter(seq); if (it == NULL) { type_error("argument of type '%.200s' is not iterable", seq); return -1; } https://github.com/python/cpython/blob/v3.8.0b4/Objects/abstract.c#L2003-L2007 I believe some functions, like proxy_length, may be immune to this problem because they do not access the borrowed referent after calling into user code. However, this is hard to verify from reading the code and may be fragile -- small changes to PyObject_Length/Size, for example, might . See also https://bugs.python.org/issue16602 -- components: Interpreter Core messages: 354102 nosy: colesbury priority: normal severity: normal status: open title: proxy_contains (weakref.proxy) can access an object with 0 refcount type: crash versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue38395> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44657] instancemethod_call should use PyInstanceMethod_GET_FUNCTION macro
New submission from Sam Gross : The instancemethod_call function should use the PyInstanceMethod_GET_FUNCTION macro instead of the PyMethod_GET_FUNCTION macro. The current code is incorrect, but still works okay (doesn't crash) because PyInstanceMethodObject.func is at the same offset as PyMethodObject.im_func. https://github.com/python/cpython/blob/c90c591e5158ab7b531dcd6e2a5f00bc70ba7637/Objects/classobject.c#L465 -- components: Interpreter Core messages: 397660 nosy: colesbury priority: normal severity: normal status: open title: instancemethod_call should use PyInstanceMethod_GET_FUNCTION macro type: enhancement versions: Python 3.10, Python 3.11, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue44657> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44657] instancemethod_call should use PyInstanceMethod_GET_FUNCTION macro
Change by Sam Gross : -- type: enhancement -> behavior ___ Python tracker <https://bugs.python.org/issue44657> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28737] Document that tp_dealloc handler must call PyObject_GC_UnTrack if Py_TPFLAGS_HAVE_GC is set
Change by Sam Gross : -- assignee: docs@python -> colesbury ___ Python tracker <https://bugs.python.org/issue28737> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28737] Document that tp_dealloc handler must call PyObject_GC_UnTrack if Py_TPFLAGS_HAVE_GC is set
Sam Gross added the comment: Antoine Pitrou already fixed the "noddy4" example (now renamed to "custom4") and updated the newtypes_tutorial, but I think it's still worth mentioning PyObject_GC_Untrack in a few additional places. -- ___ Python tracker <https://bugs.python.org/issue28737> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28737] Document that tp_dealloc handler must call PyObject_GC_UnTrack if Py_TPFLAGS_HAVE_GC is set
Change by Sam Gross : -- keywords: +patch pull_requests: +27510 stage: -> patch review pull_request: https://github.com/python/cpython/pull/29246 ___ Python tracker <https://bugs.python.org/issue28737> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28737] Document that tp_dealloc handler must call PyObject_GC_UnTrack if Py_TPFLAGS_HAVE_GC is set
Change by Sam Gross : -- pull_requests: +27513 pull_request: https://github.com/python/cpython/pull/29249 ___ Python tracker <https://bugs.python.org/issue28737> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45809] Race condition in WeakKeyDictionary/WeakKeyDictionary
New submission from Sam Gross : The issue described issue7105 (and maybe issue7060) still exists due to a race condition in WeakKeyDictionary. This shows up as test failure that looks like: test test_weakref failed -- Traceback (most recent call last): File "Lib/test/test_weakref.py", line 1960, in test_threaded_weak_value_dict_deepcopy self.check_threaded_weak_dict_copy(weakref.WeakValueDictionary, True) File "Lib/test/test_weakref.py", line 1940, in check_threaded_weak_dict_copy raise exc[0] File "Lib/test/test_weakref.py", line 1897, in dict_copy _ = copy.deepcopy(d) File "Lib/copy.py", line 153, in deepcopy y = copier(memo) File "Lib/weakref.py", line 189, in __deepcopy__ for key, wr in self.data.items(): RuntimeError: dictionary changed size during iteration The cause is that the check of "self._iterating" and the call to "_atomic_removal" are not performed atomically together. By the time _atomic_removal() is called, an iteration might have already started. https://github.com/python/cpython/blob/ec382fac0db6d9159c2d3496a70b7a605545957e/Lib/weakref.py#L109-L114 -- components: Library (Lib) messages: 406357 nosy: colesbury priority: normal severity: normal status: open title: Race condition in WeakKeyDictionary/WeakKeyDictionary type: behavior versions: Python 3.10, Python 3.11, Python 3.9 ___ Python tracker <https://bugs.python.org/issue45809> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45809] Race condition in WeakKeyDictionary/WeakKeyDictionary
Sam Gross added the comment: The attached patch (issue45809-repro.patch) introduces artificial delays to make reproduction of the underlying issue easier. To reproduce the issue: patch -p1 < issue45809-repro.patch ./python -m test test_weakref -m test_threaded_weak_value_dict_deepcopy -v -- keywords: +patch Added file: https://bugs.python.org/file50440/issue45809-repro.patch ___ Python tracker <https://bugs.python.org/issue45809> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42540] Debug pymalloc crash when using os.fork() [regression]
Change by Sam Gross : -- keywords: +patch nosy: +colesbury nosy_count: 2.0 -> 3.0 pull_requests: +27812 stage: -> patch review pull_request: https://github.com/python/cpython/pull/29564 ___ Python tracker <https://bugs.python.org/issue42540> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42969] pthread_exit & PyThread_exit_thread from PyEval_RestoreThread etc. are harmful
Sam Gross added the comment: The `pthread_exit` behavior has been a problem for PyTorch and related libraries since Python 3.9. The PyTorch team has tried working around the problems without success (i.e. they keep getting bug reports involving crashes in PyEval_SaveThread/RestoreThread). The hang/paused the thread behavior suggested by jbms and gps seems like the only reliable option. This is also what the Java VM does when returning from native code and the JVM has exited. I believe it's not difficult to hang a thread in a cross-platform way: create a mutex, acquire it in the main thread (before setting PyRuntime._finalizing), never release it. Other threads can acquire that same mutex to block until the application exits. The crashes can occur even without daemon threads if the user presses ctrl-c while _thread_shutdown is running. -- nosy: +colesbury ___ Python tracker <https://bugs.python.org/issue42969> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue42540] Debug pymalloc crash when using os.fork() [regression]
Change by Sam Gross : -- pull_requests: +27843 pull_request: https://github.com/python/cpython/pull/29600 ___ Python tracker <https://bugs.python.org/issue42540> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45835] Race condition in test_queue can lead to test failures
New submission from Sam Gross : The test_queue suite has a race condition that can lead to test failures in test_many_threads, test_many_threads_nonblock, and test_many_threads_timeout. Consumers are signaled to exit by a sentinel value (None). The sentinel values are at the end of the input list, but that doesn't mean they are necessarily enqueued at the end of the inter-thread queue when there are multiple "feeder" threads. In particular, a feeder thread may be delayed in enqueueing a non-sentinel value. The other feeder threads may finish popping and enqueueing the remaining values including all the sentinels, leading to the delayed non-sentinel value arriving AFTER all the sentinels. The "consumer" threads exit before processing all the values leading to the assertion error in run_threads() in test_queue.py: self.assertTrue(q.empty()) I will attach a patch that adds a delay in feed() to make the race condition occur more frequently so that the issue is easier to reproduce. -- components: Tests messages: 406498 nosy: colesbury priority: normal severity: normal status: open title: Race condition in test_queue can lead to test failures type: behavior versions: Python 3.10, Python 3.11, Python 3.9 ___ Python tracker <https://bugs.python.org/issue45835> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45835] Race condition in test_queue can lead to test failures
Change by Sam Gross : -- keywords: +patch Added file: https://bugs.python.org/file50447/issue45835_repro.patch ___ Python tracker <https://bugs.python.org/issue45835> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45835] Race condition in test_queue can lead to test failures
Change by Sam Gross : -- pull_requests: +27844 stage: -> patch review pull_request: https://github.com/python/cpython/pull/29601 ___ Python tracker <https://bugs.python.org/issue45835> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45838] Incorrect line numbers in GDB Python backtraces [3.9]
New submission from Sam Gross : Starting in Python 3.6 the line numbers table contains a *signed* byte indicating line delta. The calculation in Tools/gdb/libpython.py was not updated to handle signed bytes leading to incorrect line numbers when running "py-bt" (or printing frames) in GDB. This issue does not exist in Python 3.10 or later because line number table was changed (and libpython.py was updated) in GH-23113. -- components: Demos and Tools messages: 406560 nosy: colesbury priority: normal severity: normal status: open title: Incorrect line numbers in GDB Python backtraces [3.9] type: behavior versions: Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue45838> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45838] Incorrect line numbers in GDB Python backtraces [3.9]
Change by Sam Gross : -- keywords: +patch pull_requests: +27860 stage: -> patch review pull_request: https://github.com/python/cpython/pull/29628 ___ Python tracker <https://bugs.python.org/issue45838> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39571] clang warns "warning: redefinition of typedef 'PyTypeObject' is a C11 feature [-Wtypedef-redefinition]"
New submission from Sam Gross : A recent commit added a typedef for PyTypeObject in Include/object.h https://github.com/python/cpython/commit/0e4e735d06967145b49fd00693627f3624991dbc This duplicates the typedef in Include/cpython/object.h. Building with clang now issues a warning: ./Include/cpython/object.h:274:3: warning: redefinition of typedef 'PyTypeObject' is a C11 feature [-Wtypedef-redefinition] This is due to the combination of `-Wall` and `-std=c99`. GCC will only warn if the `-pedantic` option is specified. -- components: C API messages: 361497 nosy: colesbury, vstinner priority: normal severity: normal status: open title: clang warns "warning: redefinition of typedef 'PyTypeObject' is a C11 feature [-Wtypedef-redefinition]" versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue39571> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39571] clang warns "warning: redefinition of typedef 'PyTypeObject' is a C11 feature [-Wtypedef-redefinition]"
Sam Gross added the comment: Alternatively I think you can just remove the typedef from Include/cpython/object.h since Include/object.h is always included first. i.e.: typedef struct _typeobject { ... } PyTypeObject; to simply struct _typeobject { ... }; -- ___ Python tracker <https://bugs.python.org/issue39571> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39571] clang warns "warning: redefinition of typedef 'PyTypeObject' is a C11 feature [-Wtypedef-redefinition]"
Sam Gross added the comment: Yes, that fixes the warnings for me. Thanks! -- ___ Python tracker <https://bugs.python.org/issue39571> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40120] Undefined C behavior going beyond end of struct via a char[1].
Sam Gross added the comment: It may be worth considering C-API extensions written in C++. Flexible array members are not part of the C++ standard, although GCC, Clang, and MSVC support them as an extension. GCC and Clang will issue warnings with `-Wpedantic` and MSVC will issue warnings with `/Wall`. Currently, C++ code that includes `` is warning-free in GCC (g++) even with `-Wpedantic`. That won't be true after this change, unless Py_LIMITED_API is defined. Note that GCC also explicitly supports trailing one-element arrays (the current pattern) as an extension. https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html -- nosy: +colesbury ___ Python tracker <https://bugs.python.org/issue40120> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28737] Document that tp_dealloc handler must call PyObject_GC_UnTrack if Py_TPFLAGS_HAVE_GC is set
New submission from Sam Gross: In general, an a PyTypeObject that has Py_TPFLAGS_HAVE_GC set must call PyObject_GC_UnTrack() before it frees any PyObject* references it owns. The only reference to this requirement I found is in https://docs.python.org/3/c-api/gcsupport.html#c._PyObject_GC_TRACK. This requirement should be documented in: 1. https://docs.python.org/3/c-api/typeobj.html#c.PyTypeObject.tp_dealloc 2. https://docs.python.org/3/extending/newtypes.html A call to PyObject_GC_UnTrack() should also be added to he official "noddy4" example. Currently, the example is incorrect and can crash if a referred-to object triggers a GC from it's destructor. See the following example which segfaults: https://github.com/colesbury/noddy It may be worthwhile to have _Py_Dealloc call PyObject_GC_UnTrack() if the PyTypeObject has Py_TPFLAGS_HAVE_GC set. Considering that the official Python extension example is missing the call, it seems likely that extension writers often forget to include it. -- assignee: docs@python components: Documentation, Extension Modules messages: 281146 nosy: colesbury, docs@python priority: normal severity: normal status: open title: Document that tp_dealloc handler must call PyObject_GC_UnTrack if Py_TPFLAGS_HAVE_GC is set versions: Python 3.5 ___ Python tracker <http://bugs.python.org/issue28737> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com