[issue39776] Crash in decimal module in heavy-multithreaded scenario
New submission from Evgeny Boytsov : Hello everybody! We are using Python 3.7 running at CentOS 7 x64. Python is used as a library to create dynamic extensions for our app server. Some time ago we began to experience crashes in decimal module in some heavy-multithreaded scenarios. After some testing and debugging I was able to reproduce it without our own code using only pybind11 library to simplify embedding (in real app we are using boost.python). I've built python 3.8 with clang 7 and address sanitizer enabled and got error "use-after-free" with some additional data. Please find attached C++ source file, python module and ASAN output. Is it really a bug (most probably - data race) or there is something wrong with such embedding scenario? -- components: Interpreter Core files: decimal_crash.zip messages: 362807 nosy: boytsovea priority: normal severity: normal status: open title: Crash in decimal module in heavy-multithreaded scenario versions: Python 3.7, Python 3.8 Added file: https://bugs.python.org/file48923/decimal_crash.zip ___ Python tracker <https://bugs.python.org/issue39776> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39776] Crash in decimal module in heavy-multithreaded scenario
Evgeny Boytsov added the comment: Please note, that UnlockGILandSleep takes GIL back before returning. In a real production code there is a database query. In this example I emulate them with random sleep. So I don't see any problems here. -- ___ Python tracker <https://bugs.python.org/issue39776> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39776] Crash in decimal module in heavy-multithreaded scenario
Evgeny Boytsov added the comment: Thank you for feedback. I will try to reproduce the issue with 3.6. By the way, haven't you used gdb with python pretty-printers enabled to examine the state of the program? I've got the same error message, then I breaked the execution in debugger and tried to examine the callstack of threads, that stucked in UnlockGILandSleep. The reason for it is clear: then the debugger tries to build a callstack, some of pretty printers try to execute some python code to give a better representation of interpreter objects. The code is executed at the top of the stack of the examined thread. Since this thread explicitly released the GIL before going to sleep, these functions hit the assert about calling the memory allocator without holdng the GIL. Disabling pretty-printers makes these error messages to disappear. -- ___ Python tracker <https://bugs.python.org/issue39776> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39776] Crash in decimal module in heavy-multithreaded scenario
Evgeny Boytsov added the comment: I'am unable to reproduce neither my or your issues with python 3.6. The program runs infinitely as it meant to be. Can you please give me C++ traceback from the core dump, which was created when you ran my program? -- ___ Python tracker <https://bugs.python.org/issue39776> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39776] Crash in decimal module in heavy-multithreaded scenario
Evgeny Boytsov added the comment: Your callstack is very strange. At line 30 of main.cpp GIL is obviously locked: // importing module in this tread gstate = PyGILState_Ensure(); py::module crash_test = py::module::import( "crash_test" ); <-- import PyGILState_Release( gstate ); I suppose that there is something wrong with your setup. Maybe - wrong working directory for the main executable, which doesn't contain crash_test.py Also I've tried to revert this patch https://github.com/python/cpython/pull/5278 for 3.7. It makes problem to disappear, 1 hour of stable work under ASAN. So I suppose it is the source of the bug. I will try to tweak _testembed.c. -- resolution: -> not a bug ___ Python tracker <https://bugs.python.org/issue39776> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39776] Crash in decimal module in heavy-multithreaded scenario
Evgeny Boytsov added the comment: I rewrote my example without pybind and eliminated C++ module (I realized that time.sleep() also releases the GIL, so we achieve the same effect). Still the same results: with python 3.7.3 app crashes with attached ASAN output, with python 3.7.3 without https://github.com/python/cpython/pull/5278 works just fine. To run main.cpp you should add directory with crash_test.py to PYTHONPATH. -- Added file: https://bugs.python.org/file48930/threaded_crash.zip ___ Python tracker <https://bugs.python.org/issue39776> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39776] Crash in decimal module in heavy-multithreaded scenario
Evgeny Boytsov added the comment: Also I understood the source of your crash with my initial example. Since you haven't used CMake to configure project, pybind didn't setup required macroses to enable threading support. So no issues in pybind. -- ___ Python tracker <https://bugs.python.org/issue39776> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39776] PyContextVar_Get(): crash due to race condition in updating tstate->id
Evgeny Boytsov added the comment: I checked both my test example and real production code with your patch. I'm unable to reproduce the bug, so I think it is fixed now. Thank you! -- ___ Python tracker <https://bugs.python.org/issue39776> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30929] AttributeErrors after import in multithreaded environment
New submission from Evgeny Boytsov: Hello everybody! We are using Python 3.4 running at CentOS 7 x64 and experiencing some problems with simulatenous import of modules from different threads of execution. The attached archive contains simple example, which demonstrates the issue. There is python package called "mod". It contains two sub-packages called "foo" and "bar". They, in turn, contain modules "foo" and "bar" respectively. Another package called "threader" contains function "run_threads()", which runs two threads of execution (using threading) and joins them. First thread of execution imports "mod.foo.foo" from "mod" and calls its function. The second thread does the same, but for module "mod.bar.bar". Finally, there is module "test.py" which calls "threader.run_threads()" and when it returns, tries to import "mod.foo.foo" and call its function. To reproduce the bug you need to run "test.py" for several times. Most of the launches complete successfully. But sometimes we encounter AttributeError during import like that: 'module' object has no attribute 'foo' We've done some investigation in order to clarify this behaviour and added import hook (you can see it in test.py), which logs all import that take place. When the launch completes successfully, the order of imports looks like that: $ python3 test.py [py][import] threader [py][import] threading [py][import] time [py][import] traceback [py][import] linecache [py][import] tokenize [py][import] collections [py][import] _collections [py][import] operator [py][import] _operator [py][import] keyword [py][import] heapq [py][import] itertools [py][import] _heapq [py][import] reprlib [py][import] re [py][import] sre_compile [py][import] _sre [py][import] sre_parse [py][import] sre_constants [py][import] _locale [py][import] copyreg [py][import] token [py][import] mod [py][import] mod.foo [py][import] mod.bar [py][import] mod.foo.foo [py][import] mod.bar.bar print from foo print from bar When the issue arises we see the following: $ python3 test.py [py][import] threader [py][import] threading [py][import] time [py][import] traceback [py][import] linecache [py][import] tokenize [py][import] collections [py][import] _collections [py][import] operator [py][import] _operator [py][import] keyword [py][import] heapq [py][import] itertools [py][import] _heapq [py][import] reprlib [py][import] re [py][import] sre_compile [py][import] _sre [py][import] sre_parse [py][import] sre_constants [py][import] _locale [py][import] copyreg [py][import] token [py][import] mod [py][import] mod.foo [py][import] mod [py][import] mod.bar [py][import] mod.foo.foo [py][import] mod.bar.bar thread of foo: import error 'module' object has no attribute 'foo' print from bar That is when the issue arises there are two imports of package "mod". And the most confusing thing about this scenario is that even after completion of "run_threads()" interpreter can not import "mod.foo.foo" and gives the same AttributeErrors in for-loop inside test.py. -- components: Interpreter Core files: mod.zip messages: 298349 nosy: boytsovea priority: normal severity: normal status: open title: AttributeErrors after import in multithreaded environment type: behavior versions: Python 3.4 Added file: http://bugs.python.org/file47014/mod.zip ___ Python tracker <http://bugs.python.org/issue30929> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30929] AttributeErrors after import in multithreaded environment
Evgeny Boytsov added the comment: The same behaviour is reprodusible at ubuntu 16.04 with python 3.5.2. And a colleague of mine said that he was able to reproduce the issue with python 3.6. -- ___ Python tracker <http://bugs.python.org/issue30929> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com