[issue37971] Wrong trace with multiple decorators (linenumber wrong in frame)
New submission from Joran van Apeldoorn : When applying multiple decorators to a function, a traceback from the top generator shows the bottom generator instead. For example def printingdec(f): raise Exception() return f def dummydec(f): return f @printingdec @dummydec def foo(): pass gives Traceback (most recent call last): File "bug.py", line 9, in @dummydec File "bug.py", line 2, in printingdec raise Exception() Exception instead of Traceback (most recent call last): File "bug.py", line 8, in @printingdec File "bug.py", line 2, in printingdec raise Exception() Exception Digging around with sys._getframe() it seems that the frame's linenumber is set wrong internally, leading to the wrong line being displayed. The ast does display the correct linenumber. -- messages: 350686 nosy: control-k priority: normal severity: normal status: open title: Wrong trace with multiple decorators (linenumber wrong in frame) type: behavior versions: Python 3.6 ___ Python tracker <https://bugs.python.org/issue37971> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37971] Wrong trace with multiple decorators (linenumber wrong in frame)
Joran van Apeldoorn added the comment: Digging around with the disassembler shows that this originates in the bytecode. Code: import dis src = """ def printingdec(f): raise Exception() return f def dummydec(f): return f @printingdec @dummydec def foo(): pass """ code = compile(src,filename="bug.py",mode='exec') print(dis.dis(code)) - gives on 3.6: 2 0 LOAD_CONST 0 () 2 LOAD_CONST 1 ('printingdec') 4 MAKE_FUNCTION0 6 STORE_NAME 0 (printingdec) 6 8 LOAD_CONST 2 () 10 LOAD_CONST 3 ('dummydec') 12 MAKE_FUNCTION0 14 STORE_NAME 1 (dummydec) 9 16 LOAD_NAME0 (printingdec) 10 18 LOAD_NAME1 (dummydec) 20 LOAD_CONST 4 () 22 LOAD_CONST 5 ('foo') 24 MAKE_FUNCTION0 26 CALL_FUNCTION1 28 CALL_FUNCTION1 30 STORE_NAME 2 (foo) 32 LOAD_CONST 6 (None) 34 RETURN_VALUE None and on 3.9: 2 0 LOAD_CONST 0 () 2 LOAD_CONST 1 ('printingdec') 4 MAKE_FUNCTION0 6 STORE_NAME 0 (printingdec) 6 8 LOAD_CONST 2 () 10 LOAD_CONST 3 ('dummydec') 12 MAKE_FUNCTION0 14 STORE_NAME 1 (dummydec) 9 16 LOAD_NAME0 (printingdec) 10 18 LOAD_NAME1 (dummydec) 11 20 LOAD_CONST 4 () 22 LOAD_CONST 5 ('foo') 24 MAKE_FUNCTION0 26 CALL_FUNCTION1 28 CALL_FUNCTION1 30 STORE_NAME 2 (foo) 32 LOAD_CONST 6 (None) 34 RETURN_VALUE Disassembly of : 3 0 LOAD_GLOBAL 0 (Exception) 2 CALL_FUNCTION0 4 RAISE_VARARGS1 4 6 LOAD_FAST0 (f) 8 RETURN_VALUE Disassembly of : 7 0 LOAD_FAST0 (f) 2 RETURN_VALUE Disassembly of : 12 0 LOAD_CONST 0 (None) 2 RETURN_VALUE None The change from 3.6 seems to be that a new line number is introduced for instruction 20, loading the function code, which seems reasonable. It would feel natural if the line number of the decorator would be used for instructions 26 & 28, the decorator call. -- ___ Python tracker <https://bugs.python.org/issue37971> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue37971] Wrong trace with multiple decorators (linenumber wrong in frame)
Joran van Apeldoorn added the comment: After compiling 3.7 and 3.8 as well it seems that the change happened between those versions. I was a able to patch compiler.c for 3.9 to make it work (first time changing cpython internals, so no guarantees). Patch is attached. This trips up one of the tests in test_trace however, since both the LOAD_NAME before the function def and the CALL_FUNCTION after are counted as a visit to the decorator line. However, this is also the case for your example with the decorators written out, running: def deco1(f): return f def deco2(f): return f def go(): f = 5 f = ( deco1( deco2( f ) ) ) import trace tracer = trace.Trace(count=1,trace=0,countfuncs=0, countcallers=0) tracer.run('go()') for k,v in tracer.results().counts.items(): print(k,v) gives ('', 1) 1 ('/home/user/projects/ShortUse/tracebug/cpython3.9clean/mytracetest.py', 8) 1 ('/home/user/projects/ShortUse/tracebug/cpython3.9clean/mytracetest.py', 10) 2 ('/home/user/projects/ShortUse/tracebug/cpython3.9clean/mytracetest.py', 11) 2 ('/home/user/projects/ShortUse/tracebug/cpython3.9clean/mytracetest.py', 12) 1 ('/home/user/projects/ShortUse/tracebug/cpython3.9clean/mytracetest.py', 5) 1 ('/home/user/projects/ShortUse/tracebug/cpython3.9clean/mytracetest.py', 2) 1 ('/home/user/projects/ShortUse/tracebug/cpython3.9clean/mytracetest.py', 9) 1 while clearly each function is only called ones. In addition, to get back to the 3.6/3.7 problem as well, on 3.6 the slight modification def deco1(f): raise Exception() return f def deco2(f): return f f = 5 f = ( deco1( deco2( f ) ) ) gives Traceback (most recent call last): File "sixtest.py", line 12, in f File "sixtest.py", line 2, in deco1 raise Exception() Exception So the problem is not only with decorators, it is with function calls on multiple lines, in all versions. It seems that: 1. The problem with tracebacks for function calls on multiple lines has been fixed in going from 3.7 to 3.8 (should this fix be merged down as well?) 2. The same problem for decorators has not been fixed (patch attached for 3.9) 3. The fix in 3.8 introduced a bug in the trace module which seems hard to fix. -- keywords: +patch Added file: https://bugs.python.org/file48567/decolinenumbers.patch ___ Python tracker <https://bugs.python.org/issue37971> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45869] Unicode and acii regular expressions do not agree on ascii space characters
New submission from Joran van Apeldoorn : The expectation would be that the re.A (or re.ASCII) flag should not impact the matching behavior of a regular expression on strings consisting only of ASCII characters. However, for the characters 0x1c till 0x1f, the classes \s and \S differ. For ASCII theses characters are not considered space characters while for unicode they are. Note that python strings do consider these characters spaces as '\xc1'.isspace() gives True. All other classes and characters stay the same for unicode and ASCII matching. -- components: Regular Expressions files: unicode-ascii-space.py messages: 406773 nosy: control-k, ezio.melotti, mrabarnett priority: normal severity: normal status: open title: Unicode and acii regular expressions do not agree on ascii space characters versions: Python 3.10, Python 3.11, Python 3.8, Python 3.9 Added file: https://bugs.python.org/file50457/unicode-ascii-space.py ___ Python tracker <https://bugs.python.org/issue45869> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45869] Unicode and acii regular expressions do not agree on ascii space characters
Change by Joran van Apeldoorn : -- type: -> behavior ___ Python tracker <https://bugs.python.org/issue45869> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45869] Unicode and acii regular expressions do not agree on ascii space characters
Joran van Apeldoorn added the comment: Small addition, the sre categories CATEGORY_LINEBREAK and CATEGORY_UNI_LINEBREAK also do not agree on ASCII characters. The first is only '\n' while the second also includes for example '\r' and some others. These do not seem to correspond to anything however and are never used in sre_parse.py or sre_compile.py. -- ___ Python tracker <https://bugs.python.org/issue45869> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45869] Unicode and acii regular expressions do not agree on ascii space characters
Joran van Apeldoorn added the comment: Hi, I was not suggesting that the documentation literally says they should be the same but it might be unexpected for users if ASCCI characters change properties depending on whether they are considered in a unicode or pure ASCII setting. The documentation says about re.A: "Make \w, \W, \b, \B, \d, \D, \s and \S perform ASCII-only matching instead of full Unicode matching. ". The problem might be that there is no clear notion of "ASCII-only matching". I assumed this mean matching ASCII characters only, i.e., the character classes are simply limited to codes below 128. About \s the documentation says: "Matches Unicode whitespace characters (which includes [ \t\n\r\f\v], and also many other characters, for example the non-breaking spaces mandated by typography rules in many languages). If the ASCII flag is used, only [ \t\n\r\f\v] is matched.". This heavily implies that there are non-ASCII characters in Unicode that might be considered spaces, but that the ASCII characters are [ \t\n\r\f\v], although again, not stated literally. There might be valid reasons to change the definition (even for ASCII characters) depending on re.A, but should it then not follow the unicode standard for white space in the unicode case? (which would coincide with the current ASCII case). There seem to be many different places where python is opinionated about what a space is, but not much consistency behind it. I am a bit worried about the undocumented nature of the precise definitions of the regex classes in general. How is a user supposed to know that the default behavior of \s, when no flag is passed, is to also match other ASCII characters then those mentioned for the ASCII case? In contrast to this, the \d class is directly defined as the unicode category [Nd]. It is likely to hard to change and to many things depend on it but the following definitions would make more sense to me, and hopefully others: - Character classes are defined as a set of unicode properties/categories, following the same definitions as elsewhere in python. - If re.A is passed, they are this same set but limited to codes below 128. After some digging in the code I traced the current definitions as follows: - For unicode Py_UNICODE_ISSPACE is called, which either does a lookup in the constant table _Py_ascii_whitespace or calls _PyUnicode_IsWhitespace for non ASCII characters. Both of these define a space as "Unicode characters having the bidirectional type 'WS', 'B' or 'S' or the category 'Zs'", i.e., this is simply the unicode string isspace() definition. - For ASCII Py_ISSPACE is called which does a lookup in _Py_ctype_table. It is unclear to me how this table was made. So sre just follows the other python definitions. In searching around i found issue #18236 , which also considers how the python definition differs from the unicode one. -- ___ Python tracker <https://bugs.python.org/issue45869> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com