[issue44699] Simple regex appears to take exponential time in length of input
New submission from János Brezniczky : I don't know if it's normal, but it's impractical. I seem to have come by an expression consuming o(c^n) CPU cycles with c around 2. Regex: \*([^*]+)*\* Resulted in times (in seconds) of 0.17 (* is there anybody out?) 0.34 1 0.69 12 1.36 123 2.73 1234 5.44 12345 11.1 123456 (* is there anybody123456 out?) Please see the source for more details/repro. -- components: Regular Expressions files: regex_test.py messages: 397948 nosy: brezniczky, ezio.melotti, mrabarnett priority: normal severity: normal status: open title: Simple regex appears to take exponential time in length of input type: performance versions: Python 3.8 Added file: https://bugs.python.org/file50169/regex_test.py ___ Python tracker <https://bugs.python.org/issue44699> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44699] Simple regex appears to take exponential time in length of input
János Brezniczky added the comment: Might be related to https://bugs.python.org/issue43075, as in would trigger ReDOSability had my syntax highlighting-related changes (which eventually broke golden master tests, fortunately :) ) hit the public. (Meaning my reduced - possibly minimal - corner case stems from a practical scenario as well.) -- ___ Python tracker <https://bugs.python.org/issue44699> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44699] Simple regex appears to take exponential time in length of input
János Brezniczky added the comment: I'd also raise for consideration the introduction a (default?) timeout on regexes, similarly to how such a feature seems available in .NET. Given the DOS vector vs. occasionally non-trivially complex expressions, this could draw developer attention to this security aspect and stimulate the evolution of a more secure ecosystem. https://docs.microsoft.com/en-us/dotnet/api/system.text.regularexpressions.regex.matchtimeout?view=net-5.0 -- ___ Python tracker <https://bugs.python.org/issue44699> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue44699] Simple regex appears to take exponential time in length of input
János Brezniczky added the comment: Hello! Thanks for the reply - you are probably right, and I accept that it's normal. However, there may be numerous frameworks getting built on the concept of collaboratively contributed regexes. (Python likes prototyping, prototyping likes portable programming constructs, regular expressions are portable, for this and other factors and as past practice suggests, these will be a temptingly plausible choice as a basis for frameworks, etc.) So I do accept it's normal (I haven't been deeply into regexes for cca 10 years, but I vaguely recall what's complex depends a lot on the type of engine, maybe A or B? pass on that one). However people make mistakes that attackers seek... (Not to lose that point, I am otherwise okay to let it go as "not a bug" if you're certain! I guess something should be done though.) -- ___ Python tracker <https://bugs.python.org/issue44699> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com