[issue24384] difflib.SequenceMatcher faster quick_ratio with lower bound specification
floyd added the comment: Yes, I agree this should be closed. Especially because my proposed code is so incredibly bad (e.g. regarding performance) that it should be rejected. Back then I was horribly wrong and didn't understand the problem well enough yet. If somebody would like to have such a function, this is all it needs: def quick_ratio_ge(self, a, b, threshold): return threshold <= 2.0*(len(a))/(len(a)+len(b)) Here is how I actually use it in code: https://github.com/modzero/burp-ResponseClusterer/blob/master/ResponseClusterer.py#L343 Sorry for the fuzz -- resolution: -> rejected stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue24384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24384] difflib.SequenceMatcher faster quick_ratio with lower bound specification
New submission from floyd: I guess a lot of users of difflib call the SequenceMatcher in the following way (where a and b often have different lengths): if difflib.SequenceMatcher.quick_ratio(None, a, b) >= threshold: However, for this use case the current quick_ratio is quite a performance loss. Therefore I propose to add an additional, optimized version quick_ratio_ge which would be called like this: if difflib.SequenceMatcher.quick_ratio_ge(None, a, b, threshold): As we are able to calculate upper bounds for threshold depending on the lengths of a and b this function would return much faster in a lot of cases. An example of how quick_ratio_ge could be implemented is attached. -- components: Library (Lib) files: difflib_SequenceMatcher_quick_ratio_ge.py messages: 244840 nosy: floyd priority: normal severity: normal status: open title: difflib.SequenceMatcher faster quick_ratio with lower bound specification type: enhancement Added file: http://bugs.python.org/file39625/difflib_SequenceMatcher_quick_ratio_ge.py ___ Python tracker <http://bugs.python.org/issue24384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24384] difflib.SequenceMatcher faster quick_ratio with lower bound specification
floyd added the comment: Now that I gave it another thought, I think it would be better if we simply add threshold as a named parameter of quick_ratio -- ___ Python tracker <http://bugs.python.org/issue24384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24384] difflib.SequenceMatcher faster quick_ratio with lower bound specification
floyd added the comment: Agree with the separate function (especially as the return value would change from float to bool). In my experience this is one of the most often occuring use cases for difflib in practice. Another reason is that it is not obvious that the user can optimize it with the appended version. Some more opinions would be nice. If this suggestion is rejected we could include a performance warning for this use case in the docs and/or I'll write an online code recipe which can be linked to. -- ___ Python tracker <http://bugs.python.org/issue24384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue5849] Idle 3.01 - invalid syntec error
New submission from R.D. floyd : Recently upgraded to OS 10.5, Experienced Fortran, Basic, et. al. programmer learning Python. IDLE 3.01 give invalid syntec error when running program below. IDLE from MacPython 2.xx runs it ok! -- components: IDLE files: odbchelper.py messages: 86609 nosy: r2d2floyd severity: normal status: open title: Idle 3.01 - invalid syntec error type: compile error versions: Python 3.0 Added file: http://bugs.python.org/file13793/odbchelper.py ___ Python tracker <http://bugs.python.org/issue5849> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com