[issue45180] possible wrong result for difflib.SequenceMatcher.ratio()
New submission from Nabeel Alzahrani : The difflib.SequenceMatcher.ratio() gives 0.3 instead of 1.0 or at least 0.9 for the following two strings a and b: a=""" #include #include using namespace std; int main() { string userWord; unsigned int i; cin >> userWord; for(i = 0; i < userWord.size(); i++) { if(userWord.at(i) == 'i') { userWord.at(i) = '1'; } if(userWord.at(i) == 'a') { userWord.at(i) = '@'; } if(userWord.at(i) == 'm') { userWord.at(i) = 'M'; } if(userWord.at(i) == 'B') { userWord.at(i) = '8'; } if(userWord.at(i) == 's') { userWord.at(i) = '$'; } userWord.push_back('!'); } cout << userWord << endl; return 0; } """ b=""" #include #include using namespace std; int main() { string userWord; unsigned int i; cin >> userWord; userWord.push_back('!'); for(i = 0; i < userWord.size(); i++) { if(userWord.at(i) == 'i') { userWord.at(i) = '1'; } if(userWord.at(i) == 'a') { userWord.at(i) = '@'; } if(userWord.at(i) == 'm') { userWord.at(i) = 'M'; } if(userWord.at(i) == 'B') { userWord.at(i) = '8'; } if(userWord.at(i) == 's') { userWord.at(i) = '$'; } } cout << userWord << endl; return 0; } """ -- components: Library (Lib) messages: 401683 nosy: nalza001 priority: normal severity: normal status: open title: possible wrong result for difflib.SequenceMatcher.ratio() type: behavior versions: Python 3.10, Python 3.11, Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue45180> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45180] possible wrong result for difflib.SequenceMatcher.ratio()
Nabeel Alzahrani added the comment: But when I turn off the "autojunk" feature for the following example, I get the wrong ratio of 0.5 instead of the correct ratio of 0.2 with autojunk enabled. a=""" #include #include using namespace std; int main() { string userPass; int sMaxIndex; char indivChar; int i; cin >> userPass; sMaxIndex = userPass.size() - 1; for (i = 0; i <= sMaxIndex; ++i) { indivChar = userPass.at(i); if (indivChar == 'i') { indivChar = '1'; cout << indivChar; } else if (indivChar == 'a') { indivChar = '@'; cout << indivChar; } else if (indivChar == 'm') { indivChar = 'M'; cout << indivChar; } else if (indivChar == 'B') { indivChar = '8'; cout << indivChar; } else if (indivChar == 's') { indivChar = '$'; cout << indivChar; } else { cout << indivChar; } } cout << "!" << endl; return 0; } """ b=""" #include #include using namespace std; int main() { string ori; cin >> ori; for (int i = 0; i < ori.size(); i++){ if (ori.at(i) == 'i') ori.at(i) = '1'; if (ori.at(i) == 'a') ori.at(i) = '@'; if (ori.at(i) == 'm') ori.at(i) = 'M'; if (ori.at(i) == 'B') ori.at(i) = '8'; if (ori.at(i) == 's') ori.at(i) = '$'; } cout << ori << endl; return 0; } """ -- ___ Python tracker <https://bugs.python.org/issue45180> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45180] possible wrong result for difflib.SequenceMatcher.ratio()
Change by Nabeel Alzahrani : -- resolution: not a bug -> status: closed -> open ___ Python tracker <https://bugs.python.org/issue45180> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45180] possible wrong result for difflib.SequenceMatcher.ratio()
Nabeel Alzahrani added the comment: Here are the steps that I used to calculate 0.2 for the last example: I used class difflib.HtmlDiff to find the number of changed chars (addedChars, deletedChars, and changedChars) which is 1172 (let us call it delta) The size of both strings a and b in this example is 1470 I calculated the similality ratio using 1-(delta/totalSize) = 1-(1172/1470)=0.2 I am assuming both classes difflib.SequenceMatcher and difflib.HtmlDiff are both using the same algorithms and arguments and if so they should produce the same ratio. Is that right? -- status: closed -> open ___ Python tracker <https://bugs.python.org/issue45180> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com