[issue18236] int() and float() do not accept strings with trailing separators

2013-06-23 Thread Alexander Belopolsky
Changes by Alexander Belopolsky : -- keywords: +patch Added file: http://bugs.python.org/file30677/5c934626d44d.diff ___ Python tracker ___ __

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-23 Thread Alexander Belopolsky
Changes by Alexander Belopolsky : -- hgrepos: +201 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-23 Thread STINNER Victor
Changes by STINNER Victor : -- nosy: +haypo ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-23 Thread Marc-Andre Lemburg
Marc-Andre Lemburg added the comment: I agree with Martin. At the time Unicode was added to Python, there was no single Unicode property for white space, so I had to deduce this from the other available properties. Now that we have a white space property in Unicode, we should start using it.

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-23 Thread Martin v . Löwis
Martin v. Löwis added the comment: I stand by that comment: IsWhiteSpace should use the Unicode White_Space property. Since FS/GS/RS/US are not in the White_Space property, it's correct that the int conversion fails. It's incorrect that .isspace() gives true. There are really several bugs here

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-22 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: Martin v. Löwis wrote at #13391 (msg147634): > I do think that _PyUnicode_IsWhitespace should use the White_Space > property (from PropList.txt). I'm not quite sure how they computed > that property (or whether it's manually curated). Since that's a > behav

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-22 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: I did a little more investigation and it looks like information separators have been included in whitespace since unicode type was first implemented in Python: guido 11967 Fri Mar 10 22:52:46 2000 +: /* Returns 1 for Unicode characters having the typ

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-22 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: It looks like str.isspace() is incorrect. The proper definition of unicode whitespace seems to include 26 characters: # 0009..000D; White_Space # Cc [5] .. 0020 ; White_Space # Zs SPA

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-22 Thread Terry J. Reedy
Terry J. Reedy added the comment: I see your point now. Since RS is not whitespace by any definition I knew of previously, why is RS.isspace True? Apparent answer: Doc says '''Return true if there are only whitespace characters in the string and there is at least one character, false otherwise

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-22 Thread Alexander Belopolsky
Alexander Belopolsky added the comment: > You stated facts: what is your proposal? There is a bug somewhere. We cannot simultaneously have >>> '\N{RS}'.isspace() True and not accept '\N{RS}' as whitespace when parsing numbers. I believe int(x) should be equivalent to int(x.strip()). This is

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-21 Thread Terry J. Reedy
Terry J. Reedy added the comment: You stated facts: what is your proposal? The fact that unicode calls characters 'space' does not make then whitespace as commonly understood, or as defined by C, or even as defined by the Unicode database. Unicode apparently has a WSpace property. According to

[issue18236] int() and float() do not accept strings with trailing separators

2013-06-16 Thread Alexander Belopolsky
New submission from Alexander Belopolsky: ASCII information separators, hex codes 1C through 1F are classified as space: >>> all(c.isspace() for c in '\N{FS}\N{GS}\N{RS}\N{US}') True but int()/float() do not accept strings with leading or trailing separators: >>> int('123\N{RS}') Traceback (mo