[Python-Dev] Documenting Python's float.__str__()
Hello, There appears to be extremely minimal documentation on how floats are formatted on output. All I really see is that float.__str__() is float.__repr__(). So that means that float->str->float does not result in a different value. It would be nice if the output format for float was documented, to the extent this is possible. #python suggested that I propose a patch, but I see no way to write a documentation patch without having any clue about what Python promises, whether in the CPython implementation or as part of a specification for Python. What are the promises Python makes about the str() of a float? Will it produce 1.0 today and 1.0e0 or +1.0 tomorrow? When is the result in exponential notation and when not? Does any of this depend on the underlying OS or hardware or Python implementation? Etc. I'm guessing that Python is consistent with an IEEE 754 "external character sequence", but don't know what the IEEE specification says or whether python conforms. I don't really care whether there's documentation for __str__() or __repr__() or something else. I'm just thinking that there should be some way to guarantee a well defined "useful" float output formatting. By "useful" I mean in exponential notation when non-exponential notation is over-long. I am writing a program that sometimes prints python floats and want to be able to document what is printed. Right now I can't truly guarantee anything, other than the nan and inf and -inf representations. (I feel comfortable with nan and the like because I don't see it likely that their representations will change.) Of course I could always re-implement Python's float.__repr__() in Python so as to have full control, but this should be pointless. Python's output representation is unlikely to change and Python should be able to make sufficient promises about its existing float representation. I suppose there are similar issues with integers, but the varieties of floating point number implementations and the existence of both exponential and non-exponential representations make float particularly problematic and representations potentially mercurial. I also don't know if documentation changes with regard to external representations would require a PEP. I have found the following related information: Use shorter float repr when possible https://bugs.python.org/issue1580 https://github.com/python/cpython/blob/master/Python/pystrtod.c#L831 String conversion and formatting https://docs.python.org/3/c-api/conversion.html sys.float_repr_style https://docs.python.org/3/library/sys.html#sys.float_repr_style object.__str__(self) https://docs.python.org/3/reference/datamodel.html#object.__str__ At the end of the day I don't _really_ care. But having put thought into the matter I care enough to write this email and ask the question. Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FV22TKT3S2Q3P7PNN6MCXI6IX3HRRNAL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Documenting Python's float.__str__()
On Tue, 21 Jan 2020 21:09:57 +1100 Steven D'Aprano wrote: > On Mon, Jan 20, 2020 at 09:59:07PM -0600, Karl O. Pinc wrote: > > > It would be nice if the output format for float was documented, to > > the extent this is possible. > > I don't think we should make any promises about the repr() of floats. > We've already changed the format at least twice: > > - once to switch to the shortest unambiguous representation; > - and once to shift to a more consistent output for NANs. > > (NANs on Windows prior to 2.6 used to be displayed as '1.#IND', if I > recall correctly.) > > We may never want to change output format again, but if we document a > certain format that will be read by people as a guarantee, and that > closes the door to any change without a long and tedious deprecation > period. Understood. But you still might want to document, or even define in the language, that you're outputting the shortest unambiguous representation. Or other such broad principals like IEEE 754 representation compatibility. This is a suggestion, I don't want to advocate. > If anyone wants a guaranteed output format for floats, they ought to > use the various string formatting operations, which offer guaranteed > formatting outputs. Or build your own formatter. > > I think that the most we should promise is that (with the exception > of NANs) float -> repr -> float should round-trip with no change in > value. That would be nice, and is the sort of general principal I'm thinking of. Another one might be "a sign is only printed for negative numbers". I guess I will advocate for _some_ specification built into Python's definition. Otherwise everybody should _always_ build their own formatter; lest they wake up one morning and find that int zero prints as "+0". As mentioned, parts of this discussion could also apply to other numeric types. > > I don't really care whether there's documentation for __str__() or > > __repr__() or something else. I'm just thinking that there should > > be some way to guarantee a well defined "useful" float output > > formatting. > > https://docs.python.org/3/library/stdtypes.html#printf-style-string-formatting > > https://docs.python.org/3/library/string.html#format-string-syntax Thanks. For some reason nobody in #python pointed me to the 'g' format type. That resolves my issue. Unfortunately, because 'g' can strip the trailing ".0" floats formatted with it no longer satisfy the float->str->float immutability property. I can always: out = f'{num:g}' print(out if 'e' in out or '.' in out else f'{out}.0') sort of logic. (With handling for INF and NAN.) A cleaner format would be nice but this works. (The #g format leaves multiple trailing zeros, which is too different from the "minimal" form __repr__() produces.) FYI. It wouldn't hurt to have the PyOS_double_to_string() docs https://docs.python.org/3/c-api/conversion.html point out that "format" uses the codes as defined in your formatting links above. Digging around got me to PyOS_double_to_string() whereupon I was left in the dark about the meaning of the "format" codes. Thanks you all for the help. Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MP5OKKVGWLCCYJE7EQ2DOPXFHACGTRN4/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Documenting Python's float.__str__()
On Tue, 21 Jan 2020 09:01:29 -0600 "Karl O. Pinc" wrote: > I guess I will advocate for _some_ specification built into Python's > definition. Otherwise everybody should _always_ build their own > formatter; lest they wake up one morning and find that int zero prints > as "+0". Having made a suggestion I've followed up with a pull request. https://github.com/python/cpython/pull/18111 I think I have come up with a very minimal and sane set of restrictions on the default Numeric string representations. Having done that, I'm less interested in spending a lot more time on this. I'd be happy to explain my wording choices, and equally happy to have the pull request immediately rejected. The pull request is presently failing the check for news. (I'm not entirely clear on how to satisfy the requirement, or whether I could come up with a good news entry. I'll wait to resolve this if it looks like the patch is going anywhere.) There should probably also be unit tests. But again, I'll wait to see if this is going anywhere. FYI, it was remarkably easy to build the docs. But the contribution process goes through an annoying number of corporations (github, the contributor signature...) and login steps. (The contributor signature needs to clear at your end.) Regards, Karl Free Software: "You don't pay back, you pay forward." -- Robert A. Heinlein ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FDC772QSZB5IE7TY4DQILHWBZS2WYKKQ/ Code of Conduct: http://python.org/psf/codeofconduct/