Re: pytest segfault, not with -v
On Fri, Nov 19, 2021 at 9:49 AM Marco Sulla wrote: > I have a battery of tests done with pytest. My tests break with a > segfault if I run them normally. If I run them using pytest -v, the > segfault does not happen. > > What could cause this quantical phenomenon? > Pure python code shouldn't do this, unless you're using ctypes or similar (which arguably isn't pure python). But C extension modules sure can. See: https://stromberg.dnsalias.org/~strombrg/checking-early.html . It uses Fortran to make its point, but the same thing very much applies to C. BTW, if you're using C extension modules, the troublesome one doesn't necessarily have to be one you wrote. It could be a dependency created by someone else too. -- https://mail.python.org/mailman/listinfo/python-list
Re: pytest segfault, not with -v
Indeed I have introduced a command line parameter in my bench.py script that simply specifies the number of times the benchmarks are performed. This way I have a sort of segfault checker. But I don't bench any part of the library. I suppose I have to create a separate script that does a simple loop for all the cases, and remove the optional parameter from bench. How boring. PS: is there a way to monitor the Python consumed memory inside Python itself? In this way I could also trap memory leaks. On Sat, 20 Nov 2021 at 01:46, MRAB wrote: > > On 2021-11-19 23:44, Marco Sulla wrote: > > On Fri, 19 Nov 2021 at 20:38, MRAB wrote: > >> > >> On 2021-11-19 17:48, Marco Sulla wrote: > >> > I have a battery of tests done with pytest. My tests break with a > >> > segfault if I run them normally. If I run them using pytest -v, the > >> > segfault does not happen. > >> > > >> > What could cause this quantical phenomenon? > >> > > >> Are you testing an extension that you're compiling? That kind of problem > >> can occur if there's an uninitialised variable or incorrect reference > >> counting (Py_INCREF/Py_DECREF). > > > > Ok, I know. But why can't it be reproduced if I do pytest -v? This way > > I don't know which test fails. > > Furthermore I noticed that if I remove the __pycache__ dir of tests, > > pytest does not crash, until I re-ran it with the __pycache__ dir > > present. > > This way is very hard for me to understand what caused the segfault. > > I'm starting to think pytest is not good for testing C extensions. > > > If there are too few Py_INCREF or too many Py_DECREF, it'll free the > object too soon, and whether or when that will cause a segfault will > depend on whatever other code is running. That's the nature of the > beast: it's unpredictable! > > You could try running each of the tests in a loop to see which one > causes a segfault. (Trying several in a loop will let you narrow it down > more quickly.) > > pytest et al. are good for testing behaviour, but not for narrowing down > segfaults. > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
No right operator in tp_as_number?
I checked the documentation: https://docs.python.org/3/c-api/typeobj.html#number-structs and it seems that, in the Python C API, the right operators do not exist. For example, there is nb_add, that in Python is __add__, but there's no nb_right_add, that in Python is __radd__ Am I missing something? -- https://mail.python.org/mailman/listinfo/python-list
Re: pytest segfault, not with -v
On 2021-11-20 17:40, Marco Sulla wrote: Indeed I have introduced a command line parameter in my bench.py script that simply specifies the number of times the benchmarks are performed. This way I have a sort of segfault checker. But I don't bench any part of the library. I suppose I have to create a separate script that does a simple loop for all the cases, and remove the optional parameter from bench. How boring. PS: is there a way to monitor the Python consumed memory inside Python itself? In this way I could also trap memory leaks. I'm on Windows 10, so I debug in Microsoft Visual Studio. I also have a look at the memory usage in Task Manager. If the program uses more memory when there are more iterations, then that's a sign of a memory leak. For some objects I'd look at the reference count to see if it's increasing or decreasing for each iteration when it should be constant over time. On Sat, 20 Nov 2021 at 01:46, MRAB wrote: On 2021-11-19 23:44, Marco Sulla wrote: > On Fri, 19 Nov 2021 at 20:38, MRAB wrote: >> >> On 2021-11-19 17:48, Marco Sulla wrote: >> > I have a battery of tests done with pytest. My tests break with a >> > segfault if I run them normally. If I run them using pytest -v, the >> > segfault does not happen. >> > >> > What could cause this quantical phenomenon? >> > >> Are you testing an extension that you're compiling? That kind of problem >> can occur if there's an uninitialised variable or incorrect reference >> counting (Py_INCREF/Py_DECREF). > > Ok, I know. But why can't it be reproduced if I do pytest -v? This way > I don't know which test fails. > Furthermore I noticed that if I remove the __pycache__ dir of tests, > pytest does not crash, until I re-ran it with the __pycache__ dir > present. > This way is very hard for me to understand what caused the segfault. > I'm starting to think pytest is not good for testing C extensions. > If there are too few Py_INCREF or too many Py_DECREF, it'll free the object too soon, and whether or when that will cause a segfault will depend on whatever other code is running. That's the nature of the beast: it's unpredictable! You could try running each of the tests in a loop to see which one causes a segfault. (Trying several in a loop will let you narrow it down more quickly.) pytest et al. are good for testing behaviour, but not for narrowing down segfaults. -- https://mail.python.org/mailman/listinfo/python-list
Re: pytest segfault, not with -v
I know how to check the refcounts, but I don't know how to check the memory usage, since it's not a program, it's a simple library. Is there not a way to check inside Python the memory usage? I have to use a bash script (I'm on Linux)? On Sat, 20 Nov 2021 at 19:00, MRAB wrote: > > On 2021-11-20 17:40, Marco Sulla wrote: > > Indeed I have introduced a command line parameter in my bench.py > > script that simply specifies the number of times the benchmarks are > > performed. This way I have a sort of segfault checker. > > > > But I don't bench any part of the library. I suppose I have to create > > a separate script that does a simple loop for all the cases, and > > remove the optional parameter from bench. How boring. > > PS: is there a way to monitor the Python consumed memory inside Python > > itself? In this way I could also trap memory leaks. > > > I'm on Windows 10, so I debug in Microsoft Visual Studio. I also have a > look at the memory usage in Task Manager. If the program uses more > memory when there are more iterations, then that's a sign of a memory > leak. For some objects I'd look at the reference count to see if it's > increasing or decreasing for each iteration when it should be constant > over time. > > > On Sat, 20 Nov 2021 at 01:46, MRAB wrote: > >> > >> On 2021-11-19 23:44, Marco Sulla wrote: > >> > On Fri, 19 Nov 2021 at 20:38, MRAB wrote: > >> >> > >> >> On 2021-11-19 17:48, Marco Sulla wrote: > >> >> > I have a battery of tests done with pytest. My tests break with a > >> >> > segfault if I run them normally. If I run them using pytest -v, the > >> >> > segfault does not happen. > >> >> > > >> >> > What could cause this quantical phenomenon? > >> >> > > >> >> Are you testing an extension that you're compiling? That kind of problem > >> >> can occur if there's an uninitialised variable or incorrect reference > >> >> counting (Py_INCREF/Py_DECREF). > >> > > >> > Ok, I know. But why can't it be reproduced if I do pytest -v? This way > >> > I don't know which test fails. > >> > Furthermore I noticed that if I remove the __pycache__ dir of tests, > >> > pytest does not crash, until I re-ran it with the __pycache__ dir > >> > present. > >> > This way is very hard for me to understand what caused the segfault. > >> > I'm starting to think pytest is not good for testing C extensions. > >> > > >> If there are too few Py_INCREF or too many Py_DECREF, it'll free the > >> object too soon, and whether or when that will cause a segfault will > >> depend on whatever other code is running. That's the nature of the > >> beast: it's unpredictable! > >> > >> You could try running each of the tests in a loop to see which one > >> causes a segfault. (Trying several in a loop will let you narrow it down > >> more quickly.) > >> > >> pytest et al. are good for testing behaviour, but not for narrowing down > >> segfaults. > > > -- > https://mail.python.org/mailman/listinfo/python-list -- https://mail.python.org/mailman/listinfo/python-list
Re: No right operator in tp_as_number?
On 2021-11-20 17:45, Marco Sulla wrote: I checked the documentation: https://docs.python.org/3/c-api/typeobj.html#number-structs and it seems that, in the Python C API, the right operators do not exist. For example, there is nb_add, that in Python is __add__, but there's no nb_right_add, that in Python is __radd__ Am I missing something? A quick Google came up with this: Python's __radd__ doesn't work for C-defined types https://stackoverflow.com/questions/18794169/pythons-radd-doesnt-work-for-c-defined-types It's about Python 2.7, but the principle is the same. -- https://mail.python.org/mailman/listinfo/python-list
Re: pytest segfault, not with -v
On Sat, Nov 20, 2021 at 10:09 AM Marco Sulla wrote: > I know how to check the refcounts, but I don't know how to check the > memory usage, since it's not a program, it's a simple library. Is > there not a way to check inside Python the memory usage? I have to use > a bash script (I'm on Linux)? > ps auxww ...can show you how much memory is in use for the entire process. It's commonly combined with grep, like: ps auxww | head -1 ps auxww | grep my-program-name Have a look at the %MEM, VSZ and RSS columns. But being out of memory doesn't necessarily lead to a segfault - it can (EG if a malloc failed, and some C programmer neglected to do decent error checking), but an OOM kill is more likely. -- https://mail.python.org/mailman/listinfo/python-list
Re: pytest segfault, not with -v
On Sat, Nov 20, 2021 at 10:59 AM Dan Stromberg wrote: > > > On Sat, Nov 20, 2021 at 10:09 AM Marco Sulla > wrote: > >> I know how to check the refcounts, but I don't know how to check the >> memory usage, since it's not a program, it's a simple library. Is >> there not a way to check inside Python the memory usage? I have to use >> a bash script (I'm on Linux)? >> > > ps auxww > ...can show you how much memory is in use for the entire process. > > It's commonly combined with grep, like: > ps auxww | head -1 > ps auxww | grep my-program-name > > Have a look at the %MEM, VSZ and RSS columns. > > But being out of memory doesn't necessarily lead to a segfault - it can > (EG if a malloc failed, and some C programmer neglected to do decent error > checking), but an OOM kill is more likely. > The above can be used to detect a leak in the _process_. Once it's been established (if it's established) that the process is getting oversized, you can sometimes see where the memory is going with: https://www.fugue.co/blog/diagnosing-and-fixing-memory-leaks-in-python.html But again, a memory leak isn't necessarily going to lead to a segfault. -- https://mail.python.org/mailman/listinfo/python-list
Re: getting source code line of error?
Stefan Ram wrote: > [email protected] (Stefan Ram) writes: > >except Exception as inst: > >print( traceback.format_exc() ) > > More to the point of getting the line number: As I wrote in my initial posting: I already have the line number. I am looking for the source code line! So far I use: m = re.search(r'\n\s*(.+)\n.*\n$',traceback.format_exc()) if m: print('%s %s' % (prefix,m.group(1))) -- Ullrich Horlacher Server und Virtualisierung Rechenzentrum TIK Universitaet Stuttgart E-Mail: [email protected] Allmandring 30aTel:++49-711-68565868 70569 Stuttgart (Germany) WWW:http://www.tik.uni-stuttgart.de/ -- https://mail.python.org/mailman/listinfo/python-list
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
Chris Angelico writes: > On Sat, Nov 20, 2021 at 3:41 PM Ben Bacarisse wrote: >> >> Chris Angelico writes: >> >> > On Sat, Nov 20, 2021 at 12:43 PM Ben Bacarisse >> > wrote: >> >> >> >> Chris Angelico writes: >> >> >> >> > On Sat, Nov 20, 2021 at 9:07 AM Ben Bacarisse >> >> > wrote: >> >> >> >> >> >> Chris Angelico writes: >> >> >> >> >> >> > On Sat, Nov 20, 2021 at 5:08 AM ast wrote: >> >> >> >> >> >> >> >>> 0.3 + 0.3 + 0.3 == 0.9 >> >> >> >> False >> >> >> > >> >> >> > That's because 0.3 is not 3/10. It's not because floats are >> >> >> > "unreliable" or "inaccurate". It's because the ones you're entering >> >> >> > are not what you think they are. >> >> >> > >> >> >> > When will people understand this? >> >> >> > >> >> >> > (Probably never. Sigh.) >> >> >> >> >> >> Most people understand what's going on when it's explained to them. >> >> >> And >> >> >> I think that being initially baffled is not unreasonable. After all, >> >> >> almost everyone comes to computers after learning that 3/10 can be >> >> >> written as 0.3. And Python "plays along" with the fiction to some >> >> >> extent. 0.3 prints as 0.3, 3/10 prints as 0.3 and 0.3 == 3/10 is True. >> >> > >> >> > In grade school, we learn that not everything can be written that way, >> >> > and 1/3 isn't actually equal to 0.33. >> >> >> >> Yes. We learn early on that 0.33 means 33/100. >> >> We don't learn that 0.33 is a special notation for machines that >> >> have something called "binary floating point hardware" that does not >> >> mean 33/100. That has to be learned later. And every >> >> generation has to learn it afresh. >> > >> > But you learn that it isn't the same as 1/3. That's my point. You >> > already understand that it is *impossible* to write out 1/3 in >> > decimal. Is it such a stretch to discover that you cannot write 3/10 >> > in binary? >> > >> > Every generation has to learn about repeating fractions, but most of >> > us learn them in grade school. Every generation learns that computers >> > talk in binary. Yet, putting those two concepts together seems beyond >> > many people, to the point that they feel that floating point can't be >> > trusted. >> >> Binary is a bit of a red herring here. It's the floating point format >> that needs to be understood. Three tenths can be represented in many >> binary formats, and even decimal floating point will have some surprises >> for the novice. > > Not completely a red herring; binary floating-point as used in Python > (IEEE double-precision) is defined as a binary mantissa and a scale, > just as "blackboard arithmetic" is generally defined as a decimal > mantissa and a scale. (At least, I don't think I've ever seen anyone > doing arithmetic on a blackboard in hex or octal.) You seem to be agreeing with me. It's the floating point part that is the issue, not the base itself. >> >> Yes, agreed, but I was not commenting on the odd (and incorrect) view >> >> that floating point operations are not reliable and well-defined, but on >> >> the reasonable assumption that a clever programming language might take >> >> 0.3 to mean what I was taught it meant in grade school. >> > >> > It does mean exactly what it meant in grade school, just as 1/3 means >> > exactly what it meant in grade school. Now try to represent 1/3 on a >> > blackboard, as a decimal fraction. If that's impossible, does it mean >> > that 1/3 doesn't mean 1/3, or that 1/3 can't be represented? >> >> As you know, it is possible, but let's say we outlaw any finite notation >> for repeated digits... Why should I convert 1/3 to this particular >> apparently unsuitable representation? I will write 1/3 and manipulate >> that number using factional notation. > > If you want that, the fractions module is there for you. Yes, I know. The only point of disagreement (as far as can see) is that literals like 0.3 appears to be confusing for beginners. You think they should know that "binary" (which may be all they know about computers and numbers) means fixed-width binary floating point (or at least might imply a format that can't represent three tenths), where I think it's not unreasonable for them to suppose that 0.3 is manipulated as the rational number it so clearly is. > And again, > grade school, we learned about ratios as well as decimals (or vulgar > fractions and decimal fractions). They have different tradeoffs. For > instance, I learned pi as both 22/7 and 3.14, because sometimes it'd > be convenient to use the rational form and other times the decimal. > >> The novice programmer might similarly expect that when they write 0.3, >> the program will manipulate that number as the faction it clearly is. >> They may well be surprised by the fact that it must get put into a >> format that can't represent what those three characters mean, just as I >> would be surprised if you insisted I write 1/3 as a finite decimal (with >> no repeat notation). > > Excep
RE: Unexpected behaviour of math.floor, round and int functions (rounding)
This discussion gets tiresome for some. Mathematics is a pristine world that is NOT the real world. It handles near-infinities fairly gracefully but many things in the real world break down because our reality is not infinitely divisible and some parts are neither contiguous nor fixed but in some sense wavy and probabilistic or worse. So in any computer, or computer language, we have realities to deal with when someone asks for say the square root of 2 or other transcendental numbers like pi or e or things like the sin(x) as often they are numbers which in decimal require an infinite number of digits and in many cases do not repeat. Something as simple as the fractions for 1/7, in decimal, has an interesting repeating pattern but is otherwise infinite. .142857142857142857 ... ->> 1/7 .285714285714285714 ... ->> 2/7 .428571 ... .571428 ... .714285 ... .857142 ... No matter how many bits you set aside, you cannot capture such numbers exactly IN BASE 10. You may be able to capture some such things in another base but then yet others cannot be seen in various other bases. I suspect someone has considered a data type that stores results in arbitrary bases and delays evaluation as late as possible, but even those cannot handle many numbers. So the reality is that most computer programming is ultimately BINARY as in BASE 2. At some level almost anything is rounded and imprecise. About all we want to guarantee is that any rounding or truncation done is as consistent as possible so every time you ask for pi or the square root of 2, you get the same result stored as bits. BUT if you ask a slightly different question, why expect the same results? sqrt(2) operates on the number 2. But sqrt(6*(1/3)) first evaluates 1/3 and stores it as bits then multiplies it by the bit representation of 6 and stores a result which then is handed to sqrt() and if the bits are not identical, there is no guarantee that the result is identical. I will say this. Python has perhaps an improved handling of large integers. Many languages have an assortment of integer sizes you can use such as 16 bits or 32 or 64 and possibly many others including using 8 or 1bits for limited cases. But for larger numbers, there is a problem where the result overflows what can be shown in that many bits and the result either is seen as an error or worse, as a smaller number where some of the overflow bits are thrown away. Python has indefinite length integers that work fine. But if I take a real number with the same value and do a similar operation, I get what I consider a truncated result: >>> 256**40 2135987035920910082395021706169552114602704522356652769947041607822219725780 640550022962086936576 >>> 256.0**40 2.13598703592091e+96 That is because Python has not chosen to implement a default floating point method that allows larger storage formats that could preserve more digits. Could we design a more flexible storage form? I suspect we could BUT it would not solve certain problems. I mean Consider these two squarings: >>> .123456789123456789 * .123456789123456789 0.015241578780673677 >>> 123456789123456789 * 123456789123456789 15241578780673678515622620750190521 Clearly a fuller answer to the first part, based on the second, is .015241578780673678515622620750190521 So one way to implement such extended functionality might be to have an object that has a storage of the decimal part of something as an extended integer variation along with storage of other parts like the exponent. SOME operations would then use the integer representation and then be converted back as needed. But such an item would not conform to existing standards and would not trivially be integrated everywhere a normal floating point is expected and thus may be truncated in many cases or have to be converted before use. But even such an object faces a serious problem as asking for a fraction like 1/7 might lead to an infinite regress as the computer keeps lengthening the data representation indefinitely. It has to be terminated eventually and some of the examples shown where the whole does not seem to be the same when viewed several ways, would still show the anomalies some invoke. Do note pure Mathematics is just as confusing at times. The number .... where the dot-dot-dot notation means go on forever, is mathematically equivalent to the number 1 as is any infinite series that asymptotically approaches 1 as in 1/2 + 1/4 + 1/8 + ... + 1/(2**N) + ... It is not seen by many students how continually appending a 9 can ever be the same as a number like 1.0 since every single digit is always not a match. But the mathematical theorems about limits are now well understood and in the limit as N approaches infinity, the two come to mean the same thing. Python is a tool. More specifically, it is a changing platform that hosts many additional tools. For the moment the tools are built on bits which are both very precise but also cannot finitely represent everythin
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On Sun, Nov 21, 2021 at 6:51 AM Ben Bacarisse wrote: > > Chris Angelico writes: > > > On Sat, Nov 20, 2021 at 3:41 PM Ben Bacarisse wrote: > >> > >> Chris Angelico writes: > >> > >> > On Sat, Nov 20, 2021 at 12:43 PM Ben Bacarisse > >> > wrote: > >> >> > >> >> Chris Angelico writes: > >> >> > >> >> > On Sat, Nov 20, 2021 at 9:07 AM Ben Bacarisse > >> >> > wrote: > >> >> >> > >> >> >> Chris Angelico writes: > >> >> >> > >> >> >> > On Sat, Nov 20, 2021 at 5:08 AM ast wrote: > >> >> >> > >> >> >> >> >>> 0.3 + 0.3 + 0.3 == 0.9 > >> >> >> >> False > >> >> >> > > >> >> >> > That's because 0.3 is not 3/10. It's not because floats are > >> >> >> > "unreliable" or "inaccurate". It's because the ones you're entering > >> >> >> > are not what you think they are. > >> >> >> > > >> >> >> > When will people understand this? > >> >> >> > > >> >> >> > (Probably never. Sigh.) > >> >> >> > >> >> >> Most people understand what's going on when it's explained to them. > >> >> >> And > >> >> >> I think that being initially baffled is not unreasonable. After all, > >> >> >> almost everyone comes to computers after learning that 3/10 can be > >> >> >> written as 0.3. And Python "plays along" with the fiction to some > >> >> >> extent. 0.3 prints as 0.3, 3/10 prints as 0.3 and 0.3 == 3/10 is > >> >> >> True. > >> >> > > >> >> > In grade school, we learn that not everything can be written that way, > >> >> > and 1/3 isn't actually equal to 0.33. > >> >> > >> >> Yes. We learn early on that 0.33 means 33/100. > >> >> We don't learn that 0.33 is a special notation for machines that > >> >> have something called "binary floating point hardware" that does not > >> >> mean 33/100. That has to be learned later. And every > >> >> generation has to learn it afresh. > >> > > >> > But you learn that it isn't the same as 1/3. That's my point. You > >> > already understand that it is *impossible* to write out 1/3 in > >> > decimal. Is it such a stretch to discover that you cannot write 3/10 > >> > in binary? > >> > > >> > Every generation has to learn about repeating fractions, but most of > >> > us learn them in grade school. Every generation learns that computers > >> > talk in binary. Yet, putting those two concepts together seems beyond > >> > many people, to the point that they feel that floating point can't be > >> > trusted. > >> > >> Binary is a bit of a red herring here. It's the floating point format > >> that needs to be understood. Three tenths can be represented in many > >> binary formats, and even decimal floating point will have some surprises > >> for the novice. > > > > Not completely a red herring; binary floating-point as used in Python > > (IEEE double-precision) is defined as a binary mantissa and a scale, > > just as "blackboard arithmetic" is generally defined as a decimal > > mantissa and a scale. (At least, I don't think I've ever seen anyone > > doing arithmetic on a blackboard in hex or octal.) > > You seem to be agreeing with me. It's the floating point part that is > the issue, not the base itself. Mostly, but all the problems come from people expecting decimal floats when they're using binary floats. > >> >> Yes, agreed, but I was not commenting on the odd (and incorrect) view > >> >> that floating point operations are not reliable and well-defined, but on > >> >> the reasonable assumption that a clever programming language might take > >> >> 0.3 to mean what I was taught it meant in grade school. > >> > > >> > It does mean exactly what it meant in grade school, just as 1/3 means > >> > exactly what it meant in grade school. Now try to represent 1/3 on a > >> > blackboard, as a decimal fraction. If that's impossible, does it mean > >> > that 1/3 doesn't mean 1/3, or that 1/3 can't be represented? > >> > >> As you know, it is possible, but let's say we outlaw any finite notation > >> for repeated digits... Why should I convert 1/3 to this particular > >> apparently unsuitable representation? I will write 1/3 and manipulate > >> that number using factional notation. > > > > If you want that, the fractions module is there for you. > > Yes, I know. The only point of disagreement (as far as can see) is > that literals like 0.3 appears to be confusing for beginners. You think > they should know that "binary" (which may be all they know about > computers and numbers) means fixed-width binary floating point (or at > least might imply a format that can't represent three tenths), where I > think it's not unreasonable for them to suppose that 0.3 is manipulated > as the rational number it so clearly is. Rationals are mostly irrelevant. We don't use int/int for most purposes. When you're comparing number systems between the way people write them and the way computers do, the difference isn't "0.3" and "3/10". If people are prepared to switch their thinking to rationals instead of decimals, then sure, the computer can represent those precise
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On Sun, Nov 21, 2021 at 8:32 AM Avi Gross via Python-list wrote: > > This discussion gets tiresome for some. > > Mathematics is a pristine world that is NOT the real world. It handles > near-infinities fairly gracefully but many things in the real world break > down because our reality is not infinitely divisible and some parts are > neither contiguous nor fixed but in some sense wavy and probabilistic or > worse. But the purity of mathematics isn't the problem. The problem is people's expectations around computers. (The problem is ALWAYS people's expectations.) > So in any computer, or computer language, we have realities to deal with > when someone asks for say the square root of 2 or other transcendental > numbers like pi or e or things like the sin(x) as often they are numbers > which in decimal require an infinite number of digits and in many cases do > not repeat. Something as simple as the fractions for 1/7, in decimal, has an > interesting repeating pattern but is otherwise infinite. > > .142857142857142857 ... ->> 1/7 > .285714285714285714 ... ->> 2/7 > .428571 ... > .571428 ... > .714285 ... > .857142 ... > > No matter how many bits you set aside, you cannot capture such numbers > exactly IN BASE 10. Right, and people understand this. Yet as soon as you switch from base 10 to base 2, it becomes impossible for people to understand that 1/5 now becomes the exact same thing: an infinitely repeating expansion for the rational number. > You may be able to capture some such things in another base but then yet > others cannot be seen in various other bases. I suspect someone has > considered a data type that stores results in arbitrary bases and delays > evaluation as late as possible, but even those cannot handle many numbers. More likely it would just store rationals as rationals - or, in other words, fractions.Fraction(). > So the reality is that most computer programming is ultimately BINARY as in > BASE 2. At some level almost anything is rounded and imprecise. About all we > want to guarantee is that any rounding or truncation done is as consistent > as possible so every time you ask for pi or the square root of 2, you get > the same result stored as bits. BUT if you ask a slightly different > question, why expect the same results? sqrt(2) operates on the number 2. But > sqrt(6*(1/3)) first evaluates 1/3 and stores it as bits then multiplies it > by the bit representation of 6 and stores a result which then is handed to > sqrt() and if the bits are not identical, there is no guarantee that the > result is identical. This is what I take issue with. Binary doesn't mean "rounded and imprecise". It means "base two". People get stroppy at a computer's inability to represent 0.3 correctly, because they think that it should be perfectly obvious what that value is. Nobody's bothered by sqrt(2) not being precise, but they're very much bothered by 1/10 not "working". > Do note pure Mathematics is just as confusing at times. The number > .... where the dot-dot-dot notation means go on forever, is > mathematically equivalent to the number 1 as is any infinite series that > asymptotically approaches 1 as in > > 1/2 + 1/4 + 1/8 + ... + 1/(2**N) + ... > > It is not seen by many students how continually appending a 9 can ever be > the same as a number like 1.0 since every single digit is always not a > match. But the mathematical theorems about limits are now well understood > and in the limit as N approaches infinity, the two come to mean the same > thing. Mathematics is confusing. That's not a problem. To be quite frank, the real world is far more confusing than the pristine beauty that we have inside a computer. The problem isn't the difference between reality and mathematics, or between reality and computers, or anything like that; the problem, as always, is between people's expectations and what computers do. Tell me: if a is equal to b and b is equal to c, is a equal to c? Mathematicians say "of course it is". Engineers say "there's no way you can rely on that". Computer programmers side with whoever makes most sense right this instant. > So, what should be stressed, and often is, is to use tools available that > let you compare numbers for being nearly equal. No. No no no no no. You don't need to use a "nearly equal" comparison just because floats are "inaccurate". It isn't like that. It's this exact misinformation that I am trying to fight, because floats are NOT inaccurate. They're just in binary, same as everything that computers do. > I note how unamused I was when making a small table in EXCEL (Note, not > Python) of credit card numbers and balances when I saw the darn credit card > numbers were too long and a number like: > > 4195032150199578 > > was displayed by EXCEL as: > > 4195032150199570 > > It looks like I just missed having significant stored digits and EXCEL > reconstructed it by filling in a zero for the missing extra. The problem is > I had to check balanc
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On 2021-11-20, Chris Angelico wrote: > But you learn that it isn't the same as 1/3. That's my point. You > already understand that it is *impossible* to write out 1/3 in > decimal. Is it such a stretch to discover that you cannot write 3/10 > in binary? For many people, it seems to be. There are plenty of people trying to write code who don't even under the concept of different bases. I remember trying to explain the concept of CPU registers, stacks, interrupts, and binary representations to VAX/VMS FORTRAN programmers and getting absolutely nowhere. Years later, I went through the same exercise with a bunch of Windows C++ programmers, and they seemed similarly baffled. Perhaps I was just a bad teacher. -- Grant -- https://mail.python.org/mailman/listinfo/python-list
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On 2021-11-20, Ben Bacarisse wrote: > You seem to be agreeing with me. It's the floating point part that is > the issue, not the base itself. No, it's the base. Floating point can't represent 3/10 _because_ it's base 2 floating point. Floating point in base 10 doesn't have any problem representing 3/10. -- Grant -- https://mail.python.org/mailman/listinfo/python-list
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On Sun, Nov 21, 2021 at 9:22 AM Grant Edwards wrote: > > On 2021-11-20, Chris Angelico wrote: > > > But you learn that it isn't the same as 1/3. That's my point. You > > already understand that it is *impossible* to write out 1/3 in > > decimal. Is it such a stretch to discover that you cannot write 3/10 > > in binary? > > For many people, it seems to be. > > There are plenty of people trying to write code who don't even under > the concept of different bases. > > I remember trying to explain the concept of CPU registers, stacks, > interrupts, and binary representations to VAX/VMS FORTRAN programmers > and getting absolutely nowhere. > > Years later, I went through the same exercise with a bunch of Windows > C++ programmers, and they seemed similarly baffled. > > Perhaps I was just a bad teacher. > And to some extent, that's not really surprising; not everyone can think the way other people do, and not everyone can think the way computers do. But it seems that, in this one specific case, there's a massive tendency to (a) misunderstand, and then (b) belligerently assume that the computer acts the way they want it to act. And then sometimes (c) get really annoyed at the computer for not being a person, and start the cargo cult practice of "always use a nearly-equal function instead of testing for equality", which we've seen in this exact thread. That's what I take issue with: the smug "0.1 + 0.2 != 0.3, therefore computers are wrong" people, and the extremely unhelpful "never use == with floats" people. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
RE: Unexpected behaviour of math.floor, round and int functions (rounding)
Chris, I generally agree with your comments albeit I might take a different slant. What I meant is that people who learn mathematics (as I and many here obviously did) can come away with idealized ideas that they then expect to be replicable everywhere. But there are grey lines along the way where some mathematical proofs do weird things like IGNORE parts of a calculation by suggesting they are going to zero much faster than other parts and then wave a mathematical wand about what happens when they approach a limit like zero and voila, we just "proved" that the derivative of X**2 is 2*X or the more general derivative of A*(X**N) is N*A*(X**(N-1)) and then extend that to N being negative or fractional or a transcendental number and beyond. Computers generally use finite methods, sometimes too finite. Yes, the problem is not Mathematics as a field. It is how humans often generalize or analogize from one area into something a bit different. I do not agree with any suggestion that a series of bits that encodes a result that is rounded or truncated is CORRECT. A representation of 0.3 in a binary version of some floating point format is not technically correct. Storing it as 3/10 and carefully later multiplying it by 20 and then carefully canceling part will result in exactly 6. While storing it digitally and then multiplying it in registers or whatever by 20 may get a result slightly different than the storage representation of 6.00... and that is a fact and risk we generally are willing to take. But consider a different example. If I have a filesystem or URL or anything that does not care about whether parts are in upper or lower case, then "filename" and "FILENAME" and many variations like "fIlEnAmE" are all assumed to mean the same thing. A program may even simply store all of them in the same way as all uppercase. But when you ask to compare two versions with a function where case matters, they all test as unequal! So there are ways to ask for a comparison that is approximately equal given the constraints that case does not matter: >>> alpha="Hello" >>> beta="hELLO" >>> alpha == beta False >>> alpha.lower() == beta.lower() True I see no reason why a comparison canot be done like this in cases you are concerned with small errors creeping in: >>> from math import isclose >>> isclose(1, .99) True >>> isclose(1, .99) True >>> isclose(1, .999) False I will agree with you that binary is not any more imprecise than base 10. Computer hardware is much easier to design though that works with binary. So floats by themselves are not inaccurate but realistically the results of operations ARE. I mean if I ask a long number to be stored that does not fully fit, it is often silently truncated and what the storage location now represent accurately is not my number but the shorter version that is at the limit of tolerance. But consider another analogy often encountered in mathematics. If I measure several numbers in the real world such as weight and height and temperature and so on, some are considered accurate only to a limited number of digits. Your weight on a standard digital scale may well be 189.8 but if I add a feather or subtract one, the reading may well shift to one unit up or down. Heck, the same person measured just minutes later may shift. If I used a deluxe scale that measures to more decimal places, it may get hard to get the exact same number twice in a row as just taking a deeper breath may make a change. So what happens if I measure a box in three dimensions to the nearest .1 inch and decide it is 10.1 by 20.2 by 30.3 inches? What is the volume, ignoring pesky details about the width of the cardboard or whatever? A straightforward multiplication yields 4141.606 cubic inches. You may have been told to round that to something like 4141.6 because the potential error in each measure cannot result in more precision. In reality, you might even calculate two sets of numbers assuming the true width may have been a tad more or less and come up with the volume being BETWEEN a somewhat smaller number and a somewhat larger number. I claim a similar issue plagues using a computer to deal with stored numbers, perhaps not stored 100% perfectly as discussed, and doing calculations. The result often comes out more precisely than warranted. I suspect there are modules out there that might do multi-step calculations where at each step, numbers generated with extra precision are throttled back so the extra precision is set to zeroes after rounding to avoid the small increments adding up. Others may just do the calculations and keep track and remove extra precision at the end. And again, this is not because the implementation of numbers is in any way wrong but because a real-world situation requires the humans to sort of dial back how they are used and not over-reach. So comparing for close-enough inequality is not necessarily a reflection on floats but on the design not acco
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On 20/11/2021 22:59, Avi Gross via Python-list wrote: there are grey lines along the way where some mathematical proofs do weird things like IGNORE parts of a calculation by suggesting they are going to zero much faster than other parts and then wave a mathematical wand about what happens when they approach a limit like zero and voila, we just "proved" that the derivative of X**2 is 2*X or the more general derivative of A*(X**N) is N*A*(X**(N-1)) and then extend that to N being negative or fractional or a transcendental number and beyond. You seem to be maligning mathematicians. What you say was true in the time of Newton, Leibniz and Bishop Berkeley, but analysis was made completely rigorous by the efforts of Weierstrass and others. There are no "grey lines". Proofs do not "suggest", they PROVE (else they are not proofs, they are plain wrong). It is not the fault of mathematicians (or mathematics) if some people produce sloppy hand-wavy "proofs" as justification for their conclusions. I am absolutely sure you know all this, but your post does not read as if you do. And it could give a mistaken impression to a non-mathematician. I think we have had enough denigration of experts. Best Rob Cliffe -- https://mail.python.org/mailman/listinfo/python-list
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On Sun, Nov 21, 2021 at 10:01 AM Avi Gross via Python-list wrote: > Computers generally use finite methods, sometimes too finite. Yes, the > problem is not Mathematics as a field. It is how humans often generalize or > analogize from one area into something a bit different. I do not agree with > any suggestion that a series of bits that encodes a result that is rounded > or truncated is CORRECT. A representation of 0.3 in a binary version of some > floating point format is not technically correct. Storing it as 3/10 and > carefully later multiplying it by 20 and then carefully canceling part will > result in exactly 6. While storing it digitally and then multiplying it in > registers or whatever by 20 may get a result slightly different than the > storage representation of 6.00... and that is a fact and risk we > generally are willing to take. Do you accept that storing the floating point value 1/4, then multiplying by 20, will give precisely 5? Because that is *guaranteed*. You don't have to expect a result "slightly different" from 5, it will be absolutely exactly five: >>> (1/4) * 20 == 5.0 True This is what I'm talking about. Some numbers can be represented perfectly, others can't. If you try to represent the square root of two as a decimal number, then multiply it by itself, you won't get back precisely 2, because you can't have written out the *exact* square root of two. But you most certainly CAN write "1.875" on a piece of paper, and it really truly does exactly mean fifteen eighths. And you can write that number as a binary float, too, and it'll mean the exact same value. > But consider a different example. If I have a filesystem or URL or anything > that does not care about whether parts are in upper or lower case, then > "filename" and "FILENAME" and many variations like "fIlEnAmE" are all > assumed to mean the same thing. A program may even simply store all of them > in the same way as all uppercase. But when you ask to compare two versions > with a function where case matters, they all test as unequal! So there are > ways to ask for a comparison that is approximately equal given the > constraints that case does not matter: A URL has distinct parts to it: the domain has some precise folding done (most notably case folding), the path does not, and you can consider "http://example.com:80/foo"; to be the same as "http://example.com/foo"; because 80 is the default port. > >>> alpha="Hello" > >>> beta="hELLO" > >>> alpha == beta > False > >>> alpha.lower() == beta.lower() > True > That's a terrible way to compare URLs, because it's both too sloppy AND too strict at the same time. But if you have a URL representation tool, it should be able to consider two things equal. Floats are representations of numbers that can be compared for equality if they truly represent the same number. The value 3/6 is precisely equal to the value 7/14: >>> 3/6 == 7/14 True You don't need an "approximately equal" function here. They are the same value. They are equal. > I see no reason why a comparison canot be done like this in cases you are > concerned with small errors creeping in: > > >>> from math import isclose > >>> isclose(1, .99) > True > >>> isclose(1, .99) > True > >>> isclose(1, .999) > False This is exactly the problem though: HOW close counts as equal? The only way to answer that question is to know the accuracy of your inputs, and the operations done. > So floats by themselves are not inaccurate but realistically the results of > operations ARE. I mean if I ask a long number to be stored that does not > fully fit, it is often silently truncated and what the storage location now > represent accurately is not my number but the shorter version that is at the > limit of tolerance. But consider another analogy often encountered in > mathematics. Not true. Operations are often perfectly accurate. > If I measure several numbers in the real world such as weight and height and > temperature and so on, some are considered accurate only to a limited number > of digits. Your weight on a standard digital scale may well be 189.8 but if > I add a feather or subtract one, the reading may well shift to one unit up > or down. Heck, the same person measured just minutes later may shift. If I > used a deluxe scale that measures to more decimal places, it may get hard to > get the exact same number twice in a row as just taking a deeper breath may > make a change. > > So what happens if I measure a box in three dimensions to the nearest .1 > inch and decide it is 10.1 by 20.2 by 30.3 inches? What is the volume, > ignoring pesky details about the width of the cardboard or whatever? > > A straightforward multiplication yields 4141.606 cubic inches. You may have > been told to round that to something like 4141.6 because the potential error > in each measure cannot result in more precision. In reality, you might even > calculate two sets of numbers assuming the true width may have
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
Grant Edwards writes: > On 2021-11-20, Ben Bacarisse wrote: > >> You seem to be agreeing with me. It's the floating point part that is >> the issue, not the base itself. > > No, it's the base. Floating point can't represent 3/10 _because_ it's > base 2 floating point. Floating point in base 10 doesn't have any > problem representing 3/10. Every base has the same problem for some numbers. It's the floating point part that causes the problem. Binary and decimal stand out because we write a lot of decimals in source code and computers use binary, but if decimal floating point were common (as it increasingly is) different fractions would become the oft quoted "surprise" results. -- Ben. -- https://mail.python.org/mailman/listinfo/python-list
Re: getting source code line of error?
Am 20.11.21 um 20:15 schrieb Ulli Horlacher: Stefan Ram wrote: [email protected] (Stefan Ram) writes: except Exception as inst: print( traceback.format_exc() ) More to the point of getting the line number: As I wrote in my initial posting: I already have the line number. I am looking for the source code line! So far I use: m = re.search(r'\n\s*(.+)\n.*\n$',traceback.format_exc()) if m: print('%s %s' % (prefix,m.group(1))) Stefan Ram's solution missed only the line content. Here it is. import sys import traceback try: 1/0 except ZeroDivisionError as exception: tr = traceback.TracebackException.from_exception( exception ) x = tr.stack[0] print("Exception %s in line %s: %s" % (exception, x.lineno, x.line)) The traceback object does not only contain the lineno but also the content of the offending line. -- Paolo -- https://mail.python.org/mailman/listinfo/python-list
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On Sun, Nov 21, 2021 at 10:55 AM Ben Bacarisse wrote: > > Grant Edwards writes: > > > On 2021-11-20, Ben Bacarisse wrote: > > > >> You seem to be agreeing with me. It's the floating point part that is > >> the issue, not the base itself. > > > > No, it's the base. Floating point can't represent 3/10 _because_ it's > > base 2 floating point. Floating point in base 10 doesn't have any > > problem representing 3/10. > > Every base has the same problem for some numbers. It's the floating > point part that causes the problem. > > Binary and decimal stand out because we write a lot of decimals in > source code and computers use binary, but if decimal floating point were > common (as it increasingly is) different fractions would become the oft > quoted "surprise" results. > And if decimal floating point were common, other "surprise" behaviour would be cited, like how x < y and (x+y)/2 < x. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
RE: Unexpected behaviour of math.floor, round and int functions (rounding)
Can I suggest a way to look at it, Grant? In base 10, we represent all numbers as the (possibly infinite) sum of ten raised to some integral power. 123 is 3 times 1 (ten to the zero power) plus 2 times 10 (ten to the one power) plus 1 times 100 (ten to the two power) 123.456 just extends this with 4 times 1/10 (ten to the minus one power) plus 5 times 1/100 (10**-2) plus 6 time 1/1000 (10**-3) In binary, all the powers are not powers of 10 but powers of two. So IF you wrote something like 111 it means 1 times 1 plus 1 times 2 plus 1 times 4 or 7. A zero anywhere just skips a 2 to that power. If you added a decimal point to make 111.111 the latter part would be 1/2 plus 1/4 plus 1/8 or 7/8 which combined might be 7 and 7/8. So any fractions of the form something over 2**N can be made easily and almost everything else cannot be made in finite stretches. How would you make 2/3 or 3 /10? But the opposite is something true. In decimal, to make the above it becomes 7.875 and to make other fractions of the kind, you need more and more As it happens, all such base-2 compatible streams can be made because each is in some sense a divide by two. 7/16 = 1/2 * .875 = .4375 7/32 = 1/2 * .4375 = .21875 and so on. But this ability is a special case artifact caused by a terminal digit 5 always being able to be divided in tow to make a 25 a unit longer and then again and again. Note 2 and 5 are factors of 10. In the more general case, this fails. In base 7, 3/7 is written easily as 0.3 but the same fraction in decimal is a repeating copy of .428571... which never terminates. A number like 3/7 + 4/49 + 5/343 generally cannot be written in base 7 but the opposite is also true that only a approximation of numbers in base 2 or base 10 can ever be written. I am, of course, talking about the part to the right of the decimal. Integers to the left can be written in any base. It is fractional parts that can end up being nonrepeating. What about pi and e and the square root of 2? I suspect all of them have an infinite sequence with no real repetition (over long enough stretches) in any base! I mean an integer base, of course. The constant e in base e is just 1. As has been hammered home, computers have generally always dealt in one or more combined on/off or Boolean idea so deep down they tend to have binary circuits. At one point, programmers sometimes used base 8, octal, to group three binary digits together as in setting flags for a file's permissions, may use 01, 02 and 04 to be OR'ed with the current value to turn on read/write/execute bits, or a combination like 7 (1+2+4) to set all of them at once. And, again, for some purposes, base 16 (hexadecimal) is often used with numerals extended to include a-f to represent a nibble or half byte, as in some programs that let you set colors or whatever. But they are just a convenience as ultimately they are used as binary for most purposes. In high school, for a while, and just for fun, I annoyed one teacher by doing much of my math in base 32 leaving them very perplexed as to how I got the answers right. As far as I know, nobody seriously uses any bases not already a power of two even for intermediate steps, outside of some interesting stuff in number theory. I think there have been attempts to use a decimal representation in some accounting packages or database applications that allow any decimal numbers to be faithfully represented and used in calculations. Generally this is not a very efficient process but it can handle 0.3 albeit still have no way to deal with transcendental numbers. As such, since this is a Python Forum let me add you can get limited support for some of this using the decimal module: https://www.askpython.com/python-modules/python-decimal-module But I doubt Python can be said to do things worse than just about any other computer language when storing and using floating point. As hammered in repeatedly, it is doing whatever is allowed in binary and many things just cannot easily or at all be done in binary. Let me leave you with Egyptian mathematics. Their use of fractions, WAY BACK WHEN, only had the concept of a reciprocal of an integer. As in for any integer N, there was a fraction of 1/N. They had a concept of 1/3 but not of 2/3 or 4/9. So they added reciprocals to make any more complex fractions. To make 2/3 they added 1/2 plus 1/6 for example. Since they were not stuck with any one base, all kinds of such combined fractions could be done but of course the square root of 2 or pi were a bit beyond them and for similar reasons. https://en.wikipedia.org/wiki/Egyptian_fraction My point is there are many ways humans can choose to play with numbers and not all of them can easily do the same thing. Roman Numerals were (and remain) a horror to do much mathematics with and especially when they play games based on whether a symbol like X is to the left or right of another like C as XC is 90 and CX is 110. To do programming learn the rules that only w
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On Sun, Nov 21, 2021 at 11:39 AM Avi Gross via Python-list wrote: > > Can I suggest a way to look at it, Grant? > > In base 10, we represent all numbers as the (possibly infinite) sum of ten > raised to some integral power. Not infinite. If you allow an infinite sequence of digits, you create numerous paradoxes, not to mention the need for infinite storage. > 123 is 3 times 1 (ten to the zero power) plus > 2 times 10 (ten to the one power) plus > 1 times 100 (ten to the two power) > > 123.456 just extends this with > 4 times 1/10 (ten to the minus one power) plus > 5 times 1/100 (10**-2) plus > 6 time 1/1000 (10**-3) > > In binary, all the powers are not powers of 10 but powers of two. > > So IF you wrote something like 111 it means 1 times 1 plus 1 times 2 plus 1 > times 4 or 7. A zero anywhere just skips a 2 to that power. If you added a > decimal point to make 111.111 the latter part would be 1/2 plus 1/4 plus 1/8 > or 7/8 which combined might be 7 and 7/8. So any fractions of the form > something over 2**N can be made easily and almost everything else cannot be > made in finite stretches. How would you make 2/3 or 3 /10? Right, this is exactly how place value works. > But the opposite is something true. In decimal, to make the above it becomes > 7.875 and to make other fractions of the kind, you need more and more As it > happens, all such base-2 compatible streams can be made because each is in > some sense a divide by two. > > 7/16 = 1/2 * .875 = .4375 > 7/32 = 1/2 * .4375 = .21875 > > and so on. But this ability is a special case artifact caused by a terminal > digit 5 always being able to be divided in tow to make a 25 a unit longer > and then again and again. Note 2 and 5 are factors of 10. In the more > general case, this fails. In base 7, 3/7 is written easily as 0.3 but the > same fraction in decimal is a repeating copy of .428571... which never > terminates. A number like 3/7 + 4/49 + 5/343 generally cannot be written in > base 7 but the opposite is also true that only a approximation of numbers in > base 2 or base 10 can ever be written. I am, of course, talking about the > part to the right of the decimal. Integers to the left can be written in any > base. It is fractional parts that can end up being nonrepeating. If you have a number with a finite binary representation, you can guarantee that it can be represented finitely in decimal too. Infinitely repeating expansions come from denominators that are coprime with the numeric base. > What about pi and e and the square root of 2? I suspect all of them have an > infinite sequence with no real repetition (over long enough stretches) in > any base! I mean an integer base, of course. The constant e in base e is > just 1. More than "suspect". This has been proven. That's what transcendental means. I don't think "base e" means the same thing that "base ten" does. (Normally you'd talk about a base e *logarithm*, which is a completely different concept.) But if you try to work with a transcendental base like that, it would be impossible to represent any integer finitely. (Side point: There are other representations that have different implications about what repeats and what doesn't. For instance, the decimal expansion for a square root doesn't repeat, but the continued fraction for the same square root will. For instance, 7**0.5 is 2;1,1,1,4,1,1,1,4... with an infinitely repeating four-element unit.) > I think there have been attempts to use a decimal representation in some > accounting packages or database applications that allow any decimal numbers > to be faithfully represented and used in calculations. Generally this is not > a very efficient process but it can handle 0.3 albeit still have no way to > deal with transcendental numbers. Fixed point has been around for a long time (the simplest example being "work in cents and use integers"), but actual decimal floating-point is quite unusual. Some databases support it, and REXX used that as its only numeric form, but it's not hugely popular. > Let me leave you with Egyptian mathematics. Their use of fractions, WAY BACK > WHEN, only had the concept of a reciprocal of an integer. As in for any > integer N, there was a fraction of 1/N. They had a concept of 1/3 but not of > 2/3 or 4/9. > > So they added reciprocals to make any more complex fractions. To make 2/3 > they added 1/2 plus 1/6 for example. > > Since they were not stuck with any one base, all kinds of such combined > fractions could be done but of course the square root of 2 or pi were a bit > beyond them and for similar reasons. > > https://en.wikipedia.org/wiki/Egyptian_fraction It's interesting as a curiosity, but it makes arithmetic extremely difficult. > My point is there are many ways humans can choose to play with numbers and > not all of them can easily do the same thing. Roman Numerals were (and > remain) a horror to do much mathematics with and especially when they play > games based on whether a symbol like X is
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On 2021-11-21, Chris Angelico wrote: >> I think there have been attempts to use a decimal representation in some >> accounting packages or database applications that allow any decimal numbers >> to be faithfully represented and used in calculations. Generally this is not >> a very efficient process but it can handle 0.3 albeit still have no way to >> deal with transcendental numbers. > > Fixed point has been around for a long time (the simplest example > being "work in cents and use integers"), but actual decimal > floating-point is quite unusual. Some databases support it, and REXX > used that as its only numeric form, but it's not hugely popular. My recollection is that it was quite common back in the days before FP hardware was "a thing" on small computers. CPM and DOS compilers for various languages often gave the user a choice between binary FP and decimal (BCD) FP. If you were doing accounting you chose decimal. If you were doing science, you chose binary (better range and precision for the same number of bits of storage). Once binary FP hardware became available, decimal FP support was abandoned. -- Grant -- https://mail.python.org/mailman/listinfo/python-list
RE: Unexpected behaviour of math.floor, round and int functions (rounding)
Not at all, Robb. I am not intending to demean Mathematicians as one of my degrees is in that subject and I liked it. I mean that some things in mathematics are not as intuitive to people when they first encounter them, let alone those who never see them and then marvel at results and have expectations. The example I gave, is NOW, indeed on quite firm footing but for quite a while was not. What we have in this forum recently is people taking pot shots at aspects of Python where in a similar way, they know not what is actually happening and insist it be some other way. Some people also assume that an email message work any way they want and post things to a text-only group that other cannot see or become badly formatted or complain why a very large attachment makes a message be rejected. They also expect SPAM checkers to be perfect and never reject valid messages and so on. Things are what they are, not what we wish them to be. And many kinds of pure mathematics live in a Platonic world and must be used with care. Calculus is NOT on a firm footing when any of the ideas in it are violated. A quantum Mechanical universe at a deep level does not have continuity so continuous functions may not really exist and there can be no such thing as an infinitesimal smaller than any epsilon and so on. Much of what we see at that level includes things like a probabilistic view of an electron cloud forming the probability that an electron (which is not a mathematical point) is at any moment at a particular location around an atom. But some like the p-orbital have a sort of 3-D figure eight shape (sort of a pair of teardrops) where there is a plane between the two halves with a mathematically zero probability of the electron ever being there. Yet, quantum tunneling effects let it dross through that plane without actually ever being in the plane because various kinds of quantum jumps in a very wiggly space-time fabric can and will happen in a way normal mathematics may not predict or allow. Which brings me back to the python analogy of algorithms implemented that gradually zoom in on an answer you might view as a local maximum or minimum. It may be that with infinite precision calculations, you might zoom in ever closer to the optimal answer where the tangent to such a curve has slope zero. Your program would never halt though if the condition was that it be exactly at that point to an infinite number of decimal places. This is a place I do not agree that the concept of being near the answer (or in this case being near zero) is not a good enough heuristic solution. There are many iterative problems (and recursive ones) where a close-enough condition is adequate. Some libraries incorporated into languages like Python use an infinite series to calculate something like sin(x) and many other such things, including potentially e and pi and various roots. Many of them can safely stop after N significant digits are locked into place, and especially when all available signific ant digits are locked. Running them further gains nothing much. So code like: (previous_estimate - current_estimate) == 0 may be a bad idea compared to something like: abs(previous_estimate - current_estimate) < epsilon No disrespect to mathematics intended. My understanding is that mathematics can only be used validly if all underlying axioms are assumed to be true. When (as in the real world or computer programs) some axioms are violated, watch out. Matrix multiplication does not have a symmetry so A*B in general is not the same as B*A and even worse, may be a matrix of a different dimension. A 4x2 matrix and a 2x4 matrix can result in either a 2x2 or 4x4 for example. The violation of that rule may bother some people but is not really an issue as any mathematics that has an axiom for say an abelian group, simply is not expected to apply for a non-abelian case. -Original Message- From: Python-list On Behalf Of Rob Cliffe via Python-list Sent: Saturday, November 20, 2021 6:19 PM To: Subject: Re: Unexpected behaviour of math.floor, round and int functions (rounding) On 20/11/2021 22:59, Avi Gross via Python-list wrote: > there are grey lines along the way where some mathematical proofs do > weird things like IGNORE parts of a calculation by suggesting they are > going to zero much faster than other parts and then wave a > mathematical wand about what happens when they approach a limit like > zero and voila, we just "proved" that the derivative of X**2 is 2*X or > the more general derivative of A*(X**N) is N*A*(X**(N-1)) and then > extend that to N being negative or fractional or a transcendental number and > beyond. > > You seem to be maligning mathematicians. What you say was true in the time of Newton, Leibniz and Bishop Berkeley, but analysis was made completely rigorous by the efforts of Weierstrass and others. There are no "grey lines". Proofs do not "suggest",
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On Sun, Nov 21, 2021 at 12:56 PM Avi Gross via Python-list wrote: > > Not at all, Robb. I am not intending to demean Mathematicians as one of my > degrees is in that subject and I liked it. I mean that some things in > mathematics are not as intuitive to people when they first encounter them, > let alone those who never see them and then marvel at results and have > expectations. > > The example I gave, is NOW, indeed on quite firm footing but for quite a > while was not. > > What we have in this forum recently is people taking pot shots at aspects of > Python where in a similar way, they know not what is actually happening and > insist it be some other way. Some people also assume that an email message > work any way they want and post things to a text-only group that other cannot > see or become badly formatted or complain why a very large attachment makes a > message be rejected. They also expect SPAM checkers to be perfect and never > reject valid messages and so on. > > Things are what they are, not what we wish them to be. And many kinds of pure > mathematics live in a Platonic world and must be used with care. Calculus is > NOT on a firm footing when any of the ideas in it are violated. A quantum > Mechanical universe at a deep level does not have continuity so continuous > functions may not really exist and there can be no such thing as an > infinitesimal smaller than any epsilon and so on. Much of what we see at that > level includes things like a probabilistic view of an electron cloud forming > the probability that an electron (which is not a mathematical point) is at > any moment at a particular location around an atom. But some like the > p-orbital have a sort of 3-D figure eight shape (sort of a pair of teardrops) > where there is a plane between the two halves with a mathematically zero > probability of the electron ever being there. Yet, quantum tunneling effects > let it dross through that plane without actually ever being in the plane > because various kinds o f > quantum jumps in a very wiggly space-time fabric can and will happen in a > way normal mathematics may not predict or allow. > > Which brings me back to the python analogy of algorithms implemented that > gradually zoom in on an answer you might view as a local maximum or minimum. > It may be that with infinite precision calculations, you might zoom in ever > closer to the optimal answer where the tangent to such a curve has slope > zero. Your program would never halt though if the condition was that it be > exactly at that point to an infinite number of decimal places. This is a > place I do not agree that the concept of being near the answer (or in this > case being near zero) is not a good enough heuristic solution. There are many > iterative problems (and recursive ones) where a close-enough condition is > adequate. Some libraries incorporated into languages like Python use an > infinite series to calculate something like sin(x) and many other such > things, including potentially e and pi and various roots. Many of them can > safely stop after N significant digits are locked into place, and especially > when all available signif ic > ant digits are locked. Running them further gains nothing much. So code like: > > (previous_estimate - current_estimate) == 0 > > may be a bad idea compared to something like: > > abs(previous_estimate - current_estimate) < epsilon > > No disrespect to mathematics intended. My understanding is that mathematics > can only be used validly if all underlying axioms are assumed to be true. > When (as in the real world or computer programs) some axioms are violated, > watch out. Matrix multiplication does not have a symmetry so A*B in general > is not the same as B*A and even worse, may be a matrix of a different > dimension. A 4x2 matrix and a 2x4 matrix can result in either a 2x2 or 4x4 > for example. The violation of that rule may bother some people but is not > really an issue as any mathematics that has an axiom for say an abelian > group, simply is not expected to apply for a non-abelian case. > All of this is true, but utterly irrelevant to floating-point. If your algorithm is inherently based on repeated estimates (Newton's method, for instance), then you can iterate until you're "happy enough" with the result. That's fine. But that is nothing whatsoever to do with the complaint that 0.1+0.2!=0.3 or that you should "never use == with floats" or any of those claims. It's as relevant as saying that my ruler claims to be 30cm long but is actually nearly 310mm long, and therefore the centimeter is an inherently unreliable unit and anything measured in it should be treated as an estimate. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On 21/11/2021 01:02, Chris Angelico wrote: If you have a number with a finite binary representation, you can guarantee that it can be represented finitely in decimal too. Infinitely repeating expansions come from denominators that are coprime with the numeric base. Not quite, e.g. 1/14 is a repeating decimal but 14 and 10 are not coprime. I believe it is correct to say that infinitely recurring expansions occur when the denominator is divisible by a prime that does not divide the base. Rob Cliffe -- https://mail.python.org/mailman/listinfo/python-list
RE: Unexpected behaviour of math.floor, round and int functions (rounding)
Chris, You know I am going to fully agree with you that within some bounds, any combination of numbers that can accurately be represented will continue to be adequately represented under some operations like addition and subtraction and multiplication up to any point where they do not overflow (or underflow) the storage mechanism. Division may be problematic and especially division by zero. But bring in any number that is not fully and accurately representable, and it can poison everything much in the way including an NA poisons any attempts to take a sum or mean. Any calculation that includes an e is an example. Of course there is not much in computing that necessarily relies on representable numbers and especially not when the numbers are dynamically gotten as in from a file or user and already not quite what is representable. I can even imagine a situation where some fraction is represented in a double and then "copied" into a regular singular float and some of it is lost/truncated. I get your point about URL's but I was really focused at that point on filenames as an example on systems where they are not case sensitive. Some programming languages had a similar concept. Yes, you can have URL with more complex comparison functions needed including when something lengthens them or whatever. In one weird sense, as in you GET TO THE SAME PAGE, any URL that redirects you to another might be considered synonymous even if the two look nothing at all alike. To continue, I do not mean to give the impression that comparing representable numbers with == is generally wrong. I am saying there are places where there may be good reasons for the alternative. I can imagine an algorithm that starts with representable numbers and maybe at each stage continues to generate representable numbers, such as one of the hill climbing algorithms I am talking about. It may end up overshooting a bit past the peak and next round overshooting back to the other side and getting stuck in a loop. One way out is to keep track of past locations and abort when the cycle is seen to be repeating. Another is to leave when the result seems close enough. However, my comments about over/underflow may apply here as enough iterations with representable numbers may at some point result in the kind of rounding error that warps the results of further calculations. I note some of your argument is the valid difference between when your knowledge of the input numbers is uncertain and what the computer does with them. Yes, my measures of the height/width/depth may be uncertain and it is not the fault of a python program if it multiplies them to provide an exact answer as if in a mathematical world where numbers are normally precise. I am saying that the human using the program needs external info before they use the answer. In my example, I would note the rule that when dealing with numbers that are only significant to some number of digits, the final calculation should often be rounded down according to some rules. So instead of printing out the volume as 4140.606, the program may call some function like round() as in round(10.1*20.2*30.3, 1) so it displays 4141.6 instead. The Python language does what you ask and not what you do not ask. Now a statistical program or perhaps an AI or Machine Learning program I write, might actually care about the probabilistic effects. I often create graphs that include perhaps a smoothed curve of some kind that approximates the points in the data as well as a light gray ribbon that represents some kind of error bands above and below and which suggest the line not be taken too seriously and there may be something like a 95% chance the true values are within the gray zone an even some chance they may be beyond it in an even lighter series of gray (color is not the issue) zones representing a 1% chance or even less. Such approaches apply if the measurement errors are assume to be as much as .1 inches for each measure independently. The smallest volume would be (10.1 - 0.1)*(20.2 - 0.1)*(30.3 - 0.1) = 6070.2 The largest possible volume if all my measures were off by that amount in the other direction would be: (10.1 + 0.1)*(20.2 + 0.1)*(30.3 + 0.1) = 6294.6 The above are truncated to one significant digit. The Python program evaluates all the above representable numbers perfectly, albeit I doubt they are all representable in binary. But for human purposes, the actual answer for a volume has some uncertainty built-in to the method of measurement and perhaps others such as the inner sides of the box may not be perfectly flat or the angles things join at may not be precisely 90 degrees and filling it with something like oranges may not fit as much more if you enlarge it a tad as they may not stack much better from a minor change. Python is not to blame in these cases if not programmed well enough. And I suggest the often minor errors introduced by a r
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On Sun, Nov 21, 2021 at 1:20 PM Rob Cliffe via Python-list wrote: > > > > On 21/11/2021 01:02, Chris Angelico wrote: > > > > If you have a number with a finite binary representation, you can > > guarantee that it can be represented finitely in decimal too. > > Infinitely repeating expansions come from denominators that are > > coprime with the numeric base. > > > > > Not quite, e.g. 1/14 is a repeating decimal but 14 and 10 are not coprime. > I believe it is correct to say that infinitely recurring expansions > occur when the denominator is divisible by a prime that does not divide > the base. > Rob Cliffe True, my bad. I can't remember if there's a term for that, but your description is correct. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
RE: Unexpected behaviour of math.floor, round and int functions (rounding)
Sorry Chris, I was talking mathematically where a number like pi or like 1/7 conceptually have an infinite number of digits needed that are added to a growing sum using ever smaller powers of 10, in the decimal case. In programming, and in the binary storage, the number of such is clearly limited. Is there any official limit on the maximum size of a python integer other than available memory? And replying sparsely, yes, pretty much nothing can be represent completely in base e other than integral multiples of e, perhaps. No other numbers, especially integers, can be linear combinations of e or e raised to an integral power. Having said that, if you throw in another transcendental called pi and expand to include the complex number i, then you can weirdly combine them another way to make -1. I am sure you have seen equations like: e**(pi*i) +1 = 0 By extension, you can make any integer by adding multiple such entities together. On another point, an indefinitely continued repeated fraction is sort of similar to indefinitely summed series. Both can exist and demonstrate a regularity when the actual digits of the number seemingly show no patterns. -Original Message- From: Python-list On Behalf Of Chris Angelico Sent: Saturday, November 20, 2021 8:03 PM To: [email protected] Subject: Re: Unexpected behaviour of math.floor, round and int functions (rounding) On Sun, Nov 21, 2021 at 11:39 AM Avi Gross via Python-list wrote: > > Can I suggest a way to look at it, Grant? > > In base 10, we represent all numbers as the (possibly infinite) sum of > ten raised to some integral power. Not infinite. If you allow an infinite sequence of digits, you create numerous paradoxes, not to mention the need for infinite storage. > 123 is 3 times 1 (ten to the zero power) plus > 2 times 10 (ten to the one power) plus > 1 times 100 (ten to the two power) > > 123.456 just extends this with > 4 times 1/10 (ten to the minus one power) plus > 5 times 1/100 (10**-2) plus > 6 time 1/1000 (10**-3) > > In binary, all the powers are not powers of 10 but powers of two. > > So IF you wrote something like 111 it means 1 times 1 plus 1 times 2 > plus 1 times 4 or 7. A zero anywhere just skips a 2 to that power. If > you added a decimal point to make 111.111 the latter part would be 1/2 > plus 1/4 plus 1/8 or 7/8 which combined might be 7 and 7/8. So any > fractions of the form something over 2**N can be made easily and > almost everything else cannot be made in finite stretches. How would you make 2/3 or 3 /10? Right, this is exactly how place value works. > But the opposite is something true. In decimal, to make the above it > becomes > 7.875 and to make other fractions of the kind, you need more and more > As it happens, all such base-2 compatible streams can be made because > each is in some sense a divide by two. > > 7/16 = 1/2 * .875 = .4375 > 7/32 = 1/2 * .4375 = .21875 > > and so on. But this ability is a special case artifact caused by a > terminal digit 5 always being able to be divided in tow to make a 25 a > unit longer and then again and again. Note 2 and 5 are factors of 10. > In the more general case, this fails. In base 7, 3/7 is written easily > as 0.3 but the same fraction in decimal is a repeating copy of > .428571... which never terminates. A number like 3/7 + 4/49 + 5/343 > generally cannot be written in base 7 but the opposite is also true > that only a approximation of numbers in base 2 or base 10 can ever be > written. I am, of course, talking about the part to the right of the > decimal. Integers to the left can be written in any base. It is fractional parts that can end up being nonrepeating. If you have a number with a finite binary representation, you can guarantee that it can be represented finitely in decimal too. Infinitely repeating expansions come from denominators that are coprime with the numeric base. > What about pi and e and the square root of 2? I suspect all of them > have an infinite sequence with no real repetition (over long enough > stretches) in any base! I mean an integer base, of course. The > constant e in base e is just 1. More than "suspect". This has been proven. That's what transcendental means. I don't think "base e" means the same thing that "base ten" does. (Normally you'd talk about a base e *logarithm*, which is a completely different concept.) But if you try to work with a transcendental base like that, it would be impossible to represent any integer finitely. (Side point: There are other representations that have different implications about what repeats and what doesn't. For instance, the decimal expansion for a square root doesn't repeat, but the continued fraction for the same square root will. For instance, 7**0.5 is 2;1,1,1,4,1,1,1,4... with an infinitely repeating four-element unit.) > I think there have been attempts to use a decimal representation in > some accounting packages or database applications that allow any > d
Re: Unexpected behaviour of math.floor, round and int functions (rounding)
On 21/11/21 2:18 pm, Grant Edwards wrote: My recollection is that it was quite common back in the days before FP hardware was "a thing" on small computers. CPM and DOS compilers for various languages often gave the user a choice between binary FP and decimal (BCD) FP. It's also very common for handheld calculators to work in decimal. Most of HP's classic calculators used a CPU that was specifically designed for doing BCD arithmetic, and many versions of it didn't even have a way of doing arithmetic in binary! -- Greg -- https://mail.python.org/mailman/listinfo/python-list
Re: pytest segfault, not with -v
Marco Sulla wrote at 2021-11-20 19:07 +0100: >I know how to check the refcounts, but I don't know how to check the >memory usage, since it's not a program, it's a simple library. Is >there not a way to check inside Python the memory usage? I have to use >a bash script (I'm on Linux)? If Python was compiled appropriately (with "PYMALLOG_DEBUG"), `sys` contains the function `_debugmallocstats` which prints details about Python's memory allocation and free lists. I was not able to compile Python 2.7 in this way. But the (system) Python 3.6 of Ubuntu was compiled appropriately. Note that memory leaks usually do not cause segfaults (unless the application runs out of memory due to the leak). Your observation shows (apparently) non-deterministic behavior. In those cases, minor differences (e.g. with/without "-v") can significantly change the behavior (e.g. segfault or not). Memory management bugs (releasing memory still in use) are a primary cause for this kind of behavior in Python applications. -- https://mail.python.org/mailman/listinfo/python-list
