Re: pytest segfault, not with -v

2021-11-20 Thread Dan Stromberg
On Fri, Nov 19, 2021 at 9:49 AM Marco Sulla 
wrote:

> I have a battery of tests done with pytest. My tests break with a
> segfault if I run them normally. If I run them using pytest -v, the
> segfault does not happen.
>
> What could cause this quantical phenomenon?
>

Pure python code shouldn't do this, unless you're using ctypes or similar
(which arguably isn't pure python).

But C extension modules sure can.  See:
https://stromberg.dnsalias.org/~strombrg/checking-early.html .  It uses
Fortran to make its point, but the same thing very much applies to C.

BTW, if you're using C extension modules, the troublesome one doesn't
necessarily have to be one you wrote. It could be a dependency created by
someone else too.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pytest segfault, not with -v

2021-11-20 Thread Marco Sulla
Indeed I have introduced a command line parameter in my bench.py
script that simply specifies the number of times the benchmarks are
performed. This way I have a sort of segfault checker.

But I don't bench any part of the library. I suppose I have to create
a separate script that does a simple loop for all the cases, and
remove the optional parameter from bench. How boring.
PS: is there a way to monitor the Python consumed memory inside Python
itself? In this way I could also trap memory leaks.

On Sat, 20 Nov 2021 at 01:46, MRAB  wrote:
>
> On 2021-11-19 23:44, Marco Sulla wrote:
> > On Fri, 19 Nov 2021 at 20:38, MRAB  wrote:
> >>
> >> On 2021-11-19 17:48, Marco Sulla wrote:
> >> > I have a battery of tests done with pytest. My tests break with a
> >> > segfault if I run them normally. If I run them using pytest -v, the
> >> > segfault does not happen.
> >> >
> >> > What could cause this quantical phenomenon?
> >> >
> >> Are you testing an extension that you're compiling? That kind of problem
> >> can occur if there's an uninitialised variable or incorrect reference
> >> counting (Py_INCREF/Py_DECREF).
> >
> > Ok, I know. But why can't it be reproduced if I do pytest -v? This way
> > I don't know which test fails.
> > Furthermore I noticed that if I remove the __pycache__ dir of tests,
> > pytest does not crash, until I re-ran it with the __pycache__ dir
> > present.
> > This way is very hard for me to understand what caused the segfault.
> > I'm starting to think pytest is not good for testing C extensions.
> >
> If there are too few Py_INCREF or too many Py_DECREF, it'll free the
> object too soon, and whether or when that will cause a segfault will
> depend on whatever other code is running. That's the nature of the
> beast: it's unpredictable!
>
> You could try running each of the tests in a loop to see which one
> causes a segfault. (Trying several in a loop will let you narrow it down
> more quickly.)
>
> pytest et al. are good for testing behaviour, but not for narrowing down
> segfaults.
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


No right operator in tp_as_number?

2021-11-20 Thread Marco Sulla
I checked the documentation:
https://docs.python.org/3/c-api/typeobj.html#number-structs
and it seems that, in the Python C API, the right operators do not exist.
For example, there is nb_add, that in Python is __add__, but there's
no nb_right_add, that in Python is __radd__

Am I missing something?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pytest segfault, not with -v

2021-11-20 Thread MRAB

On 2021-11-20 17:40, Marco Sulla wrote:

Indeed I have introduced a command line parameter in my bench.py
script that simply specifies the number of times the benchmarks are
performed. This way I have a sort of segfault checker.

But I don't bench any part of the library. I suppose I have to create
a separate script that does a simple loop for all the cases, and
remove the optional parameter from bench. How boring.
PS: is there a way to monitor the Python consumed memory inside Python
itself? In this way I could also trap memory leaks.

I'm on Windows 10, so I debug in Microsoft Visual Studio. I also have a 
look at the memory usage in Task Manager. If the program uses more 
memory when there are more iterations, then that's a sign of a memory 
leak. For some objects I'd look at the reference count to see if it's 
increasing or decreasing for each iteration when it should be constant 
over time.



On Sat, 20 Nov 2021 at 01:46, MRAB  wrote:


On 2021-11-19 23:44, Marco Sulla wrote:
> On Fri, 19 Nov 2021 at 20:38, MRAB  wrote:
>>
>> On 2021-11-19 17:48, Marco Sulla wrote:
>> > I have a battery of tests done with pytest. My tests break with a
>> > segfault if I run them normally. If I run them using pytest -v, the
>> > segfault does not happen.
>> >
>> > What could cause this quantical phenomenon?
>> >
>> Are you testing an extension that you're compiling? That kind of problem
>> can occur if there's an uninitialised variable or incorrect reference
>> counting (Py_INCREF/Py_DECREF).
>
> Ok, I know. But why can't it be reproduced if I do pytest -v? This way
> I don't know which test fails.
> Furthermore I noticed that if I remove the __pycache__ dir of tests,
> pytest does not crash, until I re-ran it with the __pycache__ dir
> present.
> This way is very hard for me to understand what caused the segfault.
> I'm starting to think pytest is not good for testing C extensions.
>
If there are too few Py_INCREF or too many Py_DECREF, it'll free the
object too soon, and whether or when that will cause a segfault will
depend on whatever other code is running. That's the nature of the
beast: it's unpredictable!

You could try running each of the tests in a loop to see which one
causes a segfault. (Trying several in a loop will let you narrow it down
more quickly.)

pytest et al. are good for testing behaviour, but not for narrowing down
segfaults.



--
https://mail.python.org/mailman/listinfo/python-list


Re: pytest segfault, not with -v

2021-11-20 Thread Marco Sulla
I know how to check the refcounts, but I don't know how to check the
memory usage, since it's not a program, it's a simple library. Is
there not a way to check inside Python the memory usage? I have to use
a bash script (I'm on Linux)?

On Sat, 20 Nov 2021 at 19:00, MRAB  wrote:
>
> On 2021-11-20 17:40, Marco Sulla wrote:
> > Indeed I have introduced a command line parameter in my bench.py
> > script that simply specifies the number of times the benchmarks are
> > performed. This way I have a sort of segfault checker.
> >
> > But I don't bench any part of the library. I suppose I have to create
> > a separate script that does a simple loop for all the cases, and
> > remove the optional parameter from bench. How boring.
> > PS: is there a way to monitor the Python consumed memory inside Python
> > itself? In this way I could also trap memory leaks.
> >
> I'm on Windows 10, so I debug in Microsoft Visual Studio. I also have a
> look at the memory usage in Task Manager. If the program uses more
> memory when there are more iterations, then that's a sign of a memory
> leak. For some objects I'd look at the reference count to see if it's
> increasing or decreasing for each iteration when it should be constant
> over time.
>
> > On Sat, 20 Nov 2021 at 01:46, MRAB  wrote:
> >>
> >> On 2021-11-19 23:44, Marco Sulla wrote:
> >> > On Fri, 19 Nov 2021 at 20:38, MRAB  wrote:
> >> >>
> >> >> On 2021-11-19 17:48, Marco Sulla wrote:
> >> >> > I have a battery of tests done with pytest. My tests break with a
> >> >> > segfault if I run them normally. If I run them using pytest -v, the
> >> >> > segfault does not happen.
> >> >> >
> >> >> > What could cause this quantical phenomenon?
> >> >> >
> >> >> Are you testing an extension that you're compiling? That kind of problem
> >> >> can occur if there's an uninitialised variable or incorrect reference
> >> >> counting (Py_INCREF/Py_DECREF).
> >> >
> >> > Ok, I know. But why can't it be reproduced if I do pytest -v? This way
> >> > I don't know which test fails.
> >> > Furthermore I noticed that if I remove the __pycache__ dir of tests,
> >> > pytest does not crash, until I re-ran it with the __pycache__ dir
> >> > present.
> >> > This way is very hard for me to understand what caused the segfault.
> >> > I'm starting to think pytest is not good for testing C extensions.
> >> >
> >> If there are too few Py_INCREF or too many Py_DECREF, it'll free the
> >> object too soon, and whether or when that will cause a segfault will
> >> depend on whatever other code is running. That's the nature of the
> >> beast: it's unpredictable!
> >>
> >> You could try running each of the tests in a loop to see which one
> >> causes a segfault. (Trying several in a loop will let you narrow it down
> >> more quickly.)
> >>
> >> pytest et al. are good for testing behaviour, but not for narrowing down
> >> segfaults.
> >
> --
> https://mail.python.org/mailman/listinfo/python-list
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: No right operator in tp_as_number?

2021-11-20 Thread MRAB

On 2021-11-20 17:45, Marco Sulla wrote:

I checked the documentation:
https://docs.python.org/3/c-api/typeobj.html#number-structs
and it seems that, in the Python C API, the right operators do not exist.
For example, there is nb_add, that in Python is __add__, but there's
no nb_right_add, that in Python is __radd__

Am I missing something?


A quick Google came up with this:

Python's __radd__ doesn't work for C-defined types
https://stackoverflow.com/questions/18794169/pythons-radd-doesnt-work-for-c-defined-types

It's about Python 2.7, but the principle is the same.
--
https://mail.python.org/mailman/listinfo/python-list


Re: pytest segfault, not with -v

2021-11-20 Thread Dan Stromberg
On Sat, Nov 20, 2021 at 10:09 AM Marco Sulla 
wrote:

> I know how to check the refcounts, but I don't know how to check the
> memory usage, since it's not a program, it's a simple library. Is
> there not a way to check inside Python the memory usage? I have to use
> a bash script (I'm on Linux)?
>

ps auxww
...can show you how much memory is in use for the entire process.

It's commonly combined with grep, like:
ps auxww | head -1
ps auxww | grep my-program-name

Have a look at the %MEM, VSZ and RSS columns.

But being out of memory doesn't necessarily lead to a segfault - it can (EG
if a malloc failed, and some C programmer neglected to do decent error
checking), but an OOM kill is more likely.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: pytest segfault, not with -v

2021-11-20 Thread Dan Stromberg
On Sat, Nov 20, 2021 at 10:59 AM Dan Stromberg  wrote:

>
>
> On Sat, Nov 20, 2021 at 10:09 AM Marco Sulla 
> wrote:
>
>> I know how to check the refcounts, but I don't know how to check the
>> memory usage, since it's not a program, it's a simple library. Is
>> there not a way to check inside Python the memory usage? I have to use
>> a bash script (I'm on Linux)?
>>
>
> ps auxww
> ...can show you how much memory is in use for the entire process.
>
> It's commonly combined with grep, like:
> ps auxww | head -1
> ps auxww | grep my-program-name
>
> Have a look at the %MEM, VSZ and RSS columns.
>
> But being out of memory doesn't necessarily lead to a segfault - it can
> (EG if a malloc failed, and some C programmer neglected to do decent error
> checking), but an OOM kill is more likely.
>

The above can be used to detect a leak in the _process_.

Once it's been established (if it's established) that the process is
getting oversized, you can sometimes see where the memory is going with:
https://www.fugue.co/blog/diagnosing-and-fixing-memory-leaks-in-python.html

But again, a memory leak isn't necessarily going to lead to a segfault.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: getting source code line of error?

2021-11-20 Thread Ulli Horlacher
Stefan Ram  wrote:
> [email protected] (Stefan Ram) writes:
> >except Exception as inst:
> >print( traceback.format_exc() )
> 
>   More to the point of getting the line number:

As I wrote in my initial posting:
I already have the line number. I am looking for the source code line! 

So far I use:

  m = re.search(r'\n\s*(.+)\n.*\n$',traceback.format_exc())
  if m: print('%s %s' % (prefix,m.group(1)))

-- 
Ullrich Horlacher  Server und Virtualisierung
Rechenzentrum TIK 
Universitaet Stuttgart E-Mail: [email protected]
Allmandring 30aTel:++49-711-68565868
70569 Stuttgart (Germany)  WWW:http://www.tik.uni-stuttgart.de/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Ben Bacarisse
Chris Angelico  writes:

> On Sat, Nov 20, 2021 at 3:41 PM Ben Bacarisse  wrote:
>>
>> Chris Angelico  writes:
>>
>> > On Sat, Nov 20, 2021 at 12:43 PM Ben Bacarisse  
>> > wrote:
>> >>
>> >> Chris Angelico  writes:
>> >>
>> >> > On Sat, Nov 20, 2021 at 9:07 AM Ben Bacarisse  
>> >> > wrote:
>> >> >>
>> >> >> Chris Angelico  writes:
>> >> >>
>> >> >> > On Sat, Nov 20, 2021 at 5:08 AM ast  wrote:
>> >> >>
>> >> >> >>  >>> 0.3 + 0.3 + 0.3 == 0.9
>> >> >> >> False
>> >> >> >
>> >> >> > That's because 0.3 is not 3/10. It's not because floats are
>> >> >> > "unreliable" or "inaccurate". It's because the ones you're entering
>> >> >> > are not what you think they are.
>> >> >> >
>> >> >> > When will people understand this?
>> >> >> >
>> >> >> > (Probably never. Sigh.)
>> >> >>
>> >> >> Most people understand what's going on when it's explained to them.  
>> >> >> And
>> >> >> I think that being initially baffled is not unreasonable.  After all,
>> >> >> almost everyone comes to computers after learning that 3/10 can be
>> >> >> written as 0.3.  And Python "plays along" with the fiction to some
>> >> >> extent.  0.3 prints as 0.3, 3/10 prints as 0.3 and 0.3 == 3/10 is True.
>> >> >
>> >> > In grade school, we learn that not everything can be written that way,
>> >> > and 1/3 isn't actually equal to 0.33.
>> >>
>> >> Yes.  We learn early on that 0.33 means 33/100.
>> >> We don't learn that 0.33 is a special notation for machines that
>> >> have something called "binary floating point hardware" that does not
>> >> mean 33/100.  That has to be learned later.  And every
>> >> generation has to learn it afresh.
>> >
>> > But you learn that it isn't the same as 1/3. That's my point. You
>> > already understand that it is *impossible* to write out 1/3 in
>> > decimal. Is it such a stretch to discover that you cannot write 3/10
>> > in binary?
>> >
>> > Every generation has to learn about repeating fractions, but most of
>> > us learn them in grade school. Every generation learns that computers
>> > talk in binary. Yet, putting those two concepts together seems beyond
>> > many people, to the point that they feel that floating point can't be
>> > trusted.
>>
>> Binary is a bit of a red herring here.  It's the floating point format
>> that needs to be understood.  Three tenths can be represented in many
>> binary formats, and even decimal floating point will have some surprises
>> for the novice.
>
> Not completely a red herring; binary floating-point as used in Python
> (IEEE double-precision) is defined as a binary mantissa and a scale,
> just as "blackboard arithmetic" is generally defined as a decimal
> mantissa and a scale. (At least, I don't think I've ever seen anyone
> doing arithmetic on a blackboard in hex or octal.)

You seem to be agreeing with me.  It's the floating point part that is
the issue, not the base itself.

>> >> Yes, agreed, but I was not commenting on the odd (and incorrect) view
>> >> that floating point operations are not reliable and well-defined, but on
>> >> the reasonable assumption that a clever programming language might take
>> >> 0.3 to mean what I was taught it meant in grade school.
>> >
>> > It does mean exactly what it meant in grade school, just as 1/3 means
>> > exactly what it meant in grade school. Now try to represent 1/3 on a
>> > blackboard, as a decimal fraction. If that's impossible, does it mean
>> > that 1/3 doesn't mean 1/3, or that 1/3 can't be represented?
>>
>> As you know, it is possible, but let's say we outlaw any finite notation
>> for repeated digits...  Why should I convert 1/3 to this particular
>> apparently unsuitable representation?  I will write 1/3 and manipulate
>> that number using factional notation.
>
> If you want that, the fractions module is there for you.

Yes, I know.  The only point of disagreement (as far as can see) is
that literals like 0.3 appears to be confusing for beginners.  You think
they should know that "binary" (which may be all they know about
computers and numbers) means fixed-width binary floating point (or at
least might imply a format that can't represent three tenths), where I
think it's not unreasonable for them to suppose that 0.3 is manipulated
as the rational number it so clearly is.

> And again,
> grade school, we learned about ratios as well as decimals (or vulgar
> fractions and decimal fractions). They have different tradeoffs. For
> instance, I learned pi as both 22/7 and 3.14, because sometimes it'd
> be convenient to use the rational form and other times the decimal.
>
>> The novice programmer might similarly expect that when they write 0.3,
>> the program will manipulate that number as the faction it clearly is.
>> They may well be surprised by the fact that it must get put into a
>> format that can't represent what those three characters mean, just as I
>> would be surprised if you insisted I write 1/3 as a finite decimal (with
>> no repeat notation).
>
> Excep

RE: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Avi Gross via Python-list
This discussion gets tiresome for some.

Mathematics is a pristine world that is NOT the real world. It handles
near-infinities fairly gracefully but many things in the real world break
down because our reality is not infinitely divisible and some parts are
neither contiguous nor fixed but in some sense wavy and probabilistic or
worse.

So in any computer, or computer language, we have realities to deal with
when someone asks for say the square root of 2 or other transcendental
numbers like pi or e or things like the sin(x) as often they are numbers
which in decimal require an infinite number of digits and in many cases do
not repeat. Something as simple as the fractions for 1/7, in decimal, has an
interesting repeating pattern but is otherwise infinite.

.142857142857142857 ... ->> 1/7
.285714285714285714 ... ->> 2/7
.428571 ...
.571428 ...
.714285 ...
.857142 ...

No matter how many bits you set aside, you cannot capture such numbers
exactly IN BASE 10.

You may be able to capture some such things in another base but then yet
others cannot be seen in various other bases. I suspect someone has
considered a data type that stores results in arbitrary bases and delays
evaluation as late as possible, but even those cannot handle many numbers.

So the reality is that most computer programming is ultimately BINARY as in
BASE 2. At some level almost anything is rounded and imprecise. About all we
want to guarantee is that any rounding or truncation done is as consistent
as possible so every time you ask for pi or the square root of 2, you get
the same result stored as bits. BUT if you ask a slightly different
question, why expect the same results? sqrt(2) operates on the number 2. But
sqrt(6*(1/3)) first evaluates 1/3 and stores it as bits then multiplies it
by the bit representation of 6 and stores a result which then is handed to
sqrt() and if the bits are not identical, there is no guarantee that the
result is identical.

I will say this. Python has perhaps an improved handling of large integers.
Many languages have an assortment of integer sizes you can use such as 16
bits or 32 or 64 and possibly many others including using 8 or 1bits for
limited cases. But for larger numbers, there is a problem where the result
overflows what can be shown in that many bits and the result either is seen
as an error or worse, as a smaller number where some of the overflow bits
are thrown away. Python has indefinite length integers that work fine. But
if I take a real number with the same value and do a similar operation, I
get what I consider a truncated result:

>>> 256**40
2135987035920910082395021706169552114602704522356652769947041607822219725780
640550022962086936576
>>> 256.0**40
2.13598703592091e+96

That is because Python has not chosen to implement a default floating point
method that allows larger storage formats that could preserve more digits.

Could we design a more flexible storage form? I suspect we could BUT it
would not solve certain problems. I mean Consider these two squarings:

>>> .123456789123456789 * .123456789123456789
0.015241578780673677
>>> 123456789123456789 * 123456789123456789
15241578780673678515622620750190521

Clearly a fuller answer to the first part, based on the second, is
.015241578780673678515622620750190521

So one way to implement such extended functionality might be to have an
object that has a storage of the decimal part of something as an extended
integer variation along with storage of other parts like the exponent. SOME
operations would then use the integer representation and then be converted
back as needed. But such an item would not conform to existing standards and
would not trivially be integrated everywhere a normal floating point is
expected and thus may be truncated in many cases or have to be converted
before use.

But even such an object faces a serious problem as asking for a fraction
like 1/7 might lead to an infinite regress as the computer keeps lengthening
the data representation indefinitely. It has to be terminated eventually and
some of the examples shown where the whole does not seem to be the same
when viewed several ways, would still show the anomalies some invoke.

Do note pure Mathematics is just as confusing at times. The number
.... where the dot-dot-dot notation means go on forever, is
mathematically equivalent to the number 1 as is any infinite series that
asymptotically approaches 1 as in 

1/2 + 1/4 + 1/8 + ... + 1/(2**N) + ...

It is not seen by many students how continually appending a 9 can ever be
the same as a number like 1.0 since every single digit is always not a
match. But the mathematical theorems about limits are now well understood
and in the limit as N approaches infinity, the two come to mean the same
thing. 

Python is a tool. More specifically, it is a changing platform that hosts
many additional tools. For the moment the tools are built on bits which are
both very precise but also cannot finitely represent everythin

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Chris Angelico
On Sun, Nov 21, 2021 at 6:51 AM Ben Bacarisse  wrote:
>
> Chris Angelico  writes:
>
> > On Sat, Nov 20, 2021 at 3:41 PM Ben Bacarisse  wrote:
> >>
> >> Chris Angelico  writes:
> >>
> >> > On Sat, Nov 20, 2021 at 12:43 PM Ben Bacarisse  
> >> > wrote:
> >> >>
> >> >> Chris Angelico  writes:
> >> >>
> >> >> > On Sat, Nov 20, 2021 at 9:07 AM Ben Bacarisse  
> >> >> > wrote:
> >> >> >>
> >> >> >> Chris Angelico  writes:
> >> >> >>
> >> >> >> > On Sat, Nov 20, 2021 at 5:08 AM ast  wrote:
> >> >> >>
> >> >> >> >>  >>> 0.3 + 0.3 + 0.3 == 0.9
> >> >> >> >> False
> >> >> >> >
> >> >> >> > That's because 0.3 is not 3/10. It's not because floats are
> >> >> >> > "unreliable" or "inaccurate". It's because the ones you're entering
> >> >> >> > are not what you think they are.
> >> >> >> >
> >> >> >> > When will people understand this?
> >> >> >> >
> >> >> >> > (Probably never. Sigh.)
> >> >> >>
> >> >> >> Most people understand what's going on when it's explained to them.  
> >> >> >> And
> >> >> >> I think that being initially baffled is not unreasonable.  After all,
> >> >> >> almost everyone comes to computers after learning that 3/10 can be
> >> >> >> written as 0.3.  And Python "plays along" with the fiction to some
> >> >> >> extent.  0.3 prints as 0.3, 3/10 prints as 0.3 and 0.3 == 3/10 is 
> >> >> >> True.
> >> >> >
> >> >> > In grade school, we learn that not everything can be written that way,
> >> >> > and 1/3 isn't actually equal to 0.33.
> >> >>
> >> >> Yes.  We learn early on that 0.33 means 33/100.
> >> >> We don't learn that 0.33 is a special notation for machines that
> >> >> have something called "binary floating point hardware" that does not
> >> >> mean 33/100.  That has to be learned later.  And every
> >> >> generation has to learn it afresh.
> >> >
> >> > But you learn that it isn't the same as 1/3. That's my point. You
> >> > already understand that it is *impossible* to write out 1/3 in
> >> > decimal. Is it such a stretch to discover that you cannot write 3/10
> >> > in binary?
> >> >
> >> > Every generation has to learn about repeating fractions, but most of
> >> > us learn them in grade school. Every generation learns that computers
> >> > talk in binary. Yet, putting those two concepts together seems beyond
> >> > many people, to the point that they feel that floating point can't be
> >> > trusted.
> >>
> >> Binary is a bit of a red herring here.  It's the floating point format
> >> that needs to be understood.  Three tenths can be represented in many
> >> binary formats, and even decimal floating point will have some surprises
> >> for the novice.
> >
> > Not completely a red herring; binary floating-point as used in Python
> > (IEEE double-precision) is defined as a binary mantissa and a scale,
> > just as "blackboard arithmetic" is generally defined as a decimal
> > mantissa and a scale. (At least, I don't think I've ever seen anyone
> > doing arithmetic on a blackboard in hex or octal.)
>
> You seem to be agreeing with me.  It's the floating point part that is
> the issue, not the base itself.

Mostly, but all the problems come from people expecting decimal floats
when they're using binary floats.

> >> >> Yes, agreed, but I was not commenting on the odd (and incorrect) view
> >> >> that floating point operations are not reliable and well-defined, but on
> >> >> the reasonable assumption that a clever programming language might take
> >> >> 0.3 to mean what I was taught it meant in grade school.
> >> >
> >> > It does mean exactly what it meant in grade school, just as 1/3 means
> >> > exactly what it meant in grade school. Now try to represent 1/3 on a
> >> > blackboard, as a decimal fraction. If that's impossible, does it mean
> >> > that 1/3 doesn't mean 1/3, or that 1/3 can't be represented?
> >>
> >> As you know, it is possible, but let's say we outlaw any finite notation
> >> for repeated digits...  Why should I convert 1/3 to this particular
> >> apparently unsuitable representation?  I will write 1/3 and manipulate
> >> that number using factional notation.
> >
> > If you want that, the fractions module is there for you.
>
> Yes, I know.  The only point of disagreement (as far as can see) is
> that literals like 0.3 appears to be confusing for beginners.  You think
> they should know that "binary" (which may be all they know about
> computers and numbers) means fixed-width binary floating point (or at
> least might imply a format that can't represent three tenths), where I
> think it's not unreasonable for them to suppose that 0.3 is manipulated
> as the rational number it so clearly is.

Rationals are mostly irrelevant. We don't use int/int for most
purposes. When you're comparing number systems between the way people
write them and the way computers do, the difference isn't "0.3" and
"3/10". If people are prepared to switch their thinking to rationals
instead of decimals, then sure, the computer can represent those
precise

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Chris Angelico
On Sun, Nov 21, 2021 at 8:32 AM Avi Gross via Python-list
 wrote:
>
> This discussion gets tiresome for some.
>
> Mathematics is a pristine world that is NOT the real world. It handles
> near-infinities fairly gracefully but many things in the real world break
> down because our reality is not infinitely divisible and some parts are
> neither contiguous nor fixed but in some sense wavy and probabilistic or
> worse.

But the purity of mathematics isn't the problem. The problem is
people's expectations around computers. (The problem is ALWAYS
people's expectations.)

> So in any computer, or computer language, we have realities to deal with
> when someone asks for say the square root of 2 or other transcendental
> numbers like pi or e or things like the sin(x) as often they are numbers
> which in decimal require an infinite number of digits and in many cases do
> not repeat. Something as simple as the fractions for 1/7, in decimal, has an
> interesting repeating pattern but is otherwise infinite.
>
> .142857142857142857 ... ->> 1/7
> .285714285714285714 ... ->> 2/7
> .428571 ...
> .571428 ...
> .714285 ...
> .857142 ...
>
> No matter how many bits you set aside, you cannot capture such numbers
> exactly IN BASE 10.

Right, and people understand this. Yet as soon as you switch from base
10 to base 2, it becomes impossible for people to understand that 1/5
now becomes the exact same thing: an infinitely repeating expansion
for the rational number.

> You may be able to capture some such things in another base but then yet
> others cannot be seen in various other bases. I suspect someone has
> considered a data type that stores results in arbitrary bases and delays
> evaluation as late as possible, but even those cannot handle many numbers.

More likely it would just store rationals as rationals - or, in other
words, fractions.Fraction().

> So the reality is that most computer programming is ultimately BINARY as in
> BASE 2. At some level almost anything is rounded and imprecise. About all we
> want to guarantee is that any rounding or truncation done is as consistent
> as possible so every time you ask for pi or the square root of 2, you get
> the same result stored as bits. BUT if you ask a slightly different
> question, why expect the same results? sqrt(2) operates on the number 2. But
> sqrt(6*(1/3)) first evaluates 1/3 and stores it as bits then multiplies it
> by the bit representation of 6 and stores a result which then is handed to
> sqrt() and if the bits are not identical, there is no guarantee that the
> result is identical.

This is what I take issue with. Binary doesn't mean "rounded and
imprecise". It means "base two". People get stroppy at a computer's
inability to represent 0.3 correctly, because they think that it
should be perfectly obvious what that value is. Nobody's bothered by
sqrt(2) not being precise, but they're very much bothered by 1/10 not
"working".

> Do note pure Mathematics is just as confusing at times. The number
> .... where the dot-dot-dot notation means go on forever, is
> mathematically equivalent to the number 1 as is any infinite series that
> asymptotically approaches 1 as in
>
> 1/2 + 1/4 + 1/8 + ... + 1/(2**N) + ...
>
> It is not seen by many students how continually appending a 9 can ever be
> the same as a number like 1.0 since every single digit is always not a
> match. But the mathematical theorems about limits are now well understood
> and in the limit as N approaches infinity, the two come to mean the same
> thing.

Mathematics is confusing. That's not a problem. To be quite frank, the
real world is far more confusing than the pristine beauty that we have
inside a computer. The problem isn't the difference between reality
and mathematics, or between reality and computers, or anything like
that; the problem, as always, is between people's expectations and
what computers do.

Tell me: if a is equal to b and b is equal to c, is a equal to c?
Mathematicians say "of course it is". Engineers say "there's no way
you can rely on that". Computer programmers side with whoever makes
most sense right this instant.

> So, what should be stressed, and often is, is to use tools available that
> let you compare numbers for being nearly equal.

No. No no no no no. You don't need to use a "nearly equal" comparison
just because floats are "inaccurate". It isn't like that. It's this
exact misinformation that I am trying to fight, because floats are NOT
inaccurate. They're just in binary, same as everything that computers
do.

> I note how unamused I was when making a small table in EXCEL (Note, not
> Python) of credit card numbers and balances when I saw the darn credit card
> numbers were too long and a number like:
>
> 4195032150199578
>
> was displayed by EXCEL as:
>
> 4195032150199570
>
> It looks like I just missed having significant stored digits and EXCEL
> reconstructed it by filling in a zero for the missing extra. The problem is
> I had to check balanc

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Grant Edwards
On 2021-11-20, Chris Angelico  wrote:

> But you learn that it isn't the same as 1/3. That's my point. You
> already understand that it is *impossible* to write out 1/3 in
> decimal. Is it such a stretch to discover that you cannot write 3/10
> in binary?

For many people, it seems to be.

There are plenty of people trying to write code who don't even under
the concept of different bases.

I remember trying to explain the concept of CPU registers, stacks,
interrupts, and binary representations to VAX/VMS FORTRAN programmers
and getting absolutely nowhere.

Years later, I went through the same exercise with a bunch of Windows
C++ programmers, and they seemed similarly baffled.

Perhaps I was just a bad teacher.
 
--
Grant

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Grant Edwards
On 2021-11-20, Ben Bacarisse  wrote:

> You seem to be agreeing with me.  It's the floating point part that is
> the issue, not the base itself.

No, it's the base. Floating point can't represent 3/10 _because_ it's
base 2 floating point. Floating point in base 10 doesn't have any
problem representing 3/10.

--
Grant
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Chris Angelico
On Sun, Nov 21, 2021 at 9:22 AM Grant Edwards  wrote:
>
> On 2021-11-20, Chris Angelico  wrote:
>
> > But you learn that it isn't the same as 1/3. That's my point. You
> > already understand that it is *impossible* to write out 1/3 in
> > decimal. Is it such a stretch to discover that you cannot write 3/10
> > in binary?
>
> For many people, it seems to be.
>
> There are plenty of people trying to write code who don't even under
> the concept of different bases.
>
> I remember trying to explain the concept of CPU registers, stacks,
> interrupts, and binary representations to VAX/VMS FORTRAN programmers
> and getting absolutely nowhere.
>
> Years later, I went through the same exercise with a bunch of Windows
> C++ programmers, and they seemed similarly baffled.
>
> Perhaps I was just a bad teacher.
>

And to some extent, that's not really surprising; not everyone can
think the way other people do, and not everyone can think the way
computers do. But it seems that, in this one specific case, there's a
massive tendency to (a) misunderstand, and then (b) belligerently
assume that the computer acts the way they want it to act. And then
sometimes (c) get really annoyed at the computer for not being a
person, and start the cargo cult practice of "always use a
nearly-equal function instead of testing for equality", which we've
seen in this exact thread.

That's what I take issue with: the smug "0.1 + 0.2 != 0.3, therefore
computers are wrong" people, and the extremely unhelpful "never use ==
with floats" people.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Avi Gross via Python-list
Chris,

I generally agree with your comments albeit I might take a different slant.

What I meant is that people who learn mathematics (as I and many here
obviously did) can come away with idealized ideas that they then expect to
be replicable everywhere. But there are grey lines along the way where some
mathematical proofs do weird things like IGNORE parts of a calculation by
suggesting they are going to zero much faster than other parts and then wave
a mathematical wand about what happens when they approach a limit like zero
and voila, we just "proved" that the derivative of X**2 is 2*X or the more
general derivative of A*(X**N) is N*A*(X**(N-1)) and then extend that to N
being negative or fractional or a transcendental number and beyond.

Computers generally use finite methods, sometimes too finite. Yes, the
problem is not Mathematics as a field. It is how humans often generalize or
analogize from one area into something a bit different. I do not agree with
any suggestion that a series of bits that encodes a result that is rounded
or truncated is CORRECT. A representation of 0.3 in a binary version of some
floating point format is not technically correct. Storing it as 3/10 and
carefully later multiplying it by 20 and then carefully canceling part will
result in exactly 6. While storing it digitally and then multiplying it in
registers or whatever by 20 may get a result slightly different than the
storage representation of 6.00... and that is a fact and risk we
generally are willing to take.

But consider a different example. If I have a filesystem or URL or anything
that does not care about whether parts are in upper or lower case, then
"filename" and "FILENAME" and many variations like "fIlEnAmE" are all
assumed to mean the same thing. A program may even simply store all of them
in the same way as all uppercase. But when you ask to compare two versions
with a function where case matters, they all test as unequal! So there are
ways to ask for a comparison that is approximately equal given the
constraints that case does not matter:

>>> alpha="Hello"
>>> beta="hELLO"
>>> alpha == beta
False
>>> alpha.lower() == beta.lower()
True

I see no reason why a comparison canot be done like this in cases you are
concerned with small errors creeping in:

>>> from math import isclose
>>> isclose(1, .99)
True
>>> isclose(1, .99)
True
>>> isclose(1, .999)
False

I will agree with you that binary is not any more imprecise than base 10.
Computer hardware is much easier to design though that works with binary. 

So floats by themselves are not inaccurate but realistically the results of
operations ARE. I mean if I ask a long number to be stored that does not
fully fit, it is often silently truncated and what the storage location now
represent accurately is not my number but the shorter version that is at the
limit of tolerance. But consider another analogy often encountered in
mathematics.

If I measure several numbers in the real world such as weight and height and
temperature and so on, some are considered accurate only to a limited number
of digits. Your weight on a standard digital scale may well be 189.8 but if
I add a feather or subtract one, the reading may well shift to one unit up
or down. Heck, the same person measured just minutes later may shift. If I
used a deluxe scale that measures to more decimal places, it may get hard to
get the exact same number twice in a row as just taking a deeper breath may
make a change. 

So what happens if I measure a box in three dimensions to the nearest .1
inch and decide it is 10.1 by 20.2 by 30.3 inches? What is the volume,
ignoring pesky details about the width of the cardboard or whatever?

A straightforward multiplication yields 4141.606 cubic inches. You may have
been told to round that to something like 4141.6 because the potential error
in each measure cannot result in more precision. In reality, you might even
calculate two sets of numbers assuming the true width may have been a tad
more or less and come up with the volume being BETWEEN a somewhat smaller
number and a somewhat larger number.

I claim a similar issue plagues using a computer to deal with stored
numbers, perhaps not stored 100% perfectly as discussed, and doing
calculations. The result often comes out more precisely than warranted. I
suspect there are modules out there that might do multi-step calculations
where at each step, numbers generated with extra precision are throttled
back so the extra precision is set to zeroes after rounding to avoid the
small increments adding up. Others may just do the calculations and keep
track and remove extra precision at the end.

And again, this is not because the implementation of numbers is in any way
wrong but because a real-world situation requires the humans to sort of dial
back how they are used and not over-reach.

So comparing for close-enough inequality is not necessarily a reflection on
floats but on the design not acco

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Rob Cliffe via Python-list



On 20/11/2021 22:59, Avi Gross via Python-list wrote:

there are grey lines along the way where some
mathematical proofs do weird things like IGNORE parts of a calculation by
suggesting they are going to zero much faster than other parts and then wave
a mathematical wand about what happens when they approach a limit like zero
and voila, we just "proved" that the derivative of X**2 is 2*X or the more
general derivative of A*(X**N) is N*A*(X**(N-1)) and then extend that to N
being negative or fractional or a transcendental number and beyond.



    You seem to be maligning mathematicians.
    What you say was true in the time of Newton, Leibniz and Bishop 
Berkeley, but analysis was made completely rigorous by the efforts of 
Weierstrass and others.  There are no "grey lines".  Proofs do not 
"suggest", they PROVE (else they are not proofs, they are plain wrong).  
It is not the fault of mathematicians (or mathematics) if some people 
produce sloppy hand-wavy "proofs" as justification for their conclusions.
    I am absolutely sure you know all this, but your post does not read 
as if you do.  And it could give a mistaken impression to a 
non-mathematician.  I think we have had enough denigration of experts.

Best
Rob Cliffe



--
https://mail.python.org/mailman/listinfo/python-list


Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Chris Angelico
On Sun, Nov 21, 2021 at 10:01 AM Avi Gross via Python-list
 wrote:
> Computers generally use finite methods, sometimes too finite. Yes, the
> problem is not Mathematics as a field. It is how humans often generalize or
> analogize from one area into something a bit different. I do not agree with
> any suggestion that a series of bits that encodes a result that is rounded
> or truncated is CORRECT. A representation of 0.3 in a binary version of some
> floating point format is not technically correct. Storing it as 3/10 and
> carefully later multiplying it by 20 and then carefully canceling part will
> result in exactly 6. While storing it digitally and then multiplying it in
> registers or whatever by 20 may get a result slightly different than the
> storage representation of 6.00... and that is a fact and risk we
> generally are willing to take.

Do you accept that storing the floating point value 1/4, then
multiplying by 20, will give precisely 5? Because that is
*guaranteed*. You don't have to expect a result "slightly different"
from 5, it will be absolutely exactly five:

>>> (1/4) * 20 == 5.0
True

This is what I'm talking about. Some numbers can be represented
perfectly, others can't. If you try to represent the square root of
two as a decimal number, then multiply it by itself, you won't get
back precisely 2, because you can't have written out the *exact*
square root of two. But you most certainly CAN write "1.875" on a
piece of paper, and it really truly does exactly mean fifteen eighths.
And you can write that number as a binary float, too, and it'll mean
the exact same value.

> But consider a different example. If I have a filesystem or URL or anything
> that does not care about whether parts are in upper or lower case, then
> "filename" and "FILENAME" and many variations like "fIlEnAmE" are all
> assumed to mean the same thing. A program may even simply store all of them
> in the same way as all uppercase. But when you ask to compare two versions
> with a function where case matters, they all test as unequal! So there are
> ways to ask for a comparison that is approximately equal given the
> constraints that case does not matter:

A URL has distinct parts to it: the domain has some precise folding
done (most notably case folding), the path does not, and you can
consider "http://example.com:80/foo"; to be the same as
"http://example.com/foo"; because 80 is the default port.

> >>> alpha="Hello"
> >>> beta="hELLO"
> >>> alpha == beta
> False
> >>> alpha.lower() == beta.lower()
> True
>

That's a terrible way to compare URLs, because it's both too sloppy
AND too strict at the same time. But if you have a URL representation
tool, it should be able to consider two things equal.

Floats are representations of numbers that can be compared for
equality if they truly represent the same number. The value 3/6 is
precisely equal to the value 7/14:

>>> 3/6 == 7/14
True

You don't need an "approximately equal" function here. They are the
same value. They are equal.

> I see no reason why a comparison canot be done like this in cases you are
> concerned with small errors creeping in:
>
> >>> from math import isclose
> >>> isclose(1, .99)
> True
> >>> isclose(1, .99)
> True
> >>> isclose(1, .999)
> False

This is exactly the problem though: HOW close counts as equal? The
only way to answer that question is to know the accuracy of your
inputs, and the operations done.

> So floats by themselves are not inaccurate but realistically the results of
> operations ARE. I mean if I ask a long number to be stored that does not
> fully fit, it is often silently truncated and what the storage location now
> represent accurately is not my number but the shorter version that is at the
> limit of tolerance. But consider another analogy often encountered in
> mathematics.

Not true. Operations are often perfectly accurate.

> If I measure several numbers in the real world such as weight and height and
> temperature and so on, some are considered accurate only to a limited number
> of digits. Your weight on a standard digital scale may well be 189.8 but if
> I add a feather or subtract one, the reading may well shift to one unit up
> or down. Heck, the same person measured just minutes later may shift. If I
> used a deluxe scale that measures to more decimal places, it may get hard to
> get the exact same number twice in a row as just taking a deeper breath may
> make a change.
>
> So what happens if I measure a box in three dimensions to the nearest .1
> inch and decide it is 10.1 by 20.2 by 30.3 inches? What is the volume,
> ignoring pesky details about the width of the cardboard or whatever?
>
> A straightforward multiplication yields 4141.606 cubic inches. You may have
> been told to round that to something like 4141.6 because the potential error
> in each measure cannot result in more precision. In reality, you might even
> calculate two sets of numbers assuming the true width may have

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Ben Bacarisse
Grant Edwards  writes:

> On 2021-11-20, Ben Bacarisse  wrote:
>
>> You seem to be agreeing with me.  It's the floating point part that is
>> the issue, not the base itself.
>
> No, it's the base. Floating point can't represent 3/10 _because_ it's
> base 2 floating point. Floating point in base 10 doesn't have any
> problem representing 3/10.

Every base has the same problem for some numbers.  It's the floating
point part that causes the problem.

Binary and decimal stand out because we write a lot of decimals in
source code and computers use binary, but if decimal floating point were
common (as it increasingly is) different fractions would become the oft
quoted "surprise" results.

-- 
Ben.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: getting source code line of error?

2021-11-20 Thread Paolo G. Cantore

Am 20.11.21 um 20:15 schrieb Ulli Horlacher:

Stefan Ram  wrote:

[email protected] (Stefan Ram) writes:

except Exception as inst:
print( traceback.format_exc() )


   More to the point of getting the line number:


As I wrote in my initial posting:
I already have the line number. I am looking for the source code line!

So far I use:

   m = re.search(r'\n\s*(.+)\n.*\n$',traceback.format_exc())
   if m: print('%s %s' % (prefix,m.group(1)))


Stefan Ram's solution missed only the line content. Here it is.


import sys
import traceback

try:
1/0
except ZeroDivisionError as exception:
tr = traceback.TracebackException.from_exception( exception )
x = tr.stack[0]
print("Exception %s in line %s: %s" % (exception, x.lineno, x.line))


The traceback object does not only contain the lineno but also the 
content of the offending line.


--
Paolo
--
https://mail.python.org/mailman/listinfo/python-list


Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Chris Angelico
On Sun, Nov 21, 2021 at 10:55 AM Ben Bacarisse  wrote:
>
> Grant Edwards  writes:
>
> > On 2021-11-20, Ben Bacarisse  wrote:
> >
> >> You seem to be agreeing with me.  It's the floating point part that is
> >> the issue, not the base itself.
> >
> > No, it's the base. Floating point can't represent 3/10 _because_ it's
> > base 2 floating point. Floating point in base 10 doesn't have any
> > problem representing 3/10.
>
> Every base has the same problem for some numbers.  It's the floating
> point part that causes the problem.
>
> Binary and decimal stand out because we write a lot of decimals in
> source code and computers use binary, but if decimal floating point were
> common (as it increasingly is) different fractions would become the oft
> quoted "surprise" results.
>

And if decimal floating point were common, other "surprise" behaviour
would be cited, like how x < y and (x+y)/2 < x.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Avi Gross via Python-list
Can I suggest a way to look at it, Grant?

In base 10, we represent all numbers as the (possibly infinite) sum of ten
raised to some integral power.

123 is 3 times 1 (ten to the zero power) plus
2 times 10 (ten to the one power) plus
1 times 100 (ten to the two power)

123.456 just extends this with
4 times 1/10 (ten to the minus one power) plus
5 times 1/100 (10**-2) plus
6 time 1/1000 (10**-3)

In binary, all the powers are not powers of 10 but powers of two.

So IF you wrote something like 111 it means 1 times 1 plus 1 times 2 plus 1
times 4 or 7. A zero anywhere just skips a 2 to that power. If you added a
decimal point to make 111.111 the latter part would be 1/2 plus 1/4 plus 1/8
or 7/8 which combined might be 7 and 7/8. So any fractions of the form
something over 2**N can be made easily and almost everything else cannot be
made in finite stretches. How would you make 2/3 or 3 /10?

But the opposite is something true. In decimal, to make the above it becomes
7.875 and to make other fractions of the kind, you need more and more As it
happens, all such base-2 compatible streams can be made because each is in
some sense a divide by two.

7/16 = 1/2 * .875 = .4375
7/32 = 1/2 * .4375 = .21875

and so on. But this ability is a special case artifact caused by a terminal
digit 5 always being able to be divided in tow to make a 25 a unit longer
and then again and again. Note 2 and 5 are factors of 10.  In the more
general case, this fails. In base 7, 3/7 is written easily as 0.3 but the
same fraction in decimal is a repeating copy of .428571... which never
terminates. A number like 3/7 + 4/49 + 5/343 generally cannot be written in
base 7 but the opposite is also true that only a approximation of numbers in
base 2 or base 10 can ever be written. I am, of course, talking about the
part to the right of the decimal. Integers to the left can be written in any
base. It is fractional parts that can end up being nonrepeating.

What about pi and e and the square root of 2? I suspect all of them have an
infinite sequence with no real repetition (over long enough stretches) in
any base! I mean an integer base, of course. The constant e in base e is
just 1.

As has been hammered home, computers have generally always dealt in one or
more combined on/off or Boolean idea so deep down they tend to have binary
circuits. At one point, programmers sometimes used base 8, octal, to group
three binary digits together as in setting flags for a file's permissions,
may use 01, 02 and 04 to be OR'ed with the current value to turn on
read/write/execute bits, or a combination like 7 (1+2+4) to set all of them
at once. And, again, for some purposes, base 16 (hexadecimal) is often used
with numerals extended to include a-f to represent a nibble or half byte, as
in some programs that let you set colors or whatever. But they are just a
convenience as ultimately they are used as binary for most purposes. In high
school, for a while, and just for fun, I annoyed one teacher by doing much
of my math in base 32 leaving them very perplexed as to how I got the
answers right. As far as I know, nobody seriously uses any bases not already
a power of two even for intermediate steps, outside of some interesting
stuff in number theory.

I think there have been attempts to use a decimal representation in some
accounting packages or database applications that allow any decimal numbers
to be faithfully represented and used in calculations. Generally this is not
a very efficient process but it can handle 0.3 albeit still have no way to
deal with transcendental numbers.

As such, since this is a Python Forum let me add you can get limited support
for some of this using the decimal module:

https://www.askpython.com/python-modules/python-decimal-module

But I doubt Python can be said to do things worse than just about any other
computer language when storing and using floating point. As hammered in
repeatedly, it is doing whatever is allowed in binary and many things just
cannot easily or at all be done in binary.

Let me leave you with Egyptian mathematics. Their use of fractions, WAY BACK
WHEN, only had the concept of a reciprocal of an integer. As in for any
integer N, there was a fraction of 1/N. They had a concept of 1/3 but not of
2/3 or 4/9.

So they added reciprocals to make any more complex fractions. To make 2/3
they added 1/2 plus 1/6 for example.

Since they were not stuck with any one base, all kinds of such combined
fractions could be done but of course the square root of 2 or pi were a bit
beyond them and for similar reasons.

https://en.wikipedia.org/wiki/Egyptian_fraction

My point is there are many ways humans can choose to play with numbers and
not all of them can easily do the same thing. Roman Numerals were (and
remain) a horror to do much mathematics with and especially when they play
games based on whether a symbol like X is to the left or right of another
like C as XC is 90 and CX is 110.

To do programming learn the rules that only w

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Chris Angelico
On Sun, Nov 21, 2021 at 11:39 AM Avi Gross via Python-list
 wrote:
>
> Can I suggest a way to look at it, Grant?
>
> In base 10, we represent all numbers as the (possibly infinite) sum of ten
> raised to some integral power.

Not infinite. If you allow an infinite sequence of digits, you create
numerous paradoxes, not to mention the need for infinite storage.

> 123 is 3 times 1 (ten to the zero power) plus
> 2 times 10 (ten to the one power) plus
> 1 times 100 (ten to the two power)
>
> 123.456 just extends this with
> 4 times 1/10 (ten to the minus one power) plus
> 5 times 1/100 (10**-2) plus
> 6 time 1/1000 (10**-3)
>
> In binary, all the powers are not powers of 10 but powers of two.
>
> So IF you wrote something like 111 it means 1 times 1 plus 1 times 2 plus 1
> times 4 or 7. A zero anywhere just skips a 2 to that power. If you added a
> decimal point to make 111.111 the latter part would be 1/2 plus 1/4 plus 1/8
> or 7/8 which combined might be 7 and 7/8. So any fractions of the form
> something over 2**N can be made easily and almost everything else cannot be
> made in finite stretches. How would you make 2/3 or 3 /10?

Right, this is exactly how place value works.

> But the opposite is something true. In decimal, to make the above it becomes
> 7.875 and to make other fractions of the kind, you need more and more As it
> happens, all such base-2 compatible streams can be made because each is in
> some sense a divide by two.
>
> 7/16 = 1/2 * .875 = .4375
> 7/32 = 1/2 * .4375 = .21875
>
> and so on. But this ability is a special case artifact caused by a terminal
> digit 5 always being able to be divided in tow to make a 25 a unit longer
> and then again and again. Note 2 and 5 are factors of 10.  In the more
> general case, this fails. In base 7, 3/7 is written easily as 0.3 but the
> same fraction in decimal is a repeating copy of .428571... which never
> terminates. A number like 3/7 + 4/49 + 5/343 generally cannot be written in
> base 7 but the opposite is also true that only a approximation of numbers in
> base 2 or base 10 can ever be written. I am, of course, talking about the
> part to the right of the decimal. Integers to the left can be written in any
> base. It is fractional parts that can end up being nonrepeating.

If you have a number with a finite binary representation, you can
guarantee that it can be represented finitely in decimal too.
Infinitely repeating expansions come from denominators that are
coprime with the numeric base.

> What about pi and e and the square root of 2? I suspect all of them have an
> infinite sequence with no real repetition (over long enough stretches) in
> any base! I mean an integer base, of course. The constant e in base e is
> just 1.

More than "suspect". This has been proven. That's what transcendental means.

I don't think "base e" means the same thing that "base ten" does.
(Normally you'd talk about a base e *logarithm*, which is a completely
different concept.) But if you try to work with a transcendental base
like that, it would be impossible to represent any integer finitely.

(Side point: There are other representations that have different
implications about what repeats and what doesn't. For instance, the
decimal expansion for a square root doesn't repeat, but the continued
fraction for the same square root will. For instance, 7**0.5 is
2;1,1,1,4,1,1,1,4... with an infinitely repeating four-element unit.)

> I think there have been attempts to use a decimal representation in some
> accounting packages or database applications that allow any decimal numbers
> to be faithfully represented and used in calculations. Generally this is not
> a very efficient process but it can handle 0.3 albeit still have no way to
> deal with transcendental numbers.

Fixed point has been around for a long time (the simplest example
being "work in cents and use integers"), but actual decimal
floating-point is quite unusual. Some databases support it, and REXX
used that as its only numeric form, but it's not hugely popular.

> Let me leave you with Egyptian mathematics. Their use of fractions, WAY BACK
> WHEN, only had the concept of a reciprocal of an integer. As in for any
> integer N, there was a fraction of 1/N. They had a concept of 1/3 but not of
> 2/3 or 4/9.
>
> So they added reciprocals to make any more complex fractions. To make 2/3
> they added 1/2 plus 1/6 for example.
>
> Since they were not stuck with any one base, all kinds of such combined
> fractions could be done but of course the square root of 2 or pi were a bit
> beyond them and for similar reasons.
>
> https://en.wikipedia.org/wiki/Egyptian_fraction

It's interesting as a curiosity, but it makes arithmetic extremely difficult.

> My point is there are many ways humans can choose to play with numbers and
> not all of them can easily do the same thing. Roman Numerals were (and
> remain) a horror to do much mathematics with and especially when they play
> games based on whether a symbol like X is 

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Grant Edwards
On 2021-11-21, Chris Angelico  wrote:

>> I think there have been attempts to use a decimal representation in some
>> accounting packages or database applications that allow any decimal numbers
>> to be faithfully represented and used in calculations. Generally this is not
>> a very efficient process but it can handle 0.3 albeit still have no way to
>> deal with transcendental numbers.
>
> Fixed point has been around for a long time (the simplest example
> being "work in cents and use integers"), but actual decimal
> floating-point is quite unusual. Some databases support it, and REXX
> used that as its only numeric form, but it's not hugely popular.

My recollection is that it was quite common back in the days before FP
hardware was "a thing" on small computers. CPM and DOS compilers for
various languages often gave the user a choice between binary FP and
decimal (BCD) FP.

If you were doing accounting you chose decimal. If you were doing
science, you chose binary (better range and precision for the same
number of bits of storage).

Once binary FP hardware became available, decimal FP support was
abandoned.

--
Grant
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Avi Gross via Python-list
Not at all, Robb. I am not intending to demean Mathematicians as one of my 
degrees is in that subject and I liked it. I mean that some things in 
mathematics are not as intuitive to people when they first encounter them, let 
alone those who never see them and then marvel at results and have expectations.

The example I gave, is NOW, indeed on quite firm footing but for quite a while 
was not.

What we have in this forum recently is people taking pot shots at aspects of 
Python where in a similar way, they know not what is actually happening and 
insist it be some other way. Some people also assume that an email message work 
any way they want and post things to a text-only group that other cannot see or 
become badly formatted or complain why a very large attachment makes a message 
be rejected. They also expect SPAM checkers to be perfect and never reject 
valid messages and so on.

Things are what they are, not what we wish them to be. And many kinds of pure 
mathematics live in a Platonic world and must be used with care. Calculus is 
NOT on a firm footing when any of the ideas in it are violated. A quantum 
Mechanical universe at a deep level does not have continuity so continuous 
functions may not really exist and there can be no such thing as an 
infinitesimal smaller than any epsilon and so on. Much of what we see at that 
level includes things like a probabilistic view of an electron cloud forming 
the probability that an electron (which is not a mathematical point) is at any 
moment at a particular location around an atom. But some like the p-orbital 
have a sort of 3-D figure eight shape (sort of a pair of teardrops) where there 
is a plane between the two halves with a mathematically zero probability of the 
electron ever being there. Yet, quantum tunneling effects let it dross through 
that plane without actually ever being in the plane because various kinds of 
 quantum jumps in a very wiggly space-time fabric can and will happen in a way 
normal mathematics may not predict or allow. 

Which brings me back to the python analogy of algorithms implemented that 
gradually zoom in on an answer you might view as a local maximum or minimum. It 
may be that with infinite precision calculations, you might zoom in ever closer 
to the optimal answer where the tangent to such a curve has slope zero. Your 
program would never halt though if the condition was that it be exactly at that 
point to an infinite number of decimal places. This is a place I do not agree 
that the concept of being near the answer (or in this case being near zero) is 
not a good enough heuristic solution. There are many iterative problems (and 
recursive ones) where  a close-enough condition is adequate. Some libraries 
incorporated into languages like Python use an infinite series to calculate 
something like sin(x) and many other such things, including potentially e and 
pi and various roots. Many of them can safely stop after N significant digits 
are locked into place, and especially when all available signific
 ant digits are locked. Running them further gains nothing much. So code like:

(previous_estimate - current_estimate) == 0

may be a bad idea compared to something like:

abs(previous_estimate - current_estimate) < epsilon

No disrespect to mathematics intended. My understanding is that mathematics can 
only be used validly if all underlying axioms are assumed to be true. When (as 
in the real world or computer programs) some axioms are violated, watch out. 
Matrix multiplication does not have a symmetry so A*B in general is not the 
same as B*A and even worse, may be a matrix of a different dimension. A 4x2 
matrix and a 2x4 matrix can result in either a 2x2 or 4x4 for example. The 
violation of that rule may bother some people but is not really an issue as any 
mathematics that has an axiom for say an abelian group, simply is not expected 
to apply for a non-abelian case.




-Original Message-
From: Python-list  On 
Behalf Of Rob Cliffe via Python-list
Sent: Saturday, November 20, 2021 6:19 PM
To: 
Subject: Re: Unexpected behaviour of math.floor, round and int functions 
(rounding)



On 20/11/2021 22:59, Avi Gross via Python-list wrote:
> there are grey lines along the way where some mathematical proofs do 
> weird things like IGNORE parts of a calculation by suggesting they are 
> going to zero much faster than other parts and then wave a 
> mathematical wand about what happens when they approach a limit like 
> zero and voila, we just "proved" that the derivative of X**2 is 2*X or 
> the more general derivative of A*(X**N) is N*A*(X**(N-1)) and then 
> extend that to N being negative or fractional or a transcendental number and 
> beyond.
>
>
 You seem to be maligning mathematicians.
 What you say was true in the time of Newton, Leibniz and Bishop Berkeley, 
but analysis was made completely rigorous by the efforts of Weierstrass and 
others.  There are no "grey lines".  Proofs do not "suggest", 

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Chris Angelico
On Sun, Nov 21, 2021 at 12:56 PM Avi Gross via Python-list
 wrote:
>
> Not at all, Robb. I am not intending to demean Mathematicians as one of my 
> degrees is in that subject and I liked it. I mean that some things in 
> mathematics are not as intuitive to people when they first encounter them, 
> let alone those who never see them and then marvel at results and have 
> expectations.
>
> The example I gave, is NOW, indeed on quite firm footing but for quite a 
> while was not.
>
> What we have in this forum recently is people taking pot shots at aspects of 
> Python where in a similar way, they know not what is actually happening and 
> insist it be some other way. Some people also assume that an email message 
> work any way they want and post things to a text-only group that other cannot 
> see or become badly formatted or complain why a very large attachment makes a 
> message be rejected. They also expect SPAM checkers to be perfect and never 
> reject valid messages and so on.
>
> Things are what they are, not what we wish them to be. And many kinds of pure 
> mathematics live in a Platonic world and must be used with care. Calculus is 
> NOT on a firm footing when any of the ideas in it are violated. A quantum 
> Mechanical universe at a deep level does not have continuity so continuous 
> functions may not really exist and there can be no such thing as an 
> infinitesimal smaller than any epsilon and so on. Much of what we see at that 
> level includes things like a probabilistic view of an electron cloud forming 
> the probability that an electron (which is not a mathematical point) is at 
> any moment at a particular location around an atom. But some like the 
> p-orbital have a sort of 3-D figure eight shape (sort of a pair of teardrops) 
> where there is a plane between the two halves with a mathematically zero 
> probability of the electron ever being there. Yet, quantum tunneling effects 
> let it dross through that plane without actually ever being in the plane 
> because various kinds o
 f
>  quantum jumps in a very wiggly space-time fabric can and will happen in a 
> way normal mathematics may not predict or allow.
>
> Which brings me back to the python analogy of algorithms implemented that 
> gradually zoom in on an answer you might view as a local maximum or minimum. 
> It may be that with infinite precision calculations, you might zoom in ever 
> closer to the optimal answer where the tangent to such a curve has slope 
> zero. Your program would never halt though if the condition was that it be 
> exactly at that point to an infinite number of decimal places. This is a 
> place I do not agree that the concept of being near the answer (or in this 
> case being near zero) is not a good enough heuristic solution. There are many 
> iterative problems (and recursive ones) where  a close-enough condition is 
> adequate. Some libraries incorporated into languages like Python use an 
> infinite series to calculate something like sin(x) and many other such 
> things, including potentially e and pi and various roots. Many of them can 
> safely stop after N significant digits are locked into place, and especially 
> when all available signif
 ic
>  ant digits are locked. Running them further gains nothing much. So code like:
>
> (previous_estimate - current_estimate) == 0
>
> may be a bad idea compared to something like:
>
> abs(previous_estimate - current_estimate) < epsilon
>
> No disrespect to mathematics intended. My understanding is that mathematics 
> can only be used validly if all underlying axioms are assumed to be true. 
> When (as in the real world or computer programs) some axioms are violated, 
> watch out. Matrix multiplication does not have a symmetry so A*B in general 
> is not the same as B*A and even worse, may be a matrix of a different 
> dimension. A 4x2 matrix and a 2x4 matrix can result in either a 2x2 or 4x4 
> for example. The violation of that rule may bother some people but is not 
> really an issue as any mathematics that has an axiom for say an abelian 
> group, simply is not expected to apply for a non-abelian case.
>

All of this is true, but utterly irrelevant to floating-point. If your
algorithm is inherently based on repeated estimates (Newton's method,
for instance), then you can iterate until you're "happy enough" with
the result. That's fine. But that is nothing whatsoever to do with the
complaint that 0.1+0.2!=0.3 or that you should "never use == with
floats" or any of those claims. It's as relevant as saying that my
ruler claims to be 30cm long but is actually nearly 310mm long, and
therefore the centimeter is an inherently unreliable unit and anything
measured in it should be treated as an estimate.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Rob Cliffe via Python-list




On 21/11/2021 01:02, Chris Angelico wrote:


If you have a number with a finite binary representation, you can
guarantee that it can be represented finitely in decimal too.
Infinitely repeating expansions come from denominators that are
coprime with the numeric base.



Not quite, e.g. 1/14 is a repeating decimal but 14 and 10 are not coprime.
I believe it is correct to say that infinitely recurring expansions 
occur when the denominator is divisible by a prime that does not divide 
the base.

Rob Cliffe
--
https://mail.python.org/mailman/listinfo/python-list


RE: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Avi Gross via Python-list
Chris,

You know I am going to fully agree with you that within some bounds, any 
combination of numbers that can accurately be represented will continue to be 
adequately represented under some operations like addition and subtraction and 
multiplication up to any point where they do not overflow (or underflow) the 
storage mechanism.

Division may be problematic and especially division by zero. 

But bring in any number that is not fully and accurately representable, and it 
can poison everything much in the way including an NA poisons any attempts to 
take a sum or mean. Any calculation that includes an e is an example.

Of course there is not much in computing that necessarily relies on 
representable numbers and especially not when the numbers are dynamically 
gotten as in from a file or user and already not quite what is representable. I 
can even imagine a situation where some fraction is represented in a double and 
then "copied" into a regular singular float and some of it is lost/truncated. 

I get your point about URL's but I was really focused at that point on 
filenames as an example on systems where they are not case sensitive. Some 
programming languages had a similar concept. Yes, you can have URL with more 
complex comparison functions needed including when something lengthens them or 
whatever. In one weird sense, as in you GET TO THE SAME PAGE, any URL that 
redirects you to another might be considered synonymous even if the two look 
nothing at all alike.

To continue, I do not mean to give the impression that comparing representable 
numbers with == is generally wrong. I am saying there are places where there 
may be good reasons for the alternative.

I can imagine an algorithm that starts with representable numbers and maybe at 
each stage continues to generate representable numbers, such as one of the hill 
climbing algorithms I am talking about. It may end up overshooting a bit past 
the peak and next round overshooting back to the other side and getting stuck 
in a loop. One way out is to keep track of past locations and abort when the 
cycle is seen to be repeating. Another is to leave when the result seems close 
enough. 

However, my comments about over/underflow may apply here as enough iterations 
with representable numbers may at some point result in the kind of rounding 
error that warps the results of further calculations.

I note some of your argument is the valid difference between when your 
knowledge of the input numbers is uncertain and what the computer does with 
them. Yes, my measures of the height/width/depth may be uncertain and it is not 
the fault of a python program if it multiplies them to provide an exact answer 
as if in a mathematical world where numbers are normally precise. I am saying 
that the human using the program needs external info before they use the 
answer. In my example, I would note the rule that when dealing with numbers 
that are only significant to some number of digits, the final calculation 
should often be rounded down according to some rules. So instead of printing 
out the volume as 4140.606, the program may call some function like round() as 
in round(10.1*20.2*30.3, 1) so it displays 4141.6 instead. The Python language 
does what you ask and not what you do not ask. 

Now a statistical program or perhaps an AI or Machine Learning program I write, 
might actually care about the probabilistic effects. I often create graphs that 
include perhaps a smoothed curve of some kind that approximates the points in 
the data as well as a light gray ribbon that represents some kind of error 
bands above and below and which suggest the line not be taken too seriously and 
there may be something like a 95% chance the true values are within the gray 
zone an even some chance they may be beyond it in an even lighter series of 
gray (color is not the issue) zones representing a 1% chance or even less. 

Such approaches apply if the measurement errors are assume to be as much as .1 
inches for each measure independently. The smallest volume would be (10.1 - 
0.1)*(20.2 - 0.1)*(30.3 - 0.1) = 6070.2

The largest possible volume if all my measures were off by that amount in the 
other direction would be:

(10.1 + 0.1)*(20.2 + 0.1)*(30.3 + 0.1) = 6294.6 

The above are truncated to one significant digit. 

The Python program evaluates all the above representable numbers perfectly, 
albeit I doubt they are all representable in binary. But for human purposes, 
the actual answer for a volume has some uncertainty built-in to the method of 
measurement and perhaps others such as the inner sides of the box may not be 
perfectly flat or the angles things join at may not be precisely 90 degrees and 
filling it with something like oranges may not fit as much more if you enlarge 
it a tad as they may not stack much better from a minor change.

Python is not to blame in these cases if not programmed well enough. And I 
suggest the often minor errors introduced by a r

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Chris Angelico
On Sun, Nov 21, 2021 at 1:20 PM Rob Cliffe via Python-list
 wrote:
>
>
>
> On 21/11/2021 01:02, Chris Angelico wrote:
> >
> > If you have a number with a finite binary representation, you can
> > guarantee that it can be represented finitely in decimal too.
> > Infinitely repeating expansions come from denominators that are
> > coprime with the numeric base.
> >
> >
> Not quite, e.g. 1/14 is a repeating decimal but 14 and 10 are not coprime.
> I believe it is correct to say that infinitely recurring expansions
> occur when the denominator is divisible by a prime that does not divide
> the base.
> Rob Cliffe

True, my bad. I can't remember if there's a term for that, but your
description is correct.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


RE: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Avi Gross via Python-list
Sorry Chris,

I was talking mathematically where a number like pi or like 1/7 conceptually
have an infinite number of digits needed that are added to a growing sum
using ever smaller powers of 10, in the decimal case.

In programming, and in the binary storage, the number of such is clearly
limited.

Is there any official limit on the maximum size of a python integer other
than available memory?

And replying sparsely, yes, pretty much nothing can be represent completely
in base e other than integral multiples of e, perhaps. No other numbers,
especially integers,  can be linear combinations of e or e raised to an
integral power.

Having said that, if you throw in another transcendental called pi and
expand to include the complex number i, then you can weirdly combine them
another way to make -1. I am sure you have seen equations like:

e**(pi*i) +1 = 0

By extension, you can make any integer by adding multiple such entities
together.

On another point, an indefinitely continued repeated fraction is sort of
similar to indefinitely summed series. Both can exist and demonstrate a
regularity when the actual digits of the number seemingly show no patterns.



-Original Message-
From: Python-list  On
Behalf Of Chris Angelico
Sent: Saturday, November 20, 2021 8:03 PM
To: [email protected]
Subject: Re: Unexpected behaviour of math.floor, round and int functions
(rounding)

On Sun, Nov 21, 2021 at 11:39 AM Avi Gross via Python-list
 wrote:
>
> Can I suggest a way to look at it, Grant?
>
> In base 10, we represent all numbers as the (possibly infinite) sum of 
> ten raised to some integral power.

Not infinite. If you allow an infinite sequence of digits, you create
numerous paradoxes, not to mention the need for infinite storage.

> 123 is 3 times 1 (ten to the zero power) plus
> 2 times 10 (ten to the one power) plus
> 1 times 100 (ten to the two power)
>
> 123.456 just extends this with
> 4 times 1/10 (ten to the minus one power) plus
> 5 times 1/100 (10**-2) plus
> 6 time 1/1000 (10**-3)
>
> In binary, all the powers are not powers of 10 but powers of two.
>
> So IF you wrote something like 111 it means 1 times 1 plus 1 times 2 
> plus 1 times 4 or 7. A zero anywhere just skips a 2 to that power. If 
> you added a decimal point to make 111.111 the latter part would be 1/2 
> plus 1/4 plus 1/8 or 7/8 which combined might be 7 and 7/8. So any 
> fractions of the form something over 2**N can be made easily and 
> almost everything else cannot be made in finite stretches. How would you
make 2/3 or 3 /10?

Right, this is exactly how place value works.

> But the opposite is something true. In decimal, to make the above it 
> becomes
> 7.875 and to make other fractions of the kind, you need more and more 
> As it happens, all such base-2 compatible streams can be made because 
> each is in some sense a divide by two.
>
> 7/16 = 1/2 * .875 = .4375
> 7/32 = 1/2 * .4375 = .21875
>
> and so on. But this ability is a special case artifact caused by a 
> terminal digit 5 always being able to be divided in tow to make a 25 a 
> unit longer and then again and again. Note 2 and 5 are factors of 10.  
> In the more general case, this fails. In base 7, 3/7 is written easily 
> as 0.3 but the same fraction in decimal is a repeating copy of 
> .428571... which never terminates. A number like 3/7 + 4/49 + 5/343 
> generally cannot be written in base 7 but the opposite is also true 
> that only a approximation of numbers in base 2 or base 10 can ever be 
> written. I am, of course, talking about the part to the right of the 
> decimal. Integers to the left can be written in any base. It is fractional
parts that can end up being nonrepeating.

If you have a number with a finite binary representation, you can guarantee
that it can be represented finitely in decimal too.
Infinitely repeating expansions come from denominators that are coprime with
the numeric base.

> What about pi and e and the square root of 2? I suspect all of them 
> have an infinite sequence with no real repetition (over long enough 
> stretches) in any base! I mean an integer base, of course. The 
> constant e in base e is just 1.

More than "suspect". This has been proven. That's what transcendental means.

I don't think "base e" means the same thing that "base ten" does.
(Normally you'd talk about a base e *logarithm*, which is a completely
different concept.) But if you try to work with a transcendental base like
that, it would be impossible to represent any integer finitely.

(Side point: There are other representations that have different
implications about what repeats and what doesn't. For instance, the decimal
expansion for a square root doesn't repeat, but the continued fraction for
the same square root will. For instance, 7**0.5 is 2;1,1,1,4,1,1,1,4... with
an infinitely repeating four-element unit.)

> I think there have been attempts to use a decimal representation in 
> some accounting packages or database applications that allow any 
> d

Re: Unexpected behaviour of math.floor, round and int functions (rounding)

2021-11-20 Thread Greg Ewing

On 21/11/21 2:18 pm, Grant Edwards wrote:

My recollection is that it was quite common back in the days before FP
hardware was "a thing" on small computers. CPM and DOS compilers for
various languages often gave the user a choice between binary FP and
decimal (BCD) FP.


It's also very common for handheld calculators to work in decimal.
Most of HP's classic calculators used a CPU that was specifically
designed for doing BCD arithmetic, and many versions of it didn't
even have a way of doing arithmetic in binary!

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: pytest segfault, not with -v

2021-11-20 Thread Dieter Maurer
Marco Sulla wrote at 2021-11-20 19:07 +0100:
>I know how to check the refcounts, but I don't know how to check the
>memory usage, since it's not a program, it's a simple library. Is
>there not a way to check inside Python the memory usage? I have to use
>a bash script (I'm on Linux)?

If Python was compiled appropriately (with "PYMALLOG_DEBUG"), `sys` contains
the function `_debugmallocstats` which prints details
about Python's memory allocation and free lists.

I was not able to compile Python 2.7 in this way. But the (system) Python 3.6
of Ubuntu was compiled appropriately.


Note that memory leaks usually do not cause segfaults (unless the application
runs out of memory due to the leak).

Your observation shows (apparently) non-deterministic behavior. In those cases,
minor differences (e.g. with/without "-v") can significantly change
the behavior (e.g. segfault or not). Memory management bugs (releasing memory
still in use) are a primary cause for this kind of behavior in Python
applications.
-- 
https://mail.python.org/mailman/listinfo/python-list