[Python-Dev] Snapshot formats in tracemalloc vs profiler

2019-06-27 Thread Yonatan Zunger
Hi everyone,

Something occurred to me while trying to analyze code today: profiler and
cProfiler emit their data in pstats format, which various tools and
libraries consume. tracemalloc, on the other hand, uses a completely
separate format which nonetheless contains similar data. In fact, in many
non-Python applications I've worked in, heap and CPU profiles were always
emitted in identical formats, which allowed things like visual
representations of stack traces where memory is allocated, and these have
proven quite useful in practice and allowed lots of sharing of tools across
many applications.

Is there a particular design reason why these formats are different in
Python right now? Would it make sense to consider allowing them to match,
e.g. having a tracemalloc.dump_pstats() method?

Yonatan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3JFFWGJ57LQRZI3CVJXF5P7NYRCEWCJB/


[Python-Dev] Re: Snapshot formats in tracemalloc vs profiler

2019-06-27 Thread Yonatan Zunger
It's similar, but not quite the same -- I was just trying to see if I could
build a neatly Pythonic library to do the conversion. The CPU profilers are
basically building a dict from (filename, lineno, funcname) to a tuple
(from a comment in profile.py):

[0] = The number of times this function was called, not counting direct
  or indirect recursion,
[1] = Number of times this function appears on the stack, minus one
[2] = Total time spent internal to this function
[3] = Cumulative time that this function was present on the stack.  In
  non-recursive functions, this is the total execution time from
start
  to finish of each invocation of a function, including time spent
in
  all subfunctions.
[4] = A dictionary indicating for each function name, the number of
times
  it was called by us.

pstats serializes this dict in a particular format which various other
tools can read, like gprof2dot. The challenge in translating is that
building (4), or handling recursion for (0) and (3), really requires
instrumentation at the CPU trace points as well, which would probably be a
good answer to my original question of why not. :)

However, there are other profiling formats which are used outside the
Python community, have good tooling support, and could be much easier to
deal with; for example, there's the pprof format
<https://github.com/google/pprof/tree/master/proto>, which is almost
ludicrously versatile; it's meant for profiling both compiled and
interpreted languages, so it's very flexible as to what constitutes a
"line."

So if I have the time, and knowing that there's no intrinsic thing to fear
in all of this, I'll see if I can implement a pprof translator for
tracemalloc snapshots.


Although while I have you hear, I do have a further question about how
tracemalloc works: If I'm reading the code correctly, traces get removed by
tracemalloc when objects are free, which means that at equilibrium (e.g. at
the end of a function) the trace would show just the data which leaked.
That's very useful in most cases, but I'm trying to hunt down a situation
where memory usage is transiently spiking -- which might be due to
something being actively used, or to something building up and overwhelming
the GC, or to evil elves in the CPU for all I can tell so far. Would it be
completely insane for tracemalloc to have a mode where it either records
frees separately (e.g. as a malloc of negative space, at the trace where
the free is happening), or where it simply ignores frees altogether?

On Thu, Jun 27, 2019 at 3:08 PM Victor Stinner  wrote:

> Hi,
>
> I designed tracemalloc with Charles-François Natali in PEP 454. The
> API is a lightweight abstraction on top of the internal C structures
> used by the C _tracemalloc module which is designed to minimize the
> memory footprint.
>
> I'm not aware of the pstats format. Adding a new
> tracemalloc.dump_pstats() function looks like a good idea. Does pstats
> allow to attach arbitrary data to a traceback? The root structure of
> tracemalloc is basically the tuple (size: int, traceback) (trace_t
> structure in C).
>
> Victor
>
> Le jeu. 27 juin 2019 à 21:03, Yonatan Zunger  a écrit :
> >
> >
> > Hi everyone,
> >
> > Something occurred to me while trying to analyze code today: profiler
> and cProfiler emit their data in pstats format, which various tools and
> libraries consume. tracemalloc, on the other hand, uses a completely
> separate format which nonetheless contains similar data. In fact, in many
> non-Python applications I've worked in, heap and CPU profiles were always
> emitted in identical formats, which allowed things like visual
> representations of stack traces where memory is allocated, and these have
> proven quite useful in practice and allowed lots of sharing of tools across
> many applications.
> >
> > Is there a particular design reason why these formats are different in
> Python right now? Would it make sense to consider allowing them to match,
> e.g. having a tracemalloc.dump_pstats() method?
> >
> > Yonatan
> > ___
> > Python-Dev mailing list -- python-dev@python.org
> > To unsubscribe send an email to python-dev-le...@python.org
> > https://mail.python.org/mailman3/lists/python-dev.python.org/
> > Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/3JFFWGJ57LQRZI3CVJXF5P7NYRCEWCJB/
>
>
>
> --
> Night gathers, and now my watch begins. It shall not end until my death.
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AFVTRDKP3HZHYQJJEZTQB3QHXELV2PVK/


[Python-Dev] Re: Snapshot formats in tracemalloc vs profiler

2019-06-27 Thread Yonatan Zunger
Well, then. I think I'm going to have some fun with this. :)

Thank you!

On Thu, Jun 27, 2019 at 4:17 PM Victor Stinner  wrote:

> Le ven. 28 juin 2019 à 01:03, Yonatan Zunger  a écrit :
> > Although while I have you hear, I do have a further question about how
> tracemalloc works: If I'm reading the code correctly, traces get removed by
> tracemalloc when objects are free, which means that at equilibrium (e.g. at
> the end of a function) the trace would show just the data which leaked.
> That's very useful in most cases, but I'm trying to hunt down a situation
> where memory usage is transiently spiking -- which might be due to
> something being actively used, or to something building up and overwhelming
> the GC, or to evil elves in the CPU for all I can tell so far. Would it be
> completely insane for tracemalloc to have a mode where it either records
> frees separately (e.g. as a malloc of negative space, at the trace where
> the free is happening), or where it simply ignores frees altogether?
>
> My very first implementation of tracemalloc produced a log of malloc
> and free calls. Problem: transferring the log from a slow set top box
> to a desktop computer was slow, and parsing the log was very slow.
> Parsing complexity is in O(n) where n is the number of malloc or free
> calls, knowning that Python calls malloc(), realloc() or free()
> 270,000 times per second in average:
> https://www.python.org/dev/peps/pep-0454/#log-calls-to-the-memory-allocator
>
> tracemalloc is built on top of PEP 445 -- Add new APIs to customize
> Python memory allocators:
> https://www.python.org/dev/peps/pep-0445/
>
> Using these PEP 445 hooks, you should be able to do whatever you want
> on Python memory allocations and free :-)
>
> Example of toy project to inject memory allocation failures:
> https://github.com/vstinner/pyfailmalloc
>
> Victor
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5NC67BB24DK246UUVY4NYOXPLTUE564X/


[Python-Dev] Sanity check about ctypes

2022-01-05 Thread Yonatan Zunger
Hey everyone.

Quick sanity check: The ctypes docs
 refer to
_CData as a non-public class which is in the module, but _ctypes.c doesn't
actually export it
.
(I discovered this because it turns out that typeshed *is* referencing
_CData, e.g. in its type annotations for RawIOBase

)

Is this intended behavior in CPython (in which case the docs are a bit off
and typeshed has a bug), or is it unexpected to people on this list (in
which case it's an issue in _ctypes.c)?

Yonatan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7L6DNNI3MJ4UIM3C7A7KAIWHX562MRZL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Sanity check about ctypes

2022-01-06 Thread Yonatan Zunger
I think it makes good sense for the type-checking reason: _CData *does* declare
a fairly useful base interface that other classes also expose, so saying
that a function takes a _CData argument can make good sense. (As a bunch of
the methods in the io library do, for example) typeshed hacks it for the
case where one is using .pyi files for type annotations, but there's no
great way to just do that in ordinary Python code.

(Really, I think it would be lovely to stick proper type signatures
directly into the base libraries like io, but that's a whole separate
conversation. I like strong typing, does it show?)

Yonatan

On Wed, Jan 5, 2022 at 5:15 PM Gregory P. Smith  wrote:

>
> On Wed, Jan 5, 2022 at 3:17 PM Yonatan Zunger  wrote:
>
>> Hey everyone.
>>
>> Quick sanity check: The ctypes docs
>> <https://docs.python.org/3.10/library/ctypes.html#ctypes._CData> refer
>> to _CData as a non-public class which is in the module, but _ctypes.c doesn't
>> actually export it
>> <https://github.com/python/cpython/blob/main/Modules/_ctypes/_ctypes.c#L5680>.
>> (I discovered this because it turns out that typeshed *is* referencing
>> _CData, e.g. in its type annotations for RawIOBase
>> <https://github.com/python/typeshed/blob/master/stdlib/_typeshed/__init__.pyi#L190>
>> )
>>
>> Is this intended behavior in CPython (in which case the docs are a bit
>> off and typeshed has a bug), or is it unexpected to people on this list (in
>> which case it's an issue in _ctypes.c)?
>>
>
> typeshed is presumably referring to itself. It defines an interface for
> ctypes._CData in
> https://github.com/python/typeshed/blob/master/stdlib/ctypes/__init__.pyi#L82
>
> The CPython ctypes docs *seem* reasonable to me. There is such a class.
> It is not public, so you cannot access ctypes._CData in any direct manner.
> That it gets called a class may be somewhat historical - its purpose is to
> provide a common interface. What code would ever actually care that it used
> class mechanisms as an internal implementation detail to do that?
>
> -gps
>
>
>>
>> Yonatan
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/7L6DNNI3MJ4UIM3C7A7KAIWHX562MRZL/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RFEECGPCUIYAWYIC7C4EFDR4JAH7I4MP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Exceptions and tracebacks and frames, oh my!

2022-07-14 Thread Yonatan Zunger
Hi everyone,

Apologies for pinging the dev list; I'm shaving a very hairy yak, and the
python-help crew, the docs, and the source are all unable to help.

Context: I'm working on a library function to dump stack traces for all
threads, sorta like _Py_DumpTracebackThreads but in Python and meant to be
invoked from a high layer, and thus be pretty and readable and so on rather
than signal-safe and robust. The hard part is combining data from thread
stack frames and exceptions, since people generally want to know about
these.

What I'm having trouble understanding is the relationship between an
exception's TB and the thread's frame. Two important differences I've
noticed are that (a) the output of traceback.print_tb(sys.exc_info()[2])
and traceback.print_stack(sys.exc_info()[2].tb_frame) are markedly
different -- the former gives data about where the exception was raised,
while the latter about where the thread currently is, usually in an
exception handler and (b) if I'm in a unittest and dump the stack for a
thread starting from its current frame (using traceback.format_stack and
sys._current_frames), it gives a very deep stack that includes all the
unittest harness code etc., but if I dump the stack for an exception
starting from its TB object, it instead is very short and focused on where
the exception happened.

All of which tells me that I don't understand the relationship between the
exception's traceback object and the frames it points to at all.

Is there somewhere that explains this particular bit of magic, or someone
who's well-versed in it? I have to admit that after a few hours of
spelunking the code, I can't even figure out how the exception traceback is
being set in the first place. :)

Again, apologies for bugging the dev list with what (I hope) is a simple
question -- if there's a FM I should be reading, please let me know!

Yonatan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IRPWIRAZ3K5Y2F466ZISGSF5JBF657CB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Snapshot formats in tracemalloc vs profiler

2019-08-14 Thread Yonatan Zunger
Update: Thanks to Victor's advice and the PEP445 hooks, I put together a
pretty comprehensive logging/sampling heap profiler for Python, and it
works great. The package is now available via pip
<https://pypi.org/project/heapprof/> for anyone who needs it!

On Thu, Jun 27, 2019 at 4:21 PM Yonatan Zunger  wrote:

> Well, then. I think I'm going to have some fun with this. :)
>
> Thank you!
>
> On Thu, Jun 27, 2019 at 4:17 PM Victor Stinner 
> wrote:
>
>> Le ven. 28 juin 2019 à 01:03, Yonatan Zunger  a écrit :
>> > Although while I have you hear, I do have a further question about how
>> tracemalloc works: If I'm reading the code correctly, traces get removed by
>> tracemalloc when objects are free, which means that at equilibrium (e.g. at
>> the end of a function) the trace would show just the data which leaked.
>> That's very useful in most cases, but I'm trying to hunt down a situation
>> where memory usage is transiently spiking -- which might be due to
>> something being actively used, or to something building up and overwhelming
>> the GC, or to evil elves in the CPU for all I can tell so far. Would it be
>> completely insane for tracemalloc to have a mode where it either records
>> frees separately (e.g. as a malloc of negative space, at the trace where
>> the free is happening), or where it simply ignores frees altogether?
>>
>> My very first implementation of tracemalloc produced a log of malloc
>> and free calls. Problem: transferring the log from a slow set top box
>> to a desktop computer was slow, and parsing the log was very slow.
>> Parsing complexity is in O(n) where n is the number of malloc or free
>> calls, knowning that Python calls malloc(), realloc() or free()
>> 270,000 times per second in average:
>>
>> https://www.python.org/dev/peps/pep-0454/#log-calls-to-the-memory-allocator
>>
>> tracemalloc is built on top of PEP 445 -- Add new APIs to customize
>> Python memory allocators:
>> https://www.python.org/dev/peps/pep-0445/
>>
>> Using these PEP 445 hooks, you should be able to do whatever you want
>> on Python memory allocations and free :-)
>>
>> Example of toy project to inject memory allocation failures:
>> https://github.com/vstinner/pyfailmalloc
>>
>> Victor
>>
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IFWYZFSL5LTMJ4LRY2LICDGAMLR3FNTR/


[Python-Dev] Helpers for dynamic bytecode generation

2019-10-24 Thread Yonatan Zunger
Hi everyone,

I've found myself recently writing Python code that dynamically generates
bytecode.¹ I now have yet another case where I'm having to do this, in
which my nice situation of being able to easily precompute all the jump
addresses no longer holds. So I'm starting to write a helper to make it
easy to write bytecode from Python, with its basic API calls being
write(opcode, arg) and nextLine(optional label). The argument can be an
int, name, local name, constant, label, etc., depending on the opcode, and
it maintains all the appropriate tables and finally dumps a code object at
the end.

All of which is well and good and makes life much easier, but... I am
*not* looking
forward to writing the logic that basically duplicates that of assemble()
in compile.c, of splitting all of this into basic blocks and computing the
correct jump positions and so on before finally dumping out the bytecode.

Has anyone already done this that people know of? (Searching the
Internetz didn't turn anything up) Failing that, to what extent is it
reasonable to either consider assemble() as some kind of sane API point
into compile.c, and/or add some new API in compile.h which implements all
of the stuff described above in C?

(I'm fully expecting the answer to these latter questions to be "you have
got to be kidding me," but figured it was wiser to check than to reinvent
this particular wheel if it isn't necessary)

Yonatan



¹ Not out of masochism, in case you're wondering; there was a real use
case. A storage system would receive a read request that specified a bunch
of (key, condition) pairs, where conditions where either return any value,
return an exact value, or return values in a range. It would then receive
between 1 and 1M (depending on the request parameters) candidate cells from
the underlying storage layers, each of which had a tuple of bytes as its
actual key values; it had to compare each of those tuples against the
request parameters, and yield the values which matched. Because it's an
inner loop and can easily be called 1M times, doing this in pure Python
slows things down by a lot. Because it's also only called once, doing some
really expensive overhead like synthesizing Python code and calling
compile() on it would also slow things down a lot. But converting a bunch
of (key, condition) pairs to a really efficient function from tuples of
bytes to bools was pretty easy.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E57DTM65LFEROFZLHKRV442JPPFAWNJU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Helpers for dynamic bytecode generation

2019-10-25 Thread Yonatan Zunger
@jjevnik -- Oops! I meant to send it there and sent it here by mistake.

But thank you, everyone, for the recommendations! I'll check them all out
and (mercifully) will probably not try to write this from scratch.

On Thu, Oct 24, 2019 at 11:05 PM Brandt Bucher 
wrote:

> Interesting that you bring this up. Just earlier this week I published the
> first version of a new package that lets you write compiled bytecode
> instructions inline with pure-Python syntax. The code's still a bit messy,
> being only a week old and all, but it works as advertised for CPython 3.6.2
> through 3.9.0a0, and even includes neat features like labeled jumps, unused
> name/constant removal, stack size adjustments, etc... Perhaps it'll be
> useful to you (or at least you'll find it interesting):
> https://github.com/brandtbucher/hax
>
> Victor's Stinner's Bytecode package (already mentioned) is surely better
> for *dynamic* generation... I've never used it personally, but it looks
> great.
>
> Definitely not for the faint of heart, though! ;)
>
> Brandt
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/PAQUZZ4MHZ6TJFIBNCPZGZBW6DOZFUJG/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YOQT3QSXUY3TI3FXOSHLRY24GJBJHLWG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Adding a scarier warning to object.__del__?

2019-12-31 Thread Yonatan Zunger
Hey everyone,

I just encountered yet another reason to beware of __del__: when it's
called during interpreter shutdown, for reasons which are kind of obvious
in retrospect, if it calls notify() on a threading.Condition, the waiting
thread may or may not ever actually receive it, and so if it does that and
then tries to join() the thread the interpreter may hang in a hard-to-debug
way.

This isn't something that can reasonably be fixed, and (like in most cases
involving __del__) there's a very simple fix of using weakref.finalize
instead. My question for the dev list: How would people feel about changing
the documentation for the method
 to more
bluntly warn people against using it, and refer them to weakref.finalize
and/or atexit.register as an alternative? The text already has an undertone
of "lasciate ogni speranza, voi ch'entrate" but it may be helpful to be
more explicit to avoid people getting caught in unexpected pitfalls.

Yonatan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/AAZQBWD6PHC4PVNCCPX4A2745SS7B3LS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 558: Defined semantics for locals() (December 2019 edition)

2019-12-31 Thread Yonatan Zunger
Thanks for writing this up, Nick!

My main question is about the remaining difference between semantics at the
class/module versus function level: is it worth the additional cognitive
complexity to have the class/module behavior be different from the function
behavior?

The "mutable" class/module behavior is really nice for debuggers, of
course, but when you want that you really want it at *all* scopes, so
having it available only some of the time feels like an unpredictable tool.
I could imagine having a mode where everything (including function scopes)
is required to permit mutability, but that would (a) be a lot of work (as
you pointed out in the design discussion) and (b) would cause "debug mode"
binaries to have to work so differently from "opt mode" ones that debugging
would get harder for other reasons.

I'm not sure that it *isn't* worth it to have this extra flexibility at the
higher scope levels, but it isn't obvious to me that it is, and if that
could lead to some code simplification, I wouldn't be sad!

Yonatan

On Mon, Dec 30, 2019 at 10:47 PM Nick Coghlan  wrote:

> Hi folks,
>
> I've finally updated PEP 558 and it's reference implementation based
> on Nathaniel's feedback back in May.
>
> The latest version of the PEP can now be found at
> https://www.python.org/dev/peps/pep-0558/, and I've created a
> discussion thread on Discourse:
> https://discuss.python.org/t/pep-558-defined-semantics-for-locals/2936
>
> The latest version implements Nathaniel's "independent snapshot"
> proposal, and I like how that has turned out. The one thing that
> changed from the May discussion thread is that the refcount semantics
> of PyEval_GetLocals() (it returns a borrowed reference) meant that it
> had to keep the old behaviour of returning a reference to the internal
> dynamically updated shared "snapshot" at function scope, with a new
> API, PyEval_GetPyLocals(), providing the C equivalent of the locals()
> builtin.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/GHN2GYJQRU77EZBXX4SQUZ5XEMEONSFL/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NH2ILSTNK5KGHAXPT7FUPBQSRJBKQKMW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 558: Defined semantics for locals() (December 2019 edition)

2020-01-01 Thread Yonatan Zunger
Makes sense. Thanks for the clarification!

On Wed, Jan 1, 2020 at 1:30 AM Nick Coghlan  wrote:

> On Wed, 1 Jan 2020 at 10:42, Yonatan Zunger  wrote:
> >
> > Thanks for writing this up, Nick!
> >
> > My main question is about the remaining difference between semantics at
> the class/module versus function level: is it worth the additional
> cognitive complexity to have the class/module behavior be different from
> the function behavior?
>
> Mutating values through locals() has long worked at class and module
> scope, and folks rely on that often enough that we don't want to break
> their code. Correctly supporting mutation via locals() also falls
> naturally out of the runtime semantics of those scopes (at module
> scope, locals() returns the same namespace as globals() does, while at
> class scope it returns the same namespace as will eventually be passed
> to the metaclass to create the new class object).
>
> By contrast, mutation via locals() at function scope hasn't worked
> consistently since fast locals were introduced, and that predates even
> statically nested scopes (I went trawling through the repo history to
> try to work out when they were first added, and made it as far as
> "some time before 1997, since [1] changed how they worked to make them
> faster, but was replacing an even older list-based mechanism").
>
> Cheers,
> Nick.
>
> [1]
> https://github.com/python/cpython/commit/f3e85a0356e679ed9ff8d13236ff8e9f77a9fd0b#diff-f1de0b7194da5b0cdf48379eca8678d7
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/S3MOFZMX6NWUCNL33ROALUNJDGN3YQ5H/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Adding a scarier warning to object.__del__?

2020-01-01 Thread Yonatan Zunger
Oh, I'm absolutely thinking about clarity. Something like:



This method is called when the instance is about to be destroyed. Because
it may be called by either the ordinary Python execution flow or by the
garbage collector, it has very unusual semantics, and must be treated with
great care. In almost all cases, you should instead use either
weakref.finalize (if you want logic to be called when an object is no
longer in use) or atexit.register (if you want to clean up before the
interpreter exits).

The precise semantics of when __del__ is called on an object are
implementation-dependent. For example:
* It might be invoked during the normal interpreter flow at a moment like
function return, if the reference count for an object has dropped to zero;
* It might be invoked by the garbage collector, which happens during the
normal interpreter flow but is called from an arbitrary thread;
* It might be invoked during interpreter shutdown, after various global
variables (including modules!) have already been deleted, or other threads'
interpreters have already stopped.
* It may not be invoked at all; it is *not guaranteed* that __del__()
methods are called for objects that still exist when the interpreter exits.

In particular, this means that, unless your program is structured so that
you can deterministically control when an object will be destroyed,
implementations of __del__():
* Must not assume that any variables outside self exist, including modules
or global variables;
* Must not attempt to acquire locks or otherwise block, since they may
(e.g.) be invoked inside a thread which already holds those resources;
* Must not rely on being able to signal other threads and wait for their
responses (even thread joins), since those threads' interpreters may have
exited;
* Must not cause failures (e.g. leaving external devices in an invalid
state) if they are *never* invoked.

Note that del x doesn’t directly call x.__del__() — the former decrements
the reference count for x by one, and the latter is only called when x’s
reference count reaches zero. Depending on the implementation, it is
possible for a reference cycle to prevent the reference count of an object
from going to zero. (e.g., in CPython, a common cause of reference cycles
is when an exception is caught and stored in a local variable; the
exception contains a reference to the traceback, which in turn references
the locals of all frames caught in the traceback.) In this case, the cycle
will be later detected and deleted by the cyclic garbage collector.

If a base class has a __del__() method, the derived class’s __del__()
method, if any, must explicitly call it to ensure proper deletion of the
base class part of the instance.

It is possible (though not recommended!) for the __del__() method to
postpone destruction of the instance by creating a new reference to it.
This is called object resurrection. It is implementation-dependent whether
__del__() is called a second time when a resurrected object is about to be
destroyed; the current CPython implementation only calls it once.



On Wed, Jan 1, 2020 at 12:57 PM Andrew Svetlov 
wrote:

> If the warning text tells about the delayed execution -- I'm +1.
> -1 for something like "just don't use __del__, it is dangerous".
>
> On Wed, Jan 1, 2020, 21:51 Gregory P. Smith  wrote:
>
>>
>>
>> On Wed, Jan 1, 2020 at 6:40 AM Andrew Svetlov 
>> wrote:
>>
>>> __del__ is very useful not for interpreter shutdown but as a regular
>>> destructor in object's lifecycle.
>>>
>>
>> The reason we should warn people against ever implementing __del__ is
>> that people rarely actually understand object lifecycle.  Its presence
>> guarantees delayed garbage collection and that its code will code execute
>> in a context way outside of normal control flow that all other methods are
>> invoked from.
>>
>> -gps
>>
>>
>>>
>>> Action on the shutdown is another beast.
>>> Personally, I prefer to do all finalization works by explicit calls
>>> instead.
>>>
>>> On Wed, Jan 1, 2020 at 2:39 AM Yonatan Zunger  wrote:
>>> >
>>> > Hey everyone,
>>> >
>>> > I just encountered yet another reason to beware of __del__: when it's
>>> called during interpreter shutdown, for reasons which are kind of obvious
>>> in retrospect, if it calls notify() on a threading.Condition, the waiting
>>> thread may or may not ever actually receive it, and so if it does that and
>>> then tries to join() the thread the interpreter may hang in a hard-to-debug
>>> way.
>>> >
>>> > This isn't something that can reasonably be fixed, and (like in most
>>> cases involving __del__) there's a very simple fix of

[Python-Dev] Re: Adding a scarier warning to object.__del__?

2020-01-02 Thread Yonatan Zunger
weakref.finalize always executes during the ordinary Python flow, IIUC --
e.g., it happens before the interpreter is stopping. I guess it does still
have the "arbitrary thread" limitation -- which is a good point, I may have
some code bugs of my own to fix. But it's a huge difference in terms of
predictability of the interpreter state, and in particular both the issues
around deleted modules and the ones around threading no longer working
don't apply to it.

atexit is a pretty corner-case thing -- I agree that it shouldn't be the
common thing at all, and the recommendation should be specifically for
those use cases where it makes sense. FD's are a classic thing to close
with weakref.finalize.

On Thu, Jan 2, 2020 at 7:47 AM Andrew Svetlov 
wrote:

> I would say that the "recommended" weakref.finalize() shares very many
> limitations of __del__(), that's why hard to buy it.
> atexit.register() is not a common thing, the recommendation of using
> atexit for file descriptor closing *in general* looks weird, while it
> can be a good solution in some particular case.
>
> On Thu, Jan 2, 2020 at 1:05 PM Armin Rigo  wrote:
> >
> > Hi,
> >
> > On Thu, 2 Jan 2020 at 03:59, Yonatan Zunger  wrote:
> > > It is possible (though not recommended!) for the __del__() method to
> postpone destruction of the instance by creating a new reference to it.
> This is called object resurrection. It is implementation-dependent whether
> __del__() is called a second time when a resurrected object is about to be
> destroyed; the current CPython implementation only calls it once.
> >
> > "...in most cases."
> >
> >
> > Armin Rigo
>
>
>
> --
> Thanks,
> Andrew Svetlov
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FIM4PII72PCUZ7WVZBGCPHRNXYRH7WIX/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Adding a scarier warning to object.__del__?

2020-01-03 Thread Yonatan Zunger
Jeff, I like all of your proposals. It does get a bit long, but I think
it's not so long that it feels out-of-place in the data model docs, which
is where you expect weird stuff to live. (I mean, look how long the
discussion of metaclasses is -- and I certainly wouldn't want to eliminate
that!)

On Fri, Jan 3, 2020 at 3:56 AM Jeff Allen  wrote:

> On 02/01/2020 02:53, Yonatan Zunger wrote:
>
> Oh, I'm absolutely thinking about clarity. ...
>
> Could any revision also be clear what is *required of Python the language*
> vs. what is a CPython implementation detail? I always appreciate this care.
> There is good practice here and elsewhere in the existing documentation,
> but drift is easy for those steeped in CPython implementation. In the
> present case, it's a matter of avoiding an explicit requirement for a
> reference-counting approach to lifecycle management. Here the places in the
> proposal a change could achieve that.
>
>
> The precise semantics of when __del__ is called on an object are
> implementation-dependent. For example:
> * It might be invoked during the normal interpreter flow at a moment like
> function return, ...
>
> We should continue here "... immediately the object is no nonger
> referenced;"
>
> (It might not be called immediately, but that's implied by your
> implementation-dependent "might be".)
>
> Note that del x doesn’t directly call x.__del__() — the former decrements
> the reference count for x by one, and the latter is only called when x’s
> reference count reaches zero. Depending on the implementation, it is
> possible for a reference cycle to prevent the reference count of an object
> from going to zero. (e.g., in CPython, a common cause of reference cycles
> is when an exception is caught and stored in a local variable; the
> exception contains a reference to the traceback, which in turn references
> the locals of all frames caught in the traceback.) In this case, the cycle
> will be later detected and deleted by the cyclic garbage collector.
>
> I realise that most of this paragraph is existing text rearranged, and
> currently it fails to make the distinction I'm looking for in the "note"
> part. But it is clear in the next paragraph. I think it better to say,
> closer to the current text:
>
> """Note:: ``del x`` does not call ``x.__del__()`` directly. After ``del
> x``, variable ``x`` is undefined (or unbound). If the object it referenced
> is now no longer referenced at all,  that object's ``__del__()`` might be
> called, immediately or later, subject to the caveats already given.
>
> *CPython implementation detail:* ``del x`` decrements the reference count
> of the object by one, and if that makes it zero, ``x.__del__()`` will be
> called immediately. It is possible for a reference cycle to prevent the
> reference count of any object in it from going to zero. A common cause of
> reference cycles is when an exception is caught and stored in a local
> variable; the exception contains a reference to the traceback, which in
> turn references the locals of all frames caught in the traceback. In this
> case, the cycle will be detected later and its objects deleted by the
> cyclic garbage collector."""
>
>
> If a base class has a __del__() method, the derived class’s __del__()
> method, if any, must explicitly call it to ensure proper deletion of the
> base class part of the instance.
>
> Possibly this thought belings with the "implementations of __del__(): *
> Must ..." paragraph.
>
> But also, while I think there is scope for a better guidance, this is
> getting a bit long. Should there be a "HOW TO write a __del__ method (and
> how to avoid it)" to contain the advisory points being made? In-lining
> advice here, on how to survive the infernal circles of __del__, dilutes the
> scariness of the warning not to enter at all.
>
> ---
>
> Jeff Allen
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/A32QEUP4R6XTY5LQW56LKWJ3XBUZCHOR/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5FO4CRYFPJGVZJDYYXMUQBTLVBZ5DMVN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Intended invariants for signals in CPython

2020-06-24 Thread Yonatan Zunger via Python-Dev
Hi everyone,

I'm in the process of writing some code to defer signals during critical
regions, which has involved a good deal of reading through the CPython
implementation to understand the behaviors. Something I've found is that
there appears to be a lot of thoughtfulness about where the signal handlers
can be triggered, but this thoughtfulness is largely undocumented. I've put
together a working list of behaviors from staring at the code, but what I'd
like to figure out is which of these behaviors the devs think of as
intended to be invariants, versus which are just accidents of how the code
currently works and might change unpredictably.

And if there are things which are intended to be genuine invariants, would
it be reasonable to document these formally and make them part of the
language, not just for inside the CPython codebase?

What appears to be true is this:

   - Signal handlers are only invoked in the main thread (documented with
   the signal library)
   - High-level: Signal handlers may be invoked at any instruction
   boundary. External C libraries *may* invoke them as well, but there are
   no general guarantees. (Documented with the signal library)
   - Low-level: Certain functions can be described as "interruptable," and
   signal handlers may be invoked whenever these functions are called.
   - Signal handlers are thus partially reentrant: a signal handler may be
   interrupted by another signal iff it invokes an interruptable function.

In particular, the thing whose intentionality I'm not sure about is whether
the notion of an interruptable function or instruction is meant to be an
actual property of the language and/or of the CPython runtime, or whether
it's actually intended that only the "high-level" rule above be true, and
that all signal handlers should be considered to be fully reentrant at all
times. The comments in sysmodule.c about avoiding triggering
PyErr_CheckSignals() suggest that there definitely is some thinking about
this within the CPython code itself.

The reason it would be useful to document this is so that if I'm trying to
write a fairly generic library that handles signals (like the one I'm doing
now) I can reason about where I need to be defensive about an instruction
being interrupted by yet another signal, and maybe avoid calls to certain
functions which are known to be interruptable, much like I would avoid
calling malloc() in a C signal handler.

In the current implementation, the interruptable functions and instructions
are:

Big categories:

   - Any function which calls PyErr_SetFromErrno, *if* errno == EINTR.
   (Catalogue needs to be made of these -- it's a much smaller set than the
   set of all calls to PyErr_SetFromErrno)
   - Basically any open, read, or write method of a raw or buffered file
   object.
   - Likewise, any open, read, or write method on a socket.
   - In any interactive console readline, or in input().
   - object.__str__, object.__repr__, and PyObject_Print, and anything that
   falls back to these.

Specific instructions:

   -
   - Multiplication, division, or stringification of long integers.

More specific functions:

   - In `multiprocessing.shared_memory`, SharedMemory.__init__, .close, and
   .unlink.
   - In `multiprocessing.semaphore`, Semaphore.acquire. (But interestingly,
   *not* threading.Semaphore.acquire)
   - In `signal`, pause, signal, sigwaitinfo, sigtimedwait, pthread_kill,
   and pthread_sigmask.
   - In `fcntl`, fcntl and ioctl.
   - In `traceback`, any of the print methods.
   - In `faulthandler`, dump_traceback
   - In `select`, all of the methods. (select, epoll, etc)
   - In `time`, sleep.
   - In `curses`, whenever you look for key input.
   - In `tkinter`, during the main loop of a Tcl/Tk app.
   - During an SSL handshake.

-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

zun...@humu.com

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/W5LGEEWGGO7ODIAJXM54YSI2PZR5UO6Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-24 Thread Yonatan Zunger via Python-Dev
... Reading through more of the code, I realized that I greatly
underestimated the number of interruptible operations.

That said, the meta-question still applies: Are there things which are
generally intended *not* to be interruptible by signals, and if so, is
there some consistent way of indicating this?

On Wed, Jun 24, 2020 at 2:34 PM Yonatan Zunger  wrote:

> Hi everyone,
>
> I'm in the process of writing some code to defer signals during critical
> regions, which has involved a good deal of reading through the CPython
> implementation to understand the behaviors. Something I've found is that
> there appears to be a lot of thoughtfulness about where the signal handlers
> can be triggered, but this thoughtfulness is largely undocumented. I've put
> together a working list of behaviors from staring at the code, but what I'd
> like to figure out is which of these behaviors the devs think of as
> intended to be invariants, versus which are just accidents of how the code
> currently works and might change unpredictably.
>
> And if there are things which are intended to be genuine invariants, would
> it be reasonable to document these formally and make them part of the
> language, not just for inside the CPython codebase?
>
> What appears to be true is this:
>
>- Signal handlers are only invoked in the main thread (documented with
>the signal library)
>- High-level: Signal handlers may be invoked at any instruction
>boundary. External C libraries *may* invoke them as well, but there
>are no general guarantees. (Documented with the signal library)
>- Low-level: Certain functions can be described as "interruptable,"
>and signal handlers may be invoked whenever these functions are called.
>- Signal handlers are thus partially reentrant: a signal handler may
>be interrupted by another signal iff it invokes an interruptable function.
>
> In particular, the thing whose intentionality I'm not sure about is
> whether the notion of an interruptable function or instruction is meant to
> be an actual property of the language and/or of the CPython runtime, or
> whether it's actually intended that only the "high-level" rule above be
> true, and that all signal handlers should be considered to be fully
> reentrant at all times. The comments in sysmodule.c about avoiding
> triggering PyErr_CheckSignals() suggest that there definitely is some
> thinking about this within the CPython code itself.
>
> The reason it would be useful to document this is so that if I'm trying to
> write a fairly generic library that handles signals (like the one I'm doing
> now) I can reason about where I need to be defensive about an instruction
> being interrupted by yet another signal, and maybe avoid calls to certain
> functions which are known to be interruptable, much like I would avoid
> calling malloc() in a C signal handler.
>
> In the current implementation, the interruptable functions and
> instructions are:
>
> Big categories:
>
>- Any function which calls PyErr_SetFromErrno, *if* errno == EINTR.
>(Catalogue needs to be made of these -- it's a much smaller set than the
>set of all calls to PyErr_SetFromErrno)
>- Basically any open, read, or write method of a raw or buffered file
>object.
>- Likewise, any open, read, or write method on a socket.
>- In any interactive console readline, or in input().
>- object.__str__, object.__repr__, and PyObject_Print, and anything
>that falls back to these.
>
> Specific instructions:
>
>-
>- Multiplication, division, or stringification of long integers.
>
> More specific functions:
>
>- In `multiprocessing.shared_memory`, SharedMemory.__init__, .close,
>and .unlink.
>- In `multiprocessing.semaphore`, Semaphore.acquire. (But
>interestingly, *not* threading.Semaphore.acquire)
>- In `signal`, pause, signal, sigwaitinfo, sigtimedwait, pthread_kill,
>and pthread_sigmask.
>- In `fcntl`, fcntl and ioctl.
>- In `traceback`, any of the print methods.
>- In `faulthandler`, dump_traceback
>- In `select`, all of the methods. (select, epoll, etc)
>- In `time`, sleep.
>- In `curses`, whenever you look for key input.
>- In `tkinter`, during the main loop of a Tcl/Tk app.
>- During an SSL handshake.
>
> --
>
> Yonatan Zunger
>
> Distinguished Engineer and Chief Ethics Officer
>
> He / Him
>
> zun...@humu.com
>
> 100 View St, Suite 101
>
> Mountain View, CA 94041
>
> Humu.com <https://www.humu.com>  · LinkedIn
> <https://www.linkedin.com/company/humuhq>  · Twitter
> <https://twitter.com/humuinc>
>


-- 

[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
I'm taking it from this thread that suppressing signals in a small window
is not something anyone in their right mind would really want to attempt.
:) (Or that if they did, it would have to be through a proper change to the
runtime, not something higher-level)

On Thu, Jun 25, 2020 at 7:14 AM Antoine Pitrou  wrote:

>
> Le 25/06/2020 à 16:00, Guido van Rossum a écrit :
> > On Thu, Jun 25, 2020 at 02:02 Antoine Pitrou  > <mailto:solip...@pitrou.net>> wrote:
> >
> > ...  The intent, though, is that any function
> > waiting on an external event (this can be a timer, a socket, a
> > lock, a directory...) should be interruptible so that Ctrl-C works in
> > an interactive prompt.
> >
> > That’s not really true though right? Locks can block the REPL.
>
> On POSIX they don't.  On Windows it's a long-standing bug:
> https://bugs.python.org/issue29971
>
> Regards
>
> Antoine.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/4TNEA5KNWCYTJVIPISUZKVXVDK2BQJWT/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

zun...@humu.com

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BYZD3YJ52BGHIGS7M5IAWO3MJYLWAVAD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
Also, just to sanity-check that I understand things correctly: Python
signal handlers *are* reentrant, in that a signal handler can be
interrupted by another signal, is that right? Is there any general
recommendation on how to write signal handlers in order to manage that?

(Antoine, I *so* wish I could be doing less with signals and signal
handlers right now. Alas, I have a combination of a SIGTERM-happy runtime
environment and a long-story situation involving wacky multiprocessing to
avoid issues in someone else's C library that make that impossible. So
instead I'm trying to write a general library to help simplify the task,
and so thinking about a lot of slightly nutty corner cases...)

On Thu, Jun 25, 2020 at 10:33 AM Yonatan Zunger  wrote:

> I'm taking it from this thread that suppressing signals in a small window
> is not something anyone in their right mind would really want to attempt.
> :) (Or that if they did, it would have to be through a proper change to the
> runtime, not something higher-level)
>
> On Thu, Jun 25, 2020 at 7:14 AM Antoine Pitrou  wrote:
>
>>
>> Le 25/06/2020 à 16:00, Guido van Rossum a écrit :
>> > On Thu, Jun 25, 2020 at 02:02 Antoine Pitrou > > <mailto:solip...@pitrou.net>> wrote:
>> >
>> > ...  The intent, though, is that any function
>> > waiting on an external event (this can be a timer, a socket, a
>> > lock, a directory...) should be interruptible so that Ctrl-C works
>> in
>> > an interactive prompt.
>> >
>> > That’s not really true though right? Locks can block the REPL.
>>
>> On POSIX they don't.  On Windows it's a long-standing bug:
>> https://bugs.python.org/issue29971
>>
>> Regards
>>
>> Antoine.
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/4TNEA5KNWCYTJVIPISUZKVXVDK2BQJWT/
>> Code of Conduct: http://python.org/psf/codeofconduct/
>>
>
>
> --
>
> Yonatan Zunger
>
> Distinguished Engineer and Chief Ethics Officer
>
> He / Him
>
> zun...@humu.com
>
> 100 View St, Suite 101
>
> Mountain View, CA 94041
>
> Humu.com <https://www.humu.com>  · LinkedIn
> <https://www.linkedin.com/company/humuhq>  · Twitter
> <https://twitter.com/humuinc>
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

zun...@humu.com

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZSR73MQWMFUXFOADDMEG5JBXOJSZ232Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
I had not -- thank you!

On Thu, Jun 25, 2020 at 1:49 PM Chris Jerdonek 
wrote:

> On Wed, Jun 24, 2020 at 5:15 PM Yonatan Zunger via Python-Dev <
> python-dev@python.org> wrote:
>
>> That said, the meta-question still applies: Are there things which are
>> generally intended *not* to be interruptible by signals, and if so, is
>> there some consistent way of indicating this?
>>
>
> Yonatan, Nathaniel Smith wrote an interesting post a few years ago that
> includes some background about signal handling:
> https://vorpus.org/blog/control-c-handling-in-python-and-trio/
> Have you seen that?
>
> --Chris
>
>
>>

-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

zun...@humu.com

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BG35OR3HK7NRKPT7Q6L5Y36WQK2MWQK6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
HOLY CRAP THIS IS MADNESS. I kind of love it. :)

And it's related to some other problems that have been on my mind (how to
"paint" stack frames with user-defined variables, with those variables then
being used by things like CPU/heap profilers as smart annotations), and I
have to say it's a damned clever solution to the problem.

On Thu, Jun 25, 2020 at 6:35 PM Yonatan Zunger  wrote:

> I had not -- thank you!
>
> On Thu, Jun 25, 2020 at 1:49 PM Chris Jerdonek 
> wrote:
>
>> On Wed, Jun 24, 2020 at 5:15 PM Yonatan Zunger via Python-Dev <
>> python-dev@python.org> wrote:
>>
>>> That said, the meta-question still applies: Are there things which are
>>> generally intended *not* to be interruptible by signals, and if so, is
>>> there some consistent way of indicating this?
>>>
>>
>> Yonatan, Nathaniel Smith wrote an interesting post a few years ago that
>> includes some background about signal handling:
>> https://vorpus.org/blog/control-c-handling-in-python-and-trio/
>> Have you seen that?
>>
>> --Chris
>>
>>
>>>
>
> --
>
> Yonatan Zunger
>
> Distinguished Engineer and Chief Ethics Officer
>
> He / Him
>
> zun...@humu.com
>
> 100 View St, Suite 101
>
> Mountain View, CA 94041
>
> Humu.com <https://www.humu.com>  · LinkedIn
> <https://www.linkedin.com/company/humuhq>  · Twitter
> <https://twitter.com/humuinc>
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

zun...@humu.com

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UQZDHEKI5OXULCHWCYA4AAPI52HZ3JK2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-25 Thread Yonatan Zunger via Python-Dev
What, weird edge cases involving *signals?* Never! :)

Here's a nice simple one: it takes at least a few opcodes to set said
global flag, during which (depending on the whims of how eval_break gets
set) yet another signal might get raised and handled.

I did just make a post to python-ideas about the possibility of adding a
"sys.suppress_signals" method; it seems like it would be surprisingly easy
in CPython (basically by just adding another check at the start of
_PyErr_CheckSignalsTstate) but would also be a truly impressive footgun.
Not sure if I'm going to try to climb that particular mountain yet, but I
figured I'd see what obvious holes other people could poke in it.

Thanks for your help!

On Thu, Jun 25, 2020 at 1:27 PM Antoine Pitrou  wrote:

> On Thu, 25 Jun 2020 11:18:13 -0700
> Yonatan Zunger via Python-Dev  wrote:
> > Also, just to sanity-check that I understand things correctly: Python
> > signal handlers *are* reentrant, in that a signal handler can be
> > interrupted by another signal, is that right? Is there any general
> > recommendation on how to write signal handlers in order to manage that?
>
> To be honest, I've never thought about that.  If you need to care about
> reentrancy, you should perhaps use some kind of global flag to detect
> it (hopefully you won't run into weird edge cases...).
>
> > (Antoine, I *so* wish I could be doing less with signals and signal
> > handlers right now. Alas, I have a combination of a SIGTERM-happy runtime
> > environment and a long-story situation involving wacky multiprocessing to
> > avoid issues in someone else's C library that make that impossible. So
> > instead I'm trying to write a general library to help simplify the task,
> > and so thinking about a lot of slightly nutty corner cases...)
>
> Ha, I wisk you good luck with that :-)
>
> Best regards
>
> Antoine.
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/KBD7XG5QPRQRP52FVPAFLZ3G6PSPPVYE/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

zun...@humu.com

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PZUPF2X6KVV5SQLIWUCOK4CN34ENLKDU/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Intended invariants for signals in CPython

2020-06-29 Thread Yonatan Zunger via Python-Dev
Whew. Nick, Antoine, and Chris, thanks to each of you for your feedback --
with it, I *think* I've managed to write a pure-Python signal suppression
library. I'm nowhere near confident enough in its handling of corner cases
yet to release it to the general public, but hopefully I'll be able to
acquire that faith in it over time and do that.

(It ended up involving a new & improved Semaphore class with some more
functions like pausability, having a signal handler that puts things in a
SimpleQueue [thanks, Antoine] and dequeues them when the semaphore is empty
and paused, creative use of with instead of try/finally to leverage some of
the ideas in the blog post Chris linked and manage reentrancy, and with all
that allow the main thread to meaningfully know when it needs to defer
dealing with a signal until later so that the threads can safely finish.
Whew.)

Good rule of thumb: If your Python code has comments talking about specific
opcodes, you are writing some Really Interesting Python Code. :)

Yonatan

On Sat, Jun 27, 2020 at 10:26 PM Nick Coghlan  wrote:

> On Fri., 26 Jun. 2020, 7:02 am Chris Jerdonek, 
> wrote:
>
>> On Wed, Jun 24, 2020 at 5:15 PM Yonatan Zunger via Python-Dev <
>> python-dev@python.org> wrote:
>>
>>> That said, the meta-question still applies: Are there things which are
>>> generally intended *not* to be interruptible by signals, and if so, is
>>> there some consistent way of indicating this?
>>>
>>
>> Yonatan, Nathaniel Smith wrote an interesting post a few years ago that
>> includes some background about signal handling:
>> https://vorpus.org/blog/control-c-handling-in-python-and-trio/
>>
>
> Related to that is this CPython bug report:
> https://bugs.python.org/issue29988
>
> The short version is that Greg Smith and I tried to close some of the
> remaining signal safety holes a couple of years ago, and I made it as far
> as building better tools for provoking the bugs (this is the origin of
> per-opcode tracing hooks in CPython), but we never came up with an actual
> solution.
>
> So the workaround remains to run anything that absolutely cannot be
> interrupted by poorly timed signals in a subthread, and dedicate the main
> thread to signal handling.
>
> Cheers,
> Nick.
>
>
>
>>

-- 

Yonatan Zunger

Distinguished Engineer and Chief Ethics Officer

He / Him

zun...@humu.com

100 View St, Suite 101

Mountain View, CA 94041

Humu.com <https://www.humu.com>  · LinkedIn
<https://www.linkedin.com/company/humuhq>  · Twitter
<https://twitter.com/humuinc>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VMKL26EOUELDRG24JLYN6RIQW25N4BXV/
Code of Conduct: http://python.org/psf/codeofconduct/