Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks
On 29/03/2019 01.02, Victor Stinner wrote: > Hi, > > I read quickly the PEP, I'm not sure that I understood it correctly, > so here are some early questions more about the usage of the PEP, than > its implementation. > >> This is not sandboxing, as this proposal does not attempt to prevent >> malicious behavior (though it enables some new options to do so). >> See the `Why Not A Sandbox`_ section below for further discussion. > > I don't understand well the overall security model. If malicious > behaviors can still occur, what is the the purpose of auditing? For > example, if an audit hook writes events into a local log file, the > attacker can easily remove this log file, no? An attacker may not have permission to mess with the auditing subsystem. For example an attacker may be able to modify an application like a web server or web application. Audit loggers typically run in a different, more protected context. On modern, hardened operation systems root / Adminstrator aren't all powerful, too. They are typically restricted by additional policies like SELinux. Further more, servers also send auditing data to remote nodes for analysis. Keep in mind that auditing is not primarily about preventing compromises. It's about detecting what, when, who, and how a system was compromised. >> Verified Open Hook >> -- >> >> Most operating systems have a mechanism to distinguish between files >> that can be executed and those that can not. For example, this may be an >> execute bit in the permissions field, a verified hash of the file >> contents to detect potential code tampering, or file system path >> restrictions. These are an important security mechanism for preventing >> execution of data or code that is not approved for a given environment. >> Currently, Python has no way to integrate with these when launching >> scripts or importing modules. > > In my experience, it doesn't work just because Python has too many > functions opening files indirectly or call external C libraries which > open files. > > I vaguely recall an exploit in my pysandbox project which uses the > internal code of Python which displays a traceback... to read the > content of an arbitrary file on the disk :-( Game over. I would never > expect that there are so many ways to read a file in Python... The verified open hook is not about sandboxing. It's a mechanism to prevent a class of attacks like directory traversal attacks. On Linux, the open-for-import hook could refuse access to .py and .pyc files that do not have the user.python_code or root.python_code extended file attribute. This verified open hook could have prevent the compromise of wiki.python.org many years ago. > Even when I restricted pysandbox to the bare minimum of the Python > language (with no import), multiple exploits have been found. > Moreover, at the end, Python just became useful. > > More generally, there are a lot of codes in Python which allow > arbitrary code injection :-( (Most of them are now fixed, hopefully!) > > I did my best to modify as much functions as possible to implement the > PEP 446 "Make newly created file descriptors non-inheritable", but I > know that *many* functions call directly open() or fopen() and so > create inheritable file descriptors. For example, the Python ssl > module takes directly filenames and OpenSSL open directly files. It's > just one example. > > You will never be able to cover all cases. I agree. Don't draw the wrong conclusion from your statement. PEP 578 adds hooks for auditing, which in return can be used to harden and log an application. Unlike secure sandboxing, it doesn't have to be perfect. Alex Gaynor summed this up in his blog post https://alexgaynor.net/2018/jul/20/worst-truism-in-infosec/ > Having a single function which allows to open an arbitrary file > without triggering an audit event would defeat the whole purpose of > auditing, no? Again, maybe I didn't understand well the overall > purpose of the PEP, sorry. This case can be detected during development and QE phase. You simply have to count the amount of open syscalls and compare it to the amount of open auditing events. >> The important performance impact is the case where events are being >> raised but there are no hooks attached. This is the unavoidable case - >> once a developer has added audit hooks they have explicitly chosen to >> trade performance for functionality. > > (The Linux kernel uses advance tooling to inject hooks: it has no > impact on performances when no hook is used. Machine code of functions > is patched to inject a hook. Impressive stuff :-)) > > Here I expect a small overhead. But the global overhead will be > proportional to the number of hooks, no? Maybe it's not significant > with the proposed list of events, but it will be more significant with > 100 or 1000 events? > > I'm not saying that it's a blocker issue, I'm just thinking aloud to > make sure that I understood correctly :-) The performan
Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks
On 28/03/2019 23.35, Steve Dower wrote: > Audit Hook > -- > > In order to observe actions taken by the runtime (on behalf of the > caller), an API is required to raise messages from within certain > operations. These operations are typically deep within the Python > runtime or standard library, such as dynamic code compilation, module > imports, DNS resolution, or use of certain modules such as ``ctypes``. > > The following new C APIs allow embedders and CPython implementors to > send and receive audit hook messages:: > > # Add an auditing hook > typedef int (*hook_func)(const char *event, PyObject *args, > void *userData); > int PySys_AddAuditHook(hook_func hook, void *userData); > > # Raise an event with all auditing hooks > int PySys_Audit(const char *event, PyObject *args); > > # Internal API used during Py_Finalize() - not publicly accessible > void _Py_ClearAuditHooks(void); > > The new Python APIs for receiving and raising audit hooks are:: > > # Add an auditing hook > sys.addaudithook(hook: Callable[[str, tuple]]) > > # Raise an event with all auditing hooks > sys.audit(str, *args) > > > Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time, > including before ``Py_Initialize()``, or by calling > ``sys.addaudithook()`` from Python code. Hooks cannot be removed or > replaced. Hi Steve, I wonder if the hooks could be replaced by a more efficient mechanism. These days, Linux, macOS, and most recently Windows [1] support dtrace probes. DTrace is a very powerful and efficient mechanism to trace user-space processes from Kernel space. At least we should consider to add DTrace probes to the auditing framework. Regards, Christian [1] https://techcommunity.microsoft.com/t5/Windows-Kernel-Internals/DTrace-on-Windows/ba-p/362902 ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] A request for PEP announcement format [was: PEP 570]
On Thu, Mar 28, 2019 at 11:28 PM Jeroen Demeyer wrote: > On 2019-03-29 04:08, Stephen J. Turnbull wrote: > > In this case, it's here: > > > > > > https://discuss.python.org/t/pep-570-Python-Positional-Only-Parameters/1078 > > So, are we supposed to discuss PEPs on discuss.python.org now? That's > fine for me, should I create a thread like that for PEP 580 too? > It is up to the PEP author where a PEP will be discussed. You have the option of using python-dev, Discourse or a dedicated mailing list. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks
On 29Mar2019 0334, Christian Heimes wrote: On 28/03/2019 23.35, Steve Dower wrote: Audit Hook -- In order to observe actions taken by the runtime (on behalf of the caller), an API is required to raise messages from within certain operations. These operations are typically deep within the Python runtime or standard library, such as dynamic code compilation, module imports, DNS resolution, or use of certain modules such as ``ctypes``. The following new C APIs allow embedders and CPython implementors to send and receive audit hook messages:: # Add an auditing hook typedef int (*hook_func)(const char *event, PyObject *args, void *userData); int PySys_AddAuditHook(hook_func hook, void *userData); # Raise an event with all auditing hooks int PySys_Audit(const char *event, PyObject *args); # Internal API used during Py_Finalize() - not publicly accessible void _Py_ClearAuditHooks(void); The new Python APIs for receiving and raising audit hooks are:: # Add an auditing hook sys.addaudithook(hook: Callable[[str, tuple]]) # Raise an event with all auditing hooks sys.audit(str, *args) Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time, including before ``Py_Initialize()``, or by calling ``sys.addaudithook()`` from Python code. Hooks cannot be removed or replaced. Hi Steve, I wonder if the hooks could be replaced by a more efficient mechanism. These days, Linux, macOS, and most recently Windows [1] support dtrace probes. DTrace is a very powerful and efficient mechanism to trace user-space processes from Kernel space. At least we should consider to add DTrace probes to the auditing framework. [1] https://techcommunity.microsoft.com/t5/Windows-Kernel-Internals/DTrace-on-Windows/ba-p/362902 Calling into those frameworks will still require as much work as these hooks do, and also make it very difficult to do things like auditing unit tests or using pure-Python code when you're not concerned about initialization (e.g. in a long-running web server). So I'm inclined to say that if you want those probes, you can enable them by adding a hook that calls into them? A similar argument is made for using ETW on Windows (which will work on versions of Windows that have been released, unlike DTrace) (and yes, this is a real argument I've already had over this proposal ;) ), so I really think leaving it open-ended and Python-specific is the best approach. (Reading further down the link you provided, it seems DTrace in Windows will only be enabled for essentially-administrators. So that rules it out as a substitute for this proposal in my opinion.) Cheers, Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks
Thanks Christian for responding - I endorse and support all your comments. (I'd hoped that by explicitly saying "this is not a sandbox" it would avoid people thinking it was a sandbox, but apparently I would have been better just to avoid the keyword completely...) On 29Mar2019 0324, Christian Heimes wrote: On 29/03/2019 01.02, Victor Stinner wrote: Here I expect a small overhead. But the global overhead will be proportional to the number of hooks, no? Maybe it's not significant with the proposed list of events, but it will be more significant with 100 or 1000 events? I'm not saying that it's a blocker issue, I'm just thinking aloud to make sure that I understood correctly :-) The performance impact can be remedied and reduced with a simple check. If there is no audit hook installed, it's just a matter of a pointer deref + JNZ. Yep, the performance case we care about is when there are no hooks attached, since that's the only time a user cannot do anything to improve performance themselves. See the "Performance Impact" section in the PEP. In my implementation the cost is about as low as I can make it - see https://github.com/python/cpython/pull/12613/files#diff-f38879f4833a6b6847e556b9a07bf4edR115 (looking at it again I can probably avoid the exception preservation and a few conditionals at the end) Basically, PySys_Audit takes a format string and arguments, rather than making callers eagerly construct the tuple that gets passed to the hook, and only actually allocates when there is a hook to call. There aren't even any Py_INCREF's if there are no hooks. And as Christian says, it's a deref+JNZ. Now, if someone has implemented a hook and that hook has performance issues, yeah things will slow down. In general, the places where we are interested in hooks is where calls are being made into the operating system, so most of them will also involve a few syscalls and the cost of the hook should be minimal in comparison. But there isn't another way to provide the functionality - offloading it to the OS just means the OS is going to suffer the performance penalty, so it really is just moving the blame elsewhere. I dislike playing that game. Cheers, Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks
Like in the mktemp thread earlier, I would request a threat model (what use cases are supposed to be protected (in this case, by reporting rather than preventing) and from what threats) -- in the discussion, and eventually, in the PEP. Without one, any claims and talks about whether something would be an effective security measure are pointless -- 'cuz you would never know if you accounted for everything and would not even have the definition of that "everything". On 29.03.2019 1:35, Steve Dower wrote: Hi all Time is short, but I'm hoping to get PEP 578 (formerly PEP 551) into Python 3.8. Here's the current text for review and comment before I submit to the Steering Council. The formatted text is at https://www.python.org/dev/peps/pep-0578/ (update just pushed, so give it an hour or so, but it's fundamentally the same as what's there) No Discourse post, because we don't have a python-dev equivalent there yet, so please reply here for this one. Implementation is at https://github.com/zooba/cpython/tree/pep-578/ and my backport to 3.7 (https://github.com/zooba/cpython/tree/pep-578-3.7/) is already getting some real use (though this will not be added to 3.7, unless people *really* want it, so the backport is just for reference). Cheers, Steve = PEP: 578 Title: Python Runtime Audit Hooks Version: $Revision$ Last-Modified: $Date$ Author: Steve Dower Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 16-Jun-2018 Python-Version: 3.8 Post-History: Abstract This PEP describes additions to the Python API and specific behaviors for the CPython implementation that make actions taken by the Python runtime visible to auditing tools. Visibility into these actions provides opportunities for test frameworks, logging frameworks, and security tools to monitor and optionally limit actions taken by the runtime. This PEP proposes adding two APIs to provide insights into a running Python application: one for arbitrary events, and another specific to the module import system. The APIs are intended to be available in all Python implementations, though the specific messages and values used are unspecified here to allow implementations the freedom to determine how best to provide information to their users. Some examples likely to be used in CPython are provided for explanatory purposes. See PEP 551 for discussion and recommendations on enhancing the security of a Python runtime making use of these auditing APIs. Background == Python provides access to a wide range of low-level functionality on many common operating systems. While this is incredibly useful for "write-once, run-anywhere" scripting, it also makes monitoring of software written in Python difficult. Because Python uses native system APIs directly, existing monitoring tools either suffer from limited context or auditing bypass. Limited context occurs when system monitoring can report that an action occurred, but cannot explain the sequence of events leading to it. For example, network monitoring at the OS level may be able to report "listening started on port 5678", but may not be able to provide the process ID, command line, parent process, or the local state in the program at the point that triggered the action. Firewall controls to prevent such an action are similarly limited, typically to process names or some global state such as the current user, and in any case rarely provide a useful log file correlated with other application messages. Auditing bypass can occur when the typical system tool used for an action would ordinarily report its use, but accessing the APIs via Python do not trigger this. For example, invoking "curl" to make HTTP requests may be specifically monitored in an audited system, but Python's "urlretrieve" function is not. Within a long-running Python application, particularly one that processes user-provided information such as a web app, there is a risk of unexpected behavior. This may be due to bugs in the code, or deliberately induced by a malicious user. In both cases, normal application logging may be bypassed resulting in no indication that anything out of the ordinary has occurred. Additionally, and somewhat unique to Python, it is very easy to affect the code that is run in an application by manipulating either the import system's search path or placing files earlier on the path than intended. This is often seen when developers create a script with the same name as the module they intend to use - for example, a ``random.py`` file that attempts to import the standard library ``random`` module. This is not sandboxing, as this proposal does not attempt to prevent malicious behavior (though it enables some new options to do so). See the `Why Not A Sandbox`_ section below for further discussion. Overview of Changes === The aim of these changes is to enable both application developers and system administrators to integrate Python into their existing monito
[Python-Dev] Summary of Python tracker Issues
ACTIVITY SUMMARY (2019-03-22 - 2019-03-29) Python tracker at https://bugs.python.org/ To view or respond to any of the issues listed below, click on the issue. Do NOT respond to this message. Issues counts and deltas: open7048 ( -3) closed 41176 (+76) total 48224 (+73) Open issues with patches: 2809 Issues opened (50) == #22189: collections.UserString missing some str methods https://bugs.python.org/issue22189 reopened by rhettinger #28190: Cross-build _curses failed if host ncurses headers and target https://bugs.python.org/issue28190 reopened by doko #33356: Windows 10 buildbot: test__xxsubinterpreters.test_already_runn https://bugs.python.org/issue33356 reopened by vstinner #35155: Clarify Protocol Handlers in urllib.request Docs https://bugs.python.org/issue35155 reopened by orsenthil #36150: Possible assertion failures due to _ctypes.c's PyCData_reduce( https://bugs.python.org/issue36150 reopened by serhiy.storchaka #36344: install_certificates.command too complicated https://bugs.python.org/issue36344 reopened by ned.deily #36402: test_threading: test_threads_join_2() failed with "Fatal Pytho https://bugs.python.org/issue36402 opened by vstinner #36403: AsyncIterator on 3.7: __aiter__ no longer honors finally block https://bugs.python.org/issue36403 opened by asksol #36404: Document PendingDeprecationWarning as deprecated https://bugs.python.org/issue36404 opened by inada.naoki #36406: doctest.testmod(empty_package) raises TypeError in 3.7 (and no https://bugs.python.org/issue36406 opened by Dutcho #36408: Tkinter multi-processing performance, Linux 10-25 times faster https://bugs.python.org/issue36408 opened by james.mccormack #36409: plistlib old API should be removed https://bugs.python.org/issue36409 opened by bigfootjon #36411: Python 3 f.tell() gets out of sync with file pointer in binary https://bugs.python.org/issue36411 opened by pewscorner #36414: Multiple test failures in GCC and Clang optional builds on Tra https://bugs.python.org/issue36414 opened by xtreak #36416: bytes.rpartition bug in online documentation https://bugs.python.org/issue36416 opened by pewscorner #36417: unicode.isdecimal bug in online Python 2 documentation https://bugs.python.org/issue36417 opened by pewscorner #36418: urllib.parse.*Result: support _replace for additional computed https://bugs.python.org/issue36418 opened by cheryl.sabella #36419: IDLE autocomplete: refactor and polish code and tests https://bugs.python.org/issue36419 opened by terry.reedy #36422: tempfile.TemporaryDirectory() removes entire directory tree ev https://bugs.python.org/issue36422 opened by riccardomurri #36424: Pickle fails on frozen dataclass that has slots https://bugs.python.org/issue36424 opened by drhagen #36426: exec() issue when used inside function https://bugs.python.org/issue36426 opened by schperplata #36427: Document that PyEval_RestoreThread and PyGILState_Ensure can t https://bugs.python.org/issue36427 opened by pablogsal #36429: Fix starting IDLE with pyshell https://bugs.python.org/issue36429 opened by terry.reedy #36432: Running python test suite fails on macOS 10.14.4 with resource https://bugs.python.org/issue36432 opened by ned.deily #36434: Zipfile breaks if signalled during write() https://bugs.python.org/issue36434 opened by and800 #36437: method_descriptor surprising error message when self is passed https://bugs.python.org/issue36437 opened by vstinner #36438: Python 3.5.7 import error on Cross compile https://bugs.python.org/issue36438 opened by Brian Spratke #36439: Inconsistencies with datetime.fromtimestamp(t) when t < 0 https://bugs.python.org/issue36439 opened by Tiger-222 #36440: more helpful diagnostics for parser module https://bugs.python.org/issue36440 opened by A. Skrobov #36442: Different ValueError for the same operation in List and Tuple https://bugs.python.org/issue36442 opened by HardikPatel #36445: bus error in test_gil test on armhf running with 64bit kernel https://bugs.python.org/issue36445 opened by doko #36448: Message "You will need to rebuild pythoncore to see the change https://bugs.python.org/issue36448 opened by jdemeyer #36449: __aexit__ is not called when a context manager is used in an a https://bugs.python.org/issue36449 opened by Andrey Lemets #36453: pkgutil.get_importer only return the first valid path_hook(imp https://bugs.python.org/issue36453 opened by Windson Yang #36454: test_time: test_monotonic() failed on AMD64 FreeBSD 10-STABLE https://bugs.python.org/issue36454 opened by vstinner #36456: task.cancel unbound recursion https://bugs.python.org/issue36456 opened by Dima.Tisnek #36457: functools.singledispatchmethod interacts poorly with subclasse https://bugs.python.org/issue36457 opened by Tim Mitchell2 #36460: Add AMP MIME type support https://bugs.python.org/issue36460 opened by Daniel Black #36461: timeit: Additional changes for autorange https://bugs.python
Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks
PEP 551 (referenced from this one) contains information about using these hooks for security purposes, along with other approaches to minimize the risk of having Python in your production environments. Threat models have to be designed by the user; we can't predict what it looks like for the incredibly diverse user base we have. This PEP is explicitly only about the API changes. Cheers, Steve On 29Mar2019 1044, Ivan Pozdeev via Python-Dev wrote: Like in the mktemp thread earlier, I would request a threat model (what use cases are supposed to be protected (in this case, by reporting rather than preventing) and from what threats) -- in the discussion, and eventually, in the PEP. Without one, any claims and talks about whether something would be an effective security measure are pointless -- 'cuz you would never know if you accounted for everything and would not even have the definition of that "everything". On 29.03.2019 1:35, Steve Dower wrote: Hi all Time is short, but I'm hoping to get PEP 578 (formerly PEP 551) into Python 3.8. Here's the current text for review and comment before I submit to the Steering Council. The formatted text is at https://www.python.org/dev/peps/pep-0578/ (update just pushed, so give it an hour or so, but it's fundamentally the same as what's there) No Discourse post, because we don't have a python-dev equivalent there yet, so please reply here for this one. Implementation is at https://github.com/zooba/cpython/tree/pep-578/ and my backport to 3.7 (https://github.com/zooba/cpython/tree/pep-578-3.7/) is already getting some real use (though this will not be added to 3.7, unless people *really* want it, so the backport is just for reference). Cheers, Steve = PEP: 578 Title: Python Runtime Audit Hooks Version: $Revision$ Last-Modified: $Date$ Author: Steve Dower Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 16-Jun-2018 Python-Version: 3.8 Post-History: Abstract This PEP describes additions to the Python API and specific behaviors for the CPython implementation that make actions taken by the Python runtime visible to auditing tools. Visibility into these actions provides opportunities for test frameworks, logging frameworks, and security tools to monitor and optionally limit actions taken by the runtime. This PEP proposes adding two APIs to provide insights into a running Python application: one for arbitrary events, and another specific to the module import system. The APIs are intended to be available in all Python implementations, though the specific messages and values used are unspecified here to allow implementations the freedom to determine how best to provide information to their users. Some examples likely to be used in CPython are provided for explanatory purposes. See PEP 551 for discussion and recommendations on enhancing the security of a Python runtime making use of these auditing APIs. Background == Python provides access to a wide range of low-level functionality on many common operating systems. While this is incredibly useful for "write-once, run-anywhere" scripting, it also makes monitoring of software written in Python difficult. Because Python uses native system APIs directly, existing monitoring tools either suffer from limited context or auditing bypass. Limited context occurs when system monitoring can report that an action occurred, but cannot explain the sequence of events leading to it. For example, network monitoring at the OS level may be able to report "listening started on port 5678", but may not be able to provide the process ID, command line, parent process, or the local state in the program at the point that triggered the action. Firewall controls to prevent such an action are similarly limited, typically to process names or some global state such as the current user, and in any case rarely provide a useful log file correlated with other application messages. Auditing bypass can occur when the typical system tool used for an action would ordinarily report its use, but accessing the APIs via Python do not trigger this. For example, invoking "curl" to make HTTP requests may be specifically monitored in an audited system, but Python's "urlretrieve" function is not. Within a long-running Python application, particularly one that processes user-provided information such as a web app, there is a risk of unexpected behavior. This may be due to bugs in the code, or deliberately induced by a malicious user. In both cases, normal application logging may be bypassed resulting in no indication that anything out of the ordinary has occurred. Additionally, and somewhat unique to Python, it is very easy to affect the code that is run in an application by manipulating either the import system's search path or placing files earlier on the path than intended. This is often seen when developers create a script with the same name as the module they intend to use - for example, a
[Python-Dev] Please take your time reading PEPs (was: PEP 578: Python Runtime Audit Hooks)
On Thu, Mar 28, 2019 at 5:03 PM Victor Stinner wrote: > Hi, > > I read quickly the PEP > I would like to encourage everyone to read PEPs so that they never feel the need to write those words ever again. ;) PEPs are never decided in less than 24 hours, so there is no rush to read a PEP as quickly as possible in order to reply ASAP. We also have so much volume as it is when discussing PEPs that I think we should be encouraging people to take the time to be informed by reading thoroughly before replying so the back-and-forth is minimized and optimized for impactful discussions (personally, I would love it if we all aimed for one, thorough response/day when discussing PEPs, but that's just me). Otherwise we end up with way more time spent in replies to things that would not have been necessary to ask if we took our time reading. Remember, every email you send is read by tons of other people and so there's a real time commitment you're asking of the world every time you hit that Reply button. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks
On 28/03/2019 23.35, Steve Dower wrote: > Hi all > > Time is short, but I'm hoping to get PEP 578 (formerly PEP 551) into > Python 3.8. Here's the current text for review and comment before I > submit to the Steering Council. > > The formatted text is at https://www.python.org/dev/peps/pep-0578/ > (update just pushed, so give it an hour or so, but it's fundamentally > the same as what's there) > > No Discourse post, because we don't have a python-dev equivalent there > yet, so please reply here for this one. > > Implementation is at https://github.com/zooba/cpython/tree/pep-578/ and > my backport to 3.7 (https://github.com/zooba/cpython/tree/pep-578-3.7/) > is already getting some real use (though this will not be added to 3.7, > unless people *really* want it, so the backport is just for reference). > > Cheers, > Steve > > = > > PEP: 578 > Title: Python Runtime Audit Hooks > Version: $Revision$ > Last-Modified: $Date$ > Author: Steve Dower > Status: Draft > Type: Standards Track > Content-Type: text/x-rst > Created: 16-Jun-2018 > Python-Version: 3.8 > Post-History: > > Abstract > > > This PEP describes additions to the Python API and specific behaviors > for the CPython implementation that make actions taken by the Python > runtime visible to auditing tools. Visibility into these actions > provides opportunities for test frameworks, logging frameworks, and > security tools to monitor and optionally limit actions taken by the > runtime. > > This PEP proposes adding two APIs to provide insights into a running > Python application: one for arbitrary events, and another specific to > the module import system. The APIs are intended to be available in all > Python implementations, though the specific messages and values used > are unspecified here to allow implementations the freedom to determine > how best to provide information to their users. Some examples likely > to be used in CPython are provided for explanatory purposes. > > See PEP 551 for discussion and recommendations on enhancing the > security of a Python runtime making use of these auditing APIs. > > Background > == > > Python provides access to a wide range of low-level functionality on > many common operating systems. While this is incredibly useful for > "write-once, run-anywhere" scripting, it also makes monitoring of > software written in Python difficult. Because Python uses native system > APIs directly, existing monitoring tools either suffer from limited > context or auditing bypass. > > Limited context occurs when system monitoring can report that an > action occurred, but cannot explain the sequence of events leading to > it. For example, network monitoring at the OS level may be able to > report "listening started on port 5678", but may not be able to > provide the process ID, command line, parent process, or the local > state in the program at the point that triggered the action. Firewall > controls to prevent such an action are similarly limited, typically > to process names or some global state such as the current user, and > in any case rarely provide a useful log file correlated with other > application messages. > > Auditing bypass can occur when the typical system tool used for an > action would ordinarily report its use, but accessing the APIs via > Python do not trigger this. For example, invoking "curl" to make HTTP > requests may be specifically monitored in an audited system, but > Python's "urlretrieve" function is not. > > Within a long-running Python application, particularly one that > processes user-provided information such as a web app, there is a risk > of unexpected behavior. This may be due to bugs in the code, or > deliberately induced by a malicious user. In both cases, normal > application logging may be bypassed resulting in no indication that > anything out of the ordinary has occurred. > > Additionally, and somewhat unique to Python, it is very easy to affect > the code that is run in an application by manipulating either the > import system's search path or placing files earlier on the path than > intended. This is often seen when developers create a script with the > same name as the module they intend to use - for example, a > ``random.py`` file that attempts to import the standard library > ``random`` module. > > This is not sandboxing, as this proposal does not attempt to prevent > malicious behavior (though it enables some new options to do so). > See the `Why Not A Sandbox`_ section below for further discussion. > > Overview of Changes > === > > The aim of these changes is to enable both application developers and > system administrators to integrate Python into their existing > monitoring systems without dictating how those systems look or behave. > > We propose two API changes to enable this: an Audit Hook and Verified > Open Hook. Both are available from Python and native code, allowing > applications and frameworks written in pure Python co
Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks
On 29Mar2019 1218, Christian Heimes wrote: On 28/03/2019 23.35, Steve Dower wrote: The ``importlib.util.open_for_import()`` function is a drop-in replacement for ``open(str(pathlike), 'rb')``. Its default behaviour is to open a file for raw, binary access. To change the behaviour a new handler should be set. Handler functions only accept ``str`` arguments. The C API ``PyImport_OpenForImport`` function assumes UTF-8 encoding. [...] All import and execution functionality involving code from a file will be changed to use ``open_for_import()`` unconditionally. It is important to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go through this function - an audit hook that includes the code from these calls is the best opportunity to validate code that is read from the file. Given the current decoupling between import and execution in Python, most imported code will go through both ``open_for_import()`` and the log hook for ``compile``, and so care should be taken to avoid repeating verification steps. There is no Python API provided for changing the open hook. To modify import behavior from Python code, use the existing functionality provided by ``importlib``. I think that the import hook needs to be extended. It only works for simple Python files or pyc files. There are at least two other important scenarios: zipimport and shared libraries. For example how does the importhook work in regarding of alternative importers like zipimport? What does the import hook 'see' for an import from a zipfile? Yes, good point. I think opening the zip file with open_for_import() is the right place to do it, as this operation relates to opening the file on disk rather than files within it. Shared libraries are trickier. libc doesn't define a way to dlopen() from a file descriptor. dlopen() takes a file name, but a file name leaves the audit hook open to a TOCTOU attack. For Windows, at least, the operating system can run its own validation on native modules (if you're using a feature like DeviceGuard, for example), so the hook likely isn't necessary for those purposes. I believe some configurations of Linux allow this as well? But there's likely no better option here than a combination of good ACLs and checking by filename, which at least lets you whitelist the files you know you want to allow. Similarly for the zip file - if you trust a particular file and trust your ACLs, checking by filename is fine. That said, specific audit events for "I'm about to open this zip/dlopen this file for import" are very easy to add. (The PEP proposes many examples, but is not trying to be exhaustive. If accepted, we should feel free to add new events as we identify places where they matter.) Aside: an important aspect of this per-file approach to execution is that the idea is generally to *enable* the files you trust, rather than disable the files that are bad. So the detection routines are typically "does this match a known hash" or "is this in a secure location", which for a carefully deployed system are already known values, rather than trying to figure out whether a file might do a bad thing. If you can't validate the files in your deployment match the ones you thought you were deploying, you are so far from needing this that it doesn't even matter, but most of the deployments I work with are *at least* this well controlled. Cheers, Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Strange umask(?)/st_mode issue
Hi I'm trying to track down the cause of the failed Linux tests on this build (it's also been happening on random PR builds for the last day or two, but this is the first I've seen it happen on already merged code): https://dev.azure.com/Python/cpython/_build/results?buildId=40189 One of the failed tests is this one from test_tarfile.py (and I think this is representative of the others): def test_file_mode(self): # Test for issue #8464: Create files with correct # permissions. if os.path.exists(tmpname): support.unlink(tmpname) original_umask = os.umask(0o022) try: tar = tarfile.open(tmpname, self.mode) tar.close() mode = os.stat(tmpname).st_mode & 0o777 self.assertEqual(mode, 0o644, "wrong file permissions") finally: os.umask(original_umask) It's failing because the mode of the new TAR file is 0o666 rather than 0o644. I'd guess that this means the umask() call is not having effect? But honestly this is beyond me - I don't have any idea what would cause Python to ignore that, and I don't see anything in the commit history that could be at fault. I'm hoping someone can tell me that it's definitely a platform configuration issue or it's definitely interference between two of our tests (which run in random order, which would explain the intermittent nature of this, but I haven't spotted any tests that seem likely either). To see the other failed PR builds, the full list is at https://dev.azure.com/Python/cpython/_build?definitionId=9 and most of the ones from today have failed because of whatever is causing it. Any help? Thanks, Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange umask(?)/st_mode issue
That does sound strange. How easily can you reproduce it? That majorly effects how I would try to debug something like this... If you're able to get an strace of a failed run then that would probably tell us a *lot*. On Fri, Mar 29, 2019, 16:28 Steve Dower wrote: > Hi > > I'm trying to track down the cause of the failed Linux tests on this > build (it's also been happening on random PR builds for the last day or > two, but this is the first I've seen it happen on already merged code): > > https://dev.azure.com/Python/cpython/_build/results?buildId=40189 > > One of the failed tests is this one from test_tarfile.py (and I think > this is representative of the others): > > def test_file_mode(self): > # Test for issue #8464: Create files with correct > # permissions. > if os.path.exists(tmpname): > support.unlink(tmpname) > > original_umask = os.umask(0o022) > try: > tar = tarfile.open(tmpname, self.mode) > tar.close() > mode = os.stat(tmpname).st_mode & 0o777 > self.assertEqual(mode, 0o644, "wrong file permissions") > finally: > os.umask(original_umask) > > It's failing because the mode of the new TAR file is 0o666 rather than > 0o644. I'd guess that this means the umask() call is not having effect? > But honestly this is beyond me - I don't have any idea what would cause > Python to ignore that, and I don't see anything in the commit history > that could be at fault. > > I'm hoping someone can tell me that it's definitely a platform > configuration issue or it's definitely interference between two of our > tests (which run in random order, which would explain the intermittent > nature of this, but I haven't spotted any tests that seem likely either). > > To see the other failed PR builds, the full list is at > https://dev.azure.com/Python/cpython/_build?definitionId=9 and most of > the ones from today have failed because of whatever is causing it. > > Any help? > > Thanks, > Steve > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/njs%40pobox.com > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New Python Initialization API
The purpose of the PEP 587 is to have a working document so everyone can look at the proposed API (stay focused to the API rather than bothering with the implementation). IMHO it's now time to get more people looking at the Python Initialization. > But there are enough of us > with fuzzy but valid ideas in our heads that we really need that > brainstorming session to mix them together and find something feasible. > Maybe we're best to put it off until PyCon at this point. Python 3.8 feature freeze is scheduled at the end of May, less than one month after the PyCon. It might be a little bit too late, no? Would you mind to elaborate these ideas? Victor -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange umask(?)/st_mode issue
On 29Mar.2019 1731, Nathaniel Smith wrote: > That does sound strange. Oh good, it's not just me :) > How easily can you reproduce it? That majorly > effects how I would try to debug something like this... No idea. Looking at the builds on Pipelines again, it seems that all the builds since about 10am PDT have been failing, so perhaps they rolled out something that changed how umask works? I don't have any idea whether that's possible though. > If you're able to get an strace of a failed run then that would probably > tell us a *lot*. I'll need help with that :) But if it's just commands in the .azure-pipelines/posix-steps.yml file then sending a PR with them added should do it. Though given there seems to be a clear start time and no relevant change, I'm inclined to report it to the Pipelines team (which I already have) and see what they say. Normally my internal CPython builds catch issues in the hosted VMs before they reach public accounts, but this may have been a hotfix for some other issue. Cheers, Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange umask(?)/st_mode issue
On 29Mar2019 19:30, Steve Dower wrote: On 29Mar.2019 1731, Nathaniel Smith wrote: That does sound strange. Oh good, it's not just me :) How easily can you reproduce it? That majorly effects how I would try to debug something like this... No idea. Looking at the builds on Pipelines again, it seems that all the builds since about 10am PDT have been failing, so perhaps they rolled out something that changed how umask works? I don't have any idea whether that's possible though. If you're able to get an strace of a failed run then that would probably tell us a *lot*. I'll need help with that :) But if it's just commands in the .azure-pipelines/posix-steps.yml file then sending a PR with them added should do it. Can you get a branch into your pipeline? Then you could just hack the tarfile test with something quick and dirty like: pid = os.getpid() system("strace -p %d 2>/path/to/strace.out &" % pid) time.sleep(2) # get strace heaps of time to start just before the tarfile open. A ghastly hack but it would get you debugging info. You could even decide to remove the strace.out file if the umask issue doesn't show, if it is erratic (can't see why it would be though). Cheers, Cameron Simpson ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] New Python Initialization API
On 29Mar.2019 1830, Victor Stinner wrote: > The purpose of the PEP 587 is to have a working document so everyone > can look at the proposed API (stay focused to the API rather than > bothering with the implementation). IMHO it's now time to get more > people looking at the Python Initialization. > >> But there are enough of us >> with fuzzy but valid ideas in our heads that we really need that >> brainstorming session to mix them together and find something feasible. >> Maybe we're best to put it off until PyCon at this point. > > Python 3.8 feature freeze is scheduled at the end of May, less than > one month after the PyCon. It might be a little bit too late, no? I don't think we want to rush this in for 3.8 at this point anyway. The design of how Python is embedded is one of those things that could drastically affect the scenarios it gets used for in the future (probably half of my tasks at work right now involve embedding CPython), so I'd like to get it right. > Would you mind to elaborate these ideas? I'd love to, but I don't have them all straight right now, and one of the problems with putting them in writing is I don't get immediate feedback when I'm not being clear enough or if there is missing context. I know you personally have seen most of my ideas, because I keep pinging you on them ;) My big one is what I posted on capi-sig about being able to classify our APIs better and define scenarios where they are ready for use, as well as breaking up unnecessary dependencies so that embedders have more flexibility (the rings and layers post). I posted a few examples of how initialization "could" be on various bugs I've had to deal with relating to it, and obviously I've been pushing the embeddable distro for Windows for a while (which is surprisingly popular with a very specific subset of users), as well as using it myself, so there are things that just annoy me enough about what we currently have. But I really do think this should start as a high bandwidth, in-person brainstorm session to get through the first few big scenarios. Then it'll be easy to open those up to review and let anyone submit their needs for hosting Python. And once we've collated a good set of "needs" we'll have a chance of designing the configuration and initialization APIs that will satisfy most/all of them. Maybe in time for 3.9 (or 3.10, if our RM gets the accelerated release cycle he wants ;) ). I personally think being able to embed Python easily and safely in other applications will be a powerful feature that will allow many non-developers to write code to get their work done, as we already see with Jupyter (and family). More are coming, but the responsibility is on us to make it successful. I want to get it right. Cheers, Steve ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Strange umask(?)/st_mode issue
On 29Mar.2019 1939, Cameron Simpson wrote: > Can you get a branch into your pipeline? Then you could just hack the > tarfile test with something quick and dirty like: > > pid = os.getpid() > system("strace -p %d 2>/path/to/strace.out &" % pid) > time.sleep(2) # get strace heaps of time to start > > just before the tarfile open. A ghastly hack but it would get you > debugging info. You could even decide to remove the strace.out file if > the umask issue doesn't show, if it is erratic (can't see why it would > be though). Perfect, I'll give this a go. Thanks! ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks
I don't like adding more Python callback from low level. Python runtime is very complicated already, especially __del__, shutdown process, and multi threading. Python callback from low level is source of very difficult bugs always. Additionally, if we used the PEP for logging complex application, the log will be unreliable. For example: 1. Want to open file A in C code, call callback. 2. In the callback, "A is opened" is logged. 3. In the same callback, import may be happened and logged. 4. In the same callback, other thread may be run and any thing can be logged. 5. Many many other things happens and callback is called. 6. Then, open the file A. In this example, logged event ordering and timing is very different from real event ordering and timing. I prefer low level tool to trace low level thing, although it lacks some application context. Maybe, DTrace will become more important tool. https://techcommunity.microsoft.com/t5/Windows-Kernel-Internals/DTrace-on-Windows/ba-p/362902 Regards, -- Inada Naoki ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com