Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks

2019-03-29 Thread Christian Heimes
On 29/03/2019 01.02, Victor Stinner wrote:
> Hi,
> 
> I read quickly the PEP, I'm not sure that I understood it correctly,
> so here are some early questions more about the usage of the PEP, than
> its implementation.
> 
>> This is not sandboxing, as this proposal does not attempt to prevent
>> malicious behavior (though it enables some new options to do so).
>> See the `Why Not A Sandbox`_ section below for further discussion.
> 
> I don't understand well the overall security model. If malicious
> behaviors can still occur, what is the the purpose of auditing? For
> example, if an audit hook writes events into a local log file, the
> attacker can easily remove this log file, no?

An attacker may not have permission to mess with the auditing subsystem.
For example an attacker may be able to modify an application like a web
server or web application. Audit loggers typically run in a different,
more protected context. On modern, hardened operation systems root /
Adminstrator aren't all powerful, too. They are typically restricted by
additional policies like SELinux.

Further more, servers also send auditing data to remote nodes for
analysis. Keep in mind that auditing is not primarily about preventing
compromises. It's about detecting what, when, who, and how a system was
compromised.



>> Verified Open Hook
>> --
>>
>> Most operating systems have a mechanism to distinguish between files
>> that can be executed and those that can not. For example, this may be an
>> execute bit in the permissions field, a verified hash of the file
>> contents to detect potential code tampering, or file system path
>> restrictions. These are an important security mechanism for preventing
>> execution of data or code that is not approved for a given environment.
>> Currently, Python has no way to integrate with these when launching
>> scripts or importing modules.
> 
> In my experience, it doesn't work just because Python has too many
> functions opening files indirectly or call external C libraries which
> open files.
> 
> I vaguely recall an exploit in my pysandbox project which uses the
> internal code of Python which displays a traceback... to read the
> content of an arbitrary file on the disk :-( Game over. I would never
> expect that there are so many ways to read a file in Python...

The verified open hook is not about sandboxing. It's a mechanism to
prevent a class of attacks like directory traversal attacks. On Linux,
the open-for-import hook could refuse access to .py and .pyc files that
do not have the user.python_code or root.python_code extended file
attribute. This verified open hook could have prevent the compromise of
wiki.python.org many years ago.


> Even when I restricted pysandbox to the bare minimum of the Python
> language (with no import), multiple exploits have been found.
> Moreover, at the end, Python just became useful.
> 
> More generally, there are a lot of codes in Python which allow
> arbitrary code injection :-( (Most of them are now fixed, hopefully!)
> 
> I did my best to modify as much functions as possible to implement the
> PEP 446 "Make newly created file descriptors non-inheritable", but I
> know that *many* functions call directly open() or fopen() and so
> create inheritable file descriptors. For example, the Python ssl
> module takes directly filenames and OpenSSL open directly files. It's
> just one example.
> 
> You will never be able to cover all cases.

I agree. Don't draw the wrong conclusion from your statement. PEP 578
adds hooks for auditing, which in return can be used to harden and log
an application. Unlike secure sandboxing, it doesn't have to be perfect.
Alex Gaynor summed this up in his blog post
https://alexgaynor.net/2018/jul/20/worst-truism-in-infosec/

> Having a single function which allows to open an arbitrary file
> without triggering an audit event would defeat the whole purpose of
> auditing, no? Again, maybe I didn't understand well the overall
> purpose of the PEP, sorry.

This case can be detected during development and QE phase. You simply
have to count the amount of open syscalls and compare it to the amount
of open auditing events.


>> The important performance impact is the case where events are being
>> raised but there are no hooks attached. This is the unavoidable case -
>> once a developer has added audit hooks they have explicitly chosen to
>> trade performance for functionality.
> 
> (The Linux kernel uses advance tooling to inject hooks: it has no
> impact on performances when no hook is used. Machine code of functions
> is patched to inject a hook. Impressive stuff :-))
> 
> Here I expect a small overhead. But the global overhead will be
> proportional to the number of hooks, no? Maybe it's not significant
> with the proposed list of events, but it will be more significant with
> 100 or 1000 events?
> 
> I'm not saying that it's a blocker issue, I'm just thinking aloud to
> make sure that I understood correctly :-)

The performan

Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks

2019-03-29 Thread Christian Heimes
On 28/03/2019 23.35, Steve Dower wrote:
> Audit Hook
> --
> 
> In order to observe actions taken by the runtime (on behalf of the
> caller), an API is required to raise messages from within certain
> operations. These operations are typically deep within the Python
> runtime or standard library, such as dynamic code compilation, module
> imports, DNS resolution, or use of certain modules such as ``ctypes``.
> 
> The following new C APIs allow embedders and CPython implementors to
> send and receive audit hook messages::
> 
>    # Add an auditing hook
>    typedef int (*hook_func)(const char *event, PyObject *args,
>     void *userData);
>    int PySys_AddAuditHook(hook_func hook, void *userData);
> 
>    # Raise an event with all auditing hooks
>    int PySys_Audit(const char *event, PyObject *args);
> 
>    # Internal API used during Py_Finalize() - not publicly accessible
>    void _Py_ClearAuditHooks(void);
> 
> The new Python APIs for receiving and raising audit hooks are::
> 
>    # Add an auditing hook
>    sys.addaudithook(hook: Callable[[str, tuple]])
> 
>    # Raise an event with all auditing hooks
>    sys.audit(str, *args)
> 
> 
> Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
> including before ``Py_Initialize()``, or by calling
> ``sys.addaudithook()`` from Python code. Hooks cannot be removed or
> replaced.

Hi Steve,

I wonder if the hooks could be replaced by a more efficient mechanism.
These days, Linux, macOS, and most recently Windows [1] support dtrace
probes. DTrace is a very powerful and efficient mechanism to trace
user-space processes from Kernel space. At least we should consider to
add DTrace probes to the auditing framework.

Regards,
Christian

[1]
https://techcommunity.microsoft.com/t5/Windows-Kernel-Internals/DTrace-on-Windows/ba-p/362902
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] A request for PEP announcement format [was: PEP 570]

2019-03-29 Thread Guido van Rossum
On Thu, Mar 28, 2019 at 11:28 PM Jeroen Demeyer  wrote:

> On 2019-03-29 04:08, Stephen J. Turnbull wrote:
> > In this case, it's here:
> >
> >   >
> https://discuss.python.org/t/pep-570-Python-Positional-Only-Parameters/1078
>
> So, are we supposed to discuss PEPs on discuss.python.org now? That's
> fine for me, should I create a thread like that for PEP 580 too?
>

It is up to the PEP author where a PEP will be discussed. You have the
option of using python-dev, Discourse or a dedicated mailing list.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks

2019-03-29 Thread Steve Dower

On 29Mar2019 0334, Christian Heimes wrote:

On 28/03/2019 23.35, Steve Dower wrote:

Audit Hook
--

In order to observe actions taken by the runtime (on behalf of the
caller), an API is required to raise messages from within certain
operations. These operations are typically deep within the Python
runtime or standard library, such as dynamic code compilation, module
imports, DNS resolution, or use of certain modules such as ``ctypes``.

The following new C APIs allow embedders and CPython implementors to
send and receive audit hook messages::

    # Add an auditing hook
    typedef int (*hook_func)(const char *event, PyObject *args,
     void *userData);
    int PySys_AddAuditHook(hook_func hook, void *userData);

    # Raise an event with all auditing hooks
    int PySys_Audit(const char *event, PyObject *args);

    # Internal API used during Py_Finalize() - not publicly accessible
    void _Py_ClearAuditHooks(void);

The new Python APIs for receiving and raising audit hooks are::

    # Add an auditing hook
    sys.addaudithook(hook: Callable[[str, tuple]])

    # Raise an event with all auditing hooks
    sys.audit(str, *args)


Hooks are added by calling ``PySys_AddAuditHook()`` from C at any time,
including before ``Py_Initialize()``, or by calling
``sys.addaudithook()`` from Python code. Hooks cannot be removed or
replaced.


Hi Steve,

I wonder if the hooks could be replaced by a more efficient mechanism.
These days, Linux, macOS, and most recently Windows [1] support dtrace
probes. DTrace is a very powerful and efficient mechanism to trace
user-space processes from Kernel space. At least we should consider to
add DTrace probes to the auditing framework.

[1]
https://techcommunity.microsoft.com/t5/Windows-Kernel-Internals/DTrace-on-Windows/ba-p/362902


Calling into those frameworks will still require as much work as these 
hooks do, and also make it very difficult to do things like auditing 
unit tests or using pure-Python code when you're not concerned about 
initialization (e.g. in a long-running web server).


So I'm inclined to say that if you want those probes, you can enable 
them by adding a hook that calls into them? A similar argument is made 
for using ETW on Windows (which will work on versions of Windows that 
have been released, unlike DTrace) (and yes, this is a real argument 
I've already had over this proposal ;) ), so I really think leaving it 
open-ended and Python-specific is the best approach.


(Reading further down the link you provided, it seems DTrace in Windows 
will only be enabled for essentially-administrators. So that rules it 
out as a substitute for this proposal in my opinion.)


Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks

2019-03-29 Thread Steve Dower
Thanks Christian for responding - I endorse and support all your 
comments. (I'd hoped that by explicitly saying "this is not a sandbox" 
it would avoid people thinking it was a sandbox, but apparently I would 
have been better just to avoid the keyword completely...)


On 29Mar2019 0324, Christian Heimes wrote:

On 29/03/2019 01.02, Victor Stinner wrote:

Here I expect a small overhead. But the global overhead will be
proportional to the number of hooks, no? Maybe it's not significant
with the proposed list of events, but it will be more significant with
100 or 1000 events?

I'm not saying that it's a blocker issue, I'm just thinking aloud to
make sure that I understood correctly :-)


The performance impact can be remedied and reduced with a simple check.
If there is no audit hook installed, it's just a matter of a pointer
deref + JNZ.


Yep, the performance case we care about is when there are no hooks 
attached, since that's the only time a user cannot do anything to 
improve performance themselves. See the "Performance Impact" section in 
the PEP.


In my implementation the cost is about as low as I can make it - see 
https://github.com/python/cpython/pull/12613/files#diff-f38879f4833a6b6847e556b9a07bf4edR115 
(looking at it again I can probably avoid the exception preservation and 
a few conditionals at the end)


Basically, PySys_Audit takes a format string and arguments, rather than 
making callers eagerly construct the tuple that gets passed to the hook, 
and only actually allocates when there is a hook to call. There aren't 
even any Py_INCREF's if there are no hooks. And as Christian says, it's 
a deref+JNZ.


Now, if someone has implemented a hook and that hook has performance 
issues, yeah things will slow down. In general, the places where we are 
interested in hooks is where calls are being made into the operating 
system, so most of them will also involve a few syscalls and the cost of 
the hook should be minimal in comparison. But there isn't another way to 
provide the functionality - offloading it to the OS just means the OS is 
going to suffer the performance penalty, so it really is just moving the 
blame elsewhere. I dislike playing that game.


Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks

2019-03-29 Thread Ivan Pozdeev via Python-Dev
Like in the mktemp thread earlier, I would request a threat model (what use cases are supposed to be protected (in this case, by reporting 
rather than preventing) and from what threats) -- in the discussion, and eventually, in the PEP.
Without one, any claims and talks about whether something would be an effective security measure are pointless -- 'cuz you would never know 
if you accounted for everything and would not even have the definition of that "everything".


On 29.03.2019 1:35, Steve Dower wrote:

Hi all

Time is short, but I'm hoping to get PEP 578 (formerly PEP 551) into Python 3.8. Here's the current text for review and comment before I 
submit to the Steering Council.


The formatted text is at https://www.python.org/dev/peps/pep-0578/ (update just pushed, so give it an hour or so, but it's fundamentally 
the same as what's there)


No Discourse post, because we don't have a python-dev equivalent there yet, so 
please reply here for this one.

Implementation is at https://github.com/zooba/cpython/tree/pep-578/ and my backport to 3.7 
(https://github.com/zooba/cpython/tree/pep-578-3.7/) is already getting some real use (though this will not be added to 3.7, unless people 
*really* want it, so the backport is just for reference).


Cheers,
Steve

=

PEP: 578
Title: Python Runtime Audit Hooks
Version: $Revision$
Last-Modified: $Date$
Author: Steve Dower 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16-Jun-2018
Python-Version: 3.8
Post-History:

Abstract


This PEP describes additions to the Python API and specific behaviors
for the CPython implementation that make actions taken by the Python
runtime visible to auditing tools. Visibility into these actions
provides opportunities for test frameworks, logging frameworks, and
security tools to monitor and optionally limit actions taken by the
runtime.

This PEP proposes adding two APIs to provide insights into a running
Python application: one for arbitrary events, and another specific to
the module import system. The APIs are intended to be available in all
Python implementations, though the specific messages and values used
are unspecified here to allow implementations the freedom to determine
how best to provide information to their users. Some examples likely
to be used in CPython are provided for explanatory purposes.

See PEP 551 for discussion and recommendations on enhancing the
security of a Python runtime making use of these auditing APIs.

Background
==

Python provides access to a wide range of low-level functionality on
many common operating systems. While this is incredibly useful for
"write-once, run-anywhere" scripting, it also makes monitoring of
software written in Python difficult. Because Python uses native system
APIs directly, existing monitoring tools either suffer from limited
context or auditing bypass.

Limited context occurs when system monitoring can report that an
action occurred, but cannot explain the sequence of events leading to
it. For example, network monitoring at the OS level may be able to
report "listening started on port 5678", but may not be able to
provide the process ID, command line, parent process, or the local
state in the program at the point that triggered the action. Firewall
controls to prevent such an action are similarly limited, typically
to process names or some global state such as the current user, and
in any case rarely provide a useful log file correlated with other
application messages.

Auditing bypass can occur when the typical system tool used for an
action would ordinarily report its use, but accessing the APIs via
Python do not trigger this. For example, invoking "curl" to make HTTP
requests may be specifically monitored in an audited system, but
Python's "urlretrieve" function is not.

Within a long-running Python application, particularly one that
processes user-provided information such as a web app, there is a risk
of unexpected behavior. This may be due to bugs in the code, or
deliberately induced by a malicious user. In both cases, normal
application logging may be bypassed resulting in no indication that
anything out of the ordinary has occurred.

Additionally, and somewhat unique to Python, it is very easy to affect
the code that is run in an application by manipulating either the
import system's search path or placing files earlier on the path than
intended. This is often seen when developers create a script with the
same name as the module they intend to use - for example, a
``random.py`` file that attempts to import the standard library
``random`` module.

This is not sandboxing, as this proposal does not attempt to prevent
malicious behavior (though it enables some new options to do so).
See the `Why Not A Sandbox`_ section below for further discussion.

Overview of Changes
===

The aim of these changes is to enable both application developers and
system administrators to integrate Python into their existing
monito

[Python-Dev] Summary of Python tracker Issues

2019-03-29 Thread Python tracker

ACTIVITY SUMMARY (2019-03-22 - 2019-03-29)
Python tracker at https://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open7048 ( -3)
  closed 41176 (+76)
  total  48224 (+73)

Open issues with patches: 2809 


Issues opened (50)
==

#22189: collections.UserString missing some str methods
https://bugs.python.org/issue22189  reopened by rhettinger

#28190: Cross-build _curses failed if host ncurses headers and target 
https://bugs.python.org/issue28190  reopened by doko

#33356: Windows 10 buildbot: test__xxsubinterpreters.test_already_runn
https://bugs.python.org/issue33356  reopened by vstinner

#35155: Clarify Protocol Handlers in urllib.request Docs
https://bugs.python.org/issue35155  reopened by orsenthil

#36150: Possible assertion failures due to _ctypes.c's PyCData_reduce(
https://bugs.python.org/issue36150  reopened by serhiy.storchaka

#36344: install_certificates.command too complicated
https://bugs.python.org/issue36344  reopened by ned.deily

#36402: test_threading: test_threads_join_2() failed with "Fatal Pytho
https://bugs.python.org/issue36402  opened by vstinner

#36403: AsyncIterator on 3.7: __aiter__ no longer honors finally block
https://bugs.python.org/issue36403  opened by asksol

#36404: Document PendingDeprecationWarning as deprecated
https://bugs.python.org/issue36404  opened by inada.naoki

#36406: doctest.testmod(empty_package) raises TypeError in 3.7 (and no
https://bugs.python.org/issue36406  opened by Dutcho

#36408: Tkinter multi-processing performance, Linux 10-25 times faster
https://bugs.python.org/issue36408  opened by james.mccormack

#36409: plistlib old API should be removed
https://bugs.python.org/issue36409  opened by bigfootjon

#36411: Python 3 f.tell() gets out of sync with file pointer in binary
https://bugs.python.org/issue36411  opened by pewscorner

#36414: Multiple test failures in GCC and Clang optional builds on Tra
https://bugs.python.org/issue36414  opened by xtreak

#36416: bytes.rpartition bug in online documentation
https://bugs.python.org/issue36416  opened by pewscorner

#36417: unicode.isdecimal bug in online Python 2 documentation
https://bugs.python.org/issue36417  opened by pewscorner

#36418: urllib.parse.*Result: support _replace for additional computed
https://bugs.python.org/issue36418  opened by cheryl.sabella

#36419: IDLE autocomplete: refactor and polish code and tests
https://bugs.python.org/issue36419  opened by terry.reedy

#36422: tempfile.TemporaryDirectory() removes entire directory tree ev
https://bugs.python.org/issue36422  opened by riccardomurri

#36424: Pickle fails on frozen dataclass that has slots
https://bugs.python.org/issue36424  opened by drhagen

#36426: exec() issue when used inside function
https://bugs.python.org/issue36426  opened by schperplata

#36427: Document that PyEval_RestoreThread and PyGILState_Ensure can t
https://bugs.python.org/issue36427  opened by pablogsal

#36429: Fix starting IDLE with pyshell
https://bugs.python.org/issue36429  opened by terry.reedy

#36432: Running python test suite fails on macOS 10.14.4 with resource
https://bugs.python.org/issue36432  opened by ned.deily

#36434: Zipfile breaks if signalled during write()
https://bugs.python.org/issue36434  opened by and800

#36437: method_descriptor surprising error message when self is passed
https://bugs.python.org/issue36437  opened by vstinner

#36438: Python 3.5.7 import error on Cross compile
https://bugs.python.org/issue36438  opened by Brian Spratke

#36439: Inconsistencies with datetime.fromtimestamp(t) when t < 0
https://bugs.python.org/issue36439  opened by Tiger-222

#36440: more helpful diagnostics for parser module
https://bugs.python.org/issue36440  opened by A. Skrobov

#36442: Different ValueError for the same operation in List and Tuple
https://bugs.python.org/issue36442  opened by HardikPatel

#36445: bus error in test_gil test on armhf running with 64bit kernel
https://bugs.python.org/issue36445  opened by doko

#36448: Message "You will need to rebuild pythoncore to see the change
https://bugs.python.org/issue36448  opened by jdemeyer

#36449: __aexit__ is not called when a context manager is used in an a
https://bugs.python.org/issue36449  opened by Andrey Lemets

#36453: pkgutil.get_importer only return the first valid path_hook(imp
https://bugs.python.org/issue36453  opened by Windson Yang

#36454: test_time: test_monotonic() failed on AMD64 FreeBSD 10-STABLE 
https://bugs.python.org/issue36454  opened by vstinner

#36456: task.cancel unbound recursion
https://bugs.python.org/issue36456  opened by Dima.Tisnek

#36457: functools.singledispatchmethod interacts poorly with subclasse
https://bugs.python.org/issue36457  opened by Tim Mitchell2

#36460: Add AMP MIME type support
https://bugs.python.org/issue36460  opened by Daniel Black

#36461: timeit: Additional changes for autorange
https://bugs.python

Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks

2019-03-29 Thread Steve Dower
PEP 551 (referenced from this one) contains information about using 
these hooks for security purposes, along with other approaches to 
minimize the risk of having Python in your production environments. 
Threat models have to be designed by the user; we can't predict what it 
looks like for the incredibly diverse user base we have.


This PEP is explicitly only about the API changes.

Cheers,
Steve

On 29Mar2019 1044, Ivan Pozdeev via Python-Dev wrote:
Like in the mktemp thread earlier, I would request a threat model (what 
use cases are supposed to be protected (in this case, by reporting 
rather than preventing) and from what threats) -- in the discussion, and 
eventually, in the PEP.
Without one, any claims and talks about whether something would be an 
effective security measure are pointless -- 'cuz you would never know if 
you accounted for everything and would not even have the definition of 
that "everything".


On 29.03.2019 1:35, Steve Dower wrote:

Hi all

Time is short, but I'm hoping to get PEP 578 (formerly PEP 551) into 
Python 3.8. Here's the current text for review and comment before I 
submit to the Steering Council.


The formatted text is at https://www.python.org/dev/peps/pep-0578/ 
(update just pushed, so give it an hour or so, but it's fundamentally 
the same as what's there)


No Discourse post, because we don't have a python-dev equivalent there 
yet, so please reply here for this one.


Implementation is at https://github.com/zooba/cpython/tree/pep-578/ 
and my backport to 3.7 
(https://github.com/zooba/cpython/tree/pep-578-3.7/) is already 
getting some real use (though this will not be added to 3.7, unless 
people *really* want it, so the backport is just for reference).


Cheers,
Steve

=

PEP: 578
Title: Python Runtime Audit Hooks
Version: $Revision$
Last-Modified: $Date$
Author: Steve Dower 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 16-Jun-2018
Python-Version: 3.8
Post-History:

Abstract


This PEP describes additions to the Python API and specific behaviors
for the CPython implementation that make actions taken by the Python
runtime visible to auditing tools. Visibility into these actions
provides opportunities for test frameworks, logging frameworks, and
security tools to monitor and optionally limit actions taken by the
runtime.

This PEP proposes adding two APIs to provide insights into a running
Python application: one for arbitrary events, and another specific to
the module import system. The APIs are intended to be available in all
Python implementations, though the specific messages and values used
are unspecified here to allow implementations the freedom to determine
how best to provide information to their users. Some examples likely
to be used in CPython are provided for explanatory purposes.

See PEP 551 for discussion and recommendations on enhancing the
security of a Python runtime making use of these auditing APIs.

Background
==

Python provides access to a wide range of low-level functionality on
many common operating systems. While this is incredibly useful for
"write-once, run-anywhere" scripting, it also makes monitoring of
software written in Python difficult. Because Python uses native system
APIs directly, existing monitoring tools either suffer from limited
context or auditing bypass.

Limited context occurs when system monitoring can report that an
action occurred, but cannot explain the sequence of events leading to
it. For example, network monitoring at the OS level may be able to
report "listening started on port 5678", but may not be able to
provide the process ID, command line, parent process, or the local
state in the program at the point that triggered the action. Firewall
controls to prevent such an action are similarly limited, typically
to process names or some global state such as the current user, and
in any case rarely provide a useful log file correlated with other
application messages.

Auditing bypass can occur when the typical system tool used for an
action would ordinarily report its use, but accessing the APIs via
Python do not trigger this. For example, invoking "curl" to make HTTP
requests may be specifically monitored in an audited system, but
Python's "urlretrieve" function is not.

Within a long-running Python application, particularly one that
processes user-provided information such as a web app, there is a risk
of unexpected behavior. This may be due to bugs in the code, or
deliberately induced by a malicious user. In both cases, normal
application logging may be bypassed resulting in no indication that
anything out of the ordinary has occurred.

Additionally, and somewhat unique to Python, it is very easy to affect
the code that is run in an application by manipulating either the
import system's search path or placing files earlier on the path than
intended. This is often seen when developers create a script with the
same name as the module they intend to use - for example, a

[Python-Dev] Please take your time reading PEPs (was: PEP 578: Python Runtime Audit Hooks)

2019-03-29 Thread Brett Cannon
On Thu, Mar 28, 2019 at 5:03 PM Victor Stinner  wrote:

> Hi,
>
> I read quickly the PEP
>

I would like to encourage everyone to read PEPs so that they never feel the
need to write those words ever again. ;)

PEPs are never decided in less than 24 hours, so there is no rush to read a
PEP as quickly as possible in order to reply ASAP. We also have so much
volume as it is when discussing PEPs that I think we should be encouraging
people to take the time to be informed by reading thoroughly before
replying so the back-and-forth is minimized and optimized for impactful
discussions (personally, I would love it if we all aimed for one, thorough
response/day when discussing PEPs, but that's just me). Otherwise we end up
with way more time spent in replies to things that would not have been
necessary to ask if we took our time reading. Remember, every email you
send is read by tons of other people and so there's a real time commitment
you're asking of the world every time you hit that Reply button.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks

2019-03-29 Thread Christian Heimes
On 28/03/2019 23.35, Steve Dower wrote:
> Hi all
> 
> Time is short, but I'm hoping to get PEP 578 (formerly PEP 551) into
> Python 3.8. Here's the current text for review and comment before I
> submit to the Steering Council.
> 
> The formatted text is at https://www.python.org/dev/peps/pep-0578/
> (update just pushed, so give it an hour or so, but it's fundamentally
> the same as what's there)
> 
> No Discourse post, because we don't have a python-dev equivalent there
> yet, so please reply here for this one.
> 
> Implementation is at https://github.com/zooba/cpython/tree/pep-578/ and
> my backport to 3.7 (https://github.com/zooba/cpython/tree/pep-578-3.7/)
> is already getting some real use (though this will not be added to 3.7,
> unless people *really* want it, so the backport is just for reference).
> 
> Cheers,
> Steve
> 
> =
> 
> PEP: 578
> Title: Python Runtime Audit Hooks
> Version: $Revision$
> Last-Modified: $Date$
> Author: Steve Dower 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 16-Jun-2018
> Python-Version: 3.8
> Post-History:
> 
> Abstract
> 
> 
> This PEP describes additions to the Python API and specific behaviors
> for the CPython implementation that make actions taken by the Python
> runtime visible to auditing tools. Visibility into these actions
> provides opportunities for test frameworks, logging frameworks, and
> security tools to monitor and optionally limit actions taken by the
> runtime.
> 
> This PEP proposes adding two APIs to provide insights into a running
> Python application: one for arbitrary events, and another specific to
> the module import system. The APIs are intended to be available in all
> Python implementations, though the specific messages and values used
> are unspecified here to allow implementations the freedom to determine
> how best to provide information to their users. Some examples likely
> to be used in CPython are provided for explanatory purposes.
> 
> See PEP 551 for discussion and recommendations on enhancing the
> security of a Python runtime making use of these auditing APIs.
> 
> Background
> ==
> 
> Python provides access to a wide range of low-level functionality on
> many common operating systems. While this is incredibly useful for
> "write-once, run-anywhere" scripting, it also makes monitoring of
> software written in Python difficult. Because Python uses native system
> APIs directly, existing monitoring tools either suffer from limited
> context or auditing bypass.
> 
> Limited context occurs when system monitoring can report that an
> action occurred, but cannot explain the sequence of events leading to
> it. For example, network monitoring at the OS level may be able to
> report "listening started on port 5678", but may not be able to
> provide the process ID, command line, parent process, or the local
> state in the program at the point that triggered the action. Firewall
> controls to prevent such an action are similarly limited, typically
> to process names or some global state such as the current user, and
> in any case rarely provide a useful log file correlated with other
> application messages.
> 
> Auditing bypass can occur when the typical system tool used for an
> action would ordinarily report its use, but accessing the APIs via
> Python do not trigger this. For example, invoking "curl" to make HTTP
> requests may be specifically monitored in an audited system, but
> Python's "urlretrieve" function is not.
> 
> Within a long-running Python application, particularly one that
> processes user-provided information such as a web app, there is a risk
> of unexpected behavior. This may be due to bugs in the code, or
> deliberately induced by a malicious user. In both cases, normal
> application logging may be bypassed resulting in no indication that
> anything out of the ordinary has occurred.
> 
> Additionally, and somewhat unique to Python, it is very easy to affect
> the code that is run in an application by manipulating either the
> import system's search path or placing files earlier on the path than
> intended. This is often seen when developers create a script with the
> same name as the module they intend to use - for example, a
> ``random.py`` file that attempts to import the standard library
> ``random`` module.
> 
> This is not sandboxing, as this proposal does not attempt to prevent
> malicious behavior (though it enables some new options to do so).
> See the `Why Not A Sandbox`_ section below for further discussion.
> 
> Overview of Changes
> ===
> 
> The aim of these changes is to enable both application developers and
> system administrators to integrate Python into their existing
> monitoring systems without dictating how those systems look or behave.
> 
> We propose two API changes to enable this: an Audit Hook and Verified
> Open Hook. Both are available from Python and native code, allowing
> applications and frameworks written in pure Python co

Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks

2019-03-29 Thread Steve Dower

On 29Mar2019 1218, Christian Heimes wrote:

On 28/03/2019 23.35, Steve Dower wrote:

The ``importlib.util.open_for_import()`` function is a drop-in
replacement for ``open(str(pathlike), 'rb')``. Its default behaviour is
to open a file for raw, binary access. To change the behaviour a new
handler should be set. Handler functions only accept ``str`` arguments.
The C API ``PyImport_OpenForImport`` function assumes UTF-8 encoding.


[...]


All import and execution functionality involving code from a file will
be changed to use ``open_for_import()`` unconditionally. It is important
to note that calls to ``compile()``, ``exec()`` and ``eval()`` do not go
through this function - an audit hook that includes the code from these
calls is the best opportunity to validate code that is read from the
file. Given the current decoupling between import and execution in
Python, most imported code will go through both ``open_for_import()``
and the log hook for ``compile``, and so care should be taken to avoid
repeating verification steps.

There is no Python API provided for changing the open hook. To modify
import behavior from Python code, use the existing functionality
provided by ``importlib``.


I think that the import hook needs to be extended. It only works for
simple Python files or pyc files. There are at least two other important
scenarios: zipimport and shared libraries.

For example how does the importhook work in regarding of alternative
importers like zipimport? What does the import hook 'see' for an import
from a zipfile?


Yes, good point. I think opening the zip file with open_for_import() is 
the right place to do it, as this operation relates to opening the file 
on disk rather than files within it.



Shared libraries are trickier. libc doesn't define a way to dlopen()
from a file descriptor. dlopen() takes a file name, but a file name
leaves the audit hook open to a TOCTOU attack.


For Windows, at least, the operating system can run its own validation 
on native modules (if you're using a feature like DeviceGuard, for 
example), so the hook likely isn't necessary for those purposes. I 
believe some configurations of Linux allow this as well?


But there's likely no better option here than a combination of good ACLs 
and checking by filename, which at least lets you whitelist the files 
you know you want to allow. Similarly for the zip file - if you trust a 
particular file and trust your ACLs, checking by filename is fine. That 
said, specific audit events for "I'm about to open this zip/dlopen this 
file for import" are very easy to add. (The PEP proposes many examples, 
but is not trying to be exhaustive. If accepted, we should feel free to 
add new events as we identify places where they matter.)


Aside: an important aspect of this per-file approach to execution is 
that the idea is generally to *enable* the files you trust, rather than 
disable the files that are bad. So the detection routines are typically 
"does this match a known hash" or "is this in a secure location", which 
for a carefully deployed system are already known values, rather than 
trying to figure out whether a file might do a bad thing. If you can't 
validate the files in your deployment match the ones you thought you 
were deploying, you are so far from needing this that it doesn't even 
matter, but most of the deployments I work with are *at least* this well 
controlled.


Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Strange umask(?)/st_mode issue

2019-03-29 Thread Steve Dower

Hi

I'm trying to track down the cause of the failed Linux tests on this 
build (it's also been happening on random PR builds for the last day or 
two, but this is the first I've seen it happen on already merged code):


https://dev.azure.com/Python/cpython/_build/results?buildId=40189

One of the failed tests is this one from test_tarfile.py (and I think 
this is representative of the others):


def test_file_mode(self):
# Test for issue #8464: Create files with correct
# permissions.
if os.path.exists(tmpname):
support.unlink(tmpname)

original_umask = os.umask(0o022)
try:
tar = tarfile.open(tmpname, self.mode)
tar.close()
mode = os.stat(tmpname).st_mode & 0o777
self.assertEqual(mode, 0o644, "wrong file permissions")
finally:
os.umask(original_umask)

It's failing because the mode of the new TAR file is 0o666 rather than 
0o644. I'd guess that this means the umask() call is not having effect? 
But honestly this is beyond me - I don't have any idea what would cause 
Python to ignore that, and I don't see anything in the commit history 
that could be at fault.


I'm hoping someone can tell me that it's definitely a platform 
configuration issue or it's definitely interference between two of our 
tests (which run in random order, which would explain the intermittent 
nature of this, but I haven't spotted any tests that seem likely either).


To see the other failed PR builds, the full list is at 
https://dev.azure.com/Python/cpython/_build?definitionId=9 and most of 
the ones from today have failed because of whatever is causing it.


Any help?

Thanks,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange umask(?)/st_mode issue

2019-03-29 Thread Nathaniel Smith
That does sound strange. How easily can you reproduce it? That majorly
effects how I would try to debug something like this...

If you're able to get an strace of a failed run then that would probably
tell us a *lot*.

On Fri, Mar 29, 2019, 16:28 Steve Dower  wrote:

> Hi
>
> I'm trying to track down the cause of the failed Linux tests on this
> build (it's also been happening on random PR builds for the last day or
> two, but this is the first I've seen it happen on already merged code):
>
> https://dev.azure.com/Python/cpython/_build/results?buildId=40189
>
> One of the failed tests is this one from test_tarfile.py (and I think
> this is representative of the others):
>
> def test_file_mode(self):
>  # Test for issue #8464: Create files with correct
>  # permissions.
>  if os.path.exists(tmpname):
>  support.unlink(tmpname)
>
>  original_umask = os.umask(0o022)
>  try:
>  tar = tarfile.open(tmpname, self.mode)
>  tar.close()
>  mode = os.stat(tmpname).st_mode & 0o777
>  self.assertEqual(mode, 0o644, "wrong file permissions")
>  finally:
>  os.umask(original_umask)
>
> It's failing because the mode of the new TAR file is 0o666 rather than
> 0o644. I'd guess that this means the umask() call is not having effect?
> But honestly this is beyond me - I don't have any idea what would cause
> Python to ignore that, and I don't see anything in the commit history
> that could be at fault.
>
> I'm hoping someone can tell me that it's definitely a platform
> configuration issue or it's definitely interference between two of our
> tests (which run in random order, which would explain the intermittent
> nature of this, but I haven't spotted any tests that seem likely either).
>
> To see the other failed PR builds, the full list is at
> https://dev.azure.com/Python/cpython/_build?definitionId=9 and most of
> the ones from today have failed because of whatever is causing it.
>
> Any help?
>
> Thanks,
> Steve
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/njs%40pobox.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Python Initialization API

2019-03-29 Thread Victor Stinner
The purpose of the PEP 587 is to have a working document so everyone
can look at the proposed API (stay focused to the API rather than
bothering with the implementation). IMHO it's now time to get more
people looking at the Python Initialization.

> But there are enough of us
> with fuzzy but valid ideas in our heads that we really need that
> brainstorming session to mix them together and find something feasible.
> Maybe we're best to put it off until PyCon at this point.

Python 3.8 feature freeze is scheduled at the end of May, less than
one month after the PyCon. It might be a little bit too late, no?

Would you mind to elaborate these ideas?

Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange umask(?)/st_mode issue

2019-03-29 Thread Steve Dower
On 29Mar.2019 1731, Nathaniel Smith wrote:
> That does sound strange. 

Oh good, it's not just me :)

> How easily can you reproduce it? That majorly
> effects how I would try to debug something like this...

No idea. Looking at the builds on Pipelines again, it seems that all the
builds since about 10am PDT have been failing, so perhaps they rolled
out something that changed how umask works? I don't have any idea
whether that's possible though.

> If you're able to get an strace of a failed run then that would probably
> tell us a *lot*.

I'll need help with that :) But if it's just commands in the
.azure-pipelines/posix-steps.yml file then sending a PR with them added
should do it.

Though given there seems to be a clear start time and no relevant
change, I'm inclined to report it to the Pipelines team (which I already
have) and see what they say. Normally my internal CPython builds catch
issues in the hosted VMs before they reach public accounts, but this may
have been a hotfix for some other issue.

Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange umask(?)/st_mode issue

2019-03-29 Thread Cameron Simpson

On 29Mar2019 19:30, Steve Dower  wrote:

On 29Mar.2019 1731, Nathaniel Smith wrote:

That does sound strange.


Oh good, it's not just me :)


How easily can you reproduce it? That majorly
effects how I would try to debug something like this...


No idea. Looking at the builds on Pipelines again, it seems that all the
builds since about 10am PDT have been failing, so perhaps they rolled
out something that changed how umask works? I don't have any idea
whether that's possible though.


If you're able to get an strace of a failed run then that would probably
tell us a *lot*.


I'll need help with that :) But if it's just commands in the
.azure-pipelines/posix-steps.yml file then sending a PR with them added
should do it.


Can you get a branch into your pipeline? Then you could just hack the 
tarfile test with something quick and dirty like:


   pid = os.getpid()
   system("strace -p %d 2>/path/to/strace.out &" % pid)
   time.sleep(2)   # get strace heaps of time to start

just before the tarfile open. A ghastly hack but it would get you 
debugging info. You could even decide to remove the strace.out file if 
the umask issue doesn't show, if it is erratic (can't see why it would 
be though).


Cheers,
Cameron Simpson 
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] New Python Initialization API

2019-03-29 Thread Steve Dower
On 29Mar.2019 1830, Victor Stinner wrote:
> The purpose of the PEP 587 is to have a working document so everyone
> can look at the proposed API (stay focused to the API rather than
> bothering with the implementation). IMHO it's now time to get more
> people looking at the Python Initialization.
> 
>> But there are enough of us
>> with fuzzy but valid ideas in our heads that we really need that
>> brainstorming session to mix them together and find something feasible.
>> Maybe we're best to put it off until PyCon at this point.
> 
> Python 3.8 feature freeze is scheduled at the end of May, less than
> one month after the PyCon. It might be a little bit too late, no?

I don't think we want to rush this in for 3.8 at this point anyway. The
design of how Python is embedded is one of those things that could
drastically affect the scenarios it gets used for in the future
(probably half of my tasks at work right now involve embedding CPython),
so I'd like to get it right.

> Would you mind to elaborate these ideas?

I'd love to, but I don't have them all straight right now, and one of
the problems with putting them in writing is I don't get immediate
feedback when I'm not being clear enough or if there is missing context.
I know you personally have seen most of my ideas, because I keep pinging
you on them ;)

My big one is what I posted on capi-sig about being able to classify our
APIs better and define scenarios where they are ready for use, as well
as breaking up unnecessary dependencies so that embedders have more
flexibility (the rings and layers post). I posted a few examples of how
initialization "could" be on various bugs I've had to deal with relating
to it, and obviously I've been pushing the embeddable distro for Windows
for a while (which is surprisingly popular with a very specific subset
of users), as well as using it myself, so there are things that just
annoy me enough about what we currently have.

But I really do think this should start as a high bandwidth, in-person
brainstorm session to get through the first few big scenarios. Then
it'll be easy to open those up to review and let anyone submit their
needs for hosting Python. And once we've collated a good set of "needs"
we'll have a chance of designing the configuration and initialization
APIs that will satisfy most/all of them. Maybe in time for 3.9 (or 3.10,
if our RM gets the accelerated release cycle he wants ;) ).

I personally think being able to embed Python easily and safely in other
applications will be a powerful feature that will allow many
non-developers to write code to get their work done, as we already see
with Jupyter (and family). More are coming, but the responsibility is on
us to make it successful. I want to get it right.

Cheers,
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Strange umask(?)/st_mode issue

2019-03-29 Thread Steve Dower
On 29Mar.2019 1939, Cameron Simpson wrote:
> Can you get a branch into your pipeline? Then you could just hack the
> tarfile test with something quick and dirty like:
> 
>    pid = os.getpid()
>    system("strace -p %d 2>/path/to/strace.out &" % pid)
>    time.sleep(2)   # get strace heaps of time to start
> 
> just before the tarfile open. A ghastly hack but it would get you
> debugging info. You could even decide to remove the strace.out file if
> the umask issue doesn't show, if it is erratic (can't see why it would
> be though).

Perfect, I'll give this a go. Thanks!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 578: Python Runtime Audit Hooks

2019-03-29 Thread Inada Naoki
I don't like adding more Python callback from low level.

Python runtime is very complicated already, especially __del__,
shutdown process, and multi threading.  Python callback from low level
is source of very difficult bugs always.

Additionally, if we used the PEP for logging complex application,
the log will be unreliable.  For example:

1. Want to open file A in C code, call callback.
2. In the callback, "A is opened" is logged.
3. In the same callback, import may be happened and logged.
4. In the same callback, other thread may be run and any thing can be logged.
5. Many many other things happens and callback is called.
6. Then, open the file A.

In this example, logged event ordering and timing is very different from
real event ordering and timing.

I prefer low level tool to trace low level thing, although it lacks some
application context.  Maybe, DTrace will become more important tool.

https://techcommunity.microsoft.com/t5/Windows-Kernel-Internals/DTrace-on-Windows/ba-p/362902

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com