Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Robert Collins
On 11 April 2016 at 13:49, Tres Seaver  wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA1
>
> On 04/10/2016 06:31 PM, Jon Ribbens wrote:
>> Unless someone knows a way to get to an object's __dict__ or its type
>> without using vars() or type() or underscore attributes...
>
> Hmm, 'classmethod'-wrapped functions get passed the type.

yeah, but to access that you need to assign the descriptor to the type
- circular loop. If you can arrange that assignment its easy:


thetype = []
class gettype:
def __get__(self, obj, type=None):
thetype.append((obj, type))
return None

classIwant.query = gettype()
classIwant().query
thetype[0][1]...

but you've already gotten to classIwant there.

-Rob

-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Serhiy Storchaka

On 11.04.16 00:53, Jon Ribbens wrote:

Try following example:

 it = iter([1])
 for i in range(100):
 it = filter(None, it)
 next(it)


That does indeed segfault. I guess you should report that as a bug!


There is old issue that doesn't have adequate solution. And this is only 
one example, you can get segfault with other recursive iterators.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Oleg Broytman
On Mon, Apr 11, 2016 at 08:06:34AM +0200, Oleg Broytman  wrote:
> On Mon, Apr 11, 2016 at 12:42:47AM -0500, Wes Turner  
> wrote:
> > On Sun, Apr 10, 2016 at 10:50 PM, Oleg Broytman  wrote:
> > 
> > > On Mon, Apr 11, 2016 at 01:09:19PM +1000, Steven D'Aprano <
> > > st...@pearwood.info> wrote:
> > > > On Sun, Apr 10, 2016 at 08:12:30PM -0400, Jonathan Goble wrote:
> > > > > On Sun, Apr 10, 2016 at 7:02 PM, Oscar Benjamin
> > > > >  wrote:
> > > > > > I haven't looked at your sandbox but for a different approach try
> > > this one:
> > > > > >
> > > > > >   L = [None]
> > > > > >   L.extend(iter(L))
> > > > > >
> > > > > > On my Linux machine that doesn't just crash Python.
> > > > >
> > > > > For the record: don't try this if you have unsaved files open on your
> > > > > computer, because you will lose them. When I typed these two lines
> > > > > into the Py3.5 interactive prompt, it completely and totally froze
> > > > > Windows to the point that nothing would respond and I had to resort to
> > > > > the old trick of holding the power button down for five seconds to
> > > > > forcibly shut the computer down.
> > > >
> > > >
> > > > I think this might improve matters:
> > > >
> > > > http://bugs.python.org/issue26351
> > > >
> > > > although I must admit I don't understand why the entire OS is effected.
> > >
> > >Memory exhaustion?
> > *
> > https://docs.docker.com/compose/compose-file/#cpu-shares-cpu-quota-cpuset-domainname-hostname-ipc-mac-address-mem-limit-memswap-limit-privileged-read-only-restart-stdin-open-tty-user-working-dir
> > 
> > * https://github.com/jupyter/dockerspawner/blob/master/systemuser/Dockerfile
> 
>I think memory control groups in Linux can be used to limit memory
> usage. I have mem. c. g. configured and I'll try to find time to
> experiment with the code above.

   With limited memory it was fast:

$ ulimit -d 5 -m 8 -s 1 -v 10
$ python
Python 2.7.9 (default, Mar  1 2015, 18:22:53) 
[GCC 4.9.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> L = [None]
>>> L.extend(iter(L))
Traceback (most recent call last):
  File "", line 1, in 
MemoryError

   Memory control groups don't help because they don't limit virtual
memory so the process simply starts thrashing.

> > > > --
> > > > Steve

Oleg.
-- 
 Oleg Broytmanhttp://phdru.name/p...@phdru.name
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Victor Stinner
2016-04-10 18:43 GMT+02:00 Jon Ribbens :
> On Sat, Apr 09, 2016 at 02:43:19PM +0200, Victor Stinner wrote:
>>Please don't loose time trying yet another sandbox inside CPython. It's
>>just a waste of time. It's broken by design.
>>
>>Please read my email about my attempt (pysandbox):
>>https://lwn.net/Articles/574323/
>>
>>And the LWN article:
>>https://lwn.net/Articles/574215/
>>
>>There are a lot of safe ways to run CPython inside a sandbox (and not rhe
>>opposite).
>>
>>I started as you, add more and more things to a blacklist, but it doesn't
>>work.
>
> That's the opposite of my approach though - I'm starting small and
> adding things, not starting with everything and removing stuff. Even
> if what we end up with is an extremely restricted subset of Python,
> there are still cases where that could be a useful tool to have.

You design rely on the assumption that CPython is only pure Python.
That's wrong. A *lot* of Python features are implemented in C and
"ignore" your sandboxing code. Quick reminder: 50% of CPython is
written in the C language.

It means that your protections like hiding builtin functions from the
Python scope don't work. If an attacker gets access to a C function
giving access to the hidden builtin, the game is over.

pysandbox is based on the idea of tav (his project safelite.py):
remove features in the dictionary of builtin C types like FrameType,
CodeObject, etc. See sandbox/attributes.py. It's not enough to be 100%
safe, a C function can still access fields of the C structure
directly, but it was enough to protect "most" C functions.

It's hard to list all features of the C code which are indirectly
accessible from the Python scope. Some examples: warnings and
tracebacks. These features killed the pysandbox project because they
open directly files on the filesystem, it's not possible to control
these features from the Python scope.

Another example which exposes a vulnerability of your sandbox:
str.format() gets directly object attributes without the getattr()
builtin function, so it's possible to escape your sandbox. Example:
"{0.__class__}".format(obj) shows the type of an object.

Think also about the new f-string which allows arbitrary Python code: f"{code}".


> However on the other hand, nobody has tried before to do what I am
> doing (static code analysis),

You're wrong.

Zope Security ("RestrictedPython") has a similar design. Analyzing AST
is a common design to build a sanbox. But it's not safe.

The "See also" section of my pysandbox has a long list of Python
sandboxes without various design.


> so it's not necessarily a safe
> assumption that the idea is doomed. For example, as far as I can see,
> none of the methods used to break out of your pysandbox would work to
> break out of my experiment.

What I see is that you asked to break your sandbox, and less than 1
hour later, a first vulnerability was found (exec called with two
parameters). A few hours later, a second vulnerability was found
(async generator and cr_frame). By the way, are you sure that you
fixed the vulnerability? You blacklisted "cb_frame", not cr_frame ;-)

You should look closer, pysandbox is very close to you project. It
also uses whitelists for some protections (ex: builtins) and blacklist
for other protections (ex: hide sensitive attributes). You are using a
blacklist for attributes. By the way, you hide cr_frame but not
cr_code. I'm quite sure that it's possible to execute arbitrary
bytecode in your sandbox, I just don't have enough time to dig into
the code. Your sandbox is not fully based on whitelists.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancments - method name only

2016-04-11 Thread Koos Zevenhoven
On Sat, Apr 9, 2016 at 10:48 AM, Nick Coghlan  wrote:
> On 9 April 2016 at 04:25, Brett Cannon  wrote:
>> On Fri, 8 Apr 2016 at 11:13 Ethan Furman  wrote:
>>> On 04/08/2016 10:46 AM, Koos Zevenhoven wrote:
>>>  > On Fri, Apr 8, 2016 at 7:42 PM, Chris Barker  wrote:
>>>  >> On Fri, Apr 8, 2016 at 9:02 AM, Koos Zevenhoven wrote:
>>>  >>>
>>>  >>> I'm still thinking a little bit about 'pathname', which to me sounds
>>>  >>> more like a string than fspath does.
>>>  >>
>>>  >>
>>>  >> I like that a lot - or even "__pathstr__" or "__pathstring__"
>>>  >> after all, we're making a big deal out of the fact that a path is
>>>  >> *not a string*, but rather a string is a *representation* (or
>>>  >> serialization) of a path.
>>>
>>> That's a decent point.
>>>
>>> So the plausible choices are, I think:
>>>
>>> - __fspath__  # File System Path -- possible confusion with Path
>>
>> +1
>
> I like __fspath__, but I'm also sympathetic to Koos' point that we're
> really dealing with path *names* being produced via this protocol,
> rather than the paths themselves.
>
> That would bring the completely explicit "__fspathname__" into the
> mix, which would be comparable in length to "__getattribute__" as a
> magic method name (both in terms of number of syllable and number of
> characters).
>
> Considering the helper function usage, here's some examples in
> combination with os.fsencode and os.fsdecode:
>
> # Status quo for binary/text path conversions
> text_path = os.fsdecode(bytes_path)
> bytes_path = os.fsencode(text_path)
>
> # Getting a text path from an arbitrary object
> text_path = os.fspath(obj) # This doesn't scream "returns text!" to me
> text_path = os.fspathname(obj) # This does
>
> # Getting a binary path from an arbitrary object
> bytes_path = os.fsencode(os.fspath(obj))
> bytes_path = os.fsencode(os.fspathname(obj))
>
> I'm starting to think the semantic nudge from the "name" suffix when
> reading the code is worth the extra four characters when writing it
> (keeping in mind that the whole point of this exercise is that most
> folks *won't* be writing explicit conversions - the stdlib will handle
> it on their behalf).
>

Regarding the name, I completely agree with Nick's reasoning (above).
I'm not sure it's a high priority to make dunder-method names short.
They are not typed very often, and when the number of these
"protocols" increases, you face potentially ambiguous names more and
more often (there already is a '__path__' and a '__file__' etc., as
has been brought up earlier in these threads.). In other words, it's a
good idea to have some information in the name.

> I also think the more explicit name helps answer some of the type
> signature questions that have arisen:
>
> 1. Does os.fspathname return rich Path objects? No, it returns names
> as str objects

Or byte strings, it seems, unfortunately.

> 2. Will file descriptors pass through os.fspathname? No, as they're
> not names, they're numeric descriptors.
> 3. Will bytes-like objects pass through os.fspathname? No, as they're
> not names, they're encodings of names
>

If fspathname(...) is to be used in os.path.*, it will break things if
it starts to turn encoded bytes pathnames into str pathnames, which it
did not previously do.

And if fspathname is not to be used in os.path.*, who would be our
intended user of fspathname? I assume we we don't want to encourage
typical 'users' to manipulate pathnames by hand.

>> I personally still like __ospath__ as well.
>
> That one fails the "Is it ambiguous when spoken aloud?" test for me:
> if someone mentions "oh-ess-path", are they talking about os.path or
> __ospath__? With "eff-ess-path" or "eff-ess-path-name", that problem
> doesn't arise.
>

+1 to this too.

-Koos
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #25910: Fixed more links in the docs.

2016-04-11 Thread Tim Golden
On 11/04/2016 15:38, serhiy.storchaka wrote:
> -  `__.
> +  `__.

Is there any intended irony in our link to openssl not being via https?

:)

TJG
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Jon Ribbens
On Mon, Apr 11, 2016 at 11:40:05AM +0200, Victor Stinner wrote:
> 2016-04-10 18:43 GMT+02:00 Jon Ribbens :
> > That's the opposite of my approach though - I'm starting small and
> > adding things, not starting with everything and removing stuff. Even
> > if what we end up with is an extremely restricted subset of Python,
> > there are still cases where that could be a useful tool to have.
> 
> You design rely on the assumption that CPython is only pure Python.

No it doesn't. Obviously I know CPython is written in C - the clue is
in the name. I'm not sure what you mean here. 

> It means that your protections like hiding builtin functions from the
> Python scope don't work. If an attacker gets access to a C function
> giving access to the hidden builtin, the game is over.

The former is only true if you assume the latter is possible.
Is there any reason to believe it is?

> It's hard to list all features of the C code which are indirectly
> accessible from the Python scope. Some examples: warnings and
> tracebacks. These features killed the pysandbox project because they
> open directly files on the filesystem, it's not possible to control
> these features from the Python scope.

I think what you're referring to is when they show context for errors,
for which they try and find the source code lines to display by
identifying the filename, and you can subvert that process by changing
__file__ and/or __name__. If so, you can't do that within my
experiment because you're not allowed to access either of those names.

> Another example which exposes a vulnerability of your sandbox:
> str.format() gets directly object attributes without the getattr()
> builtin function, so it's possible to escape your sandbox. Example:
> "{0.__class__}".format(obj) shows the type of an object.

Yes, I'd thought of that. However getting access to a string which
contains the name or a representation of an object is not at all the
same thing as getting access to the object itself. 

> Think also about the new f-string which allows arbitrary Python
> code: f"{code}".

Obviously I can't speak to features of future versions of Python.
I'd have to see the ast generated by an f-string to know if they
pose a problem or not, but I suspect they would compile to
expression nodes and hence be caught by the existing checks.

> > However on the other hand, nobody has tried before to do what I am
> > doing (static code analysis),
> 
> You're wrong.
> 
> Zope Security ("RestrictedPython") has a similar design. Analyzing AST
> is a common design to build a sanbox. But it's not safe.

Ah, I hadn't seen that one. Yes, they are doing something similar
(but also much more complex!) I don't know why you say this is
a "common design" though, that one is the only one that appears to
use it.

> What I see is that you asked to break your sandbox, and less than 1
> hour later, a first vulnerability was found (exec called with two
> parameters). A few hours later, a second vulnerability was found
> (async generator and cr_frame).

The former was just a stupid bug, it says nothing about the viability
of the methodology. The latter was a new feature in a Python version
later than I have ever used, and again does not imply anything much
about the viability. I think now I've blocked the names of frame
object attributes it wouldn't be a vulnerability any more anyway.

> By the way, are you sure that you fixed the vulnerability? You
> blacklisted "cb_frame", not cr_frame ;-)

Ah, thanks. As above, I think this doesn't actually make any
difference, but I've updated the code regardless.

> You should look closer, pysandbox is very close to you project.

I've just looked through it all again, and I don't understand why you
are saying that. It's nothing like my experiment. It's trying to alter
the global Python environment so that arbitrary code can be executed,
whereas I am not even trying to allow execution of arbitrary code and
am not altering the global environment.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancments - method name only

2016-04-11 Thread Antoine Pitrou
Ethan Furman  stoneleaf.us> writes:
> 
> That's a decent point.
> 
> So the plausible choices are, I think:
> 
> - __fspath__  # File System Path -- possible confusion with Path

This would have my preference.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Antoine Pitrou
Ethan Furman  stoneleaf.us> writes:
>  > I also think the more explicit name helps answer some of the type
>  > signature questions that have arisen:
>  >
>  > 1. Does os.fspathname return rich Path objects? No, it returns names
>  > as str objects
>  > 2. Will file descriptors pass through os.fspathname? No, as they're
>  > not names, they're numeric descriptors.
>  > 3. Will bytes-like objects pass through os.fspathname? No, as they're
>  > not names, they're encodings of names
> 
> If we add os.fspath(), but don't allow bytes to be returned from it, our 
> above example looks more like:
> 
>if isinstance(a_path_thingy, bytes):
># because os can accept bytes
>pass
>else:
>a_path_thingy = os.fspath(a_path_thingy)
># do something with the path
> 
> Yes, it's better -- but it still requires a pre-check before calling 
> os.fspath().
> 
> It is my contention that this is better:
> 
>a_path_thingy = os.fspath(a_path_thingy)

It's not better, because a_path_thingy then may be a bytes object,
and the os.fspath() caller has to deal with it.  Conversely, if
os.fspath() is guaranteed to return a unicode string, then the caller
only has to worry about bytes paths if it really wants to; most callers
probably don't care.

I know what some people say: support for bytes paths is necessary
for "low-level functions" (definition required ;-)).  But in a
PEP 383 world, it's not necessary at all.

> 2) pathlib.Path accepts bytes --

Does it? Or are you proposing such a change?

>>> pathlib.Path(b".")
Traceback (most recent call last):
  File "", line 1, in 
  File "/home/antoine/35/lib/python3.5/pathlib.py", line 956, in __new__
self = cls._from_parts(args, init=False)
  File "/home/antoine/35/lib/python3.5/pathlib.py", line 638, in _from_parts
drv, root, parts = self._parse_args(args)
  File "/home/antoine/35/lib/python3.5/pathlib.py", line 630, in _parse_args
% type(a))
TypeError: argument should be a path or str object, not 


Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Koos Zevenhoven
On Mon, Apr 11, 2016 at 9:27 AM, Nick Coghlan  wrote:
> On 11 April 2016 at 02:16, Ethan Furman  wrote:
>>
>> I guess I don't see the point of this.  Either DirEntry's [1] only get
>> partial support (which is only marginally better than the no support pathlib
>> currently has), or stdlib code will need to catch those errors and then do
>> an isinstance check to see if knows what the type is and how to deal with it
>> [1].
>
> What's wrong with only gaining partial support? Standard library code
> that doesn't currently support DirEntry at all will gain the ability
> to support str-based DirEntry objects, while bytes-based DirEntry
> objects will continue to be a low level object that isn't
> interoperable with most other APIs (which is fine - anyone writing low
> level POSIX-specific code can deal with unpacking the values
> explicitly, it just won't happen implicitly anywhere).
>

While I'm also tempted to lean towards 'marginalizing bytes support',
it seems a little bit dangerous to me. Currently, os.path is heavily
based on duck typing of str and bytes, so there may be code out there
that does all kinds of things with paths without knowing whether it
deals with bytes or str objects. If such code gets in contact with
this pathname protocol, it will raise an exception whenever it happens
to be fed a bytes path. That is, if the approach of 'partial support'
is taken.

And still there is the question I just posted in another branch of
this mess: Who should use os.fspathname(...)? If it's os.path.* and
other traditional (low-level?) functions that deal with paths, then
fspathname should, in the name of backwards compatiblity, be able to
deal with bytes and return bytes in those cases.  Otherwise fspathname
would do nothing for you, and all the work of
isinstance/hasattr/whatever would be left to the caller of
os.fspathname (or maybe this is what you want?).

So a somewhat useful fspathname might indeed look something like this:

 def fspathname(pathlike) -> Union[str, bytes]:
 pathname = getattr(pathlike, '__fspathname__', pathlike)
 if not isinstance(pathname, (str, bytes)):
 raise TypeError("your thing is not pathlike")
 return pathname

But maybe it is enough to have the __fspathname__ attribute, and make
fspathname() some internal implementation detail of os.path.* and the
like.

-Koos
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Paul Moore
On 11 April 2016 at 15:46, Jon Ribbens  wrote:
> It's trying to alter
> the global Python environment so that arbitrary code can be executed,
> whereas I am not even trying to allow execution of arbitrary code and
> am not altering the global environment.

However, it's not at all clear (to me at least) what you *are* trying
to do. You're limiting the subset of Python that people can use,
understood. And you're trying to ensure that people can't do "bad
things". Again, understood. But what subset are you actually allowing,
and what things are you trying to protect against? (For example, I
can't calculate sin(1.2) using the math module - why is that not
alllowed? It's just as safe as using the built in exponential
operator, and indeed I could write a sin() function in pure Python,
although it would be too slow to be useful, unlike math.sin...)

It feels at the moment as if I'm playing a game where I don't know the
rules, and every time I think I scored a point, the rules are changed
to retroactively disallow it.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Ethan Furman

On 04/11/2016 07:56 AM, Antoine Pitrou wrote:


2) pathlib.Path accepts bytes --


Does it? Or are you proposing such a change?


It used to (I posted a couple examples from 3.5.0).  I finally rebuilt 
with the latest and it no longer does.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Nikolaus Rath
On Apr 11 2016, Jon Ribbens  wrote:
>> What I see is that you asked to break your sandbox, and less than 1
>> hour later, a first vulnerability was found (exec called with two
>> parameters). A few hours later, a second vulnerability was found
>> (async generator and cr_frame).
>
> The former was just a stupid bug, it says nothing about the viability
> of the methodology. The latter was a new feature in a Python version
> later than I have ever used, and again does not imply anything much
> about the viability.

It implies that new versions of Python may break your sandbox. That
doesn't sound like a viable long-term solution.

> I think now I've blocked the names of frame
> object attributes it wouldn't be a vulnerability any more anyway.

It seems like you're playing whack-a-mole. 


Best,
-Nikolaus

-- 
GPG encrypted emails preferred. Key id: 0xD113FCAC3C4E599F
Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

 »Time flies like an arrow, fruit flies like a Banana.«
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] tp_new selection regression in the 2.7 branch

2016-04-11 Thread Julien Cristau
Hi,

changeset https://hg.python.org/cpython/rev/e7062dd9085e in the 2.7
branch changes how tp_new is assigned, and causes regressions with
multiple inheritance from extension classes.
http://bugs.python.org/issue25731#msg262922 has a fairly simple
reproducer using cython.  The __base__ attribute is set correctly, but
tp_new is now wrong and thus the object initialization is broken.

Can this change be fixed or reverted before the next 2.7.x release?

(I have not verified if this regression also affects the 3.5 branch)

Thanks,
Julien


signature.asc
Description: PGP signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Isaac Morland

On Mon, 11 Apr 2016, Victor Stinner wrote:


2016-04-10 18:43 GMT+02:00 Jon Ribbens :


That's the opposite of my approach though - I'm starting small and
adding things, not starting with everything and removing stuff. Even
if what we end up with is an extremely restricted subset of Python,
there are still cases where that could be a useful tool to have.


You design rely on the assumption that CPython is only pure Python.
That's wrong. A *lot* of Python features are implemented in C and
"ignore" your sandboxing code. Quick reminder: 50% of CPython is
written in the C language.

It means that your protections like hiding builtin functions from the
Python scope don't work. If an attacker gets access to a C function
giving access to the hidden builtin, the game is over.

[]

Non-Python core developer, non-expert-specifically-in-computer-security 
here, so won't take up much room on this list.


I know enough about almost everything in Computer Science to know just how 
ignorant I am about almost everything in Computer Science.


But I would not use for security purposes a Python sandbox that was not 
formally verified to be correct and unbreakable.  Of course in order for 
this to be possible, there first has to be a formal semantics for Python. 
Has anybody made a formal semantics for Python?  If not, then this project 
is missing a pretty important pre-requisite.


Isaac Morland   CSCF Web Guru
DC 2619, x36650 WWW Software Specialist
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Chris Angelico
On Mon, Apr 11, 2016 at 9:04 PM, Isaac Morland  wrote:
> But I would not use for security purposes a Python sandbox that was not
> formally verified to be correct and unbreakable.  Of course in order for
> this to be possible, there first has to be a formal semantics for Python.
> Has anybody made a formal semantics for Python?  If not, then this project
> is missing a pretty important pre-requisite.

Formal semantics for the language? Yes; most of docs.python.org is
about the language, independently of any particular implementation.
(There are odd notes here and there about "CPython implementation
detail" and such, and there are some entire modules that are
specifically stated as being implementation-specific, but they're a
tiny proportion.) You can also read through the PEPs, which (again,
for the most part) deal with language-level concerns ahead of
implementation details.

However, even with that information, it's virtually impossible to
formally verify that the sandbox is unbreakable. A Python-in-Python
sandbox is almost guaranteed to leak information across the boundary,
and when information is leaked, it's extremely hard to prove that
privilege escalation is impossible.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Ethan Furman

On 04/10/2016 11:27 PM, Nick Coghlan wrote:

On 11 April 2016 at 02:16, Ethan Furman  wrote:



DirEntry can still get the check, it can just throw TypeError when it
represents a binary path (that's one of the advantages of using a
method-based protocol - exceptions on method calls are more acceptable
than exceptions on property access).



I guess I don't see the point of this.  Either DirEntry's [1] only get
partial support (which is only marginally better than the no support pathlib
currently has), or stdlib code will need to catch those errors and then do
an isinstance check to see if knows what the type is and how to deal with it
[1].


What's wrong with only gaining partial support? Standard library code
that doesn't currently support DirEntry at all will gain the ability
to support str-based DirEntry objects, while bytes-based DirEntry
objects will continue to be a low level object [...]


Let's consider to functions, one that accepts bytes/str for the path, 
and one that only accepts str:



  str-only support
  
  # before new protocol
  def do_fritz(a_path):
  if not isinstance(a_path, str):
  raise TypeError('str required')
  ...

  # after new protocol with str-only support
  def do_fritz(a_path):
  a_path = fspath(a_path)
  ...

  # after new protocol with bytes/str support
  a_path = fspath(a_path)
  if not isinstance(a_path, str):
  raise TypeError('str required')
  ...


  bytes/str support
  -
  # before new protocol
  def zingar(a_path):
  if not isinstance(a_path, (bytes,str)):
  raise TypeError('bytes or str required')
  ...

  # after new protocol with str-only support
  def zingar(a_path):
  if not isinstance(a_path, bytes):
  try:
  a_path = fspath(a_path)
  except FSPathError:
  raise TypeError('bytes or str required')
  ...

  # after new protocol with bytes/str support
  def zingar(a_path):
  a_path = fspath(a_path)
  if not isinstance(a_path, (bytes,str)):
  raise TypeError('bytes or str required')
  ...


If those examples are anywhere close to accurate, an fspath protocol 
that supported both bytes and str seems a lot easier to work with.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Anders Munch
Steven D'Aprano:
> although I must admit I don't understand why the entire OS is effected.

A consequence of memory overcommit, I'd wager.  The crasher code not only 
allocates vast swathes of memory, but accesses it as well, which is bad news 
for Linux with overcommit enabled. When the OS runs out of backing store to 
handle page faults, anything can happen.
 
- Anders

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Zachary Ware
On Mon, Apr 11, 2016 at 11:18 AM, Ethan Furman  wrote:
> If those examples are anywhere close to accurate, an fspath protocol that
> supported both bytes and str seems a lot easier to work with.

But why are you working with bytes paths in the first place? Where did
you get them from, and why couldn't you decode them at that boundary?
In 7ish years of working with Python (almost exclusively Python 3) on
Windows and UNIX, I have never used bytes paths on any platform.

-- 
Zach
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Jon Ribbens
On Mon, Apr 11, 2016 at 08:35:11AM -0700, Nikolaus Rath wrote:
> On Apr 11 2016, Jon Ribbens  wrote:
> >> What I see is that you asked to break your sandbox, and less than 1
> >> hour later, a first vulnerability was found (exec called with two
> >> parameters). A few hours later, a second vulnerability was found
> >> (async generator and cr_frame).
> >
> > The former was just a stupid bug, it says nothing about the viability
> > of the methodology. The latter was a new feature in a Python version
> > later than I have ever used, and again does not imply anything much
> > about the viability.
> 
> It implies that new versions of Python may break your sandbox. That
> doesn't sound like a viable long-term solution.

That is obviously always going to be true of major new versions with
major new features, no matter what language we're talking about or
what method is being used to sandbox - unless the sandboxing were to
be built in to the language itself, which I have deliberately not
suggested.

But having said that, I already pointed out in the message you're
responding to that with the method I'm using now, coroutines would
not have been an issue even if I hadn't specifically fixed them.

> > I think now I've blocked the names of frame
> > object attributes it wouldn't be a vulnerability any more anyway.
> 
> It seems like you're playing whack-a-mole. 

Well, no, quite the opposite in fact. If that was true then I would
have given up already as the method having been proved useless.
So far it looks like blocking "_*" and the frame object attributes
appears to be sufficient.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Jon Ribbens
On Mon, Apr 11, 2016 at 04:04:21PM +0100, Paul Moore wrote:
> However, it's not at all clear (to me at least) what you *are* trying
> to do.

I'm trying to see to what extent we can use ast node inspection to
remedy the failures of prior attempts at Python sandboxing. Is there
*any* extent to which Python can be sandboxed, or is even trying to
use it as a calculator function unfixably insecure?

> You're limiting the subset of Python that people can use,
> understood. And you're trying to ensure that people can't do "bad
> things". Again, understood. But what subset are you actually allowing,
> and what things are you trying to protect against? (For example, I
> can't calculate sin(1.2) using the math module - why is that not
> alllowed?

It wasn't allowed in the earlier version because I wasn't allowing
import at all, because this is just an experiment. As it happens,
I added 'import' yesterday so yes you can use math.sin.

> It feels at the moment as if I'm playing a game where I don't know the
> rules, and every time I think I scored a point, the rules are changed
> to retroactively disallow it.

The challenge is to show some code that will escape from the sandbox,
in a way that is not trivially fixable with a tiny patch, or in a way
that demonstrates that such a large number of tiny patches would be
required as to be unworkable.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Chris Angelico
On Tue, Apr 12, 2016 at 2:53 AM, Jon Ribbens
 wrote:
> On Mon, Apr 11, 2016 at 04:04:21PM +0100, Paul Moore wrote:
>> However, it's not at all clear (to me at least) what you *are* trying
>> to do.
>
> I'm trying to see to what extent we can use ast node inspection to
> remedy the failures of prior attempts at Python sandboxing. Is there
> *any* extent to which Python can be sandboxed, or is even trying to
> use it as a calculator function unfixably insecure?
>

It all depends on how much functionality you want. If all you need is
a numeric expression evaluator, that's not too hard - disallow all
forms of attribute access, etc, and just have simple numbers and
operators. That's pretty useful, and safe. Alternatively, go
completely the other way. Let people run whatever code they like... in
an environment where it can't hurt anyone else. That's what PyPyJS
does - don't bother looking for security holes in it, because all
you're doing is attacking your own computer.

The hard part comes when you want to allow *some*, but not all,
interaction with the outside world. When I was looking into this kind
of sandboxing (although it was Python-in-C++ rather than
Python-in-Python), it was to allow untrusted users to control certain
parts of server-side execution. The result was dismal, because it's
fundamentally impossible to allow the level of control I wanted
without allowing a level of control I didn't want.

So before you can ask whether Python is unfixably insecure, you first
have to decide what the minimum level of functionality is that you'll
accept. Do you need basic arithmetic plus trignometric functions? Easy
enough - disallow all attribute access and imports, and populate
builtins with "from math import *". Need them to be able to assign
variables and define functions? That's gonna be harder.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Ethan Furman

On 04/11/2016 09:32 AM, Zachary Ware wrote:

On Mon, Apr 11, 2016 at 11:18 AM, Ethan Furman wrote:



If those examples are anywhere close to accurate, an fspath protocol that
supported both bytes and str seems a lot easier to work with.


But why are you working with bytes paths in the first place? Where did
you get them from, and why couldn't you decode them at that boundary?
In 7ish years of working with Python (almost exclusively Python 3) on
Windows and UNIX, I have never used bytes paths on any platform.


I'm not saying that bytes paths are common -- and if this was a 
brand-new feature I wouldn't be pushing for it so hard;  however, bytes 
paths are already supported and it seems to me to be much less of a 
headache to continue the support in this new protocol instead of drawing 
an artificial line in the sand.


Also, let me be clear that the new protocol will not adversely affect my 
own library is it directly subclasses bytes and strings (bPath and 
uPath), so they will pass through either way (or be appropriately 
rejected if the function only supports str -- are there any?) .


This kind of feels like PEP 361 again -- the vast majority of Python 
programmers do not need %-interpolation for bytes, but what a pain in 
the rear for those that did!  (Yes, I was one of those.)  Admittedly, 
the pain from this will not be nearly as severe as that was, but why 
should we have any unnecessary pain at all?


Asked another way, what are we gaining by disallowing bytes in this new 
way of getting paths versus the pain caused when bytes are needed and/or 
accepted?


From my point of view the pain of simply implementing this without 
bytes support in the existing os and os.path modules is not worth 
excluding bytes.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Donald Stufft

> On Apr 11, 2016, at 1:12 PM, Ethan Furman  wrote:
> 
> Asked another way, what are we gaining by disallowing bytes in this new way 
> of getting paths versus the pain caused when bytes are needed and/or accepted?


It seems fine to me to allow __fspath__ to return bytes as well as str. The 
only argument I can think against it is that something like pathlib.Path() 
would not work with a bytes returning __fspath__, but that’s not any different 
than what happens if you pass a bytes object directly into pathlib.Path as well.

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Brett Cannon
On Mon, 11 Apr 2016 at 10:13 Ethan Furman  wrote:

> On 04/11/2016 09:32 AM, Zachary Ware wrote:
> > On Mon, Apr 11, 2016 at 11:18 AM, Ethan Furman wrote:
>
> >> If those examples are anywhere close to accurate, an fspath protocol
> that
> >> supported both bytes and str seems a lot easier to work with.
> >
> > But why are you working with bytes paths in the first place? Where did
> > you get them from, and why couldn't you decode them at that boundary?
> > In 7ish years of working with Python (almost exclusively Python 3) on
> > Windows and UNIX, I have never used bytes paths on any platform.
>
> I'm not saying that bytes paths are common -- and if this was a
> brand-new feature I wouldn't be pushing for it so hard;  however, bytes
> paths are already supported and it seems to me to be much less of a
> headache to continue the support in this new protocol instead of drawing
> an artificial line in the sand.
>

Headache for you? The stdlib? Library authors? Users of libraries? There
are a lot of users of this who have varying levels of pain for this.


>
> Also, let me be clear that the new protocol will not adversely affect my
> own library is it directly subclasses bytes and strings (bPath and
> uPath), so they will pass through either way (or be appropriately
> rejected if the function only supports str -- are there any?) .
>

Well, technically it depends on whether we prefer the protocol or explicit
type checking and how we define the protocol. If we say __ospath__ has to
return str and we check for that first then that would be bad for you. If
we do isinstance() checks before calling the protocol or allow both str and
bytes then we open it up.


>
> This kind of feels like PEP 361 again -- the vast majority of Python
> programmers do not need %-interpolation for bytes, but what a pain in
> the rear for those that did!  (Yes, I was one of those.)  Admittedly,
> the pain from this will not be nearly as severe as that was, but why
> should we have any unnecessary pain at all?
>
> Asked another way, what are we gaining by disallowing bytes in this new
> way of getting paths versus the pain caused when bytes are needed and/or
> accepted?
>

Type consistency. E.g. if I pass in a DirEntry object into os.fspath() and
I don't know what the heck I'm getting back then that can lead to subtle
bugs, especially when you didn't check ahead of time what DirEntry.path
was. To me, that bumps up against "In the face of ambiguity, refuse the
temptation to guess". Having the type vary even when the type doesn't can
get messy if you don't expect to always vary (i.e. this isn't getattr()).


>
>  From my point of view the pain of simply implementing this without
> bytes support in the existing os and os.path modules is not worth
> excluding bytes.
>

How about we take something from the "explicit is better than implicit"
playbook and add a keyword argument to os.fspath() to allow bytes to pass
through?

  def fspath(path, *, allow_bytes=False):
  if isinstance(path, str):
  return path
  # Allow bytearray?
  elif allow_bytes and isinstance(path, bytes):
  return path
  try:
  protocol = path.__fspath__()
  except AttributeError:
  pass
  else:
  # Explicit type check worth it, or better to rely on duck typing?
  if isinstance(protocol_path, str):
  return protocol_path
  raise TypeError("expected a path-like object, str, or bytes (if
allowed), not {type(path)}")

For DirEntry users who use bytes, they will simply have to pass around
DirEntry.path which is not as nice as simply passing around DirEntry, but
it does allow them to continue to operate without having to decode the
bytes if allow_bytes is True. We get type consistency in the protocol fas
we can continue to expect people to return strings for __fspath__. And for
those APIs where supporting bytes won't be an issue, they can explicitly
choose to support bytes or not and then not have to juggle support for both
str and bytes if they choose not to. IOW consenting adults to bytes paths
can not get cut out and have a ton of hoops to jump through as long as they
opt-in, but those adults who don't consent to bytes paths have their lives
simplified.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Antoine Pitrou
Ethan Furman  stoneleaf.us> writes:
> 
> On 04/11/2016 07:56 AM, Antoine Pitrou wrote:
> 
> >> 2) pathlib.Path accepts bytes --
> >
> > Does it? Or are you proposing such a change?
> 
> It used to (I posted a couple examples from 3.5.0).  I finally rebuilt 
> with the latest and it no longer does.

This is surprising, since in its entire lifetime, pathlib was never
supposed to support bytes inputs. See the argument check in the
initial checkin of pathlib.py:
https://hg.python.org/cpython/rev/43377dcfb801/#l6.571

Perhaps that slipped through at some point (and obviously no test was
there to prevent it :-)).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 506 secrets module

2016-04-11 Thread Steven D'Aprano
On Sun, Apr 10, 2016 at 11:43:08AM -0700, Guido van Rossum wrote:
> Hi Steven,
> 
> No probIem with the delay -- it's still before 3.6.0. I do think it's
> just about a record gap in the PEP review process. :-)
> 
> I will approve the PEP as soon as you've updated the two function
> names in the PEP. (If you don't have write access to the peps repo,
> send the new version to p...@python.org -- or send a link to the new
> draft somewhere online, e.g. github if you're using that. If you do
> have peps repo write access, just reply here when it's done.)

I have done that, and updated the API and Implementation section to be 
less wishy-washy and more commital about what exactly will be included. 
Hope it meets with your approval, and thanks for your guidance!


-- 
Steve
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Random832
On Mon, Apr 11, 2016, at 13:36, Brett Cannon wrote:
> How about we take something from the "explicit is better than implicit"
> playbook and add a keyword argument to os.fspath() to allow bytes to pass
> through?

Except, we already know how to convert a bytes-path into a str (and vice
versa) with sys.getfilesystemencoding and surrogateescape. So why not
just have the argument specify what return type is desired?

def fspath(path, *, want_bytes=False):
if isinstance(path, (bytes, str)):
ppath = path
else:
try:
ppath = path.__fspath__()
except AttributeError:
raise TypeError
if isinstance(ppath, str):
return ppath.encode(...) if want_bytes else ppath
elif isinstance(ppath, bytes):
return ppath if want_bytes else ppath.decode(...)
else:
raise TypeError

This way the posix os module can call the function and have the bytes
value already prepared for it to pass to the real open() syscall.

You could even add the same thing in other places, e.g. os.path.join
(defaulting to if the first argument is a bytes).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Ethan Furman

On 04/11/2016 10:36 AM, Brett Cannon wrote:

On Mon, 11 Apr 2016 at 10:13 Ethan Furman wrote:



I'm not saying that bytes paths are common -- and if this was a
brand-new feature I wouldn't be pushing for it so hard;  however, bytes
paths are already supported and it seems to me to be much less of a
headache to continue the support in this new protocol instead of drawing
an artificial line in the sand.


Headache for you? The stdlib? Library authors? Users of libraries? There
are a lot of users of this who have varying levels of pain for this.


Yes, yes, maybe, maybe.  :)


Asked another way, what are we gaining by disallowing bytes in this new
way of getting paths versus the pain caused when bytes are needed and/or
accepted?


Type consistency. E.g. if I pass in a DirEntry object into os.fspath()
and I don't know what the heck I'm getting back then that can lead to
subtle bugs [...]



How about we take something from the "explicit is better than implicit"
playbook and add a keyword argument to os.fspath() to allow bytes to
pass through?

   def fspath(path, *, allow_bytes=False):
   if isinstance(path, str):
   return path
   # Allow bytearray?
   elif allow_bytes and isinstance(path, bytes):
   return path
   try:
   protocol = path.__fspath__()
   except AttributeError:
   pass
   else:
   # Explicit type check worth it, or better to rely on duck typing?
   if isinstance(protocol_path, str):
   return protocol_path
   raise TypeError("expected a path-like object, str, or bytes (if
allowed), not {type(path)}")


I think that might work.  We currently have four path related things: 
bytes, str, Path, DirEntry -- two are str-only, one is bytes-only, and 
one can be either.


I would write the above as:

  def fspath(path, *, allow_bytes=False):
 try:
path = path.__fspath__()
 except AttributeError:
pass
 if isinstance(path, str):
return path
 elif allow_bytes and isinstance(path, bytes):
return path
 else:
raise SomeError()


For DirEntry users who use bytes, they will simply have to pass around
DirEntry.path which is not as nice as simply passing around DirEntry,


If we go with the above we allow DirEntry.__fspath__ to return bytes and 
still get type-consistency of str unless the user explicitly declares 
they're okay with getting either (and even then the field is narrowed 
from four possible source types (or more as time goes on) to two.


To recap, this would allow both str & bytes in __fspath__, but the 
fspath() function defaults to only allowing str through.


I can live with that.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-checkins] cpython (2.7): Issue #25910: Fixed more links in the docs.

2016-04-11 Thread Serhiy Storchaka

On 11.04.16 17:41, Tim Golden wrote:

On 11/04/2016 15:38, serhiy.storchaka wrote:

-  `__.
+  `__.


Is there any intended irony in our link to openssl not being via https?

:)


http://bugs.python.org/issue26736


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 506 secrets module

2016-04-11 Thread Guido van Rossum
Most excellent! PEP 506 is hereby approved. Congrats again.

On Mon, Apr 11, 2016 at 10:50 AM, Steven D'Aprano  wrote:
> On Sun, Apr 10, 2016 at 11:43:08AM -0700, Guido van Rossum wrote:
>> Hi Steven,
>>
>> No probIem with the delay -- it's still before 3.6.0. I do think it's
>> just about a record gap in the PEP review process. :-)
>>
>> I will approve the PEP as soon as you've updated the two function
>> names in the PEP. (If you don't have write access to the peps repo,
>> send the new version to p...@python.org -- or send a link to the new
>> draft somewhere online, e.g. github if you're using that. If you do
>> have peps repo write access, just reply here when it's done.)
>
> I have done that, and updated the API and Implementation section to be
> less wishy-washy and more commital about what exactly will be included.
> Hope it meets with your approval, and thanks for your guidance!
>
>
> --
> Steve
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/guido%40python.org



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 506 secrets module

2016-04-11 Thread Ethan Furman

On 04/11/2016 11:35 AM, Guido van Rossum wrote:


Most excellent! PEP 506 is hereby approved. Congrats again.


Congratulations, Steven!

--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Brett Cannon
On Mon, 11 Apr 2016 at 11:28 Ethan Furman  wrote:

> On 04/11/2016 10:36 AM, Brett Cannon wrote:
> > On Mon, 11 Apr 2016 at 10:13 Ethan Furman wrote:
>
> >> I'm not saying that bytes paths are common -- and if this was a
> >> brand-new feature I wouldn't be pushing for it so hard;  however, bytes
> >> paths are already supported and it seems to me to be much less of a
> >> headache to continue the support in this new protocol instead of drawing
> >> an artificial line in the sand.
> >
> > Headache for you? The stdlib? Library authors? Users of libraries? There
> > are a lot of users of this who have varying levels of pain for this.
>
> Yes, yes, maybe, maybe.  :)
>
> >> Asked another way, what are we gaining by disallowing bytes in this new
> >> way of getting paths versus the pain caused when bytes are needed and/or
> >> accepted?
> >
> > Type consistency. E.g. if I pass in a DirEntry object into os.fspath()
> > and I don't know what the heck I'm getting back then that can lead to
> > subtle bugs [...]
>
> > How about we take something from the "explicit is better than implicit"
> > playbook and add a keyword argument to os.fspath() to allow bytes to
> > pass through?
> >
> >def fspath(path, *, allow_bytes=False):
> >if isinstance(path, str):
> >return path
> ># Allow bytearray?
> >elif allow_bytes and isinstance(path, bytes):
> >return path
> >try:
> >protocol = path.__fspath__()
> >except AttributeError:
> >pass
> >else:
> ># Explicit type check worth it, or better to rely on duck
> typing?
> >if isinstance(protocol_path, str):
> >return protocol_path
> >raise TypeError("expected a path-like object, str, or bytes (if
> > allowed), not {type(path)}")
>
> I think that might work.  We currently have four path related things:
> bytes, str, Path, DirEntry -- two are str-only, one is bytes-only, and
> one can be either.
>
> I would write the above as:
>
>def fspath(path, *, allow_bytes=False):
>   try:
>  path = path.__fspath__()
>   except AttributeError:
>  pass
>   if isinstance(path, str):
>  return path
>   elif allow_bytes and isinstance(path, bytes):
>  return path
>   else:
>  raise SomeError()
>
> > For DirEntry users who use bytes, they will simply have to pass around
> > DirEntry.path which is not as nice as simply passing around DirEntry,
>
> If we go with the above we allow DirEntry.__fspath__ to return bytes and
> still get type-consistency of str unless the user explicitly declares
> they're okay with getting either (and even then the field is narrowed
> from four possible source types (or more as time goes on) to two.
>

You get type consistency from so.fspath(), not the protocol, though.


>
> To recap, this would allow both str & bytes in __fspath__, but the
> fspath() function defaults to only allowing str through.
>
> I can live with that.
>

I'm -0 on allowing __fspath__ to return bytes, but we can see what others
think.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Ethan Furman

On 04/11/2016 12:00 PM, Brett Cannon wrote:

On Mon, 11 Apr 2016 at 11:28 Ethan Furman wrote:



I would write the above as:

def fspath(path, *, allow_bytes=False):


You get type consistency from so.fspath(), not the protocol, though.


Well, since the protocol is also a function, we could put the 
allow_bytes on that as well -- not sure if that is a good idea or not.


--
~Ethan~

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Alexander Walters
In reviewing the ongoing arguments about how to make pathlib better, 
there have been circular arguments about if it is even broken, if it 
should support bytes, if there should be a path protocol that all 
functions that touch the filesystem should use, if that protocol should 
support bytes, how that protocol should be open or closed to allow third 
party modules to act as paths, etc., etc.


If there is headway being made, I do not see it.

I don't think we can come to an agreement that will make anyone happy, 
or have any effect on the adoption of the pathlib module in the standard 
library.  Maybe, just maybe, since there is an ecosystem of third party 
modules already doing this job (and arguably doing it much better than 
pathlib, and for more supported versions of python than any future 
version of pathlib will), it should be dropped from the standard library 
and left on pypi as a third party module.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib+os/shutil feedback

2016-04-11 Thread Sven R. Kunze

On 10.04.2016 16:51, Paul Moore wrote:

On 10 April 2016 at 15:07, Sven R. Kunze  wrote:

If there's some agreement to change things with respect to those 5 points, I
am willing to put some time into it.

In broad terms I agree with these points. Thanks for doing the
research. It would certainly be good to try to improve pathlib based
on this sort of feedback while it is still provisional.


I'd appreciate some guidance on this. Just let me know what I can do 
since I don't know the processes of hacking CPython.



"""
Path.rglob(pattern)
Walk down a given path; a wrapper for "os.scandir"/"os.listdir".
"""

However, at least in 3.5, Path.rglob does *not* wrap scandir. There's
a difference in principle, in that scandir (DirEntry) objects cache
stat data, where pathlib does not. Whether that makes using scandir in
Path.rglob impossible, I don't know. Ideally I'd like to see pathlib
modified to use scandir (because otherwise there will always be people
saying "use os.walk rather than scandir, as it's faster) - or if it's
not possible to do so because of the difference in principle, then I'd
like to see a clear discussion of the issue in the docs, including the
recommended approach for people who want scandir performance *without*
having to abandon pathlib for lower level functions.


Good point. The proposed docstring was just to illustrate the 
functionality to the uninformed reader. People mostly trust the docs 
without digging deeper but they should be accurate of course.



Best,
Sven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread marky1991 .
Neverending email chains aside, as a mere user, I like pathlib even as it
is today and like the convenience of it being in the stdlib. (And would
like it even more if the stdlib played nicely with it) I would be
disappointed if it were taken out. (It's one of the few recent additions
that I find useful to be honest)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Victor Stinner
2016-04-11 21:00 GMT+02:00 Brett Cannon :
> I'm -0 on allowing __fspath__ to return bytes, but we can see what others
> think.

With the PEP 383, a bytes filename can be stored as str using the
surrogateescape error handler. So DirEntry can convert a bytes path to
str using os.fsdecode().

A "byte string" is unclear in Python. There is the immutable "bytes"
type. But there is also the mutable "bytearray" type. And the buffer
protocol which can have different shapes.

I like the idea of a simple protocol: only allow a single type, str.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Sven R. Kunze

On 11.04.2016 22:33, Alexander Walters wrote:

If there is headway being made, I do not see it.


Funny that you brought it up. I was about posting something myself. I 
cannot agree completely. But starting with a comment from Paul, I 
realized that pathlib is something different than a string. After doing 
the research and our issues with pathlib, I found:



- pathlib just needs to be improved (see my 5 points)
- os[.path] should not tinkered with


I know that all of those discussions of a new protocol (path->str, 
__fspath__ etc. etc.) might be rendered worthless by these two 
statements. But that's my conclusion.


"os" and "os.path" are just lower level. "pathlib" is a high-level, 
convenience library. When using it, I don't want to use "os" or 
"os.path" anymore. If I still do, "pathlib" needs improving. *Not "os" 
nor "os.path"*.



Best,
Sven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Ethan Furman

On 04/11/2016 01:33 PM, Alexander Walters wrote:


In reviewing the ongoing arguments about how to make pathlib better,
there have been circular arguments about if it is even broken, if it
should support bytes, if there should be a path protocol that all
functions that touch the filesystem should use, if that protocol should
support bytes, how that protocol should be open or closed to allow third
party modules to act as paths, etc., etc.


Do not take lots of discussion as a negative.  It's better to thrash it 
out thoroughly first.



If there is headway being made, I do not see it.


It's being made, and I dare say we are close to the end.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Alexander Walters

If i had my druthers, this thread would be kept to either:

"Shut up alex, we are really close to figuring this out"

or

"Ok, maybe you have a point."

Every conceivable way to fix pathlib have already been argued.  Are any 
of them worth doing?  Can we get consensus enough to implement one of 
them?  If not, we should consider either dropping the matter or dropping 
the module.



On 4/11/2016 16:48, Sven R. Kunze wrote:

On 11.04.2016 22:33, Alexander Walters wrote:

If there is headway being made, I do not see it.


Funny that you brought it up. I was about posting something myself. I 
cannot agree completely. But starting with a comment from Paul, I 
realized that pathlib is something different than a string. After 
doing the research and our issues with pathlib, I found:



- pathlib just needs to be improved (see my 5 points)
- os[.path] should not tinkered with


I know that all of those discussions of a new protocol (path->str, 
__fspath__ etc. etc.) might be rendered worthless by these two 
statements. But that's my conclusion.


"os" and "os.path" are just lower level. "pathlib" is a high-level, 
convenience library. When using it, I don't want to use "os" or 
"os.path" anymore. If I still do, "pathlib" needs improving. *Not "os" 
nor "os.path"*.



Best,
Sven


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/tritium-list%40sdamon.com


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Alexander Walters

That is great news.  I just couldn't see it myself in the threads

On 4/11/2016 16:51, Ethan Furman wrote:

If there is headway being made, I do not see it.


It's being made, and I dare say we are close to the end. 


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Ethan Furman

On 04/11/2016 01:42 PM, Victor Stinner wrote:

2016-04-11 21:00 GMT+02:00 Brett Cannon:



I'm -0 on allowing __fspath__ to return bytes, but we can see what others
think.


With the PEP 383, a bytes filename can be stored as str using the
surrogateescape error handler. So DirEntry can convert a bytes path to
str using os.fsdecode().


I am far from a unicode expert, but if I understand this correctly you 
are proposing that DirEntry.__whatever__ can always return a str using 
the surogateescape (SE) method.


However, before this SE string can be used, it would need to be 
converted back to bytes, and with the same SE method, yes?  And this has 
already been implemented in the stdlib?


So my concern in such a case is what happens if we pass this SE string 
somewhere else: a UTF-8 file, or over a socket, or into a database? 
Does this have issues that we wouldn't face if we just used bytes?


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Random832
On Mon, Apr 11, 2016, at 16:48, Sven R. Kunze wrote:
> On 11.04.2016 22:33, Alexander Walters wrote:
> > If there is headway being made, I do not see it.
> 
> Funny that you brought it up. I was about posting something myself. I 
> cannot agree completely. But starting with a comment from Paul, I 
> realized that pathlib is something different than a string. After doing 
> the research and our issues with pathlib, I found:
> 
> 
> - pathlib just needs to be improved (see my 5 points)
> - os[.path] should not tinkered with

I'm not so sure. Is there any particular reason os.path.join should
require its arguments to be homogenous, rather than allowing
os.path.join('a', b'b', Path('c')) to return 'a/b/c'?

> I know that all of those discussions of a new protocol (path->str, 
> __fspath__ etc. etc.) might be rendered worthless by these two 
> statements. But that's my conclusion.
> 
> "os" and "os.path" are just lower level. "pathlib" is a high-level, 
> convenience library. When using it, I don't want to use "os" or 
> "os.path" anymore. If I still do, "pathlib" needs improving. *Not "os" 
> nor "os.path"*.

The problem isn't you using os. It's you using other modules that use
os. or io, shutil, or builtins.open. Or pathlib, if what *you're* using
is some other path library. Are you content living in a walled garden
where there is only your code and pathlib, and you never might want to
pass a Path to some function someone else (who didn't use pathlib)
wrote?

os is being used as an example because fixing os probably gets you most
other things (that just pass it through to builtins.open which passes it
through to os.open) for free.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Sven R. Kunze

On 11.04.2016 22:55, Alexander Walters wrote:
Every conceivable way to fix pathlib have already been argued. Are any 
of them worth doing?  Can we get consensus enough to implement one of 
them?  If not, we should consider either dropping the matter or 
dropping the module.


Right now, I don't see pathlib removed. Why? Because using strings alone 
has its caveats (we all know that). So, I cannot imagine an alternative 
concept to pathlib right now. We might call it differently, but the 
concept stays unchanged.


MAYBE, if there's an alternative concept, I could be convinced to 
support dropping the module.


Best,
Sven

PS: The only way out that I can imagine is to fix pathlib. I am not in 
favor of fixing functions of "os" and "os.path" to except "path" 
objects; which does the majority here discuss now with the new 
__fspath__ protocol. But shaping what we have is definitely worth it.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Random832
On Mon, Apr 11, 2016, at 17:04, Sven R. Kunze wrote:
> PS: The only way out that I can imagine is to fix pathlib. I am not in 
> favor of fixing functions of "os" and "os.path" to except "path" 
> objects;

Why not?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Ethan Furman

On 04/11/2016 02:04 PM, Sven R. Kunze wrote:

On 11.04.2016 22:55, Alexander Walters wrote:



Every conceivable way to fix pathlib have already been argued. Are any
of them worth doing?  Can we get consensus enough to implement one of
them?  If not, we should consider either dropping the matter or
dropping the module.


Right now, I don't see pathlib removed. Why? Because using strings alone
has its caveats (we all know that). So, I cannot imagine an alternative
concept to pathlib right now. We might call it differently, but the
concept stays unchanged.


We've pretty decided that we have two options:

1. remove pathlib
2. make the stdlib work with pathlib

So we're trying to make option 2 work before falling back to option 1.

If you have a way to make pathlib work with the stdlib that doesn't 
involve "fixing" os and os.path, now is the time to speak up.


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Alexander Walters
This stance was probably already argued in the threads in question. This 
thread is more of a health-check.  As an observer, it did not look like 
any headway was being made, and I suggested the solimaic solution.  It 
has been pointed out to me that headway IS being made and they are close 
to a solution.  I think this thread can safely be sunset.


On 4/11/2016 17:04, Sven R. Kunze wrote:

On 11.04.2016 22:55, Alexander Walters wrote:
Every conceivable way to fix pathlib have already been argued. Are 
any of them worth doing?  Can we get consensus enough to implement 
one of them?  If not, we should consider either dropping the matter 
or dropping the module.


Right now, I don't see pathlib removed. Why? Because using strings 
alone has its caveats (we all know that). So, I cannot imagine an 
alternative concept to pathlib right now. We might call it 
differently, but the concept stays unchanged.


MAYBE, if there's an alternative concept, I could be convinced to 
support dropping the module.


Best,
Sven

PS: The only way out that I can imagine is to fix pathlib. I am not in 
favor of fixing functions of "os" and "os.path" to except "path" 
objects; which does the majority here discuss now with the new 
__fspath__ protocol. But shaping what we have is definitely worth it.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/tritium-list%40sdamon.com


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Sven R. Kunze

On 11.04.2016 23:08, Random832 wrote:

On Mon, Apr 11, 2016, at 17:04, Sven R. Kunze wrote:

PS: The only way out that I can imagine is to fix pathlib. I am not in
favor of fixing functions of "os" and "os.path" to except "path"
objects;

Why not?


It occurred to me after pondering over Paul's comments.

"os" and "os.path" is just a completely different level of abstraction. 
There is just no need to mess with them.


The initial failure of my colleague and me of using pathlib can be 
solely attributed to pathlib's lack of functionality. Not to the 
incompatibility of "os" nor "os.path" with "Path" objects.



Best,
Sven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Sven R. Kunze

On 11.04.2016 23:05, Random832 wrote:

On Mon, Apr 11, 2016, at 16:48, Sven R. Kunze wrote:

On 11.04.2016 22:33, Alexander Walters wrote:

If there is headway being made, I do not see it.

Funny that you brought it up. I was about posting something myself. I
cannot agree completely. But starting with a comment from Paul, I
realized that pathlib is something different than a string. After doing
the research and our issues with pathlib, I found:


- pathlib just needs to be improved (see my 5 points)
- os[.path] should not tinkered with

I'm not so sure. Is there any particular reason os.path.join should
require its arguments to be homogenous, rather than allowing
os.path.join('a', b'b', Path('c')) to return 'a/b/c'?


Besides the fact, that I don't like mixing types (this was something 
that worried me about the discussion from the beginning), you can 
achieve the same using pathlib alone.


There's no need of it let alone the maintenance and slowdown of these 
implicit conversions.



I know that all of those discussions of a new protocol (path->str,
__fspath__ etc. etc.) might be rendered worthless by these two
statements. But that's my conclusion.

"os" and "os.path" are just lower level. "pathlib" is a high-level,
convenience library. When using it, I don't want to use "os" or
"os.path" anymore. If I still do, "pathlib" needs improving. *Not "os"
nor "os.path"*.

The problem isn't you using os. It's you using other modules that use
os. or io, shutil, or builtins.open. Or pathlib, if what *you're* using
is some other path library. Are you content living in a walled garden
where there is only your code and pathlib, and you never might want to
pass a Path to some function someone else (who didn't use pathlib)
wrote?

os is being used as an example because fixing os probably gets you most
other things (that just pass it through to builtins.open which passes it
through to os.open) for free.


Hypothetical assumptions meeting implicit type conversions. You might 
prefer those, I don't because of good reason. I was one of those 
starting the discussion around pathlib improvements. I understand now, 
that this is one of its minor issues. And btw. using some "other 
pathlib" is no argument for or against improving "THE pathlib".


The .path attribute will do it from what I can see.


Best,
Sven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib+os/shutil feedback

2016-04-11 Thread Brett Cannon
On Mon, 11 Apr 2016 at 13:40 Sven R. Kunze  wrote:

> On 10.04.2016 16:51, Paul Moore wrote:
> > On 10 April 2016 at 15:07, Sven R. Kunze  wrote:
> >> If there's some agreement to change things with respect to those 5
> points, I
> >> am willing to put some time into it.
> > In broad terms I agree with these points. Thanks for doing the
> > research. It would certainly be good to try to improve pathlib based
> > on this sort of feedback while it is still provisional.
>
> I'd appreciate some guidance on this. Just let me know what I can do
> since I don't know the processes of hacking CPython.
>

https://docs.python.org/devguide/ and
https://mail.python.org/mailman/listinfo/core-mentorship are your friends.
:)

For new features of a module you can discuss it on python-ideas first
before proposing a patch if you're worried a patch implementing the feature
might get rejected and you don't want to risk wasting your time.

-Brett
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Ben Finney
Alexander Walters  writes:

> That is great news.  I just couldn't see it myself in the threads

Agreed. A summary posting, from someone who has a good handle on the
issue and outcome, would be very helpful.

-- 
 \   “Firmness in decision is often merely a form of stupidity. It |
  `\indicates an inability to think the same thing out twice.” |
_o__)—Henry L. Mencken |
Ben Finney

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Brett Cannon
On Mon, 11 Apr 2016 at 14:11 Ethan Furman  wrote:

> On 04/11/2016 01:42 PM, Victor Stinner wrote:
> > 2016-04-11 21:00 GMT+02:00 Brett Cannon:
>
> >> I'm -0 on allowing __fspath__ to return bytes, but we can see what
> others
> >> think.
> >
> > With the PEP 383, a bytes filename can be stored as str using the
> > surrogateescape error handler. So DirEntry can convert a bytes path to
> > str using os.fsdecode().
>
> I am far from a unicode expert, but if I understand this correctly you
> are proposing that DirEntry.__whatever__ can always return a str using
> the surogateescape (SE) method.
>
> However, before this SE string can be used, it would need to be
> converted back to bytes, and with the same SE method, yes?  And this has
> already been implemented in the stdlib?
>
> So my concern in such a case is what happens if we pass this SE string
> somewhere else: a UTF-8 file, or over a socket, or into a database?
> Does this have issues that we wouldn't face if we just used bytes?
>

This is my worry as well and why I have not proposed this kind of universal
normalizing of bytes paths using os.fsdecode() w/ surrogateescape. Doing
this sort of thing from the system boundary and documenting as such as PEP
383 proposed makes a bit more sense as the expectation is more controlled
and is a clear input boundary.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Sven R. Kunze

On 11.04.2016 23:15, Ethan Furman wrote:

We've pretty decided that we have two options:

1. remove pathlib
2. make the stdlib work with pathlib

So we're trying to make option 2 work before falling back to option 1.

If you have a way to make pathlib work with the stdlib that doesn't 
involve "fixing" os and os.path, now is the time to speak up.


As I said, I don't like messing with os or os.path. They are built with 
a different level of abstraction in mind.



What makes people want to go down from pathlib to os (speaking in terms 
of abstraction) is the fact that pathlib suggests/promise a convenience 
that it cannot hold. You might have seen my "feedback" post here on 
python-dev. If those points were corrected in a reasonable way, we 
wouldn't have had the need to go down to os or other stdlib modules. As 
it presents itself, it feels like a poor wrapper for os and os.path. I 
hope that makes sense.


So, I might add:

3. add more high-level features to pathlib to prevent a downgrade to os 
or os.path



Best,
Sven
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of the pathlib discussion (Re: Maybe, just maybe, pathlib doesn't belong.)

2016-04-11 Thread Brett Cannon
On Mon, 11 Apr 2016 at 14:42 Ben Finney  wrote:

> Alexander Walters  writes:
>
> > That is great news.  I just couldn't see it myself in the threads
>
> Agreed. A summary posting, from someone who has a good handle on the
> issue and outcome, would be very helpful.
>


   - Guido has put Chris Angelico and myself in charge of drafting a
   proposal once we are done discussing things as a PEP (probably an amendment
   to the pathlib PEP where I will also explain why we are still not
   subclassing str)
   - Ethan Furman has volunteered to help out with code work (as have I)
   - Name bikeshedding never seems to end, but there seems to be coalescing
   around __fspath__ or __fspathname__ (I think, although __fspath__ seems to
   be what everyone has been typing today; I'm trying to stay out of it so as
   to not influence too much)
   - We are only discussing two things still (all going on in the threads
   relating to return values, arguments, types, etc. in their titles)...
  - Should path.__fspath__() be allowed to return bytes on top of
  strings? (we seem to have found an amicable way to allow
os.fspath() to let
  a bytes argument pass through just like str in an explicit fashion)
  - Should we explicitly type check in os.fspath() what
  path.__fspath__() returns or just let it fall through and hope people do
  the right thing?

That's pretty much it unless Chris or Ethan disagree. So I think pathlib is
far from being as dead as a parrot. ;)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] pathlib - current status of discussions

2016-04-11 Thread Ethan Furman

name:


We are down to two choices:

- __fspath__, or
- __fspathname__

The final choice I suspect will be affected by the choice to allow (or 
not) bytes.



method or attribute:
---

method


built-in:


Almost - we'll put it in the os module


add to str:
--

No, not all strings are paths.


add to C API:


Yes.  Possible names include PyUnicode_FromFSPath and PyObject_Path -- 
again, the choice of bytes inclusion will affect the final choice of name.



add a Path ABC:
--

undecided


Sticking points:
---

Do we allow bytes to be returned from os.fspath()?  If yes, then do we 
allow bytes from __fspath__()?


--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Summary of the pathlib discussion (Re: Maybe, just maybe, pathlib doesn't belong.)

2016-04-11 Thread Ethan Furman

On 04/11/2016 02:55 PM, Brett Cannon wrote:


That's pretty much it unless Chris or Ethan disagree. So I think pathlib
is far from being as dead as a parrot. ;)


That's nearly exactly what I wrote in my summary.  :)

So, yes, we are nearly there!

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Wes Turner
You seem to be defining a (restricted subset of an existing) language;
which will need version strings and ABI tags for compatibility purposes:

* Build Tags (for Python variants):
   * https ://
www.python.org
/dev/peps/pep-0425/

 * Python tag
 * ABI tag
 * Platform tag
  * https://www.python.org/dev/peps/pep-0513/ manylinux1
  * https://www.python.org/dev/peps/pep-3149/ .so file tags
  * RestrictedPython does not have ABI tags

An Android CPython build discussion about just exposing an extra attribute
in the platform module (the Android build also ships without some modules
IIRC):
* https://mail.python.org/pipermail/python-dev/2014-August/135606.html
*
https://mail.python.org/pipermail/python-dev/2014-August/thread.html#135640

On 11 April 2016 at 15:46, Jon Ribbens 
wrote:
> It's trying to alter
> the global Python environment so that arbitrary code can be executed,
> whereas I am not even trying to allow execution of arbitrary code and
> am not altering the global environment.

However, it's not at all clear (to me at least) what you *are* trying
to do. You're limiting the subset of Python that people can use,
understood. And you're trying to ensure that people can't do "bad
things". Again, understood. But what subset are you actually allowing,
and what things are you trying to protect against? (For example, I
can't calculate sin(1.2) using the math module - why is that not
alllowed? It's just as safe as using the built in exponential
operator, and indeed I could write a sin() function in pure Python,
although it would be too slow to be useful, unlike math.sin...)

It feels at the moment as if I'm playing a game where I don't know the
rules, and every time I think I scored a point, the rules are changed
to retroactively disallow it.

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-11 Thread Donald Stufft

> On Apr 11, 2016, at 5:58 PM, Ethan Furman  wrote:
> 
> name:
> 
> 
> We are down to two choices:
> 
> - __fspath__, or
> - __fspathname__
> 
> The final choice I suspect will be affected by the choice to allow (or not) 
> bytes.


+1 on __fspath__, -0 on __fspathname__

> 
> 
> 
> add a Path ABC:
> --
> 
> undecided


I think it makes sense to add it, but maybe only in 3.6? Path accepting code 
could be updated to do something like `isinstance(obj, (bytes, str, PathMeta))` 
which seems like a net win to me.

> 
> 
> Sticking points:
> ---
> 
> Do we allow bytes to be returned from os.fspath()?  If yes, then do we allow 
> bytes from __fspath__()?

I think yes and yes, it seems like making it needlessly harder to deal with a 
bytes path in the scenarios that you’re actually dealing with them is the kind 
of change that 3.0 made that ended up getting rolled back where it could.

-
Donald Stufft
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA



signature.asc
Description: Message signed with OpenPGP using GPGMail
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Jon Ribbens
On Tue, Apr 12, 2016 at 03:02:54AM +1000, Chris Angelico wrote:
> On Tue, Apr 12, 2016 at 2:53 AM, Jon Ribbens
>  wrote:
> > On Mon, Apr 11, 2016 at 04:04:21PM +0100, Paul Moore wrote:
> >> However, it's not at all clear (to me at least) what you *are* trying
> >> to do.
> >
> > I'm trying to see to what extent we can use ast node inspection to
> > remedy the failures of prior attempts at Python sandboxing. Is there
> > *any* extent to which Python can be sandboxed, or is even trying to
> > use it as a calculator function unfixably insecure?
> 
> It all depends on how much functionality you want. If all you need is
> a numeric expression evaluator, that's not too hard - disallow all
> forms of attribute access, etc, and just have simple numbers and
> operators. That's pretty useful, and safe.

By "calculator" I didn't necessarily mean to imply numeric-only,
sorry if I was unclear. Also perhaps I should have said "non-trivial",
inasmuch as if we restrict it that far then it would quite possibly be
simpler and quicker just to write the expression evaluator from scratch
and not use the Python interpreter at all.

> Alternatively, go completely the other way. Let people run whatever
> code they like... in an environment where it can't hurt anyone else.
> That's what PyPyJS does - don't bother looking for security holes in
> it, because all you're doing is attacking your own computer.

That's a very specific use case though: running client-side in the
user's browser.

> So before you can ask whether Python is unfixably insecure, you first
> have to decide what the minimum level of functionality is that you'll
> accept. Do you need basic arithmetic plus trignometric functions? Easy
> enough - disallow all attribute access and imports, and populate
> builtins with "from math import *". Need them to be able to assign
> variables and define functions? That's gonna be harder.

I think calling functions and accessing variables and attributes is
likely a minimum. Defining functions would be useful, and of course
defining classes would be another useful step further.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Random832
On Mon, Apr 11, 2016, at 17:15, Ethan Furman wrote:
> So we're trying to make option 2 work before falling back to option 1.
> 
> If you have a way to make pathlib work with the stdlib that doesn't 
> involve "fixing" os and os.path, now is the time to speak up.

Fully general re-dispatch from argument types on any call to a function
that raises TypeError or NotImplemented? [e.g. call
Path.__missing_func__(os.open, path, mode)]

Have pathlib monkey-patch things at import?


On Mon, Apr 11, 2016, at 17:43, Sven R. Kunze wrote:
> So, I might add:
> 
> 3. add more high-level features to pathlib to prevent a downgrade to os 
> or os.path

3. reimplement the entire ecosystem in every walled garden so no-one has
to leave their walled gardens.

What's the point of batteries being included if you can't wire them to
anything?

I don't get what you mean by this whole "different level of abstraction"
thing, anyway. The fact that there is one obvious thing to want to do
with open and a Path strongly suggests that that should be able to be
done by passing the Path to open.

Also, what level of abstraction is builtin open? Maybe we should _just_
leave os alone on the grounds of some holy sacred lowest-level-itude,
but allow io and shutils to accept Path?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Victor Stinner
Le 11 avr. 2016 11:11 PM, "Ethan Furman"  a écrit :
> So my concern in such a case is what happens if we pass this SE string
somewhere else: a UTF-8 file, or over a socket, or into a database? Does
this have issues that we wouldn't face if we just used bytes?

"SE string" are returned by os.listdir(str), os.walk(str), os.getenv(str),
sys.argv[int], ... since Python 3.3. Nothing new under the sun.

Trying to encode a surrogate to ascii, latin1 or utf8 raise an encoding
error. A surrogate is created to store an undecodable byte in a filename.

IHMO it's safer to get an encoding error rather than no error when you
concatenate two byte strings encoded to two different encodings (mojibake).

print(os.fspath(obj)) will more likely do what you expect if os.fspath()
always return str. I mean that it will encode your filename to the encoding
of the terminal which can be different than the filesystem encoding.

If fspath() can return bytes, you should write
print(os.fsdecode(os.fspath(obj))).

--

On Linux, open(DirEntry) for a bytes entry (os.scandir(bytes)) would have
to first decode a bytes filename with os.fsdecode() to then encode it back
with os.fsencode().

Yeah, that's inefficient. But we now have super fast codecs (ex: encode and
decode is almost memcpy for pure ascii). And filenames are usually very
short (less than 300 bytes). IMHO the interface matters more than
performance.

As I showed with my print example, filenames are not only used to access
the filesystem, you also want to display them. Using Unicode avoids bad
surprises (mojibake).

--

Well, the question is more why you want to get bytes at the first place.
Why not only using Unicode?

I understood that some people expect mojibake when using Unicode, whereas
using bytes cannot lead to mojibake. Well, in practice it's simply the
opposite :-)

Maybe devs read that Linux syscalls and C functions take bytes, so using
bytes give access to any filenames including "invalid filenames". That's
true. But it's also true for Unicode if you use os.fsdecode().

Maybe dev don't understand, don't know and fear Unicode :-)

My goal is more to educate users and help them to avoid mojibake.

Did I mention that you must not use bytes filename on Windows? So using
Unicode everywhere helps to write really portable code. On Windows, using
Unicode is requied to be able to open any file.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Summary of the pathlib discussion (Re: Maybe, just maybe, pathlib doesn't belong.)

2016-04-11 Thread Chris Angelico
On Tue, Apr 12, 2016 at 7:55 AM, Brett Cannon  wrote:
> That's pretty much it unless Chris or Ethan disagree. So I think pathlib is
> far from being as dead as a parrot. ;)

That looks like an accurate summary!

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Ethan Furman

On 04/11/2016 01:42 PM, Victor Stinner wrote:


With the PEP 383, a bytes filename can be stored as str using the
surrogateescape error handler. So DirEntry can convert a bytes path to
str using os.fsdecode().


Does this mean that os.fsdecode() is simply a wrapper that sets the 
errors to the surrogateescape handler?


--

~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread INADA Naoki
Sorry, I've forgot to use "Reply All".

On Tue, Apr 12, 2016 at 9:49 AM, INADA Naoki  wrote:

> IHMO it's safer to get an encoding error rather than no error when you
>> concatenate two byte strings encoded to two different encodings (mojibake).
>>
>> print(os.fspath(obj)) will more likely do what you expect if os.fspath()
>> always return str. I mean that it will encode your filename to the encoding
>> of the terminal which can be different than the filesystem encoding.
>>
>> If fspath() can return bytes, you should write
>> print(os.fsdecode(os.fspath(obj))).
>>
>>
> Why not print(obj)?
> str() is normal high-level API, and __fspath__ and os.fspath() should be
> low level API.
> Normal users shouldn't use __fspath__ and os.fspath().  Only library
> developers should use it.
>
> --
> INADA Naoki  
>

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pathlib enhancements - acceptable inputs and outputs for __fspath__ and os.fspath()

2016-04-11 Thread Greg Ewing

Ethan Furman wrote:

  # after new protocol with bytes/str support
  def zingar(a_path):
  a_path = fspath(a_path)
  if not isinstance(a_path, (bytes,str)):
  raise TypeError('bytes or str required')
  ...


I think that one would be just

   def zingar(a_path):
   a_path = fspath(a_path)

because fspath() would presumably check the result for
str/bytesness itself. At least I can't think of a reason
for it not to, since returning either str or bytes is
part of its contract.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Greg Ewing

Jon Ribbens wrote:

So far it looks like blocking "_*" and the frame object attributes
appears to be sufficient.


Even if your sandbox as it currently exists is secure, it's
only an extremely restricted subset. You seem to be assuming
that if your technique works so far, then it can be extended
to cover a larger subset, but I don't think that's certain.

One problem that's been raised is how to prevent untrusted
code from monkeypatching imported modules. Possibly that
could be addressed by giving the untrusted code a copy of
the module, but I'm not entirely sure -- accidentally
importing two copies of the same source file is a well-known
source of bugs, after all.

A related, but more difficult problem is that if we allow
the untrusted code to import any pure-Python classes, it
will be able to monkeypatch them. So it seems like it will
need its own copy of those classes as well -- and having
two copies of the same class around is *another* well
known source of bugs.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Wes Turner
On Mon, Apr 11, 2016 at 8:08 PM, Greg Ewing 
wrote:

> Jon Ribbens wrote:
>
>> So far it looks like blocking "_*" and the frame object attributes
>> appears to be sufficient.
>>
>
> Even if your sandbox as it currently exists is secure, it's
> only an extremely restricted subset. You seem to be assuming
> that if your technique works so far, then it can be extended
> to cover a larger subset, but I don't think that's certain.
>

How would you test that?


> One problem that's been raised is how to prevent untrusted
> code from monkeypatching imported modules. Possibly that
> could be addressed by giving the untrusted code a copy of
> the module, but I'm not entirely sure -- accidentally
> importing two copies of the same source file is a well-known
> source of bugs, after all.
>

https://en.wikipedia.org/wiki/Monkey_patch#Pitfalls

*
https://pypi.python.org/pypi?%3Aaction=search&term=monkeypatch&submit=search

  * https://pypi.python.org/pypi/apparmor_monkeys
  *
http://eventlet.net/doc/patching.html#monkeypatching-the-standard-library
  * http://www.gevent.org/gevent.monkey.html
  * https://docs.python.org/3/library/asyncio-sync.html#locks
  * https://docs.python.org/2/library/threading.html#lock-objects
  *
https://docs.python.org/2/library/sets.html?highlight=immutable#sets.ImmutableSet
  * http://doc.pypy.org/en/latest/stm.html#locks
   - " Infinite recursion just segfaults for now."
  * https://github.com/tobgu/pyrsistent #justfoundthis
- https://github.com/tobgu/pyrsistent#invariants
- https://github.com/tobgu/pyrsistent#freeze-and-thaw
  - freeze, thaw

  * define a @property (and no @propname.setter)
- https://docs.python.org/2/howto/descriptor.html#properties
- https://docs.python.org/2/library/functions.html#property


> A related, but more difficult problem is that if we allow
> the untrusted code to import any pure-Python classes, it
> will be able to monkeypatch them. So it seems like it will
> need its own copy of those classes as well --


* https://docs.python.org/3/library/importlib.html#importlib.__import__
*


> and having
> two copies of the same class around is *another* well
> known source of bugs.


One way to reduce the likelihood of this is to
bundle all dependencies into a self-contained
PEX ZIP package
and specify entry points.

* http://legacy.python.org/dev/peps/pep-0441/
*
https://pex.readthedocs.org/en/stable/buildingpex.html#specifying-entry-points
*
https://pex.readthedocs.org/en/stable/buildingpex.html#tailoring-pex-execution-at-build-time


>
>
> --
> Greg
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/wes.turner%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Jon Ribbens
On Tue, Apr 12, 2016 at 01:08:36PM +1200, Greg Ewing wrote:
> Jon Ribbens wrote:
> >So far it looks like blocking "_*" and the frame object attributes
> >appears to be sufficient.
> 
> Even if your sandbox as it currently exists is secure, it's
> only an extremely restricted subset.

I'm not sure what you think the restrictions are, but yes a highly
restricted Python that was secure would be very useful sometimes.

> You seem to be assuming that if your technique works so far, then it
> can be extended to cover a larger subset, but I don't think that's
> certain.

No, I'm not assuming that.

> One problem that's been raised is how to prevent untrusted
> code from monkeypatching imported modules. Possibly that
> could be addressed by giving the untrusted code a copy of
> the module,

Yes, that's what it does.

> but I'm not entirely sure -- accidentally importing two copies of
> the same source file is a well-known source of bugs, after all.

I'm not sure what you mean by that.

> A related, but more difficult problem is that if we allow
> the untrusted code to import any pure-Python classes, it
> will be able to monkeypatch them. So it seems like it will
> need its own copy of those classes as well

Yes, that's also what it does.

> -- and having two copies of the same class around is *another* well
> known source of bugs.

I'm not sure what you mean by that either.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Challenge: Please break this! (a.k.a restricted mode revisited)

2016-04-11 Thread Chris Angelico
On Tue, Apr 12, 2016 at 8:43 AM, Jon Ribbens
 wrote:
> On Tue, Apr 12, 2016 at 03:02:54AM +1000, Chris Angelico wrote:
>> It all depends on how much functionality you want. If all you need is
>> a numeric expression evaluator, that's not too hard - disallow all
>> forms of attribute access, etc, and just have simple numbers and
>> operators. That's pretty useful, and safe.
>
> By "calculator" I didn't necessarily mean to imply numeric-only,
> sorry if I was unclear. Also perhaps I should have said "non-trivial",
> inasmuch as if we restrict it that far then it would quite possibly be
> simpler and quicker just to write the expression evaluator from scratch
> and not use the Python interpreter at all.

I'm aware you wanted more. My point is that it's not hard to secure
the trivially simple, and it doesn't have to be entirely useless. But
every bit of additional power brings with it additional risk.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-11 Thread Nick Coghlan
On 12 April 2016 at 07:58, Ethan Furman  wrote:
> Sticking points:
> ---
>
> Do we allow bytes to be returned from os.fspath()?  If yes, then do we allow
> bytes from __fspath__()?

I've come around to the point of view that allowing both str and
bytes-like objects to pass through unchanged makes sense, with the
rationale being the one someone mentioned regarding ease-of-use in
os.path.

Consider os.path.join: with a permissive os.fspath, the necessary
update should just be to introduce "map(os.fspath, args)" (or its C
equivalent), and then continue with the existing bytes vs str handling
logic.

Functions consuming os.fspath can then decide on a case-by-case basis
how they want to handle binary paths: either use them as is (which
will usually work on mostly-ASCII systems), convert them to text with
os.fsdecode (which will usually work on *nix systems), or disallow
them entirely (which would probably only be appropriate for libraries
that wanted to ensure support for non-ASCII paths on Windows systems).

That then cascades into the other open questions mentioned:

- permitted return types for both fspath and __fspath__ would be (str, bytes)
- the names would be fspath and __fspath__, since the result may be
either a path name as text, or an encoded path name as bytes

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-11 Thread Nick Coghlan
On 12 April 2016 at 13:45, Nick Coghlan  wrote:
> Consider os.path.join: with a permissive os.fspath, the necessary
> update should just be to introduce "map(os.fspath, args)" (or its C
> equivalent), and then continue with the existing bytes vs str handling
> logic.

That does remind me: once a patch is available, we should check the
benchmark numbers with the patch applied. I'd expect the new protocol
overhead to be swamped by the actual IO costs, but this kind of low
level change can have surprising consequences.

Regarding the type checks, PyObject_AsFilesystemPath (or whatever we
call it) will be implemented in C, with os.fspath just calling that,
so doing "PyUnicode_Check(path) || PyBytes_Check(path)" on the result
will be both cheap and convenient for API consumers (since it means
they know they only have to cope with bytes or str instances
internally, and will get a clear error message if handed something
else).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-11 Thread Chris Barker - NOAA Federal
>  with the
> rationale being the one someone mentioned regarding ease-of-use in
> os.path.
>
> Consider os.path.join:

Why in the world do the  os.path functions need to work with Path
objects? ( and other conforming objects)

Thus all started with the goal of using Path objects in the stdlib,
but that's for opening files, etc. Path is an alternative to os.path
-- you don't need to use both.

And if you do have a byte path, you can stick with os.path

BTW,

I'm confused about what a bytes path IS -- is it encoded? Can you
assume it can be decoded ? It seems to me that the ONLY time you
should get a byte path is from a low level system call on a posix
system, and you may have no idea how it's encoded. So the ONLY thing
you should do with it is pass it along to another low level system
call.

I can't see why we should support anything else with bytes objects.

> - the names would be fspath and __fspath__, since the result may be
> either a path name as text, or an encoded path name as bytes

You just used the phrase "path name as bytes" -- so why is
__pathname__ inappropriate if it might return bytes?

I like __pathname__ better because this entire effort is because we'
be decided itMs important to make the distinction between a "path" and
the text representation of said path.

Just sayin'

-CHB
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-11 Thread Stephen J. Turnbull
Donald Stufft writes:

 > I think yes and yes [__fspath__ and fspath should be allowed to
 > handle bytes, otherwise] it seems like making it needlessly harder
 > to deal with a bytes path

It's not needless.  This kind of polymorphism makes it hard to review
code locally.  Once bytes get a foothold inside a text application,
they metastasize altogether too easily, and you end up with TypeErrors
or UnicodeErrors quite far from the origin.  Debugging often requires
tracing data flows over hill and over dale while choking from the
dusty trail, or band-aids like a top-level "except UnicodeError:
log_and_quarantine(bytes)".  I can't prove that returning bytes from
these APIs is a big risk in this sense, but I can't see a way to prove
that it's not, either, given that their point is duck-typing, and
therefore they may be generalized in the future, and by third parties.

I understand that there are applications where it's bytes all the way
down, but by the very nature of computing systems, there are systems
where bytes are decoded to text.  For historical reasons (the encoding
Tower of Babel), it's very error-prone to do that on demand.  Best
practice is to do the conversion as close to the boundary as possible,
and process only text internally.

In text applications, "bytes as carcinogen" is an apt metaphor.

Now, I'm not Dutch, so I can't tell you it's obvious that the risk to
text-processing applications is more important than the inconvenience
to byte-shoveling applications.  But there is a need to be
parsimonious with polymorphism.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on backporting __wrapped__ to 2.7?

2016-04-11 Thread Robert Collins
On 6 April 2016 at 15:03, Stephen J. Turnbull  wrote:
> Robert Collins writes:
>
>  > Sadly that has the ordering bug of assigning __wrapped__ first and appears
>  > a little unmaintained based on the bug tracker :(
>
> You can fix two problems with one patch, then!
>

Not really - taking over a project is somewhat long winded; it would
be centralising yet another backport which
may-or-may-not-be-a-good-thing, and I'm not exactly overflowing with
spare tuits. If someone wants to do it - great, more power to them,
but the last thing we need is to move it from one unmaintained spot to
another unmaintained spot.

-Rob



-- 
Robert Collins 
Distinguished Technologist
HP Converged Cloud
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-11 Thread Greg Ewing

Chris Barker - NOAA Federal wrote:

Why in the world do the  os.path functions need to work with Path
objects?


So that applications using path objects can pass them
to library code that uses os.path to manipulate them.


I'm confused about what a bytes path IS -- is it encoded?


It's a sequence of bytes identifying a file. Often it
will be an encoding of som piece of text in the file
system encoding, but there's no guarantee of that.


Can you assume it can be decoded ?


Only if you use an encoding in which all byte sequences
are valid, such as latin1 or utf8+surrogateescape.


So the ONLY thing
you should do with it is pass it along to another low level system
call.


Not quite -- you can separate it into components and
work with them. Essentially the same set of operations
that os.path provides.


- the names would be fspath and __fspath__, since the result may be
either a path name as text, or an encoded path name as bytes


I like __pathname__ better because this entire effort is because we'
be decided itMs important to make the distinction between a "path" and
the text representation of said path.


I agree -- the term "pathname" can cover both text and
bytes. When posix talks about pathnames it's really
talking about bytes.

--
Greg
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] pathlib - current status of discussions

2016-04-11 Thread Ethan Furman

On 04/11/2016 10:14 PM, Chris Barker - NOAA Federal wrote:


Consider os.path.join:


Why in the world do the  os.path functions need to work with Path
objects? ( and other conforming objects)


Because library XYZ that takes a path and wants to open it shouldn't 
have to care whether that path is a string or pathlib.Path -- but if 
os.open can't use pathlib.Path then the library has to care (or the user 
has to care).



This all started with the goal of using Path objects in the stdlib,
but that's for opening files, etc.


Etc. as in os.join?  os.stat? os.path.split?


Path is an alternative to os.path -- you don't need to use both.


As a user you don't, no.  As a library that has no control over what 
kind of "path" is passed to you -- well, if os and os.path can accept 
Path objects then you can just use os and os.path; otherwise you have to 
use os and os.path if passed a str or bytes, and pathlib.Path if passed 
a pathlib.Path -- so you do have to use both.



- the names would be fspath and __fspath__, since the result may be
either a path name as text, or an encoded path name as bytes


You just used the phrase "path name as bytes" -- so why is
__pathname__ inappropriate if it might return bytes?


No, he used the phrase "*encoded* path name as bytes".  Names are 
typically represented as text, and since bytes might be returned we 
don't want a signal that says text.



I like __pathname__ better because this entire effort is because we'
be decided itMs important to make the distinction between a "path" and
the text representation of said path.


No, this entire effort is to make pathlib work with the rest of the stdlib.

--
~Ethan~
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Maybe, just maybe, pathlib doesn't belong.

2016-04-11 Thread Stephen J. Turnbull
Alexander Walters writes:

 > If there is headway being made, I do not see it.

Filter out everything but the posts by Brett, and see if you still
feel that way.  (Other people have contributed[1], but that filter
has about 20dB better S/N than the whole thread does.)


Footnotes: 
[1]  Brett may even claim none of the ideas are his.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com