from:"Daniel."

On Mon, Jan 13, 2014 at 12:42 PM, R. David Murray  wrote:
> On Mon, 13 Jan 2014 12:41:18 +0100, Antoine Pitrou  
> wrote:
>> On Sun, 12 Jan 2014 18:11:47 -0800
>> Guido van Rossum  wrote:
>> > On Sun, Jan 12, 2014 at 5:27 PM, Ethan Furman  wrote:
>> > > On 01/12/2014 04:47 PM, Guido van Rossum wrote:
>> > >> %s seems the trickiest: I think with a bytes argument it should just
>> > >> insert those bytes (and the padding modifiers should work too), and
>> > >> for other types it should probably work like %a, so that it works as
>> > >> expected for numeric values, and with a string argument it will return
>> > >> the ascii()-variant of its repr(). Examples:
>> > >>
>> > >> b'%s' % 42 == b'42'
>> > >> b'%s' % 'x' == b"'x'" (i.e. the three-byte string containing an 'x'
>> > >> enclosed in single quotes)
>> > >
>> > > I'm not sure about the quotes.  Would anyone ever actually want those in 
>> > > the
>> > > byte stream?
>> >
>> > Perhaps not, but it's a hint that you should probably think about an
>> > encoding. It's symmetric with how '%s' % b'x' returns "b'x'". Think of
>> > it as payback time. :-)
>>
>> What is the use case for embedding a quoted ASCII-encoded representation
>> in a byte stream?
>
> There is no use case in the sense you are asking, just like there is no
> real use case for '%s' % b'x' producing "b'x'".  But the real use case
> is exactly the same: to let you know your code is screwed up without
> actually blowing up with a encoding Exception.
>
> For the record, I like Guido's logic and proposal.  I don't understand
> Nick's objection, since I don't see the difference between the situation
> here where a string gets interpolated into bytes as 'xxx' and the
> corresponding situation where bytes gets interpolated into a string
> as b'xxx'.  Why struggle to keep bytes interpolation "pure" if string
> interpolation isn't?
>
> Guido's proposal makes the language more symmetric, and thus more
> consistent and less surprising.  Exactly the hallmarks of Python's design
> sense, IMO.  (Big surprise, right? :)
>
> Of course, this point of view *is* based on the idea that when you are
> doing interpolation using %/.format, you are in fact primarily concerned
> with ASCII compatible byte streams.  This is a Practicality sort of
> argument.  It is, after all, by far the most common use case when
> doing interpolation[*].
>
> If you wanted to do a purist version of this symmetry, you'd have bytes(x)
> calling __bytes__ if it was defined and falling back to calling a
> __brepr__ otherwise.
>
> But what would __brepr__ implement?  The variety of format codes in
> the struct module argues that there is no "one obvious" binary
> repr for most types.  (Those that have one would implement __bytes__).
> And what would be the __brepr__ of an arbitrary 'object'?
>
> Faced with the impracticality of defining __brepr__ usefully in any "pure
> bytes" form, it seems sensible to admit that the most useful __brepr__
> is the ascii() encoding of the __repr__.  Which naturally produces 'xxx'
> as the __brepr__ of a string.
>
> This does cause things to get a little un-pretty when you are operating
> at the python prompt:
>
> >>> b'%s' % object
> b'""'
>
> But then again that is most likely really not what you mean to do, so
> it becomes a big red flag...just like b'xxx' is a small red flag when
> you accidentally interpolate unencoded bytes into a string.
>
> --David
>
> PS: When I first read Guido's remark that the result of interpolating a
> string should be 'xxx', I went Wah?  I had to reason my way through to
> it as above, but to him it was just the natural answer.  Guido isn't
> always right, but this kind of automatic language design consistency
> is one reason he's the BDFL.
>
> [*] I still think that you mostly want to design your library so that
> you are handling the text parts as text and the bytes parts as bytes,
> and encoding/gluing them as appropriate at the IO boundary.  But if Guido
> says his real code would benefit by being able to interpolate ASCII into
> bytes at certain points, I'll believe him.



If you think corrupted data is easier or more pleasant to track down
than encoding exceptions then I think you are strange. It makes
porting really difficult while you are still trying to figure out
where the bytes/str boundaries are. I am now deeply suspicious of all
% formatting.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 460 reboot

I see it now. b"foo%sbar" % b'baz' should also expand to b"foob'foo'bar"

Instead of "%b" could "%j" mean "I should have used + or join() here
but was too lazy" and work on str too?

On Mon, Jan 13, 2014 at 2:51 PM, Terry Reedy  wrote:
> On 1/13/2014 1:40 PM, Brett Cannon wrote:
>
>> > So bytes formatting really needn't (and shouldn't, IMO) mirror str
>> > formatting.
>
>
> This was my presumption in writing byteformat().
>
>
>> I think one of the things about Guido's proposal that bugs me is that it
>> breaks the mental model of the .format() method from str in terms of how
>> the mini-language works. For str.format() you have the conversion and
>> the format spec (e.g. "{!r}" and "{:d}", respectively). You apply the
>> conversion by calling the appropriate built-in, e.g. 'r' calls repr().
>> The format spec semantically gets passed with the object to format()
>> which calls the object's __format__() method: ``format(number, 'd')``.
>>
>> Now Guido's suggestion has two parts that affect the mini-language for
>> .format(). One is that for bytes.format() the default conversion is
>> bytes() instead of str(), which is fine (probably want to add 'b' as a
>> conversion value as well to be consistent). But the other bit is that
>> the format spec goes from semantically meaning ``format(thing,
>> format_spec)`` to ``format(thing, format_spec).encode('ascii',
>> 'strict')`` for at least numbers. That implicitness bugs me as I have
>> always thought of format specs just leading to a call to format(). I
>> think I can live with it, though, as long as it is **consistently**
>> applied across the board for bytes.format(); every use of a format spec
>> leads to calling ``format(thing, format_spec).encode('ascii',
>> 'strict')`` no matter what type 'thing' would be and it is clearly
>> documented that this is done to ease porting and handle the common case
>> then I can live with it.
>
>
> This is how my byteformat function works, except that when no format_spec is
> given, byte and bytearrary objects are left unchanged rather than being
> decoded and encoded again.
>
>
>> This even gives people in-place ASCII encoding for strings by always
>> using '{:s}' with text which they can do when they port their code to
>> run under both Python 2 and 3. So you should be able to do
>> ``b'Content-Type: {:s}'.format('image/jpeg')`` and have it give ASCII.
>> If you want more explicit encoding to latin-1 then you need to do it
>> explicitly and not rely on the mini-language to do tricks for you.
>>
>> IOW I want to treat the format mini-language as a language and thus not
>> have any special-casing or massive shifts in meaning between
>> str.format() and bytes.format() so my mental model doesn't have to
>> contort based on whether it's str or bytes. My preference is not have
>> any, but if Guido is going say PBP here then I want absolute consistency
>> across the board in how bytes.format() tweaks things.
>>
>> As for %s for the % operator calling ascii(), I think that will be a
>> porting nightmare of finding out why your bytes suddenly stopped being
>> formatted properly and then having to crawl through all of your code for
>> that one use of %s which is getting bytes in. By raising a TypeError you
>> will very easily detect where your screw-up occurred thanks to the
>> traceback; do so otherwise feels too much like implicit type conversion
>> and ask any JavaScript developer how that can be a bad thing.
>
>
> I personally would not add 'bytes % whatever'.
>
> --
> Terry Jan Reedy
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 460 reboot

On Mon, Jan 13, 2014 at 3:11 PM, Yury Selivanov  wrote:
> On January 13, 2014 at 3:08:43 PM, Daniel Holth (dho...@gmail.com) wrote:
>>
>> I see it now. b"foo%sbar" % b'baz' should also expand to b"foob'foo'bar"
>>
>> Instead of "%b" could "%j" mean "I should have used + or join()
>> here
>> but was too lazy" and work on str too?
>
> Isn’t this just error prone? Since it’s a new format character, many,
> probably, would write %s by mistake. And, besides, there was no %j
> in python2.

Merely a flesh wound.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 460 reboot

On Mon, Jan 13, 2014 at 4:59 PM, Guido van Rossum  wrote:
> On Mon, Jan 13, 2014 at 1:29 PM, Glenn Linderman  
> wrote:
>> On 1/13/2014 12:09 PM, Guido van Rossum wrote:
>>
>> Yeah, the %s behavior with a string argument was a messy attempt at
>> compromise. I was hoping to mimick a common use of %s in Python 2,
>> where it can be used with either an 8-bit string or a number as
>> argument, acting like %b in the former case and like %d in the latter
>> case. Not having %s at all in Python 3 means that porting requires
>> more thinking (== more opportunity for mistakes when you're converting
>> in bulk) and there's no easy way to write code that works in Python 2
>> and 3.
>>
>> If we have %b for strictly interpolating bytes, I'm fine with adding
>> %a for calling ascii() on the argument and then interpolating the
>> result after ASCII-encoding it.
>>
>> If somehow (unlikely though it seems) we end up keeping %s (e.g.
>> strictly to ease porting), we could also keep %r as an alias for %a.
>>
>>
>> %s for strictly interpolating bytes eases porting. Sad name, but good for
>> compatibility. When the blowup happens, due to having a str type passed, the
>> porter adds the appropriate .encode(...) to the parameter, so it doesn't
>> blow up on Py 3, and it'll be OK for Py 2 as well, will it not?
>
> Lots of code uses %s with numbers too, and probably the occasional
> None or list (relying on the Python 2 near-guarantee that most
> objects' str() is their repr() and that repr() nearly guarantees to
> return only ASCII).
>
> E.g. I'm sure you can find live code doing something like
>
> headers.append('Content-Length: %s\r\n' % len(body))

But if the alternative is spurious quotes then the choice is clear...
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 460 reboot

2014-01-14 Thread Daniel Holth

On Tue, Jan 14, 2014 at 1:52 PM, Guido van Rossum  wrote:
> On Tue, Jan 14, 2014 at 9:45 AM, Chris Barker  wrote:
>> On Tue, Jan 14, 2014 at 9:29 AM, Yury Selivanov 
>> wrote:
>>>
>>>  - Try str(), and do ".encode(‘ascii’, ‘stcict’)” on the result.
>>
>>
>> please no -- that's the source of a lot of pain in py2 now.
>>
>> having a failure as a result of the value, rather than the type, of an
>> object just makes hard-to-test for bugs. Everything will be hunky dory for
>> development and testing, then in deployment some idiot ( ;-) ) will pass in
>> some non-ascii compatible string and you get  failure. And the person who
>> gets the failure doesn't understand why, or they wouldn't have passed in
>> non-ascii values in the first place...
>>
>> Ease of porting is nice, but let's not make it easy to port bug-prone code.
>
> Right. This is a big red flag to me as well.
>
> I think there is some inherent conflict between the extensible design
> of str.format() and the practical needs of people who are actually
> going to use formatting operations (either % or .format()) with bytes.
>
> The *practical* needs are mostly limited to supporting basic number
> formatting (decimal, hex, padding) and interpolation of anything that
> supports the buffer interface. It would also be nice if you didn't
> have to specify the type at all in the format string, i.e. {} should
> do the right thing for numbers and (all sorts of) bytes.
>
> But the way to arrive at this behavior without duplicating a whole lot
> of code seems to be to call the existing text-based __format__ API and
> convert the result to bytes -- for numbers this should be safe (their
> formatting produces just ASCII digits and a selected few other ASCII
> characters) but leads to an undesirable outcome for other types -- not
> just str but also e.g. lists or dicts containing str instances, since
> those call __repr__ on the contained items, and repr() may produce
> non-ASCII bytes.
>
> This is why my earlier proposal used ascii(), which is a "nerfed"(*)
> version of repr(). This does the right thing for numbers as well as
> for many other types (e.g. None, bool) and does something unpleasant
> for text strings that is perhaps better than the alternative.
>
> Which reminds me. Quite a few people have spoken out in favor of loud
> failures rather than silent "wrong" output. But I think that in the
> specific context of formatting output, there is a long and IMO good
> tradition of producing (slightly) wrong output in favor of more strict
> behavior. Consider for example what to do when a number doesn't fit in
> the given width. Would you rather raise an exception, truncate the
> value, or mess up the formatting? All languages newer than Fortran
> that I've used have chosen the latter, and I still agree it's a good
> idea. Similar with infinities, NaN, or None. (Yes, it's embarrassing
> to have a website displaying 'null'. But isn't a 500 even *more*
> embarrassing?)
>
> This doesn't mean I'm insensitive to the argument in favor of loud and
> early failure. It's just that I can see both sides of the coin, and
> I'm still deciding which argument is more important.
>
> (*) Gamer slang for a weapon made less dangerous. :-)

I think loud and early failure is important for porting while you
might still be trying to pound out the previously blurry encode/decode
boundaries. In this code str and bytes will be wrong everywhere. Some
APIs might return either str or bytes based on the input. Let it fail,
find the boundaries, and fix it until it does something useful without
failing. And it kindof depends on the context whether it is worse to
display weird ephemeral output or write the same weird output to long
term storage.

I'm not sure what to think about content-dependent failures on
protocols that are supposed to be ASCII-only-without-repr-noise.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Making proxy types easier to write and maintain

2014-03-20 Thread Daniel Holth

pypy's transparent proxy feature:
http://pypy.readthedocs.org/en/latest/objspace-proxies.html#transparent-proxies



On Thu, Mar 20, 2014 at 1:56 PM, Larry Hastings  wrote:
> On 03/20/2014 12:49 AM, Nick Coghlan wrote:
>
> So long as Graham's willing to go along with it, he doesn't have to to
> be the one to write the PEP.
>
>
> PEP? Why does it need a PEP?  I didn't think it'd even be a new top-level
> library module, it'd be an addition / additions to functools.
>
>
> /arry
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-25 Thread Daniel Holth

I love it.

On Tue, Mar 25, 2014 at 6:37 PM, Ethan Furman  wrote:
> Okay, I included that last round of comments (from late February).
>
> Barring typos, this should be the final version.
>
> Final comments?
>
> -
> PEP: 461
> Title: Adding % formatting to bytes and bytearray
> Version: $Revision$
> Last-Modified: $Date$
> Author: Ethan Furman 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 2014-01-13
> Python-Version: 3.5
> Post-History: 2014-01-14, 2014-01-15, 2014-01-17, 2014-02-22, 2014-03-25
> Resolution:
>
>
> Abstract
> 
>
> This PEP proposes adding % formatting operations similar to Python 2's
> ``str``
> type to ``bytes`` and ``bytearray`` [1]_ [2]_.
>
>
> Rationale
> =
>
> While interpolation is usually thought of as a string operation, there are
> cases where interpolation on ``bytes`` or ``bytearrays`` make sense, and the
> work needed to make up for this missing functionality detracts from the
> overall
> readability of the code.
>
>
> Motivation
> ==
>
> With Python 3 and the split between ``str`` and ``bytes``, one small but
> important area of programming became slightly more difficult, and much more
> painful -- wire format protocols [3]_.
>
> This area of programming is characterized by a mixture of binary data and
> ASCII compatible segments of text (aka ASCII-encoded text).  Bringing back a
> restricted %-interpolation for ``bytes`` and ``bytearray`` will aid both in
> writing new wire format code, and in porting Python 2 wire format code.
>
> Common use-cases include ``dbf`` and ``pdf`` file formats, ``email``
> formats, and ``FTP`` and ``HTTP`` communications, among many others.
>
>
> Proposed semantics for ``bytes`` and ``bytearray`` formatting
> =
>
> %-interpolation
> ---
>
> All the numeric formatting codes (such as ``%x``, ``%o``, ``%e``, ``%f``,
> ``%g``, etc.) will be supported, and will work as they do for str, including
> the padding, justification and other related modifiers.  The only difference
> will be that the results from these codes will be ASCII-encoded text, not
> unicode.  In other words, for any numeric formatting code `%x`::
>
>b"%x" % val
>
> is equivalent to
>
>("%x" % val).encode("ascii")
>
> Examples::
>
>>>> b'%4x' % 10
>b'   a'
>
>>>> b'%#4x' % 10
>' 0xa'
>
>>>> b'%04X' % 10
>'000A'
>
> ``%c`` will insert a single byte, either from an ``int`` in range(256), or
> from
> a ``bytes`` argument of length 1, not from a ``str``.
>
> Examples::
>
> >>> b'%c' % 48
> b'0'
>
> >>> b'%c' % b'a'
> b'a'
>
> ``%s`` is included for two reasons:  1) `b` is already a format code for
> ``format`` numerics (binary), and 2) it will make 2/3 code easier as Python
> 2.x
> code uses ``%s``; however, it is restricted in what it will accept::
>
>   - input type supports ``Py_buffer`` [6]_?
> use it to collect the necessary bytes
>
>   - input type is something else?
> use its ``__bytes__`` method [7]_ ; if there isn't one, raise a
> ``TypeError``
>
> In particular, ``%s`` will not accept numbers (use a numeric format code for
> that), nor ``str`` (encode it to ``bytes``).
>
> Examples::
>
> >>> b'%s' % b'abc'
> b'abc'
>
> >>> b'%s' % 'some string'.encode('utf8')
> b'some string'
>
> >>> b'%s' % 3.14
> Traceback (most recent call last):
> ...
> TypeError: b'%s' does not accept numbers, use a numeric code instead
>
> >>> b'%s' % 'hello world!'
> Traceback (most recent call last):
> ...
> TypeError: b'%s' does not accept 'str', it must be encoded to `bytes`
>
>
> ``%a`` will call ``ascii()`` on the interpolated value.  This is intended
> as a debugging aid, rather than something that should be used in production.
> Non-ASCII values will be encoded to either ``\xnn`` or ``\u``
> representation.  Use cases include developing a new protocol and writing
> landmarks into the stream; debugging data going into an existing protocol
> to see if the problem is the protocol itself or bad data; a fall-back for a
> serialization format; or even a rudimentary serialization format when
> defining ``__bytes__`` would not be appropriate [8].
>
> .. note::
>
> If a ``str`` is passed into ``%a``, it will be surrounded by quotes.
>
>
> Unsupported codes
> -
>
> ``%r`` (which calls ``__repr__`` and returns a ``str``) is not supported.
>
>
> Proposed variations
> ===
>
> It was suggested to let ``%s`` accept numbers, but since numbers have their
> own
> format codes this idea was discarded.
>
> It has been suggested to use ``%b`` for bytes as well as ``%s``.  This was
> rejected as not adding any value either in clarity or simplicity.
>
> It has been proposed to automatically use ``.encode('ascii','strict')`` for
> ``str`` arguments to ``%s``.
>
>   - Rejected as thi

Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Daniel Holth

I feel not including %s is nuts. Should I write .replace('%b', '%s')?
All I desperately need are APIs that provide enough unicode / str type
safety that I get an exception when mixing them accidentally... in my
own code, dynamic typing is usually a bug. As has been endlessly
discussed, %s for bytes is a bit like exposing sprintf()...

On Thu, Mar 27, 2014 at 2:41 PM, Guido van Rossum  wrote:
> Much better, but I'm still not happy with including %s at all. Otherwise
> it's accept-worthy. (How's that for pressure. :-)
>
>
> On Thu, Mar 27, 2014 at 11:04 AM, Ethan Furman  wrote:
>>
>> On 03/27/2014 10:55 AM, Ethan Furman wrote:
>>>
>>> On 03/27/2014 10:29 AM, Guido van Rossum wrote:


 I also don't understand why we can't use %b instead of %s. AFAIK %b
 currently doesn't mean anything and I somehow don't
 expect we're likely to add it for other reasons (unless there's a
 proposal I'm missing?). Just like we use %a instead of
 %r to remind people that it's not quite the same (since it applies
 .encode('ascii', 'backslashreplace')), shouldn't we
 use anything *but* %s to remind people that that is also not the same
 (not at all, in fact)? The PEP's argument against
 %b ("rejected as not adding any value either in clarity or simplicity")
 is hardly a good reason.
>>>
>>>
>>> The biggest reason to use %s is to support a common code base for 2/3
>>> endeavors.  The biggest reason to not include %b
>>> is that it means binary number in format(); given that each type can
>>> invent it's own mini-language, this probably isn't
>>> a very strong argument against it.
>>>
>>> I have moderate feelings for keeping %s as a synonym for %b for backwards
>>> compatibility with Py2 code (when it's
>>> appropriate).
>>
>>
>> Changed to:
>>
>> --
>> ``%b`` will insert a series of bytes.  These bytes are collected in one of
>> two
>> ways:
>>
>>   - input type supports ``Py_buffer`` [4]_?
>>
>> use it to collect the necessary bytes
>>
>>   - input type is something else?
>> use its ``__bytes__`` method [5]_ ; if there isn't one, raise a
>> ``TypeError``
>>
>> In particular, ``%b`` will not accept numbers nor ``str``.  ``str`` is
>> rejected
>> as the string to bytes conversion requires an encoding, and we are
>> refusing to
>> guess; numbers are rejected because:
>>
>>   - what makes a number is fuzzy (float? Decimal? Fraction? some user
>> type?)
>>
>>   - allowing numbers would lead to ambiguity between numbers and textual
>> representations of numbers (3.14 vs '3.14')
>>
>>   - given the nature of wire formats, explicit is definitely better than
>> implicit
>>
>> ``%s`` is included as a synonym for ``%b`` for the sole purpose of making
>> 2/3 code
>> bases easier to maintain.  Python 3 only code should use ``%b``.
>>
>> Examples::
>>
>> >>> b'%b' % b'abc'
>> b'abc'
>>
>> >>> b'%b' % 'some string'.encode('utf8')
>> b'some string'
>>
>> >>> b'%b' % 3.14
>>
>> Traceback (most recent call last):
>> ...
>> TypeError: b'%b' does not accept 'float'
>>
>> >>> b'%b' % 'hello world!'
>>
>> Traceback (most recent call last):
>> ...
>> TypeError: b'%b' does not accept 'str'
>>
>> --
>>
>>
>> --
>> ~Ethan~
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 461: Adding % formatting to bytes and bytearray -- Final, Take 3

2014-03-27 Thread Daniel Holth

On Thu, Mar 27, 2014 at 2:53 PM, Guido van Rossum  wrote:
> So what's the use case for Python 2/3 compatible code? IMO the main use case
> for the PEP is simply to be able to construct bytes from a combination of a
> template and some input that may include further bytes and numbers. E.g. in
> asyncio when you write an HTTP client or server you have to construct bytes
> to write to the socket, and I'd be happy if I could write b'HTTP/1.0 %d
> %b\r\n' % (status, message) rather than having to use
> str(status).encode('ascii') and concatenation or join().

It seems to be notoriously difficult to understand or explain why
Unicode can still be very hard in Python 3 or in code that is in the
middle of being ported or has to run in both interpreters. As far as I
can tell part of it is when a symbol has type(str or bytes) depending
(declared as if we had a static type system with union types); some of
it is because incorrect mixing can happen without an exception, only
to be discovered later and far away in space and time from the error
(worse of all in a serialized file), and part of it is all of the not
easily checkable "types" a particular Unicode object has depending on
whether it contains surrogates or codes > n. Sometimes you might
simply disagree about whether an API should be returning bytes or
Unicode in mildly ambiguous cases like base64 encoding. Sometimes
Unicode is just intrinsically complicated.

For me this PEP holds the promise of being able to do work in the
bytes domain, with no accidental mixing ever, when I *really* want
bytes. For 2+3 I would get exceptions sometimes in Python 2 and
exceptions all the time in Python 3 for mistakes. I hope this is less
error prone in strict domains than for example u"string
processing".encode('latin1'). And I hope that there is very little
type(str or int) in HTTP for example or other "legitimate" bytes
domains but I don't know; I suspect that if you have a lot of problems
with bytes' %s then it's a clue you should use (u"%s" %
(argument)).encode() instead.

sprintf()'s version of %s just takes a char* and puts it in without
doing any type conversion of course. IANACL (I am not a C lawyer).
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 465: A dedicated infix operator for matrix multiplication

2014-04-08 Thread Daniel Holth

On Tue, Apr 8, 2014 at 12:08 AM, Nick Coghlan  wrote:
>
> On 7 Apr 2014 21:58, "MRAB"  wrote:
>>
>> On 2014-04-08 02:45, Guido van Rossum wrote:
>>>
>>> So what? Aren't we allowed to have fun? :-)
>>>
>> Next thing you know, he'll be threatening people with The Comfy Chair!
>
> You may want to take a look at the packaging metadata 2.0 spec ;)
>
> I was also going to add a +1 for the actual topic of this thread, but
> Guido's acceptance of the PEP rendered that point rather moot :)

@ [1]
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] this is what happens if you freeze all the modules required for startup

IIRC it is no longer the case that ZIP imports (involving only one
file for a lot of modules) are much faster than regular FS imports?

On Tue, Apr 15, 2014 at 10:34 AM, Eric Snow  wrote:
> On Tue, Apr 15, 2014 at 1:45 AM, Chris Angelico  wrote:
>> Specific use-case that I can see: Mercurial. In a git vs hg shoot-out,
>> git will usually win on performance, and hg is using Py2; migrating hg
>> to Py3 will (if I understand the above figures correctly) widen that
>> gap, so any improvement done to startup performance will give a very
>> real advantage.
>
> Perhaps not so much "a very real advantage" as "less of a
> distraction". It's still significantly slower than 2.7.  :)
>
> -eric
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Mercurial sluggishness (was: this is what happens if you freeze all the modules required for startup)

I find Python's startup time to be very sluggish. I wish it was less
than 50 milliseconds (0.05 seconds) including running hg, which is the
common threshold for "instant". On my machine 'python -c ""' takes
about 0.05 seconds but 'python3 -c ""' takes 0.125 seconds. I will be
very happy to see any speedup.

On Tue, Apr 15, 2014 at 11:34 AM, Skip Montanaro  wrote:
> Eric wrote:
>
>> Perhaps not so much "a very real advantage" as "less of a
>> distraction".  It's still significantly slower than 2.7.  :)
>
> I'm still confused. I peeked in /usr/bin/hg. The only "system" modules
> it imports directly are os and system (maybe I'm using an ancient
> version?). After that, it imports its own lazy import module. This
> suggests to me that Mercurial's import slowness is mostly in its own
> code (I counted 104 Python modules and six shared objects in its
> mercurial package, which isn't going to be affected (much?) by
> freezing the Python standard modules.
>
> I'm not trying to be difficult here. I thought that way back in the
> day a huge amount of work was done to remove needless filesystem
> activity, and zip packaging has been around for quite awhile.
>
> As an experiment, I ran hg pull as
>
> /usr/bin/python -v /usr/bin/hg pull
>
> in my cpython repo then looked at the -v output. Summarizing the
> output I saw the following:
>
> 30 imports (0 dlopens)
>
> Python banner printed
>
> 86 imports (18 dlopens)
>
> adding changesets message
>
> 5 imports (2 dlopens)
>
> adding manifests message
>
> 1 import (0 dlopens)
>
> adding file changes message
>
> 7 imports (3 dlopens)
>
> added ... changesets message
>
> 4 imports (0 dlopens)
>
> run 'hg update' message
>
> (I missed a "searching" message in there somewhere.)
>
> That's a total of 133 imports, 23 of which were satisfied by loading
> an extension module. The imports break down as follows:
>
> 51 imports (4 dlopens) from the mercurial package
> 5 imports from the hgext package
> 7 imports from the encodings package
>
> Everything else is imported from the top level, and at a glance
> appears to be all Python stdlib stuff.  The key period of execution
> looks to be between the printing of the Python banner and the printing
> of the adding changesets message. I see 46 imports (2 dlopens) from
> the mercurial or hgext packages. That leaves 40 stdlib imports, of
> which 16 were satisfied by dlopen.
>
> As a final check, I saved all the stdlib import statements from the -v
> output (77 statements) to a file and timed its execution:
>
> % time /usr/bin/python ~/tmp/pyimp.py
>
> real0m0.162s
> user0m0.034s
> sys 0m0.010s
>
> It doesn't take much time to import every stdlib module that Mercurial
> needs.  I don't know how much slower all this import machinery is in
> 3.x (and can't test at work, as we don't have a copy laying about). It
> would probably have to be 3x or more slower for it to have much
> visible impact on even simple hg commands.  I find it hard to believe
> that freezing the stdlib is going to lower the barrier enough for the
> Mercurial folks, if, in fact, import slowness is their main reason for
> not moving to 3.x.
>
> Skip
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Mercurial sluggishness (was: this is what happens if you freeze all the modules required for startup)

In case you were wondering, I'm using Ubuntu's "2.7.5+" and "3.3.2+".

My feeling has long been that the speed of getting at the "--help"
option or any initial user feedback from Mercurial or git is a big
driver in perceived speed as opposed to how long the entire operation
might take. But for me any initial speed improvements from git are
fully offset by the feeling of irritation afterwards. /troll

For me Python's startup time (warm) takes about 1/4 of the hg startup
time in the worst case. I expect to both notice and appreciate any
speedups and encourage all optimizers to optimize.

On Tue, Apr 15, 2014 at 12:47 PM, Antoine Pitrou  wrote:
> Le 15/04/2014 17:42, Daniel Holth a écrit :
>
>> I find Python's startup time to be very sluggish. I wish it was less
>> than 50 milliseconds (0.05 seconds) including running hg, which is the
>> common threshold for "instant". On my machine 'python -c ""' takes
>> about 0.05 seconds but 'python3 -c ""' takes 0.125 seconds.
>
>
> Please quote exact versions. Different versions of Python 3 will have
> different startup characteristics.
>
> Regards
>
> Antoine.
>
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Mercurial sluggishness (was: this is what happens if you freeze all the modules required for startup)

On Tue, Apr 15, 2014 at 1:29 PM, Antoine Pitrou  wrote:
> Le 15/04/2014 19:09, Daniel Holth a écrit :
>
>> In case you were wondering, I'm using Ubuntu's "2.7.5+" and "3.3.2+".
>>
>> My feeling has long been that the speed of getting at the "--help"
>> option or any initial user feedback from Mercurial or git is a big
>> driver in perceived speed as opposed to how long the entire operation
>> might take. But for me any initial speed improvements from git are
>> fully offset by the feeling of irritation afterwards. /troll
>>
>> For me Python's startup time (warm) takes about 1/4 of the hg startup
>> time in the worst case. I expect to both notice and appreciate any
>> speedups and encourage all optimizers to optimize.
>
>
> Well, if we optimize 11% out of that 1/4, I don't expect you to notice the
> speedup at all ;-)
>
>
> Regards
>
> Antoine.

No one expects the Spanish Inquisition.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Software integrators vs end users (was Re: Language Summit notes)

2014-04-18 Thread Daniel Holth

On Fri, Apr 18, 2014 at 12:55 PM, Paul Moore  wrote:
> On 18 April 2014 16:58, Nick Coghlan  wrote:
>> As part of thrashing out the respective distribution ecosystem roles
>> of pip and conda (still a work in progress), we're at least converging
>> on the notion that there are actually now *two* main ways of consuming
>> Python: as a "software integrator" (the way most of us have
>> traditionally consumed it, and the way that dominates most project
>> documentation outside the scientific Python community) and as an "end
>> user" (the way Linux system administrators have long consumed it, and
>> the way scientists, financial analysts and folks just learning Python
>> are likely best off consuming it).
>>
>> Making these different personas explicit is a process that has barely
>> begun (this email is mostly based on some conversations I had in
>> person at PyCon and via email during the sprints), but here's the gist
>> (based on listing examples):
>
> Interesting perspective. However, I'm not convinced it's complete.
> Specifically, there's one group of people who I encounter relatively
> often, who don't seem to me to fit well into either category you're
> proposing. That is, (Windows in my experience, but maybe Linux as
> well) users who want to write "simple scripts" and for whom batch
> files or similar are not sufficient. Such people typically don't have
> the sort of "single application area" focus that your "end user"
> category seems to imply, but on the other hand they don't really fit
> the "software integrators" role in the sense of necessarily being
> comfortable setting up their own development environment.
>
> I worry that your classification risks ignoring that group (maybe
> because Unix users are well served with other alternatives than Python
> for this type of task, or because on Unix "use the system Python" is
> the right answer).
>
> Your list of "end user" targeted distributions seem to be limited to:
>
>   - Linux distribution vendors
>   - Vendors focused on the essentially scientific community (in the
> broadest sense)
>   - Embedded Python
>
> That's very far from being complete coverage of all the people *I'd*
> like to be able to recommend Python to. Specifically, unless we're not
> interested in "generic" Windows users, I think we need to offer *some*
> form of equivalent of the OS-packaged Python on Linux for Windows
> users. That's what the python.org builds, plus pip, wheels and PyPI,
> give for Windows users now. Hmm, if we assume that supporting that
> remains a priority, is what you're really saying that we *don't* try
> to extend that to work for Linux/OSX, as doing so competes with the OS
> vendors - but rather we see python.org binaries and binary
> infrastructure like wheels as being focused on the Windows user
> experience?
>
> (I wish I'd been at PyCon, this would have been a very interesting
> discussion to have face to face. Email isn't ideal for this...)

One more group that I find interesting is application users. These
people should not need to notice that Python is in use at all, in
contrast to the "build a virtualenv / install / pass through fire and
death / use" workflow that is sometimes promoted. They are well served
by good tools that make single-file zip distributions or py2exe etc.
from a collection of wheels or sdists. Application users are using
Python because a best-in-class program is written in Python and not
because a Python program integrates better with other Python
libraries.


Conda is interesting because it is a system package manager, except it
installs everything into what they sometimes describe as "C-level
virtualenvs" rather than /. They've come up with a design that works
well with Python programs but isn't particularly Python-specific at
all.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] pep8 reasoning

2014-04-24 Thread Daniel Holth

Fortunately, Unicode provides us with the COMBINING LOW LINE
character, combining the horizontal space-savings of camelCase with
the underscore-indicates-separation properties of _. And it's a valid
Python identifier.

convertx̲mlt̲oj̲son



On Thu, Apr 24, 2014 at 12:25 PM, Chris Angelico  wrote:
> On Thu, Apr 24, 2014 at 11:59 PM, Barry Warsaw  wrote:
>> I will say this: the original preference for underscore_names in PEP 8 was
>> spurred by user studies some of our early non-native English speaking users
>> conducted many years ago.  We learned that it was more difficult for many of
>> them to parse mixedCase names than underscore_names.  I'm afraid I probably 
>> no
>> longer have references to those studies, but the difference was pronounced,
>> IIRC, and I think it's easy to see why.  Underscores can be scanned by the 
>> eye
>> as spaces, while I'd hypothesize that the brain has to do more work to read
>> mixedCase names.
>
> Underscores also play much more nicely with initialisms. How would you
> spell a function named "Add HTTP Header"?
>
> add_HTTP_header
> add_http_header
>
> addHTTPHeader
> addHttpHeader
>
> Four options to choose from. The first two clearly separate the
> initialism from the other words; take your pick whether you want it
> uppercased or not, because it's separated either way. In mixedCase,
> the first one merges the H of Header in with HTTP; with something less
> well known, that can be a nasty readability problem. The second one is
> probably more readable, but looks weird. Or, here's another one:
> converting one thing into another, where both are named by their
> initials:
>
> convert_XML_to_JSON
> convert_xml_to_json
>
> convertXMLToJSON
> convertXmlToJson
>
> Same four options. Which is the more readable?
>
> ChrisA
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Some notes about MicroPython from an observer

- micropython is designed to run on a machine with 192 kilobytes of
RAM and perhaps a megabyte of FLASH. The controller can execute
read-only code directly from FLASH. There is no dynamic linker in this
environment. (It also has a UNIX port).
- However it does include a full Python parser and REPL, so the board
can be programmed without a separate computer as opposed to, say,
having to upload bytecode compiled on a regular computer.
- It's definitely going to be a subset of Python. For example,
func.__name__ is not supported - to make it more micro?
- They have a C API. It is much different than the CPython C API.
- It mas more than one code emitter. A certain decorator causes a
function to be compiled to ARM Thumb code instead of bytecode.
- It even has an inline assembler than translates Python-syntax ARM
assembly (to re-use the same parser) into machine code.

Most information from
https://www.kickstarter.com/projects/214379695/micro-python-python-for-microcontrollers/posts
and http://micropython.org/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

Can of worms, opened.
On Jun 4, 2014 7:20 AM, "Chris Angelico"  wrote:

> On Wed, Jun 4, 2014 at 9:12 PM, Paul Sokolovsky  wrote:
> > An alternative view is that the discussion on the tracker showed Python
> > developers' mind-fixation on implementing something the way CPython does
> > it. And I didn't yet go to that argument, but in the end, MicroPython
> > does not try to rewrite CPython or compete with it. So, having few
> > choices with pros and cons leading approximately to the tie among them,
> > it's the least productive to make the same choice as CPython did.
>
> I'm not a CPython dev, nor a Python dev, and I don't think any of the
> big names of CPython or Python has showed up on that tracker as yet.
> But why is "be different from CPython" such a valuable choice? CPython
> works. It's had many hours of dev time put into it. Problems have been
> identified and avoided. Throwing that out means throwing away a
> freely-given shoulder to stand on, in an Isaac Newton way.
>
> http://www.joelonsoftware.com/articles/fog69.html
>
> ChrisA
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

If we're voting I think representing Unicode internally in micropython
as utf-8 with O(N) indexing is a great idea, partly because I'm not
sure indexing into strings is a good idea - lots of Unicode code
points don't make sense by themselves; see also grapheme clusters. It
would probably work great.

On Wed, Jun 4, 2014 at 7:49 AM, Paul Sokolovsky  wrote:
> Hello,
>
> On Wed, 4 Jun 2014 20:53:46 +1000
> Chris Angelico  wrote:
>
>> On Wed, Jun 4, 2014 at 8:38 PM, Paul Sokolovsky 
>> wrote:
>> > And I'm saying that not to discourage Unicode addition to
>> > MicroPython, but to hint that "force-force" approach implemented by
>> > CPython3 and causing rage and split in the community is not
>> > appreciated.
>>
>> FWIW, it's Python 3 (the language) and not CPython 3.x (the
>> implementation) that specifies Unicode strings in this way.
>
> Yeah, but it's CPython what dictates how language evolves (some people
> even think that it dictates how language should be implemented!), so all
> good parts belong to Python3, and all bad parts - to CPython3,
> right? ;-)
>
>> I don't
>> know why it has to cause a split in the community; this is the one way
>> to make sure *everyone's* strings work perfectly, rather than having
>> ASCII strings work fine and others start tripping over problems in
>> various APIs.
>
> It did cause split in the community, that's the fact, that's why
> Python2 and Python3 are at the respective positions. Anyway, I'm not
> interested in participating in that split, I did not yet uttered my
> opinion on that publicly enough, so I seized a chance to drop some
> witty remarks, but I don't want to start yet another Unicode flame.
>
>
>
> So, let's please be back to Unicode storage representation in
> MicroPython. So, https://github.com/micropython/micropython/issues/657
> discussed technical aspects, in a recent mail on this list I expressed
> my opinion why following CPython way is not productive (for development
> satisfaction and evolution of Python community, to be explicit).
>
> Final argument I would have is that you certainly can implement Unicode
> support the PEP393 way - it would be enormous help and would be gladly
> accepted. The question, how useful it will be for MicroPython. It
> certainly will be useful to report passing of testsuites. But will it
> be *really* used?
>
> For microcontroller board, it might be too heavy (put simple, with it,
> people will be able to do less (== heap running out sooner)), than
> without it, so one may expect it to be disabled by default. Then POSIX
> port is there surely not to let people replace "python" command
> with "micropython" and run Django, but to let people develop and debug
> their apps with more comfort than on embedded board. So, it should
> behave close to MCU version, and would follow with MCU choice
> re: Unicode.
>
> That's actually the reason why I keep up this discussion - not for the
> sake of argument or to bash Python3's Unicode choices. With recent
> MicroPython announcement, we surely looked for more people to
> contribute to its development. But then we (or at least I can speak for
> myself), would like to make sure that these contribution are actually
> the most useful ones (for both MicroPython, and Python community in
> general, which gets more choices, rather than just getting N% smaller
> CPython rewrite).
>
> So, you're not sure how O(N) string indexing will work? But MicroPython
> offers a great opportunity to try! And it's something new and exciting,
> which surely will be useful (== will save people memory), not just
> something old and boring ;-).
>
>
>>
>> ChrisA
>
>
> --
> Best regards,
>  Paul  mailto:pmis...@gmail.com
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

MicroPython is going to be significantly incompatible with Python
anyway. But you should be able to run your mp code on regular Python.

On Wed, Jun 4, 2014 at 9:39 AM, Serhiy Storchaka  wrote:
> 04.06.14 04:17, Steven D'Aprano написав(ла):
>
>> Would either of these trade-offs be acceptable while still claiming
>> "Python 3.4 compatibility"?
>>
>> My own feeling is that O(1) string indexing operations are a quality of
>> implementation issue, not a deal breaker to call it a Python. I can't
>> see any requirement in the docs that str[n] must take O(1) time, but
>> perhaps I have missed something.
>
>
> I think than breaking O(1) expectation for indexing makes the implementation
> significant incompatible with Python. Virtually all string operations in
> Python operates with indices.
>
> O(1) indexing operations can be kept with minimal memory requirements if
> implement Unicode internally as modified UTF-8 plus optional array of
> offsets for every, say, 32th character (which even can be compressed to an
> array of 16-bit or 32-bit integers).
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

On Wed, Jun 4, 2014 at 10:12 AM, Steven D'Aprano  wrote:
> On Wed, Jun 04, 2014 at 01:14:04PM +, Steve Dower wrote:
>> I'm agree with Daniel. Directly indexing into text suggests an
>> attempted optimization that is likely to be incorrect for a set of
>> strings.
>
> I'm afraid I don't understand this argument. The language semantics says
> that a string is an array of code points. Every index relates to a
> single code point, no code point extends over two or more indexes.
> There's a 1:1 relationship between code points and indexes. How is
> direct indexing "likely to be incorrect"?

"Useful" is probably a better word. When you get into the complicated
languages and you want to know how wide something is, and you might
have y with two dots on it as one code point or two and left-to-right
and right-to-left indicators and who knows what else... then looking
at individual code points only works sometimes. I get the slicing
idea.

I like the idea that encoding to utf-8 would be the fastest thing you
can do with a string. You could consider doing regexps in that domain,
and other implementation specific optimizations in exactly the same
way that any Python implementation has them.

None of this would make it harder to move a servo.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Internal representation of strings and Micropython

2014-06-05 Thread Daniel Holth

On Thu, Jun 5, 2014 at 11:59 AM, Paul Moore  wrote:
> On 5 June 2014 14:15, Nick Coghlan  wrote:
>> As I've said before in other contexts, find me Windows, Mac OS X and
>> JVM developers, or educators and scientists that are as concerned by
>> the text model changes as folks that are primarily focused on Linux
>> system (including network) programming, and I'll be more willing to
>> concede the point.
>
> There is once again a strong selection bias in this discussion, by its
> very nature. People who like the new model don't have anything to
> complain about, and so are not heard.
>
> Just to support Nick's point, I for one find the Python 3 text model a
> huge benefit, both in practical terms of making my programs more
> robust, and educationally, as I have a far better understanding of
> encodings and their issues than I ever did under Python 2. Whenever a
> discussion like this occurs, I find it hard not to resent the people
> arguing that the new model should be taken away from me and replaced
> with a form of the old error-prone (for me) approach - as if it was in
> my best interests.
>
> Internal details don't bother me - using UTF8 and having indexing be
> potentially O(N) is of little relevance. But make me work with a
> string type that *doesn't* abstract a string as a sequence of Unicode
> code points and I'll get very upset.

Once you get past whether str + bytes throws an exception which seems
to be the divide most people focus on, you can discover new things
like dance-encoded strings, bytes decoded using an incorrect encoding
intended to be transcoded into the correct encoding later, surrogates
that work perfectly until .encode(), str(bytes), APIs that disagree
with you about whether the result should be str or bytes, APIs that
return either string or bytes depending on their initializers and so
on. Unicode can still be complicated in Python 3 independent of any
judgement about whether it is worse, better, or different than Python
2.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Feature request: Change a Dependency Package Version During Package Initiation

2019-05-17 Thread Daniel Holth

This sounds exactly like what people used to do with eggs. You could have
multiple versions of a package on the path as eggs and then require a
version at runtime. The approach has problems. Ruby also abandoned a
strategy where random app code depends on package management code at
runtime.

One better strategy is to set up a python path in a wrapper script.

On Fri, May 17, 2019, 11:27 Brett Cannon  wrote:

> Thanks for the idea but there are currently no plans to support such a
> feature. If you would like to see it then you will need to write a PEP with
> a proof-of-concept to demonstrate how you would expect such a feature to
> work.
>
> On Fri., May 17, 2019, 07:55 Q via Python-Dev, 
> wrote:
>
>> A lot of the Python code we use in production are used directly as
>> imports in other python
>> distributions (such as the python comes with the finite element software
>> Abaqus and MSC Marc), many
>> packages (such as matplotlib, numpy) that may have varying versioned
>> dependencies.
>>
>> I was wondering if this could be expanded to allow a version to be set
>> within a package and have
>> that propagate to all modules in that package. For example in the root
>> init.py if I set
>> multiversion(tornado, 2.2.1) then all modules in that package will use
>> tornado 2.2.1 when I import
>> tornado.
>>
>> See a relevant issue on github:
>> https://github.com/mitsuhiko/multiversion/issues/1
>>
>> Thank you!
>> Qiang
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dholth%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 594: update 1

2019-05-25 Thread Daniel Moisset

Hi, thanks for the work on this proposal, I think this is at least a tip of
the iceberg and a good start for the bigger question of how the stdlib
should evolve..

I think that the PEP should include an idea of what should happen if
existing stdlib pieces depend on this deprecated modules. For example,
email.mime.audio is part of a non-deprecated module but it depends on
sndhdr which is being planned for deprecation. Should that part of the
functionality be deprecated too? rewritten to not depend on the deprecated
module (but keep the pieces of relevant code? depend on optional 3rd party
modules and degrade if those are not available?

The sndhdr case is just an example but I can imagine there are others
(optparse is no longer schedule for removal in your PEP, but would have
been another case, being used by the profile module).

Best,
Daniel

On Tue, 21 May 2019 at 15:15, Christian Heimes  wrote:

> Hi,
>
> I have updated the PEP with feedback from discussions. The highlights are:
>
> * Deprecate parser module
> * Keep fileinput module
> * Elaborate why crypt and spwd are dangerous and bad
> * Improve sections for cgitb, colorsys, nntplib, and smtpd modules
> * The colorsys, crypt, imghdr, sndhdr, and spwd sections now list suitable
> substitutions.
> * Mention that socketserver is going to stay for http.server and
> xmlrpc.server
> * The future maintenance section now states that the deprecated modules
> may be adopted by Python community members.
>
> https://github.com/python/peps/compare/7799178a...2d536899?diff=unified#diff-ae358c21fa7968ee3b6c64479e051574
>
>
> I'll be traveling the next couple of days and will only have limited
> opportunities to respond on feedback.
>
> Christian
>
> ---
> PEP: 594
> Title: Removing dead batteries from the standard library
> Author: Christian Heimes 
> Status: Draft
> Type: Standards Track
> Content-Type: text/x-rst
> Created: 20-May-2019
> Post-History: 21-May-2019
>
>
> Abstract
> 
>
> This PEP proposed a list of standard library modules to be removed from the
> standard library. The modules are mostly historic data formats and APIs
> that
> have been superseded a long time ago, e.g. Mac OS 9 and Commodore.
>
>
> Rationale
> =
>
> Back in the early days of Python, the interpreter came with a large set of
> useful modules. This was often refrained to as "batteries included"
> philosophy and was one of the corner stones to Python's success story.
> Users didn't have to figure out how to download and install separate
> packages in order to write a simple web server or parse email.
>
> Times have changed. The introduction of the cheese shop (PyPI), setuptools,
> and later pip, it became simple and straight forward to download and
> install
> packages. Nowadays Python has a rich and vibrant ecosystem of third party
> packages. It's pretty much standard to either install packages from PyPI or
> use one of the many Python or Linux distributions.
>
> On the other hand, Python's standard library is piling up cruft,
> unnecessary
> duplication of functionality, and dispensable features. This is undesirable
> for several reasons.
>
> * Any additional module increases the maintenance cost for the Python core
>   development team. The team has limited resources, reduced maintenance
> cost
>   frees development time for other improvements.
> * Modules in the standard library are generally favored and seen as the
>   de-facto solution for a problem. A majority of users only pick 3rd party
>   modules to replace a stdlib module, when they have a compelling reason,
> e.g.
>   lxml instead of `xml`. The removal of an unmaintained stdlib module
>   increases the chances of a community contributed module to become widely
>   used.
> * A lean and mean standard library benefits platforms with limited
> resources
>   like devices with just a few hundred kilobyte of storage (e.g. BBC
>   Micro:bit). Python on mobile platforms like BeeWare or WebAssembly
>   (e.g. pyodide) also benefit from reduced download size.
>
> The modules in the PEP have been selected for deprecation because their
> removal is either least controversial or most beneficial. For example
> least controversial are 30 years old multimedia formats like ``sunau``
> audio format, which was used on SPARC and NeXT workstations in the late
> 1980ties. The ``crypt`` module has fundamental flaws that are better solved
> outside the standard library.
>
> This PEP also designates some modules as not scheduled for removal. Some
> modules have been deprecated for several releases or seem unnecessary at
> first glance. However it is benefic

[Python-Dev] python3 -bb and hash collisions

2019-06-18 Thread Daniel Holth

set([u"foo", b"foo]) will error because the two kinds of string have the
same hash, and this causes a comparison. Is that correct?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MRKTBTY7ZTU3I6SCWRHB533O2QMYA4VR/

[Python-Dev] Re: python3 -bb and hash collisions

2019-06-21 Thread Daniel Holth

Thanks. I think I might like an option to disable str(bytes) without
disabling str != bytes. Unless the second operation would also corrupt
output.

Came across this kind of set in the hyper http library which uses a set to
accept certain headers with either str or bytes keys.

On Tue, Jun 18, 2019, 13:05 Christian Heimes  wrote:

> On 18/06/2019 18.32, Daniel Holth wrote:
> > set([u"foo", b"foo]) will error because the two kinds of string have the
> > same hash, and this causes a comparison. Is that correct?
>
> Yes, it will fail with -bb, because it turns comparison between str and
> bytes into an error. This can also happen with other strings when
> hash(u'somestring') & mask == hash(b'otherbytes') & mask. The mask of a
> set starts with PySet_MINSIZE - 1 == 8 and increases over team.
>
> Christian
>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/ZIF2MRBWSMSCFP6E7PZOBI5KYP46QZPK/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R6E7FAR36UO6XHQSIAVF4DIM7G23ADJP/

[Python-Dev] Re: python3 -bb and hash collisions

2019-06-21 Thread Daniel Holth

The answer bytes == str is just False. That doesn't put b'' in your
database by accident. It could be useful to separate the two kinds of
warnings.

On Fri, Jun 21, 2019, 18:57 Ivan Pozdeev via Python-Dev <
python-dev@python.org> wrote:

> On 22.06.2019 1:08, Daniel Holth wrote:
>
> Thanks. I think I might like an option to disable str(bytes) without
> disabling str != bytes. Unless the second operation would also corrupt
> output.
>
> You can't compare str to bytes without knowing the encoding the bytes are
> supposed to be in (see
> https://stackoverflow.com/questions/49991870/python-default-string-encoding
> for details).
>
> And if you do know the encoding, you can as well compare
> `str.encode(encoding) != bytes` / `str != bytes.decode(encoding)`.
>
>
> Came across this kind of set in the hyper http library which uses a set to
> accept certain headers with either str or bytes keys.
>
> On Tue, Jun 18, 2019, 13:05 Christian Heimes  wrote:
>
>> On 18/06/2019 18.32, Daniel Holth wrote:
>> > set([u"foo", b"foo]) will error because the two kinds of string have the
>> > same hash, and this causes a comparison. Is that correct?
>>
>> Yes, it will fail with -bb, because it turns comparison between str and
>> bytes into an error. This can also happen with other strings when
>> hash(u'somestring') & mask == hash(b'otherbytes') & mask. The mask of a
>> set starts with PySet_MINSIZE - 1 == 8 and increases over team.
>>
>> Christian
>>
>>
>> ___
>> Python-Dev mailing list -- python-dev@python.org
>> To unsubscribe send an email to python-dev-le...@python.org
>> https://mail.python.org/mailman3/lists/python-dev.python.org/
>> Message archived at
>> https://mail.python.org/archives/list/python-dev@python.org/message/ZIF2MRBWSMSCFP6E7PZOBI5KYP46QZPK/
>>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to 
> python-dev-leave@python.orghttps://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/R6E7FAR36UO6XHQSIAVF4DIM7G23ADJP/
>
> --
> Regards,
> Ivan
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/XAN44UH5X5PYNSHY5ONULXIJF4DLBXF6/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/36DKFLVTBABEZPDX7MYHP7H2TVDZTOHG/

[Python-Dev] Re: The semantics of pattern matching for Python

2020-11-18 Thread Daniel Moisset

[sorry for the duplicate, meant to reply-all]

Thank you for this approach, I find it really helpful to put the
conversation in these terms (semantics and guiding principles).

This is not an answer to the proposal (which I've read and helps me
contextualize) but to your points below and how they apply to PEP-634. I'm
also answering personally, with a reasonable guess about what the other
authors of 634-636 would agree, but they may correct me if I'm wrong.

On Mon, 16 Nov 2020 at 14:44, Mark Shannon  wrote:

> (...)
> I believe that a pattern matching implementation must have the following
> properties:
>
> * The semantics must be precisely defined.
> * It must be implemented efficiently.
> * Failed matches must not pollute the enclosing namespace.
> * Objects should be able determine which patterns they match.
> * It should be able to handle erroneous patterns, beyond just syntax
> errors.
>
> PEP 634 and PEP 642 don't have *any* of these properties.
>

Let me answer this one by one:

1. "The semantics must be precisely defined":
If this happens in PEP634 I don't think it was intentional, and I'm pretty
sure the authors would be happy to complete any incompleteness that it has.
I would happily have a more accurate description (I drafted a non-official
one for a much earlier version of PEP-622,
https://github.com/dmoisset/notebook/blob/master/python/pep622/semantic-specs.md
).
Can you clarify where you see these imprecisions?

2. "It must be implemented efficiently":
I don't think "efficient implementation" was a priority in PEP634,
although I saw your proposal defines this as "same performance as the
equivalent if statement", and I'm quite sure that level of performance can
be achieved (if it isn't already by Brandt's implementation). Finding the
best way to optimise wasn't a priority, but I think if there was anything
in our implementation that would make optimisations harder we would
consider them as a change. Do you think anything like that has been
presented?

3. "Failed matches must not pollute the enclosing namespace":
This for me is one of the less-desirable parts of the proposal, and was
agreed more as a matter of practicality and an engineering tradeoff. If you
have a reasonable way of solving this (like putting matched variables in
the stack and popping it later) reasonably I'd be much happier putting that
in.

4. "Objects should be able determine which patterns they match."
This is something that you and I, and most of the authors of 622 agree on.
What we found out when discussing this is that we didn't have clear what
and how to open that customization. Some customization options added a lot
of complexity at the cost of performance, some others were very simple but
it wasn't clear that they would be actually useful, or extensible in the
future. This has a lot to do with this being a somewhat new paradigm in
Python, and our lack of knowledge on what the user community may do with it
beyond what we imagined. So the decision was "pattern matching as it is
presented without extensibility is useful, let's get this in, and once we
see how it is used in the wild we'll understand better what kind of
extensibility is valuable". For a crude analogy, imagine trying to get the
descriptor protocol right when the basic python object model was being
implemented. These things happened as different times as the use of the
language evolved, and my opinion is that customization of the match
protocol must follow a similar path.

5. "It should be able to handle erroneous patterns, beyond just syntax
errors."
I'll be answering this based on the example in your document, matching
RemoteCount(success=True,
count=count) where RemoteCount is a namedtuple. The code is likely an
error, and I'm in general all for reporting errors early, but the kind of
error detection proposed here is the kind of errors that python normally
ignore. I find that example really similar to the kind of error you could
make writing "if remcount.count == 3: ..." or going beyond this example
"maxelems = configfile.read(); if len(elems) == maxelems: ...". There are
many type errors that python ignores (especially related to equality), and
Python has already made the decision of allowing mixed type equality
comparisons everywhere, so why do you think pattern matching should be
different with respect to this? In my opinion trying to "fix" this (or even
agreeing if this is a bug or not) is a much more general issue unrelated to
pattern matching. Given the current status-quo I normally trust python type
checkers to help me with these errors, and I'd expect them to do the same
with the "erroneous" match statement.
If there are other examples you had in mind when you wrote this I'd also be
happy

[Python-Dev] Re: Questions about about the DLS 2020

2020-11-19 Thread Daniel Moisset

A notorious example here of the "not many" is this proposal (i.e. not part
of the language yet) for C++:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1371r0.pdf . I
think it's an interesting example given that this is a very mature
language, not originally designed with pattern matching in mind, where _ is
normally an identifier.

Best, D.

On Tue, 17 Nov 2020 at 19:30, Brett Cannon  wrote:

>
>
> On Mon, Nov 16, 2020 at 9:03 AM Tobias Kohn  wrote:
>
>> Hi Mark,
>>
>> Thank you for your interest and the questions.
>>
>>
>> 1.  This really comes down to how you look at it, or how you define
>> pattern matching.  The issue here is that the concept of pattern matching
>> has grown into a large and somewhat diverse flock of interpretations and
>> implementations (as a side note: interestingly enough, some of the only
>> universally agreed-upon standards are to use `_` as a wildcard and not to
>> mark names that capture/bind values---which are quite exactly the points
>> most fiercely debatted here).
>>
> How many of those languages added pattern matching *later* and not at the
> earliest stages of the language (if not from the beginning)? And for those
> that added it later, how many of those didn't already have a convention
> surrounding "_"? My suspicion is "not many" and "not many". 😉
>
> -Brett
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/VZHF7GJ637YULIU53MCCH26T4LWJ2YP6/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RBVIQ3WG4AFNJ7WKMNSXISR5X3LSJ2KO/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The semantics of pattern matching for Python

2020-11-20 Thread Daniel Moisset

dent and then fixed, it shouldn't break anything (it's
   like the change in dictionary order moving to "undefined/arbitrary" to
   "preserving insertion order") and can be done later one

Thanks again,

Daniel

On Mon, 16 Nov 2020 at 14:44, Mark Shannon  wrote:

> Hi everyone,
>
> There has been much discussion on the syntax of pattern matching for
> Python (in case you hadn't noticed ;)
>
> Unfortunately the semantics seem to have been somewhat overlooked.
> What pattern matching actually does seems at least as important as the
> syntax.
>
>
> I believe that a pattern matching implementation must have the following
> properties:
>
> * The semantics must be precisely defined.
> * It must be implemented efficiently.
> * Failed matches must not pollute the enclosing namespace.
> * Objects should be able determine which patterns they match.
> * It should be able to handle erroneous patterns, beyond just syntax
> errors.
>
> PEP 634 and PEP 642 don't have *any* of these properties.
>
>
> I've written up a document to specify a possible semantics of pattern
> matching for Python that has the above properties, and includes reasons
> why they are necessary.
>
>
> https://github.com/markshannon/pattern-matching/blob/master/precise_semantics.rst
>
> It's in the format of a PEP, but it isn't a complete PEP as it lacks
> surface syntax.
>
> Please, let me know what you think.
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/BTPODVYLPKY5IHWFKYQJICONTNTRNDB2/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J55MO5VWDYAAAXEE266BVXGLZJLFW2O2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: The semantics of pattern matching for Python

2020-11-22 Thread Daniel Moisset

Others have replied with most of this covering my opinion but there's a
point I'd like to highlight here

On Fri, 20 Nov 2020 at 14:23, Mark Shannon  wrote:

> Hi Daniel,
>
> On 20/11/2020 10:50 am, Daniel Moisset wrote:
> >  (... snipping for brevity ...)
> >
> >  1. You mention a goal about "erroneous patterns" (which I'm still not
> > sure I agree on), and your proposal addresses that by forcing
> > classes to be explicit (via __atributes__ and __deconstruct__) about
> > what attributes are accepted as matches. This is against one design
> > principle that's not in your list but it was (at least implicitly)
> > in PEP622 and 634: "pattern matching must allow matching objects on
> > types not specifically designed for it"; this will allow to apply
> > this feature to classes that you can not modify (like instances
> > created by a 3rd party library ). That's why PEP634 relies on
> > getattr() ; that could be extended in the feature (providing some
> > override using attributes like yours) but that wouldn't be required
> > by default
>
> Why force pattern matching onto library code that was not designed for
> pattern matching? It seems risky.
>
> Fishing arbitrary attributes out of an object and assuming that the
> values returned by attribute lookup are equivalent to the internal
> structure breaks abstraction and data-hiding.
>
> An object's API may consist of methods only. Pulling arbitrary
> attributes out of that object may have all sorts of unintended
> side-effects.
>
> PEP 634 and the DLS paper assert that deconstruction, by accessing
> attributes of an object, is the opposite of construction.
> This assertion seems false in OOP.
>
>
I think your description about what you and I call OOP lies at the center
of why we're unlikely to agree on what's the best approach on class
patterns.

The Python I write and tend to use allows and encourages the use of the dot
operator (aka "Fishing arbitrary attributes out of an object") and the
libraries I use expose through it important parts of the API which
doesn't follow the style you mention where  "An object's API may consist of
methods only.". Of course I rely on properly documented attributes, like
datetime.date.year, Fractional.denominator, JSONDEcodeError.message (in the
stdlib) or fields I defined in my django models, or things like
requests.response.status_code (in 3rd party libraries that I used as is or
by extending). Those are all types of objects that I'd like the proposal to
allow as subjects of a match statement and attributes that I'd like to have
in lookups. Given that I can not force a lot of 3rd party libraries to add
a new special attribute `__attributes__`, our proposal relies on the
standard `getattr()` API (which is the core operation on Python object
model to doing most things with an object, even calling methods).

I've used other OOP languages within that style were attributes have to be
explicitly exposed, and even so that's disencouraged and they follow the
principle of "API=only methods". And I can imagine some people could choose
to write Python like that, even if I don't and most of the libraries I use
don't either. For people with that philosophy about OOP, our pattern
matching proposal will be horrible and useless and they can choose not to
use the match statement, ever. The argumentation behind the proposal is
implicitly based on the assumption that *most* Python developers consider
the dot operator one of the "natural" ways to expose object interfaces in
Python, even if it can pull attributes and use non-method APIs. If you
think that assumption is false, we'll have to agree to disagree but I don't
see how we could get to a common vision on how class patterns should behave.

(...)
> When we added the "with" statement, there was no attempt to force
> existing code to support it. We made the standard library support it,
> and let the community add support as and when it suited them.
>
> We should do the same with pattern matching.
>

With our proposal, nothing forces the existing code to support it, they
support it naturally using their existing attribute access API (through the
dot operator). One advantage that the with statement had is that it's quite
easy to wrap an object that wasn't designed to be used as context manager
in another that can (the stdlib even provides a common example of that with
contextlib.closing). Trying to use something like that (writing "match
wrapper(obj): ..." where obj doesn't support some match protocol and the
wrapper adds it) is non viable in pattern matching, because there are many
cases where the object

[Python-Dev] Re: Words rather than sigils in Structural Pattern Matching

2020-11-22 Thread Daniel Moisset

On Sun, 22 Nov 2020 at 00:31, Greg Ewing 
wrote:

> On 22/11/20 6:47 am, David Mertz wrote:
> > I'm convinced by Guido,
> > Brandt, and others that the binding  use will be far more common, so
> > adding extra characters for the 90% case does not feel desirable
>
> Minimising the number of characters is not the only consideration.
> Readability counts too, and I think the proposed DWIM rules suffer
> in the readability area.
>
> There are about five different contexts in which a bare name can
> appear as part of a match case:
>
> * As a constructor name
> * As a bare name in an argument position
> * As part of a dotted expression
> * On the left of an =
> * On the right of an =
>
> Only in some of those contexts is it treated as a name to be
> assigned. That's a fairly complex bit of mental parsing to do
> when reading a case.
>
>
Hi Greg,

Look at the following (non-pattern-matching) snippet:

event = datetime.date(x, month=y, day=z)

you have bare names appearing as:
* a constructor name (date)
* bare name in an argument position (x)
* part of a dotted expression (datetime, date)
* left of = (event, month, day)
* right of = (y, z)

some of them are lookups in the namespace (datetime, x, y, z). Others are
targets to write (event). "date" is actually a literal to be used as
attribute lookup, for reading (a key in the module dictionary). "month" and
"day" are literals to lookup in the list of formal parameters of a
function, for writing into the new function stack frame.

I never found Python students struggle too much with the line above, and I
don't expect it to be much different for names in patterns. Names in Python
can appear with a lot of different "roles" in a lot of different contexts,
but the context is usually easy to recognise (is it left or right of a
chain of dots? does it have an = on the left or right? is it in an argument
list? a parameter list?).

Of course I can never deny if you tell me "I find this syntax hard to read"
(only a person can decide what's easy or hard to read for them), but I
don't think this "bare names appear in many contexts" is a solid
explanation.

A hypothesis I have is that many readers of the PEP have been scared of
these semantics because they have been laid bare when explaining them in
detail, but would never notice in day to day life, in the same way that
most Python users would never stop twice to look at the meaning of "a.b.c =
0", even if a semantic description requires us to describe that each of the
three names has a different role ("a" is a lookup in the current scopes,
"b" is a constant name for the purpose of reading attributes, and "c" is a
constant name for the purpose of writing attributes). But looking at the
details description of what's going on under the hood is scarier than the
intuitive look, and when people read the PEP they are looking under the
hood rather than looking at this as users. This is guesswork and could be
absolutely wrong, but I hope it helps read this with a different set of
eyes.

> --
> Greg
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/LMALNSPBLL3MHFOJHBGGO24IDS6CI5R3/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6VKTXLNWA6ZAXS6TQRT6Z3HS7F2OQLIL/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Pattern Matching Scope

2020-12-04 Thread Daniel Moisset

This is an answer to "what PEP 634 proposes":

On Fri, 4 Dec 2020 at 19:18, Jim J. Jewett  wrote:

> (...)
> I'm getting a bit confused over when people mean "the PEP currently says"
> vs "the implementation probably should" vs "the PEP should additionally
> require" vs "the PEP should instead say".
>
> To be more specific, I'm not sure what is intended for the 2nd or 3rd case
> below, which reuse a variable "bound" by the first (failed) match.  Nor am
> I sure whether it matters that the first match fails on the guard
> predicate, instead of immediately on the match.
>
> case (a, b, c) if f():  # assume f() returns false
>

This will guarantee binding of a, b, and c when the subject is a sequence
of 3 elements. Otherwise, which are assigned is implementation defined.
Nothing of this is affected in any way by the behaviour of f, and the
binding is guaranteed to be done before f is called.

>
> case (a, b) if a == c:  # is a still bound from case above?  Is that
> implementation-dependent?
>

This will guarantee binding of a and b when the subject is a sequence of 2
elements. Otherwise, which are assigned is implementation defined. Nothing
of this is affected in any way by the behaviour of a==c, and the binding is
guaranteed to be done before the equality is computed.

>
> case (d = a):  # is a still bound from case above?  Is that
> implementation-dependent?  Is it even still possible to put restrictions in
> before the guard clause, like d=4?
>

This is a SyntaxError, (d=a) is not a pattern.

To summarize:
* on a successful match of the *pattern* (ignoring the guard), all captured
variables are bound, and that happens before the guard is evaluated
* on a failed match, an arbitrary subset of the variables may be bound, and
the guard is guaranteed to not be evaluated.

My previous belief was that this was implementation defined, because the
> cases could be processed in parallel, so that the first case might not have
> finished by the time variable a was needed in the later cases.  My reading
> of PEP 634 suggests that there is a linearization, but only of the guards,
> so ... now I am not sure.
>
> -jJ
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/F5KRIPR4HUDG6TIUWLYBD6CBUFKPLYVF/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/V5DDEGA6CQLBQCMMRCS5QUKGTN37R6BC/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Concerns about PEP 634

2021-02-06 Thread Daniel Moisset

Hi Mark,

I think some of these issues have already been raised and replied (even if
no agreement has been reached). but this is a good summary, so let me reply
with a summary of responses for this.

On Sat, 6 Feb 2021 at 15:51, Mark Shannon  wrote:

> Hi,
>
> Since a decision on PEP 634 is imminent, I'd like to reiterate some
> concerns that I voiced last year.
>
> I am worried about the semantics and implementation of PEP 634.
> I don't want to comment on the merits of pattern matching in general, or
> the proposed syntax in PEP 634 (or PEP 640 or PEP 642).
>
> Semantics
> -
>
> 1. PEP 634 eschews the object model, in favour of adhoc instance checks,
> length checks and attribute accesses.
>
> This is in contrast to almost all of the the rest of the language, which
> uses special (dunder) methods:
>All operators,
>subscripting,
>attribute lookup,
>iteration,
>calls,
>tests,
>object creation,
>conversions,
>and the with statement
>
> AFAICT, no justification is given for this.
> Why can't pattern matching use the object model?
>

No one has said that "we can't". It's just that "we don't have to". The
underlying mechanisms used by pattern matching (instance check, length,
attribute access) already have their defined protocols and support within
the object model. It's analogous as the way in which  iterable unpacking
didn't need to define it's own object model special methods, because the
existing iteration mechanism used in for loops was sufficient.

This does not exclude possible future extensions to the object model to
include a richer protocol like described in
https://www.python.org/dev/peps/pep-0622/#custom-matching-protocol (where
it also describes why we're not proposing that *now*, why it can be done
later, and why we think it's best to do it later)

>
> PEP 343 (the "with" statement) added the __enter__ and __exit__ methods
> to the object model, and that works very well.
>
>
> 2. PEP 634 deliberately introduces a large area of undefined behaviour
> into Python.
>
>
> https://www.python.org/dev/peps/pep-0634/#side-effects-and-undefined-behavior
>
> Python has, in general, been strict about not having undefined behaviour.
> Having defined semantics means that you can reason about programs, even
> non-idiomatic ones.
> [This is not unique to Python, it is really only C and C++ that have
> areas of undefined behaviour]
>

The C standard uses a very peculiar definition of "undefined behaviour"
(I'm not familiar with the C++ one to assert anything, I'll assume it's the
same), where for certain set of programs, any resulting behaviour is valid,
even at compile time (so a compiler that deletes all your files when trying
to compile "void f() {int a[10]; a[10]=0;}" is standard compliant).
Comparing that with the use of the term "undefined behaviour" in the PEP is
not very useful, because even if they are the same words, they don't have
the same meaning

If you want to compare it with the C standards, the term we'd use would be
"implementation defined behaviour". Python has a lot of those. For example,
the output of all these python programs can change between implementations
(And some of these even change between versions of cpython):

   - print({3,2,1})
   - print(hash("hello"))
   - if id(1) == id(1): print("hello")
   - import sys; print(sys.getsizeof([]))

Some order of operations is also implementation dependent for example in
def foo():
   print(open("/etc/passwd")).read()
foo()

The moment where file closing happens is implementation dependent.

The section you linked to introduces behaviour in similar lines to all of
the above.

> I can see no good reason for adding undefined behaviour. It doesn't help
> anyone.
>
>
It helps for the point 3 that you mention (See below)

> The lack of precise semantics makes programs harder to understand, and
> it makes the language harder to implement.
> If the semantics aren't specified, then the implementation becomes the
> specification.
> This bakes bugs into the language and makes it harder to maintain,
> as bug-for-bug compatibility must be maintained.
>
>
> 3. Performance
>
> PEP 634 says that each pattern must be checked in turn.
> That means that multiple redundant checks must be performed on (for
> example) a sequence if there are several mapping patterns.
> This is unnecessarily slow.
>

What PEP 634 says about this(unless we're reading different sections, a
quote could help here) is that the semantics are the one of selecting
the *first
case block whose patterns succeeds matching it and whose guard condition
(if present) is "truthy"*.

As long as the semantics are respected, a smart compiler could reduce some
of the redundant checks (and that's *precisely* why the order of the checks
is left as implementation dependent).

It's important here to separate "sequential semantics" vs "implemented as
sequential checks". For an analogy, look at the "serializable" concept in
databases, where the outcome of multipl

[Python-Dev] Re: PEP 653: Precise Semantics for Pattern Matching

2021-02-23 Thread Daniel Moisset

In addition to the changes proposed here that go beyond PEP-634 (which
other people discuss), I find that many of the definitions fail to capture
some of the basic features of PEP-634, especially when nesting patterns.

Take for example: "case [int(), str()]". According to
https://www.python.org/dev/peps/pep-0653/#sequence-patterns that desugars
to:

if $kind & MATCH_SEQUENCE:
if $list is None:
$list = list($value)
if len($list) == len([int(), str()]):
int(), str() = $list   *# This is invalid!*
if guard:
   DONE
In general, the way these semantics are defined non-recursively over
patterns won't be able to work with nested patterns.

I value the idea of having a more accurate definition of what the semantics
are, but this approach doesn't seem to work.

Best,   D.

On Thu, 18 Feb 2021 at 16:45, Mark Shannon  wrote:

> Hi everyone,
>
> I'd like to announce a new PEP.
> https://www.python.org/dev/peps/pep-0653/
>
> It builds on PEP 634 (Structural Pattern Matching: Specification), adding:
>
> More precise semantics.
> A bit more customization for complex classes.
>
> Its also a bit more robust and should be faster (eventually).
>
> The syntax is unchanged, and in most cases the semantics are the same.
>
> As always, comments and suggestions are welcome.
>
> Cheers,
> Mark.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/O4I345VHIQUXPDJWPDA54RWVXMYEEBRM/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/6RGSDHU7ZB77RTWUS7IBTMCZNLV4NBJH/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: aiter/anext review request

2021-03-20 Thread Daniel Pope

As someone who was involved in implementing these, I think they should not
be in builtins if that means they have to be in C.

My argument is from a point of maintainability. Writing them was plenty of
effort in the first place; Josh had written them in idiomatic async Python
in the first place, my contribution was to unroll that to sync Python code,
and then port that to (sync) C code. It was a lot of effort and a lot of
code - several hundred lines and 4(?) new types. The Python code was a few
lines - very readable and likely to be practically as fast. We weren't
writing this in C to speed it up or to make the code better, but because we
*had to*.

Implementing async functionality in C is a pain because to implement an
awaitable type you need not just that awaitable type, but a new type to
represent the iterator that am_await returns. I could imagine having
generic type objects and other helpers for implementing async PyObjects in
C but I don't really envisage anyone doing that; if you want to write async
helpers for Python the best framework is Python.

As Josh can attest I was in two minds while implementing this change; I
argued firstly that having them in the operator module is fine, and later,
that if we want async builtins in general, maybe we could implement them in
Python and freeze them into the binary. We pushed on with the C approach
mostly because we were already 70% done, and this was what Yury asked for,
so it seemed more likely that this would get merged.

But, if we're still discussing whether this should be merged in builtins or
operator, and that dictates whether it is in Python or C, I'm 100% behind
having this code be Python.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/VZFDUDH3NIBZX4ADPJ4E7VG2WAWOUBAA/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: PEP-376 and PEP-427 interpretation

2021-03-31 Thread Daniel Holth

I meant to exclude md5 and sha1, e.g. hash functions with known problems.
SHA224 would be a weird choice but it wouldn't personally offend me
otherwise. It would be fun to see how many wheel handlers support
non-sha256 hash functions.

On Mon, Mar 29, 2021 at 9:56 PM Theallredman via Python-Dev <
python-dev@python.org> wrote:

> Thank you.  I can't think of a compelling reason someone would want to
> choose SHA224 over SHA256 in the context of wheel generation.  It just that
> the PEPs are usually pretty explicit and SHA224 seemed to be implicitly
> excluded from RECORD files.  And I'm considering the details of making a
> pretty pedantic wheel generation PEP517 backend.
>
> Eldon
>
> ‐‐‐ Original Message ‐‐‐
> On Monday, March 29, 2021 2:16 PM, Paul Moore  wrote:
>
> > On Mon, 29 Mar 2021 at 17:40, Theallredman via Python-Dev
> > python-dev@python.org wrote:
> >
> > > So going back to my actual question SHA224 is disallowed in record
> files because it's bit length is less then 256?
> >
> > It doesn't look like it's ever been excluded. The only explicit
> > exclusions are MD5 and SHA1 as you point out. Do you have a particular
> > reason to want to use SHA224? Pretty much everyone is using SHA256, as
> > far as I know.
> >
> > Paul
>
>
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/NB6MLDDDJHRTRMOEWDFG5IYGZCP65K6V/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/P4575PDTQCUYMYZVOSJPYGD3J5YKNCFY/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] cmd.Cmd.prompt should be an instance attribute

I was recently using the cmd module for a project where my CLI
could connect to and interact with another host.  I implemented prompt in
such a way that it would show the IP address when connected.  I.e.,

class MyCmd(cmd.Cmd):
...

@property
def prompt(self) -> str:
if self.remote_host.connected():
return f'> ({self.remote_host.ip}) '
else:
return '> '

This worked perfectly fine... until I ran mypy.  mypy complained because,
in cmd.Cmd, prompt is a class attribute.

Looking at cmd.py, this seems like an odd design choice as all of the
references to cmd are through the instance (i.e., self.prompt).

While it's easy to fix this in my subclass by overriding the onecmd method
and setting self.prompt there, this seems kludgy ("Beautiful is better than
ugly").  More importantly, this whole arrangement obscures the fact that
prompt is a de facto attribute of the instance and not the class.  It's not
the class that's displaying the prompt but the instance.

It seems more intuitive for cmd.Cmd to be implemented thus:

   class Cmd:
PROMPT = '> '

def __init__(self):
...

self.prompt = self.PROMPT
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PCXEIWFOJ23ZVFWXW7NGTOSMAKKGMQ4Z/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: cmd.Cmd.prompt should be an instance attribute

Sorry, that implementation should have been

class Cmd:
PROMPT = '> '

@property
def prompt(self) -> str:
return self.PROMPT
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Q7TSFJD4EPUXXJFT5TMBUODIVL4CLR4O/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: cmd.Cmd.prompt should be an instance attribute

I wasn't looking at the type stub but cmd.py itself.  It has

PROMPT = '(Cmd) '
...

class Cmd:
prompt = PROMPT
...

On Sat, Jun 26, 2021 at 6:04 PM Guido van Rossum  wrote:

> On Sat, Jun 26, 2021 at 9:25 AM Daniel Walker  wrote:
>
>> I was recently using the cmd module for a project where my CLI
>> could connect to and interact with another host.  I implemented prompt in
>> such a way that it would show the IP address when connected.  I.e.,
>>
>> class MyCmd(cmd.Cmd):
>> ...
>>
>> @property
>> def prompt(self) -> str:
>> if self.remote_host.connected():
>> return f'> ({self.remote_host.ip}) '
>> else:
>> return '> '
>>
>> This worked perfectly fine... until I ran mypy.  mypy complained because,
>> in cmd.Cmd, prompt is a class attribute.
>>
>> Looking at cmd.py, this seems like an odd design choice as all of the
>> references to cmd are through the instance (i.e., self.prompt).
>>
>
> You misread the typeshed stub. Where you see these lines in cmd.pyi
>
> class Cmd:
> prompt: str
> identchars: str
> ruler: str
> ...
>
> those are all instance attribute declarations.
>
> I think that you're running into a different mypy bug, which is that you
> can't override a plain attribute with a property in a subclass.
>
> I think there's already a bug for that in the mypy tracker, but I can't
> find it right now.
>
> --
> --Guido van Rossum (python.org/~guido)
> *Pronouns: he/him **(why is my pronoun here?)*
> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XAENJVNK7US2BRUPOIFIUMY2N7MTHCWT/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: cmd.Cmd.prompt should be an instance attribute

Found the mypy bug:

https://github.com/python/mypy/issues/4125

On Sat, Jun 26, 2021 at 7:35 PM Daniel Walker  wrote:

> I wasn't looking at the type stub but cmd.py itself.  It has
>
> PROMPT = '(Cmd) '
> ...
>
> class Cmd:
> prompt = PROMPT
> ...
>
> On Sat, Jun 26, 2021 at 6:04 PM Guido van Rossum  wrote:
>
>> On Sat, Jun 26, 2021 at 9:25 AM Daniel Walker 
>> wrote:
>>
>>> I was recently using the cmd module for a project where my CLI
>>> could connect to and interact with another host.  I implemented prompt in
>>> such a way that it would show the IP address when connected.  I.e.,
>>>
>>> class MyCmd(cmd.Cmd):
>>> ...
>>>
>>> @property
>>> def prompt(self) -> str:
>>> if self.remote_host.connected():
>>> return f'> ({self.remote_host.ip}) '
>>> else:
>>> return '> '
>>>
>>> This worked perfectly fine... until I ran mypy.  mypy complained
>>> because, in cmd.Cmd, prompt is a class attribute.
>>>
>>> Looking at cmd.py, this seems like an odd design choice as all of the
>>> references to cmd are through the instance (i.e., self.prompt).
>>>
>>
>> You misread the typeshed stub. Where you see these lines in cmd.pyi
>>
>> class Cmd:
>> prompt: str
>> identchars: str
>> ruler: str
>> ...
>>
>> those are all instance attribute declarations.
>>
>> I think that you're running into a different mypy bug, which is that you
>> can't override a plain attribute with a property in a subclass.
>>
>> I think there's already a bug for that in the mypy tracker, but I can't
>> find it right now.
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>> *Pronouns: he/him **(why is my pronoun here?)*
>> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
>>
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZJ6FN2DNDYN66RMC6UEYBQPTQ5SAPQEG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: cmd.Cmd.prompt should be an instance attribute

Thank you

On Sat, Jun 26, 2021 at 8:10 PM Guido van Rossum  wrote:

> Okay, then Chris Barker’s explanation applies.
>
> On Sat, Jun 26, 2021 at 16:35 Daniel Walker  wrote:
>
>> I wasn't looking at the type stub but cmd.py itself.  It has
>>
>> PROMPT = '(Cmd) '
>> ...
>>
>> class Cmd:
>> prompt = PROMPT
>> ...
>>
>> On Sat, Jun 26, 2021 at 6:04 PM Guido van Rossum 
>> wrote:
>>
>>> On Sat, Jun 26, 2021 at 9:25 AM Daniel Walker 
>>> wrote:
>>>
>>>> I was recently using the cmd module for a project where my CLI
>>>> could connect to and interact with another host.  I implemented prompt in
>>>> such a way that it would show the IP address when connected.  I.e.,
>>>>
>>>> class MyCmd(cmd.Cmd):
>>>> ...
>>>>
>>>> @property
>>>> def prompt(self) -> str:
>>>> if self.remote_host.connected():
>>>> return f'> ({self.remote_host.ip}) '
>>>> else:
>>>> return '> '
>>>>
>>>> This worked perfectly fine... until I ran mypy.  mypy complained
>>>> because, in cmd.Cmd, prompt is a class attribute.
>>>>
>>>> Looking at cmd.py, this seems like an odd design choice as all of the
>>>> references to cmd are through the instance (i.e., self.prompt).
>>>>
>>>
>>> You misread the typeshed stub. Where you see these lines in cmd.pyi
>>>
>>> class Cmd:
>>> prompt: str
>>> identchars: str
>>> ruler: str
>>> ...
>>>
>>> those are all instance attribute declarations.
>>>
>>> I think that you're running into a different mypy bug, which is that you
>>> can't override a plain attribute with a property in a subclass.
>>>
>>> I think there's already a bug for that in the mypy tracker, but I can't
>>> find it right now.
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>> *Pronouns: he/him **(why is my pronoun here?)*
>>> <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-change-the-world/>
>>>
>> --
> --Guido (mobile)
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/K5QNGOF4WOYL6NN2KKDLBGFQKN5THXNJ/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-Dev] Re: Python multithreading without the GIL

2021-10-08 Thread Daniel Pope

On Fri, 8 Oct 2021 at 03:50, Sam Gross wrote:
> My goal with the proof-of-concept is to demonstrate that removing the GIL is
> feasible and worthwhile, and that the technical ideas of the project could
> serve as a basis of such an effort.

I'm a novice C programmer, but I'm unsure about the safety of your
thread-safe collections description. You describe an algorithm for
lock-free read access to list items as

1. Load the version counter from the collection
2. Load the “backing array” from the collection
3. Load the address of the item (from the “backing array”)
4. Increment the reference count of the item, if it is non-zero
(otherwise retry)
5. Verify that the item still exists at the same location in the
collection (otherwise retry)
6. Verify that the version counter did not change (otherwise retry)
7. Return the address of the item

But you do the bounds check for the index before this, here[1]. If the
thread is suspended after this and before you read the address of the
backing array [2], the list could have been resized (shrunk), and the
backing array reallocated from a new memory block. So the pointer you
read at 3 could be from uninitialized memory that is beyond the size
of the array (or within the array but larger than the current number
of items). And then you write to it at 4 which is then a write into a
random memory location.

[1]
https://github.com/colesbury/nogil/blob/fb6aabede5f7f1936a21c2f48ec7fcc0848d74bf/Objects/listobject.c#L137
[2]
https://github.com/colesbury/nogil/blob/fb6aabede5f7f1936a21c2f48ec7fcc0848d74bf/Objects/listobject.c#L141
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/6J6XFEACF2C6XPLZRVABUFFHJICUTZCS/
Code of Conduct: http://python.org/psf/codeofconduct/

Re: [Python-Dev] BDFL delegation for PEP 426 (PyPI metadata 1.3)

2013-02-15 Thread Daniel Holth

On Fri, Feb 15, 2013 at 12:27 PM, Erik Bray  wrote:

> On Sun, Feb 3, 2013 at 5:24 PM, Vinay Sajip 
> wrote:
> > Éric Araujo  netwok.org> writes:
> >
> >> Looks like we agree that a basic tool able to bootstrap the packaging
> >> story is needed :)
> >
> > Agreed. Just because distutils can't easily/reliably build things that
> are
> > better built with SCons/WAF/tup/whatever, doesn't mean that we shouldn't
> have
> > the ability to build pure-Python distributions and distributions
> including C
> > libs and extensions, with the ability to extend easily by third-party
> tools. It
> > just needs to be done in a way which is easy to build on, so the included
> > battery stays small and simple. Easier said than done, I know :-)
> >
> > Regards,
> >
> > Vinay Sajip
>
> Sorry to revive an old-ish discussion--I'm just catching up on things.
>  But I just wanted to add that distutils is still pretty okay for
> building reasonably complex projects.  Although it does not rise to
> the level of complexity of Numpy or SciPy, the Astropy project
> (https://github.com/astropy/astropy) has managed to put together a
> pretty nice build system on top of mostly-plain distutils (it does use
> distribute but primarily just for 2to3 support).
>
>
> This has necessitated a number of hacks to overcome shortcomings and
> bugs in distutils, but most of those shortcomings could probably be
> fixed in distutils within the framework of a slightly lifted freeze.
> But in any case I haven't found it worthwhile to switch to something
> like bento when the batteries included in the stdlib have been mostly
> Good Enough. Having fewer installation dependencies has also made it
> significantly easier for non-advanced users to install. Even the
> distribute requirement doesn't add too much overhead, as most users
> have it on their systems by default now, and for those who don't
> distribute_setup.py works okay.
>
> TL;DR, strong -1 on the stdlib "getting out of the build business".
> Also as I think Nick already mentioned one of the wins of
> Setup-Requires-Dist is to have a standard way to bring in extra build
> requirements (such as bento) so if we have better support for a
> feature like that it's not necessary to "bless" any preferred tool.
>

Distutils is not really going away. We need it to build the existing 28,000
packages. However empirically it seems if you try to write a significant
extension to or improvement of distutils then you are likely to get burnt
out and switch careers.

Instead of literally killing distutils we hope to make it very easy to use
other build tools when you need them and not use any build tools at all
when you don't. As a thought experiment: what if one of those third party
build tools hosted on pypi was distutils itself? What would you need to do
to make that happen?

The packaging peps PEP-376 and so on are brilliant because they are simple
enough to be implemented twice. If we had better ways to separate interface
from implementation in Python I'd like to see two of whatever else we come
up with for packaging.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Submitting PEP 425 (compatibility tags)

2013-02-15 Thread Daniel Holth

This is the improved compatibility tags PEP 425, specifying how part of the
Wheel PEP 427 filenames work. Last time we discussed whether replacing .
with _ was ugly but I concluded it was harmless.

Submitted for your consideration,

PEP: 425
Title: Compatibility Tags for Built Distributions
Version: $Revision$
Last-Modified: 07-Aug-2012
Author: Daniel Holth 
BDFL-Delegate: Nick Coghlan 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 27-Jul-2012
Python-Version: 3.4
Post-History: 8-Aug-2012, 18-Oct-2012, 15-Feb-2013


Abstract


This PEP specifies a tagging system to indicate with which versions of
Python a built or binary distribution is compatible.  A set of three
tags indicate which Python implementation and language version, ABI,
and platform a built distribution requires.  The tags are terse because
they will be included in filenames.


PEP Editor's Note
=

While the naming scheme described in this PEP will not be supported directly
in the standard library until Python 3.4 at the earliest, draft
implementations may be made available in third party projects.


Rationale
=

Today "python setup.py bdist" generates the same filename on PyPy
and CPython, but an incompatible archive, making it inconvenient to
share built distributions in the same folder or index.  Instead, built
distributions should have a file naming convention that includes enough
information to decide whether or not a particular archive is compatible
with a particular implementation.

Previous efforts come from a time where CPython was the only important
implementation and the ABI was the same as the Python language release.
This specification improves upon the older schemes by including the Python
implementation, language version, ABI, and platform as a set of tags.

By comparing the tags it supports with the tags listed by the
distribution, an installer can make an educated decision about whether
to download a particular built distribution without having to read its
full metadata.

Overview


The tag format is {python tag}-{abi tag}-{platform tag}

python tag
‘py27’, ‘cp33’
abi tag
‘cp32dmu’, ‘none’
platform tag
‘linux_x86_64’, ‘any’

For example, the tag py27-none-any indicates compatible with Python 2.7
(any Python 2.7 implementation) with no abi requirement, on any platform.

Use
===

The `wheel` built package format includes these tags in its filenames,
of the form ``{distribution}-{version}(-{build tag})?-{python tag}-{abi
tag}-{platform tag}.whl``.  Other package formats may have their own
conventions.

Details
===

Python Tag
--

The Python tag indicates the implementation and version required by
a distribution.  Major implementations have abbreviated codes, initially:

* py: Generic Python (does not require implementation-specific features)
* cp: CPython
* ip: IronPython
* pp: PyPy
* jy: Jython

Other Python implementations should use `sys.implementation.name`.

The version is `py_version_nodot`.  CPython gets away with no dot,
but if one is needed the underscore `_` is used instead.  PyPy should
probably use its own versions here `pp18`, `pp19`.

The version can be just the major version `2` or `3` `py2`, `py3` for
many pure-Python distributions.

Importantly, major-version-only tags like `py2` and `py3` are not
shorthand for `py20` and `py30`.  Instead, these tags mean the packager
intentionally released a cross-version-compatible distribution.

A single-source Python 2/3 compatible distribution can use the compound
tag `py2.py3`.  See `Compressed Tag Sets`, below.

ABI Tag
---

The ABI tag indicates which Python ABI is required by any included
extension modules.  For implementation-specific ABIs, the implementation
is abbreviated in the same way as the Python Tag, e.g. `cp33d` would be
the CPython 3.3 ABI with debugging.

The CPython stable ABI is `abi3` as in the shared library suffix.

Implementations with a very unstable ABI may use the first 6 bytes (as
8 base64-encoded characters) of the SHA-256 hash of ther source code
revision and compiler flags, etc, but will probably not have a great need
to distribute binary distributions. Each implementation's community may
decide how to best use the ABI tag.

Platform Tag


The platform tag is simply `distutils.util.get_platform()` with all
hyphens `-` and periods `.` replaced with underscore `_`.

* win32
* linux_i386
* linux_x86_64

Use
===

The tags are used by installers to decide which built distribution
(if any) to download from a list of potential built distributions.
The installer maintains a list of (pyver, abi, arch) tuples that it
will support.  If the built distribution's tag is `in` the list, then
it can be installed.

For example, an installer running under CPython 3.3 on a linux_x86_64
system might support::

 1. cp33-cp33m-linux_x86_64
 2. cp33-abi3-linux_x86_64
 3. cp33-none-linux_x86_64
 4. cp33-none-any
 5. cp3-none-any
 6. cp32-none

Re: [Python-Dev] PEP 426 is now the draft spec for distribution metadata 2.0

2013-02-17 Thread Daniel Holth

On Sun, 17 Feb 2013 20:11:23 +1000
Nick Coghlan  wrote:
>
> Major changes since the last draft:
>
> 1. Metadata-Version is 2.0 rather than 1.3, and the field now has the
> same major.minor semantics as are defined for wheel versions in PEP
> 427 (i.e. if a tool sees a major version number it doesn't recognise,
> it should give up rather than trying to guess what to do with it,
> while it's OK to process a higher minor version)

So is it possible to bundle several versioned metadata files in order
to maximize compatibility? Or does a distribution with 2.0 metadata
have to abandon all hope of being compatible with all tools?

Regards

Antoine.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/dholth%40gmail.com

Not likely to matter for a while as the current md v1 tools don't
understand this new obsolescence rule :-)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec fordistribution metadata 2.0

On Feb 19, 2013 6:57 AM,  wrote:

> > I've never seen environment markers being used or supported
> > in the wild.
> >
> > I'm not against modernizing the format, but given that version 1.2
> > has been out for around 8 years now, without much following,
> > I think we need to make the implementation bit a requirement
> > before accepting the PEP.
>
> Elephant in the room? Somethin's been around for so long and hasn't got
> any sort of
> mainstream support is *not* a good candidate for a requirement but maybe
> it's me
> being naive.
>

I think 1.2 was started and then amended over a long period of time to
become what it is today. I wanted to edit it again just to add
Provides-Extra. It turned out to be more practical to make a new PEP. Nick
explains that for complicated reasons the implementation of Metadata 1.2
was not successful. For Metadata 2.0 we had a different strategy: put it
into distribute first. You will find that this aids adoption. distlib will
provide another implementation that is in many ways interchangeable.

We also have three implementations of an installer for the related wheel
format, three implementations of the wheel builder, and two surprisingly
short implementations of environment markers. Yes, the wheel project has
the same name as the wheel format.

Metadata 2.0's Provides-Extra field is there to represent the setuptools
"extras" feature which is necessary for a very large number of
distributions on pypi. For a while the most common environment markers will
look like just "extra == 'foo'". So it should not be surprising that
distribute is the first tool to recognize Provides-Extra. Previously
distribute did not use METADATA / PKG-INFO for dependencies, instead
reading from requires.txt

If you can get around the fact that we've incorporated most of another PEP
386 into v2.0, rewritten and explained everything to death, I would argue
that 2.0 is actually simpler than 1.2. Only four headers are actually
required in a valid Metadata 2.0 file and a number of the fields are
informational and do not have to be understood by the installer.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec fordistribution metadata 2.0

On Tue, Feb 19, 2013 at 11:26 AM, Paul Moore  wrote:

> On 19 February 2013 13:59, Nick Coghlan  wrote:
> > It's OK if people don't want to read the detailed rationale provided
> > for each of the major changes as part of the PEP, or if they want to
> > dispute a particular piece of that rationale. But merely going "it's
> > too complicated!" or "metadata 1.2 failed, so 2.0 will fail as well!"
> > is not a reasonable response. Software distribution is complicated -
> > trying to oversimplify it is one of the reasons setuptools became
> > necessary.
>
> Nevertheless, the landscape is confusing. PEPs 241, 314, 345, 426,
> 390, 376 and 386 are all relevant to one extent or another, and only
> PEPs 426 (Metadata 2.0) and 390 (Static Metadata for Distutils -
> setup.cfg) are still in Draft format. The others are all Accepted or
> Final. And yet PEPs 345, 390 and 386 are unused and likely to remain
> so.
>
> I believe that the only ones that are *actually* of use are 241, 314
> and 426 (Metadata 1.0, 1.1 and 2.0, but not 1.2) and 376 (Database of
> Installed Python Distributions) although 376 still has a number of
> flaws. I'd suggest that these should be marked as Final, and the
> others as Rejected, so that we have a clear statement of what is
> actually supported.
>
> Making sure that users have the means to write code that *uses* these
> standards using functionality available in the stdlib is then the next
> step as you say. It is critical that this is done, because packaging
> tools are unique in that the barrier to using external dependencies is
> particularly high for them - for example, pip's reliance on
> distribute/setuptools is necessary, but has been problematic at times.
>
> Paul
>
> PS Apologies for using a load of PEP numbers without explanation.
> Here's a glossary:
>
> Metadata 1.0 - PEP 241
> Metadata 1.1 - PEP 314
> Metadata 1.2 - PEP 345
> Metadata 1.3 - PEP 426
> Static Metadata for Distutils - PEP 390
> Database of Installed Python Distributions - PEP 376
> Changing the version comparison module in Distutils - PEP 386
>

The documentation is bad.

The standard library will almost certainly grow code that can interpret
these PEPs. Packaging tool authors can handle these specs but no sane
person reads PEPs to figure out how to submit a simple library to pypi. No
end user cares about metadata versions either or even notices that sdists
contain PKG-INFO at their root. You get what setup() produces and that's
it.

Wheel installers can work without being installed into the target
environment at all. This is very hard to understand if you have been using
the distutils model (the installed package imports the installer and
packages are always rebuilt from source) for more than a decade. This
feature does something to remedy the setuptools chicken/egg problem. We
have eliminated the egg ;-)

Once the installer doesn't need to be installed, repeatedly, in every
virtualenv, hundreds of times a day, users could just have a single
up-to-date standalone copy of pip invoked as "python pip.zip install x".
Additional packages needed only for builds (even setuptools) may be
installed automatically by a build system (like pip) when building rather
than using packages.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

On Tue, Feb 19, 2013 at 3:25 PM, Paul Moore  wrote:

> On 19 February 2013 13:40, Nick Coghlan  wrote:
> >> If a tools wants to support metadata 2.0, it has to support all
> >> the complicated stuff as well, i.e. handle the requires fields,
> >> the environment markers and version comparisons/sorting.
> >
> > Which is what distutils2 can be used for now, and what distlib will
> > provide without the unwanted build system infrastructure in
> > distutils2.
>
> One particular item in Metadata 2.0 which needs to be addressed is
> requirements specifiers, as used in (for example) Requires-Dist. These
> take forms like "DistName (>= ver)". There are a number of potential
> issues here:
>
> 1. This differs from the pkg_resources format (which omits the
> parentheses). Having 2 subtly different formats is not a good idea in
> the long term. At the moment, pkg_resources format is used in pip
> requirements (on the command line and in requirement files) as well as
> in setuptools/distribute and possibly elsewhere.
> 2. There is currently no code that I am aware of that implements this
> spec. I don't believe distlib does (yet - give Vinay 5 minutes and who
> knows? :-)), pkg_resources as I said implements a different format,
> and distutils2, apart from being a big dependency to cover just this
> point, only matches the version (not the project name) and presumably
> does so using the distutils2 version ordering (which is incompatible
> with Metadata 2.0).
> 3. The format is fiddly to parse and process - nobody is likely to
> implement it for themselves without a library that does it (especially
> not when using pkg_resources is so easy).
>
> The PEP needs a rationale as to why the pkg_resources format wasn't
> used, and should suggest a migration path for projects which currently
> use the pkg_resources style (such as pip's requirements files).
>
> Paul


We did add full support for Metadata 2.0 Requires-Dist in pkg_resources.
The fiddling happens here:
https://bitbucket.org/tarek/distribute/src/tip/pkg_resources.py?at=default#cl-2469

The code removes the parenthesis and adds == when no operator was specified
to convert us back to the pkg_resources format. The rest of pkg_resources
carries on as before. Wheel installs would not pull in dependencies
otherwise.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0

On Tue, Feb 19, 2013 at 4:34 PM, M.-A. Lemburg  wrote:

> On 19.02.2013 14:40, Nick Coghlan wrote:
> > On Tue, Feb 19, 2013 at 11:23 PM, M.-A. Lemburg  wrote:
> >> * PEP 426 doesn't include any mention of the egg distribution format,
> >>   even though it's the most popular distribution format at the moment.
> >>   It should at least include the location of the metadata file
> >>   in eggs (EGG-INFO/PKG-INFO) and egg installations
> >>   (/EGG-INFO/PKG-INFO).
> >
> > "Other tools involved in Python distribution may also use this format."
> >
> > The egg format has never been, and never will be, officially endorsed
> > by python-dev. The wheel format is the standard format for binary
> > distribution, and PEP 376 defines the standard location for metadata
> > on installed distributions.
>
> Oh, come on, Nick, that's just silly. setuptools was included in stdlib
> for a short while, so the above is simply wrong. Eggs are the most
> widely used binary distribution format for Python package on PyPI:
>
> # wc *files.csv
>   25585   25598 1431013 2013-02-19-egg-files.csv
>46194640  236694 2013-02-19-exe-files.csv
> 254 255   13402 2013-02-19-msi-files.csv
>  104691  104853 5251962 2013-02-19-tar-gz-files.csv
>  24  241221 2013-02-19-whl-files.csv
>   17937   18022  905913 2013-02-19-zip-files.csv
>  153110  153392 7840205 total
>
> (based on todays PyPI stats)
>
> It doesn't really help ignoring realities... and I'm saying
> that as one of the core devs who got setuptools kicked out of
> the stdlib again.
>
> --
> Marc-Andre Lemburg
> eGenix.com
>

The wheel philosophy is that it should be supported by both python-dev and
setuptools and that you should feel happy about using setuptools if you
like it whether or not python-dev (currently) endorses that. If you are
using setuptools (distribute's pkg_resources) then you can use both at the
same time.

Distribute, distutils and setuptools' problems have not been well
understood which I think is why there has been a need to discredit
setuptools by calling it non-standard. It is the defacto standard. If your
packages have dependencies there is no other choice. Wheel tries to solve
the real problem by allowing you to build a package with setuptools while
giving the end-user the choice of installing setuptools or not.

Of course eggs are the most popular right now. The wheel format is very
egg-like while avoiding some of egg's problems. See the comparison in the
PEP or read the story on wheel's rtfd. The wheel project includes tools to
losslessly convert eggs or bdist_wininst to wheel.

I am confident distlib can thrive outside of the standard library! Why the
rush to kill it before its prime?
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Fwd: PEP 426 is now the draft spec for distribution metadata 2.0