[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-02 Thread M.-A. Lemburg
On 02.02.2021 00:33, Inada Naoki wrote:
> On Tue, Feb 2, 2021 at 12:43 AM M.-A. Lemburg  wrote:
>>
>> Hi Inada-san,
>>
>> thank you for adding some comments, but they are not really capturing
>> what I think is missing:
>>
>> """
>> Removing these APIs removes ability to use codec without temporary Unicode.
>>
>> Codecs can not encode Unicode buffer directly without temporary Unicode
>> object since Python 3.3. All these APIs creates temporary Unicode object for
>> now. So removing them doesn't reduce any abilities.
>> """
>>
>> The point is that while the decoders allow going from a C object
>> to a Python object directly, we are missing a way to do the same
>> for the encoders, since the Python 3.3 change in the Unicode internals.
>>
>> At the very least, we should have such APIs for going from wchar_t*
>> to a Python object.
> 
> We already have PyUnicode_FromWideChar(). So I assume you mean
> "wchar_t* to Python bytes object".

Yes, that's what I meant. Encoding from wchar_t* to a Python bytes
object. This is what the encoder APIs all implement. They have become
less efficient with Python 3.3, but this can be resolved, while
at the same time removing Py_UNICODE and replacing it with wchar_t
in those encoder APIs.

>>
>> The alternatives you provide all require creating an intermediate
>> Python object for this purpose. The APIs you want to remove do that
>> as well, but that's not the point. The point is to expose the codecs'
>> decode mechanism which is available in the C code, but currently
>> not exposed via C APIs, e.g. ucs4lib_utf8_encode().
>>
>> It would be breaking change, but those APIs in your list could
>> simply be changed from using Py_UNICODE to using whcar_t instead
>> and then interface directly to the internal functions we have for
>> the encoders.
>>
> 
> OK, I see codecs.h has three encoders.
> 
> * utf8_encode
> * utf16_encode
> * utf32_encode
>
> But there are 13 encoders in my PEP:
> 
> PyUnicode_Encode()
> PyUnicode_EncodeASCII()
> PyUnicode_EncodeLatin1()
> PyUnicode_EncodeUTF7()
> PyUnicode_EncodeUTF8()
> PyUnicode_EncodeUTF16()
> PyUnicode_EncodeUTF32()
> PyUnicode_EncodeUnicodeEscape()
> PyUnicode_EncodeRawUnicodeEscape()
> PyUnicode_EncodeCharmap()
> PyUnicode_TranslateCharmap()
> PyUnicode_EncodeDecimal()
> PyUnicode_TransformDecimalToASCII()
> 
> Do you want to keep all encoders? or 3 encoders?

We could keep all encoders, replacing Py_UNICODE with wchar_t
in the API.

For the ones where we have separate implementations
as private functions, we can move back to direct encoding.

For the others, we can keep using the temporary Unicode object
or refactor the code to expose the native encoders working
directly on the internal buffers as private functions
and then use those in the same way for direct encoding.

The Unicode API was meant and designed as a rich API, making
it easy to use and providing a complete set for extension
writers and CPython to use. I believe we should keep it that
way.

>> That would keep extensions working after a recompile, since
>> Py_UNICODE is already a typedef to wchar_t.
>>
> 
> That idea is written in the PEP already.
> https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-wchar-t

Right and I think this is a more workable approach than removing
APIs.

BTW: I don't understand this comment:
"They are inefficient on platforms wchar_t* is UTF-16. It is because
built-in codecs supports only UCS-1, UCS-2, and UCS-4 input."

Windows is one such platform. Java (indirectly) is another. They both
store UTF-16LE in those arrays and Python's codecs handle this just
fine.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, Feb 02 2021)
>>> Python Projects, Coaching and Support ...https://www.egenix.com/
>>> Python Product Development ...https://consulting.egenix.com/


::: We implement business ideas - efficiently in both time and costs :::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   https://www.egenix.com/company/contact/
 https://www.malemburg.com/
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PRFDSXHVNITI5PKQPI7DJJJ6DPIKRYM5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding "is not safe" in typeobject.c

2021-02-02 Thread Phil Thompson via Python-Dev

On 01/02/2021 23:50, Greg Ewing wrote:

On 2/02/21 12:13 am, Phil Thompson via Python-Dev wrote:

TypeError: object.__new__(B) is not safe, use B.__new__()


It's not safe because object.__new__ doesn't know about any
C-level initialisation that A or B need.


But A.__new__ is calling object.__new__ and so can take care of its own 
needs after the latter returns.



At the C level, there is always a *single* inheritance hierarchy.


Why?


The right thing is for B's tp_new to directly call A's tp_new,
which calls object's tp_new.


I want my C-implemented class's __new__ to support cooperative 
multi-inheritance so my A class cannot assume that object.__new__ is the 
next in the MRO.


I did try to call the next-in-MRO's tp_new directly (rather that calling 
it's __new__ attribute) but that gave me recursion errors.



Don't worry about Python-level multiple inheritance; the
interpreter won't let you create an inheritance structure
that would mess this up.


Phil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GZ2RF7TJ6MXDODPWCJB3PDC2Z3VDSQIQ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding "is not safe" in typeobject.c

2021-02-02 Thread Phil Thompson via Python-Dev

On 01/02/2021 19:06, Guido van Rossum wrote:

That code is quite old. This comment tries to explain it:
```
/* Check that the use doesn't do something silly and unsafe like
   object.__new__(dict). To do this, we check that the
most derived base that's not a heap type is this type. */
```


I understand what it is checking, but I don't understand why it is 
"silly and unsafe".


I think you may have to special-case this and arrange for B.__new__() 
to be

called, like it or not.


But it's already been called. The check fails when trying to 
subsequently call object.__new__().


(If you want us to change the code, please file a bpo bug report. I 
know

that's no fun, but it's the way to get the right people involved.)


Happy to do that but I first wanted to check if I was doing something 
"silly" - I'm still not sure.


Phil


On Mon, Feb 1, 2021 at 3:27 AM Phil Thompson via Python-Dev <
python-dev@python.org> wrote:


Hi,

I'm trying to understand the purpose of the check in tp_new_wrapper() 
of

typeobject.c that results in the "is not safe" exception.

I have the following class hierarchy...

B -> A -> object

...where B and A are implemented in C. Class A has an implementation 
of

tp_new which does a few context-specific checks before calling
PyBaseObject_Type.tp_new() directly to actually create the object. 
This

works fine.

However I want to allow class B to be used with a Python mixin. A's
tp_new() then has to do something similar to super().__new__(). I have
tried to implement this by locating the type object after A in B's 
MRO,
getting it's '__new__' attribute and calling it (using 
PyObject_Call())
with B passed as the only argument. However I then get the "is not 
safe"

exception, specifically...

TypeError: object.__new__(B) is not safe, use B.__new__()

I take the same approach for __init__() and that works fine.

If I comment out the check in tp_new_wrapper() then everything works
fine.

So, am I doing something unsafe? If so, what?

Or, is the check at fault in not allowing the case of a C extension 
type

with its own tp_new?

Thanks,
Phil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at
https://mail.python.org/archives/list/python-dev@python.org/message/HRGDEMURCJ5DSNEPMQPQR3R7VVDFA4ZX/
Code of Conduct: http://python.org/psf/codeofconduct/


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZNJK6BJLXCMOOZNEDGNZZKT2YG4XUV57/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 597: Add optional EncodingWarning

2021-02-02 Thread Victor Stinner
On Tue, Feb 2, 2021 at 5:40 AM Inada Naoki  wrote:
> > In Python 3.10, I added _locale._get_locale_encoding() function which
> > is exactly what the encoding used by open() when no encoding is
> > specified (encoding=None) and when os.device_encoding(fd) returns
> > None. See _Py_GetLocaleEncoding() for the C implementation
> > (Python/fileutils.c).
> >
> > Maybe we should add a public locale.get_locale_encoding() function? On
> > Unix, this function uses nl_langinfo(CODESET) *without* setting
> > LC_CTYPE locale to the user preferred locale.
> >
>
> I can not imagine any use case. Isn't it just confusing?

It's the same than locale.getpreferredencoding(False) but with a more
explicit name, no argument and a *sane default behavior* (don't change
the LC_CTYPE locale temporarily).

The use case is to pass text to the OS (or get text from the OS) when
you cannot pass text directly, but must encode it (or decode it)
manually. Not all use cases involve files ;-)

Example of locale.getpreferredencoding() usage:

* XML ElementTree uses locale.getpreferredencoding() when
encoding="unicode" is used
* Deprecate gettext functions use it to encode to bytes
* the cgi module uses it to encode the URL query string for the CGI
stdin (GET and HEAD methods)

I dislike getpreferredencoding() because by default it changes
temporarily the LC_CTYPE locale which affects all threads, and this is
bad.

Well, it doesn't have to be part of the PEP ;-)

> > I understand that encoding=locale.get_locale_encoding() would be
> > different from encoding="locale":
> > encoding=locale.get_locale_encoding() doesn't call
> > os.device_encoding(), right?
> >
>
> Yes.

Would it be useful to add a io.get_locale_encoding(fd)->str (maybe
"get_default_encoding"?) function which gives the chosen encoding from
a file descriptor, similar to open(fd, encoding="locale").encoding?
The os.device_encoding() call is not obvious.


> > Maybe the PEP should also explain (in a "How to teach this" section?)
> > when encoding="locale" is better than a specific encoding, like
> > encoding="utf-8" or encoding="cp1252". In my experience, it's mostly
> > for the inter-operability which other applications which also use the
> > current locale encoding.
>
> This option is for experts who are publishing cross-platform
> libraries, frameworks, etc.
>
> For students, I am suggesting another idea that make UTF-8 mode more 
> accessible.

Maybe just say that in "How to teach this" section in the PEP?

In case of doubt, pass encoding="utf-8". Only use encoding="locale" if
you understand that the encoding changes depending on the platform and
the user locale. The common issue with encoding="locale" is that files
should not be exchanged between two computers. encoding="locale" is
good for files which remain local. It's also good for interoperability
with other applications which use the locale encoding and with the
terminal.


> > > Opt-in warning
> > > ---
> > >
> > > Although ``DeprecationWarning`` is suppressed by default, emitting
> > > ``DeprecationWarning`` always when ``encoding`` option is omitted
> > > would be too noisy.
> >
> > The PEP is not very clear. Does "-X warn_encoding" only emits the
> > warning, or does it also display it by default? Does it add a warning
> > filter for EncodingWarning?
> >
>
> This section is not the spec. This section is the rationale for adding
> EncodingWarning instead of using DeprecationWarning.
>
> As spec saying, EncodingWarning is a subclass of Warning. So it is
> displayed by default. But it is not emitted by default.
>
> When -X encoding_warning (or -X warn_default_encoding) is used, the
> warning is emitted and shown unless the user suppresses warnings.

I understand that EncodingWarning is always displayed by default
(default warning filters don't ignore it, whereas DeprecationWarning
are ignored by default), but no warning is emitted by default. Ok,
that makes sense. Maybe try to say it explicitly in the PEP.


> This PEP doesn't have "backward compatibility" section because the PEP
> doesn't break any backward compatibility.

IMO it's a good thing to always have the section, just to say that you
took time to think about backward compatibility ;-) The section can be
empty, like just say "there is no incompatible change" ;-)


> And if developers want to support Python ~3.9 and use -X
> warn_default_encoding on 3.10, they need to write
> `encoding=getattr(io, "LOCALE_ENCODING", None)`, as written in the
> spec.

Maybe repeat it in the Backward Compatibility section.

It's important to provide a way to prevent the warning without losing
the support for old Python versions.


> > The main question is if it's possible to use encoding="locale" on
> > Python 3.6-3.9 (maybe using some ugly hacks).
>
> No.

Hum. To write code compatible with Python 3.9, I understand that
encoding=None is the closest to encoding="locale".

And I understand that encoding=getattr(io, "LOCALE_ENCODING", None) is
backward and forward compatib

[Python-Dev] PEP 637 - Support for indexing with keyword arguments: request for feedback for SC submission

2021-02-02 Thread Stefano Borini
Hi all,

I would like to request feedback by python-dev on the current
implementation of PEP 637 - Support for indexing with keyword
arguments.

https://www.python.org/dev/peps/pep-0637/

The PEP is ready for SC submission and it has a prototype
implementation ready, available here (note, not reviewed, but
apparently fully functional)

https://github.com/python/cpython/compare/master...stefanoborini:PEP-637-implementation-attempt-2

(note: not sure if there's a preference for the link to be to the diff
or to the branch, let me know if you prefer I change the PEP link)

Thank you for your help.


-- 
Kind regards,

Stefano Borini
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/E3AMOIB3GKYAGN6IVSLEEKVP4VUEC2V3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-02 Thread Inada Naoki
On Tue, Feb 2, 2021 at 7:37 PM M.-A. Lemburg  wrote:
>
> >> That would keep extensions working after a recompile, since
> >> Py_UNICODE is already a typedef to wchar_t.
> >>
> >
> > That idea is written in the PEP already.
> > https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-wchar-t
>
> Right and I think this is a more workable approach than removing
> APIs.
>
> BTW: I don't understand this comment:
> "They are inefficient on platforms wchar_t* is UTF-16. It is because
> built-in codecs supports only UCS-1, UCS-2, and UCS-4 input."
>
> Windows is one such platform. Java (indirectly) is another. They both
> store UTF-16LE in those arrays and Python's codecs handle this just
> fine.
>

I'm sorry about the section is not clear.

For example, if wchar_t* is UCS4, ucs4_utf8_encoder() can encode
wchar_t* into UTF-8.

But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle
surrogate escape.
We need to use a temporary Unicode object. That is what "inefficient" means.

I will update the section more elaborate.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QUGBVLQNBFVNX25AEIL77WSFOHQES6LJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-02 Thread Emily Bowman
On Tue, Feb 2, 2021 at 3:47 AM Inada Naoki  wrote:

> But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle
> surrogate escape.
> We need to use a temporary Unicode object. That is what "inefficient"
> means.
>

Since real UCS-2 is effectively dead, maybe it should be flipped around:
Make UTF-16 be the efficient path and UCS-2 be the path that needs to
round-trip through Unicode. But I suppose that's out of scope for this PEP.

-Em
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KWEJN7FKF4GUEW34KZZWXOD4W5HCLVIH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding "is not safe" in typeobject.c

2021-02-02 Thread Greg Ewing

On 3/02/21 12:07 am, Phil Thompson wrote:

On 01/02/2021 23:50, Greg Ewing wrote:

At the C level, there is always a *single* inheritance hierarchy.


Why?


Because a C struct can only extend one other C struct.

I want my C-implemented class's __new__ to support cooperative 
multi-inheritance


I don't think this is possible. Here is what the C API docs have to
say about the matter:

---

Note

If you are creating a co-operative tp_new (one that calls a base type’s 
tp_new or __new__()), you must not try to determine what method to call 
using method resolution order at runtime. Always statically determine 
what type you are going to call, and call its tp_new directly, or via 
type->tp_base->tp_new. If you do not do this, Python subclasses of your 
type that also inherit from other Python-defined classes may not work 
correctly. (Specifically, you may not be able to create instances of 
such subclasses without getting a TypeError.)


---

(Source: https://docs.python.org/3.5/extending/newtypes.html)

This doesn't mean that your type can't be used in multiple inheritance,
just that __new__ methods in particular can't be cooperative.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KVUEGIRAXWPVD6BZLHPKUI5X7UBH3G2M/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding "is not safe" in typeobject.c

2021-02-02 Thread Phil Thompson via Python-Dev

On 02/02/2021 14:18, Greg Ewing wrote:

On 3/02/21 12:07 am, Phil Thompson wrote:

On 01/02/2021 23:50, Greg Ewing wrote:

At the C level, there is always a *single* inheritance hierarchy.


Why?


Because a C struct can only extend one other C struct.


Yes - I misunderstood what you meant by "at the C level".

I want my C-implemented class's __new__ to support cooperative 
multi-inheritance


I don't think this is possible. Here is what the C API docs have to
say about the matter:

---

Note

If you are creating a co-operative tp_new (one that calls a base
type’s tp_new or __new__()), you must not try to determine what method
to call using method resolution order at runtime. Always statically
determine what type you are going to call, and call its tp_new
directly, or via type->tp_base->tp_new. If you do not do this, Python
subclasses of your type that also inherit from other Python-defined
classes may not work correctly. (Specifically, you may not be able to
create instances of such subclasses without getting a TypeError.)

---

(Source: https://docs.python.org/3.5/extending/newtypes.html)

This doesn't mean that your type can't be used in multiple inheritance,
just that __new__ methods in particular can't be cooperative.


Thanks - that's fairly definitive, although I don't really understand 
why __new__ has this particular requirement.


Phil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FWSIZUAGD4QRZQ2ZDKLE7MP4P76EIMKL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Constructing expected_opinfo_* lists in test_dis.py

2021-02-02 Thread Skip Montanaro
> The problem is not that dis.get_instructions can't be trusted, but that
> the test isn't testing the dis module at all. It is testing whether the
> output from the compiler has changed.
> A lot of the tests in test_dis do that.

Thanks. Perhaps such tests belong in a different test_* module? (I ask
this in a rhetorical sense.) I realize that there can not be (nor
should be) perfect isolation of test cases so that (for example)
test_sys.py includes all tests of sys module functionality. Still, if
a fairly large chunk of the contents of test_dis.py don't test dis
module functionality (I'm guessing >= 50%), perhaps moving them to
test_compiler.py or something similar would be a stronger signal about
their intent.

Skip
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/BZASBG64R2ZBFROEYPEW3GGSPJOQFJT5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding "is not safe" in typeobject.c

2021-02-02 Thread Martin Teichmann
Hi Phil, Hi List,

unfortunately you do not give enough code to reproduce what you are doing,
but just guessing roughly:

you say that you have a hierarchy like B -> A -> object, with B and A
implemented in C, and then want to use B with a mixin. Programmers with a
non-python background then often write

class MyClass(B, Mixin):
  "whatever"

this leads to an MRO of MyClass -> B -> Mixin -> A -> object. This is
horror if B and A are written in C, because suddenly B needs to do
something with Python code if it wants to have to do something with its
superclass Mixin, like creating a new object. I am just guessing that this
is what your code tries to do. And this is what the comment considers silly.

With

class MyClass(Mixin, B):
 "whatever"

there is no problem at all. We get an MRO of MyClass -> Mixin -> B -> A ->
object. There is no need for B to do anything special, being written in C
it already knows by itself how to construct A as well, no need to fiddle
with Python at all. In general, it is usually not necessary to deal with
super() in C code at all.

The statement that there is only single inheritance on the C level becomes
obvious once you look at the MRO: that one is always linear, it is always
effectively a single inheritance. This is also why you have to call super()
only once even if you have multiple superclasses: super() just follows the
effective single inheritance of the MRO.

Hope that helps.

Cheers

Martin
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WHTRIPKQMVCQPOYBFWRW6HI6KOBUYJU3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-02 Thread Inada Naoki
On Tue, Feb 2, 2021 at 9:40 PM Emily Bowman  wrote:
>
> On Tue, Feb 2, 2021 at 3:47 AM Inada Naoki  wrote:
>>
>> But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle
>> surrogate escape.
>> We need to use a temporary Unicode object. That is what "inefficient" means.
>
>
> Since real UCS-2 is effectively dead, maybe it should be flipped around: Make 
> UTF-16 be the efficient path and UCS-2 be the path that needs to round-trip 
> through Unicode. But I suppose that's out of scope for this PEP.
>
> -Em

Note the ucs2_utf8_encoder() is used only for encoding Python Unicode
object for now.
Unicode object is latin1, UCS2, or UCS4. It never be UTF-16.

So if we support add UTF-16 support to ucs2_utf8_encoder(), it means
we need to add code and maintain only for PyUnicode_EncodeUTF8 (encode
from wchar_t* into char*).

I don't think it is a good deal. As described in the PEP, encoder APIs
are used very rarely.
We must not add any maintainece costs for them.

Regards,
-- 
Inada Naoki  
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KDYTBQDA4UFE6XWYENOV32ZRTCTAYEPC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding "is not safe" in typeobject.c

2021-02-02 Thread Greg Ewing

On 3/02/21 4:52 am, Phil Thompson wrote:

Thanks - that's fairly definitive, although I don't really understand 
why __new__ has this particular requirement.


The job of tp_new is to initialise the C struct. To do this,
it first has to initialise the fields of the struct it
inherits from, then initialise any fields of its own that
it adds, in that order.

Initialising the inherited fields must be done by calling
the tp_new for the struct that it inherits from. You don't
want to call the tp_new of some other class that might have
got inserted into the MRO, because you have no idea what
kind of C struct it expects to get.

Cooperative calling is a nice idea, but it requires rather
special conditions to make it work. All the methods must
have exactly the same signature, and it mustn't matter what
order they're called in. Those conditions don't apply to
__new__, especially at the C level where everything is much
more strict type-wise.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5AGXKE35LDQA6JP6EKXW7FUFR7YCC3RR/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs

2021-02-02 Thread Victor Stinner
On Tue, Feb 2, 2021 at 11:47 PM Inada Naoki  wrote:
> So if we support add UTF-16 support to ucs2_utf8_encoder(), it means
> we need to add code and maintain only for PyUnicode_EncodeUTF8 (encode
> from wchar_t* into char*).
>
> I don't think it is a good deal. As described in the PEP, encoder APIs
> are used very rarely.
> We must not add any maintainece costs for them.

I fixed tons of bugs related in Python 2.7 and Python 3 codecs before
PEP 393 (compact strings) to handle properly 16-bit wchar_t: to handle
properly surrogate characters. The implementation was complex and
slow. I would prefer to not move backwards to that :-(

If you are curious, look into PyUnicode_FromWideChar() implementation,
search for find_maxchar_surrogates(), to have an idea of the cost of
handling UTF-16 surrogate pairs. For a full codec, it's way more
complex, painful to write and to maintain. I'm happy that we were able
to remove that thanks to the PEP 393!

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OAPVKJAU6QZCMEWRQSYEDTGO6VAO5ZAN/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Understanding "is not safe" in typeobject.c

2021-02-02 Thread Greg Ewing

On 3/02/21 11:05 am, Martin Teichmann wrote:


     class MyClass(B, Mixin):
           "whatever"

this leads to an MRO of MyClass -> B -> Mixin -> A -> object.


If you do the tp_new stuff correctly at the C level, you can still
create such a class. The only limitation is that if Mixin has a __new__
method written in Python, it won't get called. So if Mixin needs to
do any initialisation, it will have to be in __init__, *and* all the
__init__ methods in the chain will need to be designed for
cooperative calling.

--
Greg
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YOMFSCDKYYFNV4IRGFAMLQU7WQYNTO6C/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP 563: get_type_hints should use closure when available

2021-02-02 Thread Caleb Donovick
The discussion around PEP 649 got me thinking about what I believe is the
largest downside to PEP 563: the inability to evaluate annotations created
with closures.  While this is in general unavoidable,  if the type is ever
referenced in an annotated function (including as an annotation) it should
be resolvable via `__closure__`.

For example:
```
from __future__ import annotations
import typing
def gen(T):
def f(x: T):
y: T = ...
return f

f = gen(int)
nonlocal_vars = {
  var : cell.cell_contents
  for var, cell in zip(f.__code__.co_freevars, f.__closure__)
}
assert typing.get_type_hints(f, localns=nonlocal_vars)  == {'x': int}
```

I would just open a PR to have `get_type_hints` attempt to resolve closure
variables by default.  However, this would require an update to PEP 563 and
I don't know what the protocol is there.

-  Caleb Donovick
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/L2XH2EIFCEPT5DQRPRFR6WYEM6M2LM2V/
Code of Conduct: http://python.org/psf/codeofconduct/