[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs
On 02.02.2021 00:33, Inada Naoki wrote: > On Tue, Feb 2, 2021 at 12:43 AM M.-A. Lemburg wrote: >> >> Hi Inada-san, >> >> thank you for adding some comments, but they are not really capturing >> what I think is missing: >> >> """ >> Removing these APIs removes ability to use codec without temporary Unicode. >> >> Codecs can not encode Unicode buffer directly without temporary Unicode >> object since Python 3.3. All these APIs creates temporary Unicode object for >> now. So removing them doesn't reduce any abilities. >> """ >> >> The point is that while the decoders allow going from a C object >> to a Python object directly, we are missing a way to do the same >> for the encoders, since the Python 3.3 change in the Unicode internals. >> >> At the very least, we should have such APIs for going from wchar_t* >> to a Python object. > > We already have PyUnicode_FromWideChar(). So I assume you mean > "wchar_t* to Python bytes object". Yes, that's what I meant. Encoding from wchar_t* to a Python bytes object. This is what the encoder APIs all implement. They have become less efficient with Python 3.3, but this can be resolved, while at the same time removing Py_UNICODE and replacing it with wchar_t in those encoder APIs. >> >> The alternatives you provide all require creating an intermediate >> Python object for this purpose. The APIs you want to remove do that >> as well, but that's not the point. The point is to expose the codecs' >> decode mechanism which is available in the C code, but currently >> not exposed via C APIs, e.g. ucs4lib_utf8_encode(). >> >> It would be breaking change, but those APIs in your list could >> simply be changed from using Py_UNICODE to using whcar_t instead >> and then interface directly to the internal functions we have for >> the encoders. >> > > OK, I see codecs.h has three encoders. > > * utf8_encode > * utf16_encode > * utf32_encode > > But there are 13 encoders in my PEP: > > PyUnicode_Encode() > PyUnicode_EncodeASCII() > PyUnicode_EncodeLatin1() > PyUnicode_EncodeUTF7() > PyUnicode_EncodeUTF8() > PyUnicode_EncodeUTF16() > PyUnicode_EncodeUTF32() > PyUnicode_EncodeUnicodeEscape() > PyUnicode_EncodeRawUnicodeEscape() > PyUnicode_EncodeCharmap() > PyUnicode_TranslateCharmap() > PyUnicode_EncodeDecimal() > PyUnicode_TransformDecimalToASCII() > > Do you want to keep all encoders? or 3 encoders? We could keep all encoders, replacing Py_UNICODE with wchar_t in the API. For the ones where we have separate implementations as private functions, we can move back to direct encoding. For the others, we can keep using the temporary Unicode object or refactor the code to expose the native encoders working directly on the internal buffers as private functions and then use those in the same way for direct encoding. The Unicode API was meant and designed as a rich API, making it easy to use and providing a complete set for extension writers and CPython to use. I believe we should keep it that way. >> That would keep extensions working after a recompile, since >> Py_UNICODE is already a typedef to wchar_t. >> > > That idea is written in the PEP already. > https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-wchar-t Right and I think this is a more workable approach than removing APIs. BTW: I don't understand this comment: "They are inefficient on platforms wchar_t* is UTF-16. It is because built-in codecs supports only UCS-1, UCS-2, and UCS-4 input." Windows is one such platform. Java (indirectly) is another. They both store UTF-16LE in those arrays and Python's codecs handle this just fine. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Experts (#1, Feb 02 2021) >>> Python Projects, Coaching and Support ...https://www.egenix.com/ >>> Python Product Development ...https://consulting.egenix.com/ ::: We implement business ideas - efficiently in both time and costs ::: eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 https://www.egenix.com/company/contact/ https://www.malemburg.com/ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PRFDSXHVNITI5PKQPI7DJJJ6DPIKRYM5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Understanding "is not safe" in typeobject.c
On 01/02/2021 23:50, Greg Ewing wrote: On 2/02/21 12:13 am, Phil Thompson via Python-Dev wrote: TypeError: object.__new__(B) is not safe, use B.__new__() It's not safe because object.__new__ doesn't know about any C-level initialisation that A or B need. But A.__new__ is calling object.__new__ and so can take care of its own needs after the latter returns. At the C level, there is always a *single* inheritance hierarchy. Why? The right thing is for B's tp_new to directly call A's tp_new, which calls object's tp_new. I want my C-implemented class's __new__ to support cooperative multi-inheritance so my A class cannot assume that object.__new__ is the next in the MRO. I did try to call the next-in-MRO's tp_new directly (rather that calling it's __new__ attribute) but that gave me recursion errors. Don't worry about Python-level multiple inheritance; the interpreter won't let you create an inheritance structure that would mess this up. Phil ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GZ2RF7TJ6MXDODPWCJB3PDC2Z3VDSQIQ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Understanding "is not safe" in typeobject.c
On 01/02/2021 19:06, Guido van Rossum wrote: That code is quite old. This comment tries to explain it: ``` /* Check that the use doesn't do something silly and unsafe like object.__new__(dict). To do this, we check that the most derived base that's not a heap type is this type. */ ``` I understand what it is checking, but I don't understand why it is "silly and unsafe". I think you may have to special-case this and arrange for B.__new__() to be called, like it or not. But it's already been called. The check fails when trying to subsequently call object.__new__(). (If you want us to change the code, please file a bpo bug report. I know that's no fun, but it's the way to get the right people involved.) Happy to do that but I first wanted to check if I was doing something "silly" - I'm still not sure. Phil On Mon, Feb 1, 2021 at 3:27 AM Phil Thompson via Python-Dev < python-dev@python.org> wrote: Hi, I'm trying to understand the purpose of the check in tp_new_wrapper() of typeobject.c that results in the "is not safe" exception. I have the following class hierarchy... B -> A -> object ...where B and A are implemented in C. Class A has an implementation of tp_new which does a few context-specific checks before calling PyBaseObject_Type.tp_new() directly to actually create the object. This works fine. However I want to allow class B to be used with a Python mixin. A's tp_new() then has to do something similar to super().__new__(). I have tried to implement this by locating the type object after A in B's MRO, getting it's '__new__' attribute and calling it (using PyObject_Call()) with B passed as the only argument. However I then get the "is not safe" exception, specifically... TypeError: object.__new__(B) is not safe, use B.__new__() I take the same approach for __init__() and that works fine. If I comment out the check in tp_new_wrapper() then everything works fine. So, am I doing something unsafe? If so, what? Or, is the check at fault in not allowing the case of a C extension type with its own tp_new? Thanks, Phil ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HRGDEMURCJ5DSNEPMQPQR3R7VVDFA4ZX/ Code of Conduct: http://python.org/psf/codeofconduct/ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZNJK6BJLXCMOOZNEDGNZZKT2YG4XUV57/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 597: Add optional EncodingWarning
On Tue, Feb 2, 2021 at 5:40 AM Inada Naoki wrote: > > In Python 3.10, I added _locale._get_locale_encoding() function which > > is exactly what the encoding used by open() when no encoding is > > specified (encoding=None) and when os.device_encoding(fd) returns > > None. See _Py_GetLocaleEncoding() for the C implementation > > (Python/fileutils.c). > > > > Maybe we should add a public locale.get_locale_encoding() function? On > > Unix, this function uses nl_langinfo(CODESET) *without* setting > > LC_CTYPE locale to the user preferred locale. > > > > I can not imagine any use case. Isn't it just confusing? It's the same than locale.getpreferredencoding(False) but with a more explicit name, no argument and a *sane default behavior* (don't change the LC_CTYPE locale temporarily). The use case is to pass text to the OS (or get text from the OS) when you cannot pass text directly, but must encode it (or decode it) manually. Not all use cases involve files ;-) Example of locale.getpreferredencoding() usage: * XML ElementTree uses locale.getpreferredencoding() when encoding="unicode" is used * Deprecate gettext functions use it to encode to bytes * the cgi module uses it to encode the URL query string for the CGI stdin (GET and HEAD methods) I dislike getpreferredencoding() because by default it changes temporarily the LC_CTYPE locale which affects all threads, and this is bad. Well, it doesn't have to be part of the PEP ;-) > > I understand that encoding=locale.get_locale_encoding() would be > > different from encoding="locale": > > encoding=locale.get_locale_encoding() doesn't call > > os.device_encoding(), right? > > > > Yes. Would it be useful to add a io.get_locale_encoding(fd)->str (maybe "get_default_encoding"?) function which gives the chosen encoding from a file descriptor, similar to open(fd, encoding="locale").encoding? The os.device_encoding() call is not obvious. > > Maybe the PEP should also explain (in a "How to teach this" section?) > > when encoding="locale" is better than a specific encoding, like > > encoding="utf-8" or encoding="cp1252". In my experience, it's mostly > > for the inter-operability which other applications which also use the > > current locale encoding. > > This option is for experts who are publishing cross-platform > libraries, frameworks, etc. > > For students, I am suggesting another idea that make UTF-8 mode more > accessible. Maybe just say that in "How to teach this" section in the PEP? In case of doubt, pass encoding="utf-8". Only use encoding="locale" if you understand that the encoding changes depending on the platform and the user locale. The common issue with encoding="locale" is that files should not be exchanged between two computers. encoding="locale" is good for files which remain local. It's also good for interoperability with other applications which use the locale encoding and with the terminal. > > > Opt-in warning > > > --- > > > > > > Although ``DeprecationWarning`` is suppressed by default, emitting > > > ``DeprecationWarning`` always when ``encoding`` option is omitted > > > would be too noisy. > > > > The PEP is not very clear. Does "-X warn_encoding" only emits the > > warning, or does it also display it by default? Does it add a warning > > filter for EncodingWarning? > > > > This section is not the spec. This section is the rationale for adding > EncodingWarning instead of using DeprecationWarning. > > As spec saying, EncodingWarning is a subclass of Warning. So it is > displayed by default. But it is not emitted by default. > > When -X encoding_warning (or -X warn_default_encoding) is used, the > warning is emitted and shown unless the user suppresses warnings. I understand that EncodingWarning is always displayed by default (default warning filters don't ignore it, whereas DeprecationWarning are ignored by default), but no warning is emitted by default. Ok, that makes sense. Maybe try to say it explicitly in the PEP. > This PEP doesn't have "backward compatibility" section because the PEP > doesn't break any backward compatibility. IMO it's a good thing to always have the section, just to say that you took time to think about backward compatibility ;-) The section can be empty, like just say "there is no incompatible change" ;-) > And if developers want to support Python ~3.9 and use -X > warn_default_encoding on 3.10, they need to write > `encoding=getattr(io, "LOCALE_ENCODING", None)`, as written in the > spec. Maybe repeat it in the Backward Compatibility section. It's important to provide a way to prevent the warning without losing the support for old Python versions. > > The main question is if it's possible to use encoding="locale" on > > Python 3.6-3.9 (maybe using some ugly hacks). > > No. Hum. To write code compatible with Python 3.9, I understand that encoding=None is the closest to encoding="locale". And I understand that encoding=getattr(io, "LOCALE_ENCODING", None) is backward and forward compatib
[Python-Dev] PEP 637 - Support for indexing with keyword arguments: request for feedback for SC submission
Hi all, I would like to request feedback by python-dev on the current implementation of PEP 637 - Support for indexing with keyword arguments. https://www.python.org/dev/peps/pep-0637/ The PEP is ready for SC submission and it has a prototype implementation ready, available here (note, not reviewed, but apparently fully functional) https://github.com/python/cpython/compare/master...stefanoborini:PEP-637-implementation-attempt-2 (note: not sure if there's a preference for the link to be to the diff or to the branch, let me know if you prefer I change the PEP link) Thank you for your help. -- Kind regards, Stefano Borini ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/E3AMOIB3GKYAGN6IVSLEEKVP4VUEC2V3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs
On Tue, Feb 2, 2021 at 7:37 PM M.-A. Lemburg wrote: > > >> That would keep extensions working after a recompile, since > >> Py_UNICODE is already a typedef to wchar_t. > >> > > > > That idea is written in the PEP already. > > https://www.python.org/dev/peps/pep-0624/#replace-py-unicode-with-wchar-t > > Right and I think this is a more workable approach than removing > APIs. > > BTW: I don't understand this comment: > "They are inefficient on platforms wchar_t* is UTF-16. It is because > built-in codecs supports only UCS-1, UCS-2, and UCS-4 input." > > Windows is one such platform. Java (indirectly) is another. They both > store UTF-16LE in those arrays and Python's codecs handle this just > fine. > I'm sorry about the section is not clear. For example, if wchar_t* is UCS4, ucs4_utf8_encoder() can encode wchar_t* into UTF-8. But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle surrogate escape. We need to use a temporary Unicode object. That is what "inefficient" means. I will update the section more elaborate. Regards, -- Inada Naoki ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QUGBVLQNBFVNX25AEIL77WSFOHQES6LJ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs
On Tue, Feb 2, 2021 at 3:47 AM Inada Naoki wrote: > But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle > surrogate escape. > We need to use a temporary Unicode object. That is what "inefficient" > means. > Since real UCS-2 is effectively dead, maybe it should be flipped around: Make UTF-16 be the efficient path and UCS-2 be the path that needs to round-trip through Unicode. But I suppose that's out of scope for this PEP. -Em ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KWEJN7FKF4GUEW34KZZWXOD4W5HCLVIH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Understanding "is not safe" in typeobject.c
On 3/02/21 12:07 am, Phil Thompson wrote: On 01/02/2021 23:50, Greg Ewing wrote: At the C level, there is always a *single* inheritance hierarchy. Why? Because a C struct can only extend one other C struct. I want my C-implemented class's __new__ to support cooperative multi-inheritance I don't think this is possible. Here is what the C API docs have to say about the matter: --- Note If you are creating a co-operative tp_new (one that calls a base type’s tp_new or __new__()), you must not try to determine what method to call using method resolution order at runtime. Always statically determine what type you are going to call, and call its tp_new directly, or via type->tp_base->tp_new. If you do not do this, Python subclasses of your type that also inherit from other Python-defined classes may not work correctly. (Specifically, you may not be able to create instances of such subclasses without getting a TypeError.) --- (Source: https://docs.python.org/3.5/extending/newtypes.html) This doesn't mean that your type can't be used in multiple inheritance, just that __new__ methods in particular can't be cooperative. -- Greg ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KVUEGIRAXWPVD6BZLHPKUI5X7UBH3G2M/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Understanding "is not safe" in typeobject.c
On 02/02/2021 14:18, Greg Ewing wrote: On 3/02/21 12:07 am, Phil Thompson wrote: On 01/02/2021 23:50, Greg Ewing wrote: At the C level, there is always a *single* inheritance hierarchy. Why? Because a C struct can only extend one other C struct. Yes - I misunderstood what you meant by "at the C level". I want my C-implemented class's __new__ to support cooperative multi-inheritance I don't think this is possible. Here is what the C API docs have to say about the matter: --- Note If you are creating a co-operative tp_new (one that calls a base type’s tp_new or __new__()), you must not try to determine what method to call using method resolution order at runtime. Always statically determine what type you are going to call, and call its tp_new directly, or via type->tp_base->tp_new. If you do not do this, Python subclasses of your type that also inherit from other Python-defined classes may not work correctly. (Specifically, you may not be able to create instances of such subclasses without getting a TypeError.) --- (Source: https://docs.python.org/3.5/extending/newtypes.html) This doesn't mean that your type can't be used in multiple inheritance, just that __new__ methods in particular can't be cooperative. Thanks - that's fairly definitive, although I don't really understand why __new__ has this particular requirement. Phil ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/FWSIZUAGD4QRZQ2ZDKLE7MP4P76EIMKL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Constructing expected_opinfo_* lists in test_dis.py
> The problem is not that dis.get_instructions can't be trusted, but that > the test isn't testing the dis module at all. It is testing whether the > output from the compiler has changed. > A lot of the tests in test_dis do that. Thanks. Perhaps such tests belong in a different test_* module? (I ask this in a rhetorical sense.) I realize that there can not be (nor should be) perfect isolation of test cases so that (for example) test_sys.py includes all tests of sys module functionality. Still, if a fairly large chunk of the contents of test_dis.py don't test dis module functionality (I'm guessing >= 50%), perhaps moving them to test_compiler.py or something similar would be a stronger signal about their intent. Skip ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/BZASBG64R2ZBFROEYPEW3GGSPJOQFJT5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Understanding "is not safe" in typeobject.c
Hi Phil, Hi List, unfortunately you do not give enough code to reproduce what you are doing, but just guessing roughly: you say that you have a hierarchy like B -> A -> object, with B and A implemented in C, and then want to use B with a mixin. Programmers with a non-python background then often write class MyClass(B, Mixin): "whatever" this leads to an MRO of MyClass -> B -> Mixin -> A -> object. This is horror if B and A are written in C, because suddenly B needs to do something with Python code if it wants to have to do something with its superclass Mixin, like creating a new object. I am just guessing that this is what your code tries to do. And this is what the comment considers silly. With class MyClass(Mixin, B): "whatever" there is no problem at all. We get an MRO of MyClass -> Mixin -> B -> A -> object. There is no need for B to do anything special, being written in C it already knows by itself how to construct A as well, no need to fiddle with Python at all. In general, it is usually not necessary to deal with super() in C code at all. The statement that there is only single inheritance on the C level becomes obvious once you look at the MRO: that one is always linear, it is always effectively a single inheritance. This is also why you have to call super() only once even if you have multiple superclasses: super() just follows the effective single inheritance of the MRO. Hope that helps. Cheers Martin ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/WHTRIPKQMVCQPOYBFWRW6HI6KOBUYJU3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs
On Tue, Feb 2, 2021 at 9:40 PM Emily Bowman wrote: > > On Tue, Feb 2, 2021 at 3:47 AM Inada Naoki wrote: >> >> But when wchar_t* is UTF-16, ucs2_utf8_encoder() can not handle >> surrogate escape. >> We need to use a temporary Unicode object. That is what "inefficient" means. > > > Since real UCS-2 is effectively dead, maybe it should be flipped around: Make > UTF-16 be the efficient path and UCS-2 be the path that needs to round-trip > through Unicode. But I suppose that's out of scope for this PEP. > > -Em Note the ucs2_utf8_encoder() is used only for encoding Python Unicode object for now. Unicode object is latin1, UCS2, or UCS4. It never be UTF-16. So if we support add UTF-16 support to ucs2_utf8_encoder(), it means we need to add code and maintain only for PyUnicode_EncodeUTF8 (encode from wchar_t* into char*). I don't think it is a good deal. As described in the PEP, encoder APIs are used very rarely. We must not add any maintainece costs for them. Regards, -- Inada Naoki ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KDYTBQDA4UFE6XWYENOV32ZRTCTAYEPC/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Understanding "is not safe" in typeobject.c
On 3/02/21 4:52 am, Phil Thompson wrote: Thanks - that's fairly definitive, although I don't really understand why __new__ has this particular requirement. The job of tp_new is to initialise the C struct. To do this, it first has to initialise the fields of the struct it inherits from, then initialise any fields of its own that it adds, in that order. Initialising the inherited fields must be done by calling the tp_new for the struct that it inherits from. You don't want to call the tp_new of some other class that might have got inserted into the MRO, because you have no idea what kind of C struct it expects to get. Cooperative calling is a nice idea, but it requires rather special conditions to make it work. All the methods must have exactly the same signature, and it mustn't matter what order they're called in. Those conditions don't apply to __new__, especially at the C level where everything is much more strict type-wise. -- Greg ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5AGXKE35LDQA6JP6EKXW7FUFR7YCC3RR/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 624: Remove Py_UNICODE encoder APIs
On Tue, Feb 2, 2021 at 11:47 PM Inada Naoki wrote: > So if we support add UTF-16 support to ucs2_utf8_encoder(), it means > we need to add code and maintain only for PyUnicode_EncodeUTF8 (encode > from wchar_t* into char*). > > I don't think it is a good deal. As described in the PEP, encoder APIs > are used very rarely. > We must not add any maintainece costs for them. I fixed tons of bugs related in Python 2.7 and Python 3 codecs before PEP 393 (compact strings) to handle properly 16-bit wchar_t: to handle properly surrogate characters. The implementation was complex and slow. I would prefer to not move backwards to that :-( If you are curious, look into PyUnicode_FromWideChar() implementation, search for find_maxchar_surrogates(), to have an idea of the cost of handling UTF-16 surrogate pairs. For a full codec, it's way more complex, painful to write and to maintain. I'm happy that we were able to remove that thanks to the PEP 393! Victor -- Night gathers, and now my watch begins. It shall not end until my death. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/OAPVKJAU6QZCMEWRQSYEDTGO6VAO5ZAN/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Understanding "is not safe" in typeobject.c
On 3/02/21 11:05 am, Martin Teichmann wrote: class MyClass(B, Mixin): "whatever" this leads to an MRO of MyClass -> B -> Mixin -> A -> object. If you do the tp_new stuff correctly at the C level, you can still create such a class. The only limitation is that if Mixin has a __new__ method written in Python, it won't get called. So if Mixin needs to do any initialisation, it will have to be in __init__, *and* all the __init__ methods in the chain will need to be designed for cooperative calling. -- Greg ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YOMFSCDKYYFNV4IRGFAMLQU7WQYNTO6C/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] PEP 563: get_type_hints should use closure when available
The discussion around PEP 649 got me thinking about what I believe is the largest downside to PEP 563: the inability to evaluate annotations created with closures. While this is in general unavoidable, if the type is ever referenced in an annotated function (including as an annotation) it should be resolvable via `__closure__`. For example: ``` from __future__ import annotations import typing def gen(T): def f(x: T): y: T = ... return f f = gen(int) nonlocal_vars = { var : cell.cell_contents for var, cell in zip(f.__code__.co_freevars, f.__closure__) } assert typing.get_type_hints(f, localns=nonlocal_vars) == {'x': int} ``` I would just open a PR to have `get_type_hints` attempt to resolve closure variables by default. However, this would require an update to PEP 563 and I don't know what the protocol is there. - Caleb Donovick ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/L2XH2EIFCEPT5DQRPRFR6WYEM6M2LM2V/ Code of Conduct: http://python.org/psf/codeofconduct/