[Python-Dev] Re: Discrepancy between what aiter() and `async for` requires on purpose?
I think the C implementation of PyAiter_Check was a translation of the Python code `isinstance(..., collections.abc.AsyncIterator)`, but I agree that it would be more consistent to just check for __anext__. There were comments at the time here: https://github.com/python/cpython/pull/8895#discussion_r532833905 ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/VLT475NI2ORNSHFZZVZVD4QYOPG66SQK/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Why doesn't peephole optimise away operations with fast locals?
STORE_FAST can also be caused by the assignment to a loop variable, so STORE/LOAD pairs can come about with things like this: >>> def f(): ... for x in stuff: ... x.work() ... ... >>> from dis import dis >>> dis(f) 2 0 LOAD_GLOBAL 0 (stuff) 2 GET_ITER >>4 FOR_ITER 6 (to 18) 6 STORE_FAST 0 (x) 3 8 LOAD_FAST0 (x) 10 LOAD_METHOD 1 (work) 12 CALL_METHOD 0 14 POP_TOP 16 JUMP_ABSOLUTE2 (to 4) 2 >> 18 LOAD_CONST 0 (None) 20 RETURN_VALUE I'd guess that they'd be somewhat common in comprehensions too: >>> dis(x**2 for x in range(1000)) 0 GEN_START0 1 2 LOAD_FAST0 (.0) >>4 FOR_ITER 7 (to 20) 6 STORE_FAST 1 (x) 8 LOAD_FAST1 (x) 10 LOAD_CONST 0 (2) 12 BINARY_POWER 14 YIELD_VALUE 16 POP_TOP 18 JUMP_ABSOLUTE2 (to 4) >> 20 LOAD_CONST 1 (None) 22 RETURN_VALUE In fact, there's already a bpo issue from 2019: https://bugs.python.org/issue38381 ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/73BMYW3TY7PJB7KRQ3Q3OROGU5UJVJAW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 677 (Callable Type Syntax): Final Call for Comments
Like others expressed, I don't like the idea of the typing and non-typing parts of Python separating. Has anyone considered adding a new special method like `__arrow__` or something, that would be user-definable, but also defined for tuples and types as returning a Callable? For example `int -> str` could mean Callable[[int], str], and (int, str) -> bool could mean Callable[[int, str], bool]. I would find that sort of semantics more agreeable since Python already has operators that dispatch to dunder methods, and anyone who knows how that bit of Python works would automatically mostly know how the new operator works. If I understand right, this is a sort of combination of two things for which there is more precedent: first, adding a new operator based on the needs of a subset of users (the @ operator and __matmul__), and second, adding new operators to existing objects for the sake of typing (like the list[int] syntax in which type.__getitem__ was implemented to dispatch to the_type.__class_getitem__). If people don't want to add a new operator and dunder, I assume using the right shift operator like `(int, bool) >> str` would be too cheesy? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GTYLK4QA6DHQDZH7NLYLELYCFUKOTNDT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Do you ever use ceval.c's LLTRACE feature?
Hi everyone, I'm looking to improve the output of the interpreter's LLTRACE feature to make it more understandable. This "feature" is undocumented, but it prints out the opcode and oparg of every bytecode instruction executed, along with some info about stack operations, whenever you've built with Py_DEBUG and the name `__ltrace__` is defined in the module. I've found this useful for debugging bits of the compiler and bytecode interpreter. For example, if I make some tweak that instroduces an off-by-one error, by default I get a segfault or a rather unhelpful assertion failure at `assert(EMPTY())` or `assert(STACK_LEVEL() <= frame->f_code->co_stacksize)` or similar, at best, with no inducation as to where or why that assertion is failing. But if I enable `__ltrace__` by either setting `__ltrace__=1` in some module or by manually setting `lltrace=1;` in the c code, I can follow what was happening in the interpreter just before the crash. I'd like the output in that scenario to be a bit more helpful. I propose printing opcode names rather than decimal digits, and printing out the name of the current function whenever a stack frame begins executing. I also proprose to print out the full stack contents (almost never very deep) before each bytecode, rather than printing the state piecemeal at each PUSH/POP/STACK_ADJUST macro. I opened issue https://github.com/python/cpython/issues/91462 and PR https://github.com/python/cpython/pull/91463 I later found that this had been explored before by https://github.com/python/cpython/issues/69757, and there was a suggestion that this could be folded into a more generalized bytecode-level tracing feature that is pluggable with python code, similar to sys.settrace(). I would tend to think "YAGNI" -- lltrace is a feature for debugging the c internals of the interpreter, and there are already separate existing features like the `trace` module for tracing through Python code with different goals. I appreciate the simplicity of printf statements at the c level -- it feels more trustworthy than adding a complicated extra feature involving python calls and global state. It's as if I just littered the code with my own debugging print statements, but re-usable and better. I see no documentation anywhere, and there's only one test case in test_lltrace, just testing that there's no segfault. Looking back through the git history, I see that the basic `printf("%d: %d, %d\n", ...);` format goes back to 1990: https://github.com/python/cpython/blob/3f5da24ea304e674a9abbdcffc4d671e32aa70f1/Python/ceval.c#L696-L710 I'm essentially writing to ask: how do you use lltrace? Does anyone rely on the particular format of the output? Would some of these improvements be helpful to you? What else could make it more helpful? Thanks, Dennis Sweeney ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/DELHX3N5PCZDWIK2DLU5JDG6JREQ42II/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Decreasing refcount for locals before popping frame
I don't know if there's anything specifically stopping this, but from what I understand, the precise moment that a finalizer gets called is unspecified, so relying on any sort of behavior there is undefined and non-portable. Implementations like PyPy don't always use reference counting, so their garbage collection might get called some unspecified amount of time later. I'm not familiar with Airflow, but would you be able to decorate the create() function to check for good return values? Something like :import functools : :def dag_initializer(func): :@functools.wraps(func) :def wrapper(): :with DAG(...) as dag: :result = func(dag) :del dag :if not isinstance(result, DAG): :raise ValueError(f"{func.__name__} did not return a dag") :return result :return wrapper : :@dag_initializer :def create(dag): :"some code here" ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/EBCLFYZLCTANUYSPZ55GFHG5I7DDTR76/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Looking for a sponsor and feedback on PEP 616: string methods for removing prefixes and suffixes
Hello all! I'm a relatively new contributor looking for a Core Dev sponsor for the following PEP: https://github.com/python/peps/pull/1332 Related: - Python-Ideas Thread: https://mail.python.org/archives/list/python-id...@python.org/thread/RJARZSUKCXRJIP42Z2YBBAEN5XA7KEC3/ - Bug Tracker Issue: https://bugs.python.org/issue39939 - Github PR for implementation: https://github.com/python/cpython/pull/18939 Abstract This is a proposal to add two new methods, ``cutprefix`` and ``cutsuffix``, to the APIs of Python's various string objects. In particular, the methods would be added to Unicode ``str`` objects, binary ``bytes`` and ``bytearray`` objects, and ``collections.UserString``. If ``s`` is one these objects, and ``s`` has ``pre`` as a prefix, then ``s.cutprefix(pre)`` returns a copy of ``s`` in which that prefix has been removed. If ``s`` does not have ``pre`` as a prefix, an unchanged copy of ``s`` is returned. In summary, ``s.cutprefix(pre)`` is roughly equivalent to ``s[len(pre):] if s.startswith(pre) else s``. The behavior of ``cutsuffix`` is analogous: ``s.cutsuffix(suf)`` is roughly equivalent to ``s[:-len(suf)] if suf and s.endswith(suf) else s``. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/XC3D3QGONENQ7PIAUM2SNNEP5BWA6Q4J/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] PEP 616 -- String methods to remove prefixes and suffixes
Browser Link: https://www.python.org/dev/peps/pep-0616/ PEP: 616 Title: String methods to remove prefixes and suffixes Author: Dennis Sweeney Sponsor: Eric V. Smith Status: Draft Type: Standards Track Content-Type: text/x-rst Created: 19-Mar-2020 Python-Version: 3.9 Post-History: 30-Aug-2002 Abstract This is a proposal to add two new methods, ``cutprefix`` and ``cutsuffix``, to the APIs of Python's various string objects. In particular, the methods would be added to Unicode ``str`` objects, binary ``bytes`` and ``bytearray`` objects, and ``collections.UserString``. If ``s`` is one these objects, and ``s`` has ``pre`` as a prefix, then ``s.cutprefix(pre)`` returns a copy of ``s`` in which that prefix has been removed. If ``s`` does not have ``pre`` as a prefix, an unchanged copy of ``s`` is returned. In summary, ``s.cutprefix(pre)`` is roughly equivalent to ``s[len(pre):] if s.startswith(pre) else s``. The behavior of ``cutsuffix`` is analogous: ``s.cutsuffix(suf)`` is roughly equivalent to ``s[:-len(suf)] if suf and s.endswith(suf) else s``. Rationale = There have been repeated issues [#confusion]_ on the Bug Tracker and StackOverflow related to user confusion about the existing ``str.lstrip`` and ``str.rstrip`` methods. These users are typically expecting the behavior of ``cutprefix`` and ``cutsuffix``, but they are surprised that the parameter for ``lstrip`` is interpreted as a set of characters, not a substring. This repeated issue is evidence that these methods are useful, and the new methods allow a cleaner redirection of users to the desired behavior. As another testimonial for the usefulness of these methods, several users on Python-Ideas [#pyid]_ reported frequently including similar functions in their own code for productivity. The implementation often contained subtle mistakes regarding the handling of the empty string (see `Specification`_). Specification = The builtin ``str`` class will gain two new methods with roughly the following behavior:: def cutprefix(self: str, pre: str, /) -> str: if self.startswith(pre): return self[len(pre):] return self[:] def cutsuffix(self: str, suf: str, /) -> str: if suf and self.endswith(suf): return self[:-len(suf)] return self[:] The only difference between the real implementation and the above is that, as with other string methods like ``replace``, the methods will raise a ``TypeError`` if any of ``self``, ``pre`` or ``suf`` is not an instace of ``str``, and will cast subclasses of ``str`` to builtin ``str`` objects. Note that without the check for the truthyness of ``suf``, ``s.cutsuffix('')`` would be mishandled and always return the empty string due to the unintended evaluation of ``self[:-0]``. Methods with the corresponding semantics will be added to the builtin ``bytes`` and ``bytearray`` objects. If ``b`` is either a ``bytes`` or ``bytearray`` object, then ``b.cutsuffix()`` and ``b.cutprefix()`` will accept any bytes-like object as an argument. Note that the ``bytearray`` methods return a copy of ``self``; they do not operate in place. The following behavior is considered a CPython implementation detail, but is not guaranteed by this specification:: >>> x = 'foobar' * 10**6 >>> x.cutprefix('baz') is x is x.cutsuffix('baz') True >>> x.cutprefix('') is x is x.cutsuffix('') True That is, for CPython's immutable ``str`` and ``bytes`` objects, the methods return the original object when the affix is not found or if the affix is empty. Because these types test for equality using shortcuts for identity and length, the following equivalent expressions are evaluated at approximately the same speed, for any ``str`` objects (or ``bytes`` objects) ``x`` and ``y``:: >>> (True, x[len(y):]) if x.startswith(y) else (False, x) >>> (True, z) if x != (z := x.cutprefix(y)) else (False, x) The two methods will also be added to ``collections.UserString``, where they rely on the implementation of the new ``str`` methods. Motivating examples from the Python standard library The examples below demonstrate how the proposed methods can make code one or more of the following: Less fragile: The code will not depend on the user to count the length of a literal. More performant: The code does not require a call to the Python built-in ``len`` function. More descriptive: The methods give a higher-level API for code readability, as opposed to the traditional method of string slicing. refactor.py --- - Current:: if fix_name.startswith(self.FILE_PREFIX): fix_name = fix_name[len(self.FILE_PREFIX):] - Improved:: fix_name = fix_name.cutprefix(
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Thanks for the feedback! I meant mnemonic as in the broader sense of "way of remembering things", not some kind of rhyming device or acronym. Maybe "mnemonic" isn't the perfect word. I was just trying to say that the structure of how the methods are named should how their behavior relates to one another, which it seems you agree with. Fair enough that ``[l/r]strip`` and the proposed methods share the behavior of "removing something from the end of a string". From that perspective, they're similar. But my thought was that ``s.lstrip("abc")`` has extremely similar behavior when changing "lstrip" to "rstrip" or "strip" -- the argument is interpreted in the exactly same way (as a character set) in each case. Looking at how the argument is used, I'd argue that ``lstrip``/``rstrip``/``strip`` are much more similar to each other than they are to the proposed methods, and that the proposed methods are perhaps more similar to something like ``str.replace``. But it does seem pretty subjective what the threshold is for behavior similar enough to have related names -- I see where you're coming from. Also, the docs at ( https://docs.python.org/3/library/stdtypes.html?highlight=lstrip#string-methods ) are alphabetical, not grouped by "similar names", so even ``lstrip``, ``strip``, and ``rstrip`` are already in different places. Maybe the name "stripprefix" would be more discoverable when "Ctrl-f"ing the docs, if it weren't for the following addition in the linked PR: .. method:: str.lstrip([chars]) Return a copy of the string with leading characters removed. The *chars* argument is a string specifying the set of characters to be removed. If omitted or ``None``, the *chars* argument defaults to removing whitespace. The *chars* argument is not a prefix; rather, all combinations of its values are stripped:: >>> ' spacious '.lstrip() 'spacious ' >>> 'www.example.com'.lstrip('cmowz.') 'example.com' + See :meth:`str.cutprefix` for a method that will remove a single prefix + string rather than all of a set of characters. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/55QJHR6PP4IWFLBRTFL4TZX5QOBJQFO5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Dennis Sweeney wrote: > to say that the structure of how the methods are named should how their > behavior relates ...should be a reminder of how... ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IFGANJGTE5RQ5J6FBJJNAWY7HRZSRED5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
For clarity, I'll change If ``s`` does not have ``pre`` as a prefix, an unchanged copy of ``s`` is returned. to If ``s`` does not have ``pre`` as a prefix, then ``s.cutprefix(pre)`` returns ``s`` or an unchanged copy of ``s``. For consistency with the Specification section, I'll also change s[len(pre):] if s.startswith(pre) else s to s[len(pre):] if s.startswith(pre) else s[:] and similarly change the ``cutsuffix`` snippet. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ULKK7K47QKFHXFXKNEAVF2UVNV6ZJNSD/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] (no subject)
Thanks for the review! > In short, I propose: > def cutprefix(self: str, prefix: str, /) -> str: > if self.startswith(prefix) and prefix: > return self[len(prefix):] > else: > return self > > I call startswith() before testing if pre is non-empty to inherit of > startswith() input type validation. For example, "a".startswith(b'x') > raises a TypeError. This still erroneously accepts tuples and and would return return str subclasses unchanged. If we want to make the Python be the spec with accuracy about type-checking, then perhaps we want: def cutprefix(self: str, prefix: str, /) -> str: if not isinstance(prefix, str): raise TypeError(f'cutprefix() argument must be str, ' f'not {type(prefix).__qualname__}') self = str(self) prefix = str(prefix) if self.startswith(prefix): return self[len(prefix):] else: return self I like the idea to always require these to return the unmodified string. Is there a reason this isn't specified by the ``replace()`` or ``strip`` methods? For accepting multiple prefixes, I can't tell if there's a consensus about whether ``s = s.cutprefix("a", "b", "c")`` should be the same as for prefix in ["a", "b", "c"]: s = s.cutprefix(prefix) or for prefix in ["a", "b", "c"]: if s.startwith(prefix): s = s.cutprefix(prefix) break The latter seems to be harder for users to implement through other means, and it's the behavior that test_concurrent_futures.py has implemented now, so maybe that's what we want. Also, it seems more elegant to me to accept variadic arguments, rather than a single tuple of arguments. Is it worth it to match the related-but-not-the-same API of "startswith" if it makes for uglier Python? My gut reaction is to prefer the varargs, but maybe someone has a different perspective. I can submit a revision to the PEP with some changes tomorrow. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/GXGP5T5KC6ZEBZ5AON4G3MHIKO6MAU35/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Hi Victor. I accidentally created a new thread, but I intended everything below as a response: Thanks for the review! > In short, I propose: > def cutprefix(self: str, prefix: str, /) -> str: > if self.startswith(prefix) and prefix: > return self[len(prefix):] > else: > return self > I call startswith() before testing if pre is non-empty to inherit of > startswith() input type validation. For example, "a".startswith(b'x') > raises a TypeError. This still erroneously accepts tuples and and would return return str subclasses unchanged. If we want to make the Python be the spec with accuracy about type-checking, then perhaps we want: def cutprefix(self: str, prefix: str, /) -> str: if not isinstance(prefix, str): raise TypeError(f'cutprefix() argument must be str, ' f'not {type(prefix).__qualname__}') self = str(self) prefix = str(prefix) if self.startswith(prefix): return self[len(prefix):] else: return self For accepting multiple prefixes, I can't tell if there's a consensus about whether ``s = s.cutprefix("a", "b", "c")`` should be the same as for prefix in ["a", "b", "c"]: s = s.cutprefix(prefix) or for prefix in ["a", "b", "c"]: if s.startwith(prefix): s = s.cutprefix(prefix) break The latter seems to be harder for users to implement through other means, and it's the behavior that test_concurrent_futures.py has implemented now, so maybe that's what we want. Also, it seems more elegant to me to accept variadic arguments, rather than a single tuple of arguments. Is it worth it to match the related-but-not-the-same API of "startswith" if it makes for uglier Python? My gut reaction is to prefer the varargs, but maybe someone has a different perspective. I can submit a revision to the PEP with some changes soon. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NYVDSQ7XB3KOXREY5FUALEILB2UCUVD3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Even then, it seems that prefix is an established computer science term: [1] https://en.wikipedia.org/wiki/Substring#Prefix [2] Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). Introduction to Algorithms (1st ed.). Chapter 15.4: Longest common subsequence And a quick search reveals that it's used hundreds of times in the docs: https://docs.python.org/3/search.html?q=prefix ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/R7CC6LEZHVLTILXGYFYGVXYTDANVJFNF/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
> But Dennis, welcome to the wonderful world of change proposals, where > you will experience insane amounts of pushback and debate on the > finest points of bikeshedding, whether or not people actually even > support the proposal at all... Lol -- thanks! In my mind, another reason that I like including the words "prefix" and "suffix" over "start" and "end" is that, even though using the verb "end" in "endswith" is unambiguous, the noun "end" can be used as either the initial or final end, as in "remove this thing from both ends of the string. So "suffix" feels more precise to me. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/7UHLOAR6NTVNLN3RBQP6ONHTLTDGXLQW/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Is there a proven use case for anything other than the empty string as the replacement? I prefer your "replacewhatever" to another "stripwhatever" name, and I think it's clear and nicely fits the behavior you proposed. But should we allow a naming convenience to dictate that the behavior should be generalized to a use case we're not sure exists, where the same same argument is passed 99% of the time? I think a downside would be that a pass-a-string-or-a-tuple-of-strings interface would be more mental effort to keep track of than a ``*args`` variadic interface for "(cut/remove/without/trim)prefix", even if the former is how ``startswith()`` works. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/KS25JX4V5LR3ZCV4EXU763RLTT24D4JT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
I like "removeprefix" and "removesuffix". My only concern before had been length, but three more characters than "cut***fix" is a small price to pay for clarity. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/Y4O2AIODGI2Z45A32UK5EHR7A7RLQFOK/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Here's an updated version. Online: https://www.python.org/dev/peps/pep-0616/ Source: https://raw.githubusercontent.com/python/peps/master/pep-0616.rst Changes: - More complete Python implementation to match what the type checking in the C implementation would be - Clarified that returning ``self`` is an optimization - Added links to past discussions on Python-Ideas and Python-Dev - Specified ability to accept a tuple of strings - Shorter abstract section and fewer stdlib examples - Mentioned - Typo and formatting fixes I didn't change the name because it didn't seem like there was a strong consensus for an alternative yet. I liked the suggestions of ``dropprefix`` or ``removeprefix``. All the best, Dennis ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/RY7GS4GF7OT7CLZVEDSULMY53QZYDN5Y/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Much appreciated! I will add that single quote and change those snippets to:: >>> s = 'FooBar' * 100 + 'Baz' >>> prefixes = ('Bar', 'Foo') >>> while len(s) != len(s := s.cutprefix(prefixes)): pass >>> s 'Baz' and:: >>> s = 'FooBar' * 100 + 'Baz' >>> prefixes = ('Bar', 'Foo') >>> while s.startswith(prefixes): s = s.cutprefix(prefixes) >>> s 'Baz' ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QJ54X6WHQQ5HFROSJOLGJF4QMFINMAPY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
This should be fixed now. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TQQXDLROEKI5ANEF3J7ESFO2VNYRVDYB/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Steven D'Aprano wrote: > Having confirmed that prefix is a tuple, you call tuple() to > make a copy of it in order to iterate over it. Why? > > Having confirmed that option is a string, you call str() on > it to (potentially) make a copy. Why? This was an attempt to ensure no one can do funny business with tuple or str subclassing. I was trying to emulate the ``PyTuple_Check`` followed by ``PyTuple_GET_SIZE`` and ``PyTuple_GET_ITEM`` that are done by the C implementation of ``str.startswith()`` to ensure that only the tuple/str methods are used, not arbitrary user subclass code. It seems that that's what most of the ``str`` methods force. I was mistaken in how to do this with pure Python. I believe I actually wanted something like: def cutprefix(self, prefix, /): if not isinstance(self, str): raise TypeError() if isinstance(prefix, tuple): for option in tuple.__iter__(prefix): if not isinstance(option, str): raise TypeError() if str.startswith(self, option): return str.__getitem__( self, slice(str.__len__(option), None)) return str.__getitem__(self, slice(None, None)) if not isinstance(prefix, str): raise TypeError() if str.startswith(self, prefix): return str.__getitem__(self, slice(str.__len__(prefix), None)) else: return str.__getitem__(self, slice(None, None)) ... which looks even uglier. > We ought to get some real-life exposure to the simple case first, before > adding support for multiple prefixes/suffixes. I could be (and have been) convinced either way about whether or not to generalize to tuples of strings. I thought Victor made a good point about compatibility with ``startswith()`` ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/CQVVWGPC454LWATA2Y7BZ5OEAGVSTHEZ/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
I think my confusion is about just how precise this sort of "reference implementation" should be. Should it behave with ``str`` and ``tuple`` subclasses exactly how it would when implemented? If so, I would expect the following to work: class S(str): __len__ = __getitem__ = __iter__ = None class T(tuple): __len__ = __getitem__ = __iter__ = None x = str.cutprefix("FooBar", T(("a", S("Foo"), 17))) assert x == "Bar" assert type(x) is str and so I think the ``str.__getitem__(self, slice(str.__len__(prefix), None))`` monstrosity would be the most technically correct, unless I'm missing something. But I've never seen Python code so ugly. And I suppose this is a slippery slope -- should it also guard against people redefining ``len = lambda x: 5`` and ``str = list`` in the global scope? Clearly not. I think then maybe it would be preferred to use the something like the following in the PEP: def cutprefix(self, prefix, /): if isinstance(prefix, str): if self.startswith(prefix): return self[len(prefix):] return self[:] elif isinstance(prefix, tuple): for option in prefix: if self.startswith(option): return self[len(option):] return self[:] else: raise TypeError() def cutsuffix(self, suffix): if isinstance(suffix, str): if self.endswith(suffix): return self[:len(self)-len(suffix)] return self[:] elif isinstance(suffix, tuple): for option in suffix: if self.endswith(option): return self[:len(self)-len(option)] return self[:] else: raise TypeError() The above would fail the assertions as written before, but would pass them for subclasses ``class S(str): pass`` and ``class T(tuple): pass`` that do not override any dunder methods. Is this an acceptable compromise if it appears alongside a clarifying sentence like the following? These methods should always return base ``str`` objects, even when called on ``str`` subclasses. I'm looking for guidance as to whether that's an appropriate level of precision for a PEP. If so, I'll make that change. All the best, Dennis ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/PV6ANJL7KN4VHPSNPZSAZGQCEWHEKYG2/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Steven D'Aprano wrote: > On Tue, Mar 24, 2020 at 08:14:33PM -, Dennis Sweeney wrote: > > I think then maybe it would be preferred to > > use the something like the following in the PEP: > > def cutprefix(self, prefix, /): > > if isinstance(prefix, str): > > if self.startswith(prefix): > > return self[len(prefix):] > > return self[:] > > > > Didn't we have a discussion about not mandating a copy when nothing > changes? For strings, I'd just return self. It is only bytearray that > requires a copy to be made. It appears that in CPython, ``self[:] is self`` is true for base ``str`` objects, so I think ``return self[:]`` is consistent with (1) the premise that returning self is an implementation detail that is neither mandated nor forbidden, and (2) the premise that the methods should return base ``str`` objects even when called on ``str`` subclasses. > > elif isinstance(prefix, tuple): > > for option in prefix: > > if self.startswith(option): > > return self[len(option):] > > > > I'd also remove the entire multiple substrings feature, for reasons I've > already given. "Compatibility with startswith" is not a good reason to > add this feature and you haven't established any good use-cases for it. > A closer analog is str.replace(substring, ''), and after almost 30 years > of real-world experience, that method still only takes a single > substring, not a tuple. The ``test_concurrent_futures.py`` example seemed to be a good use case to me. I agree that it would be good to see how common that actually is though. But it seems to me that any alternative behavior, e.g. repeated removal, could be implemented by a user on top of the remove-only-the-first-found behavior or by fluently chaining multiple method calls. Maybe you're right that it's too complex, but I think it's at least worth discussing. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TRTHGTLOEQXSYYXKQ6RFEXMGDI7O57EL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
It seems that there is a consensus on the names ``removeprefix`` and ``removesuffix``. I will update the PEP accordingly. I'll also simplify sample Python implementation to primarily reflect *intent* over strict type-checking correctness, and I'll adjust the accompanying commentary accordingly. Lastly, since the issue of multiple prefixes/suffixes is more controversial and seems that it would not affect how the single-affix cases would work, I can remove that from this PEP and allow someone else with a stronger opinion about it to propose and defend a set of semantics in a different PEP. Is there any objection to deferring this to a different PEP? All the best, Dennis ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/5JJ5YDUPCLVYSCCFOI4MQG64SLY22HU5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
There were at least two comments suggesting keeping it to one affix at a time: https://mail.python.org/archives/list/python-dev@python.org/message/GPXSIDLKTI6WKH5EKJWZEG5KR4AQ6P3J/ https://mail.python.org/archives/list/python-dev@python.org/message/EDWFPEGQBPTQTVZV5NDRC2DLSKCXVJPZ/ But I didn't see any big objections to the rest of the PEP, so I think maybe we keep it restricted for now. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QBCB2QMUMYBLPXHB6VKIKFK7OODYVKX5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
I'm removing the tuple feature from this PEP. So now, if I understand correctly, I don't think there's disagreement about behavior, just about how that behavior should be summarized in Python code. Ethan Furman wrote: > > It appears that in CPython, self[:] is self is true for base > > str > > objects, so I think return self[:] is consistent with (1) the premise > > that returning self is an implementation detail that is neither mandated > > nor forbidden, and (2) the premise that the methods should return base > > str objects even when called on str subclasses. > The Python interpreter in my head sees self[:] and returns a copy. > A > note that says a str is returned would be more useful than trying to > exactly mirror internal details in the Python "roughly equivalent" code. I think I'm still in the camp that ``return self[:]`` more precisely prescribes the desired behavior. It would feel strange to me to write ``return self`` and then say "but you don't actually have to return self, and in fact you shouldn't when working with subclasses". To me, it feels like return (the original object unchanged, or a copy of the object, depending on implementation details, but always make a copy when working with subclasses) is well-summarized by return self[:] especially if followed by the text Note that ``self[:]`` might not actually make a copy -- if the affix is empty or not found, and if ``type(self) is str``, then these methods may, but are not required to, make the optimization of returning ``self``. However, when called on instances of subclasses of ``str``, these methods should return base ``str`` objects, not ``self``. ...which is a necessary explanation regardless. Granted, ``return self[:]`` isn't perfect if ``__getitem__`` is overridden, but at the cost of three characters, the Python gains accuracy over both the optional nature of returning ``self`` in all cases and the impossibility (assuming no dunders are overridden) of returning self for subclasses. It also dissuades readers from relying on the behavior of returning self, which we're specifying is an implementation detail. Is that text explanation satisfactory? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/4E77QD52JCMHSP7O62C57XILLQN6SPCT/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
I was surprised by the following behavior: class MyStr(str): def __getitem__(self, key): if isinstance(key, slice) and key.start is key.stop is key.end: return self return type(self)(super().__getitem__(key)) my_foo = MyStr("foo") MY_FOO = MyStr("FOO") My_Foo = MyStr("Foo") empty = MyStr("") assert type(my_foo.casefold()) is str assert type(MY_FOO.capitalize()) is str assert type(my_foo.center(3)) is str assert type(my_foo.expandtabs()) is str assert type(my_foo.join(())) is str assert type(my_foo.ljust(3)) is str assert type(my_foo.lower()) is str assert type(my_foo.lstrip()) is str assert type(my_foo.replace("x", "y")) is str assert type(my_foo.split()[0]) is str assert type(my_foo.splitlines()[0]) is str assert type(my_foo.strip()) is str assert type(empty.swapcase()) is str assert type(My_Foo.title()) is str assert type(MY_FOO.upper()) is str assert type(my_foo.zfill(3)) is str assert type(my_foo.partition("z")[0]) is MyStr assert type(my_foo.format()) is MyStr I was under the impression that all of the ``str`` methods exclusively returned base ``str`` objects. Is there any reason why those two are different, and is there a reason that would apply to ``removeprefix`` and ``removesuffix`` as well? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TVDATHMCK25GT4OTBUBDWG3TBJN6DOKK/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
> I imagine it's an implementation detail of which ones depend on > ``__getitem__``. If we write class MyStr(str): def __getitem__(self, key): raise ZeroDivisionError() then all of the assertions from before still pass, so in fact *none* of the methods rely on ``__getitem__``. As of now ``str`` does not behave as an ABC at all. But it's an interesting proposal to essentially make it an ABC. Although it makes me curious what all of the different reasons people actually have for subclassing ``str``. All of the examples I found in the stdlib were either (1) contrived test cases (2) strings (e.g. grammar tokens) with some extra attributes along for the ride, or (3) string-based enums. None of types (2) or (3) ever overrode ``__getitem__``, so it doesn't feel like that common of a use case. > I don't see removeprefix and removesuffix explicitly being implemented > in terms of slicing operations as a huge win - you've demonstrated that > someone who wants a persistent string subclass still would need to > override a /lot/ of methods, so two more shouldn't hurt much - I just > think that "consistent with most of the other methods" is a > /particularly/ good reason to avoid explicitly defining these operations > in terms of __getitem__. Making sure I understand: would you prefer the PEP to say ``return self`` rather than ``return self[:]``? I never had the intention of ``self[:]`` meaning "this must have exactly the behavior of ``self.__getitem__(slice(None, None))`` regardless of type", but I can understand if that's how you're saying it could be interpreted. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/A64Q6BXTXJYNTA4NX2GHBMOG6FPZUCZP/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
> I don't understand that list bit -- surely, if I'm bothering to implement > removeprefix and removesuffix in my subclass, I would also want > to > return self to keep my subclass? Why would I want to go through the extra > overhead of either calling my own __getitem__ method, or have the > str.__getitem__ method discard my subclass? I should clarify: by "when working with subclasses" I meant "when str.removeprefix() is called on a subclass that does not override removeprefix", and in that case it should return a base str. I was not taking a stance on how the methods should be overridden, and I'm not sure there are many use cases where it should be. > However, if you are saying that self[:] will call > self.__class__.__getitem__ > so my subclass only has to override __getitem__ instead of > removeprefix and > removesuffix, that I can be happy with. I was only saying that the new methods should match 20 other methods in the str API by always returning a base str (the exceptions being format, format_map, and (r)partition for some reason). I did not mean to suggest that they should ever call user-supplied ``__getitem__`` code -- I don't think they need to. I haven't found anyone trying to use ``str`` as a mixin class/ABC, and it seems that this would be very difficult to do given that none of its methods currently rely on ``self.__class__.__getitem__``. If ``return self[:]`` in the PEP is too closely linked to "must call user-supplied ``__getitem__`` methods" for it not to be true, and so you're suggesting ``return self`` is more faithful, I can understand. So now if I understand the dilemma up to this point we have: Benefits of writing ``return self`` in the PEP: - Makes it clear that the optimization of not copying is allowed - Makes it clear that ``self.__class__.__getitem__`` isn't used Benefits of writing ``return self[:]`` in the PEP: - Makes it clear that returning self is an implementation detail - For subclasses not overriding ``__getitem__`` (the majority of cases), makes it clear that this method will return a base str like the other str methods. Did I miss anything? All the best, Dennis ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/EQVVXMC7XQJSQIHEB7ND2OLWBQLC7QYM/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
I appreciate the input and attention to detail! Using the ``str()`` constructor was sort of what I had thought originally, and that's why I had gone overboard with "casting" in one iteration of the sample code. When I realized that this isn't quite "casting" and that ``__str__`` can be overridden, I went even more overboard and suggested that ``str.__getitem__(self, ...)`` and ``str.__len__(self)`` could be written, which does have the behavior of effectively "casting", but looks nasty. Do you think that the following is a happy medium? def removeprefix(self: str, prefix: str, /) -> str: # coerce subclasses to str self_str = str(self) prefix_str = str(prefix) if self_str.startswith(prefix_str): return self_str[len(prefix_str):] else: return self_str def removesuffix(self: str, suffix: str, /) -> str: # coerce subclasses to str self_str = str(self) suffix_str = str(suffix) if suffix_str and self_str.endswith(suffix_str): return self_str[:-len(suffix_str)] else: return self_str Followed by the text: If ``type(self) is str`` (rather than a subclass) and if the given affix is empty or is not found, then these methods may, but are not required to, make the optimization of returning ``self``. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/W6DMWMSF22HPKG6MYYCXQ6QE7QIWBNSI/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
I was trying to start with the the intended behavior of the str class, then move on to generalizing to other classes, because I think completing a single example and *then* generalizing is an instructional style that's easier to digest, whereas intermixing all of the examples at once can get confused (can I call str.removeprefix(object(), 17)?). Is something missing that's not already there in the following sentence in the PEP? Although the methods on the immutable ``str`` and ``bytes`` types may make the aforementioned optimization of returning the original object, ``bytearray.removeprefix()`` and ``bytearray.removesuffix()`` should always return a copy, never the original object. Best, Dennis ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/IO33NJUQTN27TU342NAJAAMR7YGEPQRE/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
I like how that would take the pressure off of the Python sample. How's something like this? Specification = The builtin ``str`` class will gain two new methods which will behave as follows when ``type(self) is str``:: def removeprefix(self: str, prefix: str, /) -> str: if self.startswith(prefix): return self[len(prefix):] else: return self def removesuffix(self: str, suffix: str, /) -> str: if suffix and self.endswith(suffix): return self[:-len(suffix)] else: return self These methods, even when called on ``str`` subclasses, should always return base ``str`` objects. One should not rely on the behavior of ``self`` being returned (as in ``s.removesuffix('') is s``) -- this optimization should be considered an implementation detail. To test whether any affixes were removed during the call, one may use the constant-time behavior of comparing the lengths of the original and new strings:: >>> string = 'Python String Input' >>> new_string = string.removeprefix('Py') >>> modified = (len(string) != len(new_string)) >>> modified True One may also continue using ``startswith()`` and ``endswith()`` methods for control flow instead of testing the lengths as above. Note that without the check for the truthiness of ``suffix``, ``s.removesuffix('')`` would be mishandled and always return the empty string due to the unintended evaluation of ``self[:-0]``. Methods with the corresponding semantics will be added to the builtin ``bytes`` and ``bytearray`` objects. If ``b`` is either a ``bytes`` or ``bytearray`` object, then ``b.removeprefix()`` and ``b.removesuffix()`` will accept any bytes-like object as an argument. Although the methods on the immutable ``str`` and ``bytes`` types may make the aforementioned optimization of returning the original object, ``bytearray.removeprefix()`` and ``bytearray.removesuffix()`` should *always* return a copy, never the original object. The two methods will also be added to ``collections.UserString``, with similar behavior. My hesitation to write "return self" is resolved by saying that it should not be relied on, so I think this is a win. Best, Dennis ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/YZD2BTB5RT6DZUTEGHTRNAJZHBMRATPS/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
> > One may also continue using ``startswith()`` > > and ``endswith()`` > > methods for control flow instead of testing the lengths as above. > > > > That's worse, in a sense, since "foofoobar".removeprefix("foo") returns > "foobar" which still starts with "foo". I meant that startswith might be called before removeprefix, as it was in the ``deccheck.py`` example. > If I saw that in a code review I'd flag it for non-obviousness. One should > use 'string != new_string' unless there is severe pressure to squeeze > every nanosecond out of this particular code (and it better be inside an > inner loop). I thought that someone had suggested that such things go in the PEP, but since these are more stylistic considerations, I would be more than happy to trim it down to just The builtin ``str`` class will gain two new methods which will behave as follows when ``type(self) is type(prefix) is str``:: def removeprefix(self: str, prefix: str, /) -> str: if self.startswith(prefix): return self[len(prefix):] else: return self[:] def removesuffix(self: str, suffix: str, /) -> str: # suffix='' should not call self[:-0]. if suffix and self.endswith(suffix): return self[:-len(suffix)] else: return self[:] These methods, even when called on ``str`` subclasses, should always return base ``str`` objects. Methods with the corresponding semantics will be added to the builtin ``bytes`` and ``bytearray`` objects. If ``b`` is either a ``bytes`` or ``bytearray`` object, then ``b.removeprefix()`` and ``b.removesuffix()`` will accept any bytes-like object as an argument. The two methods will also be added to ``collections.UserString``, with similar behavior. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/HQRI26F6UPWL24LJOFFMKNAMYJSC2CAL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
PEP 616 -- String methods to remove prefixes and suffixes is available here: https://www.python.org/dev/peps/pep-0616/ Changes: - Only accept single affixes, not tuples - Make the specification more concise - Make fewer stylistic prescriptions for usage - Fix typos A reference implementation GitHub PR is up to date here: https://github.com/python/cpython/pull/18939 Are there any more comments for it before submission? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/UJE3WCQXSZI76IW54D2SKKL6OFQ2VFMA/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Sure -- I can add in a short list of those major changes. Best, Dennis ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/TKCHV76P3CYYSZDSB3TH3I4UTFCUNKU5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes
Hello all, It seems that most of the discussion has settled down, but I didn't quite understand from reading PEP 1 what the next step should be -- is this an appropriate time to open an issue on the Steering Council GitHub repository requesting pronouncement on PEP 616? Best, Dennis ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ZXKU3EM6HEG6R7C65L7UN65IGTBB7VHH/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip
What about "balanced", "uniform", or "rectangular"? ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/2XXSYMZELCV5EMAEIDFISLF7RDD6ICE5/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Cycles in the __context__ chain
Worth noting is that there is an existing loop-breaking mechanism, but only for the newest exception being raised. In particular, option (4) is actually the current behavior if the the most recent exception participates in a cycle: Python 3.9.0b1 >>> A, B, C, D, E = map(Exception, "ABCDE") >>> A.__context__ = B >>> B.__context__ = C >>> C.__context__ = D >>> D.__context__ = E >>> try: ... raise A ... except Exception: ... raise C ... Exception: B During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 2, in Exception: A During handling of the above exception, another exception occurred: Traceback (most recent call last): File "", line 4, in Exception: C This cycle-breaking is not due to any magic in the ``PyException_SetContext()``, which is currently a basic one-liner, but instead comes from ``_PyErr_SetObject`` in errors.c, which has something along the lines of: def _PyErr_SetObject(new_exc): top = existing_topmost_exc() if top is None: # no context set_top_exception(new_exc) return # convert new_exc class to instance if applicable. ... if top is new_exc: # already on top return e = top while True: context = e.__context__ if context is None: # no loop break if context is new_exc: # unlink the existing exception e.__context__ = None break e = context new_exc.__context__ = top set_top_exception(new_exc) The only trouble comes about when there is a "rho-shaped" linked list, in which we have a cycle not involving the new exception being raised. For instance, Raising A on top of (B -> C -> D -> C -> D -> C -> ...) results in an infinite loop. Two possible fixes would be to either (I) use a magical ``__context__`` setter to ensure that there is never a rho-shaped sequence, or (II) allow arbitrary ``__context__`` graphs and then correctly handle rho-shaped sequences in ``_PyErr_SetObject`` (i.e. at raise-time). Fix type (I) could result in surprising things like: >>> A = Exception() >>> A.__context__ = A >>> A.__context__ is None True so I propose fix type (II). This PR is such a fix: https://github.com/python/cpython/pull/20539 It basically extends the existing behavior (4) to the rho-shaped case. It also prevents the cycle-detecting logic from sitting in two places (both _PyErr_SetObject and PyException_SetContext) and does not make any visible functionality more magical. The only Python-visible change should be that the infinite loop is no longer possible. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/R5J5JVUJX3V4DBKVLUI2SUBRD3TRF6PV/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: Changing Python's string search algorithms
Here's my attempt at some heuristic motivation: Try to construct a needle that will perform as poorly as possible when using the naive two-nested-for-loops algorithm. You'll find that if there isn't some sort of vague periodicity in your needle, then you won't ever get *that* unlucky; each particular alignment will fail early, and if it doesn't then some future alignment would be pigeonholed to fail early. So Crochemore and Perrin's algorithm explicitly handles this "worst case" of periodic strings. Once we've identified in the haystack some period from the needle, there's no need to re-match it. We can keep a memory of how many periods we currently remember matching up, and never re-match them. This is what gives the O(n) behavior for periodic strings. But wait! There are some bad needles that aren't quite periodic. For instance: >>> 'ABCABCAABCABC' in 'ABC'*1_000_000 The key insight though is that the worst strings are still "periodic enough", and if we have two different patterns going on, then we can intentionally split them apart. For example, `"xyxyxyxyabcabc" --> "xyxyxyxy" + "abcabc"`. I believe the goal is to line it up so that if the right half matches but not the left then we can be sure to skip somewhat far ahead. This might not correspond exactly with splitting up two patterns. This is glossing over some details that I'm admittedly still a little hazy on as well, but hopefully that gives at least a nudge of intuition. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/MXMS5XIV6WJFFRHTH7TBHAO3TC4QIHBZ/ Code of Conduct: http://python.org/psf/codeofconduct/