[Python-Dev] Re: Discrepancy between what aiter() and `async for` requires on purpose?

2021-09-03 Thread Dennis Sweeney
I think the C implementation of PyAiter_Check was a translation of the Python 
code `isinstance(..., collections.abc.AsyncIterator)`, but I agree that it 
would be more consistent to just check for __anext__. There were comments at 
the time here: https://github.com/python/cpython/pull/8895#discussion_r532833905
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VLT475NI2ORNSHFZZVZVD4QYOPG66SQK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Why doesn't peephole optimise away operations with fast locals?

2021-10-10 Thread Dennis Sweeney
STORE_FAST can also be caused by the assignment to a loop variable, so 
STORE/LOAD pairs can come about with things like this:

>>> def f():
... for x in stuff:
... x.work()
... 
... 
>>> from dis import dis
>>> dis(f)
2   0 LOAD_GLOBAL  0 (stuff)
2 GET_ITER
>>4 FOR_ITER 6 (to 18)
6 STORE_FAST   0 (x)

3   8 LOAD_FAST0 (x)
10 LOAD_METHOD  1 (work)
12 CALL_METHOD  0
14 POP_TOP
16 JUMP_ABSOLUTE2 (to 4)

2 >>   18 LOAD_CONST   0 (None)
20 RETURN_VALUE

I'd guess that they'd be somewhat common in comprehensions too:

>>> dis(x**2 for x in range(1000))
0 GEN_START0

1   2 LOAD_FAST0 (.0)
>>4 FOR_ITER 7 (to 20)
6 STORE_FAST   1 (x)
8 LOAD_FAST1 (x)
10 LOAD_CONST   0 (2)
12 BINARY_POWER
14 YIELD_VALUE
16 POP_TOP
18 JUMP_ABSOLUTE2 (to 4)
>>   20 LOAD_CONST   1 (None)
22 RETURN_VALUE


In fact, there's already a bpo issue from 2019: 
https://bugs.python.org/issue38381
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/73BMYW3TY7PJB7KRQ3Q3OROGU5UJVJAW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 677 (Callable Type Syntax): Final Call for Comments

2022-01-12 Thread Dennis Sweeney
Like others expressed, I don't like the idea of the typing and non-typing parts 
of Python separating.

Has anyone considered adding a new special method like `__arrow__` or 
something, that would be user-definable, but also defined for tuples and types 
as returning a Callable? For example `int -> str` could mean Callable[[int], 
str], and (int, str) -> bool could mean Callable[[int, str], bool]. I would 
find that sort of semantics more agreeable since Python already has operators 
that dispatch to dunder methods, and anyone who knows how that bit of Python 
works would automatically mostly know how the new operator works.

If I understand right, this is a sort of combination of two things for which 
there is more precedent: first, adding a new operator based on the needs of a 
subset of users (the @ operator and __matmul__), and second, adding new 
operators to existing objects for the sake of typing (like the list[int] syntax 
in which type.__getitem__ was implemented to dispatch to 
the_type.__class_getitem__).

If people don't want to add a new operator and dunder, I assume using the right 
shift operator like `(int, bool) >> str` would be too cheesy?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GTYLK4QA6DHQDZH7NLYLELYCFUKOTNDT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Do you ever use ceval.c's LLTRACE feature?

2022-04-14 Thread Dennis Sweeney
Hi everyone,

I'm looking to improve the output of the interpreter's LLTRACE feature to make 
it more understandable. This "feature" is undocumented, but it prints out the 
opcode and oparg of every bytecode instruction executed, along with some info 
about stack operations, whenever you've built with Py_DEBUG and the name 
`__ltrace__` is defined in the module.

I've found this useful for debugging bits of the compiler and bytecode 
interpreter. For example, if I make some tweak that instroduces an off-by-one 
error, by default I get a segfault or a rather unhelpful assertion failure at 
`assert(EMPTY())` or `assert(STACK_LEVEL() <= frame->f_code->co_stacksize)` or 
similar, at best, with no inducation as to where or why that assertion is 
failing. But if I enable `__ltrace__` by either setting `__ltrace__=1` in some 
module or by manually setting `lltrace=1;` in the c code, I can follow what was 
happening in the interpreter just before the crash.

I'd like the output in that scenario to be a bit more helpful. I propose 
printing opcode names rather than decimal digits, and printing out the name of 
the current function whenever a stack frame begins executing. I also proprose 
to print out the full stack contents (almost never very deep) before each 
bytecode, rather than printing the state piecemeal at each 
PUSH/POP/STACK_ADJUST macro. I opened issue 
https://github.com/python/cpython/issues/91462 and PR 
https://github.com/python/cpython/pull/91463

I later found that this had been explored before by 
https://github.com/python/cpython/issues/69757, and there was a suggestion that 
this could be folded into a more generalized bytecode-level tracing feature 
that is pluggable with python code, similar to sys.settrace(). I would tend to 
think "YAGNI" -- lltrace is a feature for debugging the c internals of the 
interpreter, and there are already separate existing features like the `trace` 
module for tracing through Python code with different goals. I appreciate the 
simplicity of printf statements at the c level -- it feels more trustworthy 
than adding a complicated extra feature involving python calls and global 
state. It's as if I just littered the code with my own debugging print 
statements, but re-usable and better.

I see no documentation anywhere, and there's only one test case in 
test_lltrace, just testing that there's no segfault. Looking back through the 
git history, I see that the basic `printf("%d: %d, %d\n", ...);` format goes 
back to 1990: 
https://github.com/python/cpython/blob/3f5da24ea304e674a9abbdcffc4d671e32aa70f1/Python/ceval.c#L696-L710

I'm essentially writing to ask: how do you use lltrace? Does anyone rely on the 
particular format of the output? Would some of these improvements be helpful to 
you? What else could make it more helpful?

Thanks,
Dennis Sweeney
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DELHX3N5PCZDWIK2DLU5JDG6JREQ42II/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Decreasing refcount for locals before popping frame

2022-04-28 Thread Dennis Sweeney
I don't know if there's anything specifically stopping this, but from what I 
understand, the precise moment that a finalizer gets called is unspecified, so 
relying on any sort of behavior there is undefined and non-portable. 
Implementations like PyPy don't always use reference counting, so their garbage 
collection might get called some unspecified amount of time later.

I'm not familiar with Airflow, but would you be able to decorate the create() 
function to check for good return values? Something like

:import functools
:
:def dag_initializer(func):
:@functools.wraps(func)
:def wrapper():
:with DAG(...) as dag:
:result = func(dag)
:del dag
:if not isinstance(result, DAG):
:raise ValueError(f"{func.__name__} did not return a dag")
:return result
:return wrapper
:
:@dag_initializer
:def create(dag):
:"some code here"
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EBCLFYZLCTANUYSPZ55GFHG5I7DDTR76/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Looking for a sponsor and feedback on PEP 616: string methods for removing prefixes and suffixes

2020-03-20 Thread Dennis Sweeney
Hello all! I'm a relatively new contributor looking for a Core Dev sponsor for 
the following PEP:

https://github.com/python/peps/pull/1332

Related:
- Python-Ideas Thread: 
https://mail.python.org/archives/list/python-id...@python.org/thread/RJARZSUKCXRJIP42Z2YBBAEN5XA7KEC3/
- Bug Tracker Issue: https://bugs.python.org/issue39939
- Github PR for implementation: https://github.com/python/cpython/pull/18939

Abstract


This is a proposal to add two new methods, ``cutprefix`` and
``cutsuffix``, to the APIs of Python's various string objects.  In
particular, the methods would be added to Unicode ``str`` objects, 
binary ``bytes`` and ``bytearray`` objects, and
``collections.UserString``. 

If ``s`` is one these objects, and ``s`` has ``pre`` as a prefix, then
``s.cutprefix(pre)`` returns a copy of ``s`` in which that prefix has
been removed.  If ``s`` does not have ``pre`` as a prefix, an 
unchanged copy of ``s`` is returned.  In summary, ``s.cutprefix(pre)``
is roughly equivalent to ``s[len(pre):] if s.startswith(pre) else s``.

The behavior of ``cutsuffix`` is analogous: ``s.cutsuffix(suf)`` is
roughly equivalent to 
``s[:-len(suf)] if suf and s.endswith(suf) else s``.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XC3D3QGONENQ7PIAUM2SNNEP5BWA6Q4J/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] PEP 616 -- String methods to remove prefixes and suffixes

2020-03-20 Thread Dennis Sweeney
Browser Link: https://www.python.org/dev/peps/pep-0616/

PEP: 616
Title: String methods to remove prefixes and suffixes
Author: Dennis Sweeney 
Sponsor: Eric V. Smith 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 19-Mar-2020
Python-Version: 3.9
Post-History: 30-Aug-2002


Abstract


This is a proposal to add two new methods, ``cutprefix`` and
``cutsuffix``, to the APIs of Python's various string objects.  In
particular, the methods would be added to Unicode ``str`` objects, 
binary ``bytes`` and ``bytearray`` objects, and
``collections.UserString``. 

If ``s`` is one these objects, and ``s`` has ``pre`` as a prefix, then
``s.cutprefix(pre)`` returns a copy of ``s`` in which that prefix has
been removed.  If ``s`` does not have ``pre`` as a prefix, an 
unchanged copy of ``s`` is returned.  In summary, ``s.cutprefix(pre)``
is roughly equivalent to ``s[len(pre):] if s.startswith(pre) else s``.

The behavior of ``cutsuffix`` is analogous: ``s.cutsuffix(suf)`` is
roughly equivalent to 
``s[:-len(suf)] if suf and s.endswith(suf) else s``.


Rationale
=

There have been repeated issues [#confusion]_ on the Bug Tracker 
and StackOverflow related to user confusion about the existing 
``str.lstrip`` and ``str.rstrip`` methods.  These users are typically
expecting the behavior of ``cutprefix`` and ``cutsuffix``, but they 
are surprised that the parameter for ``lstrip`` is interpreted as a
set of characters, not a substring.  This repeated issue is evidence
that these methods are useful, and the new methods allow a cleaner
redirection of users to the desired behavior.

As another testimonial for the usefulness of these methods, several
users on Python-Ideas [#pyid]_ reported frequently including similar
functions in their own code for productivity.  The implementation
often contained subtle mistakes regarding the handling of the empty
string (see `Specification`_).


Specification
=

The builtin ``str`` class will gain two new methods with roughly the
following behavior::

def cutprefix(self: str, pre: str, /) -> str:
if self.startswith(pre):
return self[len(pre):]
return self[:]

def cutsuffix(self: str, suf: str, /) -> str:
if suf and self.endswith(suf):
return self[:-len(suf)]
return self[:]

The only difference between the real implementation and the above is
that, as with other string methods like ``replace``, the 
methods will raise a ``TypeError`` if any of ``self``, ``pre`` or 
``suf`` is not an instace of ``str``, and will cast subclasses of
``str`` to builtin ``str`` objects.

Note that without the check for the truthyness of ``suf``, 
``s.cutsuffix('')`` would be mishandled and always return the empty 
string due to the unintended evaluation of ``self[:-0]``.

Methods with the corresponding semantics will be added to the builtin 
``bytes`` and ``bytearray`` objects.  If ``b`` is either a ``bytes``
or ``bytearray`` object, then ``b.cutsuffix()`` and ``b.cutprefix()``
will accept any bytes-like object as an argument.

Note that the ``bytearray`` methods return a copy of ``self``; they do
not operate in place.

The following behavior is considered a CPython implementation detail,
but is not guaranteed by this specification::

>>> x = 'foobar' * 10**6
>>> x.cutprefix('baz') is x is x.cutsuffix('baz')
True
>>> x.cutprefix('') is x is x.cutsuffix('')
True

That is, for CPython's immutable ``str`` and ``bytes`` objects, the 
methods return the original object when the affix is not found or if
the affix is empty.  Because these types test for equality using 
shortcuts for identity and length, the following equivalent 
expressions are evaluated at approximately the same speed, for any 
``str`` objects (or ``bytes`` objects) ``x`` and ``y``::

>>> (True, x[len(y):]) if x.startswith(y) else (False, x)
>>> (True, z) if x != (z := x.cutprefix(y)) else (False, x)


The two methods will also be added to ``collections.UserString``, 
where they rely on the implementation of the new ``str`` methods.


Motivating examples from the Python standard library


The examples below demonstrate how the proposed methods can make code
one or more of the following:

Less fragile:
The code will not depend on the user to count the length of a
literal.
More performant:
The code does not require a call to the Python built-in 
``len`` function.
More descriptive:
The methods give a higher-level API for code readability, as
opposed to the traditional method of string slicing.


refactor.py
---

- Current::

if fix_name.startswith(self.FILE_PREFIX):
fix_name = fix_name[len(self.FILE_PREFIX):]

- Improved::

fix_name = fix_name.cutprefix(

[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-20 Thread Dennis Sweeney
Thanks for the feedback!

I meant mnemonic as in the broader sense of "way of remembering things", not 
some kind of rhyming device or acronym. Maybe "mnemonic" isn't the perfect 
word. I was just trying to say that the structure of how the methods are named 
should how their behavior relates to one another, which it seems you agree with.

Fair enough that ``[l/r]strip`` and the proposed methods share the behavior of 
"removing something from the end of a string". From that perspective, they're 
similar. But my thought was that ``s.lstrip("abc")`` has extremely similar 
behavior when changing "lstrip" to "rstrip" or "strip" -- the argument is 
interpreted in the exactly same way (as a character set) in each case. Looking 
at how the argument is used, I'd argue that ``lstrip``/``rstrip``/``strip`` are 
much more similar to each other than they are to the proposed methods, and that 
the proposed methods are perhaps more similar to something like 
``str.replace``. But it does seem pretty subjective what the threshold is for 
behavior similar enough to have related names -- I see where you're coming from.

Also, the docs at ( 
https://docs.python.org/3/library/stdtypes.html?highlight=lstrip#string-methods 
) are alphabetical, not grouped by "similar names", so even ``lstrip``, 
``strip``, and ``rstrip`` are already in different places. Maybe the name 
"stripprefix" would be more discoverable when "Ctrl-f"ing the docs, if it 
weren't for the following addition in the linked PR:

 .. method:: str.lstrip([chars])

Return a copy of the string with leading characters removed.  The 
*chars*
argument is a string specifying the set of characters to be removed.  
If  omitted
or ``None``, the *chars* argument defaults to removing whitespace.  The 
*chars*
argument is not a prefix; rather, all combinations of its values are 
stripped::

   >>> '   spacious   '.lstrip()
   'spacious   '
   >>> 'www.example.com'.lstrip('cmowz.')
   'example.com'

+  See :meth:`str.cutprefix` for a method that will remove a single prefix
+  string rather than all of a set of characters.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/55QJHR6PP4IWFLBRTFL4TZX5QOBJQFO5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-20 Thread Dennis Sweeney
Dennis Sweeney wrote:
> to say that the structure of how the methods are named should how their 
> behavior relates

...should be a reminder of how...
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IFGANJGTE5RQ5J6FBJJNAWY7HRZSRED5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-20 Thread Dennis Sweeney
For clarity, I'll change

 If ``s`` does not have ``pre`` as a prefix, an unchanged copy of ``s`` is 
returned.

to

If ``s`` does not have ``pre`` as a prefix, then ``s.cutprefix(pre)`` 
returns ``s`` or an unchanged copy of ``s``.

For consistency with the Specification section, I'll also change

s[len(pre):] if s.startswith(pre) else s

to

s[len(pre):] if s.startswith(pre) else s[:]

and similarly change the ``cutsuffix`` snippet.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ULKK7K47QKFHXFXKNEAVF2UVNV6ZJNSD/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] (no subject)

2020-03-20 Thread Dennis Sweeney
Thanks for the review!

> In short, I propose:
>  def cutprefix(self: str, prefix: str, /) -> str:
>  if self.startswith(prefix) and prefix:
>  return self[len(prefix):]
>  else:
>  return self
> 
> I call startswith() before testing if pre is non-empty to inherit of
> startswith() input type validation. For example, "a".startswith(b'x')
> raises a TypeError.

This still erroneously accepts tuples and and would return return str
subclasses unchanged. If we want to make the Python be the spec with accuracy 
about type-checking, then perhaps we want:

def cutprefix(self: str, prefix: str, /) -> str:
if not isinstance(prefix, str):
raise TypeError(f'cutprefix() argument must be str, '
f'not {type(prefix).__qualname__}')
self = str(self)
prefix = str(prefix)
if self.startswith(prefix):
return self[len(prefix):]
else:
return self

I like the idea to always require these to return the unmodified string. Is 
there a reason this isn't specified by the ``replace()`` or ``strip`` methods?

For accepting multiple prefixes, I can't tell if there's a consensus about 
whether ``s = s.cutprefix("a", "b", "c")``
should be the same as

for prefix in ["a", "b", "c"]:
s = s.cutprefix(prefix)

or

for prefix in ["a", "b", "c"]:
if s.startwith(prefix):
s = s.cutprefix(prefix)
break

The latter seems to be harder for users to implement through other means, and 
it's the behavior that test_concurrent_futures.py has implemented now, so maybe 
that's what we want. Also, it seems more elegant to me to accept variadic 
arguments, rather than a single tuple of arguments. Is it worth it to match the 
related-but-not-the-same API of "startswith" if it makes for uglier Python? My 
gut reaction is to prefer the varargs, but maybe someone has a different 
perspective.

I can submit a revision to the PEP with some changes tomorrow.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/GXGP5T5KC6ZEBZ5AON4G3MHIKO6MAU35/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Dennis Sweeney
Hi Victor. I accidentally created a new thread, but I intended everything below 
as a response:

Thanks for the review!

> In short, I propose:
>  def cutprefix(self: str, prefix: str, /) -> str:
>  if self.startswith(prefix) and prefix:
>  return self[len(prefix):]
>  else:
>  return self
> I call startswith() before testing if pre is non-empty to inherit of
> startswith() input type validation. For example, "a".startswith(b'x')
> raises a TypeError.

This still erroneously accepts tuples and and would return return str
subclasses unchanged. If we want to make the Python be the spec with accuracy 
about
type-checking, then perhaps we want:

 def cutprefix(self: str, prefix: str, /) -> str:
 if not isinstance(prefix, str):
 raise TypeError(f'cutprefix() argument must be str, '
 f'not {type(prefix).__qualname__}')
 self = str(self)
 prefix = str(prefix)
 if self.startswith(prefix):
 return self[len(prefix):]
 else:
 return self

For accepting multiple prefixes, I can't tell if there's a consensus about 
whether
``s = s.cutprefix("a", "b", "c")`` should be the same as

 for prefix in ["a", "b", "c"]:
 s = s.cutprefix(prefix)

or

 for prefix in ["a", "b", "c"]:
 if s.startwith(prefix):
 s = s.cutprefix(prefix)
 break

The latter seems to be harder for users to implement through other means, and 
it's the
behavior that test_concurrent_futures.py has implemented now, so maybe that's 
what we
want. Also, it seems more elegant to me to accept variadic arguments, rather 
than a single
tuple of arguments. Is it worth it to match the related-but-not-the-same API of
"startswith" if it makes for uglier Python? My gut reaction is to prefer the 
varargs, but
maybe someone has a different perspective.

I can submit a revision to the PEP with some changes soon.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/NYVDSQ7XB3KOXREY5FUALEILB2UCUVD3/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Dennis Sweeney
Even then, it seems that prefix is an established computer science term:

[1] https://en.wikipedia.org/wiki/Substring#Prefix
[2] Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990). 
Introduction to Algorithms (1st ed.). Chapter 15.4: Longest common subsequence

And a quick search reveals that it's used hundreds of times in the docs: 
https://docs.python.org/3/search.html?q=prefix
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R7CC6LEZHVLTILXGYFYGVXYTDANVJFNF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Dennis Sweeney
> But Dennis, welcome to the wonderful world of change proposals, where
> you will experience insane amounts of pushback and debate on the
> finest points of bikeshedding, whether or not people actually even
> support the proposal at all...

Lol -- thanks!

In my mind, another reason that I like including the words "prefix" and 
"suffix" over "start" and "end" is that, even though using the verb "end" in 
"endswith" is unambiguous, the noun "end" can be used as either the initial or 
final end, as in "remove this thing from both ends of the string. So "suffix" 
feels more precise to me.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/7UHLOAR6NTVNLN3RBQP6ONHTLTDGXLQW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Dennis Sweeney
Is there a proven use case for anything other than the empty string as the 
replacement? I prefer your "replacewhatever" to another "stripwhatever" name, 
and I think it's clear and nicely fits the behavior you proposed. But should we 
allow a naming convenience to dictate that the behavior should be generalized 
to a use case we're not sure exists, where the same same argument is passed 99% 
of the time? I think a downside would be that a 
pass-a-string-or-a-tuple-of-strings interface would be more mental effort to 
keep track of than a ``*args`` variadic interface for 
"(cut/remove/without/trim)prefix", even if the former is how ``startswith()`` 
works.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KS25JX4V5LR3ZCV4EXU763RLTT24D4JT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-21 Thread Dennis Sweeney
I like "removeprefix" and "removesuffix". My only concern before had been 
length, but three more characters than "cut***fix" is a small price to pay for 
clarity.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y4O2AIODGI2Z45A32UK5EHR7A7RLQFOK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Dennis Sweeney
Here's an updated version.

Online: https://www.python.org/dev/peps/pep-0616/
Source: https://raw.githubusercontent.com/python/peps/master/pep-0616.rst

Changes:
- More complete Python implementation to match what the type checking in 
the C implementation would be
- Clarified that returning ``self`` is an optimization
- Added links to past discussions on Python-Ideas and Python-Dev
- Specified ability to accept a tuple of strings
- Shorter abstract section and fewer stdlib examples
- Mentioned 
- Typo and formatting fixes

I didn't change the name because it didn't seem like there was a strong 
consensus for an alternative yet. I liked the suggestions of ``dropprefix`` or 
``removeprefix``.

All the best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/RY7GS4GF7OT7CLZVEDSULMY53QZYDN5Y/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-22 Thread Dennis Sweeney
Much appreciated! I will add that single quote and change those snippets to::

 >>> s = 'FooBar' * 100 + 'Baz'
 >>> prefixes = ('Bar', 'Foo')
 >>> while len(s) != len(s := s.cutprefix(prefixes)): pass
 >>> s
 'Baz'

and::

 >>> s = 'FooBar' * 100 + 'Baz'
 >>> prefixes = ('Bar', 'Foo')
 >>> while s.startswith(prefixes): s = s.cutprefix(prefixes)
 >>> s
 'Baz'
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QJ54X6WHQQ5HFROSJOLGJF4QMFINMAPY/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Dennis Sweeney
This should be fixed now.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TQQXDLROEKI5ANEF3J7ESFO2VNYRVDYB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-23 Thread Dennis Sweeney
Steven D'Aprano wrote:
> Having confirmed that prefix is a tuple, you call tuple() to 
> make a copy of it in order to iterate over it. Why?
> 
> Having confirmed that option is a string, you call str() on
> it to (potentially) make a copy. Why?

This was an attempt to ensure no one can do funny business with tuple or str 
subclassing. I was trying to emulate the ``PyTuple_Check`` followed by 
``PyTuple_GET_SIZE`` and ``PyTuple_GET_ITEM`` that are done by the C 
implementation of ``str.startswith()`` to ensure that only the tuple/str 
methods are used, not arbitrary user subclass code. It seems that that's what 
most of the ``str`` methods force.

I was mistaken in how to do this with pure Python. I believe I actually wanted 
something like:

def cutprefix(self, prefix, /):
if not isinstance(self, str):
raise TypeError()

if isinstance(prefix, tuple):
for option in tuple.__iter__(prefix):
if not isinstance(option, str):
raise TypeError()

if str.startswith(self, option):
return str.__getitem__(
self, slice(str.__len__(option), None))

return str.__getitem__(self, slice(None, None))

if not isinstance(prefix, str):
raise TypeError()

if str.startswith(self, prefix):
return str.__getitem__(self, slice(str.__len__(prefix), None))
else:
return str.__getitem__(self, slice(None, None))

... which looks even uglier.

> We ought to get some real-life exposure to the simple case first, before 
> adding support for multiple prefixes/suffixes.

I could be (and have been) convinced either way about whether or not to 
generalize to tuples of strings. I thought Victor made a good point about 
compatibility with ``startswith()``
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/CQVVWGPC454LWATA2Y7BZ5OEAGVSTHEZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Dennis Sweeney
I think my confusion is about just how precise this sort of "reference 
implementation" should be. Should it behave with ``str`` and ``tuple`` 
subclasses exactly how it would when implemented? If so, I would expect the 
following to work:

class S(str): __len__ = __getitem__ = __iter__ = None
class T(tuple): __len__ = __getitem__ = __iter__ = None

x = str.cutprefix("FooBar", T(("a", S("Foo"), 17)))
assert x == "Bar"
assert type(x) is str

and so I think the ``str.__getitem__(self, slice(str.__len__(prefix), None))`` 
monstrosity would be the most technically correct, unless I'm missing 
something. But I've never seen Python code so ugly. And I suppose this is a 
slippery slope -- should it also guard against people redefining ``len = lambda 
x: 5`` and ``str = list`` in the global scope? Clearly not. I think then maybe 
it would be preferred to use the something like the following in the PEP:

def cutprefix(self, prefix, /):
if isinstance(prefix, str):
if self.startswith(prefix):
return self[len(prefix):]
return self[:]
elif isinstance(prefix, tuple):
for option in prefix:
if self.startswith(option):
return self[len(option):]
return self[:]
else:
raise TypeError()


def cutsuffix(self, suffix):
if isinstance(suffix, str):
if self.endswith(suffix):
return self[:len(self)-len(suffix)]
return self[:]
elif isinstance(suffix, tuple):
for option in suffix:
if self.endswith(option):
return self[:len(self)-len(option)]
return self[:]
else:
raise TypeError()

The above would fail the assertions as written before, but would pass them for 
subclasses ``class S(str): pass`` and ``class T(tuple): pass`` that do not 
override any dunder methods. Is this an acceptable compromise if it appears 
alongside a clarifying sentence like the following?

These methods should always return base ``str`` objects, even when called 
on ``str`` subclasses.

I'm looking for guidance as to whether that's an appropriate level of precision 
for a PEP. If so, I'll make that change.

All the best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PV6ANJL7KN4VHPSNPZSAZGQCEWHEKYG2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Dennis Sweeney
Steven D'Aprano wrote:
> On Tue, Mar 24, 2020 at 08:14:33PM -, Dennis Sweeney wrote:
> > I think then maybe it would be preferred to 
> > use the something like the following in the PEP:
> > def cutprefix(self, prefix, /):
> > if isinstance(prefix, str):
> > if self.startswith(prefix):
> > return self[len(prefix):]
> > return self[:]
> > 
> > Didn't we have a discussion about not mandating a copy when nothing 
> changes? For strings, I'd just return self. It is only bytearray that 
> requires a copy to be made.

It appears that in CPython, ``self[:] is self`` is true for base ``str`` 
objects, so I think ``return self[:]`` is consistent with (1) the premise that 
returning self is an implementation detail that is neither mandated nor 
forbidden, and (2) the premise that the methods should return base ``str`` 
objects even when called on ``str`` subclasses.

> > elif isinstance(prefix, tuple):
> > for option in prefix:
> > if self.startswith(option):
> > return self[len(option):]
> > 
> > I'd also remove the entire multiple substrings feature, for reasons I've 
> already given. "Compatibility with startswith" is not a good reason to 
> add this feature and you haven't established any good use-cases for it.
> A closer analog is str.replace(substring, ''), and after almost 30 years 
> of real-world experience, that method still only takes a single 
> substring, not a tuple.

The ``test_concurrent_futures.py`` example seemed to be a good use case to me. 
I agree that it would be good to see how common that actually is though. But it 
seems to me that any alternative behavior, e.g. repeated removal, could be 
implemented by a user on top of the remove-only-the-first-found behavior or by 
fluently chaining multiple method calls. Maybe you're right that it's too 
complex, but I think it's at least worth discussing.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TRTHGTLOEQXSYYXKQ6RFEXMGDI7O57EL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Dennis Sweeney
It seems that there is a consensus on the names ``removeprefix`` and 
``removesuffix``. I will update the PEP accordingly. I'll also simplify sample 
Python implementation to primarily reflect *intent* over strict type-checking 
correctness, and I'll adjust the accompanying commentary accordingly.

Lastly, since the issue of multiple prefixes/suffixes is more controversial and 
seems that it would not affect how the single-affix cases would work, I can 
remove that from this PEP and allow someone else with a stronger opinion about 
it to propose and defend a set of semantics in a different PEP. Is there any 
objection to deferring this to a different PEP?

All the best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5JJ5YDUPCLVYSCCFOI4MQG64SLY22HU5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-24 Thread Dennis Sweeney
There were at least two comments suggesting keeping it to one affix at a time:

https://mail.python.org/archives/list/python-dev@python.org/message/GPXSIDLKTI6WKH5EKJWZEG5KR4AQ6P3J/

https://mail.python.org/archives/list/python-dev@python.org/message/EDWFPEGQBPTQTVZV5NDRC2DLSKCXVJPZ/

But I didn't see any big objections to the rest of the PEP, so I think maybe we 
keep it restricted for now.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/QBCB2QMUMYBLPXHB6VKIKFK7OODYVKX5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Dennis Sweeney
I'm removing the tuple feature from this PEP. So now, if I understand
correctly, I don't think there's disagreement about behavior, just about
how that behavior should be summarized in Python code. 

Ethan Furman wrote:
> > It appears that in CPython, self[:] is self is true for base
> > str
> >  objects, so I think return self[:] is consistent with (1) the premise
> >  that returning self is an implementation detail that is neither mandated
> >  nor forbidden, and (2) the premise that the methods should return base
> >  str objects even when called on str subclasses.
> The Python interpreter in my head sees self[:] and returns a copy. 
> A
> note that says a str is returned would be more useful than trying to
> exactly mirror internal details in the Python "roughly equivalent" code.

I think I'm still in the camp that ``return self[:]`` more precisely prescribes
the desired behavior. It would feel strange to me to write ``return self``
and then say "but you don't actually have to return self, and in fact
you shouldn't when working with subclasses". To me, it feels like

return (the original object unchanged, or a copy of the object, 
depending on implementation details, 
but always make a copy when working with subclasses)

is well-summarized by

   return self[:]

especially if followed by the text

Note that ``self[:]`` might not actually make a copy -- if the affix
is empty or not found, and if ``type(self) is str``, then these methods
may, but are not required to, make the optimization of returning ``self``.
However, when called on instances of subclasses of ``str``, these
methods should return base ``str`` objects, not ``self``.

...which is a necessary explanation regardless. Granted, ``return self[:]``
isn't perfect if ``__getitem__`` is overridden, but at the cost of three
characters, the Python gains accuracy over both the optional nature of
returning ``self`` in all cases and the impossibility (assuming no dunders
are overridden) of returning self for subclasses. It also dissuades readers
from relying on the behavior of returning self, which we're specifying is
an implementation detail.

Is that text explanation satisfactory?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/4E77QD52JCMHSP7O62C57XILLQN6SPCT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-25 Thread Dennis Sweeney
I was surprised by the following behavior:

class MyStr(str):
def __getitem__(self, key):
if isinstance(key, slice) and key.start is key.stop is key.end:
return self
return type(self)(super().__getitem__(key))


my_foo = MyStr("foo")
MY_FOO = MyStr("FOO")
My_Foo = MyStr("Foo")
empty = MyStr("")

assert type(my_foo.casefold()) is str
assert type(MY_FOO.capitalize()) is str
assert type(my_foo.center(3)) is str
assert type(my_foo.expandtabs()) is str
assert type(my_foo.join(())) is str
assert type(my_foo.ljust(3)) is str
assert type(my_foo.lower()) is str
assert type(my_foo.lstrip()) is str
assert type(my_foo.replace("x", "y")) is str
assert type(my_foo.split()[0]) is str
assert type(my_foo.splitlines()[0]) is str
assert type(my_foo.strip()) is str
assert type(empty.swapcase()) is str
assert type(My_Foo.title()) is str
assert type(MY_FOO.upper()) is str
assert type(my_foo.zfill(3)) is str

assert type(my_foo.partition("z")[0]) is MyStr
assert type(my_foo.format()) is MyStr

I was under the impression that all of the ``str`` methods exclusively returned 
base ``str`` objects. Is there any reason why those two are different, and is 
there a reason that would apply to ``removeprefix`` and ``removesuffix`` as 
well?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TVDATHMCK25GT4OTBUBDWG3TBJN6DOKK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-26 Thread Dennis Sweeney
> I imagine it's an implementation detail of which ones depend on 
> ``__getitem__``.

If we write

class MyStr(str):
def __getitem__(self, key):
raise ZeroDivisionError()

then all of the assertions from before still pass, so in fact *none* of
the methods rely on ``__getitem__``. As of now ``str`` does not behave
as an ABC at all.

But it's an interesting proposal to essentially make it an ABC. Although
it makes me curious what all of the different reasons people actually have
for subclassing ``str``. All of the examples I found in the stdlib were
either (1) contrived test cases (2) strings (e.g. grammar tokens) with
some extra attributes along for the ride, or (3) string-based enums.
None of types (2) or (3) ever overrode ``__getitem__``, so it doesn't
feel like that common of a use case.

> I don't see removeprefix and removesuffix explicitly being implemented
> in terms of slicing operations as a huge win - you've demonstrated that
> someone who wants a persistent string subclass still would need to
> override a /lot/ of methods, so two more shouldn't hurt much - I just
> think that "consistent with most of the other methods" is a
> /particularly/ good reason to avoid explicitly defining these operations
> in terms of __getitem__. 

Making sure I understand: would you prefer the PEP to say ``return self``
rather than ``return self[:]``? I never had the intention of ``self[:]``
meaning "this must have exactly the behavior of 
``self.__getitem__(slice(None, None))`` regardless of type", but I can 
understand if that's how you're saying it could be interpreted.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/A64Q6BXTXJYNTA4NX2GHBMOG6FPZUCZP/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-26 Thread Dennis Sweeney
> I don't understand that list bit -- surely, if I'm bothering to implement
> removeprefix and removesuffix in my subclass, I would also want
> to
> return self to keep my subclass?  Why would I want to go through the extra
> overhead of either calling my own __getitem__ method, or have the
> str.__getitem__ method discard my subclass?

I should clarify: by "when working with subclasses" I meant "when
str.removeprefix() is called on a subclass that does not override
removeprefix", and in that case it should return a base str. I was
not taking a stance on how the methods should be overridden, and
I'm not sure there are many use cases where it should be.

> However, if you are saying that self[:]  will call
> self.__class__.__getitem__
> so my subclass only has to override __getitem__ instead of
> removeprefix and
> removesuffix, that I can be happy with.

I was only saying that the new methods should match 20 other methods in 
the str API by always returning a base str (the exceptions being format,
format_map, and (r)partition for some reason). I did not mean to suggest
that they should ever call user-supplied ``__getitem__`` code -- I don't
think they need to. I haven't found anyone trying to use ``str`` as a
mixin class/ABC, and it seems that this would be very difficult to do
given that none of its methods currently rely on 
``self.__class__.__getitem__``.

If ``return self[:]`` in the PEP is too closely linked to "must call 
user-supplied ``__getitem__`` methods" for it not to be true, and so you're
suggesting ``return self`` is more faithful, I can understand.

So now if I understand the dilemma up to this point we have:

Benefits of writing ``return self`` in the PEP:
- Makes it clear that the optimization of not copying is allowed
- Makes it clear that ``self.__class__.__getitem__`` isn't used

Benefits of writing ``return self[:]`` in the PEP:
- Makes it clear that returning self is an implementation detail
- For subclasses not overriding ``__getitem__`` (the majority of cases), 
makes 
  it clear that this method will return a base str like the other str 
methods.

Did I miss anything?

All the best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EQVVXMC7XQJSQIHEB7ND2OLWBQLC7QYM/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
I appreciate the input and attention to detail!

Using the ``str()`` constructor was sort of what I had thought originally, and 
that's why I had gone overboard with "casting" in one iteration of the sample 
code. When I realized that this isn't quite "casting" and that ``__str__`` can 
be overridden, I went even more overboard and suggested that 
``str.__getitem__(self, ...)`` and ``str.__len__(self)`` could be written, 
which does have the behavior of effectively "casting", but looks nasty. Do you 
think that the following is a happy medium?

def removeprefix(self: str, prefix: str, /) -> str:
# coerce subclasses to str
self_str = str(self)
prefix_str = str(prefix)
if self_str.startswith(prefix_str):
return self_str[len(prefix_str):]
else:
return self_str

def removesuffix(self: str, suffix: str, /) -> str:
# coerce subclasses to str
self_str = str(self)
suffix_str = str(suffix)
if suffix_str and self_str.endswith(suffix_str):
return self_str[:-len(suffix_str)]
else:
return self_str

Followed by the text:

If ``type(self) is str`` (rather than a subclass) and if the given affix is 
empty or is not found, then these methods may, but are not required to, make 
the optimization of returning ``self``.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/W6DMWMSF22HPKG6MYYCXQ6QE7QIWBNSI/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
I was trying to start with the the intended behavior of the str class, then 
move on to generalizing to other classes, because I think completing a single 
example and *then* generalizing is an instructional style that's easier to 
digest, whereas intermixing all of the examples at once can get confused (can I 
call str.removeprefix(object(), 17)?). Is something missing that's not already 
there in the following sentence in the PEP?

Although the methods on the immutable ``str`` and ``bytes`` types may make the 
aforementioned optimization of returning the original object, 
``bytearray.removeprefix()`` and ``bytearray.removesuffix()`` should always 
return a copy, never the original object.

Best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IO33NJUQTN27TU342NAJAAMR7YGEPQRE/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
I like how that would take the pressure off of the Python sample. How's 
something like this?

Specification
=

The builtin ``str`` class will gain two new methods which will behave
as follows when ``type(self) is str``::

def removeprefix(self: str, prefix: str, /) -> str:
if self.startswith(prefix):
return self[len(prefix):]
else:
return self

def removesuffix(self: str, suffix: str, /) -> str:
if suffix and self.endswith(suffix):
return self[:-len(suffix)]
else:
return self

These methods, even when called on ``str`` subclasses, should always
return base ``str`` objects.  One should not rely on the behavior
of ``self`` being returned (as in ``s.removesuffix('') is s``) -- this
optimization should be considered an implementation detail.  To test
whether any affixes were removed during the call, one may use the
constant-time behavior of comparing the lengths of the original and
new strings::

>>> string = 'Python String Input'
>>> new_string = string.removeprefix('Py')
>>> modified = (len(string) != len(new_string))
>>> modified
True

One may also continue using ``startswith()`` and ``endswith()``
methods for control flow instead of testing the lengths as above.

Note that without the check for the truthiness of ``suffix``,
``s.removesuffix('')`` would be mishandled and always return the empty
string due to the unintended evaluation of ``self[:-0]``.

Methods with the corresponding semantics will be added to the builtin
``bytes`` and ``bytearray`` objects.  If ``b`` is either a ``bytes``
or ``bytearray`` object, then ``b.removeprefix()`` and ``b.removesuffix()``
will accept any bytes-like object as an argument.  Although the methods
on the immutable ``str`` and ``bytes`` types may make the aforementioned
optimization of returning the original object, ``bytearray.removeprefix()``
and ``bytearray.removesuffix()`` should *always* return a copy, never the
original object.

The two methods will also be added to ``collections.UserString``, with
similar behavior.

My hesitation to write "return self" is resolved by saying that it should not 
be relied on, so I think this is a win.

Best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YZD2BTB5RT6DZUTEGHTRNAJZHBMRATPS/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
> > One may also continue using ``startswith()``
> > and ``endswith()``
> > methods for control flow instead of testing the lengths as above.
> > 
> > That's worse, in a sense, since "foofoobar".removeprefix("foo") returns
> "foobar" which still starts with "foo".

I meant that startswith might be called before removeprefix, as it was 
in the ``deccheck.py`` example.

> If I saw that in a code review I'd flag it for non-obviousness. One should
> use 'string != new_string' unless there is severe pressure to squeeze
> every nanosecond out of this particular code (and it better be inside an
> inner loop).

I thought that someone had suggested that such things go in the PEP, but 
since these are more stylistic considerations, I would be more than happy to
trim it down to just

The builtin ``str`` class will gain two new methods which will behave
as follows when ``type(self) is type(prefix) is str``::

def removeprefix(self: str, prefix: str, /) -> str:
if self.startswith(prefix):
return self[len(prefix):]
else:
return self[:]

def removesuffix(self: str, suffix: str, /) -> str:
# suffix='' should not call self[:-0].
if suffix and self.endswith(suffix):
return self[:-len(suffix)]
else:
return self[:]

These methods, even when called on ``str`` subclasses, should always
return base ``str`` objects.

Methods with the corresponding semantics will be added to the builtin
``bytes`` and ``bytearray`` objects.  If ``b`` is either a ``bytes``
or ``bytearray`` object, then ``b.removeprefix()`` and ``b.removesuffix()``
will accept any bytes-like object as an argument. The two methods will
also be added to ``collections.UserString``, with similar behavior.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/HQRI26F6UPWL24LJOFFMKNAMYJSC2CAL/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-27 Thread Dennis Sweeney
PEP 616 -- String methods to remove prefixes and suffixes
is available here: https://www.python.org/dev/peps/pep-0616/

Changes:
- Only accept single affixes, not tuples
- Make the specification more concise
- Make fewer stylistic prescriptions for usage
- Fix typos

A reference implementation GitHub PR is up to date here:
https://github.com/python/cpython/pull/18939

Are there any more comments for it before submission?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UJE3WCQXSZI76IW54D2SKKL6OFQ2VFMA/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-03-28 Thread Dennis Sweeney
Sure -- I can add in a short list of those major changes.

Best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/TKCHV76P3CYYSZDSB3TH3I4UTFCUNKU5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 616 -- String methods to remove prefixes and suffixes

2020-04-01 Thread Dennis Sweeney
Hello all,

It seems that most of the discussion has settled down, but I didn't quite 
understand from reading PEP 1 what the next step should be -- is this an 
appropriate time to open an issue on the Steering Council GitHub repository 
requesting pronouncement on PEP 616?

Best,
Dennis
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZXKU3EM6HEG6R7C65L7UN65IGTBB7VHH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip

2020-05-20 Thread Dennis Sweeney
What about "balanced", "uniform", or "rectangular"?
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2XXSYMZELCV5EMAEIDFISLF7RDD6ICE5/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Cycles in the __context__ chain

2020-06-15 Thread Dennis Sweeney
Worth noting is that there is an existing loop-breaking mechanism, 
but only for the newest exception being raised. In particular, option (4)
is actually the current behavior if the the most recent exception
participates in a cycle:

Python 3.9.0b1
>>> A, B, C, D, E = map(Exception, "ABCDE")
>>> A.__context__ = B
>>> B.__context__ = C
>>> C.__context__ = D
>>> D.__context__ = E
>>> try:
... raise A
... except Exception:
... raise C
...
Exception: B

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 2, in 
Exception: A

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "", line 4, in 
Exception: C

This cycle-breaking is not due to any magic in the ``PyException_SetContext()``,
which is currently a basic one-liner, but instead comes from
``_PyErr_SetObject`` in errors.c, which has something along the lines of:

def _PyErr_SetObject(new_exc):
top = existing_topmost_exc()

if top is None:
# no context
set_top_exception(new_exc)
return

# convert new_exc class to instance if applicable.
...

if top is new_exc:
# already on top
return

e = top
while True:
context = e.__context__
if context is None:
# no loop
break
if context is new_exc:
# unlink the existing exception
e.__context__ = None
break
e = context

new_exc.__context__ = top
set_top_exception(new_exc)

The only trouble comes about when there is a "rho-shaped" linked list,
in which we have a cycle not involving the new exception being raised.
For instance,

Raising A on top of (B -> C -> D -> C -> D -> C -> ...)
results in an infinite loop.

Two possible fixes would be to either (I) use a magical ``__context__``
setter to ensure that there is never a rho-shaped sequence, or (II)
allow arbitrary ``__context__`` graphs and then correctly handle
rho-shaped sequences in ``_PyErr_SetObject`` (i.e. at raise-time).

Fix type (I) could result in surprising things like:

>>> A = Exception()
>>> A.__context__ = A
>>> A.__context__ is None
True

so I propose fix type (II). This PR is such a fix:
https://github.com/python/cpython/pull/20539

It basically extends the existing behavior (4) to the rho-shaped case.

It also prevents the cycle-detecting logic from sitting in two places
(both _PyErr_SetObject and PyException_SetContext) and does not make any
visible functionality more magical. The only Python-visible change
should be that the infinite loop is no longer possible.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/R5J5JVUJX3V4DBKVLUI2SUBRD3TRF6PV/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Changing Python's string search algorithms

2020-10-15 Thread Dennis Sweeney
Here's my attempt at some heuristic motivation:

Try to construct a needle that will perform as poorly as possible when
using the naive two-nested-for-loops algorithm. You'll find that if
there isn't some sort of vague periodicity in your needle, then you
won't ever get *that* unlucky; each particular alignment will fail
early, and if it doesn't then some future alignment would be pigeonholed 
to fail early.

So Crochemore and Perrin's algorithm explicitly handles this "worst case"
of periodic strings. Once we've identified in the haystack some period
from the needle, there's no need to re-match it. We can keep a memory
of how many periods we currently remember matching up, and never re-match
them. This is what gives the O(n) behavior for periodic strings.

But wait! There are some bad needles that aren't quite periodic.
For instance:

>>> 'ABCABCAABCABC' in 'ABC'*1_000_000

The key insight though is that the worst strings are still
"periodic enough", and if we have two different patterns going on,
then we can intentionally split them apart. For example,
`"xyxyxyxyabcabc" --> "xyxyxyxy" + "abcabc"`. I believe the goal is to
line it up so that if the right half matches but not the left then we
can be sure to skip somewhat far ahead. This might not correspond
exactly with splitting up two patterns. This is glossing over some
details that I'm admittedly still a little hazy on as well, but
hopefully that gives at least a nudge of intuition.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/MXMS5XIV6WJFFRHTH7TBHAO3TC4QIHBZ/
Code of Conduct: http://python.org/psf/codeofconduct/