[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Antoine Pitrou
On Wed, 10 Nov 2021 21:12:17 -0600
Tim Peters  wrote:
> [Bob Fang ]
> > This is a modest proposal to consider having sorted containers
> > (http://www.grantjenks.com/docs/sortedcontainers/) in standard library.  
> 
> +1 from me, but if and only if Grant Jenks (its author) wants that too.
> 
> It's first-rate code in all respects, including that it's a fine
> example _of_ Python programming (it's not written in C - in Python).

Agreed with Tim.  This is a perfect example of some basic and perennial
facility that would fit very well in the stdlib.

Regards

Antoine.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/WJ4ANAMAYPVRHJNXMIASNAAWAZ45GPSZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Paul Moore
On Thu, 11 Nov 2021 at 11:51, Antoine Pitrou  wrote:
>
> On Wed, 10 Nov 2021 21:12:17 -0600
> Tim Peters  wrote:
> > [Bob Fang ]
> > > This is a modest proposal to consider having sorted containers
> > > (http://www.grantjenks.com/docs/sortedcontainers/) in standard library.
> >
> > +1 from me, but if and only if Grant Jenks (its author) wants that too.
> >
> > It's first-rate code in all respects, including that it's a fine
> > example _of_ Python programming (it's not written in C - in Python).
>
> Agreed with Tim.  This is a perfect example of some basic and perennial
> facility that would fit very well in the stdlib.

I agree as well. Is anyone interested enough to ask the library author
if he supports doing this? That seems to be the main unanswered
question here.

But if anyone wants to argue the "the stdlib should be shrinking, not
growing" position, I suggest they do so *before* someone reaches out
to the module author. No point in us making the suggestion and then
being forced to withdraw it.

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/5SURNB4C5FGJ6LSXUPVW2EFP22ERKSGB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Remove asyncore, asynchat and smtpd modules

2021-11-11 Thread Victor Stinner
Hi,

The asyncore module is a very old module of the Python stdlib for
asynchronous programming, usually to handle network sockets
concurrently. It's a common event loop, but its design has many flaws.

The asyncio module was added to Python 3.4 with a well designed
architecture. Twisted developers, who have like 10 to 20 years of
experience in asynchronous programming, helped to design the asyncio
API. By design, asyncio doesn't have flaws which would be really hard
to fix in asyncore and asynchat.

It was decided to start deprecating the asyncore, asynchat and smtpd
modules in Python 3.6 released in 2016, 5 years ago. Python 3.10 emits
DeprecationWarning. asynchat and smtpd are implemented with asyncore.
Open issues in asyncore, asynchat and smtpd have been closed as "wont
fix" because these modules are deprecated. These modules are basically
no longer maintained.

I propose to remove asyncore, aynchat and smtpd in Python 3.11 to
reduce the Python maintenance burden, while asyncio remains available
in stdlib and is maintained:

* asyncore and asynchat can be replaced with asyncio
* smtpd can be replaced with aiosmtpd which is based on asyncio:
https://aiosmtpd.readthedocs.io/

If someone wants to continue using asyncore, asynchat or smtpd, it's
trivial to copy Python 3.10 asyncore.py, asynchat.py and smtpd.py to
their project, and maintain these files there. Someone is also free to
continue maintaining these modules as third-party projects on PyPI.

The removal is discussed at:
https://bugs.python.org/issue28533

I wrote a PR to remove the 3 modules:
https://github.com/python/cpython/pull/29521


... in short, the intent is to move the asyncore, asynchat and smtpd
maintenance outside the Pyhon project ;-) (if anyone still use them)


Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LZOOLX5EKOITW55TW7JQYKLXJUPCAJB4/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Remove asyncore, asynchat and smtpd modules

2021-11-11 Thread Guido van Rossum
Yes please. Long overdue.

On Thu, Nov 11, 2021 at 4:38 AM Victor Stinner  wrote:

> Hi,
>
> The asyncore module is a very old module of the Python stdlib for
> asynchronous programming, usually to handle network sockets
> concurrently. It's a common event loop, but its design has many flaws.
>
> The asyncio module was added to Python 3.4 with a well designed
> architecture. Twisted developers, who have like 10 to 20 years of
> experience in asynchronous programming, helped to design the asyncio
> API. By design, asyncio doesn't have flaws which would be really hard
> to fix in asyncore and asynchat.
>
> It was decided to start deprecating the asyncore, asynchat and smtpd
> modules in Python 3.6 released in 2016, 5 years ago. Python 3.10 emits
> DeprecationWarning. asynchat and smtpd are implemented with asyncore.
> Open issues in asyncore, asynchat and smtpd have been closed as "wont
> fix" because these modules are deprecated. These modules are basically
> no longer maintained.
>
> I propose to remove asyncore, aynchat and smtpd in Python 3.11 to
> reduce the Python maintenance burden, while asyncio remains available
> in stdlib and is maintained:
>
> * asyncore and asynchat can be replaced with asyncio
> * smtpd can be replaced with aiosmtpd which is based on asyncio:
> https://aiosmtpd.readthedocs.io/
>
> If someone wants to continue using asyncore, asynchat or smtpd, it's
> trivial to copy Python 3.10 asyncore.py, asynchat.py and smtpd.py to
> their project, and maintain these files there. Someone is also free to
> continue maintaining these modules as third-party projects on PyPI.
>
> The removal is discussed at:
> https://bugs.python.org/issue28533
>
> I wrote a PR to remove the 3 modules:
> https://github.com/python/cpython/pull/29521
>
>
> ... in short, the intent is to move the asyncore, asynchat and smtpd
> maintenance outside the Pyhon project ;-) (if anyone still use them)
>
>
> Victor
> --
> Night gathers, and now my watch begins. It shall not end until my death.
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/LZOOLX5EKOITW55TW7JQYKLXJUPCAJB4/
> Code of Conduct: http://python.org/psf/codeofconduct/
>


-- 
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KLIUO6O6XSN7QOVZFQSLQMX5YJMSPMLH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Christopher Barker
Earlier in the thread, we were pointed to multiple implementations.

Is this particular one clearly the “best”[*]?

If so, then sure.

-CHB

[*] best meaning “most appropriate for the stdlib”. A couple folks have
already pointed to the quality of the code. But my understanding is that
different algorithms are more or less appropriate for different use cases.
So is this one fairly “universal”?



On Thu, Nov 11, 2021 at 4:31 AM Paul Moore  wrote:

> On Thu, 11 Nov 2021 at 11:51, Antoine Pitrou  wrote:
> >
> > On Wed, 10 Nov 2021 21:12:17 -0600
> > Tim Peters  wrote:
> > > [Bob Fang ]
> > > > This is a modest proposal to consider having sorted containers
> > > > (http://www.grantjenks.com/docs/sortedcontainers/) in standard
> library.
> > >
> > > +1 from me, but if and only if Grant Jenks (its author) wants that too.
> > >
> > > It's first-rate code in all respects, including that it's a fine
> > > example _of_ Python programming (it's not written in C - in Python).
> >
> > Agreed with Tim.  This is a perfect example of some basic and perennial
> > facility that would fit very well in the stdlib.
>
> I agree as well. Is anyone interested enough to ask the library author
> if he supports doing this? That seems to be the main unanswered
> question here.
>
> But if anyone wants to argue the "the stdlib should be shrinking, not
> growing" position, I suggest they do so *before* someone reaches out
> to the module author. No point in us making the suggestion and then
> being forced to withdraw it.
>
> Paul
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/5SURNB4C5FGJ6LSXUPVW2EFP22ERKSGB/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
-- 
Christopher Barker, PhD (Chris)

Python Language Consulting
  - Teaching
  - Scientific Software Development
  - Desktop GUI and Web Development
  - wxPython, numpy, scipy, Cython
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/35YUCU4UZET6NDHCI3N2AY5YQSOX6DTF/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Daniel Pope
Serhiy Storchaka wrote:
> From my C++ and Java experience, hashtable-based containers are much
> more useful than tree-based containers. They are more compact and fast.
> In most cases the only reason of using sorted container is supporting
> deterministic iteration order, but often it is enough to sort data only
> for output. 

The concern is not constant speed factor but asymptotic complexity.
Inserting or deleting in a tree-based sorted collection is O(log n),
but lists are O(n): the bisect is O(log n) but moving trailing list elements
is O(n). It's memmove fast but dominates as n gets large.

There are many algorithms that are hard to implement as fast as they
can be without a tree. Something like rolling median would be a good
test case, e.g. "for a list N containing numbers, produce a list M of
length len(N) - 1 where M[i] == statistics.median(N[:i+1])". I think the
best you can do with the standard library and without writing a
balanced binary tree from scratch is O(n²) (if you can beat this I'd like
to know how!). With a tree-based sorted list it's O(n log n).

Dicts and sets cannot maintain an arbitrary order, except that
OrderedDict.move_to_end() very occasionally turns out to be all you
need (and even then, I find it tricky to reason about and apply
correctly in an algorithm: it requires you to maintain an invariant,
where SortedList maintains the sorted invariant for you).

> There is no such need
> of tree-based implementation. It is not needed in Python core, and is
> not needed for most users. So it is good to keep it in third-party
> libraries. Maintaining it in the stdlib has a cost and the benefit/cost
> value would be low.

The situation where this bites my firm the most is in interviews. This is
exactly the situation where we're asking candidates to demonstrate
their CS knowledge and produce an algorithm that has low asymptotic
complexity.

We use various services including HackerRank, which provide a
sandboxed Python environment that cannot use PyPI. There are quite
a few services in this kind of Educational+Challenge space including
some that execute Python in the browser using Brython etc.

My team believes that candidates with a Java background find the
challenge easier than candidates with a Python background, because
they just use NavigableSet and move on. I find that kind of embarrassing.
It requires us to apologise for Python's deficiency and fudge how we
score the question, which makes it hard to know we're providing a level
playing field.

(I would argue that we should not ask those kinds of questions or not
use these sandboxed interview environments but institutional friction
makes that magnitude of change very difficult.)

Putting something in the standard library means that it will, in due
course, appear in HackerRank and other services we use, and we can
expect candidates to be familiar with it and incorporate it in their
solutions.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VTOYBWO5OWXID4OIH5FWCKWSWRDXKSTB/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Richard Levasseur
On Thu, Nov 11, 2021 at 4:30 AM Paul Moore  wrote:

> On Thu, 11 Nov 2021 at 11:51, Antoine Pitrou  wrote:
> >
> > On Wed, 10 Nov 2021 21:12:17 -0600
> > Tim Peters  wrote:
> > > [Bob Fang ]
> > > > This is a modest proposal to consider having sorted containers
> > > > (http://www.grantjenks.com/docs/sortedcontainers/) in standard
> library.
> > >
> > > +1 from me, but if and only if Grant Jenks (its author) wants that too.
> > >
> > > It's first-rate code in all respects, including that it's a fine
> > > example _of_ Python programming (it's not written in C - in Python).
> >
> > Agreed with Tim.  This is a perfect example of some basic and perennial
> > facility that would fit very well in the stdlib.
>
> I agree as well. Is anyone interested enough to ask the library author
> if he supports doing this? That seems to be the main unanswered
> question here.
>
> But if anyone wants to argue the "the stdlib should be shrinking, not
> growing" position, I suggest they do so *before* someone reaches out
> to the module author. No point in us making the suggestion and then
> being forced to withdraw it.
>

A couple 2 cents and IMHO.

-
The tldr is not so much "don't put it in the standard lib", but that the
stdlib would be better having particular implementations instead of a
one-size-fits-all e.g. SortedList.

Should the stdlib have e.g. SortedList? Probably not, because the use cases
of such data types are too niche to a one-size-fits-all implementation, and
there are too many implementations with too many of their own settings.
Should it have e.g. BTree, RedBlackTree, SortedArrayList, etc? Probably so,
because those are somewhat fundamental data structures and implementing,
testing, and verifying them is very much non-trivial. While niche, having
them at your immediate disposal is very powerful.
-

Last year, for fun, after wishing there was a SortedSet in the standard
lib, I ended up implementing a Red-Black Tree and BTree based sorted
dictionary/set[1]. After then trying to use them for my use case[2], I
found that, in order to fully and truly exploit their benefits, the basic
Sequence/Collection/Set/Dict APIs didn't really suffice. I needed APIs that
would let me, e.g. binary search to a particular spot and then iterate, or
give me a range between two points, etc. Such apis tend to be pretty
specialized to exploit their under data structure (which isn't to say an
abstract API can't be created, see Java's NavigableMap, but that level of
abstraction deserves careful consideration).

In the end, it was a fun exercise, but in practice a dictionary and
sorted() got me 90% of the way there and sufficed. Optimizing that last 10%
wasn't worth the effort.

Anyways, I came to two particular, IMHO, conclusions:
1. Sorted collections are very niche. I *could* have used one, but probably
would have spent as much time fiddling with them as using a dict/sorted().
2. If you're in a niche use case, especially for performance, then
abstractions aren't really helpful. Using e.g. a BTree and knowing the
particulars of the implementation are going to be more helpful than having
an abstract SortedList and working out how to use its special APIs for
particular actions.

This basically echos Christopher Baker's sentiment of "whats the use case"
and "why doesn't a sorted() call at the end suffice" to some extent. IME,
Python's sort algorithm is just really good, so it's hard to beat simply
calling sorted() when you're done processing in both runtime performance
and effort-spent-optimizing. I struggle to think of or remember a
particular case where dropping in a sorted collection would be a clear and
obvious win. Maybe a more memory constrained case where sorted() creating a
new list is too much? IDK.

[1] As an aside, RB didn't perform nearly as well as sortedcontainers for
the majority of cases, but the naive BTree was about on par. IIRC,
sortedcollections is basically a specialized BTree.
[2] A wish that came about because I had sorted sets of exact/prefix
strings of 1,000 to 10,000 elements and needed them to interact in various
ways and wanted to preserve their sortedness.

-

re: "Is this particular one clearly the “best”[*]?"

Performance wise, it's probably about the best insofar as a BTree of
depth=2 and high fanout (10,000 or something? I forget what
sortedcontainers defaults to) is. In my limited experience, it's docs and
claims about its performance were pretty accurate[3].

API wise, it has its share of idiosyncrasies like anything else. e.g.
SortedList claims to implement MutableSequence, but actually raises errors
when calling e.g. append(), which is a bit misleading / wrong to some
extent (to be fair, given MutableSequence's api contract, it's not like
there's particularly clear/right/good choice one can make here).

(disclaimer: i've minimal experience with sortedcontainers; when I looked
at it, it was really only for comparing my own toy BTree/RBT
behavior/performance with a known implementation)

[3

[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Antoine Pitrou
On Thu, 11 Nov 2021 11:01:32 -0800
Richard Levasseur  wrote:
> 
> In the end, it was a fun exercise, but in practice a dictionary and
> sorted() got me 90% of the way there and sufficed. Optimizing that last 10%
> wasn't worth the effort.
> 
> Anyways, I came to two particular, IMHO, conclusions:
> 1. Sorted collections are very niche. I *could* have used one, but probably
> would have spent as much time fiddling with them as using a dict/sorted().

So the fact that sorted collections did not help for one single use
case leads you to conclude that "sorted collections are very niche"? 
I don't find this very convincing.

Unfortunately, PyPI doesn't seem to offer a way to query the reverse
dependencies of a package, otherwise we could know how many packages
depend on the "sortedcontainers" package.

[...]

> 2. If you're in a niche use case, especially for performance, then
> abstractions aren't really helpful. Using e.g. a BTree and knowing the
> particulars of the implementation are going to be more helpful than having
> an abstract SortedList and working out how to use its special APIs for
> particular actions.

You don't have "an abstract SortedList", you have a concrete SortedList
with a well-defined implementation with stable performance
characteristics.  So, again, this seems to be a red herring.

> This basically echos Christopher Baker's sentiment of "whats the use case"
> and "why doesn't a sorted() call at the end suffice" to some extent.

The use case is obviously when you have to *maintain* sorting at all
times.  Yes, there are such situations.  Some rather high-profile
Python packages, such as dask/distributed, rely on this for rather
elaborate algorithmic work (dask/distributed is a distributed task
scheduler, you may also think of it as a replacement for Spark).


I won't go over your performance considerations, but I'll just mention
that the "sortedcontainers" documentation has not one but *several*
sections exploring performance characteristics in several dimensions.
It is extremely rare for Python packages to provide such a detailed
insight on the matter (even the built-in `dict` is much less thoroughly
characterised).
http://www.grantjenks.com/docs/sortedcontainers/#user-guide

Regards

Antoine.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EHZGBXSIXZE6ZDBGJPESDQQG4PCNOMLW/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Andrew Svetlov
sortedcontainers looks great!
I very much appreciate it if such well-done library have type hints
(embedded or at least in form of types-sortedcontainers pypi package).

>From my experience of multidict library support, it is not a super-hard
challenge but something that needs attention and time resource from the
library maintainers or third-party volunteers.

On Thu, Nov 11, 2021 at 9:06 PM Richard Levasseur 
wrote:

>
>
> On Thu, Nov 11, 2021 at 4:30 AM Paul Moore  wrote:
>
>> On Thu, 11 Nov 2021 at 11:51, Antoine Pitrou  wrote:
>> >
>> > On Wed, 10 Nov 2021 21:12:17 -0600
>> > Tim Peters  wrote:
>> > > [Bob Fang ]
>> > > > This is a modest proposal to consider having sorted containers
>> > > > (http://www.grantjenks.com/docs/sortedcontainers/) in standard
>> library.
>> > >
>> > > +1 from me, but if and only if Grant Jenks (its author) wants that
>> too.
>> > >
>> > > It's first-rate code in all respects, including that it's a fine
>> > > example _of_ Python programming (it's not written in C - in Python).
>> >
>> > Agreed with Tim.  This is a perfect example of some basic and perennial
>> > facility that would fit very well in the stdlib.
>>
>> I agree as well. Is anyone interested enough to ask the library author
>> if he supports doing this? That seems to be the main unanswered
>> question here.
>>
>> But if anyone wants to argue the "the stdlib should be shrinking, not
>> growing" position, I suggest they do so *before* someone reaches out
>> to the module author. No point in us making the suggestion and then
>> being forced to withdraw it.
>>
>
> A couple 2 cents and IMHO.
>
> -
> The tldr is not so much "don't put it in the standard lib", but that the
> stdlib would be better having particular implementations instead of a
> one-size-fits-all e.g. SortedList.
>
> Should the stdlib have e.g. SortedList? Probably not, because the use
> cases of such data types are too niche to a one-size-fits-all
> implementation, and there are too many implementations with too many of
> their own settings.
> Should it have e.g. BTree, RedBlackTree, SortedArrayList, etc? Probably
> so, because those are somewhat fundamental data structures and
> implementing, testing, and verifying them is very much non-trivial. While
> niche, having them at your immediate disposal is very powerful.
> -
>
> Last year, for fun, after wishing there was a SortedSet in the standard
> lib, I ended up implementing a Red-Black Tree and BTree based sorted
> dictionary/set[1]. After then trying to use them for my use case[2], I
> found that, in order to fully and truly exploit their benefits, the basic
> Sequence/Collection/Set/Dict APIs didn't really suffice. I needed APIs that
> would let me, e.g. binary search to a particular spot and then iterate, or
> give me a range between two points, etc. Such apis tend to be pretty
> specialized to exploit their under data structure (which isn't to say an
> abstract API can't be created, see Java's NavigableMap, but that level of
> abstraction deserves careful consideration).
>
> In the end, it was a fun exercise, but in practice a dictionary and
> sorted() got me 90% of the way there and sufficed. Optimizing that last 10%
> wasn't worth the effort.
>
> Anyways, I came to two particular, IMHO, conclusions:
> 1. Sorted collections are very niche. I *could* have used one, but
> probably would have spent as much time fiddling with them as using a
> dict/sorted().
> 2. If you're in a niche use case, especially for performance, then
> abstractions aren't really helpful. Using e.g. a BTree and knowing the
> particulars of the implementation are going to be more helpful than having
> an abstract SortedList and working out how to use its special APIs for
> particular actions.
>
> This basically echos Christopher Baker's sentiment of "whats the use case"
> and "why doesn't a sorted() call at the end suffice" to some extent. IME,
> Python's sort algorithm is just really good, so it's hard to beat simply
> calling sorted() when you're done processing in both runtime performance
> and effort-spent-optimizing. I struggle to think of or remember a
> particular case where dropping in a sorted collection would be a clear and
> obvious win. Maybe a more memory constrained case where sorted() creating a
> new list is too much? IDK.
>
> [1] As an aside, RB didn't perform nearly as well as sortedcontainers for
> the majority of cases, but the naive BTree was about on par. IIRC,
> sortedcollections is basically a specialized BTree.
> [2] A wish that came about because I had sorted sets of exact/prefix
> strings of 1,000 to 10,000 elements and needed them to interact in various
> ways and wanted to preserve their sortedness.
>
> -
>
> re: "Is this particular one clearly the “best”[*]?"
>
> Performance wise, it's probably about the best insofar as a BTree of
> depth=2 and high fanout (10,000 or something? I forget what
> sortedcontainers defaults to) is. In my limited experience, it's docs and
> claims about it

[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Bob Fang
> But if anyone wants to argue the "the stdlib should be shrinking, not
> growing" position, I suggest they do so *before* someone reaches out
> to the module author. No point in us making the suggestion and then
> being forced to withdraw it.

So I suppose we can take a poll on this? If majority of us agree then I am 
happy to reach out to the author since I am the one who started this thread.

Thanks,
Bob

> On 11 Nov 2021, at 12:28, Paul Moore  wrote:
> 
> On Thu, 11 Nov 2021 at 11:51, Antoine Pitrou  > wrote:
>> 
>> On Wed, 10 Nov 2021 21:12:17 -0600
>> Tim Peters  wrote:
>>> [Bob Fang ]
 This is a modest proposal to consider having sorted containers
 (http://www.grantjenks.com/docs/sortedcontainers/) in standard library.
>>> 
>>> +1 from me, but if and only if Grant Jenks (its author) wants that too.
>>> 
>>> It's first-rate code in all respects, including that it's a fine
>>> example _of_ Python programming (it's not written in C - in Python).
>> 
>> Agreed with Tim.  This is a perfect example of some basic and perennial
>> facility that would fit very well in the stdlib.
> 
> I agree as well. Is anyone interested enough to ask the library author
> if he supports doing this? That seems to be the main unanswered
> question here.
> 
> But if anyone wants to argue the "the stdlib should be shrinking, not
> growing" position, I suggest they do so *before* someone reaches out
> to the module author. No point in us making the suggestion and then
> being forced to withdraw it.
> 
> Paul
> ___
> Python-Dev mailing list -- python-dev@python.org 
> 
> To unsubscribe send an email to python-dev-le...@python.org 
> 
> https://mail.python.org/mailman3/lists/python-dev.python.org/ 
> 
> Message archived at 
> https://mail.python.org/archives/list/python-dev@python.org/message/5SURNB4C5FGJ6LSXUPVW2EFP22ERKSGB/
>  
> 
> Code of Conduct: http://python.org/psf/codeofconduct/ 
> 
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UUAFSF2JMWVVMUXV2QGXGJOFPPQJSFMH/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Alex Waygood
I'll join Christopher Barker and Chris Angelico in voicing scepticism --
personally, I feel like I'm yet to be persuaded that the use case is strong
enough. sortedcontainers is a wonderful package, and I would completely
support a prominent link to the library in the Python documentation. But
the use case seems too niche and specific, in my opinion, for the library
to be added to the standard library. I see the standard library as a
collection of basic building blocks that can be used to create a multitude
of other things. sortedcontainers feel like a specialised solution to a
small collection of problems, rather than a general solution to a large
collection of problems. PyPI feels like the right place for this library.

Best,
Alex

On Thu, Nov 11, 2021 at 12:33 PM Paul Moore  wrote:

> On Thu, 11 Nov 2021 at 11:51, Antoine Pitrou  wrote:
> >
> > On Wed, 10 Nov 2021 21:12:17 -0600
> > Tim Peters  wrote:
> > > [Bob Fang ]
> > > > This is a modest proposal to consider having sorted containers
> > > > (http://www.grantjenks.com/docs/sortedcontainers/) in standard
> library.
> > >
> > > +1 from me, but if and only if Grant Jenks (its author) wants that too.
> > >
> > > It's first-rate code in all respects, including that it's a fine
> > > example _of_ Python programming (it's not written in C - in Python).
> >
> > Agreed with Tim.  This is a perfect example of some basic and perennial
> > facility that would fit very well in the stdlib.
>
> I agree as well. Is anyone interested enough to ask the library author
> if he supports doing this? That seems to be the main unanswered
> question here.
>
> But if anyone wants to argue the "the stdlib should be shrinking, not
> growing" position, I suggest they do so *before* someone reaches out
> to the module author. No point in us making the suggestion and then
> being forced to withdraw it.
>
> Paul
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/5SURNB4C5FGJ6LSXUPVW2EFP22ERKSGB/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EM3CCN2YNKVUUQIGNXE5XIIR5TJKSTCZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Bob Fang
Just want to mention that we do have this nice website to look at, although I 
am not sure how up to date it is:

https://www.wheelodex.org/projects/sortedcontainers/rdepends/?page=1

> On 11 Nov 2021, at 19:20, Antoine Pitrou  wrote:
> 
> Unfortunately, PyPI doesn't seem to offer a way to query the reverse
> dependencies of a package, otherwise we could know how many packages
> depend on the "sortedcontainers" package.

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/44V6343BQQJQJP4CCFBIZAI5JQOM3XB7/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Antoine Pitrou
On Thu, 11 Nov 2021 20:58:29 +
Bob Fang  wrote:
> Just want to mention that we do have this nice website to look at, although I 
> am not sure how up to date it is:
> 
> https://www.wheelodex.org/projects/sortedcontainers/rdepends/?page=1

Wow, thank you, that is very nice!

Best regards

Antoine.


___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2YSXRGLMRFWGACMATQFIKX3PFWPP5P3W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Antoine Pitrou
On Thu, 11 Nov 2021 12:39:50 +
Bob Fang  wrote:
> > But if anyone wants to argue the "the stdlib should be shrinking, not
> > growing" position, I suggest they do so *before* someone reaches out
> > to the module author. No point in us making the suggestion and then
> > being forced to withdraw it.  
> 
> So I suppose we can take a poll on this? If majority of us agree then I am 
> happy to reach out to the author since I am the one who started this thread.

Decisions are not made through voting, so that would be informative,
but little more :-)

I think it's better that you contact the author upfront, asking for a
provisional agreement about stdlib inclusion.  If he's not interested,
then no need to pursue it further.  If he's interested, then you can
start working on a more formal proposal (for example a PEP).

Regards

Antoine.



> 
> Thanks,
> Bob
> 
> > On 11 Nov 2021, at 12:28, Paul Moore  wrote:
> > 
> > On Thu, 11 Nov 2021 at 11:51, Antoine Pitrou  > > wrote:  
> >> 
> >> On Wed, 10 Nov 2021 21:12:17 -0600
> >> Tim Peters  wrote:  
> >>> [Bob Fang ]  
>  This is a modest proposal to consider having sorted containers
>  (http://www.grantjenks.com/docs/sortedcontainers/) in standard library.  
> >>> 
> >>> +1 from me, but if and only if Grant Jenks (its author) wants that too.
> >>> 
> >>> It's first-rate code in all respects, including that it's a fine
> >>> example _of_ Python programming (it's not written in C - in Python).  
> >> 
> >> Agreed with Tim.  This is a perfect example of some basic and perennial
> >> facility that would fit very well in the stdlib.  
> > 
> > I agree as well. Is anyone interested enough to ask the library author
> > if he supports doing this? That seems to be the main unanswered
> > question here.
> > 
> > But if anyone wants to argue the "the stdlib should be shrinking, not
> > growing" position, I suggest they do so *before* someone reaches out
> > to the module author. No point in us making the suggestion and then
> > being forced to withdraw it.
> > 
> > Paul
> > ___
> > Python-Dev mailing list -- python-dev@python.org 
> > 
> > To unsubscribe send an email to python-dev-le...@python.org 
> > 
> > https://mail.python.org/mailman3/lists/python-dev.python.org/ 
> > 
> > Message archived at 
> > https://mail.python.org/archives/list/python-dev@python.org/message/5SURNB4C5FGJ6LSXUPVW2EFP22ERKSGB/
> >  
> > 
> > Code of Conduct: http://python.org/psf/codeofconduct/ 
> >   
> 



___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/3UDYPFVMENHMLAZETGDVSUBAO2BY3C6W/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: containment and the empty container

2021-11-11 Thread Ethan Furman

On 11/9/21 9:02 AM, Guido van Rossum wrote:
> On Mon, Nov 8, 2021 at 10:29 PM Ethan Furman wrote:

>> The way I see it, the following should hold
>>
>>   empty_flag = RegexFlag(0)
>>   any_case = RegexFlag.IGNORECASE
>>   any_case_on_any_line = RegexFlag.IGNORECASE | RegexFlag.MULTILINE
>>
>>   any_case in empty_flag is False
>>   any_case_on_any_line in empty_flag is False
>>
>>   empty_flag in any_case is False
>>   empty_flag in any_case_on_any_line is False
>>
>
> The latter two defy all logic. Please don't. Your 'in' operator clearly means "is 
a subset of", and the empty set
> emphatically is a subset of all sets (this is the most basic mainstream set 
theory you can think of).

Thank you everyone for your feedback.  As is probably evident, my degree is not is CS nor maths, so I greatly appreciate 
your thoughts.


RegexFlags(0) in any_regex_flags_whatsoever will continue to be True.

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/VTDPADRF7ZGY6OFNC3ZA5NLU7IQTICOZ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Tim Peters
[Christopher Barker ]
> Earlier in the thread, we were pointed to multiple implementations.
>
> Is this particular one clearly the “best”[*]?
>
> If so, then sure.
>
> -CHB
>
> [*] best meaning “most appropriate for the stdlib”. A couple folks have
> already pointed to the quality of the code. But my understanding is that
> different algorithms are more or less appropriate for different use
> cases. So is this one fairly “universal”?

As noted earlier, SortedContainers is downloaded from PyPI hundreds of
thousands of times every day, and that's been so for a long time..
Best I can tell, no similar package is in the same universe as that.

The author identified over a dozen potential competitors here[1], but
most didn't look like serious projects to him. He ran extensive timing
tests against the rest, reported there too. HIs package usually wins,
and is at worst competitive. It usually wins on memory burden too.

Some of these things are people intrigued by "a data structure" they
read about and implement for fun and self-education. That sometimes
shines through in data-structure-specific APIs. While the
SortedContainers extensive docs explain how things are implemented (if
you look in the right spots for that), the API is remarkably agnostic
(& useful)). For example, this is a method of SortedList:

irange(minimum=None, maximum=None, inclusive=(True, True), reverse=False)

which returns an iterator over a contiguous sorted range of values
(neither, either, or both of whose endpoints may be included,
depending on what you pass for `inclusive` This "makes sense"
regardless of how the structure may be implemented. Of course
SortedSet and SortedDict support `irange()` too.

[1] http://www.grantjenks.com/docs/sortedcontainers/performance.html
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/FD3DSXZUBDUC3PFQH2SS47NIZ36TTOL6/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Richard Levasseur
On Thu, Nov 11, 2021 at 11:22 AM Antoine Pitrou  wrote:

> On Thu, 11 Nov 2021 11:01:32 -0800
> Richard Levasseur  wrote:
> >
> > In the end, it was a fun exercise, but in practice a dictionary and
> > sorted() got me 90% of the way there and sufficed. Optimizing that last
> 10%
> > wasn't worth the effort.
> >
> > Anyways, I came to two particular, IMHO, conclusions:
> > 1. Sorted collections are very niche. I *could* have used one, but
> probably
> > would have spent as much time fiddling with them as using a
> dict/sorted().
>
> So the fact that sorted collections did not help for one single use
> case leads you to conclude that "sorted collections are very niche"?
> I don't find this very convincing.
>

It was just an example demonstrating that, despite all the apparent "things
need to be kept sorted"-ness of the problem, the strict requirement that
data be fully sorted at every step of the code wasn't actually present.
That sorting at the end is quite sufficient. It just had the mere
appearance of being a requirement. This example is not unlike the exchange
Paul Bryan and Christopher Baker had upthread where Paul gave an example,
and Christopher pointed out sorting at the end was sufficient.

And no, that I didn't choose[1] to use a sorted collection for that
particular case didn't make me conclude they were very niche. Reflecting on
my experience coding and reviewing code over, eh, 20 years? Or w/e. lead me
to realize that very rarely have I seen, needed, or recommended using a
sorted collection. In my experience, the order of elements in a collection
mattering is the exception, not the rule. Hence, I conclude they're niche.

[1] As my example described, a sorted collection would have been able to
realize some optimal performance characteristic, but eking out that last
bit wasn't worth the effort.


>
> Unfortunately, PyPI doesn't seem to offer a way to query the reverse
> dependencies of a package, otherwise we could know how many packages
> depend on the "sortedcontainers" package.
>
>
Querying my employer's code base, the number of hits for sortedcontainers
isn't particularly impressive. It's not so low to be dismissed, but it's
also not so high as to be anywhere near unequivacable. Forgive me for not
sharing the actual numbers; I'm just not really sure on what sort of code
base statistics and numbers I'm permitted to share under what circumstances
etc.


> [...]
>
> > 2. If you're in a niche use case, especially for performance, then
> > abstractions aren't really helpful. Using e.g. a BTree and knowing the
> > particulars of the implementation are going to be more helpful than
> having
> > an abstract SortedList and working out how to use its special APIs for
> > particular actions.
>
> You don't have "an abstract SortedList", you have a concrete SortedList
> with a well-defined implementation with stable performance
> characteristics.  So, again, this seems to be a red herring.
>

My point is that there are many ways to implement a collection that
maintains its elements in sorted order that has well defined performance.
One can pick one and accept the pros/cons of the particular underlying
algorithm. But it seems misleading to call that *the* SortedList
implementation. Hence why I say "abstract SortedList".

To emphasize this point a bit, let's presume for the moment Python chose to
have a data type named SortedList, and it was, effectively, the
sortedcontainers implementation. What does one do with the "load factor"
setting that sortedcontainers.SortedList has? This arg is, essentially, a
BTree implementation detail leaking out. IIRC, it controls the fan out of
the underlying BTree (because one size doesn't always fit all), which is
one of the biggest determiners of the overall performance. If such a
parameter is exposed, then it's no longer really a generic "sorted list",
it's a "sorted list as implemented using a btree". So might as well call it
e.g. BTreeList or some such instead.

If such a parameter *isn't* exposed, then a one-size-fits-all value must be
chosen[2] -- I'm skeptical that's really tenable because, if a sorted
collection is needed, it's probably for performance reasons, and you need
knobs like that to better fit the algorithm to your data and access
patterns.

And I guess I have to state upfront that my claim of "it's probably for
performance reasons" is just what my experience leads me to believe, so
it'll just have to be taken with the dosage of salt that best fits, too.

[2] As an aside, that a list's growth factor isn't controllable works
against things like sortedcontainers. Which isn't to say `list` should
expose it, but that if, e.g. ArrayList(growth_factor) was a thing, then a
more optimal implementation of e.g. BTree would be possible.


>
> > This basically echos Christopher Baker's sentiment of "whats the use
> case"
> > and "why doesn't a sorted() call at the end suffice" to some extent.
>
> The use case is obviously when you have to *maintain* sorting at all
> ti

[Python-Dev] Adding pep-8-casing-compliant aliases for unittest and logging

2021-11-11 Thread Ethan Furman

On 11/11/21 11:53 AM, Matt del Valle wrote:

> Okay, so from the replies so far it looks like this is very quickly going 
into the 'never gonna happen'
> dumpster, so in the interests of salvaging *something* out of it:

[...]

> I just dislike having to settle for 'it's what we've got'. With these two 
modules in particular, a lot
> of the arguments that have been made so far either don't apply or are not as 
strong (such as the
> confusion of having builtins.list, builtins.List and typing.List). 
Additionally, these are significantly
> more ancillary portions of the stdlib than the builtins, much less likely to 
cause as severe of a
> disruption (I personally don't believe a backward-compatible change like this 
which only adds aliases
> would be as disruptive as many people claim, but concede that that's 
subjective), and much less likely
> to have the implementation change so drastically as to want to change out 
types for factory functions
> or vice-versa.
>
> So perhaps we could narrow the scope of this down to just adding snake_case 
aliases to the logging and
> unittest modules [...]

+1

I would happily change all my unittest and logging usage to snake case if those 
aliases existed.

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/EPUVFEBVIQCTL47IBR73UDGXV7DAC3U2/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Adding pep-8-casing-compliant aliases for unittest and logging

2021-11-11 Thread Ethan Furman

Woops, wrong list -- please disregard.

--
~Ethan~
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/SU4UOT6GVE4VJUWITFDXUP5TPW2TFCUT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Having Sorted Containers in stdlib?

2021-11-11 Thread Steven D'Aprano
On Thu, Nov 11, 2021 at 11:01:32AM -0800, Richard Levasseur wrote:

> Should the stdlib have e.g. SortedList? Probably not, because the use cases
> of such data types are too niche to a one-size-fits-all implementation, and
> there are too many implementations with too many of their own settings.
> Should it have e.g. BTree, RedBlackTree, SortedArrayList, etc? Probably so,
> because those are somewhat fundamental data structures and implementing,
> testing, and verifying them is very much non-trivial. While niche, having
> them at your immediate disposal is very powerful.

By that reasoning, we shouldn't have dicts, for exactly the same reason: 
for anyone wanting an associative array, there are so many implementation 
variants to choose from:

- hash table with linear probing
- hash table with chaining
- AVL tree
- red-black tree
- judy array
- splay tree
- treap
- scapegoat tree

and many more, and most of them can be tuned.

If Python didn't already have dict, your argument against SortedList 
would equally apply to it: they are "fundamental data structures and 
implementing, testing, and verifying them is very much non-trivial".

So if your argument is correct, that would imply that standardizing on 
one single dict implementation, one that isn't even tunable by the 
caller, was a mistake. We should have provided a dozen different hash 
tables, trees and treaps.

But... we didn't, and I don't think that Python is a worse language 
because we only have one associative array implementation in the stdlib.

Whenever I need an associative array, I don't lose sleep over whether I 
could get 2% better performance for hits, at a cost of 17% worse 
performance for misses, by swapping over to some other implementation. I 
just reach for dict, knowing that it will almost always be good enough.


> Last year, for fun, after wishing there was a SortedSet in the standard
> lib, I ended up implementing a Red-Black Tree and BTree based sorted
> dictionary/set[1]. After then trying to use them for my use case[2], I
> found that, in order to fully and truly exploit their benefits, the basic
> Sequence/Collection/Set/Dict APIs didn't really suffice. I needed APIs that
> would let me, e.g. binary search to a particular spot and then iterate, or
> give me a range between two points, etc.

I believe that sortedcontainers.SortedSet provides that functionality 
via the irange() method.

http://www.grantjenks.com/docs/sortedcontainers/sortedlist.html#sortedcontainers.SortedList.irange


-- 
Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/DWGJAXOVAY64XU7FBQI6D3D5NNFKZEHW/
Code of Conduct: http://python.org/psf/codeofconduct/