[Python-Dev] Re: Should set objects maintain insertion order too?

2019-12-17 Thread Steven D'Aprano
On Sun, Dec 15, 2019 at 09:00:31PM -0800, Larry Hastings wrote:

> I think we should decide the question "should set 
> objects maintain insertion order?" literally without any consideration 
> about performance implications.

In your opening post:

> I also want FAST lookup because I soemtimes remove jobs when they're 
> subsequently marked "not ready".
[emphasis added]

So do you care about performance or not?

:-)

If you do care (a bit) about performance, what slowdown would you be 
willing to take to get ordered sets? That would give us a good idea of 
the potential trade-offs that might be acceptable.

Without being facetious[1] if you don't care about performance, you 
don't need a set, you could use a list.

There's a serious point here: there's nothing sets can do that couldn't 
be implemented using lists. The reason we have sets, rather than a 
bunch of extra methods like intersection and symmetric_difference on 
lists, is because we considered performance.



[1] Well... maybe a tiny bit... *wink*


-- 
Steven
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/H7GRS7MF5QH7EYMP4PLNMPXW6IDM42LG/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should set objects maintain insertion order too?

2019-12-17 Thread Larry Hastings


On 12/17/19 2:02 AM, Steven D'Aprano wrote:
Without being facetious[1] if you don't care about performance, you 
don't need a set, you could use a list.



Lists don't enforce uniqueness.  Apart from that a list would probably 
work fine for my needs; in my admittedly-modest workloads I would 
probably never notice a performance difference.  My anecdote was merely 
a jumping-off point for the discussion.



"I don't care about performance" is not because I'm aching for Python to 
run my code slowly.  It's because I'm 100% confident that the Python 
community will lovingly optimize the implementation.  So when I have my 
language designer hat on, I really don't concern myself with 
performance.  As I thought I said earlier in the thread, I think we 
should figure out the semantics we want /first,/ and /then/ we figure 
out how to make it fast.


I'll also cop to "a foolish consistency is the hobgoblin of little 
minds".  I lack this strongly mathematical view of sets others have 
espoused; instead I view them more like "dicts without values".  I'm 
therefore disgruntled by this inconsistency between what are I see as 
closely related data structures, and it makes sense to me that they'd 
maintain their insertion order the same way that dictionaries now do.



//arry/

___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/XECNAHYHWKA2NIVJNM6652SK26NLO4FT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should set objects maintain insertion order too?

2019-12-17 Thread Paul Moore
On Tue, 17 Dec 2019 at 11:13, Larry Hastings  wrote:
> I lack this strongly mathematical view of sets others have espoused; instead 
> I view them more like "dicts without values".  I'm therefore disgruntled by 
> this inconsistency between what are I see as closely related data structures, 
> and it makes sense to me that they'd maintain their insertion order the same 
> way that dictionaries now do.

I'll admit to a mathematical background, which probably influences my
views. But I view sets as "collections" of values - values are "in"
the set or not. I don't see them operationally, in the sense that you
add and remove things from the set. Adding and removing are mechanisms
for changing the set, but they aren't fundamentally what sets are
about (which for me is membership). So insertion order on sets is
largely meaningless for me (it doesn't matter *how* it got there, what
matters is whether it's there now).

Having said that, I also see dictionaries as mappings from key to
value, so insertion order is mostly not something I care about for
dictionaries either. I can see the use cases for ordered dictionaries,
and now we have insertion order, it's useful occasionally, but it's
not something that was immediately obvious to me based on my
intuition. Similarly, I don't see sets as *needing* insertion order,
although I concede that there probably are use cases, similar to
dictionaries. The question for me, as with dictionaries, is whether
the cost (in terms of clear semantics, constraints on future
implementation choices, and actual performance now) is justified by
the additional capabilities (remembering that personally, I'd consider
insertion order preservation as very much a niche requirement).

Paul
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/IFXPRC4BPYC5ZAZYCCQ5PQTBX5ALIOUK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should set objects maintain insertion order too?

2019-12-17 Thread Ivan Levkivskyi
On Tue, 17 Dec 2019 at 11:48, Paul Moore  wrote:

> On Tue, 17 Dec 2019 at 11:13, Larry Hastings  wrote:
> > I lack this strongly mathematical view of sets others have espoused;
> instead I view them more like "dicts without values".  I'm therefore
> disgruntled by this inconsistency between what are I see as closely related
> data structures, and it makes sense to me that they'd maintain their
> insertion order the same way that dictionaries now do.
>
> I'll admit to a mathematical background, which probably influences my
> views. But I view sets as "collections" of values - values are "in"
> the set or not. I don't see them operationally, in the sense that you
> add and remove things from the set. Adding and removing are mechanisms
> for changing the set, but they aren't fundamentally what sets are
> about (which for me is membership). So insertion order on sets is
> largely meaningless for me (it doesn't matter *how* it got there, what
> matters is whether it's there now).
>
> Having said that, I also see dictionaries as mappings from key to
> value, so insertion order is mostly not something I care about for
> dictionaries either. I can see the use cases for ordered dictionaries,
> and now we have insertion order, it's useful occasionally, but it's
> not something that was immediately obvious to me based on my
> intuition. Similarly, I don't see sets as *needing* insertion order,
> although I concede that there probably are use cases, similar to
> dictionaries. The question for me, as with dictionaries, is whether
> the cost (in terms of clear semantics, constraints on future
> implementation choices, and actual performance now) is justified by
> the additional capabilities (remembering that personally, I'd consider
> insertion order preservation as very much a niche requirement).
>

As a random data point, I often see the pattern where one needs to remove
duplicates
from the list while preserving the order of first appearance.
This is for example needed to get stability in various type-checking
situations (like union items, type variables in base classes, type queries
etc.)

One can write a four line helper to achieve this, but I can see that having
order preserving set could be useful.
So again, it is "nice to have but not really critical".

--
Ivan
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/STQVILVCC7ECBRF53ZQCIWQS45IYOIKJ/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should set objects maintain insertion order too?

2019-12-17 Thread Serhiy Storchaka

17.12.19 14:05, Ivan Levkivskyi пише:
As a random data point, I often see the pattern where one needs to 
remove duplicates

from the list while preserving the order of first appearance.
This is for example needed to get stability in various type-checking 
situations (like union items, type variables in base classes, type 
queries etc.)


One can write a four line helper to achieve this, but I can see that 
having order preserving set could be useful.

So again, it is "nice to have but not really critical".


This is a one-liner:

list(dict.fromkeys(iterable))
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/2P43G6N2MKQPSNVB4TBWY2ZHAOEISDPT/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should set objects maintain insertion order too?

2019-12-17 Thread Tim Peters
...

[Larry]
>> One prominent Python core developer** wanted this feature for years, and I 
>> recall
>> them saying something like:
>>
>> Guido says, "When a programmer iterates over a dictionary and they see the 
>> keys
>> shift around when the dictionary changes, they learn something!"  To that I 
>> say--"Yes!
>> They learn that Python is unreliable and random!"

[Tim]
> I never wanted ordered dicts, but never objected to them either.  All
> in all, given that _I_ haven't seen a performance degradation, I like
> that they're more predictable, and love the memory savings.

That reads like I was correcting Larry for misquoting _me_.  Sorry!
He wasn't quoting me, and I didn't think he was.  What he did quote
was delightfully snarky enough that I didn't want to snip it, but not
substantial enough to be worth the bother of addressing.  So I just
moved on to summarize what I said at the time (i.e., nothing).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/LXJU3EKB2KM3BT6MTGDLWVHTZUUIZGJK/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: Should set objects maintain insertion order too?

2019-12-17 Thread Tim Peters
[Larry]
> "I don't care about performance" is not because I'm aching for Python to
> run my code slowly.  It's because I'm 100% confident that the Python
> community will lovingly optimize the implementation.

I'm not ;-)

>  So when I have my language designer hat on, I really don't concern myself
> with performance.  As I thought I said earlier in the thread, I think we 
> should
> figure out the semantics we want first, and then we figure out how to make it 
> fast.

Pretty much the opposite of how Python started.  The original lists
and dicts were deliberately designed to have dirt simple
implementations, in reaction against ABC's "theoretically optimal"
data structures that were a nightmare to maintain, and had so much
overhead to support "optimality" in all cases that they were, in fact,
much slower in common cases.

There isn't magic waiting to be uncovered here:  if you want O(1)
deletion at arbitrary positions in an ordered sequence, a doubly
linked list is _the_ way to do it.  That's why, e.g., Raymond said he
still uses a doubly linked list, instead of an ordered dict, in the
LRU cache implementation.  If that isn't clear, a cache can be
maintained in least-to-most recently accessed order with an ordered
dict like so:

if key in cache:
cached_value = cache.pop(key)  # remove key
   else:
compute cached_value
assert key not in cache
cache[key] = cached_value # most recently used at the end now
return cached_value

and deleting the least recently used is just "del
cache[next(iter(cache))]" (off topic, just noting this is a fine use
for the "first()" function that's the subject of a different thread).

We _could_ structure an ordered dict's hash+key+value records as a
doubly linked list instead (and avoid the need for O(N)
reorganizations).  But then we piss away much of the memory savings
(to store the new links) that was the _point_ of compact dicts to
begin with.

So there was a compromise.  No links, deletion on its own is O(1), but
can periodically require O(N) under-the-covers table reorganizations
to squash out the holes.  "Suitable enough" for ordered dicts, but
_thought_ to be unsuitable for ordered sets (which appear to have
higher rates of mixing deletions with insertions - the LRU cache
example being an exception).

But there are also other optimizations in the current set
implementation, so "fine, add the doubly linked list to sets but not
to dicts" is only part of it.

Which may or may not be possible to match, let alone beat, in an
ordered set implementation.  A practical barrier now is that Python is
too mature to bank on loving optimizations _after_ a change to a core
feature is released.  It's going to need a highly polished
implementation first.

I certainly don't object to anyone trying, but it won't be me ;-)
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/UXWGNLGC3BG2XSMYN5H73FVKP3O2XUUC/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] sqlite3 module and thread-safe SQLite library

2019-12-17 Thread Kacvinsky, Tom
We ran into an issue where having the SQLite library built with 
-DSQLITE_THREADSAFE=0,
but then the sqlite3 module (really, the _sqlite3.so0 crashing in threading 
code.  So I  have
to ask if it is intended that the sqlite3 Python module always  be built with a 
thread safe
SQLite library.

Thanks,

Tom
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/PKHFTUQSLXWYOZNX4QMSNCPAF5Y3TL2H/
Code of Conduct: http://python.org/psf/codeofconduct/


[Python-Dev] Re: sqlite3 module and thread-safe SQLite library

2019-12-17 Thread Gregory P. Smith
On Tue, Dec 17, 2019 at 12:30 PM Kacvinsky, Tom 
wrote:

> We ran into an issue where having the SQLite library built with
> -DSQLITE_THREADSAFE=0,
> but then the sqlite3 module (really, the _sqlite3.so0 crashing in
> threading code.  So I  have
> to ask if it is intended that the sqlite3 Python module always  be built
> with a thread safe
> SQLite library.
>
> Thanks,
>
> Tom
>

Given your experience I think you've answered the question: Yes.  But it
would be good if we could detect this at build time (is there a way to do
that?) and prevent the module from being built against the sqlite3 library
compiled in this unusual mode.  Threading support is required in order to
be a valid CPython platform, so all libraries used should be aware of that.

-gps


> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at
> https://mail.python.org/archives/list/python-dev@python.org/message/PKHFTUQSLXWYOZNX4QMSNCPAF5Y3TL2H/
> Code of Conduct: http://python.org/psf/codeofconduct/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/KJV2KH7PZ55TX6XQKQ63WZ5WE7ILO26E/
Code of Conduct: http://python.org/psf/codeofconduct/