Re: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered

2016-09-20 Thread Dima Tisnek
Totally random thought:

Can lru_cache be simplified to use an ordered dict instead of dict +
linked list?

On 15 September 2016 at 20:30, Serhiy Storchaka  wrote:
> On 15.09.16 19:13, Antoine Pitrou wrote:
>>
>> Since this micro-benchmark creates the keys in order just before
>> filling the dict with them, randomizing the insertion order destroys
>> the temporal locality of object header accesses when iterating over the
>> dict keys. *This* looks like the right explanation, not branch
>> mispredicts due to NULL checks.
>>
>> This also shows that a micro-benchmark that merely looks ok can actually
>> be a terrible proxy of actual performance.
>
>
> Thanks you for great explanation Antoine! I came to the same conclusions
> about randomized integers example, but didn't notice that this also is a
> main cause of the speed up of strings example.
>
>> As a further validation of this theory, let's dramatically decrease the
>> working set size on the initial benchmark:
>>
>> $ ./python -m timeit -s "d=dict.fromkeys(map(str,range(10**3)))"
>> "list(d)"
>>
>> -> Python 3.5: 10 loops, best of 3: 10.9 usec per loop
>> -> Python 3.6: 10 loops, best of 3: 9.72 usec per loop
>>
>> When the working set fits in the cache, this micro-benchmark is
>> only 12% slower on 3.5 compared to 3.6.
>> *This* much smaller difference (a mere 1.2ns difference per dict
>> element) could be attributed to eliminating the NULL checks, or to any
>> other streamlining of the core iteration logic.
>
>
> Yet one example, with random hashes and insertion order independent from the
> creation order.
>
> $ ./python -m timeit -s "import random; a = list(map(str, range(10**6)));
> random.shuffle(a); d = dict.fromkeys(a)" -- "list(d)"
>
> Python 3.5: 180, 180, 180 msec per loop
> Python 3.6: 171, 172, 171 msec per loop
>
> Python 3.6 is 5% faster and this looks closer to the actual performance.
>
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/dimaqq%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered

2016-09-20 Thread INADA Naoki
On Tue, Sep 20, 2016 at 7:02 PM, INADA Naoki  wrote:
> On Tue, Sep 20, 2016 at 6:56 PM, Dima Tisnek  wrote:
>> Totally random thought:
>>
>> Can lru_cache be simplified to use an ordered dict instead of dict +
>> linked list?
>>
>
> I think so.
> See also: http://bugs.python.org/issue28199#msg276938
>

FYI, current dict implementation is not optimized for removing first
item like this:

```
// When hit max_size
Py_ssize_t pos;
PyObject *key;
if (PyDict_Next(d, &pos, &key, NULL)) {
if (PyDict_DelItem(key) < 0) {
// error.
}
}
```

So, before changing lru_cache implementation, I (or someone else) should rewrite
OrderedDict which has O(1) "remove first item" method. (At least
max_size is not None).

But both of OrderedDict and lru_cache improvements can't be in 3.6
since 3.6 is beta now.
I'll try it after 3.6rc1.

-- 
INADA Naoki  
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered

2016-09-20 Thread Eric Snow
On Tue, Sep 20, 2016 at 5:11 AM, INADA Naoki  wrote:
> But both of OrderedDict and lru_cache improvements can't be in 3.6
> since 3.6 is beta now.
> I'll try it after 3.6rc1.

When you do, make sure you keep in mind the performance constraints of
*all* the OrderedDict methods.  The constraints are discussed somewhat
at the top of https://hg.python.org/cpython/file/tip/Objects/odictobject.c.

-eric
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 3.6 dict becomes compact and gets a private version; and keywords become ordered

2016-09-20 Thread Maciej Fijalkowski
On Thu, Sep 15, 2016 at 1:27 PM, Paul Moore  wrote:
> On 15 September 2016 at 10:43, Raymond Hettinger
>  wrote:
>> Something like this will reveal the true and massive improvement in 
>> iteration speed:
>>
>>  $ ./python.exe -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" 
>> "list(d)"
>
>>py -3.5 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)"
> 10 loops, best of 3: 66.2 msec per loop
>>py -3.6 -m timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)"
> 10 loops, best of 3: 27.8 msec per loop
>
> And for Victor:
>
>>py -3.5 -m perf timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)"
> 
> Median +- std dev: 65.7 ms +- 3.8 ms
>>py -3.6 -m perf timeit -s "d=dict.fromkeys(map(str,range(10**6)))" "list(d)"
> 
> Median +- std dev: 27.9 ms +- 1.2 ms
>
> Just as a side point, perf provided essentially identical results but
> took 2 minutes as opposed to 8 seconds for timeit to do so. I
> understand why perf is better, and I appreciate all the work Victor
> did to create it, and analyze the results, but for getting a quick
> impression of how a microbenchmark performs, I don't see timeit as
> being *quite* as bad as Victor is claiming.
>
> I will tend to use perf now that I have it installed, and now that I
> know how to run a published timeit invocation using perf. It's a
> really cool tool. But I certainly won't object to seeing people
> publish timeit results (any more than I'd object to *any*
> mirobenchmark).
>
> Paul

How about we just make timeit show average and not disable the GC then
(two of the complaints that will not change the execution time)?
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com