Re: [Python-Dev] PEP 7 and braces { .... } on if

2017-06-01 Thread Victor Stinner
2017-05-31 19:27 GMT+02:00 Guido van Rossum :
> I interpret the PEP (...)

Right, the phrasing requires to "interpret" it :-)

> (...) as saying that you should use braces everywhere but not
> to add them in code that you're not modifying otherwise. (I.e. don't go on a
> brace-adding rampage.) If author and reviewer of a PR disagree I would go
> with "add braces" since that's clearly the PEP's preference. This is C code.
> We should play it safe.

Would someone be nice enough to try to rephrase the PEP 7 to explain
that? Just to avoid further boring discussion on the C coding style...

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Larry Hastings



When CPython's small block allocator was first merged in late February 
2001, it allocated memory in gigantic chunks it called "arenas".  These 
arenas were a massive 256 KILOBYTES apiece.


This tunable parameter has not been touched in the intervening 16 
years.  Yet CPython's memory consumption continues to grow.  By the time 
a current "trunk" build of CPython reaches the REPL prompt it's already 
allocated 16 arenas.


I propose we make the arena size larger.  By how much?  I asked Victor 
to run some benchmarks with arenas of 1mb, 2mb, and 4mb.  The results 
with 1mb and 2mb were mixed, but his benchmarks with a 4mb arena size 
showed measurable (>5%) speedups on ten benchmarks and no slowdowns.


What would be the result of making the arena size 4mb?

 * CPython could no longer run on a computer where at startup it could
   not allocate at least one 4mb continguous block of memory.
 * CPython programs would die slightly sooner in out-of-memory conditions.
 * CPython programs would use more memory.  How much?  Hard to say.  It
   depends on their allocation strategy.  I suspect most of the time it
   would be < 3mb additional memory.  But for pathological allocation
   strategies the difference could be significant.  (e.g: lots of
   allocs, followed by lots of frees, but the occasional object lives
   forever, which means that the arena it's in can never be freed.  If
   1 out of ever 16 256k arenas is kept alive this way, and the objects
   are spaced out precisely such that now it's 1 for every 4mb arena,
   max memory use would be the same but later stable memory use would
   hypothetically be 16x current)
 * Many programs would be slightly faster now and then, simply because
   we call malloc() 1/16 as often.


What say you?  Vote for your favorite color of bikeshed now!


//arry/

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Larry Hastings



On 06/01/2017 12:38 AM, Larry Hastings wrote:
I propose we make the arena size larger.  By how much?  I asked Victor 
to run some benchmarks with arenas of 1mb, 2mb, and 4mb. The results 
with 1mb and 2mb were mixed, but his benchmarks with a 4mb arena size 
showed measurable (>5%) speedups on ten benchmarks and no slowdowns.


Oh, sorry!  Meant to add: thanks, Victor, for running these benchmarks 
for me!


Where are my manners?!


//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Victor Stinner
2017-06-01 9:38 GMT+02:00 Larry Hastings :
> When CPython's small block allocator was first merged in late February 2001,
> it allocated memory in gigantic chunks it called "arenas".  These arenas
> were a massive 256 KILOBYTES apiece.

The arena size defines the strict minimum memory usage of Python. With
256 kB, it means that the smallest memory usage is 256 kB.

> What would be the result of making the arena size 4mb?

A minimum memory usage of 4 MB. It also means that if you allocate 4
MB + 1 byte, Python will eat 8 MB from the operating system.

The GNU libc malloc uses a variable threshold to choose between sbrk()
(heap memory) or mmap(). It starts at 128 kB or 256 kB, and then is
adapted depending on the workload (I don't know how exactly).

I would prefer to have an adaptative arena size. For example start at
256 kB and then double the arena size while the memory usage grows.
The problem is that pymalloc is currently designed for a fixed arena
size. I have no idea how hard it would be to make the size per
allocated arena.

I already read that CPU support "large pages" between 2 MB and 1 GB,
instead of just 4 kB. Using large pages can have a significant impact
on performance. I don't know if we can do something to help the Linux
kernel to use large pages for our memory? I don't know neither how we
could do that :-) Maybe using mmap() closer to large pages will help
Linux to join them to build a big page? (Linux has something magic to
make applications use big pages transparently.)

More generally: I'm strongly in favor of making our memory allocator
more efficient :-D

When I wrote my tracemalloc PEP 454, I counted that Python calls
malloc() , realloc() or free() 270,000 times per second in average
when running the Python test suite:
https://www.python.org/dev/peps/pep-0454/#log-calls-to-the-memory-allocator
(now I don't recall if it was really "malloc" or PyObject_Malloc, but
well, we do a lot of memory allocations and deallocations ;-))

When I analyzed the timeline of CPython master performance, I was
surprised to see that my change on PyMem_Malloc() to make it use
pymalloc was one of the most significant "optimization" of the Python
3.6!
http://pyperformance.readthedocs.io/cpython_results_2017.html#pymalloc-allocator

The CPython performance heavily depends on the performance of our
memory allocator, at least of the performance of pymalloc (the
specialized allocator for blocks <= 512 bytes).

By the way, Naoki INADA also proposed a different idea:

"Global freepool: Many types has it’s own freepool. Sharing freepool
can increase memory and cache efficiency. Add PyMem_FastFree(void*
ptr, size_t size) to store memory block to freepool, and PyMem_Malloc
can check global freepool first."
http://faster-cpython.readthedocs.io/cpython37.html

IMHO It's also worth it to investigate this change as well.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Larry Hastings

On 06/01/2017 12:57 AM, Victor Stinner wrote:

I would prefer to have an adaptative arena size. For example start at
256 kB and then double the arena size while the memory usage grows.
The problem is that pymalloc is currently designed for a fixed arena
size. I have no idea how hard it would be to make the size per
allocated arena.


It's not hard.  The major pain point is that it'd make the 
address_in_range() inline function slightly more expensive. Currently 
that code has ARENA_SIZE hardcoded inside it; if the size was dynamic 
we'd have to look up the size of the arena every time. This function is 
called every time we free a pointer, so it's done hundreds of thousands 
of times per second (as you point out).


It's worth trying the experiment to see if dynamic arena sizes would 
make programs notably faster.  However... why not both?  Changing to 4mb 
arenas now is a one-line change, and on first examination seems mostly 
harmless, and yields an easy (if tiny) performance win.  If someone 
wants to experiment with dynamic arenas, they could go right ahead, and 
if it works well we could merge that too.



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 00:38:09 -0700
Larry Hastings  wrote:
>   * CPython programs would use more memory.  How much?  Hard to say.  It
> depends on their allocation strategy.  I suspect most of the time it
> would be < 3mb additional memory.  But for pathological allocation
> strategies the difference could be significant.  (e.g: lots of
> allocs, followed by lots of frees, but the occasional object lives
> forever, which means that the arena it's in can never be freed.  If
> 1 out of ever 16 256k arenas is kept alive this way, and the objects
> are spaced out precisely such that now it's 1 for every 4mb arena,
> max memory use would be the same but later stable memory use would
> hypothetically be 16x current)

Yes, this is the same kind of reason the default page size is still 4KB
on many platforms today, despite typical memory size having grown by a
huge amount.  Apart from the cost of fragmentation as you mentioned,
another issue is when many small Python processes are running on a
machine: a 2MB overhead per process can compound to large numbers if
you have many (e.g. hundreds) such processes.

I would suggest we exert caution here.  Small benchmarks generally have
a nice memory behaviour: not only they do not allocate a lot of memory a,
but often they will release it all at once after a single run.  Perhaps
some of those benchmarks would even be better off if we allocated 64MB
up front and never released it :-)

Long-running applications can be less friendly than that, having
various pieces of internal with unpredictable lifetimes (especially
when it's talking over the network with other peers which come and go).
And long-running applications are typically where Python memory usage is
a sensitive matter.

If you'd like to go that way anyway, I would suggest 1MB as a starting
point in 3.7.

>   * Many programs would be slightly faster now and then, simply because
> we call malloc() 1/16 as often.

malloc() you said?  Arenas are allocated using mmap() nowadays, right?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread INADA Naoki
Hello.

AFAIK, allocating arena doesn't eat real (physical) memory.

* On Windows, VirtualAlloc is used for arena.  Real memory page is assigned
  when the page is used first time.
* On Linux and some other *nix, anonymous mmap is used.  Real page is
  assigned when first touch, like Windows.

Arena size is more important for **freeing** memory.
Python returns memory to system when:

1. When no block in pool is used, it returned to arena.
2. When no pool is used, return the arena to system.

So only one memory block can disturb returning the whole arena.


Some VMs (e.g. mono) uses special APIs to return "real page" from
allocated space.

* On Windows, VirtualFree() + VirtualAlloc() can be used to unassign pages.
* On Linux, madvice(..., MADV_DONTNEED) can be used.
* On other *nix, madvice(..., MADV_DONTNEED) + madvice(..., MADV_FREE)
can be used.

See also:

https://github.com/corngood/mono/blob/ef186403b5e95a5c95c38f1f19d0c8d061f2ac37/mono/utils/mono-mmap.c#L204-L208
(Windows)
https://github.com/corngood/mono/blob/ef186403b5e95a5c95c38f1f19d0c8d061f2ac37/mono/utils/mono-mmap.c#L410-L424
(Unix)

I think we can return not recently used free pools to system in same way.
So more large arena size + more memory efficient can be achieved.

But I need more experiment.

Regards,
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Victor Stinner
2017-06-01 10:19 GMT+02:00 Antoine Pitrou :
> Yes, this is the same kind of reason the default page size is still 4KB
> on many platforms today, despite typical memory size having grown by a
> huge amount.  Apart from the cost of fragmentation as you mentioned,
> another issue is when many small Python processes are running on a
> machine: a 2MB overhead per process can compound to large numbers if
> you have many (e.g. hundreds) such processes.
>
> I would suggest we exert caution here.  Small benchmarks generally have
> a nice memory behaviour: not only they do not allocate a lot of memory a,
> but often they will release it all at once after a single run.  Perhaps
> some of those benchmarks would even be better off if we allocated 64MB
> up front and never released it :-)

By the way, the benchmark suite performance supports different ways to
trace memory usage:

* using tracemalloc
* using /proc/pid/smaps
* using VmPeak of /proc/pid/status (max RSS memory)

I wrote the code but I didn't try it yet :-) Maybe we should check the
memory usage before deciding to change the arena size?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Victor Stinner
2017-06-01 10:23 GMT+02:00 INADA Naoki :
> AFAIK, allocating arena doesn't eat real (physical) memory.
>
> * On Windows, VirtualAlloc is used for arena.  Real memory page is assigned
>   when the page is used first time.
> * On Linux and some other *nix, anonymous mmap is used.  Real page is
>   assigned when first touch, like Windows.

Memory fragmentation is also a real problem in pymalloc. I don't think
that pymalloc is designed to reduce the memory fragmentation.

I know one worst case: the Python parser which allocates small objects
which will be freed when the parser completes, while other objects
living longer are created.
https://github.com/haypo/misc/blob/master/memory/python_memleak.py

In a perfect world, the parser should use a different memory allocator
for that. But currently, the Python API doesn't offer this level of
granularity.

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] "Global freepool"

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 09:57:04 +0200
Victor Stinner  wrote:
> 
> By the way, Naoki INADA also proposed a different idea:
> 
> "Global freepool: Many types has it’s own freepool. Sharing freepool
> can increase memory and cache efficiency. Add PyMem_FastFree(void*
> ptr, size_t size) to store memory block to freepool, and PyMem_Malloc
> can check global freepool first."

This is already exactly how PyObject_Malloc() works.  Really, the fast
path for PyObject_Malloc() is:

size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
pool = usedpools[size + size];
if (pool != pool->nextpool) {
/*
 * There is a used pool for this size class.
 * Pick up the head block of its free list.
 */
++pool->ref.count;
bp = pool->freeblock;
assert(bp != NULL);
if ((pool->freeblock = *(block **)bp) != NULL) {
UNLOCK();
return (void *)bp;   // <- fast path!
}


I don't think you can get much faster than that in a generic allocation
routine (unless you have a compacting GC where allocating memory is
basically bumping a single global pointer). IMHO the main thing the
private freelists have is that they're *private* precisely, so they can
avoid a couple of conditional branches.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Larry Hastings

On 06/01/2017 01:19 AM, Antoine Pitrou wrote:

If you'd like to go that way anyway, I would suggest 1MB as a starting
point in 3.7.


I understand the desire for caution.  But I was hoping maybe we could 
experiment with 4mb in trunk for a while?  We could change it to 1mb--or 
even 256k--before beta 1 if we get anxious.




   * Many programs would be slightly faster now and then, simply because
 we call malloc() 1/16 as often.

malloc() you said?  Arenas are allocated using mmap() nowadays, right?


malloc() and free().  See _PyObject_ArenaMalloc (etc) in Objects/obmalloc.c.

On Windows Python uses VirtualAlloc(), and I don't know what the 
implications are of that.



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Larry Hastings

On 06/01/2017 01:41 AM, Larry Hastings wrote:

On 06/01/2017 01:19 AM, Antoine Pitrou wrote:

malloc() you said?  Arenas are allocated using mmap() nowadays, right?
malloc() and free().  See _PyObject_ArenaMalloc (etc) in 
Objects/obmalloc.c.


Oh, sorry, I forgot how to read.  If ARENAS_USE_MMAP is on it uses 
mmap().  I can't figure out when or how MAP_ANONYMOUS gets set, but if I 
step into the _PyObject_Arena.alloc() it indeed calls 
_PyObject_ArenaMmap() which uses mmap().  So, huzzah!, we use mmap() to 
allocate our enormous 256kb arenas.



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 01:41:15 -0700
Larry Hastings  wrote:
> On 06/01/2017 01:19 AM, Antoine Pitrou wrote:
> > If you'd like to go that way anyway, I would suggest 1MB as a starting
> > point in 3.7.  
> 
> I understand the desire for caution.  But I was hoping maybe we could 
> experiment with 4mb in trunk for a while?  We could change it to 1mb--or 
> even 256k--before beta 1 if we get anxious.

Almost nobody tests "trunk" (or "master" :-)) on production systems.  At
best a couple rare open source projects will run their test suite on
the pre-release betas, but that's all.  So we are unlikely to spot
memory usage ballooning problems before the final release.

> >>* Many programs would be slightly faster now and then, simply because
> >>  we call malloc() 1/16 as often.  
> > malloc() you said?  Arenas are allocated using mmap() nowadays, right?  
> 
> malloc() and free().  See _PyObject_ArenaMalloc (etc) in Objects/obmalloc.c.

_PyObject_ArenaMalloc should only be used if the OS doesn't support
mmap() or MAP_ANONYMOUS (see ARENAS_USE_MMAP).  Otherwise
_PyObject_ArenaMmap is used.

Apparently OS X doesn't have MAP_ANONYMOUS but it has the synonymous
MAP_ANON:
https://github.com/HaxeFoundation/hashlink/pull/12

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "Global freepool"

2017-06-01 Thread INADA Naoki
Hi,

As you said, I think PyObject_Malloc() is fast enough.
But PyObject_Free() is somewhat complex.

Actually, there are some freelists (e.g. tuple, dict, frame) and
they improve performance significantly.


My "global unified freelist" idea is unify them.  The merit is:

* Unify _PyXxx_DebugMallocStats().  Some freelists provide
  it but some doesn't.

* Unify PyXxx_ClearFreeList().  Some freelists doesn't provide
  it and it may disturb returning memory to system.

* Potential better CPU cache hit ratio by unifying LRU if some
  freelists has same memory block size.

This idea is partially implemented in https://github.com/methane/cpython/pull/3
But there are no significant difference about speed or memory usage.

Regards,

On Thu, Jun 1, 2017 at 5:40 PM, Antoine Pitrou  wrote:
> On Thu, 1 Jun 2017 09:57:04 +0200
> Victor Stinner  wrote:
>>
>> By the way, Naoki INADA also proposed a different idea:
>>
>> "Global freepool: Many types has it’s own freepool. Sharing freepool
>> can increase memory and cache efficiency. Add PyMem_FastFree(void*
>> ptr, size_t size) to store memory block to freepool, and PyMem_Malloc
>> can check global freepool first."
>
> This is already exactly how PyObject_Malloc() works.  Really, the fast
> path for PyObject_Malloc() is:
>
> size = (uint)(nbytes - 1) >> ALIGNMENT_SHIFT;
> pool = usedpools[size + size];
> if (pool != pool->nextpool) {
> /*
>  * There is a used pool for this size class.
>  * Pick up the head block of its free list.
>  */
> ++pool->ref.count;
> bp = pool->freeblock;
> assert(bp != NULL);
> if ((pool->freeblock = *(block **)bp) != NULL) {
> UNLOCK();
> return (void *)bp;   // <- fast path!
> }
>
>
> I don't think you can get much faster than that in a generic allocation
> routine (unless you have a compacting GC where allocating memory is
> basically bumping a single global pointer). IMHO the main thing the
> private freelists have is that they're *private* precisely, so they can
> avoid a couple of conditional branches.
>
> Regards
>
> Antoine.
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou
On Wed, 31 May 2017 14:09:20 -0600
Jim Baker  wrote:
> 
> But I object to a completely new feature being added to 2.7 to support the
> implementation of event loop SSL usage. This feature cannot be construed as
> a security fix, and therefore does not qualify as a feature that can be
> added to CPython 2.7 at this point in its lifecycle.

I agree with this sentiment.  Also see comments by Ben Darnell and
others here:
https://github.com/python/peps/pull/272#pullrequestreview-41388700

Moreover I think that a 2.7 policy decision shouldn't depend on
whatever future plans there are for Requests.  The slippery slope of
relaxing maintenance policy on 2.7 has come to absurd extremities.
If Requests is to remain 2.7-compatible, it's up to Requests to do the
necessary work to do so.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Victor Stinner
2017-06-01 10:41 GMT+02:00 Larry Hastings :
> On 06/01/2017 01:19 AM, Antoine Pitrou wrote:
> If you'd like to go that way anyway, I would suggest 1MB as a starting
> point in 3.7.
>
> I understand the desire for caution.  But I was hoping maybe we could
> experiment with 4mb in trunk for a while?  We could change it to 1mb--or
> even 256k--before beta 1 if we get anxious.

While I fail to explain why in depth, I would prefer to *not* touch
the default arena size at this point.

We need more data, for example measure the memory usage on different
workloads using different arena sizes.

It's really hard to tune a memory allocator for *any* use cases.

A simple enhancement would be to add an environment variable to change
the arena size at Python startup. Example: PYTHONARENASIZE=1M. If you
*know* that your application will allocate at least 2 GB, you may even
want to try PYTHONARENASIZE=1G which is more likely to use a single
large page... Such parameter cannot be used by default: it would make
the default Python memory usage insane ;-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Victor Stinner
2017-06-01 10:57 GMT+02:00 Antoine Pitrou :
> If Requests is to remain 2.7-compatible, it's up to Requests to do the
> necessary work to do so.

In practice, CPython does include Requests in ensurepip. Because of
that, it means that Requests cannot use any C extension. CPython 2.7
ensurepip prevents evolutions of Requests on Python 3.7. Is my
rationale broken somehow?

The root issue is to get a very secure TLS connection in pip to
download packages from pypi.python.org. On CPython 3.6, we made
multiple small steps to include more and more features in the stdlib
ssl module, but I understand that the lack of root certificate
authorities (CA) on Windows and macOS is still a major blocker issue
for pip. That's why pip uses Requests which uses certifi (Mozilla
bundled root certificate authorities.)

pip and so Requests are part of the current success of the Python
community. I disagree that Requests pratical isssues are not our
problems.

--

Moreover, the PEP 546 Rationale not only include Requests, but also
the important PEP 543 to make CPython 3.7 more secure in the long
term. Do you also disagree on the need of the need of the PEP 546
(backport) to make the PEP 543 (new TLS API) feasible in practice?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread INADA Naoki
> * On Linux, madvice(..., MADV_DONTNEED) can be used.

Recent Linux has MADV_FREE.  It is faster than MADV_DONTNEED,
https://lwn.net/Articles/591214/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "Global freepool"

2017-06-01 Thread Victor Stinner
2017-06-01 10:40 GMT+02:00 Antoine Pitrou :
> This is already exactly how PyObject_Malloc() works. (...)

Oh ok, good to know...

> IMHO the main thing the
> private freelists have is that they're *private* precisely, so they can
> avoid a couple of conditional branches.

I would like to understand how private free lists are "so much"
faster. In fact, I don't recall if someone *measured* the performance
speedup of these free lists :-)

By the way, the Linux kernel uses a "SLAB" allocator for the most
common object types like inode. I'm curious to know if CPython would
benefit of a similar allocator for our most common object types? For
example types which already use a free list.

https://en.wikipedia.org/wiki/Slab_allocation

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread INADA Naoki
Thanks for detailed info.

But I don't think it's a big problem.
Arenas are returned to system by chance.  So other processes
shouldn't relying to it.

And I don't propose to stop returning arena to system.
I just mean per pool (part of arena) MADV_DONTNEED can reduce RSS.

If we use very large arena, or stop returning arena to system,
it can be problem.

Regards,

On Thu, Jun 1, 2017 at 6:05 PM, Siddhesh Poyarekar  wrote:
> On Thursday 01 June 2017 01:53 PM, INADA Naoki wrote:
>> * On Linux, madvice(..., MADV_DONTNEED) can be used.
>
> madvise does not reduce the commit charge in the Linux kernel, so in
> high consumption scenarios (and where memory overcommit is disabled or
> throttled) you'll see programs dying with OOM despite the MADV_DONTNEED.
>  The way we solved it in glibc was to use mprotect to drop PROT_READ and
> PROT_WRITE in blocks that we don't need when we detect that the system
> is not configured to overcommit (using /proc/sys/vm/overcommit_memory).
> You'll need to fix the protection again though if you want to reuse the
> block.
>
> Siddhesh
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou

Le 01/06/2017 à 11:13, Victor Stinner a écrit :
> That's why pip uses Requests which uses certifi (Mozilla
> bundled root certificate authorities.)

pip could use certifi without using Requests.  My guess is that Requests
is used mostly because it eases coding.

> pip and so Requests are part of the current success of the Python
> community.

pip is, but I'm not convinced about Requests.  If Requests didn't exist,
people (including pip's developers) would use another HTTP-fetching
library, they wouldn't switch to Go or Ruby.

> Do you also disagree on the need of the need of the PEP 546
> (backport) to make the PEP 543 (new TLS API) feasible in practice?

Yes, I disagree.  We needn't backport that new API to Python 2.7.
Perhaps it's time to be reasonable: Python 2.7 has been in bugfix-only
mode for a very long time.  Python 3.6 is out.  We should move on.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Thomas Wouters
On Thu, Jun 1, 2017 at 10:45 AM, Larry Hastings  wrote:

> On 06/01/2017 01:41 AM, Larry Hastings wrote:
>
> On 06/01/2017 01:19 AM, Antoine Pitrou wrote:
>
> malloc() you said?  Arenas are allocated using mmap() nowadays, right?
>
> malloc() and free().  See _PyObject_ArenaMalloc (etc) in
> Objects/obmalloc.c.
>
>
> Oh, sorry, I forgot how to read.  If ARENAS_USE_MMAP is on it uses
> mmap().  I can't figure out when or how MAP_ANONYMOUS gets set,
>

MAP_ANONYMOUS is set by sys/mman.h (where the system supports it), just
like the other MAP_* defines.


> but if I step into the _PyObject_Arena.alloc() it indeed calls
> _PyObject_ArenaMmap() which uses mmap().  So, huzzah!, we use mmap() to
> allocate our enormous 256kb arenas.
>
>
> */arry*
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> thomas%40python.org
>
>


-- 
Thomas Wouters 

Hi! I'm an email virus! Think twice before sending your email to help me
spread!
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread INADA Naoki
x86's hugepage is 2MB.
And some Linux enables "Transparent Huge Page" feature.

Maybe, 2MB arena size is better for TLB efficiency.
Especially, for servers having massive memory.


On Thu, Jun 1, 2017 at 4:38 PM, Larry Hastings  wrote:
>
>
> When CPython's small block allocator was first merged in late February 2001,
> it allocated memory in gigantic chunks it called "arenas".  These arenas
> were a massive 256 KILOBYTES apiece.
>
> This tunable parameter has not been touched in the intervening 16 years.
> Yet CPython's memory consumption continues to grow.  By the time a current
> "trunk" build of CPython reaches the REPL prompt it's already allocated 16
> arenas.
>
> I propose we make the arena size larger.  By how much?  I asked Victor to
> run some benchmarks with arenas of 1mb, 2mb, and 4mb.  The results with 1mb
> and 2mb were mixed, but his benchmarks with a 4mb arena size showed
> measurable (>5%) speedups on ten benchmarks and no slowdowns.
>
> What would be the result of making the arena size 4mb?
>
> CPython could no longer run on a computer where at startup it could not
> allocate at least one 4mb continguous block of memory.
> CPython programs would die slightly sooner in out-of-memory conditions.
> CPython programs would use more memory.  How much?  Hard to say.  It depends
> on their allocation strategy.  I suspect most of the time it would be < 3mb
> additional memory.  But for pathological allocation strategies the
> difference could be significant.  (e.g: lots of allocs, followed by lots of
> frees, but the occasional object lives forever, which means that the arena
> it's in can never be freed.  If 1 out of ever 16 256k arenas is kept alive
> this way, and the objects are spaced out precisely such that now it's 1 for
> every 4mb arena, max memory use would be the same but later stable memory
> use would hypothetically be 16x current)
> Many programs would be slightly faster now and then, simply because we call
> malloc() 1/16 as often.
>
>
> What say you?  Vote for your favorite color of bikeshed now!
>
>
> /arry
>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Louie Lu
For the ARENA_SIZE, will that be better to setting by ./configure first,
and without hard code in c files?

2017-06-01 17:37 GMT+08:00 INADA Naoki :
> x86's hugepage is 2MB.
> And some Linux enables "Transparent Huge Page" feature.
>
> Maybe, 2MB arena size is better for TLB efficiency.
> Especially, for servers having massive memory.
>
>
> On Thu, Jun 1, 2017 at 4:38 PM, Larry Hastings  wrote:
>>
>>
>> When CPython's small block allocator was first merged in late February 2001,
>> it allocated memory in gigantic chunks it called "arenas".  These arenas
>> were a massive 256 KILOBYTES apiece.
>>
>> This tunable parameter has not been touched in the intervening 16 years.
>> Yet CPython's memory consumption continues to grow.  By the time a current
>> "trunk" build of CPython reaches the REPL prompt it's already allocated 16
>> arenas.
>>
>> I propose we make the arena size larger.  By how much?  I asked Victor to
>> run some benchmarks with arenas of 1mb, 2mb, and 4mb.  The results with 1mb
>> and 2mb were mixed, but his benchmarks with a 4mb arena size showed
>> measurable (>5%) speedups on ten benchmarks and no slowdowns.
>>
>> What would be the result of making the arena size 4mb?
>>
>> CPython could no longer run on a computer where at startup it could not
>> allocate at least one 4mb continguous block of memory.
>> CPython programs would die slightly sooner in out-of-memory conditions.
>> CPython programs would use more memory.  How much?  Hard to say.  It depends
>> on their allocation strategy.  I suspect most of the time it would be < 3mb
>> additional memory.  But for pathological allocation strategies the
>> difference could be significant.  (e.g: lots of allocs, followed by lots of
>> frees, but the occasional object lives forever, which means that the arena
>> it's in can never be freed.  If 1 out of ever 16 256k arenas is kept alive
>> this way, and the objects are spaced out precisely such that now it's 1 for
>> every 4mb arena, max memory use would be the same but later stable memory
>> use would hypothetically be 16x current)
>> Many programs would be slightly faster now and then, simply because we call
>> malloc() 1/16 as often.
>>
>>
>> What say you?  Vote for your favorite color of bikeshed now!
>>
>>
>> /arry
>>
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe:
>> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
>>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/me%40louie.lu
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 18:37:17 +0900
INADA Naoki  wrote:
> x86's hugepage is 2MB.
> And some Linux enables "Transparent Huge Page" feature.
> 
> Maybe, 2MB arena size is better for TLB efficiency.
> Especially, for servers having massive memory.

But, since Linux is able to merge pages transparently, we perhaps
needn't allocate large pages explicitly.

Another possible strategy is: allocate several arenas at once (using a
larger mmap() call), and use MADV_DONTNEED to relinquish individual
arenas.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Chris Angelico
On Thu, Jun 1, 2017 at 7:23 PM, Antoine Pitrou  wrote:
>> Do you also disagree on the need of the need of the PEP 546
>> (backport) to make the PEP 543 (new TLS API) feasible in practice?
>
> Yes, I disagree.  We needn't backport that new API to Python 2.7.
> Perhaps it's time to be reasonable: Python 2.7 has been in bugfix-only
> mode for a very long time.  Python 3.6 is out.  We should move on.

But it is in *security fix* mode for at least another three years
(ish). Proper use of TLS certificates is a security question.

How hard would it be for the primary codebase of Requests to be
written to use a C extension, but with a fallback *for pip's own
bootstrapping only* that provides one single certificate - the
authority that signs pypi.python.org? The point of the new system is
that back-ends can be switched out; a stub back-end that authorizes
only one certificate would theoretically be possible, right? Or am I
completely misreading which part needs C?

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 19:50:22 +1000
Chris Angelico  wrote:
> On Thu, Jun 1, 2017 at 7:23 PM, Antoine Pitrou  wrote:
> >> Do you also disagree on the need of the need of the PEP 546
> >> (backport) to make the PEP 543 (new TLS API) feasible in practice?  
> >
> > Yes, I disagree.  We needn't backport that new API to Python 2.7.
> > Perhaps it's time to be reasonable: Python 2.7 has been in bugfix-only
> > mode for a very long time.  Python 3.6 is out.  We should move on.  
> 
> But it is in *security fix* mode for at least another three years
> (ish). Proper use of TLS certificates is a security question.

Why are you bringing "proper use of TLS certificates"?  Python 2.7
doesn't need another backport for that.  The certifi package is
available for Python 2.7 and can be integrated simply with the existing
ssl module.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Chris Angelico
On Thu, Jun 1, 2017 at 8:01 PM, Antoine Pitrou  wrote:
> On Thu, 1 Jun 2017 19:50:22 +1000
> Chris Angelico  wrote:
>> On Thu, Jun 1, 2017 at 7:23 PM, Antoine Pitrou  wrote:
>> >> Do you also disagree on the need of the need of the PEP 546
>> >> (backport) to make the PEP 543 (new TLS API) feasible in practice?
>> >
>> > Yes, I disagree.  We needn't backport that new API to Python 2.7.
>> > Perhaps it's time to be reasonable: Python 2.7 has been in bugfix-only
>> > mode for a very long time.  Python 3.6 is out.  We should move on.
>>
>> But it is in *security fix* mode for at least another three years
>> (ish). Proper use of TLS certificates is a security question.
>
> Why are you bringing "proper use of TLS certificates"?  Python 2.7
> doesn't need another backport for that.  The certifi package is
> available for Python 2.7 and can be integrated simply with the existing
> ssl module.

As stated in this thread, OS-provided certificates are not handled by
that. For instance, if a local administrator distributes a self-signed
cert for the intranet server, web browsers will use it, but pip will
not.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "Global freepool"

2017-06-01 Thread INADA Naoki
I thought pymalloc is SLAB allocator.
What is the difference between SLAB and pymalloc allocator?

On Thu, Jun 1, 2017 at 6:20 PM, Victor Stinner  wrote:
> 2017-06-01 10:40 GMT+02:00 Antoine Pitrou :
>> This is already exactly how PyObject_Malloc() works. (...)
>
> Oh ok, good to know...
>
>> IMHO the main thing the
>> private freelists have is that they're *private* precisely, so they can
>> avoid a couple of conditional branches.
>
> I would like to understand how private free lists are "so much"
> faster. In fact, I don't recall if someone *measured* the performance
> speedup of these free lists :-)
>
> By the way, the Linux kernel uses a "SLAB" allocator for the most
> common object types like inode. I'm curious to know if CPython would
> benefit of a similar allocator for our most common object types? For
> example types which already use a free list.
>
> https://en.wikipedia.org/wiki/Slab_allocation
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/songofacandy%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "Global freepool"

2017-06-01 Thread Serhiy Storchaka

01.06.17 12:20, Victor Stinner пише:

2017-06-01 10:40 GMT+02:00 Antoine Pitrou :

This is already exactly how PyObject_Malloc() works. (...)


Oh ok, good to know...


IMHO the main thing the
private freelists have is that they're *private* precisely, so they can
avoid a couple of conditional branches.


I would like to understand how private free lists are "so much"
faster. In fact, I don't recall if someone *measured* the performance
speedup of these free lists :-)


I measured the performance boost of adding the free list for dict keys 
structures. [1] This proposition was withdraw in the favor of using 
PyObject_Malloc(). The latter solution is slightly slower, but simpler.


But even private free lists are not fast enough. That is why some 
functions (zip, dict.items iterator, property getter, etc) have private 
caches for tuples and the FASTCALL protocol added so much speedup.


At end we have multiple levels of free lists and caches, and every level 
adds good speedup (otherwise it wouldn't used).


I also have found many times is spent in dealloc functions for tuples 
called before placing an object back in a free list or memory pool. They 
use the trashcan mechanism for guarding from stack overflow, and it is 
costly in comparison with clearing 1-element tuple.


[1] https://bugs.python.org/issue16465

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 20:05:48 +1000
Chris Angelico  wrote:
> 
> As stated in this thread, OS-provided certificates are not handled by
> that. For instance, if a local administrator distributes a self-signed
> cert for the intranet server, web browsers will use it, but pip will
> not.

That's true.  But:
1) pip could grow a config entry to set an alternative or additional CA
path
2) it is not a "security fix", as not being able to recognize
privately-signed certificates is not a security breach.  It's a new
feature

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 10:23, Antoine Pitrou  wrote:
> 
> Yes, I disagree.  We needn't backport that new API to Python 2.7.
> Perhaps it's time to be reasonable: Python 2.7 has been in bugfix-only
> mode for a very long time.  Python 3.6 is out.  We should move on.

Who is the “we” that should move on? Python core dev? Or the Python ecosystem? 
Because if it’s the latter, then I’m going to tell you right now that the 
ecosystem did not get the memo. If you check the pip download numbers for 
Requests in the last month you’ll see that 80% of our downloads (9.4 million) 
come from Python 2. That is an enormous proportion: far too many to consider 
not supporting that user-base. So Requests is basically bound to support that 
userbase.

Requests is stuck in a place from which it cannot move. We feel we cannot drop 
2.7 support. We want to support as many TLS backends as possible. We want to 
enable the pip developers to focus on their features, rather than worrying 
about HTTP and TLS. And we want people to adopt the async/await keywords as 
much as possible. It turns out that we cannot satisfy all of those desires with 
the status quo, so we proposed an alternative that involves backporting 
MemoryBIO.

So, to the notion of “we need to move on”, I say this: we’re trying. We really, 
genuinely, are. I don’t know how much stronger of a signal I can give about how 
much Requests cares about Python 3 than to signal that we’re trying to adopt 
async/await and be compatible with asyncio. I believe that Python 3 is the 
present and future of this language. But right now, we can’t properly adopt it 
because we have a userbase that you want to leave behind, and we don’t.

I want to move on, but I want to bring that 80% of our userbase with us when we 
do. My reading of your post is that you would rather Requests not adopt the 
async/await paradigm than backport MemoryBIO: is my understanding correct? If 
so, fair enough. If not, I’d like to try to work with you to a place where we 
can all get what we want.

Cory
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou

Le 01/06/2017 à 12:23, Cory Benfield a écrit :
> 
> No it can’t.
> 
> OpenSSL builds chains differently, and disregards some metadata that Windows 
> and macOS store, which means that cert validation will work differently than 
> in the system store. This can lead to pip accepting a cert marked as 
> “untrusted for SSL”, for example, which would be pretty bad.

Are you claiming that OpenSSL certificate validation is insecure and
shouldn't be used at all?  I have never heard that claim before.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 11:18, Antoine Pitrou  wrote:
> 
> On Thu, 1 Jun 2017 20:05:48 +1000
> Chris Angelico  wrote:
>> 
>> As stated in this thread, OS-provided certificates are not handled by
>> that. For instance, if a local administrator distributes a self-signed
>> cert for the intranet server, web browsers will use it, but pip will
>> not.
> 
> That's true.  But:
> 1) pip could grow a config entry to set an alternative or additional CA
> path

No it can’t.

Exporting the Windows or macOS security store to a big file of PEM is a 
security vulnerability because the macOS and Windows security stores expect to 
work with their own certificate chain building algorithms. OpenSSL builds 
chains differently, and disregards some metadata that Windows and macOS store, 
which means that cert validation will work differently than in the system 
store. This can lead to pip accepting a cert marked as “untrusted for SSL”, for 
example, which would be pretty bad.

Cory
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread David Wilson
Hi Cory,

On Thu, Jun 01, 2017 at 11:22:21AM +0100, Cory Benfield wrote:

>  We want to support as many TLS backends as possible.

Just a wild idea, but have you investigated a pure-Python fallback for
2.7 such as TLSlite? Of course the fallback need only be used during
bootstrapping, and the solution would be compatible with every stable
LTS Linux distribution release that was not shipping the latest and
greatest 2.7.


David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 11:22:21 +0100
Cory Benfield  wrote:
> 
> Who is the “we” that should move on? Python core dev? Or the Python ecosystem?

Sorry.  Python core dev certainly.  As for the rest of the ecosystem, it
is moving on as well.

> Requests is stuck in a place from which it cannot move.
> We feel we cannot drop 2.7 support. We want to support as many TLS
> backends as possible.

Well, certain features could be 3.x-only, couldn't they?

> We want to enable the pip developers to focus on
> their features, rather than worrying about HTTP and TLS. And we want
> people to adopt the async/await keywords as much as possible.

I don't get what async/await keywords have to do with this.  We're
talking about backporting the ssl memory BIO object...

(also, as much as I think asyncio is a good thing, I'm not sure it will
do much for the problem of downloading packages from HTTP, even in
parallel)

> I want to move on, but I want to bring that 80% of our userbase with us when 
> we do. My reading of your post is that you would rather Requests not adopt 
> the async/await paradigm than backport MemoryBIO: is my understanding correct?

Well you cannot use async/await on 2.7 in any case, and you cannot use
asyncio on 2.7 (Trollius, which was maintained by Victor, has been
abandoned AFAIK).  If you want to use coroutines in 2.7, you need to
use Tornado or Twisted.  Twisted may not, but Tornado works fine with
the stdlib ssl module.

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 11:28, Antoine Pitrou  wrote:
> 
> 
> Le 01/06/2017 à 12:23, Cory Benfield a écrit :
>> 
>> No it can’t.
>> 
>> OpenSSL builds chains differently, and disregards some metadata that Windows 
>> and macOS store, which means that cert validation will work differently than 
>> in the system store. This can lead to pip accepting a cert marked as 
>> “untrusted for SSL”, for example, which would be pretty bad.
> 
> Are you claiming that OpenSSL certificate validation is insecure and
> shouldn't be used at all?  I have never heard that claim before.

Of course I’m not.

I am claiming that using OpenSSL certificate validation with root stores that 
are not intended for OpenSSL can be. This is because trust of a certificate is 
non-binary. For example, consider WoSign. The Windows TLS implementation will 
distrust certificates that chain up to WoSign as a root certificate that were 
issued after October 21 2016. This is not something that can currently be 
represented as a PEM file. Therefore, the person exporting the certs needs to 
choose: should that be exported or not? If it is, then OpenSSL will happily 
trust it even in situations where the system trust store would not.

More generally, macOS allows the administrator to configure graduated trust: 
that is, to override whether or not a root should be trusted for certificate 
validation in some circumstances. Again, exporting this to a PEM does not 
persist this information.

Cory

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 11:39, David Wilson  wrote:
> 
> Hi Cory,
> 
> On Thu, Jun 01, 2017 at 11:22:21AM +0100, Cory Benfield wrote:
> 
>> We want to support as many TLS backends as possible.
> 
> Just a wild idea, but have you investigated a pure-Python fallback for
> 2.7 such as TLSlite? Of course the fallback need only be used during
> bootstrapping, and the solution would be compatible with every stable
> LTS Linux distribution release that was not shipping the latest and
> greatest 2.7.

I have, but discarded the idea. There are no pure-Python TLS implementations 
that are both feature-complete and actively maintained. Additionally, doing 
crypto operations in pure-Python is a bad idea, so any implementation that did 
crypto in Python code would be ruled out immediately (which rules out TLSLite), 
so I’d need what amounts to a custom library: pure-Python TLS with crypto from 
OpenSSL, which is not currently exposed by any Python module. Ultimately it’s 
just not a winner.

Cory
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 11:45:14 +0100
Cory Benfield  wrote:
> 
> I am claiming that using OpenSSL certificate validation with root stores that 
> are not intended for OpenSSL can be. This is because trust of a certificate 
> is non-binary. For example, consider WoSign. The Windows TLS implementation 
> will distrust certificates that chain up to WoSign as a root certificate that 
> were issued after October 21 2016. This is not something that can currently 
> be represented as a PEM file. Therefore, the person exporting the certs needs 
> to choose: should that be exported or not? If it is, then OpenSSL will 
> happily trust it even in situations where the system trust store would not.

I was not talking about exporting the whole system CA as a PEM file, I
was talking about adding an option for system adminstrators to
configure an extra CA certificate to be recognized by pip.

> More generally, macOS allows the administrator to configure graduated trust: 
> that is, to override whether or not a root should be trusted for certificate 
> validation in some circumstances. Again, exporting this to a PEM does not 
> persist this information.

How much of this is relevant to pip?

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 11:39, Antoine Pitrou  wrote:
> 
> On Thu, 1 Jun 2017 11:22:21 +0100
> Cory Benfield  wrote:
>> 
>> Who is the “we” that should move on? Python core dev? Or the Python 
>> ecosystem?
> 
> Sorry.  Python core dev certainly.  As for the rest of the ecosystem, it
> is moving on as well.

Moving, sure, but slowly. Again, I point to the 80% download number.

>> Requests is stuck in a place from which it cannot move.
>> We feel we cannot drop 2.7 support. We want to support as many TLS
>> backends as possible.
> 
> Well, certain features could be 3.x-only, couldn't they?

In principle, sure. In practice, that means most of our users don’t use those 
features and so we don’t get any feedback on whether they’re good solutions to 
the problem. This is not great. Ideally we want features to be available across 
as wide a deploy base as possible, otherwise we risk shipping features that 
don’t solve the actual problem very well. Good software comes, in part, from 
getting user feedback.

>> We want to enable the pip developers to focus on
>> their features, rather than worrying about HTTP and TLS. And we want
>> people to adopt the async/await keywords as much as possible.
> 
> I don't get what async/await keywords have to do with this.  We're
> talking about backporting the ssl memory BIO object…

All of this is related. I wrote a very, very long email initially and deleted 
it all because it was just too long to expect any normal human being to read 
it, but the TL;DR here is that we also want to support async/await, and doing 
so requires a memory BIO object.

>> I want to move on, but I want to bring that 80% of our userbase with us when 
>> we do. My reading of your post is that you would rather Requests not adopt 
>> the async/await paradigm than backport MemoryBIO: is my understanding 
>> correct?
> 
> Well you cannot use async/await on 2.7 in any case, and you cannot use
> asyncio on 2.7 (Trollius, which was maintained by Victor, has been
> abandoned AFAIK).  If you want to use coroutines in 2.7, you need to
> use Tornado or Twisted.  Twisted may not, but Tornado works fine with
> the stdlib ssl module.

I can use Twisted on 2.7, and Twisted has great integration with async/await 
and asyncio when they are available. Great and getting greater, in fact, thanks 
to the work of the Twisted and asyncio teams.

As to Tornado, the biggest concern there is that there is no support for 
composing the TLS over non-TCP sockets as far as I am aware. The wrapped socket 
approach is not suitable for some kinds of stream-based I/O that users really 
should be able to use with Requests (e.g. UNIX pipes). Not a complete 
non-starter, but also not something I’d like to forego.

Cory

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 11:51, Antoine Pitrou  wrote:
> 
> On Thu, 1 Jun 2017 11:45:14 +0100
> Cory Benfield  wrote:
>> 
>> I am claiming that using OpenSSL certificate validation with root stores 
>> that are not intended for OpenSSL can be. This is because trust of a 
>> certificate is non-binary. For example, consider WoSign. The Windows TLS 
>> implementation will distrust certificates that chain up to WoSign as a root 
>> certificate that were issued after October 21 2016. This is not something 
>> that can currently be represented as a PEM file. Therefore, the person 
>> exporting the certs needs to choose: should that be exported or not? If it 
>> is, then OpenSSL will happily trust it even in situations where the system 
>> trust store would not.
> 
> I was not talking about exporting the whole system CA as a PEM file, I
> was talking about adding an option for system adminstrators to
> configure an extra CA certificate to be recognized by pip.

Generally speaking system administrators aren’t wild about this option, as it 
means that they can only add to the trust store, not remove from it. So, while 
possible, it’s not a complete solution to this issue. I say this because the 
option *already* exists, at least in part, via the REQUESTS_CA_BUNDLE 
environment variable, and we nonetheless still get many complaints from system 
administrators.

>> More generally, macOS allows the administrator to configure graduated trust: 
>> that is, to override whether or not a root should be trusted for certificate 
>> validation in some circumstances. Again, exporting this to a PEM does not 
>> persist this information.
> 
> How much of this is relevant to pip?

Depends. If the design goal is “pip respects the system administrator”, then 
the answer is “all of it”. An administrator wants to be able to configure their 
system trust settings. Ideally they want to do this once, and once only, such 
that all applications on their system respect it.

Cory
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread David Wilson
On Thu, Jun 01, 2017 at 11:47:31AM +0100, Cory Benfield wrote:

> I have, but discarded the idea.

I'm glad to hear it was given sufficent thought. :)

I have one final 'crazy' idea, and actually it does not seem to bad at
all: can't you just fork a subprocess or spawn threads to handle the
blocking SSL APIs?

Sure it wouldn't be beautiful, but it is more appealing than forcing an
upgrade on all 2.7 users just so they can continue to use pip. (Which,
ironically, seems to resonate strongly with the motivation behind all of
this work -- allowing users to continue with their old environments
without forcing an upgrade to 3.x!)


David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 12:09, David Wilson  wrote:
> 
> On Thu, Jun 01, 2017 at 11:47:31AM +0100, Cory Benfield wrote:
> 
>> I have, but discarded the idea.
> 
> I'm glad to hear it was given sufficent thought. :)
> 
> I have one final 'crazy' idea, and actually it does not seem to bad at
> all: can't you just fork a subprocess or spawn threads to handle the
> blocking SSL APIs?
> 
> Sure it wouldn't be beautiful, but it is more appealing than forcing an
> upgrade on all 2.7 users just so they can continue to use pip. (Which,
> ironically, seems to resonate strongly with the motivation behind all of
> this work -- allowing users to continue with their old environments
> without forcing an upgrade to 3.x!)

So, this will work, but at a performance and code cleanliness cost. This 
essentially becomes a Python-2-only code-path, and a very large and complex one 
at that. This has the combined unfortunate effects of meaning a) a 
proportionally small fraction of our users get access to the code path we want 
to take forward into the future, and b) the majority of our users get an 
inferior experience of having a library either spawn threads or processes under 
their feet, which in Python has a tendency to get nasty fast (I for one have 
experienced the joy of having to ctrl+c multiple times to get a program using 
paramiko to actually die).

Again, it’s worth noting that this change will not just affect pip but also the 
millions of Python 2 applications using Requests. I am ok with giving those 
users access to only part of the functionality that the Python 3 users get, but 
I’m not ok with that smaller part also being objectively worse than what we do 
today.

Cory
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 12:01:41 +0100
Cory Benfield  wrote:
> In principle, sure. In practice, that means most of our users don’t use those 
> features and so we don’t get any feedback on whether they’re good solutions 
> to the problem.

On bugs.python.org we get plenty of feedback from people using Python
3's features, and we have been for years.

Your concern would have been very valid in the Python 3.2 timeframe,
but I don't think it is anymore.

> All of this is related. I wrote a very, very long email initially and deleted 
> it all because it was just too long to expect any normal human being to read 
> it, but the TL;DR here is that we also want to support async/await, and doing 
> so requires a memory BIO object.

async/await doesn't require a memory BIO object.  For example, Tornado
supports async/await (*) even though it doesn't use a memory BIO object
for its SSL layer.  And asyncio started with a non-memory BIO SSL
implementation while still using "yield from".

(*) Despite the fact that Tornado's own coroutines are yield-based
generators.

> As to Tornado, the biggest concern there is that there is no support for 
> composing the TLS over non-TCP sockets as far as I am aware. The wrapped 
> socket approach is not suitable for some kinds of stream-based I/O that users 
> really should be able to use with Requests (e.g. UNIX pipes).

Hmm, why would you use TLS on UNIX pipes except as an academic
experiment?  Tornado is far from a full-fledged networking package like
Twisted, but its HTTP(S) support should be very sufficient
(understandably, since it is the core use case for it).

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 12:20, Antoine Pitrou  wrote:
> 
> On Thu, 1 Jun 2017 12:01:41 +0100
> Cory Benfield  wrote:
>> In principle, sure. In practice, that means most of our users don’t use 
>> those features and so we don’t get any feedback on whether they’re good 
>> solutions to the problem.
> 
> On bugs.python.org we get plenty of feedback from people using Python
> 3's features, and we have been for years.
> 
> Your concern would have been very valid in the Python 3.2 timeframe,
> but I don't think it is anymore.

Ok? I guess?

I don’t know what to do with that answer, really. I gave you some data (80%+ of 
requests downloads over the last month were Python 2), and you responded with 
“it doesn’t cause us problems”. That’s good for you, I suppose, and well done, 
but it doesn’t seem immediately applicable to the concern I have.

>> All of this is related. I wrote a very, very long email initially and 
>> deleted it all because it was just too long to expect any normal human being 
>> to read it, but the TL;DR here is that we also want to support async/await, 
>> and doing so requires a memory BIO object.
> 
> async/await doesn't require a memory BIO object.  For example, Tornado
> supports async/await (*) even though it doesn't use a memory BIO object
> for its SSL layer.  And asyncio started with a non-memory BIO SSL
> implementation while still using "yield from".
> 
> (*) Despite the fact that Tornado's own coroutines are yield-based
> generators.

You are right, sorry. I should not have used the word “require”. Allow me to 
rephrase.

MemoryBIO objects are vastly, vastly more predictable and tractable than 
wrapped sockets when combined with non-blocking I/O. Using wrapped sockets and 
select/poll/epoll/kqueue, while possible, requires extremely subtle code that 
is easy to get wrong, and can nonetheless still have awkward bugs in it. I 
would be extremely loathe to use such an implementation, but you are correct, 
such an implementation can exist.

>> As to Tornado, the biggest concern there is that there is no support for 
>> composing the TLS over non-TCP sockets as far as I am aware. The wrapped 
>> socket approach is not suitable for some kinds of stream-based I/O that 
>> users really should be able to use with Requests (e.g. UNIX pipes).
> 
> Hmm, why would you use TLS on UNIX pipes except as an academic
> experiment?  Tornado is far from a full-fledged networking package like
> Twisted, but its HTTP(S) support should be very sufficient
> (understandably, since it is the core use case for it).

Let me be clear that there is no intention to use either Tornado or Twisted’s 
HTTP/1.1 parsers or engines. With all due respect to both projects, I have 
concerns about both their client implementations. Tornado’s default is 
definitely not suitable for use in Requests, and the curl backend is but, 
surprise surprise, requires a C extension and oh god we’re back here again. I 
have similar concerns about Twisted’s default HTTP/1.1 client. Tornado’s 
HTTP/1.1 server is certainly sufficient, but also not of much use to Requests. 
Requests very much intends to use our own HTTP logic, not least because we’re 
sick of relying on someone else’s.

Literally what we want is to have an event loop backing us that we can 
integrate with async/await and that requires us to reinvent as few wheels as 
possible while giving an overall better end-user experience. If I were to use 
Tornado, because I would want to integrate PEP 543 support into Tornado I’d 
ultimately have to rewrite Tornado’s TLS implementation *anyway* to replace it 
with a PEP 543 version. If I’m doing that, I’d much rather do it with MemoryBIO 
than wrapped sockets, for all of the reasons above.

As a final note, because I think we’re getting into the weeds here: this is not 
*necessary*. None of this is *necessary*. Requests exists, and works today. 
We’ll get Windows TLS support regardless of anything that’s done here, because 
I’ll just shim it into urllib3 like we did for macOS. What I am pushing for 
with PEP 543 is an improvement that would benefit the whole ecosystem: all I 
want to do is to make it possible for me to actually use it and ship it to 
users in the tools I maintain.

It is reasonable and coherent for python-dev to say “well, good luck, but no 
backports to help you out”. The result of that is that I put PEP 543 on the 
backburner (because it doesn’t solve Requests/urllib3’s problems, and 
ultimately my day job is about resolving those issues), and probably that we 
shutter the async discussion for Requests until we drop Python 2 support. 
That’s fine, Python is your project, not mine. But I don’t see that there’s any 
reason for us not to ask for this backport. After all, the worst you can do is 
say no, and my problems remain the same.

Cory
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/ma

Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou

Le 01/06/2017 à 15:12, Cory Benfield a écrit :
> 
> I don’t know what to do with that answer, really. I gave you some data (80%+ 
> of requests downloads over the last month were Python 2), and you responded 
> with “it doesn’t cause us problems”.

And indeed it doesn't.  Unless the target user base for pip is widely
different than Python's, it shouldn't cause you any problems either.

> As a final note, because I think we’re getting into the weeds here: this is 
> not *necessary*. None of this is *necessary*. Requests exists, and works 
> today.

And pip could even bundle a frozen 2.7-compatible version of Requests if
it wanted/needed to...

> Let me be clear that there is no intention to use either Tornado or
Twisted’s HTTP/1.1 parsers or engines. [...] Requests very much intends
to use our own HTTP logic, not least because we’re sick of relying on
someone else’s.

Then the PEP is really wrong or misleading in the way it states its own
motivations.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 14:21, Antoine Pitrou  wrote:
> 
> 
> Le 01/06/2017 à 15:12, Cory Benfield a écrit :
>> 
>> I don’t know what to do with that answer, really. I gave you some data (80%+ 
>> of requests downloads over the last month were Python 2), and you responded 
>> with “it doesn’t cause us problems”.
> 
> And indeed it doesn't.  Unless the target user base for pip is widely
> different than Python's, it shouldn't cause you any problems either.

Maybe not now, but I think it’s fair to say that it did, right? As I recall, 
Python spent a long time with two fully supported Python versions, and then an 
even longer time with a version that was getting bugfixes. Tell me, which did 
you get more feedback on during that time?

Generally speaking it is fair to say that at this point *every line of code in 
Requests* is exercised or depended on by one of our users. If we write new code 
available to a small fraction of them, and it is in any way sizeable, then that 
stops being true. Again, we should look at the fact that most libraries that 
successfully support Python 2 and Python 3 do so through having codebases that 
share as much code as possible between the two implementations. Each line of 
code that is exercised in only one implementation becomes a vector for a long, 
lingering bug.

Anyway, all I know is that the last big project to do this kind of hard cut was 
Python, and while many of us are glad that Python 3 is real and glad that we 
pushed through the pain, I don’t think anyone would argue that the move was 
painless. A lesson can be learned there, especially for Requests which is not 
currently nursing a problem as fundamental to it as Python was.

>> As a final note, because I think we’re getting into the weeds here: this is 
>> not *necessary*. None of this is *necessary*. Requests exists, and works 
>> today.
> 
> And pip could even bundle a frozen 2.7-compatible version of Requests if
> it wanted/needed to…

Sure, if pip wants to internalise supporting and maintaining that version. One 
of the advantages of the pip/Requests relationship is that pip gets to stop 
worrying about HTTP: if there’s a HTTP problem, that’s on someone else to fix. 
Bundling that would remove that advantage.

> 
>> Let me be clear that there is no intention to use either Tornado or
> Twisted’s HTTP/1.1 parsers or engines. [...] Requests very much intends
> to use our own HTTP logic, not least because we’re sick of relying on
> someone else’s.
> 
> Then the PEP is really wrong or misleading in the way it states its own
> motivations.

How so? TLS is not a part of the HTTP parser. It’s an intermediary layer 
between the transport (resolutely owned by the network layer in 
Twisted/Tornado) and the parsing layer (resolutely owned by Requests). Ideally 
we would not roll our own.

Cory

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 14:37:55 +0100
Cory Benfield  wrote:
> > 
> > And indeed it doesn't.  Unless the target user base for pip is widely
> > different than Python's, it shouldn't cause you any problems either.  
> 
> Maybe not now, but I think it’s fair to say that it did, right?

Until Python 3.2 and perhaps 3.3, yes. Since 3.4, definitely not.  For
example asyncio quickly grew a sizable community around it, even though
it had established Python 2-compatible competitors.

> > Then the PEP is really wrong or misleading in the way it states its own
> > motivations.  
> 
> How so?

In the sentence "There are plans afoot to look at moving Requests to a
more event-loop-y model, and doing so basically mandates a MemoryBIO",
and also in the general feeling it gives that the backport is motivated
by security reasons primarily.

I understand that some users would like more features in Python 2.7.
That has been the case since it was decided that feature development in
the 2.x line would end in favour of Python 3 development.  But our
maintenance policy has been and is to develop new features on Python 3
(which some people have described as a "carrot" for migrating, which is
certainly true).

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 14:53, Antoine Pitrou  wrote:
> 
> On Thu, 1 Jun 2017 14:37:55 +0100
> Cory Benfield  wrote:
>>> 
>>> And indeed it doesn't.  Unless the target user base for pip is widely
>>> different than Python's, it shouldn't cause you any problems either.  
>> 
>> Maybe not now, but I think it’s fair to say that it did, right?
> 
> Until Python 3.2 and perhaps 3.3, yes. Since 3.4, definitely not.  For
> example asyncio quickly grew a sizable community around it, even though
> it had established Python 2-compatible competitors.

Sure, but “until 3.2” covers a long enough time to take us from now to 
“deprecation of Python 2”. Given that the Requests team is 4 people, unlike 
python-dev’s much larger number, I suspect we’d have at least as much pain 
proportionally as Python did. I’m not wild about signing up for that.

>>> Then the PEP is really wrong or misleading in the way it states its own
>>> motivations.  
>> 
>> How so?
> 
> In the sentence "There are plans afoot to look at moving Requests to a
> more event-loop-y model, and doing so basically mandates a MemoryBIO",
> and also in the general feeling it gives that the backport is motivated
> by security reasons primarily.

Ok, let’s address those together.

There are security reasons to do the backport, but they are “it helps us build 
a pathway to PEP 543”. Right now there are a lot of people interested in seeing 
PEP 543 happen, but vastly fewer in a position to do the work. I am, but only 
if I can actually use it for the things that are in my job. If I can’t, then 
PEP 543 becomes an “evenings and weekends” activity for me *at best*, and 
something I have to drop entirely at worst.

Adopting PEP 543 *would* be a security benefit, so while this PEP itself is not 
directly in and of itself a security benefit, it builds a pathway to something 
that is.

As to the plans to move Requests to a more event loop-y model, I think that it 
does stand in the way of this, but only insomuch as, again, we want our event 
loopy model to be as bug-free as possible. But I can concede that rewording on 
that point would be valuable.

*However*, it’s my understanding that even if I did that rewording, you’d still 
be against it. Is that correct? 

Cory

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread David Wilson
On Thu, Jun 01, 2017 at 12:18:48PM +0100, Cory Benfield wrote:

> So, this will work, but at a performance and code cleanliness cost.
> This essentially becomes a Python-2-only code-path, and a very large
> and complex one at that.

"Doctor, it hurts when I do this .."

Fine, then how about rather than exporting pip's problems on to the rest
of the world (which an API change to a security module in a stable
branch most certainly is, 18 months later when every other Python
library starts depending on it), just drop SSL entirely, it will almost
certainly cost less pain in the long run, and you can even arrange for
the same code to run in both major versions.

Drop SSL? But that's madness!

Serve the assets over plain HTTP and tack a signature somewhere
alongside it, either side-by-side in a file, embedded in a URL query
string, or whatever. Here[0] is 1000 lines of pure Python that can
validate a public key signature over a hash of the asset as it's
downloaded. Embed the 32 byte public key in the pip source and hey
presto.

[0] https://github.com/jfindlay/pure_pynacl/blob/master/pure_pynacl/tweetnacl.py

Finding someone to audit the signature checking capabilities of [0] will
have vastly lower net cost than getting the world into a situation where
pip no longer runs on the >1e6 EC2 instances that will be running Ubuntu
14.04/16.04 LTS until the turn of the next decade.

Requests can't be installed without a working SSL implementation? Then
drop requests, it's not like it does much for pip anyway. Downloads
worldwide get a huge speedup due to lack of TLS handshake latency, a
million Squid caching reverse proxies worldwide jump into action caching
tarballs they previously couldn't see, pip's _vendor directory drops by
4.2MB, and Python package security depends on 1k lines of memory-safe
code rather than possibly *the* worst example of security-unconcious C
to come into existence since the birth of our industry. Sounds like a
win to me.

Maybe set a standard rather than blindly follow everyone else, at the
cost of.. everyone else.


David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Ben Darnell
Trying to transfer github comments from
https://github.com/python/peps/pull/272#pullrequestreview-41388700:

I said:
> Tornado has been doing TLS in an event-loop model in python 2.5+ with
just wrap_socket, no MemoryBIO necessary. What am I missing? MemoryBIO
certainly gives some extra flexibility, but nothing I can see that's
strictly required for an HTTP client. (Maybe it comes up in some proxy
scenarios that Tornado hasn't implemented?)

There were three main responses:
- MemoryBIO is necessary to support TLS on windows with IOCP. Tornado's
approach requires the less-efficient select() interface. This is valid and
IMHO the biggest argument against using Tornado instead of Twisted in
requests. Even if requests is willing to accept the limitation of not being
able to use IOCP on Python 2, it may be tricky to arrange things so it can
support both Tornado's select-based event loop on Python 2 and the
IOCP-based interfaces in Python 3's asyncio (I'd volunteer to help with
this if the requests team is interested in pursuing it, though).

- wrap_socket is difficult to use correctly with an event loop; Twisted was
happy to move away from it to the MemoryBIO model. My response: MemoryBIO
is certainly a *better* solution for this problem, but it's not a
*requirement*. Twisted prefers to do as little buffering as possible, which
contributes to the difficulty of using wrap_socket. The buffering in
Tornado's SSLIOStream simplifies this. Glyph reports that there are still
some difficult-to-reproduce bugs; that may be but I haven't heard any other
reports of this. I believe that whatever bugs might remain in this area are
resolvable.

- MemoryBIO supports a wider variety of transports, including pipes.
There's a question about unix domain sockets - Tornado supports these
generally but I haven't tried them with TLS. I would expect it to work.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Siddhesh Poyarekar
On Thursday 01 June 2017 01:53 PM, INADA Naoki wrote:
> * On Linux, madvice(..., MADV_DONTNEED) can be used.

madvise does not reduce the commit charge in the Linux kernel, so in
high consumption scenarios (and where memory overcommit is disabled or
throttled) you'll see programs dying with OOM despite the MADV_DONTNEED.
 The way we solved it in glibc was to use mprotect to drop PROT_READ and
PROT_WRITE in blocks that we don't need when we detect that the system
is not configured to overcommit (using /proc/sys/vm/overcommit_memory).
You'll need to fix the protection again though if you want to reuse the
block.

Siddhesh
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Victor Stinner
It seems very complex and not portable at all to "free" a part of an
arena. We already support freeing a whole arena using munmap(). I was
a huge enhancement in our memory allocator. Change made in Python 2.5?
I don't recall, ask Evan Jones:
http://www.evanjones.ca/memoryallocator/ :-)

I'm not sure that it's worth it to increase the arena size and try to
implement the MADV_DONTNEED / MADV_FREE thing.

Victor

2017-06-01 11:21 GMT+02:00 INADA Naoki :
> Thanks for detailed info.
>
> But I don't think it's a big problem.
> Arenas are returned to system by chance.  So other processes
> shouldn't relying to it.
>
> And I don't propose to stop returning arena to system.
> I just mean per pool (part of arena) MADV_DONTNEED can reduce RSS.
>
> If we use very large arena, or stop returning arena to system,
> it can be problem.
>
> Regards,
>
> On Thu, Jun 1, 2017 at 6:05 PM, Siddhesh Poyarekar  
> wrote:
>> On Thursday 01 June 2017 01:53 PM, INADA Naoki wrote:
>>> * On Linux, madvice(..., MADV_DONTNEED) can be used.
>>
>> madvise does not reduce the commit charge in the Linux kernel, so in
>> high consumption scenarios (and where memory overcommit is disabled or
>> throttled) you'll see programs dying with OOM despite the MADV_DONTNEED.
>>  The way we solved it in glibc was to use mprotect to drop PROT_READ and
>> PROT_WRITE in blocks that we don't need when we detect that the system
>> is not configured to overcommit (using /proc/sys/vm/overcommit_memory).
>> You'll need to fix the protection again though if you want to reuse the
>> block.
>>
>> Siddhesh
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> https://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 15:10, David Wilson  wrote:

> Finding someone to audit the signature checking capabilities of [0] will
> have vastly lower net cost than getting the world into a situation where
> pip no longer runs on the >1e6 EC2 instances that will be running Ubuntu
> 14.04/16.04 LTS until the turn of the next decade.

So for the record I’m assuming most of the previous email was a joke: certainly 
it’s not going to happen. ;)

But this is a real concern that does need to be addressed: Requests can’t 
meaningfully use this as its only TLS backend until it propagates to the wider 
2.7 ecosystem, at least far enough such that pip can drop Python 2.7 releases 
lower than 2.7.14 (or wherever MemoryBIO ends up, if backported). So a concern 
emerges: if you grant my other premises about the utility of the backport, is 
it worth backporting at all?

The answer to that is honestly not clear to me. I chatted with the pip 
developers, and they have 90%+ of their users currently on Python 2, but more 
than half of those are on 2.7.9 or later. This shows some interest in upgrading 
to newer Python 2s. The question, I think, is: do we end up in a position where 
a good number of developers are on 2.7.14 or later and only a very small 
fraction on 2.7.13 or earlier before the absolute number of Python 2 devs drops 
low enough to just drop Python 2?

I don’t have an answer to that question. I have a gut instinct that says yes, 
probably, but a lack of certainty. My suspicion is that most of the core dev 
community believe the answer to that is “no”. But I’d say that from my 
perspective this is the crux of the problem. We can hedge against this by just 
choosing to backport and accepting that it may never become useful, but a 
reasonable person can disagree and say that it’s just not worth the effort.

Frankly, I think that amidst all the other arguments this is the one that most 
concretely needs answering, because if we don’t think Requests can ever 
meaningfully rely on the presence of MemoryBIO on 2.7 (where “rely on” can be 
approximated to 90%+ of 2.7 users having access to it AND 2.7 still having 
non-trivial usage numbers) then ultimately this PEP doesn’t grant me much 
benefit.

There are others who believe there are a few other benefits we could get from 
it (helping out Twisted etc.), but I don’t know that I’m well placed to make 
those arguments. (I also suspect I’d get accused of moving the goalposts.)

Cory
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Antoine Pitrou
On Thu, 1 Jun 2017 15:09:41 +0100
Cory Benfield  wrote:
> 
> As to the plans to move Requests to a more event loop-y model, I think that 
> it does stand in the way of this, but only insomuch as, again, we want our 
> event loopy model to be as bug-free as possible. But I can concede that 
> rewording on that point would be valuable.
> 
> *However*, it’s my understanding that even if I did that rewording,
> you’d still be against it. Is that correct? 

Yes.  It's just that it would more fairly inform the people reading it.

Regards

Antoine.
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Chris Angelico
On Fri, Jun 2, 2017 at 1:01 AM, Cory Benfield  wrote:
> The answer to that is honestly not clear to me. I chatted with the pip 
> developers, and they have 90%+ of their users currently on Python 2, but more 
> than half of those are on 2.7.9 or later. This shows some interest in 
> upgrading to newer Python 2s. The question, I think, is: do we end up in a 
> position where a good number of developers are on 2.7.14 or later and only a 
> very small fraction on 2.7.13 or earlier before the absolute number of Python 
> 2 devs drops low enough to just drop Python 2?
>
> I don’t have an answer to that question. I have a gut instinct that says yes, 
> probably, but a lack of certainty. My suspicion is that most of the core dev 
> community believe the answer to that is “no”.
>

Let's see.

Python 2 users include people on Windows who install it themselves,
and then have no mechanism for automatic updates. They'll probably
stay on whatever 2.7.x they first got, until something forces them to
update. But it also includes people on stable Linux distros, where
they have automatic updates provided by Red Hat or Debian or whomever,
so a change like this WILL propagate - particularly (a) as the window
is three entire years, and (b) if the change is considered important
by the distro managers, which is a smaller group of people to convince
than the users themselves.

By 2020, Windows 7 will be out of support. By various estimates, Win 7
represents roughly half of all current Windows users. That means that,
by 2020, at least half of today's Windows users will either have
upgraded to a new OS (likely with a wipe-and-fresh-install, so they'll
get a newer Python), or be on an unsupported OS, on par with people
still running XP today. The same is true for probably close to 100% of
Linux users, since any supported Linux distro will be shipping updates
between now and 2020, and I don't know much about Mac OS updates, but
I rather suspect that they'll also be updating. (Can anyone confirm?)

So I'd be in the "yes" category. Across the next few years, I strongly
suspect that 2.7.14 will propagate reasonably well. And I also
strongly suspect that, even once 2020 hits and Python 2 stops getting
updates, it will still be important to a lot of people. These numbers
aren't backed by much, but it's slightly better than mere gut
instinct.

Do you have figures for how many people use pip on Windows vs Linux vs Mac OS?

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Cory Benfield

> On 1 Jun 2017, at 17:14, Chris Angelico  wrote:
> 
> 
> Do you have figures for how many people use pip on Windows vs Linux vs Mac OS?


I have figures for the download numbers, which are an awkward proxy because 
most people don’t CI on Windows and macOS, but they’re the best we have. Linux 
has approximately 20x the download numbers of either Windows or macOS, and both 
Windows and macOS are pretty close together. These numbers are a bit confounded 
due to the fact that 1/4 of Linux’s downloads are made up of systems that don’t 
report their platform, so the actual ratio could be anywhere from about 25:1 to 
3:1 in favour of Linux for either Windows or macOS. All of this is based on the 
downloads made in the last month.

Again, an enormous number of these downloads are going to be CI downloads which 
overwhelmingly favour Linux systems.

For some extra perspective, the next highest platform by download count is 
FreeBSD, with 0.04% of the downloads of Linux.

HTH,

Cory
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Chris Angelico
On Fri, Jun 2, 2017 at 2:35 AM, Cory Benfield  wrote:
> I have figures for the download numbers, which are an awkward proxy because 
> most people don’t CI on Windows and macOS, but they’re the best we have. 
> Linux has approximately 20x the download numbers of either Windows or macOS, 
> and both Windows and macOS are pretty close together. These numbers are a bit 
> confounded due to the fact that 1/4 of Linux’s downloads are made up of 
> systems that don’t report their platform, so the actual ratio could be 
> anywhere from about 25:1 to 3:1 in favour of Linux for either Windows or 
> macOS. All of this is based on the downloads made in the last month.
>
> Again, an enormous number of these downloads are going to be CI downloads 
> which overwhelmingly favour Linux systems.

Hmm. So it's really hard to know. Pity. I suppose it's too much to ask
for IP-based stat exclusion for the most commonly-used CI systems
(Travis, Circle, etc)? Still, it does look like most pip usage happens
on Linux. Also, it seems likely that the people who use Python and pip
heavily are going to be the ones who most care about keeping
up-to-date with point releases, so I still stand by my belief that
yes, 2.7.14+ could take the bulk of 2.7's marketshare before 2.7
itself stops being significant.

Thanks for the figures.

ChrisA
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Victor Stinner
2017-06-01 18:51 GMT+02:00 Chris Angelico :
> Hmm. So it's really hard to know. Pity. I suppose it's too much to ask
> for IP-based stat exclusion for the most commonly-used CI systems
> (Travis, Circle, etc)? Still, it does look like most pip usage happens
> on Linux. Also, it seems likely that the people who use Python and pip
> heavily are going to be the ones who most care about keeping
> up-to-date with point releases, so I still stand by my belief that
> yes, 2.7.14+ could take the bulk of 2.7's marketshare before 2.7
> itself stops being significant.

It sems like PyPI statistics are public:
https://langui.sh/2016/12/09/data-driven-decisions/

Another article on PyPI stats:
https://hynek.me/articles/python3-2016/

  2.7: 419 millions (89%)
  3.3+3.4+3.5+3.6: 51 millions (11%)
  (I ignored 2.6)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Barry Warsaw
On Jun 02, 2017, at 02:14 AM, Chris Angelico wrote:

>But it also includes people on stable Linux distros, where they have
>automatic updates provided by Red Hat or Debian or whomever, so a change like
>this WILL propagate - particularly (a) as the window is three entire years,
>and (b) if the change is considered important by the distro managers, which
>is a smaller group of people to convince than the users themselves.
[...]
>So I'd be in the "yes" category. Across the next few years, I strongly
>suspect that 2.7.14 will propagate reasonably well.

I'm not so sure about that, given long term support releases.  For Ubuntu, LTS
releases live for 5 years:

https://www.ubuntu.com/info/release-end-of-life

By 2020, only Ubuntu 16.04 and 18.04 will still be maintained, so while 18.04
will likely contain whatever the latest 2.7 is available at that time, 16.04
won't track upstream point releases, but instead will get select cherry
picks.  For good reason, there's a lot of overhead to backporting fixes into
stable releases, and something as big as being suggested here would, in my
best guess, have a very low chance of showing up in stable releases.

-Barry
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Nathaniel Smith
On Jun 1, 2017 9:20 AM, "Chris Angelico"  wrote:

On Fri, Jun 2, 2017 at 1:01 AM, Cory Benfield  wrote:
> The answer to that is honestly not clear to me. I chatted with the pip
developers, and they have 90%+ of their users currently on Python 2, but
more than half of those are on 2.7.9 or later. This shows some interest in
upgrading to newer Python 2s. The question, I think, is: do we end up in a
position where a good number of developers are on 2.7.14 or later and only
a very small fraction on 2.7.13 or earlier before the absolute number of
Python 2 devs drops low enough to just drop Python 2?
>
> I don’t have an answer to that question. I have a gut instinct that says
yes, probably, but a lack of certainty. My suspicion is that most of the
core dev community believe the answer to that is “no”.
>

Let's see.

Python 2 users include people on Windows who install it themselves,
and then have no mechanism for automatic updates. They'll probably
stay on whatever 2.7.x they first got, until something forces them to
update. But it also includes people on stable Linux distros, where
they have automatic updates provided by Red Hat or Debian or whomever,
so a change like this WILL propagate - particularly (a) as the window
is three entire years, and (b) if the change is considered important
by the distro managers, which is a smaller group of people to convince
than the users themselves.


I believe that for answering this question about the ssl module, it's
really only Linux users that matter, since pip/requests/everyone else
pushing for this only want to use ssl.MemoryBIO on Linux. Their plan on
Windows/MacOS (IIUC) is to stop using the ssl module entirely in favor of
new ctypes bindings for their respective native TLS libraries.

(And yes, in principle it might be possible to write new ctypes-based
bindings for openssl, but (a) this whole project is already teetering on
the verge of being impossible given the resources available, so adding any
major extra deliverable is likely to sink the whole thing, and (b) compared
to the proprietary libraries, openssl is *much* harder and riskier to wrap
at the ctypes level because it has different/incompatible ABIs depending on
its micro version and the vendor who distributed it. This is why manylinux
packages that need openssl have to ship their own, but pip can't and
shouldn't ship its own openssl for many hopefully obvious reasons.)

-n
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Victor Stinner
2017-06-01 19:09 GMT+02:00 Barry Warsaw :
> By 2020, only Ubuntu 16.04 and 18.04 will still be maintained, so while 18.04
> will likely contain whatever the latest 2.7 is available at that time, 16.04
> won't track upstream point releases, but instead will get select cherry
> picks.  For good reason, there's a lot of overhead to backporting fixes into
> stable releases, and something as big as being suggested here would, in my
> best guess, have a very low chance of showing up in stable releases.

I can help Canonical to backport MemoryBIO *if they want* to
cherry-pick this feature ;-)

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Victor Stinner
2017-06-01 19:10 GMT+02:00 Nathaniel Smith :
> (...) since pip/requests/everyone else pushing for
> this only want to use ssl.MemoryBIO on Linux. Their plan on Windows/MacOS
> (IIUC) is to stop using the ssl module entirely in favor of new ctypes
> bindings for their respective native TLS libraries.

The long term plan include one Windows implementation, one macOS
implementation and one implementation using the stdlib ssl module. But
it seems like right now, Cory is working alone and has limited time to
implement his PEP 543 (new TLS API). The short term plans is to
implement the strict minimum implementation, the one relying on the
existing stdlib ssl module.

Backporting MemoryBIO makes it possible to get the new TLS API "for
free" on Python 2.7. IMHO Python 2.7 support is a requirement to make
the PEP popular enough to make it successful.

The backport is supposed to fix a chicken-and-egg issue :-)

> (And yes, in principle it might be possible to write new ctypes-based
> bindings for openssl, but (...))

A C extension can also be considered, but I trust more code in CPython
stdlib, since it would be well tested by our big farm of buildbots and
have more eyes looking to the code.

--

It seems like the PEP 546 (backport MemoryBIO) should make it more
explicit that MemoryBIO support will be "optional": it's ok if Jython
or PyPy doesn't implement it. It's ok if old Python 2.7 versions don't
implement it. I expect anyway to use a fallback for those. It's just
that I would prefer to avoid a fallback (likely a C extension)
whenever possible, since it would cause various issues, especially for
C code using OpenSSL: OpenSSL API changed many times :-/

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Larry Hastings



On 06/01/2017 02:03 AM, Victor Stinner wrote:

2017-06-01 10:41 GMT+02:00 Larry Hastings :

On 06/01/2017 01:19 AM, Antoine Pitrou wrote:

If you'd like to go that way anyway, I would suggest 1MB as a starting
point in 3.7.


I understand the desire for caution.  But I was hoping maybe we could
experiment with 4mb in trunk for a while?  We could change it to 1mb--or
even 256k--before beta 1 if we get anxious.

While I fail to explain why in depth, I would prefer to *not* touch
the default arena size at this point.

We need more data, for example measure the memory usage on different
workloads using different arena sizes.


I can't argue with collecting data at this point in the process.  My 
thesis is simply "the correct value for this tunable parameter in 2001 
is probably not the same value in 2017".  I don't mind proceeding 
*slowly* or gathering more data or what have you for now.  But I would 
like to see it change somehow between now and 3.7.0b1, because my sense 
is that we can get some performance for basically free by updating the 
value.


If ARENA_SIZE tracked Moore's Law, meaning that we doubled it every 18 
months like clockwork, it'd currently be 2**10 times bigger: 256MB, and 
we'd be changing it to 512MB at the end of August.


(And yes, as a high school student I was once bitten by a radioactive 
optimizer, so these days when I'm near possible optimizations my 
spider-sense--uh, I mean, my optimization-sense--starts tingling.)



A simple enhancement would be to add an environment variable to change
the arena size at Python startup. Example: PYTHONARENASIZE=1M.


Implementing this would slow down address_in_range which currently 
compiles in arena size.  It'd be by a tiny amount, but this inline 
function gets called very very frequently.  It's possible this wouldn't 
hurt performance, but my guess is it'd offset the gains we got from 
larger arenas, and the net result would be no faster or slightly slower.



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Barry Warsaw
On Jun 01, 2017, at 07:22 PM, Victor Stinner wrote:

>I can help Canonical to backport MemoryBIO *if they want* to
>cherry-pick this feature ;-)

(Pedantically speaking, this falls under the Ubuntu project's responsibility,
not directly Canonical.)

Writing the patch is only part of the process:

https://wiki.ubuntu.com/StableReleaseUpdates

There's also Debian to consider.

Cheers,
-Barry


pgpNTh6UFehHt.pgp
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "Global freepool"

2017-06-01 Thread Larry Hastings



On 06/01/2017 02:20 AM, Victor Stinner wrote:
I would like to understand how private free lists are "so much" 
faster. In fact, I don't recall if someone *measured* the performance 
speedup of these free lists :-)


I have, recently, kind of by accident.  When working on the Gilectomy I 
turned off some freelists as they were adding "needless" complexity and 
presumably weren't helping performance that much. Then I turned them 
back on because it turned out they really did help.


My intuition is that the help in two major ways:

 * Since they're a known size, you don't need to go through the
   general-case code of looking up the right spot in usedpools (etc) to
   get one / put one back in malloc/free.
 * The code that recycles these objects assumes that objects from its
   freelist are already mostly initialized, so it doesn't need to
   initialize them.

The really crazy one is PyFrameObjects.  The global freelist for these 
stores up to 200 (I think) in a stack, implemented as a simple linked 
list.  When CPython wants a new frame object, it takes the top one off 
the stack and uses it.  Where it gets crazy is: PyFrameObjects are 
dynamically sized, based on the number of arguments + local variables + 
stack + freevars + cellvars.  So the frame you pull off the free list 
might not be big enough.  If it isn't big enough, the code calls 
*realloc* on it then uses it.  This seems like such a weird approach to 
me.  But it's obviously a successful approach, and I've learned not to 
argue with success.


p.s. Speaking of freelists, at one point Serhiy had a patch adding a 
freelist for single- and I think two-digit ints.  Right now the only int 
creation optimization we have is the array of constant "small ints"; if 
the int you're constructing isn't one of those, we use the normal slow 
allocation path with PyObject_Alloc etc.  IIRC this patch made things 
faster.  Serhiy, what happened to that patch?  Was it actually a bad 
idea, or did it just get forgotten?



//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Paul Moore
On 1 June 2017 at 17:14, Chris Angelico  wrote:
> Python 2 users include people on Windows who install it themselves,
> and then have no mechanism for automatic updates. They'll probably
> stay on whatever 2.7.x they first got, until something forces them to
> update. But it also includes people on stable Linux distros, where
> they have automatic updates provided by Red Hat or Debian or whomever,
> so a change like this WILL propagate - particularly (a) as the window
> is three entire years, and (b) if the change is considered important
> by the distro managers, which is a smaller group of people to convince
> than the users themselves.

However, it is trivial for Windows users to upgrade if asked to, as
there's no issue around system packages depending on a particular
version (or indeed, much of anything depending - 3rd party
applications on Windows bundle their own Python, they don't use the
globally installed one). So in principle, there should be no problem
expecting Windows users to be on the latest version of 2.7.x. In fact,
I suspect that the proportion of Windows users on Python 3 is
noticeably higher than the proportion of Linux/Mac OS users on Python
3 (for the same reason). So this problem may overall be less pressing
for Windows users. I have no evidence that isn't anecdotal to back
this last assertion up, though.

Linux users often use the OS-supplied Python, and so getting the
distributions to upgrade, and to backport upgrades to old versions of
their OS and (push those backports as required updates) is the route
to get the bulk of the users there. Experience on pip seems to
indicate this is unlikely to happen, in practice. Mac OS users who use
the system Python are, as I understand it, stuck with a pretty broken
version (I don't know if newer versions of the OS change that). But
distributions like Macports are more common and more up to date.

Apart from the Windows details, these are purely my impressions.

> Do you have figures for how many people use pip on Windows vs Linux vs Mac OS?

No. But we do get plenty of bug reports from Windows users, so I don't
think there's any reason to assume it's particularly low (given the
relative numbers of *python* users - in fact, it may be
proportionately higher as Windows users don't have alternative options
like yum).

Paul
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread David Wilson
On Thu, Jun 01, 2017 at 04:01:54PM +0100, Cory Benfield wrote:

> > lower net cost than getting the world into a situation where pip no
> > longer runs on the >1e6 EC2 instances that will be running Ubuntu
> > 14.04/16.04 LTS until the turn of the next decade.

> So for the record I’m assuming most of the previous email was a joke:
> certainly it’s not going to happen. ;)

> But this is a real concern that does need to be addressed

Unfortunately it wasn't, but at least I'm glad to have accidentally made
a valid point amidst the cloud of caffeine-fuelled irritability :/

Apologies for the previous post, it was hardly constructive.


David
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Donald Stufft

> On Jun 1, 2017, at 1:09 PM, Barry Warsaw  wrote:
> 
> On Jun 02, 2017, at 02:14 AM, Chris Angelico wrote:
> 
>> But it also includes people on stable Linux distros, where they have
>> automatic updates provided by Red Hat or Debian or whomever, so a change like
>> this WILL propagate - particularly (a) as the window is three entire years,
>> and (b) if the change is considered important by the distro managers, which
>> is a smaller group of people to convince than the users themselves.
> [...]
>> So I'd be in the "yes" category. Across the next few years, I strongly
>> suspect that 2.7.14 will propagate reasonably well.
> 
> I'm not so sure about that, given long term support releases.  For Ubuntu, LTS
> releases live for 5 years:
> 
> https://www.ubuntu.com/info/release-end-of-life
> 
> By 2020, only Ubuntu 16.04 and 18.04 will still be maintained, so while 18.04
> will likely contain whatever the latest 2.7 is available at that time, 16.04
> won't track upstream point releases, but instead will get select cherry
> picks.  For good reason, there's a lot of overhead to backporting fixes into
> stable releases, and something as big as being suggested here would, in my
> best guess, have a very low chance of showing up in stable releases.
> 


Using 2.7.9 as a sort of benchmark here, currently 26% of downloads from PyPI 
are using a version of Python older than 2.7.9, 2 months ago that number was 
31%. (That’s total across all Python versions). Python >= 2.7.9, <3 is at 43% 
(previously 53%).

So in ~2.5 years 2.7.9+ has become > 50% of all downloads from PyPI while older 
versions of Python 2.7 are down to only ~25% of the total number of downloads 
made by pip. I was also curious about how this had changed over the past year 
instead of just the past two months, a year ago >=2.7,<2.7.9 accounted for 
almost 50% of all downloads from PyPI (compared to the 25% today). It *looks* 
like on average we’re dropping somewhere between 1.5% and 2% each month so a 
conservative estimate if these numbers hold, we’re be looking at single digit 
numbers for >=2.7,<2.7.9 in roughly 11 months, or 3.5 years after the release 
of 2.7.9.

If we assume that the hypothetical 2.7.14 w/ MemoryBio support would follow a 
similar adoption curve, we would expect to be able to mandate it for pip/etc in 
at a worst case scenario, 3-4 years after release.

In addition to that, pip 9 comes with a new feature that makes it easier to 
sunset support for versions of Python without breaking the world [1]. The 
likely scenario is that while pip 9+ is increasing in share Python <2.7.14 will 
be decreasing, and that would mean that we could *likely* start mandating it 
earlier, maybe at the 2 year mark or so.


[1] An astute reader might ask, why could you not use this same mechanism to 
simply move on to only supporting Python 3? It’s true we could do that, however 
as a rule we generally try to keep support for Pythons until the usage drops 
below some threshold, where that threshold varies based on how hard it is to 
continue supporting that version of Python and what the “win” is in terms of 
dropping it. Since we’re still at 90% of downloads from PyPI being done using 
Python 2, that suggests the threshold for Python 3.x is very far away and will 
extend beyond 2020 (I mean, we’re just *now* finally able to drop support for 
Python 2.6).

In case it’s not obvious, I am very much in support of this PEP.

—
Donald Stufft



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Donald Stufft

> On Jun 1, 2017, at 3:51 PM, Paul Moore  wrote:
> 
> Linux users often use the OS-supplied Python, and so getting the
> distributions to upgrade, and to backport upgrades to old versions of
> their OS and (push those backports as required updates) is the route
> to get the bulk of the users there. Experience on pip seems to
> indicate this is unlikely to happen, in practice. Mac OS users who use
> the system Python are, as I understand it, stuck with a pretty broken
> version (I don't know if newer versions of the OS change that). But
> distributions like Macports are more common and more up to date.
> 

Note that on macOS, within the next year macOS users using the system Python 
are going to be unable to talk to PyPI anyways (unless Apple does something 
here, which I think they will), but in either case, Apple was pretty good about 
upgrading to 2.7.9 (I think they had the first OS released that supported 
2.7.9?).

—
Donald Stufft



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Donald Stufft

> On Jun 1, 2017, at 3:57 PM, Donald Stufft  wrote:
> 
> Note that on macOS, within the next year macOS users using the system Python 
> are going to be unable to talk to PyPI anyways (unless Apple does something 
> here, which I think they will), but in either case, Apple was pretty good 
> about upgrading to 2.7.9 (I think they had the first OS released that 
> supported 2.7.9?).


Forgot to mention that pip 10.0 will work around this, thus forcing macOS users 
to upgrade or be cut off.

—
Donald Stufft



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "Global freepool"

2017-06-01 Thread Serhiy Storchaka

01.06.17 21:44, Larry Hastings пише:
p.s. Speaking of freelists, at one point Serhiy had a patch adding a 
freelist for single- and I think two-digit ints.  Right now the only int 
creation optimization we have is the array of constant "small ints"; if 
the int you're constructing isn't one of those, we use the normal slow 
allocation path with PyObject_Alloc etc.  IIRC this patch made things 
faster.  Serhiy, what happened to that patch?  Was it actually a bad 
idea, or did it just get forgotten?


The issue [1] still is open. Patches neither applied nor rejected. They 
exposes the speed up in microbenchmarks, but it is not large. Up to 40% 
for iterating over enumerate() and 5-7% for hard integer computations 
like base85 encoding or spectral_norm benchmark.


[1] https://bugs.python.org/issue25324

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] RFC: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7

2017-06-01 Thread Steve Dower

On 01Jun2017 1010, Nathaniel Smith wrote:
I believe that for answering this question about the ssl module, it's 
really only Linux users that matter, since pip/requests/everyone else 
pushing for this only want to use ssl.MemoryBIO on Linux. Their plan on 
Windows/MacOS (IIUC) is to stop using the ssl module entirely in favor 
of new ctypes bindings for their respective native TLS libraries.


(And yes, in principle it might be possible to write new ctypes-based 
bindings for openssl, but (a) this whole project is already teetering on 
the verge of being impossible given the resources available, so adding 
any major extra deliverable is likely to sink the whole thing, and (b) 
compared to the proprietary libraries, openssl is *much* harder and 
riskier to wrap at the ctypes level because it has 
different/incompatible ABIs depending on its micro version and the 
vendor who distributed it. This is why manylinux packages that need 
openssl have to ship their own, but pip can't and shouldn't ship its own 
openssl for many hopefully obvious reasons.)


How much of a stop-gap would it be (for Windows at least) to override 
OpenSSL's certificate validation with a call into the OS? This leaves 
most of the work with OpenSSL, but lets the OS say yes/no to the 
certificates based on its own configuration.


For Windows, this is under 100 lines of C code in (probably) _ssl, and 
while I think an SChannel based approach is the better way to go 
long-term,[1] offering platform-specific certificate validation as the 
default in 2.7 is far more palatable than backporting new public API.


I can't speak to whether there is an equivalent function for Mac 
(validate a certificate chain given the cert blob).


Cheers,
Steve

[1]: though I've now spent hours looking at it and still have no idea 
how it's supposed to actually work...

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Siddhesh Poyarekar
On Thursday 01 June 2017 01:27 PM, Victor Stinner wrote:
> The GNU libc malloc uses a variable threshold to choose between sbrk()
> (heap memory) or mmap(). It starts at 128 kB or 256 kB, and then is
> adapted depending on the workload (I don't know how exactly).

The threshold starts at 128K and increases whenever an mmap'd block is
freed.  For example, if the program allocates 2M (which is returned
using mmap) and then frees that block, glibc malloc assumes that 2M
blocks will be needed again and optimizes that allocation by increasing
the threshold to 2M.

This works well in practice for common programs but it has been known to
cause issues in some cases, which is why there's MALLOC_MMAP_THRESHOLD_
to fix the threshold.

> I already read that CPU support "large pages" between 2 MB and 1 GB,
> instead of just 4 kB. Using large pages can have a significant impact
> on performance. I don't know if we can do something to help the Linux
> kernel to use large pages for our memory? I don't know neither how we
> could do that :-) Maybe using mmap() closer to large pages will help
> Linux to join them to build a big page? (Linux has something magic to
> make applications use big pages transparently.)

There's MAP_HUGETLB and friends for mmap flags, but it's generally
better to just let the kernel do this for you transparently (using
Transparent Huge Pages) by making sure that your arena allocations are
either contiguous or big enough.

Siddhesh
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "Global freepool"

2017-06-01 Thread Victor Stinner
2017-06-01 22:16 GMT+02:00 Serhiy Storchaka :
> The issue [1] still is open. Patches neither applied nor rejected. They
> exposes the speed up in microbenchmarks, but it is not large. Up to 40% for
> iterating over enumerate() and 5-7% for hard integer computations like
> base85 encoding or spectral_norm benchmark.
>
> [1] https://bugs.python.org/issue25324

Hum, I think that the right issue is:
http://bugs.python.org/issue24165

Victor
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 and braces { .... } on if

2017-06-01 Thread Brett Cannon
If you create an issue at github.com/python/peps and assign it to me I will
get to it someday. :)

On Thu, 1 Jun 2017 at 00:19 Victor Stinner  wrote:

> 2017-05-31 19:27 GMT+02:00 Guido van Rossum :
> > I interpret the PEP (...)
>
> Right, the phrasing requires to "interpret" it :-)
>
> > (...) as saying that you should use braces everywhere but not
> > to add them in code that you're not modifying otherwise. (I.e. don't go
> on a
> > brace-adding rampage.) If author and reviewer of a PR disagree I would go
> > with "add braces" since that's clearly the PEP's preference. This is C
> code.
> > We should play it safe.
>
> Would someone be nice enough to try to rephrase the PEP 7 to explain
> that? Just to avoid further boring discussion on the C coding style...
>
> Victor
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 7 and braces { .... } on if

2017-06-01 Thread Barry Warsaw
https://github.com/python/peps/pull/280/files

On Jun 01, 2017, at 09:08 PM, Brett Cannon wrote:

>If you create an issue at github.com/python/peps and assign it to me I will
>get to it someday. :)


pgpqhM6HQldC5.pgp
Description: OpenPGP digital signature
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] 2017 Python Language Summit coverage -- Round 2

2017-06-01 Thread Jake Edge

Hola python-dev,

Thanks for all the positive feedback on the coverage (and the
corrections/clarifications in the comments too)!

There is, it seems, always more to do, but I do have three additional
articles from the summit up now and should complete the coverage over
the next week.

The starting point is the overview article, here:
https://lwn.net/Articles/723251/ which should now be free for anyone to
see (and the first four articles too).  LWN subscribers can see the
content right away, but one week after they are published in the weekly
edition, they become freely available for everyone.  Until then,
though, feel free to share the SubscriberLinks I am posting here.  I
have been asked about our policy on appropriate places to share
SubscriberLinks; blogs, tweets, social media, mailing lists, etc. are
all perfectly fine with us.

The new articles are:

Keeping Python competitive: https://lwn.net/Articles/723949/ or
https://lwn.net/SubscriberLink/723949/56a392defaae995c/

Trio and the future of asynchronous execution in
Python: https://lwn.net/Articles/724082/ or
https://lwn.net/SubscriberLink/724082/43c399adca8006f0/

Python ssl module update: https://lwn.net/Articles/724209/ or
https://lwn.net/SubscriberLink/724209/8460ca8b51c00634/

stay tuned sometime next week for the thrilling conclusion :)

jake

-- 
Jake Edge - LWN - j...@lwn.net - http://lwn.net
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The untuned tunable parameter ARENA_SIZE

2017-06-01 Thread Larry Hastings


On 06/01/2017 02:50 AM, Antoine Pitrou wrote:

Another possible strategy is: allocate several arenas at once (using a
larger mmap() call), and use MADV_DONTNEED to relinquish individual
arenas.


Thus adding a *fourth* layer of abstraction over memory we get from the OS?

   block -> pool -> arena -> "multi-arena" -> OS

Y'know, this might actually make things faster.  These multi-arenas 
could be the dynamically growing thing Victor wants to try.  We allocate 
16mb, then carve it up into arenas (however big those are), then next 
time allocate 32mb or what have you. Since the arenas remain a fixed 
size, we don't make the frequently-used code path (address_in_range) any 
slower.  The code to deal with the multi-arenas would add a little 
complexity--to an admittedly already complex allocator implementation, 
but then what allocator isn't complex internally?--but it'd be an 
infrequent code path and I bet it'd be an improvement over simply 
calling malloc / mmap / VirtualAlloc.  What do you think, Victor?


And to think I started this reply ironically,


//arry/
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Aligning the packaging.python.org theme with the rest of the docs

2017-06-01 Thread Nick Coghlan
On 30 May 2017 at 22:08, Antoine Pitrou  wrote:
> On Tue, 30 May 2017 21:49:19 +1000
> Nick Coghlan  wrote:
>>
>> Here's an alternate wording for the README that would focus on those
>> considerations without explicitly asking folks not to use the theme:
>>
>> "Note that when adopting this theme, you're also borrowing an element
>> of the trust and credibility established by the CPython core
>> developers over the years, as well as the legal credibility arising
>> from their close association with the Python Software Foundation.
>
> The statement about "legal credibility" sounds wishy-washy and could
> lure users into thinking that they're doing something illegal by
> borrowing the theme.
>
> Also I'm not sure what is that "legal credibility" you're talking
> about.  If it's about the PSF license and the Python CLA then
> better to voice that explicitly, IMO.

It's probably better to just drop that clause and call the repository
"cpython-docs-theme" rather than "psf-docs-theme".

Explicitly affiliating the theme with the PSF made sense if we were
reserving the right to seek trade dress protections in the future, but
it sounds like folks are pretty solidly against that idea, so we can
instead leave the PSF out of it entirely.

>> That's fine, and you're welcome to do so for other Python community
>> projects if you so choose, but please keep in mind that in doing so
>> you're also choosing to become a co-steward of that collective trust
>> :)"
>
> "Becoming a co-steward of that collective trust" sounds serious enough
> (even though I don't understand what it means concretely), so why
> the smiley?

Mainly to convey that the situation isn't necessarily as profound as
that wording might suggest.

Rephrasing that part, and incorporating the amendment from above:

"Note that when adopting this theme, you're also borrowing an element
of the trust and credibility established by the CPython core
developers over the years. That's fine, and you're welcome to do so
for other Python community projects if you so choose, but please keep
in mind that in doing so you're also choosing to accept some of the
responsibility for maintaining that collective trust."

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com