[Python-Dev] Re: PEP 594: update 1

2019-06-06 Thread Christian Tismer
On 05.06.19 02:21, Victor Stinner wrote:
> So what is happening for this PEP since Python 3.8 beta1 has been
> released? Is it too late for Python 3.8 or not?
> 
> It seems like most people are confused by the intent of the PEP. IMHO
> it would be better to rewrite "Remove packages from the stdlib" as
> "Move some stdlib modules to PyPI". But that would require to rewrite
> some parts of the PEP to explain how modules are moved, who become the
> new maintainers, how to support modules both in stdlib (old Python
> versions) and in PyPI (new Python), etc.

And I would like to add something as well:

The stdlib has been a set of well-known modules.
Maybe not the latest and greatest, but you knew for quite sure
that these modules are guaranteed to be stable and quite persistent.

With the move to PyPI, I am missing this promise, partially:

PyPI has very many good modules, but also some less good ones.
With the stdlib, you had almost one choice to choose from.
With PyPI, you have way too many modules, and you have no longer
the feeling "this seems to be right in BDFL mind".

I think what is missing is replacement of this feature:
The set of modules in the stdlib has exactly that being in the
stdlib as a quality indicator.
I need now a structure that replaces that quality,
like

"This one is eligible to go into stdlib"

Do we have such a replacement implemented, already?

-- 
Christian Tismer :^)   tis...@stackless.com
Software Consulting  : http://www.stackless.com/
Karl-Liebknecht-Str. 121 : https://github.com/PySide
14482 Potsdam: GPG key -> 0xFB7BEE0E
phone +49 173 24 18 776  fax +49 (30) 700143-0023
Python-Dev mailing list -- python-dev(a)python.org
To unsubscribe send an email to python-dev-leave(a)python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: python-ideas and python-dev migrated to Mailman 3/HyperKitty

2019-06-06 Thread Steven D'Aprano
On Wed, Jun 05, 2019 at 11:08:13AM +0200, Victor Stinner wrote:

> Enhancements:
> * More reliable "permalink" URLs to emails

This is excellent... 

but the old pipermail URLs were typically something like 70-80 
characters, which means they fit nicely into emails (and comments in 
source code). However the new URLs are much longer. Picking two examples 
at random:

https://mail.python.org/pipermail/python-ideas/2019-March/055976.html

https://mail.python.org/archives/list/python-id...@python.org/thread/B7QPHTQSBVN4NFO3SEVR57AIGYPM3MUM/

i.e. 25-40% longer. Is there a shorter permalink form available, like 
goo.gl, bitly, youtu.be etc use? That would be awesome if we could use 
them instead.



-- 
Steven
Python-Dev mailing list -- python-dev(a)python.org
To unsubscribe send an email to python-dev-leave(a)python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: python-ideas and python-dev migrated to Mailman 3/HyperKitty

2019-06-06 Thread Victor Stinner
Le jeu. 6 juin 2019 à 14:18, Steven D'Aprano  a écrit :
> i.e. 25-40% longer. Is there a shorter permalink form available, like
> goo.gl, bitly, youtu.be etc use? That would be awesome if we could use
> them instead.

I really dislike URL shorteners.

From my point of view, URL shorteners are the opposite of permanent
links. It adds a new single point of failure.

If a full URL becomes broken (HTTP error 404), I can still try to get
find the information in a different way using the full URL. For
example, there are some services like archive.org who archive public
websites. Or look in other emails "around" this email, maybe the
missing email is quoted from another email.

Victor
-- 
Night gathers, and now my watch begins. It shall not end until my death.
Python-Dev mailing list -- python-dev(a)python.org
To unsubscribe send an email to python-dev-leave(a)python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/

[Python-Dev] Re: PEP 594: update 1

2019-06-06 Thread Steve Dower

On 04Jun2019 1721, Victor Stinner wrote:

So what is happening for this PEP since Python 3.8 beta1 has been
released? Is it too late for Python 3.8 or not?

It seems like most people are confused by the intent of the PEP. IMHO
it would be better to rewrite "Remove packages from the stdlib" as
"Move some stdlib modules to PyPI". But that would require to rewrite
some parts of the PEP to explain how modules are moved, who become the
new maintainers, how to support modules both in stdlib (old Python
versions) and in PyPI (new Python), etc.


I think the problem with this is it sounds like the longer-term plan to 
maintain some stdlib packages independently of the CPython repo, *but 
still include them in the distro* (ensurepip style). This is a much 
broader set of packages that would still be available by default 
(assuming that distributors care about their users) but would also 
benefit from a single code base (rather than 3-4 branches) and being 
independently upgradeable (for backports and/or security fixes). They 
would probably still be managed by core developers, and it may be easier 
to attract new contributors for them since they don't need to manage the 
entire CPython repo.


By contrast, the packages in this PEP are just being deprecated and 
removed. We don't actually have to explain what happens next, and we 
don't have to block the PEP on those discussions.


But perhaps we should go for the big refactor now? Then at least we have 
an explanation for unmaintained packages too - they still show up in 
distros but are not part of the CPython source tree.


Cheers,
Steve
Python-Dev mailing list -- python-dev(a)python.org
To unsubscribe send an email to python-dev-leave(a)python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: python-ideas and python-dev migrated to Mailman 3/HyperKitty

2019-06-06 Thread David Mertz
The old URL is definitely a lot friendlier, even apart from the length. I
know at a glance the month and list where the thread occurred, which is
perhaps the most important metadata. In the new link I only know what
mailing list it happened on, and nothing else meaningful.

On Thu, Jun 6, 2019, 8:20 AM Steven D'Aprano  wrote:

> On Wed, Jun 05, 2019 at 11:08:13AM +0200, Victor Stinner wrote:
>
> > Enhancements:
> > * More reliable "permalink" URLs to emails
>
> This is excellent...
>
> but the old pipermail URLs were typically something like 70-80
> characters, which means they fit nicely into emails (and comments in
> source code). However the new URLs are much longer. Picking two examples
> at random:
>
> https://mail.python.org/pipermail/python-ideas/2019-March/055976.html
>
>
> https://mail.python.org/archives/list/python-id...@python.org/thread/B7QPHTQSBVN4NFO3SEVR57AIGYPM3MUM/
>
> i.e. 25-40% longer. Is there a shorter permalink form available, like
> goo.gl, bitly, youtu.be etc use? That would be awesome if we could use
> them instead.
>
>
>
> --
> Steven
> Python-Dev mailing list -- python-dev(a)python.org
> To unsubscribe send an email to python-dev-leave(a)python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: python-ideas and python-dev migrated to Mailman 3/HyperKitty

2019-06-06 Thread Barry Warsaw
On Jun 6, 2019, at 09:15, David Mertz  wrote:
> 
> The old URL is definitely a lot friendlier, even apart from the length.

Unfortunately, the old URLs aren’t really permanent.  Hypermail calculated them 
sequentially and if you rebuild the archive from the source-of-truth mbox 
files, there’s no way to guarantee the original sequence (for various reasons 
including removal of messages for spam/legal/request purposes, corruption in 
the mbox files, etc.).  In fact, lists on python.org *have* had their archives 
rebuilt and *have* gotten reshuffled, breaking existing links.

The new URLs are guaranteed to be reproducible from the original message 
source.  The downside is that they are less friendly.

-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-06 Thread Tim Peters
[Antoine Pitrou ]
> The interesting thing here is that in many situations, the size is
> known up front when deallocating - it is simply not communicated to the
> deallocator because the traditional free() API takes a sole pointer,
> not a size.  But CPython could communicate that size easily if we
> would like to change the deallocation API.  Then there's no bother
> looking up the allocated size in sophisticated lookup structures.

That could work (to make it possible to increase obmalloc's pool
size).  Except ...

> I'll note that jemalloc provides such APIs:
> http://jemalloc.net/jemalloc.3.html
>
> """The dallocx() function causes the memory referenced by ptr to be
> made available for future allocations.
>
> The sdallocx() function is an extension of dallocx() with a size
> parameter to allow the caller to pass in the allocation size as an
> optimization."""

obmalloc doesn't intend to be a general-purpose allocator - it only
aims at optimizing "small" allocations, punting to the system for
everything beyond that.  Unless the size is _always_ passed in (on
every free() and realloc() spelling it supports), an "optimization"
doesn't help it much.  It needs a bulletproof way to determine whether
it, or system malloc/realloc, originally obtained an address passed
in.  If the size is always passed in, no problem (indeed, a single bit
could suffice).  But if it's ever possible that the size won't be
passed in, all the runtime machinery to figure that out on its own
needs to be able to handle all addresses.

Like now:  if the size were passed in, obmalloc could test the size
instead of doing the `address_in_range()` dance(*).  But if it's ever
possible that the size won't be passed in, all the machinery
supporting `address_in_range()` still needs to be there, and every
obmalloc spelling of malloc/realloc needs to ensure that machinery
will work if the returned address is passed back to an obmalloc
free/realloc spelling without the size.

The "only"problem with address_in_range is that it limits us to a
maximum pool size of 4K.  Just for fun, I boosted that to 8K to see
how likely segfaults really are, and a Python built that way couldn't
even get to its first prompt before dying with an access violation
(Windows-speak for segfault).

Alas, that makes it hard to guess how much value there would be for
Python programs if the pool size could be increased - can't even get
Python started.

We could eliminate the pool size restriction in many ways.  For
example, we could store the addresses obtained from the system
malloc/realloc - but not yet freed - in a set, perhaps implemented as
a radix tree to cut the memory burden.  But digging through 3 or 4
levels of a radix tree to determine membership is probably
significantly slower than address_in_range.

I can think of a way to do it slightly faster than (but related to)
address_in_range, but it would (on a 64-bit box) require adding 24
more bytes for each system-malloc/realloc allocation.  8 of those
bytes would be pure waste, due to that the Alignment Gods appear to
require 16-byte alignment for every allocation on a 64-bit box now.

In stark contrast, the extra memory burden of the current
address_in_range is an insignificant 8 bytes per _arena_ (256 KB, and
"should be" larger now).

Another approach:  keep address_as_range as-is, but add new internal
structure to larger pools, to repeat the arena index every 4KB.  But
that fights somewhat against the goal of larger pools.

Etc. ;-)

(*) Actually not quite true.  If a "large" object is obtained from
obmalloc now (meaning it actually came from the system malloc), then
cut back to a "small" size by a realloc, it _remains_ under the
control of the system malloc now.  Passing in the newer "small" size
to a free() later would cause obmalloc to get fatally confused about
that.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-06 Thread Antoine Pitrou
On Thu, 6 Jun 2019 13:57:37 -0500
Tim Peters  wrote:
> [Antoine Pitrou ]
> > The interesting thing here is that in many situations, the size is
> > known up front when deallocating - it is simply not communicated to the
> > deallocator because the traditional free() API takes a sole pointer,
> > not a size.  But CPython could communicate that size easily if we
> > would like to change the deallocation API.  Then there's no bother
> > looking up the allocated size in sophisticated lookup structures.  
> 
> That could work (to make it possible to increase obmalloc's pool
> size).  Except ...
> 
> > I'll note that jemalloc provides such APIs:
> > http://jemalloc.net/jemalloc.3.html
> >
> > """The dallocx() function causes the memory referenced by ptr to be
> > made available for future allocations.
> >
> > The sdallocx() function is an extension of dallocx() with a size
> > parameter to allow the caller to pass in the allocation size as an
> > optimization."""  
> 
> obmalloc doesn't intend to be a general-purpose allocator - it only
> aims at optimizing "small" allocations, punting to the system for
> everything beyond that.

But my response was under the assumption that we would want obmalloc to
deal with all allocations.  Which is more or less required anyway to
have an efficient GC that doesn't have to walk linked lists and access
memory in random order to iterate over known objects.

Regards

Antoine.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: PEP 594: update 1

2019-06-06 Thread Brett Cannon
On Thu, Jun 6, 2019 at 12:25 AM Christian Tismer 
wrote:

> On 05.06.19 02:21, Victor Stinner wrote:
> > So what is happening for this PEP since Python 3.8 beta1 has been
> > released? Is it too late for Python 3.8 or not?
> >
> > It seems like most people are confused by the intent of the PEP. IMHO
> > it would be better to rewrite "Remove packages from the stdlib" as
> > "Move some stdlib modules to PyPI". But that would require to rewrite
> > some parts of the PEP to explain how modules are moved, who become the
> > new maintainers, how to support modules both in stdlib (old Python
> > versions) and in PyPI (new Python), etc.
>
> And I would like to add something as well:
>
> The stdlib has been a set of well-known modules.
> Maybe not the latest and greatest, but you knew for quite sure
> that these modules are guaranteed to be stable and quite persistent.
>
> With the move to PyPI, I am missing this promise, partially:
>
> PyPI has very many good modules, but also some less good ones.
> With the stdlib, you had almost one choice to choose from.
> With PyPI, you have way too many modules, and you have no longer
> the feeling "this seems to be right in BDFL mind".
>
> I think what is missing is replacement of this feature:
> The set of modules in the stdlib has exactly that being in the
> stdlib as a quality indicator.
> I need now a structure that replaces that quality,
> like
>
> "This one is eligible to go into stdlib"
>
> Do we have such a replacement implemented, already?
>

Are you asking for us to bless packages on PyPI as of a quality that the
core devs approve of it? Or something else? If it's the former we do have
links pointing to other projects already (e.g. linking to 'requests' from
https://docs.python.org/3/library/urllib.request.html#module-urllib.request).
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-06 Thread Tim Peters
[Antoine Pitrou ]
> But my response was under the assumption that we would want obmalloc to
> deal with all allocations.

I didn't know that.  I personally have no interest in that:  if we
want an all-purpose allocator, there are several already to choose
from.  There's no reason to imagine we could write a better one.

> Which is more or less required anyway to have an efficient GC that doesn't
> have to walk linked lists and access memory in random order to iterate over
> known objects.

As the parent thread showed, obmalloc does at least as well as any
general-purpose allocator known _for Python's purposes_ (a great many
(de)allocations of "small" objects).  Already explained too that
size-segregated linked free lists are the core of _why_ it does so
well.  Besides making carving off, and freeing, blocks dirt cheap,
linked lists also naturally support recycling memory blocks in MRU
("freshness in cache") order.

But I don't know what you mean by "access memory in random order to
iterate over known objects".  obmalloc never needs to iterate over
known objects - indeed, it contains no code capable of doing that..
Our cyclic gc does, but that's independent of obmalloc.  Over on
Discourse, Neil is speculating about using radix trees for cyclic gc
instead of _its_ linked lists  In obmalloc, allocated memory regions
aren't linked at all.  It's free regions that are linked, and
helpfully so in MRU order.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: PEP 594: update 1

2019-06-06 Thread Christian Tismer
On 06.06.19 21:27, Brett Cannon wrote:
> 
> 
> On Thu, Jun 6, 2019 at 12:25 AM Christian Tismer  > wrote:
> 
> On 05.06.19 02:21, Victor Stinner wrote:
> > So what is happening for this PEP since Python 3.8 beta1 has been
> > released? Is it too late for Python 3.8 or not?
> >
> > It seems like most people are confused by the intent of the PEP. IMHO
> > it would be better to rewrite "Remove packages from the stdlib" as
> > "Move some stdlib modules to PyPI". But that would require to rewrite
> > some parts of the PEP to explain how modules are moved, who become the
> > new maintainers, how to support modules both in stdlib (old Python
> > versions) and in PyPI (new Python), etc.
> 
> And I would like to add something as well:
> 
> The stdlib has been a set of well-known modules.
> Maybe not the latest and greatest, but you knew for quite sure
> that these modules are guaranteed to be stable and quite persistent.
> 
> With the move to PyPI, I am missing this promise, partially:
> 
> PyPI has very many good modules, but also some less good ones.
> With the stdlib, you had almost one choice to choose from.
> With PyPI, you have way too many modules, and you have no longer
> the feeling "this seems to be right in BDFL mind".
> 
> I think what is missing is replacement of this feature:
> The set of modules in the stdlib has exactly that being in the
> stdlib as a quality indicator.
> I need now a structure that replaces that quality,
> like
> 
>     "This one is eligible to go into stdlib"
> 
> Do we have such a replacement implemented, already?
> 
> 
> Are you asking for us to bless packages on PyPI as of a quality that the
> core devs approve of it? Or something else? If it's the former we do
> have links pointing to other projects already (e.g. linking to
> 'requests' from
> https://docs.python.org/3/library/urllib.request.html#module-urllib.request).


Yes, I'm asking for blessing some packages.
And I have not spent much time these days with the topic,
so please ignore my uninformed question.

-- 
Christian Tismer :^)   tis...@stackless.com
Software Consulting  : http://www.stackless.com/
Karl-Liebknecht-Str. 121 : https://github.com/PySide
14482 Potsdam: GPG key -> 0xFB7BEE0E
phone +49 173 24 18 776  fax +49 (30) 700143-0023
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-06 Thread Antoine Pitrou
On Thu, 6 Jun 2019 16:03:03 -0500
Tim Peters  wrote:
> But I don't know what you mean by "access memory in random order to
> iterate over known objects".  obmalloc never needs to iterate over
> known objects - indeed, it contains no code capable of doing that..
> Our cyclic gc does, but that's independent of obmalloc.

It's not.  Cyclic GC needs its own linked lists *because* the allocator
doesn't allow it to iterate over allocated objects.

Regards

Antoine.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/YZIG3RZKM4XSTM3PYPDAPG3UHOB5QKM4/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-06 Thread Tim Peters
[Tim]
>> But I don't know what you mean by "access memory in random order to
>> iterate over known objects".  obmalloc never needs to iterate over
>> known objects - indeed, it contains no code capable of doing that..
>> Our cyclic gc does, but that's independent of obmalloc.

[Antoine]
> It's not.  Cyclic GC needs its own linked lists *because* the allocator
> doesn't allow it to iterate over allocated objects.

The doubly linked lists in gc primarily support efficient
_partitioning_ of objects for gc's purposes (a union of disjoint sets,
with constant-time moving of an object from one set to another, and
constant-time union of disjoint sets).  "All objects" is almost never
interesting to it (it is only when the oldest non-frozen generation is
being collected).

Between collections, the partitioning is by generation.

During a collection, the youngest generations are combined into one,
and then that's sub-partitioned in various ways as collection goes
along, ending with a partition into reachable and unreachable objects.
In between, ephemeral partitions are constructed (e.g., the set of
"tentatively unreachable" objects).

None of that was driven by obmalloc's (or malloc's) inability to
iterate over objects.  Doubly linked lists were the obvious way to
implement the required operations on partitions efficiently and
simply.

In any case, it appears to be "a feature" now that people can use any
flavor of the malloc family they like in CPython, so I expect that any
attempt to tie cyclic gc to a specific flavor of malloc would be a
very hard sell.  Which, BTW, was the intended meaning of
"independent":  cyclic gc right now couldn't care less which version
of malloc a user plugs in - nor could obmalloc care less which cyclic
gc algorithm is used.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/OR4NFZRQIOCK2N3XBCPI6GESI5BYRD3D/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-06 Thread Antoine Pitrou
On Thu, 6 Jun 2019 17:26:17 -0500
Tim Peters  wrote:
> 
> The doubly linked lists in gc primarily support efficient
> _partitioning_ of objects for gc's purposes (a union of disjoint sets,
> with constant-time moving of an object from one set to another, and
> constant-time union of disjoint sets).  "All objects" is almost never
> interesting to it (it is only when the oldest non-frozen generation is
> being collected).

Right.  But the oldest generation is precisely the pain point, since
full collections can induce very long pauses.  IOW, perhaps it would be
fine to keep dedicated lists for the young generations, while doing a
heap walk for the full collection case.

> In any case, it appears to be "a feature" now that people can use any
> flavor of the malloc family they like in CPython, so I expect that any
> attempt to tie cyclic gc to a specific flavor of malloc would be a
> very hard sell.

Depends on the benefits of course ;-) Most people use some pre-built
binaries that "naturally" come with obmalloc enabled.

Of course, it's only a very small minority of applications that hit
real performance issues with the GC - and in probably many of those
cases tuning the full collection threshold can alleviate the problem
still.

Regards

Antoine.
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZQEHZTW77MFCKCJMR7XB4AVBF6KCB3ZW/


[Python-Dev] PEP 596 proposes doubling the release cadence

2019-06-06 Thread Łukasz Langa
PEP body and discussion link:
https://discuss.python.org/t/pep-596-python-3-9-release-schedule-doubling-the-release-cadence/1828

- Ł


signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/Y4XFD7OW5J4ULITO2SSPAWSZ2L76UGMU/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-06 Thread Neil Schemenauer
On 2019-06-06, Tim Peters wrote:
> Like now:  if the size were passed in, obmalloc could test the size
> instead of doing the `address_in_range()` dance(*).  But if it's ever
> possible that the size won't be passed in, all the machinery
> supporting `address_in_range()` still needs to be there, and every
> obmalloc spelling of malloc/realloc needs to ensure that machinery
> will work if the returned address is passed back to an obmalloc
> free/realloc spelling without the size.


We can almost make it work for GC objects, the use of obmalloc is
quite well encapsulated.  I think I intentionally designed the
PyObject_GG_New/PyObject_GC_Del/etc APIs that way.

Quick and dirty experiment is here:

https://github.com/nascheme/cpython/tree/gc_malloc_free_size

The major hitch seems my new gc_obj_size() function.  We can't be
sure the 'nbytes' passed to _PyObject_GC_Malloc() is the same as
what is computed by gc_obj_size().  It usually works but there are
exceptions (freelists for frame objects and tuple objects, for one)

A nasty problem is the weirdness with PyType_GenericAlloc() and the
sentinel item.  _PyObject_GC_NewVar() doesn't include space for the
sentinel but PyType_GenericAlloc() does.  When you get to
gc_obj_size(), you don't if you should use "nitems" or "nitems+1".

I'm not sure how the fix the sentinel issue.  Maybe a new type slot
or a type flag?  In any case, making a change like my git branch
above would almost certainly break extensions that don't play
nicely.  It won't be hard to make it a build option, like the
original gcmodule was.  Then, assuming there is a performance boost,
people can enable it if their extensions are friendly.


> The "only"problem with address_in_range is that it limits us to a
> maximum pool size of 4K.  Just for fun, I boosted that to 8K to see
> how likely segfaults really are, and a Python built that way couldn't
> even get to its first prompt before dying with an access violation
> (Windows-speak for segfault).

If we can make the above idea work, you could set the pool size to
8K without issue.  A possible problem is that the obmalloc and
gcmalloc arenas are separate.  I suppose that affects 
performance testing.

> We could eliminate the pool size restriction in many ways.  For
> example, we could store the addresses obtained from the system
> malloc/realloc - but not yet freed - in a set, perhaps implemented as
> a radix tree to cut the memory burden.  But digging through 3 or 4
> levels of a radix tree to determine membership is probably
> significantly slower than address_in_range.

You are likely correct. I'm hoping to benchmark the radix tree idea.
I'm not too far from having it working such that it can replace
address_in_range().  Maybe allocating gc_refs as a block would
offset the radix tree cost vs address_in_range().  If the above idea
works, we know the object size at free() and realloc(), we don't
need address_in_range() for those code paths.

Regards,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ILFK2MTCVA7GB7JGBVSUWASKJ7T4LLJE/


[Python-Dev] Re: python-ideas and python-dev migrated to Mailman 3/HyperKitty

2019-06-06 Thread Stephen J. Turnbull
Barry Warsaw writes:
 > On Jun 6, 2019, at 09:15, David Mertz  wrote:
 > > 
 > > The old URL is definitely a lot friendlier, even apart from the length.
 > 
 > Unfortunately, the old URLs aren’t really permanent.

True.  That could be addressed in theory, but it would be fragile (ie,
vulnerable to loss or corruption of the external database mapping
messages to URLs).  Calculating from the message itself means that if
you have the message you can always get where you want to go.

 > The new URLs are guaranteed to be reproducible from the original
 > message source.  The downside is that they are less friendly.

They could, however be made more friendly than they currently are.
There's no reason (in principle, of course it requires changing code
and the DNS) why your message, currently given the Archived-At URL

https://mail.python.org/archives/list/python-dev@python.org/message/EFHTPGCSB5VZSRS3DDXZN6ETYP5H6NDS/

couldn't be given (A is for Archives)

https://a.python.org/python-dev@python.org/EFHTPGCSB5VZSRS3DDXZN6ETYP5H6NDS/

which gets it down to an RFC-conformant 76 characters. ;-)  Of course
many lists would overflow that, and I agree with David that

https://a.python.org/python-dev@python.org/2019/06/EFHTPGCSB5VZSRS3DDXZN6ETYP5H6NDS/

would be better still.  Although the risk of collision would be orders
of magnitude higher (the date buys us some leeway but not much, we
could make the ID-Hash be 2019/06/B5VZSRS3DDXZN6ET (arbitrarily chose
middle 16), giving

https://a.python.org/python-dev@python.org/2019/06/B5VZSRS3DDXZN6ET

(67 characters, allowing a few more characters for domain names and/or
list names -- note with the current scheme, a domain name which is 1
character longer probably uses up two more characters of space).

None of this is very attractive to me, for reasons I will go into on
Mailman-Developers or gitlab.com/mailman/mailman/issues if you want to
file one.  Briefly, people who want bit.ly-length short URLs won't be
satisfied, and the proposed URLs are more useful but still ugly.

Personally I think we should all just switch to RestructuredText- and
Markdown-capable MUAs, and kill off both ugly visible URLs and HTML
email with one big ol' rock.

Steve
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/O3T27UUHKKXATOPJT4KEQHREUGYVMELV/


[Python-Dev] Re: obmalloc (was Have a big machine and spare time? Here's a possible Python bug.)

2019-06-06 Thread Neil Schemenauer
On 2019-06-06, Tim Peters wrote:
> The doubly linked lists in gc primarily support efficient
> _partitioning_ of objects for gc's purposes (a union of disjoint sets,
> with constant-time moving of an object from one set to another, and
> constant-time union of disjoint sets).  "All objects" is almost never
> interesting to it (it is only when the oldest non-frozen generation is
> being collected).

My current idea is to put partitioning flags on the interior radix
tree nodes.  If you mark an object as "finalizer reachable", for
example, it would mark all the nodes on the path from the root with
that flag.  Then, when you want to iterate over all the GC objects
with a flag, you can avoid uninteresting branches of the tree.

For generations, maybe tracking them at the pool level is good
enough.  Interior nodes can track generations too (i.e. the youngest
generation contained under them).

My gut feeling is that the prev/next pointer updates done by
move_unreachable() and similar functions must be quite expensive.
Doing the traversal with an explicit stack is a lot less elegant but
I think it should be faster.  At least, when you are dealing with a
big set of GC objects that don't fit in the CPU cache.

Regards,

  Neil
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/J422RWENKJAYHMXSZVRV5KGWSHNMAMJF/


[Python-Dev] Re: python-ideas and python-dev migrated to Mailman 3/HyperKitty

2019-06-06 Thread Chris Angelico
On Fri, Jun 7, 2019 at 4:30 PM Stephen J. Turnbull
 wrote:
> They could, however be made more friendly than they currently are.
> There's no reason (in principle, of course it requires changing code
> and the DNS) why your message, currently given the Archived-At URL
>
> https://mail.python.org/archives/list/python-dev@python.org/message/EFHTPGCSB5VZSRS3DDXZN6ETYP5H6NDS/
>
> couldn't be given (A is for Archives)
>
> https://a.python.org/python-dev@python.org/EFHTPGCSB5VZSRS3DDXZN6ETYP5H6NDS/
>
> which gets it down to an RFC-conformant 76 characters. ;-)

Can the list name be abbreviated to just "python-dev"? That'd give
some extra room to play with. It'd require that lists hosted on that
server be unique without their domain names; are there any known
collisions?

ChrisA
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ATA5JGGJM6NDMGPP2PJ2QVMZXH6FWETY/


[Python-Dev] Re: python-ideas and python-dev migrated to Mailman 3/HyperKitty

2019-06-06 Thread Wes Turner
Thanks for getting these upgraded. IMHO, being able to copy URLs from list
message footers as references in e.g. issues will be a great boost in
productivity.

On Friday, June 7, 2019, Stephen J. Turnbull <
turnbull.stephen...@u.tsukuba.ac.jp> wrote:

> Barry Warsaw writes:
>  > On Jun 6, 2019, at 09:15, David Mertz  wrote:
>  > >
>  > > The old URL is definitely a lot friendlier, even apart from the
> length.
>  >
>  > Unfortunately, the old URLs aren’t really permanent.
>
> True.  That could be addressed in theory, but it would be fragile (ie,
> vulnerable to loss or corruption of the external database mapping
> messages to URLs).  Calculating from the message itself means that if
> you have the message you can always get where you want to go.
>
>  > The new URLs are guaranteed to be reproducible from the original
>  > message source.  The downside is that they are less friendly.
>
> They could, however be made more friendly than they currently are.
> There's no reason (in principle, of course it requires changing code
> and the DNS) why your message, currently given the Archived-At URL
>
> https://mail.python.org/archives/list/python-dev@python.org/message/
> EFHTPGCSB5VZSRS3DDXZN6ETYP5H6NDS/
>
> couldn't be given (A is for Archives)
>
> https://a.python.org/python-dev@python.org/EFHTPGCSB5VZSRS3DDXZN6ETYP5H6N
> DS/
>
> which gets it down to an RFC-conformant 76 characters. ;-)  Of course
> many lists would overflow that, and I agree with David that
>
> https://a.python.org/python-dev@python.org/2019/06/
> EFHTPGCSB5VZSRS3DDXZN6ETYP5H6NDS/
>
> would be better still.  Although the risk of collision would be orders
> of magnitude higher (the date buys us some leeway but not much, we
> could make the ID-Hash be 2019/06/B5VZSRS3DDXZN6ET (arbitrarily chose
> middle 16), giving


> https://a.python.org/python-dev@python.org/2019/06/B5VZSRS3DDXZN6ET
>
> (67 characters, allowing a few more characters for domain names and/or
> list names -- note with the current scheme, a domain name which is 1
> character longer probably uses up two more characters of space).


Are these message IDs or hashes?
Do they have to be (is this) base-36?
Could they instead be base-62? (26+10+26)


>
> None of this is very attractive to me, for reasons I will go into on
> Mailman-Developers or gitlab.com/mailman/mailman/issues if you want to
> file one.  Briefly, people who want bit.ly-length short URLs won't be
> satisfied, and the proposed URLs are more useful but still ugly.


We shouldn't just drop extra date information from the URL and only lookup
by the messageid unless we add a redirect to the correct dated URL; because
caching and trickery.


>
> Personally I think we should all just switch to RestructuredText- and
> Markdown-capable MUAs, and kill off both ugly visible URLs and HTML
> email with one big ol' rock.


While I personally prefer .rst and .md, hovering over URL anchor text takes
unnecessary time (and I'll remember whether I've been to the actual
http://URL, but not 'here' and 'there').
So I'm fine with ridiculous, preposterous long links (even in the middle of
the email; without footnotes to scroll back and forth to)


>
> Steve
> ___
> Python-Dev mailing list -- python-dev@python.org
> To unsubscribe send an email to python-dev-le...@python.org
> https://mail.python.org/mailman3/lists/python-dev.python.org/
> Message archived at https://mail.python.org/archives/list/python-dev@
> python.org/message/O3T27UUHKKXATOPJT4KEQHREUGYVMELV/
>
___
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/N62MYXZCUHPDTEQWWTH33BU2FIZ3ROPF/