[Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-24 Thread Serhiy Storchaka
I submitted a number of patches which fixes currently broken 
Unicode-disabled build of Python 2.7 (built with --disable-unicode 
configure option). I suppose this was broken in 2.7 when C 
implementation of the io module was introduced.


http://bugs.python.org/issue21833 -- main patch which fixes the io 
module and adds helpers for testing.


http://bugs.python.org/issue21834 -- a lot of minor fixes for tests.

Following issues fix different modules and related tests:

http://bugs.python.org/issue21854 -- cookielib
http://bugs.python.org/issue21838 -- ctypes
http://bugs.python.org/issue21855 -- decimal
http://bugs.python.org/issue21839 -- distutils
http://bugs.python.org/issue21843 -- doctest
http://bugs.python.org/issue21851 -- gettext
http://bugs.python.org/issue21844 -- HTMLParser
http://bugs.python.org/issue21850 -- httplib and SimpleHTTPServer
http://bugs.python.org/issue21842 -- IDLE
http://bugs.python.org/issue21853 -- inspect
http://bugs.python.org/issue21848 -- logging
http://bugs.python.org/issue21849 -- multiprocessing
http://bugs.python.org/issue21852 -- optparse
http://bugs.python.org/issue21840 -- os.path
http://bugs.python.org/issue21845 -- plistlib
http://bugs.python.org/issue21836 -- sqlite3
http://bugs.python.org/issue21837 -- tarfile
http://bugs.python.org/issue21835 -- Tkinter
http://bugs.python.org/issue21847 -- xmlrpc
http://bugs.python.org/issue21841 -- xml.sax
http://bugs.python.org/issue21846 -- zipfile

Most fixes are trivial and are only several lines of a code.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-24 Thread Serhiy Storchaka

24.06.14 14:50, Victor Stinner написав(ла):

2014-06-24 13:04 GMT+02:00 Skip Montanaro :

I can't see any reason to make a backwards-incompatible change to
Python 2 to only support Unicode. You're bound to break somebody's
setup. Wouldn't it be better to fix bugs as Serhiy has done?


According to the long list of issues, I don't think that it's possible
to compile and use Python stdlib when Python is compiled without
Unicode support. So I'm not sure that we can say that it's an
backward-incompatible change.


Python has about 300 modules, my patches fix about 30 modules (only 8 of 
them cause compiling error). And that's almost all. Left only pickle, 
json, etree, email and unicode-specific modules (codecs, unicodedata and 
encodings). Besides pickle I'm not sure that others can be fixed.


The fact that only small fraction of modules needs fixes means that 
Python without unicode support can be pretty usable.


The main problem was with testing itself. Test suite depends on 
tempfile, which now uses io.open, which didn't work without unicode 
support (at least since 2.7).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-25 Thread Serhiy Storchaka

25.06.14 00:03, Jim J. Jewett написав(ла):

It would be good to fix the tests (and actual library issues).
Unfortunately, some of the specifically proposed changes (such as
defining and using _unicode instead of unicode within python code)
look to me as though they would trigger problems in the normal build
(where the unicode object *does* exist, but would no longer be used).


This is recomended by MvL [1] and widely used (19 times in source code) 
idiom.


[1] http://bugs.python.org/issue8767#msg159473


Other changes, such as the use of \x escapes, appear correct, but make
the tests harder to read -- and might end up removing a test for
correct unicode funtionality across different spellings.





Even if we assume that the tests are fine, and I'm just an idiot who
misread them, the fact that there is any confusion means that these
particular changes may be tricky enough to be for a bad tradeoff for 2.7.

It *might* work if you could make a more focused change.  For example,
instead of leaving the 'unicode' name unbound, provide an object that
simply returns false for isinstance and raises a UnicodeError for any
other method call.  Even *this* might be too aggressive to 2.7, but the
fact that it would only appear in the --disable-unicode builds, and
would make them more similar to the regular build are points in its
favor.


No, existing code use different approach. "unicode" doesn't exist, while 
encode/decode methods exist but are useless. If my memory doesn't fail 
me, there is even special explanatory comment about this historical 
decision somewhere. This decision was made many years ago.



Before doing that, though, please document what the --disable-unicode
mode is actually *supposed* to do when interacting with byte-streams
that a standard defines as UTF-8.  (For example, are the changes to
_xml_dumps and _xml_loads at
 http://bugs.python.org/file35758/multiprocessing.patch
correct, or do those functions assume they get bytes as input, or
should the functions raise an exception any time they are called?)


Looking more carefully, I see that there is a bug in unicode-enable 
build (wrong backporting from 3.x). In 2.x xmlrpclib.dumps produces 
already utf-8 encoded string, in 3.x xmlrpc.client.dumps produces 
unicode string. multiprocessing should fail with non-ascii str or unicode.


Side benefit of my patches is that they expose existing errors in 
unicode-enable build.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-25 Thread Serhiy Storchaka

24.06.14 22:54, Ned Deily написав(ла):

Benefit:
- Fixes documented feature that may be of benefit to users of Python in
applications with very limited memory available, although there aren't
any open issues from users requesting this (AFAIK).  No benefit to the
overwhelming majority of Python users, who only use Unicode-enabled
builds.


Other benefit: patches exposed several bugs in code (mainly errors in 
backporting from 3.x).



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-25 Thread Serhiy Storchaka

25.06.14 16:29, Victor Stinner написав(ла):

2014-06-25 14:58 GMT+02:00 Serhiy Storchaka :

Other benefit: patches exposed several bugs in code (mainly errors in
backporting from 3.x).


Oh, interesting. Do you have examples of such bugs?


In posixpath branches for unicode and str should be reversed.
In multiprocessing .encode('utf-8') is applied on utf-8 encoded str 
(this is unicode string in Python 3). And there is similar error in at 
least one other place. Tests for bytearray actually test bytes, not 
bytearray. That is what I remember.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Fix Unicode-disabled build of Python 2.7

2014-06-26 Thread Serhiy Storchaka

26.06.14 02:28, Nick Coghlan написав(ла):

OK, *that* sounds like an excellent reason to keep the Unicode disabled
builds functional, and make sure they stay that way with a buildbot: to
help make sure we're not accidentally running afoul of the implicit
interoperability between str and unicode when backporting fixes from
Python 3.

Helping to ensure correct handling of str values makes this capability
something of benefit to *all* Python 2 users, not just those that turn
off the Unicode support. It also makes it a potentially useful testing
tool when assessing str/unicode handling in general.


Do you want to make some patch reviews?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-29 Thread Serhiy Storchaka

30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou 
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?

[1] http://bugs.python.org/issue15381

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-29 Thread Serhiy Storchaka

30.07.14 06:59, Serhiy Storchaka написав(ла):

30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou 
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?

[1] http://bugs.python.org/issue15381


Using microbenchmark from issue22003:

$ cat i.py
import io
word = b'word'
line = (word * int(79/len(word))) + b'\n'
ar = line * int((4 * 1048576) / len(line))
def readlines():
return len(list(io.BytesIO(ar)))
print('lines: %s' % (readlines(),))
$ ./python -m timeit -s 'import i' 'i.readlines()'

Before patch: 10 loops, best of 3: 46.9 msec per loop
After issue22003 patch: 10 loops, best of 3: 36.4 msec per loop
After issue15381 patch: 10 loops, best of 3: 27.6 msec per loop


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-30 Thread Serhiy Storchaka

30.07.14 16:59, Antoine Pitrou написав(ла):


Le 30/07/2014 02:11, Serhiy Storchaka a écrit :

30.07.14 06:59, Serhiy Storchaka написав(ла):

30.07.14 02:45, antoine.pitrou написав(ла):

http://hg.python.org/cpython/rev/79a5fbe2c78f
changeset:   91935:79a5fbe2c78f
parent:  91933:fbd104359ef8
user:Antoine Pitrou 
date:Tue Jul 29 19:41:11 2014 -0400
summary:
   Issue #22003: When initialized from a bytes object, io.BytesIO() now
defers making a copy until it is mutated, improving performance and
memory use on some use cases.

Patch by David Wilson.


Did you compare this with issue #15381 [1]?


Not really, but David's patch is simple enough and does a good job of
accelerating the read-only BytesIO case.


Ignoring tests and comments my patch adds/removes/modifies about 200 
lines, and David's patch -- about 150 lines of code. But it's __sizeof__ 
looks not correct, correcting it requires changing about 50 lines. In 
sum the complexity of both patches is about equal.



$ ./python -m timeit -s 'import i' 'i.readlines()'

Before patch: 10 loops, best of 3: 46.9 msec per loop
After issue22003 patch: 10 loops, best of 3: 36.4 msec per loop
After issue15381 patch: 10 loops, best of 3: 27.6 msec per loop


I'm surprised your patch does better here. Any idea why?


I didn't look at David's patch too close yet. But my patch includes 
optimization for end-of-line scanning.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #22003: When initialized from a bytes object, io.BytesIO() now

2014-07-31 Thread Serhiy Storchaka

31.07.14 00:23, Antoine Pitrou написав(ла):

Le 30/07/2014 15:48, Serhiy Storchaka a écrit :
I meant that David's approach is conceptually simpler, which makes it
easier to review.
Regardless, there is no exclusive-OR here: if you can improve over the
current version, there's no reason not to consider it/


Unfortunately there is no anything common in implementations. 
Conceptually David came in his last patch to same idea as in issue15381 
but with different and less general implementation. To apply my patch 
you need first rollback issue22003 changes (except tests).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Documenting enum types

2014-08-13 Thread Serhiy Storchaka
Should new enum types added recently to collect module constants be 
documented at all? For example AddressFamily is absent in socket.__all__ 
[1].


[1] http://bugs.python.org/issue20689

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 467: Minor API improvements for bytes & bytearray

2014-08-15 Thread Serhiy Storchaka

15.08.14 08:50, Nick Coghlan написав(ла):

* add bytes.zeros() and bytearray.zeros() as a replacement


b'\0' * n and bytearray(b'\0') * n look good replacements to me. No need 
to learn new method. And it works right now.



* add bytes.iterbytes(), bytearray.iterbytes() and memoryview.iterbytes()


What are use cases for this? I suppose that main use case may be writing 
the code compatible with 2.7 and 3.x. But in this case you need a 
wrapper (because these types in 2.7 have no the iterbytes() method). And 
how larger would be an advantage of this method over the 
``map(bytes.byte, data)``?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] "embedded NUL character" exceptions

2014-08-17 Thread Serhiy Storchaka
Currently most functions which accepts string argument which then passed 
to C function as NUL-terminated string, reject strings with embedded NUL 
character and raise TypeError. ValueError looks more appropriate here, 
because argument type is correct (str), only its value is wrong. But 
this is backward incompatible change.


I think that we should get rid of this legacy inconsistency sooner or 
later. Why not fix it right now? I have opened an issue on the tracker 
[1], but this issue requires more broad discussion.


[1] http://bugs.python.org/issue22215

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Bytes path support

2014-08-19 Thread Serhiy Storchaka
Builting open(), io classes, os and os.path functions and some other 
functions in the stdlib support bytes paths as well as str paths. But 
many functions doesn't. There are requests about adding this support 
([1], [2]) in some modules. It is easy (just call os.fsdecode() on 
argument) but I'm not sure it is worth to do. Pathlib doesn't support 
bytes path and it looks intentional. What is general policy about 
support of bytes path in the stdlib?


[1] http://bugs.python.org/issue19997
[2] http://bugs.python.org/issue20797

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Bytes path support

2014-08-19 Thread Serhiy Storchaka

19.08.14 20:02, Guido van Rossum написав(ла):

The official policy is that we want them to go away, but reality so far
has not budged. We will continue to hold our breath though. :-)


Does it mean that we should reject all propositions about adding bytes 
path support in existing functions (in particular issue19997 (imghdr) 
and issue20797 (zipfile))?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Critical bash vulnerability CVE-2014-6271 may affect Python on *n*x and OSX

2014-09-26 Thread Serhiy Storchaka

On 26.09.14 01:17, Antoine Pitrou wrote:

Fortunately, Python's subprocess has its `shell` argument default to
False. However, `os.system` invokes the shell implicitly and is
therefore a possible attack vector.


Fortunately dash (which is used as /bin/sh in Debian and Ubuntu) is not 
vulnerable.


$ x='() { :;}; echo gotcha'  ./python -c 'import os; os.system("echo do 
something useful")'

do something useful


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes-like objects

2014-10-05 Thread Serhiy Storchaka

On 06.10.14 00:24, Greg Ewing wrote:

anatoly techtonik wrote:

That's a cool stuff. `bytes-like object` is really a much better name
for users.


I'm not so sure. Usually when we talk about an "xxx-like object" we
mean one that supports a certain Python interface, e.g. a "file-like
object" is one that has read() and/or write() methods. But you can't
create an object that supports the buffer protocol by implementing
Python methods.

I'm worried that using the term "bytes-like object" will lead
people to ask "What methods do I have to implement to make my
object bytes-like?", to which the answer is "mu".


Other (rarely used) alternatives are "buffer-like object" and 
"buffer-compatible object".


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 479: Change StopIteration handling inside generators

2014-11-20 Thread Serhiy Storchaka

On 20.11.14 21:58, Antoine Pitrou wrote:

To me "generator_return" sounds like the addition to generator syntax
allowing for return statements (which was done as part of the "yield
from" PEP). How about "generate_escape"?


Or may be "generator_stop_iteration"?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Email from Rietveld Code Review Tool is classified as spam

2014-12-24 Thread Serhiy Storchaka

On 25.12.14 05:56, Sky Kok wrote:

Anyway, sometimes when people review my patches for CPython, they send
me a notice through Rietveld Code Review Tool which later will send an
email to me. However, my GMail spam filter is aggressive so the email
will always be classified as spam because it fails spf checking. So if
Taylor Swift clicks 'send email' in Rietveld after reviewing my patch,
Rietveld will send email to me but the email pretends as if it is sent
from tay...@swift.com. Hence, failing spf checking.

Take an example where R. David Murray commented on my patch, I
wouldn't know about it if I did not click Spam folder out of the blue.
I remember in the past I had ignored Serhiy Storchaka's advice for
months because his message was buried in spam folder.

Maybe we shouldn't pretend as someone else when sending email through Rietveld?


http://psf.upfronthosting.co.za/roundup/meta/issue554


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] More compact dictionaries with faster iteration

2014-12-31 Thread Serhiy Storchaka

On 10.12.12 03:44, Raymond Hettinger wrote:

The current memory layout for dictionaries is
unnecessarily inefficient.  It has a sparse table of
24-byte entries containing the hash value, key pointer,
and value pointer.

Instead, the 24-byte entries should be stored in a
dense table referenced by a sparse table of indices.


FYI PHP 7 will use this technique [1]. In conjunction with other 
optimizations this will decrease memory consumption of PHP hashtables up 
to 4 times.


[1] http://nikic.github.io/2014/12/22/PHPs-new-hashtable-implementation.html
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Any grammar experts?

2015-01-26 Thread Serhiy Storchaka

On 25.01.15 17:08, Antoine Pitrou wrote:

On Sat, 24 Jan 2015 21:10:51 -0500
Neil Girdhar  wrote:

To finish PEP 448, I need to update the grammar for syntax such as
{**x for x in it}

Is this seriously allowed by the PEP? What does it mean exactly?


I would understand this as

   {k: v for x in it for k, v in x.items()}


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Any grammar experts?

2015-01-26 Thread Serhiy Storchaka

On 26.01.15 00:59, Guido van Rossum wrote:

Interestingly, the non-dict versions can all be written today using a
double-nested comprehension, e.g. {**x for x in it} can be written as:

 {x for x in xs for xs in it}


 {x for xs in it for x in xs}


But it's not so straightforward for dict comprehensions -- you'd have to
switch to calling dict():

 dict(x for x in xs for xs in it)


 {k: v for xs in it for k, v in xs.items()}

So actually this is just a syntax sugar.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Mirco-optimizations to reduce register spills and reloads observed on CLANG and

2015-02-09 Thread Serhiy Storchaka

On 09.02.15 14:48, raymond.hettinger wrote:

https://hg.python.org/cpython/rev/dc820b44ce21
changeset:   94572:dc820b44ce21
user:Raymond Hettinger 
date:Mon Feb 09 06:48:29 2015 -0600
summary:
   Mirco-optimizations to reduce register spills and reloads observed on CLANG 
and GCC.

files:
   Objects/setobject.c |  6 --
   1 files changed, 4 insertions(+), 2 deletions(-)


diff --git a/Objects/setobject.c b/Objects/setobject.c
--- a/Objects/setobject.c
+++ b/Objects/setobject.c
@@ -84,8 +84,9 @@
  return set_lookkey(so, key, hash);
  if (cmp > 0)  /* likely */
  return entry;
+mask = so->mask; /* help avoid a register spill */


Could you please explain in more details what this line do? The mask 
variable is actually constant and so->mask isn't changed in this loop.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (no subject)

2015-02-10 Thread Serhiy Storchaka

On 10.02.15 04:06, Ethan Furman wrote:

 return func(*(args + fargs), **{**keywords, **fkeywords})


We don't use [*args, *fargs] for concatenating lists, but args + fargs. 
Why not use "+" or "|" operators for merging dicts?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 (scandir): Poll to choose the implementation (full C or C+Python)

2015-02-13 Thread Serhiy Storchaka

On 13.02.15 12:07, Victor Stinner wrote:

TL,DR: are you ok to add 800 lines of C code for os.scandir(), 4x
faster than os.listdir() when the file type is checked?


You can try to make Python implementation faster if

1) Don't set attributes to None in constructor.

2) Implement scandir as:

def scandir(path):
return map(partial(DirEntry, path), _scandir(path)).

3) Or pass DirEntry to _scandir:

def scandir(path):
yield from _scandir(path, DirEntry)


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 471 (scandir): Poll to choose the implementation (full C or C+Python)

2015-02-13 Thread Serhiy Storchaka

On 13.02.15 12:07, Victor Stinner wrote:

* C implementation: scandir is at least 3.5x faster than listdir, up
to 44.6x faster on Windows


Results on Windows was obtained in the becnhmark that doesn't drop disk 
caches and runs listdir before scandir.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] subclassing builtin data structures

2015-02-13 Thread Serhiy Storchaka

On 13.02.15 05:41, Ethan Furman wrote:

So there are basically two choices:

1) always use the type of the most-base class when creating new instances

pros:
  - easy
  - speedy code
  - no possible tracebacks on new object instantiation

cons:
  - a subclass that needs/wants to maintain itself must override all
methods that create new instances, even if the only change is to
the type of object returned

2) always use the type of self when creating new instances

pros:
  - subclasses automatically maintain type
  - much less code in the simple cases [1]

cons:
  - if constructor signatures change, must override all methods which
create new objects


And switching to (2) would break existing code which uses subclasses 
with constructors with different signature (e.g. defaultdict).


The third choice is to use different specially designed constructor.

class A(int):


class A(int):
... def __add__(self, other): 

... return self.__make_me__(int(self) + int(other)) 



... def __repr__(self): 


... return 'A(%d)' % self
...

A.__make_me__ = A
A(2) + 3

A(5)

class B(A):

... def __repr__(self):
... return 'B(%d)' % self
...

B.__make_me__ = B
B(2) + 3

B(5)

We can add special attribute used to creating results of operations to 
all basic classes. By default it would be equal to the base class 
constructor.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] subclassing builtin data structures

2015-02-13 Thread Serhiy Storchaka

On 14.02.15 03:12, Ethan Furman wrote:

The third choice is to use different specially designed constructor.

class A(int):

--> class A(int):
... def __add__(self, other):
... return self.__make_me__(int(self) + int(other))

... def __repr__(self):
... return 'A(%d)' % self


How would this help in the case of defaultdict?  __make_me__ is a class method, 
but it needs instance info to properly
create a new dict with the same default factory.


In case of defaultdict (when dict would have to have __add__ and like) 
either __make_me__ == dict (then defaultdict's methods will return 
dicts) or it will be instance method.


def __make_me__(self, other):
return defaultdict(self.default_factory, other)


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] subclassing builtin data structures

2015-02-13 Thread Serhiy Storchaka

On 14.02.15 01:03, Neil Girdhar wrote:

Now the derived class knows who is asking for a copy.  In the case of
defaultdict, for example, he can implement __make_me__ as follows:

def __make_me__(self, cls, *args, **kwargs):
 if cls is dict: return default_dict(self.default_factory, *args,
**kwargs)
 return default_dict(*args, **kwargs)

essentially the caller is identifying himself so that the receiver knows
how to interpret the arguments.


No, my idea was that __make_me__ has the same signature in all 
subclasses. It takes exactly one argument and creates an instance of 
concrete class, so it never fails. If you want to create an instance of 
different class in the derived class, you should explicitly override 
__make_me__.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 441 - Improving Python ZIP Application Support

2015-02-15 Thread Serhiy Storchaka

On 15.02.15 10:47, Paul Moore wrote:

On 15 February 2015 at 08:14, Paul Moore  wrote:

Maybe it would be better to
put something on PyPI and let it develop outside the stdlib first?


The only place where a ".pyz" file can't easily be manipulated with
the stdlib zipfile module is in writing a shebang line to the start of
the archive (i.e. adding "prefix" bytes before the start of the
zipfile data). It would be nice if the ZipFile class supported this
(because to do it properly you need access to the file object that the
ZipFile object wraps). Would it be reasonable to add methods to the
ZipFile class to read and write the prefix data?


But the stdlib zipfile module supports this.

with open(filename, 'wb') as f:
f.write(shebang)
with zipfile.PyZipFile(f, 'a') as zf:
...


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 441 - Improving Python ZIP Application Support

2015-02-15 Thread Serhiy Storchaka

On 15.02.15 18:21, Thomas Wouters wrote:

which requires that extension modules are stored uncompressed (simple)
and page-aligned (harder, as the zipfile.ZipFile class doesn't directly
support page-aligning anything


It is possible to add this feature to ZipFile. It can be useful because 
will allow to mmap uncompressed files in ZIP file.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 441 - Improving Python ZIP Application Support

2015-02-17 Thread Serhiy Storchaka

On 17.02.15 23:25, Barry Warsaw wrote:

I'm not sure sys.getfilesystemencoding() is the right encoding, rather than
sys.getdefaultencoding(), if you're talking about the encoding of the shebang
line rather than the encoding of the resulting pyz filename.


On POSIX sys.getfilesystemencoding() is the right encoding because the 
shebang is read by system loader which doesn't encode/decode, but uses a 
file name as raw bytes string. On Mac OS always is UTF-8, but 
sys.getdefaultencoding() can be ASCII.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] TypeError messages

2015-02-19 Thread Serhiy Storchaka

Different patterns for TypeError messages are used in the stdlib:

expected X, Y found
expected X, found Y
expected X, but Y found
expected X instance, Y found
X expected, not Y
expect X, not Y
need X, Y found
X is required, not Y
Z must be X, not Y
Z should be X, not Y

and more.

What the pattern is most preferable?

Some messages use the article before X or Y. Should the article be used 
or omitted?


Some messages (only in C) truncate actual type name (%.50s, %.80s, 
%.200s, %.500s). Should type name be truncated at all and for how limit? 
Type names newer truncated in TypeError messages raised in Python code.


Some messages enclose actual type name with single quotes ('%s', 
'%.200s'). Should type name be quoted? It is uncommon if type name 
contains spaces.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] TypeError messages

2015-02-21 Thread Serhiy Storchaka

On 20.02.15 01:57, MRAB wrote:

Messages tend not to be complete sentences anyway, so I think that it
would be fitting to omit articles.


Thanks. This post was aroused by your note about articles on the tracker.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] TypeError messages

2015-02-21 Thread Serhiy Storchaka

On 20.02.15 18:11, Eric V. Smith wrote:

I asked about this years ago, and was told it was in case the type name
pointer was bad, and to limit the amount of garbage printed. Whether
that's an actual problem or not, I can't say. It seems more likely that
you'd get a segfault, but maybe if it was pointing to reused memory it
could be useful.


Thank you. This makes sense and explains why type names are not 
truncated in Python code.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] TypeError messages

2015-02-21 Thread Serhiy Storchaka

On 21.02.15 10:03, Guido van Rossum wrote:

IIRC there's a limited buffer used for the formatting.


For now formatting buffer is not limited. But other arguments are valid.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] TypeError messages

2015-02-21 Thread Serhiy Storchaka

On 21.02.15 16:26, Nick Coghlan wrote:

Likewise, although as Rob noted, it's sometimes to desirable to use
the "Z should be X, not Y" form if the shorter form would be
ambiguous.


Z is not always available.


Perhaps this should be a recommendation in both PEP 7 & 8? It's
exactly the kind of issue where having a recommended way to do it can
save a fair bit of time considering the exact phrasing of error
messages, as well as improving the developer experience by providing
more consistent feedback on error details.


It would be great. I'm going to change standard messages in PyArg_Parse* 
and common converting functions (as PyLong_AsLong and 
PyObject_GetBuffer) to make them uniform.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Generate all Argument Clinic code into separate files

2015-02-21 Thread Serhiy Storchaka
Currently for some files converted to use Argument Clinic the generated 
code is written into a separate file, for other files it is inlined.


I'm going to make a number of small enhancement to Argument Clinic to 
generate faster code for parsing arguments (see for example issue23492). 
Every such step will produce large diffs for
generated code and will create code churn if it is inlined and mixed up 
with handwritten code. It would be better when only generated files will 
be changed. So I suggest to move all inlined generated code in separate 
file. What are you think?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Prefixes and namespaces

2015-02-21 Thread Serhiy Storchaka

/* Namespaces are one honking great idea -- let's do more of those! */

There are two ways to avoid name conflicts: prefixes and namespaces. 
Programming languages that lacks namespaces (such as C) need to use 
prefixes. For example: PROTOCOL_SSLv2, PROTOCOL_SSLv3, PROTOCOL_SSLv23. 
Python used the same prefixed names when reflect C constants to module 
level Python globals. But when convert integer constants to IntEnum, is 
it needed to preserve common prefix? Or may be we can drop it, because 
enum class name plays its role?


class Protocol(IntEnum):
PROTOCOL_SSLv2 = ...
PROTOCOL_SSLv3 = ...
PROTOCOL_SSLv23 = ...

or

class Protocol(IntEnum):
SSLv2 = ...
SSLv3 = ...
SSLv23 = ...

? Protocol.PROTOCOL_SSLv2 or Protocol.SSLv2?

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Prefixes and namespaces

2015-02-21 Thread Serhiy Storchaka

On 21.02.15 21:49, Ian Cordasco wrote:

So I like the latter (Protocol.SSLv2) but would qualify that with the
request that ssl.PROTOCOL_SSLv2 continue to work until Python 2 is
dead and libraries like requests, urllib3, httplib2, etc. no longer
need to support those versions.


Of course, ssl.PROTOCOL_SSLv2 will continue to work until Python 2.7 and 
3.4 are dead.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #23152: Renames attribute_data_to_stat to _Py_attribute_data_to_stat

2015-02-21 Thread Serhiy Storchaka

On 21.02.15 20:04, steve.dower wrote:

https://hg.python.org/cpython/rev/307713759a62
changeset:   94720:307713759a62
parent:  94718:4f6f4aa0d80f
user:Steve Dower 
date:Sat Feb 21 10:04:10 2015 -0800
summary:
   Issue #23152: Renames attribute_data_to_stat to _Py_attribute_data_to_stat


What about time_t_to_FILE_TIME?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #23152: Renames attribute_data_to_stat to _Py_attribute_data_to_stat

2015-02-21 Thread Serhiy Storchaka

On 22.02.15 01:26, Steve Dower wrote:

Thanks! Fixed now.


Clinic removes the declaration of _Py_time_t_to_FILE_TIME.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Generate all Argument Clinic code into separate files

2015-02-22 Thread Serhiy Storchaka

On 22.02.15 05:03, Nick Coghlan wrote:

On 22 February 2015 at 06:58, Brett Cannon  wrote:

+1 to moving to a separate file for all .c files. Might be painful now but
the long-terms benefits are worth it.


Yeah, agreed. Originally I was a fan of having everything inline so I
could see them while I was working on the code, but I eventually
changed my mind to favour making it a clearer build step with a
separate generated file. I suspect it was a matter of starting to
trust AC to do the right thing, so having it implicitly asking me to
check its work all the time ultimately become annoying rather than
reassuring :)


OK. Opened an issue: https://bugs.python.org/issue23501


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #23152: Renames attribute_data_to_stat to _Py_attribute_data_to_stat

2015-02-22 Thread Serhiy Storchaka

On 22.02.15 16:12, Steve Dower wrote:

Why does it do that?


Because it is in the section of generated code.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Emit SyntaxWarning on unrecognized backslash escapes?

2015-02-23 Thread Serhiy Storchaka
See topic "Unrecognized backslash escapes in string literals" in Python 
list [1]. I agree that this is a problem, especially for novices (but 
even experience users can make a typo). May be emit SyntaxWarning on 
unrecognized backslash escapes? An exception is already raised on 
invalid octal or hexadecimal escapes. '\x' is syntax error, not two 
characters '\\' and 'x'.


[1] http://comments.gmane.org/gmane.comp.python.general/772455

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

2015-02-23 Thread Serhiy Storchaka

On 23.02.15 21:22, Ethan Furman wrote:

This could be a completely stupid question, but how does the zip file know 
where the individual files are?  More to the
point, does the index work via relative or absolute offset?  If absolute, 
wouldn't the index have to be rewritten if the
zip portion of the file moves?


Absolute.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Request for Pronouncement: PEP 441 - Improving Python ZIP Application Support

2015-02-25 Thread Serhiy Storchaka

On 24.02.15 21:01, Guido van Rossum wrote:

On Tue, Feb 24, 2015 at 10:50 AM, Paul Moore mailto:p.f.mo...@gmail.com>> wrote:

On 24 February 2015 at 18:24, Guido van Rossum mailto:gu...@python.org>> wrote:
> I'd specify that when the output argument is a file open for writing, it 
is
> the caller's responsibility to close the file.  Also, can the file be a
> pipe?  (I.e. are we using seek()/tell() or not?)  And what about the input
> archive?  Can that be a file open for reading?

I'll clarify all of these points. They are mostly "it can be whatever
the zipfile module accepts", though, which isn't explicitly stated
itself :-(


See issue23252.

https://bugs.python.org/issue23252


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Make review

2015-03-05 Thread Serhiy Storchaka
If you have ready patches that wait for review and committing, tell me. 
Send me no more than 5 links to issues per person (for first time) in 
private and I'll try to make reviews if I'm acquainted with affected 
modules or areas.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] subprocess, buffered files, pipes and broken pipe errors

2015-03-06 Thread Serhiy Storchaka

On 06.03.15 14:53, Victor Stinner wrote:

I propose to ignore BrokenPipeError in Popen.__exit__, as done in
communicate(), for convinience:
http://bugs.python.org/issue23570

Serhiy wants to keep BrokenPipeError, he wrote that file.close()
should not ignore write errors (read the issue for details).


I rather said about file.__exit__.


I consider that BrokenPipeError on a pipe is different than a write
error on a regular file.

EPIPE and SIGPIPE are designed to notify that the pipe is closed and
that it's now inefficient to continue to write into this pipe.


And into the file like open('/dev/stdout', 'wb').


Ignoring BrokenPipeError in Popen.__exit__() respects this constrain
because the method closes stdin and only returns when the process
exited. So the caller will not write anything into stdin anymore.


And the caller will not write anything into the file after calling 
file.__exit__.


I don't see large difference between open('file', 'wb') and Popen('cat 
>file', stdin=PIPE), between sys.stdout with redirecting stdout and 
running external program with Pipe().


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] boxing and unboxing data types

2015-03-08 Thread Serhiy Storchaka

On 09.03.15 06:33, Ethan Furman wrote:

I guess it could boil down to:  if IntEnum was not based on 'int', but instead 
had the __int__ and __index__ methods
(plus all the other __xxx__ methods that int has), would it still be a drop-in 
replacement for actual ints?  Even when
being used to talk to non-Python libs?


If you don't call isinstance(x, int) (PyLong_Check* in C).

Most conversions from Python to C implicitly call __index__ or __int__, 
but unfortunately not all.


>>> float(Thin(42))
42.0
>>> float(Wrap(42))
Traceback (most recent call last):
  File "", line 1, in 
TypeError: float() argument must be a string or a number, not 'Wrap'

>>> '%*s' % (Thin(5), 'x')
'x'
>>> '%*s' % (Wrap(5), 'x')
Traceback (most recent call last):
  File "", line 1, in 
TypeError: * wants int

>>> OSError(Thin(2), 'No such file or directory')
FileNotFoundError(2, 'No such file or directory')
>>> OSError(Wrap(2), 'No such file or directory')
OSError(<__main__.Wrap object at 0xb6fe81ac>, 'No such file or directory')

>>> re.match('(x)', 'x').group(Thin(1))
'x'
>>> re.match('(x)', 'x').group(Wrap(1))
Traceback (most recent call last):
  File "", line 1, in 
IndexError: no such group

And to be ideal drop-in replacement IntEnum should override such methods 
as __eq__ and __hash__ (so it could be used as mapping key). If all 
methods should be overridden to quack as int, why not take an int?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Serhiy Storchaka

On 09.03.15 08:12, Ethan Furman wrote:

On 03/08/2015 11:07 PM, Serhiy Storchaka wrote:


If you don't call isinstance(x, int) (PyLong_Check* in C).

Most conversions from Python to C implicitly call __index__ or __int__, but 
unfortunately not all.


[snip examples]

Thanks, Serhiy, that's what I was looking for.


May be most if not all of these examples can be considered as bugs and 
slowly fixed, but we can't control third-party code.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Serhiy Storchaka

On 09.03.15 10:19, Maciej Fijalkowski wrote:

Not all your examples are good.

* float(x) calls __float__ (not __int__)

* re.group requires __eq__ (and __hash__)

* I'm unsure about OSError

* the % thing at the very least works on pypy


Yes, all these examples are implementation defined and can differ 
between CPython and PyPy. There is about a dozen of similar examples 
only in C part of CPython. Most of them have in common is that the 
behavior of the function depends on the argument type. For example in 
case of re.group an argument is either integer index or string group 
name. OSError constructor can produce OSError subtype if first argument 
is known integer errno. float either convert a number to float or parse 
a string (or bytes).


Python functions can be more lenient (if they allows ducktyping) or more 
strict (if they explicitly check the type). They rarely call __index__ 
or __int__.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Serhiy Storchaka
понеділок, 09-бер-2015 10:18:50 ви написали:
> On Mon, Mar 9, 2015 at 10:10 AM, Serhiy Storchaka  wrote:
> > понеділок, 09-бер-2015 09:52:01 ви написали:
> > > On Mon, Mar 9, 2015 at 2:07 AM, Serhiy Storchaka 
> > > > And to be ideal drop-in replacement IntEnum should override such methods
> > > > as __eq__ and __hash__ (so it could be used as mapping key). If all 
> > > > methods
> > > > should be overridden to quack as int, why not take an int?
> > > 
> > > You're absolutely right that if *all the methods should be overrriden to
> > > quack as int, then you should subclass int (the Liskov substitution
> > > principle).  But all methods should not be overridden — mainly the methods
> > > you overrode in your patch should be exposed.  Here is a list of methods 
> > > on
> > > int that should not be on IntFlags in my opinion (give or take a couple):
> > > 
> > > __abs__, __add__, __delattr__, __divmod__, __float__, __floor__,
> > > __floordiv__, __index__, __lshift__, __mod__, __mul__, __pos__, __pow__,
> > > __radd__, __rdivmod__, __rfloordiv__, __rlshift__, __rmod__, __rmul__,
> > > __round__, __rpow__, __rrshift__, __rshift__, __rsub__, __rtruediv__,
> > > __sub__, __truediv__, __trunc__, conjugate, denominator, imag, numerator,
> > > real.
> > > 
> > > I don't think __index__ should be exposed either since are you really 
> > > going
> > > to slice a list using IntFlags?  Really?
> > 
> > Definitely __index__ should be exposed. __int__ is for lossy conversion to 
> > int
> > (as in float). __index__ is for lossless conversion.
> 
> Is it?  __index__ promises lossless conversion, but __index__ is *for*
> indexing.

I spite of its name it is for any lossless conversion.

> > __add__ should be exposed because some code can use + instead of | for
> > combining flags. But it shouldn't preserve the type, because this is not
> > recommended way.
> 
> I think it should be blocked because it can lead to all kinds of weird
> bugs.  If the flag is already set and you add it a copy, it silently spills
> over into other flags.  This is a mistake that a good interface prevents.

I think this is a case when backward compatibility has larger weight.

> > For the same reason I think __lshift__, __rshift__, __sub__,
> > __mul__, __divmod__, __floordiv__, __mod__, etc should be exposed too. So 
> > the
> > majority of the methods should be exposed, and there is a risk that we loss
> > something.
> 
> I totally disagree with all of those.
> 
> > For good compatibility with Python code IntFlags should expose also
> > __subclasscheck__ or __subclasshook__. And when we are at this point, why 
> > not
> > use int subclass?
> 
> Here's another reason.  What if someone wants to use an IntFlags object,
> but wants to use a fixed width type for storage, say numpy.int32?   Why
> shouldn't they be able to do that?  By using composition, you can easily
> provide such an option.

You can design abstract interface Flags that can be combined with int or other 
type. But why you want to use numpy.int32 as storage? This doesn't save much 
memory, because with composition the IntFlags class weighs more than int 
subclass.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] boxing and unboxing data types

2015-03-09 Thread Serhiy Storchaka

On 09.03.15 17:48, Neil Girdhar wrote:

So you agree that the ideal solution is composition, but you prefer
inheritance in order to not break code?


Yes, I agree. There is two advantages in the inheritance: larger 
backward compatibility and simpler implementation.



Then,I think the big question
is how much code would actually break if you presented the ideal
interface.  I imagine that 99% of the code using flags only uses __or__
to compose and __and__, __invert__ to erase flags.


I don't know and don't want to guess. Let just follow the way of bool 
and IntEnum. When users will be encouraged to use IntEnum and IntFlags 
instead of plain ints we could consider the idea of dropping inheritance 
of bool, IntEnum and IntFlags from int. This is not near future.



> Here's another reason.  What if someone wants to use an IntFlags object,
> but wants to use a fixed width type for storage, say numpy.int32?   Why
> shouldn't they be able to do that?  By using composition, you can easily
> provide such an option.
You can design abstract interface Flags that can be combined with
int or other type. But why you want to use numpy.int32 as storage?
This doesn't save much memory, because with composition the IntFlags
class weighs more than int subclass.
Maybe you're storing a bunch of flags in a numpy array having dtype
np.int32?  It's contrived, I agree.


I afraid that composition will not help you with this. Can numpy array 
pack int-like objects into fixed-width integer array and then restore 
original type on unboxing?



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Not documented special methods

2015-03-11 Thread Serhiy Storchaka
There are many special names used in Python core and the stdlib, but 
absent in the documentation index [1]. If you see names that are defined 
or used in the module or area maintained by you, please add references 
to these names to the index and document them if it is needed.


Repeat the lists here.

Module level names used in pydoc:
__author__
__credits__
__date__
__version__

Module level name used in doctest:
__test__

Other module level names:
__about__   (heapq only)
__copyright__   (many modules)
__cvsid__   (tarfile only)
__docformat__   (doctest only)
__email__   (test_with and test_keywordonlyarg only)
__libmpdec_version__(decimal only)
__status__  (logging only)


type attributes (mostly used in tests):
__abstractmethods__ (used in abc, functools)
__base__
__basicsize__
__dictoffset__
__flags__   (used in inspect, copyreg)
__itemsize__
__weakrefoffset__

super() attributes:
__self_class__
__thisclass__

Used in sqlite:
__adapt__
__conform__

Used in ctypes:
__ctype_be__
__ctype_le__
__ctypes_from_outparam__

Used in unittest:
__unittest_expecting_failure__
__unittest_skip__
__unittest_skip_why__

float methods, for testing:
__getformat__
__setformat__

Used in IDLE RPC:
__attributes__
__methods__

Others:
__alloc__   (bytearray method)
__args__(used in bdb)
__build_class__ (builtins function, used in eval loop)
__builtins__(module attribute)
__decimal_context__ (used in decimal)
__exception__   (used in pdb)
__getinitargs__ (used in pickle, datetime)
__initializing__(used in importlib)
__isabstractmethod__(function/method/descriptor attribute,
 used in abc, functools, types)
__ltrace__  (used in eval loop, never set)
__members__ (Enum attribute, used in many modules)
__mp_main__ (used in multiprocessing)
__new_member__  (Enum attribute, used in enum internally)
__newobj__  (copyreg function,
 used in pickle, object.__reduce_ex__)
__newobj_ex__   (copyreg function,
 used in pickle, object.__reduce_ex__)
__objclass__(descriptor/enum attribute, used in
 inspect, pydoc, doctest, multiprocessing)
__prepare__ (metaclass method,
 used in builtins.__build_class__, types)
__pycache__ (cache directory name)
__return__  (used in pdb)
__signature__   (used in inspect, never set)
__sizeof__  (standard method, used in sys.getsizeof)
__slotnames__   (used in object.__getstate__ for caching)
__text_signature__  (function/method/descriptor attribute,
 used in inspect)
__trunc__   (used in math.trunc, int, etc)
__warningregistry__ (used in warnings)
__weakref__ (used in weakref)
__wrapped__ (used in inspect, functools, contextlib,
 asyncio)


[1] http://bugs.python.org/issue23639

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] backwards and forwards compatibility, the exact contents of pickle files, and IntEnum

2015-03-15 Thread Serhiy Storchaka

On 15.03.15 07:52, Ethan Furman wrote:

So how do we fix it?  There are a couple different options:

   - modify IntEnum pickle methods to return the name only

   - modify IntEnum pickle methods to pickle just like the int they represent

The first option has the advantage that in 3.4 and above, you'll get back the 
IntEnum version.

The second option has the advantage that the actual pickle contents are the 
same [1] as in previous versions.

So, the final question:  Do the contents of a pickle file at a certain protocol 
have to be the some between versions, or
is it enough if unpickling them returns the correct object?


With the second option you lost the type even for 3.5+. This is a step back.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Needed reviews

2015-03-19 Thread Serhiy Storchaka
Here is list of my ready for review patches.  It is incomplete and 
contains only patches for which I don't expect objections or long 
discussion.  Most of them are relative easy and need only formal review. 
 Most of them wait for a review many months.



https://bugs.python.org/issue23681
Have -b warn when directly comparing ints and bytes

https://bugs.python.org/issue23676
Add support of UnicodeTranslateError in standard error handlers

https://bugs.python.org/issue23671
string.Template doesn't work with the self keyword argument

https://bugs.python.org/issue23637
Outputting warnings fails when file patch is not ASCII and message 
is unicode on 2.7.


https://bugs.python.org/issue23622
Deprecate unrecognized backslash+letter escapes in re

https://bugs.python.org/issue23611
Pickle nested names (e.g. unbound methods) with protocols < 4

https://bugs.python.org/issue23583
IDLE: printing unicode subclasses broken (again)

https://bugs.python.org/issue23573
Avoid redundant memory allocations in str.find and like

https://bugs.python.org/issue23509
Speed up Counter operators

https://bugs.python.org/issue23502
Tkinter doesn't support large integers (out of 32-bit range)

https://bugs.python.org/issue23488
Random objects twice as big as necessary on 64-bit builds

https://bugs.python.org/issue23466
PEP 461: Inconsistency between str and bytes formatting of integers

https://bugs.python.org/issue23419
Faster default __reduce__ for classes without __init__

https://bugs.python.org/issue23290
Faster set copying

https://bugs.python.org/issue23252
Add support of writing to unseekable file (e.g. socket) in zipfile

https://bugs.python.org/issue23502
pprint: added support for mapping proxy

https://bugs.python.org/issue23001
Accept mutable bytes-like objects in builtins that for now support 
only read-only bytes-like objects


https://bugs.python.org/issue22995
Restrict default pickleability. Fail earlier for some types instead 
of producing incorrect data.


https://bugs.python.org/issue22958
Constructors of weakref mapping classes don't accept "self" and 
"dict" keyword arguments


https://bugs.python.org/issue22831
Use "with" to avoid possible fd leaks. Large patch with many simple 
changes.


https://bugs.python.org/issue22826
Support context management protocol in bkfile and simplify 
Tools/freeze/bkfile.py


https://bugs.python.org/issue22721
pprint output for sets and dicts is not stable

https://bugs.python.org/issue22687
horrible performance of textwrap.wrap() with a long word

https://bugs.python.org/issue22682
Add support of KZ1048 (RK1048) encoding

https://bugs.python.org/issue22681
Add support of KOI8-T encoding

https://bugs.python.org/issue23671
string.Template doesn't work with the self keyword argument

https://bugs.python.org/issue23171
Accept arbitrary iterables in cvs.writerow()

https://bugs.python.org/issue23136
Fix inconsistency in handling week 0 in _strptime()

https://bugs.python.org/issue22557
Speed up local import

https://bugs.python.org/issue22493
Deprecate the use of flags not at the start of regular expression

https://bugs.python.org/issue22390
test.regrtest should complain if a test doesn't remove temporary files

https://bugs.python.org/issue22364
Improve some re error messages using regex for hints

https://bugs.python.org/issue22115
Add new methods to trace Tkinter variables

https://bugs.python.org/issue22035
Fatal error in dbm.gdbm

https://bugs.python.org/issue21802
Reader of BufferedRWPair is not closed if writer's close() fails

https://bugs.python.org/issue21859
Add Python implementation of FileIO

https://bugs.python.org/issue21717
Exclusive mode for ZipFile

https://bugs.python.org/issue21708
Deprecate nonstandard behavior of a dumbdbm database

https://bugs.python.org/issue21526
Support new booleans in Tkinter

https://bugs.python.org/issue20168
Derby: Convert the _tkinter module to use Argument Clinic

https://bugs.python.org/issue20159
Derby: Convert the ElementTree module to use Argument Clinic

https://bugs.python.org/issue20148
Derby: Convert the _sre module to use Argument Clinic

https://bugs.python.org/issue19930
os.makedirs('dir1/dir2', 0) always fails

https://bugs.python.org/issue18684
Pointers point out of array bound in _sre.c

https://bugs.python.org/issue18473
Some objects pickled by Python 3.x are not unpicklable in Python 2.x

https://bugs.python.org/issue17711
Persistent id in pickle with protocol version 0

https://bugs.python.org/issue17530
pprint could use line continuation for long bytes literals

https://bugs.python.org/issue16314
Support xz compression in distutils

https://bugs.python.org/issue15490
Correct __sizeof__ support for StringIO

https://bugs.python.org/issue15133
Make tkinter.getboolean() and BooleanVar.get() support Tcl_Obj and 
alway

[Python-Dev] How to document functions with optional positional parameters?

2015-03-20 Thread Serhiy Storchaka

How to document functions with optional positional parameters?

For example binascii.crc32(). It has two positional parameters, one is 
mandatory, and one is optional with default value 0. With Argument 
Clinic its signature is crc32(data, crc=0, /). In the documentation it 
is written as crc32(data[, crc]) (and noted that the default value of 
the second parameter is 0). Both are not valid Python syntax. Can the 
documentation be change to crc32(data, crc=0)?


Related issues:

https://bugs.python.org/issue21488 (changed encode(obj, 
encoding='ascii', errors='strict') to encode(obj, [encoding[, errors]]))


https://bugs.python.org/issue22832 (changed ioctl(fd, op[, arg[, 
mutate_flag]]) to ioctl(fd, request, arg=0, mutate_flag=True))


https://bugs.python.org/issue22341 (discussed changing crc32(data[, 
crc]) to crc32(data, crc=0))


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] How to document functions with optional positional parameters?

2015-03-22 Thread Serhiy Storchaka

On 21.03.15 13:03, Victor Stinner wrote:

The \ is useful, it indicates that you cannot use keywords.


Wouldn't it confuse users?


If you want to drop \, modify the function to accept keywords.


Yes, this is a solution. But parsing keyword arguments is slower than 
parsing positional arguments. And I'm working on patches that optimizes 
parsing code generated by Argument Clinic. First my patches will handle 
only positional parameters, with keywords it is harder.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Needed reviews

2015-03-22 Thread Serhiy Storchaka

On 21.03.15 13:46, Nick Coghlan wrote:

On 19 March 2015 at 19:28, Serhiy Storchaka  wrote:

Here is list of my ready for review patches.  It is incomplete and contains
only patches for which I don't expect objections or long discussion.  Most
of them are relative easy and need only formal review.  Most of them wait
for a review many months.


It's worth noting that If there are changes you feel are genuinely low
risk, you can go ahead and merge them based on your own commit review
(even if you also wrote the original patch).


Yes, but four eyes are better than two eyes. I make mistakes. In some 
issues I hesitate about documentation part. In some issues (issue14260 
and issue22721) I provided two alternative solutions and need a tip to 
choose from them. While I am mainly sure about the correctness of the 
patch, I'm often hesitate about the direction. Is the bug worth fixing? 
Is the new feature worth to be added to Python?


Thanks Alexander, Amaury, Benjamin, Berker, Demian, Éric, Ethan, Martin, 
Paul, Victor and others that responded on my request.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: #23657 Don't explicitly do an isinstance check for str in zipapp

2015-03-23 Thread Serhiy Storchaka

On 22.03.15 17:33, paul.moore wrote:

https://hg.python.org/cpython/rev/0b2993742650
changeset:   95126:0b2993742650
user:Paul Moore 
date:Sun Mar 22 15:32:36 2015 +
summary:
   #23657 Don't explicitly do an isinstance check for str in zipapp

As a result, explicitly support pathlib.Path objects as arguments.
Also added tests for the CLI interface.


Congratulate with your first commit Paul!


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] peps: New PEP 490: Chain exceptions at C level

2015-03-26 Thread Serhiy Storchaka

On 26.03.15 10:08, victor.stinner wrote:

https://hg.python.org/peps/rev/7daf3bfd9586
changeset:   5741:7daf3bfd9586
user:Victor Stinner 
date:Thu Mar 26 09:08:08 2015 +0100
summary:
   New PEP 490: Chain exceptions at C level



+Python 3.5 introduces a new private ``_PyErr_ChainExceptions()`` function which
+is enough to chain manually exceptions.


It also was added in Python 3.4.3.

I meditar about adding _PyErr_ReplaceException() in 2.7 for simpler 
backporting patches from 3.x.



+Functions like ``PyErr_SetString()`` don't chain automatically exceptions. To
+make usage of ``_PyErr_ChainExceptions()`` easier, new functions are added:
+
+* PyErr_SetStringChain(exc_type, message)
+* PyErr_FormatChaine(exc_type, format, ...)


Typo.


+* PyErr_SetNoneChain(exc_type)
+* PyErr_SetObjectChain(exc_type, exc_value)


I would first make these functions private, as _PyErr_ChainExceptions(). 
After proofing their usefulness in the stdlib, they can be made public.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Needed reviews

2015-03-27 Thread Serhiy Storchaka

On 27.03.15 02:16, Victor Stinner wrote:

2015-03-19 10:28 GMT+01:00 Serhiy Storchaka :



https://bugs.python.org/issue23502
 Tkinter doesn't support large integers (out of 32-bit range)


closed
(note: the title was different, "pprint: added support for mapping proxy")


My fault. The correct issue is https://bugs.python.org/issue16840.


I stop here for tonight.


Many thanks Victor!


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Sporadic failures of test_subprocess and test_multiprocessing_spawn

2015-03-28 Thread Serhiy Storchaka

On 28.03.15 11:39, Victor Stinner wrote:

Can you please take a look at the following issue and try to reproduce it?
http://bugs.python.org/issue23771

The following tests sometimes hang on "x86 Ubuntu Shared 3.x" and
"AMD64 Debian root 3.x" buildbots:

- test_notify_all() of test_multiprocessing_spawn
- test_double_close_on_error() of test_subprocess
- other sporadic failures of test_subprocess

I'm quite sure that they are regressions, maybe related to the
implementation of the PEP 475. In the middle of all PEP 475 changes, I
changed some functions to release the GIL on I/O, it wasn't the case
before. I may be related.

Are you able to reproduce these issues? I'm unable to reproduce them
on Fedora 21. Maybe they are more likely on Debian-like operating
systems?


Just run tests with low memory limit.

(ulimit -v 6; ./python -m test.regrtest -uall -v 
test_multiprocessing_spawn;)


test_io also hangs.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #23752: When built from an existing file descriptor, io.FileIO() now only

2015-03-29 Thread Serhiy Storchaka

On 30.03.15 04:22, victor.stinner wrote:

https://hg.python.org/cpython/rev/bc2a22eaa0af
changeset:   95269:bc2a22eaa0af
user:Victor Stinner 
date:Mon Mar 30 03:21:06 2015 +0200
summary:
   Issue #23752: When built from an existing file descriptor, io.FileIO() now 
only
calls fstat() once. Before fstat() was called twice, which was not necessary.

files:
   Misc/NEWS|  26 ++
   Modules/_io/fileio.c |  24 
   2 files changed, 26 insertions(+), 24 deletions(-)


diff --git a/Misc/NEWS b/Misc/NEWS
--- a/Misc/NEWS
+++ b/Misc/NEWS
@@ -2,6 +2,32 @@
  Python News
  +++

+What's New in Python 3.5.0 alpha 4?
+===


Return time machine back Victor. Current version is 3.5.0a2+.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] OpenBSD buildbot has many failures

2015-04-01 Thread Serhiy Storchaka

On 01.04.15 07:52, Davin Potts wrote:

I am personally interested in seeing all tests pass on OpenBSD and am willing 
to put forth effort to help that be so.  I would be happy to be added to any 
issues that get opened against OpenBSD.  That said, I have concerns about the 
nature of when and how these failures came about — specifically I worry that 
other devs have committed the changes which prompted these failures yet they 
did not pay attention nor take responsibility when it happened.  Having 
monitored certain buildbots for a while to see how the community behaves and 
devs fail to react when a failure is triggered by a commit, I think we should 
do much better in taking individual responsibility for prompting these failures.


http://bugs.python.org/issue?%40columns=id%2Cactivity%2Ctitle%2Ccreator%2Cassignee%2Cstatus%2Ctype&%40filter=status&%40search_text=openbsd&submit=search&status=1


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Free lists

2015-05-09 Thread Serhiy Storchaka
Here is a statistic for most called PyObject_INIT or PyObject_INIT_VAR 
for types (collected during running Python tests on 32-bit Linux).


typecount   %   acc.%

builtin_function_or_method  116012007  36.29%  36.29%
method   52465386  16.41%  52.70%
int  42828741  13.40%  66.09%
str  37017098  11.58%  77.67%
generator14026583   4.39%  82.06%
list_iterator 8731329   2.73%  84.79%
bytes 7217934   2.26%  87.04%
tuple_iterator5042563   1.58%  88.62%
float 4672980   1.46%  90.08%
set   3319699   1.04%  91.12%
_io.StringIO  3000369   0.94%  92.06%
str_iterator  2126838   0.67%  92.73%
list  2031059   0.64%  93.36%
dict  1691993   0.53%  93.89%
method-wrapper1573139   0.49%  94.38%
function  1472062   0.46%  94.84%
traceback 1388278   0.43%  95.28%
tuple 1132071   0.35%  95.63%
memoryview1092173   0.34%  95.97%
cell  1049496   0.33%  96.30%
managedbuffer 1036889   0.32%  96.63%
bytearray  711969   0.22%  96.85%
range_iterator 496924   0.16%  97.00%
range  483971   0.15%  97.15%
super  472447   0.15%  97.30%
map449567   0.14%  97.44%
frame  427320   0.13%  97.58%
set_iterator   423392   0.13%  97.71%
Leaf   398705   0.12%  97.83%
symtable   374412   0.12%  97.95%

Types for which free lists already are used: builtin_function_or_method, 
method, float, tuple, list, dict, frame. Some free list implementations 
(e.g. for tuple) don't call PyObject_INIT/PyObject_INIT_VAR. That is why 
numbers are such low for tuples.


Perhaps it is worth to add free lists for other types: int, str, bytes, 
generator, list and tuple iterators?


Shortened tables for variable-sized objects (that calls PyObject_INIT_VAR):

int  42828741  13.40%
 0 425353   0.99%   0.99%
 1   21399290  49.96%  50.96%
 2   10496856  24.51%  75.47%
 34873346  11.38%  86.85%
 41021563   2.39%  89.23%
 51246444   2.91%  92.14%
 6 733676   1.71%  93.85%
 7 123074   0.29%  94.14%
 8 139203   0.33%  94.47%
...

bytes 7217934   2.26%
 0842   0.01%   0.01%
 1 179469   2.49%   2.50%
 2 473306   6.56%   9.06%
 3 254968   3.53%  12.59%
 41169164  16.20%  28.79%
 5  72806   1.01%  29.79%
 6 128668   1.78%  31.58%
 7 169694   2.35%  33.93%
 8 155154   2.15%  36.08%
 9  67320   0.93%  37.01%
10  51703   0.72%  37.73%
11  42574   0.59%  38.32%
12 108947   1.51%  39.83%
13  40812   0.57%  40.39%
14 126783   1.76%  42.15%
15  37873   0.52%  42.67%
16 447482   6.20%  48.87%
17 194320   2.69%  51.56%
18 251685   3.49%  55.05%
19 159435   2.21%  57.26%
20 212521   2.94%  60.20%
...
31  18751   0.26%  67.32%
32 159781   2.21%  69.54%
33   8332   0.12%  69.65%
...
63  19841   0.27%  79.21%
64 144982   2.01%  81.22%
65   5216   0.07%  81.29%
...
   127   1354   0.02%  85.44%
   128 376539   5.22%  90.66%
   129  17468   0.24%  90.90%
...
   255178   0.00%  92.39%
   256  11993   0.17%  92.55%
   257124   0.00%  92.56%
...

___
Pyt

Re: [Python-Dev] Free lists

2015-05-09 Thread Serhiy Storchaka

On 09.05.15 22:51, Larry Hastings wrote:

On 05/09/2015 12:01 PM, Serhiy Storchaka wrote:

Here is a statistic for most called PyObject_INIT or PyObject_INIT_VAR
for types (collected during running Python tests on 32-bit Linux).


Can you produce these statistics for a 64-bit build?


Sorry, no. All my computers are ran under 32-bit Linux.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Free lists

2015-05-09 Thread Serhiy Storchaka

On 10.05.15 02:25, Ian Cordasco wrote:

Can you share how you gathered them so someone could run them on a
64-bit build?


This is quick and dirty patch. It generates 8 GB log file!

patch --merge -p1 PyObject_INIT.log
python3 PyObject_INIT_stat.py PyObject_INIT.stat

Perhaps compiling with COUNT_ALLOCS will produce similar statistic for 
types (but without statistics for sizes) and should be much faster.
diff -r f7cc54086cd2 Include/objimpl.h
--- a/Include/objimpl.h	Sat May 09 11:37:23 2015 -0700
+++ b/Include/objimpl.h	Sat May 09 22:05:04 2015 +0300
@@ -137,9 +137,9 @@ PyAPI_FUNC(PyVarObject *) _PyObject_NewV
 /* Macros trading binary compatibility for speed. See also pymem.h.
Note that these macros expect non-NULL object pointers.*/
 #define PyObject_INIT(op, typeobj) \
-( Py_TYPE(op) = (typeobj), _Py_NewReference((PyObject *)(op)), (op) )
+( Py_TYPE(op) = (typeobj), _Py_NewReference((PyObject *)(op)), fprintf(stderr, "PyObject_INIT %.200s\n", Py_TYPE(op)->tp_name), (op) )
 #define PyObject_INIT_VAR(op, typeobj, size) \
-( Py_SIZE(op) = (size), PyObject_INIT((op), (typeobj)) )
+( Py_SIZE(op) = (size), Py_TYPE(op) = (typeobj), _Py_NewReference((PyObject *)(op)), fprintf(stderr, "PyObject_INIT_VAR %.200s %zd\n", Py_TYPE(op)->tp_name, Py_SIZE(op)), (op) )
 
 #define _PyObject_SIZE(typeobj) ( (typeobj)->tp_basicsize )
 
import sys, collections

stat1 = collections.Counter()
stat2 = collections.defaultdict(collections.Counter)
for line in sys.stdin:
if not line.startswith('PyObject_INIT'):
continue
try:
s, t, *r = line.split()
stat1[t] += 1
if s == 'PyObject_INIT_VAR':
stat2[t][int(r[0])] += 1
except Exception as e:
print('*** ERROR: %r %r' % (line, e), file=sys.stderr)

total = sum(stat1.values())
acc = 0
print('%-30s %10s %7s %7s' % ('type', 'count', '%', 'acc.%'))
print()
for t, c in stat1.most_common(30):
acc += c
print('%-30s %10d %6.2f%% %6.2f%%' % (t, c, 100 * c / total, 100 * acc / total))

for t in sorted(stat2, key=stat1.__getitem__, reverse=True)[:20]:
c = stat1[t]
print()
print('%-30s %10d %6.2f%%' % (t, c, 100 * c / total))
acc = 0
for s, c2 in sorted(stat2[t].items()):
acc += c2
print('%30d %10d %6.2f%% %6.2f%%' % (s, c2, 100 * c2 / c, 100 * acc / c))
c2 = c - sum(stat2[t].values())
if c2:
print('%30s %10d %6.2f%% %6.2f%%' % ('-', c2, 100 * c2 / c, 100))
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Free lists

2015-05-10 Thread Serhiy Storchaka
Here is comparable statistic collected from tests ran with an executable 
buil with COUNT_ALLOCS.


typecount   %   acc.%

tuple   448855278  29.50%  29.50%
frame   203515969  13.38%  42.88%
str 182658237  12.01%  54.89%
builtin_function_or_method  156724634  10.30%  65.19%
int 106561963   7.00%  72.19%
method   88269762   5.80%  78.00%
list 50340630   3.31%  81.31%
slice36650028   2.41%  83.71%
dict 34429310   2.26%  85.98%
generator33035375   2.17%  88.15%
bytes29230573   1.92%  90.07%
function 24953392   1.64%  91.71%
list_iterator21236155   1.40%  93.11%
tuple_iterator   16800947   1.10%  94.21%
cell 16369317   1.08%  95.29%
float 7079162   0.47%  95.75%
_sre.SRE_Match6342612   0.42%  96.17%
set   5322829   0.35%  96.52%
TokenInfo 5077251   0.33%  96.85%
code  3643664   0.24%  97.09%
traceback 3510709   0.23%  97.32%
memoryview2860799   0.19%  97.51%
managedbuffer 2762975   0.18%  97.69%
method-wrapper2590642   0.17%  97.86%
Name  1681233   0.11%  97.97%
bytearray 1598429   0.11%  98.08%
_io.StringIO  1439456   0.09%  98.17%
weakref   1341485   0.09%  98.26%
super  911811   0.06%  98.32%
range  798254   0.05%  98.37%


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Is it kosher to use a buffer after release?

2015-05-10 Thread Serhiy Storchaka

On 10.05.15 21:28, Larry Hastings wrote:

In Python's argument parsing code (convertsimple in Python/getargs.c), a
couple of format units* accept "read-only bytes-like objects", aka
read-only buffer objects.  They call a helper function called
convertbuffer() which uses the buffer protocol to extract a pointer to
the memory.

Here's the relevant bit of code:

static Py_ssize_t
convertbuffer(PyObject *arg, void **p, char **errmsg)
{
Py_buffer view;
...

if (getbuffer(arg, &view, errmsg) < 0)
 return -1;
count = view.len;
*p = view.buf;
PyBuffer_Release(&view);
return count;
}


getbuffer() uses the buffer protocol to fill in the "view" buffer. If
it's successful, "view" is a valid buffer.  We store the pointer to the
buffer's memory in output parameter p.

THEN WE RELEASE THE BUFFER.

THEN WE RETURN TO THE CALLER.

In case you missed the big helpful capital letters, we are returning a
pointer given to us by PyObject_GetBuffer(), which we have already
released by calling PyBuffer_Release().  The buffer protocol
documentation for bf_releasebuffer makes it sound like this pointer
could easily be invalid after the release call finishes.

Am I missing something, or is this code relying on an implementation
detail it shouldn't--namely that you can continue using a pointer to
some (most? all?) buffer memory even after releasing it?


You are missing following code:

if (pb != NULL && pb->bf_releasebuffer != NULL) {
*errmsg = "read-only bytes-like object";
return -1;
}

convertbuffer() is applicable only for types for which 
PyBuffer_Release() is no-op. That is why there are different format 
units for read-only buffers and for general buffers. That is why new 
buffer protocol was introduced.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: Issue #20172: Convert the winsound module to Argument Clinic.

2015-05-12 Thread Serhiy Storchaka

On 13.05.15 09:32, zach.ware wrote:

https://hg.python.org/cpython/rev/d3582826d24c
changeset:   96006:d3582826d24c
user:Zachary Ware 
date:Wed May 13 01:21:21 2015 -0500
summary:
   Issue #20172: Convert the winsound module to Argument Clinic.



+/*[clinic input]
+winsound.PlaySound
+
+sound: Py_UNICODE(nullable=True)


I think this is no longer correct syntax. Should be 
Py_UNICODE(accept={str, NoneType}).



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: inspect: Micro-optimize __eq__ for Signature, Parameter and BoundArguments

2015-05-14 Thread Serhiy Storchaka

On 15.05.15 01:23, yury.selivanov wrote:

https://hg.python.org/cpython/rev/f0b10980b19e
changeset:   96056:f0b10980b19e
parent:  96054:15701e89d710
user:Yury Selivanov 
date:Thu May 14 18:20:01 2015 -0400
summary:
   inspect: Micro-optimize __eq__ for Signature, Parameter and BoundArguments

Provide __ne__ method for consistency.

files:
   Lib/inspect.py |  32 ++--
   1 files changed, 22 insertions(+), 10 deletions(-)


diff --git a/Lib/inspect.py b/Lib/inspect.py
--- a/Lib/inspect.py
+++ b/Lib/inspect.py
@@ -2353,11 +2353,15 @@
  return hash((self.name, self.kind, self.annotation, self.default))

  def __eq__(self, other):
-return (issubclass(other.__class__, Parameter) and
-self._name == other._name and
-self._kind == other._kind and
-self._default == other._default and
-self._annotation == other._annotation)
+return (self is other or
+(issubclass(other.__class__, Parameter) and
+ self._name == other._name and
+ self._kind == other._kind and
+ self._default == other._default and
+ self._annotation == other._annotation))


It would be better to return NotImplemented if other is not an instance 
of Parameter.


if self is other:
return True
if not isinstance(other, Parameter):
return NotImplemented
return (self._name == other._name and
self._kind == other._kind and
self._default == other._default and
self._annotation == other._annotation)

And why you use issubclass() instead of isinstance()?


+def __ne__(self, other):
+return not self.__eq__(other)


This is not need (and incorrect if __eq__ returns NotImplemented). The 
default __ne__ implementations calls __eq__ and correctly handles 
NotImplemented.



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython: inspect: Add __slots__ to BoundArguments.

2015-05-14 Thread Serhiy Storchaka

On 14.05.15 00:38, yury.selivanov wrote:

https://hg.python.org/cpython/rev/ee31277386cb
changeset:   96038:ee31277386cb
user:Yury Selivanov 
date:Wed May 13 17:18:41 2015 -0400
summary:
   inspect: Add __slots__ to BoundArguments.


Note that adding __slots__ breaks pickleability.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] cpython (merge 3.4 -> default): Added tests for more builtin types.

2015-05-16 Thread Serhiy Storchaka

On 17.05.15 02:44, Ned Deily wrote:

In article <20150516183940.21146.77...@psf.io>,
  serhiy.storchaka  wrote:

https://hg.python.org/cpython/rev/7b350f712c0e
changeset:   96099:7b350f712c0e
parent:  96096:f0c94892ac31
parent:  96098:955dffec3d94
user:    Serhiy Storchaka 
date:Sat May 16 21:35:56 2015 +0300
summary:
   Added tests for more builtin types.
Made test_pprint discoverable.

files:
   Lib/test/test_pprint.py |  17 -
   1 files changed, 8 insertions(+), 9 deletions(-)


==
ERROR: test_coverage (test.test_trace.TestCoverage)
--
Traceback (most recent call last):
   File "/py/dev/3x/root/uxd/lib/python3.5/test/test_trace.py", line 312,
in test_coverage
 self._coverage(tracer)
   File "/py/dev/3x/root/uxd/lib/python3.5/test/test_trace.py", line 305,
in _coverage
 tracer.run(cmd)
   File "/py/dev/3x/root/uxd/lib/python3.5/trace.py", line 500, in run
 self.runctx(cmd, dict, dict)
   File "/py/dev/3x/root/uxd/lib/python3.5/trace.py", line 508, in runctx
 exec(cmd, globals, locals)
   File "", line 1, in 
AttributeError: module 'test.test_pprint' has no attribute 'test_main'

==
ERROR: test_coverage_ignore (test.test_trace.TestCoverage)
--
Traceback (most recent call last):
   File "/py/dev/3x/root/uxd/lib/python3.5/test/test_trace.py", line 327,
in test_coverage_ignore
 self._coverage(tracer)
   File "/py/dev/3x/root/uxd/lib/python3.5/test/test_trace.py", line 305,
in _coverage
 tracer.run(cmd)
   File "/py/dev/3x/root/uxd/lib/python3.5/trace.py", line 500, in run
 self.runctx(cmd, dict, dict)
   File "/py/dev/3x/root/uxd/lib/python3.5/trace.py", line 508, in runctx
 exec(cmd, globals, locals)
   File "", line 1, in 
AttributeError: module 'test.test_pprint' has no attribute 'test_main'

Also breaks 3.4.



Thank you Ned. Opened issue24215 for this because just restoring 
test_main perhaps not the best way.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 559 - built-in noop()

2017-09-09 Thread Serhiy Storchaka

09.09.17 21:46, Barry Warsaw пише:

One use case would be for PEP 553, where you could set the breakpoint
environment variable to the following in order to effectively disable it::

 $ setenv PYTHONBREAKPOINT=noop


Are there other use cases? PEP 553 still is not approved, and you could 
use other syntax for disabling breakpoint(), e.g. setting 
PYTHONBREAKPOINT to an empty value.


It looks to me that in all other cases it can be replaced with `lambda 
*args, **kwds: None` (actually the expression can be even simpler in 
concrete cases). I can't remember any case when I needed an noop() 
function (unlike to an identity() function or null context manager).


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Why aren't decorators just expressions?

2017-09-16 Thread Serhiy Storchaka

16.09.17 12:39, Larry Hastings пише:
So why don't decorators allow arbitrary expressions?  The PEP discusses 
the syntax for decorators, but that whole debate only concerned itself 
with where the decorator goes relative to "def", and what funny 
punctuation might it use.  It never says "decorators shouldn't permit 
arbitrary expressions because--".  Nor is there any info on wiki page 
with the extensive summary of alternative syntax proposals.


Anybody remember?


https://mail.python.org/pipermail/python-dev/2004-August/046711.html

I'm not proposing that we allow arbitrary expressions as decorators... 
well, I'm not doing that /yet/ at least.  But like I said, the syntax 
has been this way for 13 years and I don't recall anybody complaining.


This may be an argument for not changing the syntax.

Actually I remember somebody raised this question a year or two ago, but 
don't remember details.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] To reduce Python "application" startup time

2017-09-23 Thread Serhiy Storchaka

05.09.17 16:02, INADA Naoki пише:

While I can't attend to sprint, I saw etherpad and I found
Neil Schemenauer and Eric Snow will work on startup time.

I want to share my current knowledge about startup time.

For bare (e.g. `python -c pass`) startup time,  I'm waiting C
implementation of ABC.

But application startup time is more important.  And we can improve
them with optimize importing common stdlib.

Current `python -v` is not useful to optimize import.
So I use this patch to profile import time.
https://gist.github.com/methane/e688bb31a23bcc437defcea4b815b1eb

With this profile, I tried optimize `python -c 'import asyncio'`, logging
and http.client.

https://gist.github.com/methane/1ab97181e74a33592314c7619bf34233#file-0-optimize-import-patch

With this small patch:

logging: 14.9ms -> 12.9ms
asyncio: 62.1ms -> 58.2ms
http.client: 43.8ms -> 36.1ms


See also https://bugs.python.org/issue30152 which optimizes the import 
time of argparse using similar technique. I think these patches overlap.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] API design question: how to extend sys.settrace()?

2017-09-27 Thread Serhiy Storchaka

27.09.17 15:56, Victor Stinner пише:

In bpo-29400, it was proposed to add the ability to trace not only
function calls but also instructions at the bytecode level. I like the
idea, but I don't see how to extend sys.settrace() to add a new
"trace_instructions: bool" optional (keyword-only?) parameter without
breaking the backward compatibility. Should we add a new function
instead?


I afraid that this change breaks an assumption in frame_setlineno() 
about the state of the stack. This can corrupt the stack if you jump 
from the instruction which is a part of Python operation. For example 
FOR_ITER expects an iterator on the stack. If you jump to the end of the 
loop from the middle of an assignment operator and skip say STORE_FAST, 
you will left an arbitrary value on the stack. This can lead to 
unpredictable consequences.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bpo-30806 netrc.__repr__() is broken for writing to file (GH-2491)

2017-10-01 Thread Serhiy Storchaka

30.09.17 10:10, INADA Naoki пише:

https://github.com/python/cpython/commit/b24cd055ecb3eea9a15405a6ca72dafc739e6531
commit: b24cd055ecb3eea9a15405a6ca72dafc739e6531
branch: master
author: James Sexton 
committer: INADA Naoki 
date: 2017-09-30T16:10:31+09:00
summary:

bpo-30806 netrc.__repr__() is broken for writing to file (GH-2491)

netrc file format doesn't support quotes and escapes.

See https://linux.die.net/man/5/netrc


The commit message looks confusing to me. Is netrc.__repr__() is broken 
now? Or this change makes netrc file format supporting quotes and 
escapes now?


Please read the following thread: 
https://mail.python.org/pipermail/python-dev/2011-May/111303.html.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python startup optimization: script vs. service

2017-10-02 Thread Serhiy Storchaka

02.10.17 16:26, Victor Stinner пише:

While "import module" is fast, maybe we should use sometimes a global
variable to cache the import.

module = None
def func():
global module
if module is None: import module
...


I optimized "import module", and I think it can be optimized even more, 
up to making the above trick unnecessary. Currently there is an overhead 
of checking that the module found in sys.modules is not imported right now.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make re.compile faster

2017-10-02 Thread Serhiy Storchaka

03.10.17 06:29, INADA Naoki пише:

Before deferring re.compile, can we make it faster?

I profiled `import string` and small optimization can make it 2x faster!
(but it's not backward compatible)


Please open an issue for this.


I found:

* RegexFlag.__and__ and __new__ is called very often.
* _optimize_charset is slow, because re.UNICODE | re.IGNORECASE

diff --git a/Lib/sre_compile.py b/Lib/sre_compile.py
index 144620c6d1..7c662247d4 100644
--- a/Lib/sre_compile.py
+++ b/Lib/sre_compile.py
@@ -582,7 +582,7 @@ def isstring(obj):

  def _code(p, flags):

-    flags = p.pattern.flags | flags
+    flags = int(p.pattern.flags) | int(flags)
      code = []

      # compile info block


Maybe cast flags to int earlier, in sre_compile.compile()?


diff --git a/Lib/string.py b/Lib/string.py
index b46e60c38f..fedd92246d 100644
--- a/Lib/string.py
+++ b/Lib/string.py
@@ -81,7 +81,7 @@ class Template(metaclass=_TemplateMetaclass):
      delimiter = '$'
      idpattern = r'[_a-z][_a-z0-9]*'
      braceidpattern = None
-    flags = _re.IGNORECASE
+    flags = _re.IGNORECASE | _re.ASCII

      def __init__(self, template):
          self.template = template

patched:
import time:      1191 |       8479 | string

Of course, this patch is not backward compatible. [a-z] doesn't match 
with 'ı' or 'ſ' anymore.

But who cares?


This looks like a bug fix. I'm wondering if it is worth to backport it 
to 3.6. But the change itself can break a user code that changes 
idpattern without touching flags. There is other way, but it should be 
discussed on the bug tracker.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make re.compile faster

2017-10-02 Thread Serhiy Storchaka

03.10.17 06:29, INADA Naoki пише:
More optimization can be done with implementing sre_parse and 
sre_compile in C.

But I have no time for it in this year.


And please don't do this! This would make maintaining the re module 
hard. The performance of the compiler is less important than correctness 
and performance of matching and searching.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make re.compile faster

2017-10-03 Thread Serhiy Storchaka

03.10.17 17:21, Barry Warsaw пише:

What if the compiler could recognize constant arguments to re.compile() and do 
the regex compilation at that point?  You’d need a way to represent the 
precompiled regex in the bytecode, and it would technically be a semantic 
change since regex problems would be discovered at compilation time instead of 
runtime - but that might be a good thing.  You could also make that an 
optimization flag for opt-in, or a flag to allow opt out.


The representation of the compiled regex is an implementation detail. It 
is even not exposed since the regex is compiled. And it is changed 
faster than bytecode and marshal format. It can be changed even in a 
bugfix release.


For implementing this idea we need:

1. Invent a universal portable regex bytecode. It shouldn't contain 
flaws and limitations and should support all features of Unicode regexps 
and possible extensions. It should also predict future Unicode changes 
and be able to code them.


2. Add support of regex objects in marshal format.

3. Implement an advanced AST optimizer.

4. Rewrite the regex compiler in C or make the AST optimizer able to 
execute Python code.


I think we are far away from this. Any of the above problems is much 
larger and can give larger benefit than changing several microseconds at 
startup.


Forget about this. Let's first get rid of GIL!

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files)

2017-10-03 Thread Serhiy Storchaka

03.10.17 18:15, Guido van Rossum пише:
It's really not that hard. You just check the magic number and if it's 
the new one, skip 4 words. No need to understand the internals of the 
header.


Hence you should know all old magic numbers to determine if the read 
magic number is the new one. Right?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Intention to accept PEP 552 soon (deterministic pyc files)

2017-10-03 Thread Serhiy Storchaka

26.09.17 23:47, Guido van Rossum пише:
I've read the current version of PEP 552 over and I think everything 
looks good for acceptance. I believe there are no outstanding objections 
(or they have been adequately addressed in responses).


Therefore I intend to accept PEP 552 this Friday, unless grave 
objections are raised on this mailing list (python-dev).


Congratulations Benjamin. Gotta love those tristate options!


While PEP 552 is accepted, I would want to see some changes.

1. Increase the size of the constant part of the signature to at least 
32 bits. Currently only the third and forth bytes are constant, and they 
are '\r\n', that is often occurred in text files. The first two bytes 
can be different in every Python version. This make hard detecting pyc 
files by utilities like file (1).


2. Split the "version" of pyc files by "major" and "minor" parts. Every 
major version is incompatible with other major versions, the interpreter 
accepts only one particular major version. It can't be changed in a 
bugfix release. But all minor versions inside the same major version are 
forward and backward compatible. The interpreter should be able to 
execute pyc file with arbitrary minor version, but it can use minor 
version of pyc file to handle errors in older versions. Minor version 
can be changed in a bugfix release. I hope this can help us with issues 
like https://bugs.python.org/issue29537. Currently 3.5 supports two 
magic numbers.


If we change the pyc format, it would be easy to make the above changes.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 553; the interaction between $PYTHONBREAKPOINT and -E

2017-10-05 Thread Serhiy Storchaka

04.10.17 21:06, Barry Warsaw пише:

Victor brings up a good question in his review of the PEP 553 implementation.

https://github.com/python/cpython/pull/3355
https://bugs.python.org/issue31353

The question is whether $PYTHONBREAKPOINT should be ignored if -E is given?

I think it makes sense for $PYTHONBREAKPOINT to be sensitive to -E, but in 
thinking about it some more, it might make better sense for the semantics to be 
that when -E is given, we treat it like PYTHONBREAKPOINT=0, i.e. disable the 
breakpoint, rather than fallback to the `pdb.set_trace` default.

My thinking is this: -E is often used in production environments to prevent 
stray environment settings from affecting the Python process.  In those 
environments, you probably also want to prevent stray breakpoints from stopping 
the process, so it’s more helpful to disable breakpoint processing when -E is 
given rather than running pdb.set_trace().

If you have a strong opinion either way, please follow up here, on the PR, or 
on the bug tracker.


What if make the default value depending on the debug level? In debug 
mode it is "pdb.set_trace", in optimized mode it is "0". Then in 
production environments you can use -E -O for ignoring environment 
settings and disable breakpoints.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \G (match last position) regex operator non-existant in python?

2017-10-29 Thread Serhiy Storchaka

27.10.17 18:35, Guido van Rossum пише:
The "why" question is not very interesting -- it probably wasn't in PCRE 
and nobody was familiar with it when we moved off PCRE (maybe it wasn't 
even in Perl at the time -- it was ~15 years ago).


I didn't understand your description of \G so I googled it and found a 
helpful StackOverflow article: 
https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. 
 From this I understand that when using e.g. findall() it forces 
successive matches to be adjacent.


This looks too Perlish to me. In Perl regular expressions are the part 
of language syntax, they can contain even Perl expressions. Arguments to 
them are passed implicitly (as well as to Perl's analogs of str.strip() 
and str.split()) and results are saved in global special variables. 
Loops also can be implicit.


It seems to me that \G makes sense only to re.findall() and 
re.finditer(), not to re.match(), re.search() or re.split().


In Python all this is explicit. Compiled regular expressions are 
objects, and you can pass start and end positions to Pattern.match(). 
The Python equivalent of \G looks to me like:


p = re.compile(...)
i = 0
while True:
m = p.match(s, i)
if not m: break
...
i = m.end()


The one also can use the undocumented Pattern.scanner() method. Actually 
Pattern.finditer() is implemented as iter(Pattern.scanner().search). 
iter(Pattern.scanner().match) would return an iterator of adjacent matches.


I think it would be more Pythonic (and much easier) to add a boolean 
parameter to finditer() and findall() than introduce a \G operator.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] The type of the result of the copy() method

2017-10-29 Thread Serhiy Storchaka
The copy() methods of list, dict, bytearray, set, frozenset, 
WeakValueDictionary, WeakKeyDictionary return an instance of the base 
type containing the content of the original collection.


The copy() methods of deque, defaultdict, OrderedDict, Counter, 
ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an 
instance of the same type as the original collection.


The copy() method of mappingproxy returns a copy of the underlying 
mapping (using its copy() method).


os.environ.copy() returns a dict.

Shouldn't it be more consistent?

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Migrate python-dev to Mailman 3?

2017-10-29 Thread Serhiy Storchaka

26.10.17 12:24, Victor Stinner пише:

We are using Mailman 3 for the new buildbot-status mailing list and it
works well:

https://mail.python.org/mm3/archives/list/buildbot-sta...@python.org/

I prefer to read archives with this UI, it's simpler to follow
threads, and it's possible to reply on the web UI!

To be honest, we got some issues when the new security-announce
mailing list was quickly migrated from Mailman 2 to Mailman 3, but
issues were quicky fixed as well.

Would it be possible to migrate python-dev to Mailman 3? Do you see
any blocker issue?


+1! Current UI is almost unusable. When you read a message the only 
navigation links are available are "pref/next in the thread" and back to 
the global list of messages. So you should either read all messages 
sequentially in some linearized order and lost a context when jump from 
the end of one branch to the start of other branch, or switch to the 
three view and open every message in a separate tab and switch between 
tabs. I preferred to use Gmane, but its web-interface now doesn't work.


Does Mailman 3 provide a NNTP interface? The NNTP interface of Gmane 
still works, but it can be switched off at any time. It would be more 
reliable to not depend on an unstable third-party service.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The type of the result of the copy() method

2017-10-31 Thread Serhiy Storchaka

29.10.17 19:04, Guido van Rossum пише:
It's somewhat problematic. If I subclass dict with a different 
constructor, but I don't overload copy(), how can the dict.copy() method 
construct a correct instance of the subclass? Even if the constructor 
signatures match, how can dict.copy() make sure it copies all attributes 
properly? Without an answer to these questions I think it's better to 
admit defeat and return a dict instance -- classes that want to do 
better should overload copy().


I notice that Counter.copy() has all the problems I indicate here -- it 
works as long as you don't add attributes or change the constructor 
signature. I bet this isn't documented anywhere.


I am familiar with these reasons, and agree with them. But I'm curious 
why some collections chose the way of creating an instance of the same 
class. For creating an instance of the same class we have the __copy__() 
method.


An attempt to preserve a class in the returned value can cause problems. 
For example, the __add__() and __mul__() methods of deque first make a 
copy of the same type, and this can cause a crash [1]. Of course this is 
not occurred in real code, it is just yet one way of crashing the 
interpreter from Python code. list and tuple are free from this problem 
since their corresponding methods (as well as copy()) create an instance 
of the corresponding base type.


I think there were reasons for copying the type in results. It would be 
nice to formalize the criteria, in what cases copy() and other methods 
should return an instance of the base class, and in what cases they 
should create an instance of the same type as the original object. This 
would help for new types. And maybe we need to change some existing type 
(the inconsistency between WeakKeyDictionary and WeakSet looks weird).


[1] https://bugs.python.org/issue31608

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] The syntax of replacement fields in format strings

2017-10-31 Thread Serhiy Storchaka
According to the specification of format string syntax [1] (I meant 
str.format(), not f-strings), both argument name and attribute name must 
be Python identifiers.


But the current implementation is more lenient and allow arbitrary 
sequences of characters while they don't contain '.', '[', ']', '{', 
'}', ':', '!'.


>>> '{#}'.format_map({'#': 42})
'42'
>>> import types
>>> '{0.#}'.format(types.SimpleNamespace(**{'#': 42}))
'42'

This can be confusing due to similarity with the format string syntaxes 
in str.format() and f-strings.


>> name = 'abc'
>>> f'{name.upper()}'
'ABC'
>>> '{name.upper()}'.format(name='abc')
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'str' object has no attribute 'upper()'

If accept only identifiers, we could produce more specific error message.

Is there a bug in the documentation or in the implementation?

[1] https://docs.python.org/3/library/string.html#format-string-syntax

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Reorganizing re and curses related code

2017-11-03 Thread Serhiy Storchaka
Currently the implementation of re and curses related modules is sparsed 
over several files:


re:
Lib/re.py
Lib/sre_compile.py
Lib/sre_constants.py
Lib/sre_parse.py

_sre:

Modules/_sre.c
Modules/sre_constants.h
Modules/sre.h
Modules/sre_lib.h

_curses:
Include/py_curses.h
Modules/_cursesmodule.c
Modules/_curses_panel.c

I want to make the re module a package, and move sre_*.py files into it. 
Maybe later I'll add the sre_optimize.py file for separating 
optimization from parsing and compiling to an internal code. The 
original sre_*.py files will be left for compatibility for long time, 
but they will just import their content from the re package.


_sre implementation will be moved into the Modules/_sre/ directory. This 
will just make them to be in one place and will decrease the number of 
files in the Modules/ directory.


The implementations of the _curses and _curses_panel modules together 
with the common header file will be moved into the Modules/_curses/ 
directory. Excluding py_curses.h from the set of global headers will 
increase the speed of rebuilding when modify just the _curses 
implementation (I did this too much recent times). In future the 
implementation of menu and forms extensions will be added (the patch for 
menu has beed provided years ago). Since _cursesmodule.c is one of the 
largest file (it defines hundreds of functions), it may be worth to 
extract the implementation of the _curses.window class into a separate 
file. And I want to implement the support of "soft function-key labels". 
All this will increase the number of _curses related files to 7.


curses already is a package.

Since virtually all changes in these files at recent years have been 
made by me, I don't think this will harm other core developers. Are 
there any objections?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Reorganizing re and curses related code

2017-11-03 Thread Serhiy Storchaka

03.11.17 12:29, Nick Coghlan пише:

On 3 November 2017 at 20:01, Serhiy Storchaka  wrote:


Since virtually all changes in these files at recent years have been made by
me, I don't think this will harm other core developers. Are there any
objections?


Sound fine to me (and you may want to add an underscore prefix to the
sre_*.py files in their new home).

The one caveat I'll note is that this may limit automatic backporting
of fixes to this files (I'm not sure how good 'git cherry-pick' is at
handling file renames).


I'm aware of this and tried to fix all known bugs (which can't be 
classified as a lack of a feature) in these modules before doing this 
change. There are two old bugs left in _sre, but they don't have fixes yet.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Remove typing from the stdlib (was: Reminder: 12 weeks to 3.7 feature code cutoff)

2017-11-03 Thread Serhiy Storchaka

03.11.17 16:36, Guido van Rossum пише:
> Maybe we should remove typing from the stdlib?
> https://github.com/python/typing/issues/495

I didn't use typing, but AFAIK the most used feature from typing is 
NamedTuple. If move NamedTuple and maybe other convenient classes not 
directly related to typing into collections or types modules, I think 
removing typing from the stdlib will less stress people.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-05 Thread Serhiy Storchaka

04.11.17 19:30, Stefan Krah пише:

would it be possible to guarantee that dict literals are ordered in v3.7?


The issue is well-known and the workarounds are tedious, example:

https://mail.python.org/pipermail/python-ideas/2015-December/037423.html


If the feature is guaranteed now, people can rely on it around v3.9.


Do you suggest to make dictionary displays producing OrderedDict instead 
of dict?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-05 Thread Serhiy Storchaka

05.11.17 20:39, Stefan Krah пише:

On Sun, Nov 05, 2017 at 01:14:54PM -0500, Paul G wrote:

2. Someone invents a new arbitrary-ordered container that would improve on the 
memory and/or CPU performance of the current dict implementation


I would think this is very unlikely, given that the previous dict implementation
has always been very fast. The new one is very fast, too.


The modification of the current implementation that don't preserve the 
initial order after deletion would be more compact and faster.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-05 Thread Serhiy Storchaka

05.11.17 21:20, Stefan Krah пише:

On Sun, Nov 05, 2017 at 09:01:40PM +0200, Serhiy Storchaka wrote:

Do you suggest to make dictionary displays producing OrderedDict
instead of dict?


No, this is essentially a language spec doc issue that would guarantee
the ordering properties of the current dict implementation.


Wouldn't be enough to guarantee just the ordering of dicts before first 
deletion? Or before first resizing (the maximal size of dictionary 
displays is known at compile time, so they can be presized)?


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Guarantee ordered dict literals in v3.7?

2017-11-05 Thread Serhiy Storchaka

05.11.17 21:30, Stefan Krah пише:

On Sun, Nov 05, 2017 at 09:09:37PM +0200, Serhiy Storchaka wrote:

05.11.17 20:39, Stefan Krah пише:

On Sun, Nov 05, 2017 at 01:14:54PM -0500, Paul G wrote:

2. Someone invents a new arbitrary-ordered container that would improve on the 
memory and/or CPU performance of the current dict implementation


I would think this is very unlikely, given that the previous dict implementation
has always been very fast. The new one is very fast, too.


The modification of the current implementation that don't preserve
the initial order after deletion would be more compact and faster.


How much faster?


I didn't try to implement this. But the current implementation requires 
periodical reallocating if add and remove items. The following loop 
reallocates the dict every len(d) iterations, while the size of the dict 
is not changed, and the half of its storage is empty.


while True:
v = d.pop(k)
...
d[k] = v

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposal: go back to enabling DeprecationWarning by default

2017-11-05 Thread Serhiy Storchaka

06.11.17 04:05, Nick Coghlan пише:

On the 12-weeks-to-3.7-feature-freeze thread, Jose Bueno & I both
mistakenly though the async/await deprecation warnings were missing
from 3.6.

They weren't missing, we'd just both forgotten those warnings were off
by default (7 years after the change to the default settings in 2.7 &
3.2).


Following issues on GitHub related to new Python releases I have found 
that many projects try to fix deprecation warning, but there are 
projects that are surprised by ending of deprecation periods and 
removing features.



So my proposal is simple (and not really new): let's revert back to
the way things were in 2.6 and earlier, with DeprecationWarning being
visible by default, and app devs having to silence it explicitly
during application startup (before they start importing third party
modules) if they don't want their users seeing it when running on the
latest Python version (e.g. this would be suitable for open source
apps that get integrated into Linux distros and use the system Python
there).

This will also restore the previously clear semantic and behavioural
different between PendingDeprecationWarning (hidden by default) and
DeprecationWarning (visible by default).


There was a proposition to make DeprecationWarning visible by default in 
debug builds and in interactive interpreter.


What if first implement this idea in 3.7 and make DeprecationWarning 
visible by default in production scripts only in 3.8? This will make 
less breakage.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


  1   2   3   4   5   6   7   8   9   10   >