Re: [Python-Dev] \G (match last position) regex operator non-existant in python?

2017-10-29 Thread Serhiy Storchaka

27.10.17 18:35, Guido van Rossum пише:
The "why" question is not very interesting -- it probably wasn't in PCRE 
and nobody was familiar with it when we moved off PCRE (maybe it wasn't 
even in Perl at the time -- it was ~15 years ago).


I didn't understand your description of \G so I googled it and found a 
helpful StackOverflow article: 
https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. 
 From this I understand that when using e.g. findall() it forces 
successive matches to be adjacent.


This looks too Perlish to me. In Perl regular expressions are the part 
of language syntax, they can contain even Perl expressions. Arguments to 
them are passed implicitly (as well as to Perl's analogs of str.strip() 
and str.split()) and results are saved in global special variables. 
Loops also can be implicit.


It seems to me that \G makes sense only to re.findall() and 
re.finditer(), not to re.match(), re.search() or re.split().


In Python all this is explicit. Compiled regular expressions are 
objects, and you can pass start and end positions to Pattern.match(). 
The Python equivalent of \G looks to me like:


p = re.compile(...)
i = 0
while True:
m = p.match(s, i)
if not m: break
...
i = m.end()


The one also can use the undocumented Pattern.scanner() method. Actually 
Pattern.finditer() is implemented as iter(Pattern.scanner().search). 
iter(Pattern.scanner().match) would return an iterator of adjacent matches.


I think it would be more Pythonic (and much easier) to add a boolean 
parameter to finditer() and findall() than introduce a \G operator.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] The type of the result of the copy() method

2017-10-29 Thread Serhiy Storchaka
The copy() methods of list, dict, bytearray, set, frozenset, 
WeakValueDictionary, WeakKeyDictionary return an instance of the base 
type containing the content of the original collection.


The copy() methods of deque, defaultdict, OrderedDict, Counter, 
ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an 
instance of the same type as the original collection.


The copy() method of mappingproxy returns a copy of the underlying 
mapping (using its copy() method).


os.environ.copy() returns a dict.

Shouldn't it be more consistent?

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Migrate python-dev to Mailman 3?

2017-10-29 Thread Serhiy Storchaka

26.10.17 12:24, Victor Stinner пише:

We are using Mailman 3 for the new buildbot-status mailing list and it
works well:

https://mail.python.org/mm3/archives/list/buildbot-sta...@python.org/

I prefer to read archives with this UI, it's simpler to follow
threads, and it's possible to reply on the web UI!

To be honest, we got some issues when the new security-announce
mailing list was quickly migrated from Mailman 2 to Mailman 3, but
issues were quicky fixed as well.

Would it be possible to migrate python-dev to Mailman 3? Do you see
any blocker issue?


+1! Current UI is almost unusable. When you read a message the only 
navigation links are available are "pref/next in the thread" and back to 
the global list of messages. So you should either read all messages 
sequentially in some linearized order and lost a context when jump from 
the end of one branch to the start of other branch, or switch to the 
three view and open every message in a separate tab and switch between 
tabs. I preferred to use Gmane, but its web-interface now doesn't work.


Does Mailman 3 provide a NNTP interface? The NNTP interface of Gmane 
still works, but it can be switched off at any time. It would be more 
reliable to not depend on an unstable third-party service.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The type of the result of the copy() method

2017-10-29 Thread Brett Cannon
It probably should be more consistent and I have a vague recollection that
this has been brought up before.

On Sun, Oct 29, 2017, 08:21 Serhiy Storchaka,  wrote:

> The copy() methods of list, dict, bytearray, set, frozenset,
> WeakValueDictionary, WeakKeyDictionary return an instance of the base
> type containing the content of the original collection.
>
> The copy() methods of deque, defaultdict, OrderedDict, Counter,
> ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an
> instance of the same type as the original collection.
>
> The copy() method of mappingproxy returns a copy of the underlying
> mapping (using its copy() method).
>
> os.environ.copy() returns a dict.
>
> Shouldn't it be more consistent?
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> https://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \G (match last position) regex operator non-existant in python?

2017-10-29 Thread MRAB

On 2017-10-29 12:27, Serhiy Storchaka wrote:

27.10.17 18:35, Guido van Rossum пише:
The "why" question is not very interesting -- it probably wasn't in PCRE 
and nobody was familiar with it when we moved off PCRE (maybe it wasn't 
even in Perl at the time -- it was ~15 years ago).


I didn't understand your description of \G so I googled it and found a 
helpful StackOverflow article: 
https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. 
 From this I understand that when using e.g. findall() it forces 
successive matches to be adjacent.


This looks too Perlish to me. In Perl regular expressions are the part
of language syntax, they can contain even Perl expressions. Arguments to
them are passed implicitly (as well as to Perl's analogs of str.strip()
and str.split()) and results are saved in global special variables.
Loops also can be implicit.

It seems to me that \G makes sense only to re.findall() and
re.finditer(), not to re.match(), re.search() or re.split().

In Python all this is explicit. Compiled regular expressions are
objects, and you can pass start and end positions to Pattern.match().
The Python equivalent of \G looks to me like:

p = re.compile(...)
i = 0
while True:
  m = p.match(s, i)
  if not m: break
  ...
  i = m.end()


You're correct. \G matches at the start position, so .search(r\G\w+') 
behaves the same as .match(r'\w+').


findall and finditer perform a series of searches, but with \G at the 
start they'll perform a series of matches, each anchored at where the 
previous one ended.



The one also can use the undocumented Pattern.scanner() method. Actually
Pattern.finditer() is implemented as iter(Pattern.scanner().search).
iter(Pattern.scanner().match) would return an iterator of adjacent matches.

I think it would be more Pythonic (and much easier) to add a boolean
parameter to finditer() and findall() than introduce a \G operator.


___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The type of the result of the copy() method

2017-10-29 Thread Raymond Hettinger

> On Oct 29, 2017, at 8:19 AM, Serhiy Storchaka  wrote:
> 
> The copy() methods of list, dict, bytearray, set, frozenset, 
> WeakValueDictionary, WeakKeyDictionary return an instance of the base type 
> containing the content of the original collection.
> 
> The copy() methods of deque, defaultdict, OrderedDict, Counter, ChainMap, 
> UserDict, UserList, WeakSet, ElementTree.Element return an instance of the 
> same type as the original collection.
> 
> The copy() method of mappingproxy returns a copy of the underlying mapping 
> (using its copy() method).
> 
> os.environ.copy() returns a dict.
> 
> Shouldn't it be more consistent?

Not really.  It is up to the class designer to make a decision about what the 
most useful behavior would be for subclassers.

Note for a regular Python class, copy.copy() by default creates an instance of 
the subclass.  On the other hand, instances like int() are harder to subclass 
because all the int operations such as __add__ produce exact int() instances 
(this is likely because so few assumptions can be made about the subclass and 
because it isn't clear what the semantics would be otherwise).

Also, the time to argue and change APIs is BEFORE they are released, not a 
decade or two after they've lived successfully in the wild.


Raymond



___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The type of the result of the copy() method

2017-10-29 Thread Guido van Rossum
It's somewhat problematic. If I subclass dict with a different constructor,
but I don't overload copy(), how can the dict.copy() method construct a
correct instance of the subclass? Even if the constructor signatures match,
how can dict.copy() make sure it copies all attributes properly? Without an
answer to these questions I think it's better to admit defeat and return a
dict instance -- classes that want to do better should overload copy().

I notice that Counter.copy() has all the problems I indicate here -- it
works as long as you don't add attributes or change the constructor
signature. I bet this isn't documented anywhere.

On Sun, Oct 29, 2017 at 9:40 AM, Brett Cannon  wrote:

> It probably should be more consistent and I have a vague recollection that
> this has been brought up before.
>
> On Sun, Oct 29, 2017, 08:21 Serhiy Storchaka,  wrote:
>
>> The copy() methods of list, dict, bytearray, set, frozenset,
>> WeakValueDictionary, WeakKeyDictionary return an instance of the base
>> type containing the content of the original collection.
>>
>> The copy() methods of deque, defaultdict, OrderedDict, Counter,
>> ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an
>> instance of the same type as the original collection.
>>
>> The copy() method of mappingproxy returns a copy of the underlying
>> mapping (using its copy() method).
>>
>> os.environ.copy() returns a dict.
>>
>> Shouldn't it be more consistent?
>>
>> ___
>> Python-Dev mailing list
>> Python-Dev@python.org
>> https://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
>> brett%40python.org
>>
>
> ___
> Python-Dev mailing list
> Python-Dev@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/
> guido%40python.org
>
>


-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Migrate python-dev to Mailman 3?

2017-10-29 Thread Barry Warsaw
On Oct 29, 2017, at 11:42, Serhiy Storchaka  wrote:

> Does Mailman 3 provide a NNTP interface? The NNTP interface of Gmane still 
> works, but it can be switched off at any time. It would be more reliable to 
> not depend on an unstable third-party service.

I use the NNTP interface of Gmane too (although not for python-dev), and agree 
with everything your saying here.  Right now however, MM3 does not have a 
built-in NNTP server.

Cheers,
-Barry



signature.asc
Description: Message signed with OpenPGP
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The type of the result of the copy() method

2017-10-29 Thread Raymond Hettinger

> On Oct 29, 2017, at 10:04 AM, Guido van Rossum  wrote:
> 
> Without an answer to these questions I think it's better to admit defeat and 
> return a dict instance 

I think it is better to admit success and recognize that these APIs have fared 
well in the wild.

Focusing just on OrderedDict() and dict(),  I don't see how to change the 
copy() method for either of them without breaking existing code.  OrderedDict 
*is* a dict subclass but really does need to have copy() return an OrderedDict.

The *default* behavior for any pure python class is for copy.copy() to return 
an instance of that class.  We really don't want ChainMap() to return a dict 
instance -- that would defeat the whole purpose of having a ChainMap in the 
first place.

And unlike the original builtin classes, most of the collection classes were 
specifically designed to be easily subclassable (not making the subclasser do 
work unnecessarily).  These aren't accidental behaviors:

class ChainMap(MutableMapping):

def copy(self):
'New ChainMap or subclass with a new copy of maps[0] and refs to 
maps[1:]'
return self.__class__(self.maps[0].copy(), *self.maps[1:])

Do you really want that changed to:

return ChainMap(self.maps[0].copy(), *self.maps[1:])

Or to:

return dict(self)

Do you really want Serhiy to sweep through the code and change all of these 
long standing APIs, overriding the decisions of the people who designed those 
classes, and breaking all user code that reasonably relied on those useful and 
intentional behaviors?


Raymond


P.S.  Possibly related:  We've gone out of way in many classes to have a 
__repr__ that uses the name of the subclass.  Presumably, this is to make life 
easier for subclassers (one less method they have to override), but it does 
make an assumption about what the subclass signature looks like.  IIRC, our 
position on that has been that a subclasser who changes the signature would 
then need to override the __repr__.   ISTM that similar reasoning would apply 
to copy.

___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The type of the result of the copy() method

2017-10-29 Thread Guido van Rossum
On Sun, Oct 29, 2017 at 10:41 AM, Raymond Hettinger <
raymond.hettin...@gmail.com> wrote:

>
> > On Oct 29, 2017, at 10:04 AM, Guido van Rossum  wrote:
> >
> > Without an answer to these questions I think it's better to admit defeat
> and return a dict instance
>
> I think it is better to admit success and recognize that these APIs have
> fared well in the wild.
>

Oh, I agree!

Focusing just on OrderedDict() and dict(),  I don't see how to change the
> copy() method for either of them without breaking existing code.
> OrderedDict *is* a dict subclass but really does need to have copy() return
> an OrderedDict.
>

And I wasn't proposing that. I like what OrderedDict does -- I was just
suggesting that the *default* dict.copy() needn't worry about this.


> The *default* behavior for any pure python class is for copy.copy() to
> return an instance of that class.  We really don't want ChainMap() to
> return a dict instance -- that would defeat the whole purpose of having a
> ChainMap in the first place.
>

Of course.

And unlike the original builtin classes, most of the collection classes
> were specifically designed to be easily subclassable (not making the
> subclasser do work unnecessarily).  These aren't accidental behaviors:
>
> class ChainMap(MutableMapping):
>
> def copy(self):
> 'New ChainMap or subclass with a new copy of maps[0] and refs
> to maps[1:]'
> return self.__class__(self.maps[0].copy(), *self.maps[1:])
>
> Do you really want that changed to:
>
> return ChainMap(self.maps[0].copy(), *self.maps[1:])
>
> Or to:
>
> return dict(self)
>

I think you've misread what I meant. (The defeat I referred to was
accepting the status quo, no matter how inconsistent it seems, not a
withdrawal to some other seemingly inconsistent but different rule.)


> Do you really want Serhiy to sweep through the code and change all of
> these long standing APIs, overriding the decisions of the people who
> designed those classes, and breaking all user code that reasonably relied
> on those useful and intentional behaviors?
>

No, and I never said that. Calm down.

Raymond
>
>
> P.S.  Possibly related:  We've gone out of way in many classes to have a
> __repr__ that uses the name of the subclass.  Presumably, this is to make
> life easier for subclassers (one less method they have to override), but it
> does make an assumption about what the subclass signature looks like.
> IIRC, our position on that has been that a subclasser who changes the
> signature would then need to override the __repr__.   ISTM that similar
> reasoning would apply to copy.
>

I don't think the same reasoning applies. When the string returned doesn't
indicate the true class of the object, debugging becomes a lot harder. If
the signature in the repr() output is wrong, the user can probably deal
with that. And yes, the subclasser who wants the best possible repr() needs
to override it, but the use cases don't match.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] \G (match last position) regex operator non-existant in python?

2017-10-29 Thread Jakub Wilk

* Guido van Rossum , 2017-10-28, 14:05:
even if we outright switched there would *still* be two versions, 
because regex itself has an internal versioning scheme where V0 claims 
to be strictly compatible with re and V1 explicitly changes the 
matching rules in some cases. (I don't know if this means that you have 
to request V1 to use \G though.)


No, \G is available in the V0 mode.

--
Jakub Wilk
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 561: Distributing and Packaging Type Information

2017-10-29 Thread Ethan Smith
On Fri, Oct 27, 2017 at 12:44 AM, Nathaniel Smith  wrote:

> On Thu, Oct 26, 2017 at 3:42 PM, Ethan Smith  wrote:
> > However, the stubs may be put in a sub-folder
> > of the Python sources, with the same name the ``*.py`` files are in. For
> > example, the ``flyingcircus`` package would have its stubs in the folder
> > ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are
> > not found in ``flyingcircus/`` the type checker may treat the
> subdirectory as
> > a normal package.
>
> I admit that I find this aesthetically unpleasant. Wouldn't something
> like __typestubs__/ be a more Pythonic name? (And also avoid potential
> name clashes, e.g. my async_generator package has a top-level export
> called async_generator; normally you do 'from async_generator import
> async_generator'. I think that might cause problems if I created an
> async_generator/async_generator/ directory, especially post-PEP 420.)
>

I agree, this is unpleasant, I am now of the thought that if maintainers do
not wish to ship stubs alongside their Python code, they should just create
separate stub-only packages. I don't think there is a particular need to
special case this for minor convenience.



> > Type Checker Module Resolution Order
> > 
> >
> > The following is the order that type checkers supporting this PEP should
> > resolve modules containing type information:
> >
> > 1. User code - the files the type checker is running on.
> >
> > 2. Stubs or Python source manually put in the beginning of the path. Type
> >checkers should provide this to allow the user complete control of
> which
> >stubs to use, and patch broken stubs/inline types from packages.
> >
> > 3. Third party stub packages - these packages can supersede the installed
> >untyped packages. They can be found at ``pkg-stubs`` for package
> ``pkg``,
> >however it is encouraged to check the package's metadata using
> packaging
> >query APIs such as ``pkg_resources`` to assure that the package is
> meant
> >for type checking, and is compatible with the installed version.
>
> Am I right that this means you need to be able to map from import
> names to distribution names? I.e., if you see 'import foo', you need
> to figure out which *.dist-info directory contains metadata for the
> 'foo' package? How do you plan to do this?
>
>
The problem is that technically, import names and distribution names
> are totally unrelated namespaces -- for example, the '_pytest' package
> comes from the 'pytest' distribution, the 'pylab' package comes from
> 'matplotlib', and 'pip install scikit-learn' gives you a package
> imported as 'sklearn'. Namespace packages are also challenging,
> because a single top-level package might actually be spread across
> multiple distributions.
>
>
This is a problem. What I now realize is that the typing metadata is needed
for *packages* and not distributions. I will work on a new proposal that
makes the metadata per-package. It will require a slightly more complicated
proposal, but I feel that it is necessary. Thank you for realizing this
issue with my proposal, I probably should have caught it earlier.

-n
>
> --
> Nathaniel J. Smith -- https://vorpus.org
>
___
Python-Dev mailing list
Python-Dev@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com