Re: [Python-Dev] \G (match last position) regex operator non-existant in python?
27.10.17 18:35, Guido van Rossum пише: The "why" question is not very interesting -- it probably wasn't in PCRE and nobody was familiar with it when we moved off PCRE (maybe it wasn't even in Perl at the time -- it was ~15 years ago). I didn't understand your description of \G so I googled it and found a helpful StackOverflow article: https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. From this I understand that when using e.g. findall() it forces successive matches to be adjacent. This looks too Perlish to me. In Perl regular expressions are the part of language syntax, they can contain even Perl expressions. Arguments to them are passed implicitly (as well as to Perl's analogs of str.strip() and str.split()) and results are saved in global special variables. Loops also can be implicit. It seems to me that \G makes sense only to re.findall() and re.finditer(), not to re.match(), re.search() or re.split(). In Python all this is explicit. Compiled regular expressions are objects, and you can pass start and end positions to Pattern.match(). The Python equivalent of \G looks to me like: p = re.compile(...) i = 0 while True: m = p.match(s, i) if not m: break ... i = m.end() The one also can use the undocumented Pattern.scanner() method. Actually Pattern.finditer() is implemented as iter(Pattern.scanner().search). iter(Pattern.scanner().match) would return an iterator of adjacent matches. I think it would be more Pythonic (and much easier) to add a boolean parameter to finditer() and findall() than introduce a \G operator. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] The type of the result of the copy() method
The copy() methods of list, dict, bytearray, set, frozenset, WeakValueDictionary, WeakKeyDictionary return an instance of the base type containing the content of the original collection. The copy() methods of deque, defaultdict, OrderedDict, Counter, ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an instance of the same type as the original collection. The copy() method of mappingproxy returns a copy of the underlying mapping (using its copy() method). os.environ.copy() returns a dict. Shouldn't it be more consistent? ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Migrate python-dev to Mailman 3?
26.10.17 12:24, Victor Stinner пише: We are using Mailman 3 for the new buildbot-status mailing list and it works well: https://mail.python.org/mm3/archives/list/buildbot-sta...@python.org/ I prefer to read archives with this UI, it's simpler to follow threads, and it's possible to reply on the web UI! To be honest, we got some issues when the new security-announce mailing list was quickly migrated from Mailman 2 to Mailman 3, but issues were quicky fixed as well. Would it be possible to migrate python-dev to Mailman 3? Do you see any blocker issue? +1! Current UI is almost unusable. When you read a message the only navigation links are available are "pref/next in the thread" and back to the global list of messages. So you should either read all messages sequentially in some linearized order and lost a context when jump from the end of one branch to the start of other branch, or switch to the three view and open every message in a separate tab and switch between tabs. I preferred to use Gmane, but its web-interface now doesn't work. Does Mailman 3 provide a NNTP interface? The NNTP interface of Gmane still works, but it can be switched off at any time. It would be more reliable to not depend on an unstable third-party service. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The type of the result of the copy() method
It probably should be more consistent and I have a vague recollection that this has been brought up before. On Sun, Oct 29, 2017, 08:21 Serhiy Storchaka, wrote: > The copy() methods of list, dict, bytearray, set, frozenset, > WeakValueDictionary, WeakKeyDictionary return an instance of the base > type containing the content of the original collection. > > The copy() methods of deque, defaultdict, OrderedDict, Counter, > ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an > instance of the same type as the original collection. > > The copy() method of mappingproxy returns a copy of the underlying > mapping (using its copy() method). > > os.environ.copy() returns a dict. > > Shouldn't it be more consistent? > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > https://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] \G (match last position) regex operator non-existant in python?
On 2017-10-29 12:27, Serhiy Storchaka wrote: 27.10.17 18:35, Guido van Rossum пише: The "why" question is not very interesting -- it probably wasn't in PCRE and nobody was familiar with it when we moved off PCRE (maybe it wasn't even in Perl at the time -- it was ~15 years ago). I didn't understand your description of \G so I googled it and found a helpful StackOverflow article: https://stackoverflow.com/questions/21971701/when-is-g-useful-application-in-a-regex. From this I understand that when using e.g. findall() it forces successive matches to be adjacent. This looks too Perlish to me. In Perl regular expressions are the part of language syntax, they can contain even Perl expressions. Arguments to them are passed implicitly (as well as to Perl's analogs of str.strip() and str.split()) and results are saved in global special variables. Loops also can be implicit. It seems to me that \G makes sense only to re.findall() and re.finditer(), not to re.match(), re.search() or re.split(). In Python all this is explicit. Compiled regular expressions are objects, and you can pass start and end positions to Pattern.match(). The Python equivalent of \G looks to me like: p = re.compile(...) i = 0 while True: m = p.match(s, i) if not m: break ... i = m.end() You're correct. \G matches at the start position, so .search(r\G\w+') behaves the same as .match(r'\w+'). findall and finditer perform a series of searches, but with \G at the start they'll perform a series of matches, each anchored at where the previous one ended. The one also can use the undocumented Pattern.scanner() method. Actually Pattern.finditer() is implemented as iter(Pattern.scanner().search). iter(Pattern.scanner().match) would return an iterator of adjacent matches. I think it would be more Pythonic (and much easier) to add a boolean parameter to finditer() and findall() than introduce a \G operator. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The type of the result of the copy() method
> On Oct 29, 2017, at 8:19 AM, Serhiy Storchaka wrote: > > The copy() methods of list, dict, bytearray, set, frozenset, > WeakValueDictionary, WeakKeyDictionary return an instance of the base type > containing the content of the original collection. > > The copy() methods of deque, defaultdict, OrderedDict, Counter, ChainMap, > UserDict, UserList, WeakSet, ElementTree.Element return an instance of the > same type as the original collection. > > The copy() method of mappingproxy returns a copy of the underlying mapping > (using its copy() method). > > os.environ.copy() returns a dict. > > Shouldn't it be more consistent? Not really. It is up to the class designer to make a decision about what the most useful behavior would be for subclassers. Note for a regular Python class, copy.copy() by default creates an instance of the subclass. On the other hand, instances like int() are harder to subclass because all the int operations such as __add__ produce exact int() instances (this is likely because so few assumptions can be made about the subclass and because it isn't clear what the semantics would be otherwise). Also, the time to argue and change APIs is BEFORE they are released, not a decade or two after they've lived successfully in the wild. Raymond ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The type of the result of the copy() method
It's somewhat problematic. If I subclass dict with a different constructor, but I don't overload copy(), how can the dict.copy() method construct a correct instance of the subclass? Even if the constructor signatures match, how can dict.copy() make sure it copies all attributes properly? Without an answer to these questions I think it's better to admit defeat and return a dict instance -- classes that want to do better should overload copy(). I notice that Counter.copy() has all the problems I indicate here -- it works as long as you don't add attributes or change the constructor signature. I bet this isn't documented anywhere. On Sun, Oct 29, 2017 at 9:40 AM, Brett Cannon wrote: > It probably should be more consistent and I have a vague recollection that > this has been brought up before. > > On Sun, Oct 29, 2017, 08:21 Serhiy Storchaka, wrote: > >> The copy() methods of list, dict, bytearray, set, frozenset, >> WeakValueDictionary, WeakKeyDictionary return an instance of the base >> type containing the content of the original collection. >> >> The copy() methods of deque, defaultdict, OrderedDict, Counter, >> ChainMap, UserDict, UserList, WeakSet, ElementTree.Element return an >> instance of the same type as the original collection. >> >> The copy() method of mappingproxy returns a copy of the underlying >> mapping (using its copy() method). >> >> os.environ.copy() returns a dict. >> >> Shouldn't it be more consistent? >> >> ___ >> Python-Dev mailing list >> Python-Dev@python.org >> https://mail.python.org/mailman/listinfo/python-dev >> Unsubscribe: https://mail.python.org/mailman/options/python-dev/ >> brett%40python.org >> > > ___ > Python-Dev mailing list > Python-Dev@python.org > https://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: https://mail.python.org/mailman/options/python-dev/ > guido%40python.org > > -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Migrate python-dev to Mailman 3?
On Oct 29, 2017, at 11:42, Serhiy Storchaka wrote: > Does Mailman 3 provide a NNTP interface? The NNTP interface of Gmane still > works, but it can be switched off at any time. It would be more reliable to > not depend on an unstable third-party service. I use the NNTP interface of Gmane too (although not for python-dev), and agree with everything your saying here. Right now however, MM3 does not have a built-in NNTP server. Cheers, -Barry signature.asc Description: Message signed with OpenPGP ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The type of the result of the copy() method
> On Oct 29, 2017, at 10:04 AM, Guido van Rossum wrote: > > Without an answer to these questions I think it's better to admit defeat and > return a dict instance I think it is better to admit success and recognize that these APIs have fared well in the wild. Focusing just on OrderedDict() and dict(), I don't see how to change the copy() method for either of them without breaking existing code. OrderedDict *is* a dict subclass but really does need to have copy() return an OrderedDict. The *default* behavior for any pure python class is for copy.copy() to return an instance of that class. We really don't want ChainMap() to return a dict instance -- that would defeat the whole purpose of having a ChainMap in the first place. And unlike the original builtin classes, most of the collection classes were specifically designed to be easily subclassable (not making the subclasser do work unnecessarily). These aren't accidental behaviors: class ChainMap(MutableMapping): def copy(self): 'New ChainMap or subclass with a new copy of maps[0] and refs to maps[1:]' return self.__class__(self.maps[0].copy(), *self.maps[1:]) Do you really want that changed to: return ChainMap(self.maps[0].copy(), *self.maps[1:]) Or to: return dict(self) Do you really want Serhiy to sweep through the code and change all of these long standing APIs, overriding the decisions of the people who designed those classes, and breaking all user code that reasonably relied on those useful and intentional behaviors? Raymond P.S. Possibly related: We've gone out of way in many classes to have a __repr__ that uses the name of the subclass. Presumably, this is to make life easier for subclassers (one less method they have to override), but it does make an assumption about what the subclass signature looks like. IIRC, our position on that has been that a subclasser who changes the signature would then need to override the __repr__. ISTM that similar reasoning would apply to copy. ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] The type of the result of the copy() method
On Sun, Oct 29, 2017 at 10:41 AM, Raymond Hettinger < raymond.hettin...@gmail.com> wrote: > > > On Oct 29, 2017, at 10:04 AM, Guido van Rossum wrote: > > > > Without an answer to these questions I think it's better to admit defeat > and return a dict instance > > I think it is better to admit success and recognize that these APIs have > fared well in the wild. > Oh, I agree! Focusing just on OrderedDict() and dict(), I don't see how to change the > copy() method for either of them without breaking existing code. > OrderedDict *is* a dict subclass but really does need to have copy() return > an OrderedDict. > And I wasn't proposing that. I like what OrderedDict does -- I was just suggesting that the *default* dict.copy() needn't worry about this. > The *default* behavior for any pure python class is for copy.copy() to > return an instance of that class. We really don't want ChainMap() to > return a dict instance -- that would defeat the whole purpose of having a > ChainMap in the first place. > Of course. And unlike the original builtin classes, most of the collection classes > were specifically designed to be easily subclassable (not making the > subclasser do work unnecessarily). These aren't accidental behaviors: > > class ChainMap(MutableMapping): > > def copy(self): > 'New ChainMap or subclass with a new copy of maps[0] and refs > to maps[1:]' > return self.__class__(self.maps[0].copy(), *self.maps[1:]) > > Do you really want that changed to: > > return ChainMap(self.maps[0].copy(), *self.maps[1:]) > > Or to: > > return dict(self) > I think you've misread what I meant. (The defeat I referred to was accepting the status quo, no matter how inconsistent it seems, not a withdrawal to some other seemingly inconsistent but different rule.) > Do you really want Serhiy to sweep through the code and change all of > these long standing APIs, overriding the decisions of the people who > designed those classes, and breaking all user code that reasonably relied > on those useful and intentional behaviors? > No, and I never said that. Calm down. Raymond > > > P.S. Possibly related: We've gone out of way in many classes to have a > __repr__ that uses the name of the subclass. Presumably, this is to make > life easier for subclassers (one less method they have to override), but it > does make an assumption about what the subclass signature looks like. > IIRC, our position on that has been that a subclasser who changes the > signature would then need to override the __repr__. ISTM that similar > reasoning would apply to copy. > I don't think the same reasoning applies. When the string returned doesn't indicate the true class of the object, debugging becomes a lot harder. If the signature in the repr() output is wrong, the user can probably deal with that. And yes, the subclasser who wants the best possible repr() needs to override it, but the use cases don't match. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] \G (match last position) regex operator non-existant in python?
* Guido van Rossum , 2017-10-28, 14:05: even if we outright switched there would *still* be two versions, because regex itself has an internal versioning scheme where V0 claims to be strictly compatible with re and V1 explicitly changes the matching rules in some cases. (I don't know if this means that you have to request V1 to use \G though.) No, \G is available in the V0 mode. -- Jakub Wilk ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 561: Distributing and Packaging Type Information
On Fri, Oct 27, 2017 at 12:44 AM, Nathaniel Smith wrote: > On Thu, Oct 26, 2017 at 3:42 PM, Ethan Smith wrote: > > However, the stubs may be put in a sub-folder > > of the Python sources, with the same name the ``*.py`` files are in. For > > example, the ``flyingcircus`` package would have its stubs in the folder > > ``flyingcircus/flyingcircus/``. This path is chosen so that if stubs are > > not found in ``flyingcircus/`` the type checker may treat the > subdirectory as > > a normal package. > > I admit that I find this aesthetically unpleasant. Wouldn't something > like __typestubs__/ be a more Pythonic name? (And also avoid potential > name clashes, e.g. my async_generator package has a top-level export > called async_generator; normally you do 'from async_generator import > async_generator'. I think that might cause problems if I created an > async_generator/async_generator/ directory, especially post-PEP 420.) > I agree, this is unpleasant, I am now of the thought that if maintainers do not wish to ship stubs alongside their Python code, they should just create separate stub-only packages. I don't think there is a particular need to special case this for minor convenience. > > Type Checker Module Resolution Order > > > > > > The following is the order that type checkers supporting this PEP should > > resolve modules containing type information: > > > > 1. User code - the files the type checker is running on. > > > > 2. Stubs or Python source manually put in the beginning of the path. Type > >checkers should provide this to allow the user complete control of > which > >stubs to use, and patch broken stubs/inline types from packages. > > > > 3. Third party stub packages - these packages can supersede the installed > >untyped packages. They can be found at ``pkg-stubs`` for package > ``pkg``, > >however it is encouraged to check the package's metadata using > packaging > >query APIs such as ``pkg_resources`` to assure that the package is > meant > >for type checking, and is compatible with the installed version. > > Am I right that this means you need to be able to map from import > names to distribution names? I.e., if you see 'import foo', you need > to figure out which *.dist-info directory contains metadata for the > 'foo' package? How do you plan to do this? > > The problem is that technically, import names and distribution names > are totally unrelated namespaces -- for example, the '_pytest' package > comes from the 'pytest' distribution, the 'pylab' package comes from > 'matplotlib', and 'pip install scikit-learn' gives you a package > imported as 'sklearn'. Namespace packages are also challenging, > because a single top-level package might actually be spread across > multiple distributions. > > This is a problem. What I now realize is that the typing metadata is needed for *packages* and not distributions. I will work on a new proposal that makes the metadata per-package. It will require a slightly more complicated proposal, but I feel that it is necessary. Thank you for realizing this issue with my proposal, I probably should have caught it earlier. -n > > -- > Nathaniel J. Smith -- https://vorpus.org > ___ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com