[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip
On Tue, 12 May 2020 at 07:53, Brandt Bucher wrote: > > > However, zip_longest is really another beast entirely > > No, it isn't. > > It has a completely independent implementation, a different interface, lives > in a separate namespace, and doesn't even reference zip in its documentation. > So it seems to me that it is indeed another beast entirely. > > > > so it makes sense that it would live in itertools while zip grows > > > in-place. > > No, it doesn't > > See above for why I think it does. ... so it's another beast because (among other reasons) it lives in a separate namespace, and it should live in a separate namespace because it's another beast? That's circular logic. If we were to put zip_strict into itertools, you could use*precisely* this logic to argue that it was the right thing to do. > > > The goal here is not just to provide a way to catch bugs, but to also > > > make it easy (even tempting) for a user to enable the check whenever > > > using zip at a call site with this property. > > Importing necessary functions is not an anti-pattern. > > Um, agreed? So importing zip_strict from itertools is an entirely reasonable way for users to enable the check, then. > > > Another proposed idiom, per-module shadowing of the built-in zip with > > > some subtly different variant from itertools, is an anti-pattern that > > > shouldn't be encouraged. > > Source? > > Point taken. I probably went a bit far labeling this a straight-up > "anti-pattern", but it is certainly annoying to find that someone has added > `from pprint import pprint as print` at the top of a module, for example > (which has actually happened to me before). Very hard to figure out what's > happening. Also irrelevant. It's very easy to suggest bad ways of using a feature. That doesn't make the feature bad. You seem to be arguing that zip_strict is bad because people can misuse it. We could probably remove 99% of the Python language by that argument... Paul ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/QYQHK6BWILSORA2XSGV7AUEZJTBUOSIL/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip
On Tue, May 12, 2020 at 5:20 PM Paul Moore wrote: > > On Tue, 12 May 2020 at 07:53, Brandt Bucher wrote: > > > > Another proposed idiom, per-module shadowing of the built-in zip with > > > > some subtly different variant from itertools, is an anti-pattern that > > > > shouldn't be encouraged. > > > Source? > > > > Point taken. I probably went a bit far labeling this a straight-up > > "anti-pattern", but it is certainly annoying to find that someone has added > > `from pprint import pprint as print` at the top of a module, for example > > (which has actually happened to me before). Very hard to figure out what's > > happening. > > Also irrelevant. It's very easy to suggest bad ways of using a > feature. That doesn't make the feature bad. You seem to be arguing > that zip_strict is bad because people can misuse it. We could probably > remove 99% of the Python language by that argument... > And considering that "from __future__ import print_function" is an officially-sanctioned way to cause a semantic change to print, I don't think it's really that strong an argument. Python is *deliberately* designed so that you can shadow things. I am most in favour of the separate-functions option *because* it makes shadowing easy. Not an anti-pattern at all. ChrisA ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/EWNW7SQGN55NIME6LD3NVVJUWIKXZO4I/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip
On Tue, 12 May 2020 06:48:32 - "Brandt Bucher" wrote: > > > > A good rule of thumb is that "mode-switches" which change return types or > > > significantly alter functionality are indeed an anti-pattern, > > Source? > > This was based on a chat with someone who has chosen not to become involved > in the larger discussion, and it was lifted almost verbatim from my notes > into the draft. It's also something that makes more or less consensus among core developers, IMHO. Regards Antoine. ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/ABJPSG65M5BPEZYMZ62NCA2KPDLOR6G7/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] inspect.getdoc and (Not) returning type/superclass docstrings in 3.9
Hi All, # Too long didn't read: In 3.9 inspect.getdoc(instance) behavior was changed and does not return the documentation of type(instance) or it's superclass(es) – I think this is a problematic change of for some project and interactive use to get info on objects that get rarely directly constructed by users. For example pandas dataframe obtained via `pandas.read_csv(filepath)`. I'd like to ask for reconsideration, and that change of behavior are better suited in a new function; potentially deprecating the old one. # Longer version In https://bugs.python.org/issue40257 attempts are made to improve the output of `pydoc`, it particular it is difficult to have fine grained logic depending on where the documentation comes from (instance, class , superclass, etc..). Which sometime can lead to nonsensical help. The following are given as examples: > inspect.getdoc(1) returns the same as inspect.getdoc(int) or >>> import wave >>> help(wave.Error) Help on class Error in module wave: class Error(builtins.Exception) | Common base class for all non-exit exceptions. | | Method resolution order: ... In 3.9 the behavior of `inspect.getdoc()` has been changed to be way more restrictive in what it returns, often returning None where it used to return docstrings. I agree with the end goal of having more controllable way of finding where the documentation/docstrings is coming from and avoiding incorrect docs in pydoc and help, though I find that change of behavior of `getdocs()` might not be the right approach. I'm quite worried many project rely on current behavior of `getdocs()` – at least IPython/Jupyter does to provide user with help/superhelp accessible via obj? and obj??. I would also argue that inaccurate help is also often better than no help. With current state on Python 3.9, a few things like asking for help on a pandas dataframe instance will currently loose informations. >>> import pandas as pd >>> from inspect import getdoc >>> df = pd.read_csv('mydata.csv') >>> print(getdoc(df)) None I'm taking the example of pandas as this is typically the kind of objects you don't construct directly, and get via for example `read_csv()`, or that another API/Package return to you. I haven't been able to confirm yet exactly how this affects sphinx rendering of docs, and how other IDEs provide help (Spyder, Pycharm...), or other projects that use `getdocs()`. I've found mentions of `getdocs()` in numpy, scipy, jedi, matplotlib ... as well (sphinx extension and various dynamic docs), and working on building them on 3.9 to check the effect. In general though the effect of `getdoc()` rarely seem to be tested as they will directly be user facing is my feeling – I was lucky to catch it in IPython/Jupyter as the failing test was unrelated and indirectly relying on the exact output of a subprocess. From the IPython/Jupyter perspective I would prefer to keep current behavior of `inspect.getdocs()` potentially deprecating it if you wish to, and provide an alternative that have a behavior of your choosing. Dealing with functions with slightly chaging behavior across Python version is not the best experience, and this would let the ecosystem get some chance to adapt. Updated project get rarely released in synchrony with new Python versions. Your thoughts on this issue are welcome, thanks for all your work on core python, and I'll support any decision that get made. -- Matthias ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/6QO2XI5B7RVZDW3YZV24LYD75VGRITFU/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip
On 05/11/2020 11:48 PM, Brandt Bucher wrote: On 05/10/2020 14:39 PM, Ethan Furman wrote: On 05/10/2020 09:04 AM, Brandt Bucher wrote: However, zip_longest is really another beast entirely No, it isn't. It has a completely independent implementation, a different interface, lives in a separate namespace, - both take an unknown number of iterables - both return tuples - both names start with `zip` - both stop at exhaustion - one as soon as possible - the other as late as possible - one has one extra parameter Those seem like very similar beasts to me. and doesn't even reference zip in its documentation. So update the docs. -- ~Ethan~ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LJ3MDIPO3AZCTXIFNQ3AM4F2O72GUTZ3/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip
On 05/11/2020 11:48 PM, Brandt Bucher wrote: On 05/10/2020 14:39, Ethan Furman wrote: On 05/10/2020 09:04 PM, Brandt Bucher wrote: the author has counted dozens of other call sites in Python's standard library and tooling where it would be appropriate to enable this new feature immediately. References, please. Here are two dozens: - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/_pydecimal.py#L3394 - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/_pydecimal.py#L3418 - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/_pydecimal.py#L3435 These are all after a function that ensures the iterables are the same length -- hardly seems a good idea to slow them down with an extra check for each digit. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/ast.py#L94-L95 This one already has a check. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/ast.py#L1184 - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/ast.py#L1275 Reasonable. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/ast.py#L1363 Already has a check. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/ast.py#L1391 Reasonable. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/copy.py#L217 - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/csv.py#L142 Mismatch cannot happen. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/dis.py#L462 Unsure if mismatch can happen. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/filecmp.py#L142 - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/filecmp.py#L143 Mismatch cannot happen. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/inspect.py#L1440 - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/inspect.py#L2095 Reasonable. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/os.py#L510 Wow -- I don't even know how to parse that! - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/plistlib.py#L577 Maybe. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/tarfile.py#L1317 - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/tarfile.py#L1323 Definitely. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/tarfile.py#L1339 Mismatch cannot happen. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/turtle.py#L3015 - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/turtle.py#L3071 Reasonable. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/turtle.py#L3901 Mismatch cannot happen. - https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/setup.py#L455 Mismatch cannot happen. So half of your examples are actually counter-examples. Did you vet them, or just pick matches against `zip(` ? Also, if a flag is used, won't that slow down every call to zip even when the flag is False? I know in many cases it probably won't matter, but I can see where it could in _pydecimal. -- ~Ethan~ ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/LZUNHS7RBAAJKYADI7P6WTRR25CQFD2Q/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Re: PEP 618: Add Optional Length-Checking To zip
On Wed, May 13, 2020 at 4:38 AM Ethan Furman wrote: > > > - > > https://github.com/python/cpython/blob/27c0d9b54abaa4112d5a317b8aa78b39ad60a808/Lib/os.py#L510 > > Wow -- I don't even know how to parse that! > Wow, that's quite an example. Of something, I'm not sure what, but definitely an example. Based on two booleans, entries is either None or a list. If it's None, this loops over just the directory names; if it's a list, then it's been populated in perfect parallel to dirs (see the preceding loop), thus guaranteeing that the two lists are perfectly parallel. But in that case, "name" actually gets a tuple of (name,entry), and then inside the loop, it does a three-way branch that is guaranteed (and asserted) to split out the name and entry ONLY when there actually will be one. Definitely an odd piece of code. But it can never zip over things of different lengths. ChrisA ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/P6K3JGADX5A34CY2VOKVCJDYTDX3WQEY/ Code of Conduct: http://python.org/psf/codeofconduct/
[Python-Dev] Detect memory leaks in unit tests
Hello there, I would like to discuss a proposal regarding one aspect which AFAIK is currently missing from cPython's test suite: the ability to detect memory leaks of functions implemented in the C extension modules. In psutil I use a test class/framework which calls a function many times, and fails if the process memory increased after doing so. I do this in order to quickly detect missing free() or Py_DECREF calls in the C code, but I suppose there may be other use cases. Here's the class: https://github.com/giampaolo/psutil/blob/913d4b1d6dcce88dea6ef9382b93883a04a66cd7/psutil/tests/__init__.py#L901 Detecting a memory leak is no easy task, and that's because the process memory fluctuates. Sometimes it may increase (or even decrease!) even if there's no leak, I suppose because of how the OS handles memory, the Python's garbage collector, the fact that RSS is an approximation, and who knows what else. In order to compensate fluctuations I did the following: in case of failure (mem > 0 after calling fun() N times) I retry the test for up to 5 times, increasing N (repetitions) each time, so I consider the test a failure only if the memory keeps increasing across all runs. So for instance, here's a legitimate failure: psutil.tests.test_memory_leaks.TestModuleFunctionsLeaks.test_disk_partitions ... Run #1: extra-mem=696.0K, per-call=3.5K, calls=200 Run #2: extra-mem=1.4M, per-call=3.5K, calls=400 Run #3: extra-mem=2.1M, per-call=3.5K, calls=600 Run #4: extra-mem=2.7M, per-call=3.5K, calls=800 Run #5: extra-mem=3.4M, per-call=3.5K, calls=1000 FAIL If, on the other hand, the memory increased on one run (say 200 calls) but decreased on the next run (say 400 calls), then it clearly means it's a false positive, because memory consumption may be > 0 on the second run, but if it's lower than the previous run with less repetitions, then it cannot possibly represent a leak (just a fluctuation): psutil.tests.test_memory_leaks.TestModuleFunctionsLeaks.test_net_connections ... Run #1: extra-mem=568.0K, per-call=2.8K, calls=200 Run #2: extra-mem=24.0K, per-call=61.4B, calls=400 OK This is the best I could come up with as a simple leak detection mechanism to integrate with CI services, and keep more advanced tools like Valgrind out of the picture (I just wanted to know if there's a leak, not to debug the leak itself). In addition, since psutil is able to get the number of fds (UNIX) and handles (Windows) opened by a process, I also run a separate set of tests to make sure I didn't forget to call close(2) or CloseHandle() in C. Would something like this make sense to have in cPython? Here's a quick PoC I put together just to show how this thing would look like in practice: https://github.com/giampaolo/cpython/pull/2/files A proper work in terms of API coverage would result being quite huge (test all C modules), and ideally should also include cases where functions raise an exception when being fed with an improper input. The biggest stopper here is, of course, psutil, since it's a third party dep, but before getting to that I wanted to see how this idea is perceived in general. Cheers, -- Giampaolo - http://grodola.blogspot.com ___ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/NFHW4TP3ALY3CVRBVKRI4SRG7BOLZLJH/ Code of Conduct: http://python.org/psf/codeofconduct/