[issue33084] Computing median, median_high an median_low in statistics library
New submission from Luc : When a list or dataframe serie contains NaN(s), the median, median_low and median_high are computed in Python 3.6.4 statistics library, however, the results are wrong. Either, it should return a NaN just like when we try to compute a mean or point the user to drop the NaNs before computing those statistics. Example: import numpy as np import statistics as stats data = [75, 90,85, 92, 95, 80, np.nan] Median = stats.median(data) Median_low = stats.median_low(data) Median_high = stats.median_high(data) The results from above return ALL 90 which are incorrect. Correct answers should be: Median = 87.5 Median_low = 85 Median_high = 92 Thanks, Luc -- components: Library (Lib) messages: 313933 nosy: dcasmr priority: normal severity: normal status: open title: Computing median, median_high an median_low in statistics library type: behavior versions: Python 3.6 ___ Python tracker <https://bugs.python.org/issue33084> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33084] Computing median, median_high an median_low in statistics library
Luc added the comment: Just to make sure we are focused on the issue, the reported bug is with the statistics library (not with numpy). It happens, when there is at least one missing value in the data and involves the computation of the median, median_low and median_high using the statistics library. The test was performed on Python 3.6.4. When there is no missing values (NaNs) in the data, computing the median, median_high and median_low from the statistics library work fine. So, yes, removing the NaNs (or imputing for them) before computing the median(s) resolve the issue. Also, just like statistics.mean(data) when data has missing return a nan, the median, median_high and median_low should behave the same way. import numpy import statistics as stats data = [75, 90,85, 92, 95, 80, np.nan] Median = stats.median(data) Median_high = stats.median_high(data) Median_low = stats.median_low(data) print("The incorrect Median is", Median) The incorrect Median is, 90 print("The incorrect median high is", Median_high) The incorrect median high is, 90 print("The incorrect median low is", Median_low) The incorrect median low is, 90 ## Mean returns nan Mean = stats.mean(data) prin("The mean is", Mean) The mean is, nan Now, when we drop the missing values, we have: data2 = [75, 90,85, 92, 95, 80] stats.median(data2) 87.5 stats.median_high(data2) 90 stats.median_low(data2) 85 -- ___ Python tracker <https://bugs.python.org/issue33084> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33084] Computing median, median_high an median_low in statistics library
Luc added the comment: If we are trying to fix this, the behavior should be like computing the mean or harmonic mean with the statistics library when there are missing values in the data. At least that way, it is consistent with how the statistics library works when computing with NaNs in the data. Then again, it should be mentioned somewhere in the docs. import statistics as stats import numpy as np import pandas as pd data = [75, 90,85, 92, 95, 80, np.nan] stats.mean(data) nan stats.harmonic_mean(data) nan stats.stdev(data) nan As you can see, when there is a missing value, computing the mean, harmonic mean and sample standard deviation with the statistics library return a nan. However, with the median, median_high and median_low, it computes those statistics incorrectly with the missing values present in the data. It is better to return a nan, then let the user drop (or resolve) any missing values before computing. ## Another example using pandas serie df = pd.DataFrame(data, columns=['data']) df.head() data 0 75.0 1 90.0 2 85.0 3 92.0 4 95.0 5 80.0 6 NaN ### Use the statistics library to compute the median of the serie stats.median(df1['data']) 90 ## Pandas returns the correct median by dropping the missing values ## Now use pandas to compute the median of the serie with missing value df['data'].median() 87.5 I did not test the median_grouped in statistics library, but will let you know afterwards if its affected as well. -- ___ Python tracker <https://bugs.python.org/issue33084> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30739] pypi ssl errors [CERTIFICATE_VERIFY_FAILED]
New submission from Luc Zimmermann: Hi Guys, I've a strange behavior. We use python for configure our new boxes with openWRT and coovaChilli. But since yesterday, when i ask to pip to dowload PyJWT, json-cfg and speedtest-cli, some boxes can download these packages, and some can't. root@OpenWrt:~# cat /root/.pip/pip.log /usr/bin/pip run on Thu Apr 13 18:46:19 2017 Downloading/unpacking PyJWT Getting page https://pypi.python.org/simple/PyJWT/ Could not fetch URL https://pypi.python.org/simple/PyJWT/: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] unknown error (_ssl.c) Will skip URL https://pypi.python.org/simple/PyJWT/ when looking for download links for PyJWT Getting page https://pypi.python.org/simple/ Could not fetch URL https://pypi.python.org/simple/: connection error: HTTPSConnectionPool(host='pypi.python.org', port=443): Max r) Will skip URL https://pypi.python.org/simple/ when looking for download links for PyJWT Cannot fetch index base URL https://pypi.python.org/simple/ URLs to search for versions for PyJWT: * https://pypi.python.org/simple/PyJWT/ Getting page https://pypi.python.org/simple/PyJWT/ Could not fetch URL https://pypi.python.org/simple/PyJWT/: connection error: [SSL: CERTIFICATE_VERIFY_FAILED] unknown error (_ssl.c) Will skip URL https://pypi.python.org/simple/PyJWT/ when looking for download links for PyJWT Could not find any downloads that satisfy the requirement PyJWT Cleaning up... Removing temporary dir /tmp/pip_build_root... No distributions at all found for PyJWT Exception information: Traceback (most recent call last): File "/usr/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg/pip/basecommand.py", line 122, in main status = self.run(options, args) File "/usr/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg/pip/commands/install.py", line 278, in run requirement_set.prepare_files(finder, force_root_egg_info=self.bundle, bundle=self.bundle) File "/usr/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg/pip/req.py", line 1177, in prepare_files url = finder.find_requirement(req_to_install, upgrade=self.upgrade) File "/usr/lib/python2.7/site-packages/pip-1.5.6-py2.7.egg/pip/index.py", line 277, in find_requirement raise DistributionNotFound('No distributions at all found for %s' % req) DistributionNotFound: No distributions at all found for PyJWT -- messages: 296708 nosy: Luc Zimmermann priority: normal severity: normal status: open title: pypi ssl errors [CERTIFICATE_VERIFY_FAILED] type: resource usage versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue30739> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30741] https://www.pypi-mirrors.org/ error 503
New submission from Luc Zimmermann: is that linked with the certificate error on pypi ? you redirect http request to https, but you still listen 80 and not 443 ? -- messages: 296721 nosy: Luc Zimmermann priority: normal severity: normal status: open title: https://www.pypi-mirrors.org/ error 503 type: security ___ Python tracker <http://bugs.python.org/issue30741> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30941] Missing line in example program
New submission from Luc Bougé: On page <https://docs.python.org/3.6/_sources/library/stdtypes.txt>, the following program is listed. It raises a syntactic error. An empty line is missing after "... n += val" to close the loop body. >>> # iteration >>> n = 0 >>> for val in values: ... n += val >>> print(n) 504 -- assignee: docs@python components: Documentation messages: 298445 nosy: docs@python, lucbouge priority: normal severity: normal status: open title: Missing line in example program type: resource usage versions: Python 3.6 ___ Python tracker <http://bugs.python.org/issue30941> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue24515] docstring of isinstance
New submission from Luc Saffre: The docstring of built-in function 'isinstance' should explain that if the classinfo is a tuple, the object must be instance of *any* (not *all*) of the class objects. -- assignee: docs@python components: Documentation messages: 245841 nosy: Luc Saffre, docs@python priority: normal severity: normal status: open title: docstring of isinstance type: enhancement versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue24515> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com