Source: scipy, dask Control: found -1 scipy/1.7.1-1 Control: found -1 dask/2021.01.0+dfsg-1 Severity: serious Tags: sid bookworm X-Debbugs-CC: debian...@lists.debian.org User: debian...@lists.debian.org Usertags: breaks needs-update
Dear maintainer(s), With a recent upload of scipy the autopkgtest of dask fails in testing when that autopkgtest is run with the binary packages of scipy from unstable. It passes when run with only packages from testing. In tabular form: pass fail scipy from testing 1.7.1-1 dask from testing 2021.01.0+dfsg-1 all others from testing from testing I copied some of the output at the bottom of this report. Currently this regression is blocking the migration of scipy to testing [1]. Due to the nature of this issue, I filed this bug report against both packages. Can you please investigate the situation and reassign the bug to the right package? More information about this bug and the reason for filing it can be found on https://wiki.debian.org/ContinuousIntegration/RegressionEmailInformation Paul [1] https://qa.debian.org/excuses.php?package=scipy https://ci.debian.net/data/autopkgtest/testing/amd64/d/dask/14750989/log.gz =================================== FAILURES =================================== _________________________ test_two[chisquare-kwargs4] __________________________ kind = 'chisquare', kwargs = {} @pytest.mark.parametrize( "kind, kwargs", [ ("ttest_ind", {}), ("ttest_ind", {"equal_var": False}), ("ttest_1samp", {}), ("ttest_rel", {}), ("chisquare", {}), ("power_divergence", {}), ("power_divergence", {"lambda_": 0}), ("power_divergence", {"lambda_": -1}), ("power_divergence", {"lambda_": "neyman"}), ], ) def test_two(kind, kwargs): a = np.random.random(size=30) b = np.random.random(size=30) a_ = da.from_array(a, 3) b_ = da.from_array(b, 3) dask_test = getattr(dask.array.stats, kind) scipy_test = getattr(scipy.stats, kind) with pytest.warns(None): # maybe overflow warning (powrer_divergence) result = dask_test(a_, b_, **kwargs) > expected = scipy_test(a, b, **kwargs) /usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ /usr/lib/python3/dist-packages/scipy/stats/stats.py:6852: in chisquare return power_divergence(f_obs, f_exp=f_exp, ddof=ddof, axis=axis, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ f_obs = array([0.50616202, 0.24047559, 0.47385697, 0.2617551 , 0.39114576, 0.88770032, 0.34302868, 0.14302513, 0.725594...42, 0.59473997, 0.6900566 , 0.53797374, 0.57797282, 0.03873117, 0.25516211, 0.61014154, 0.07151085, 0.45349848]) f_exp = array([0.74205809, 0.02842818, 0.8787515 , 0.73989809, 0.62863789, 0.29149379, 0.35539664, 0.14194145, 0.256345...17, 0.30981448, 0.79533776, 0.38182638, 0.07065248, 0.82259669, 0.67148202, 0.52457137, 0.24092418, 0.9464332 ]) ddof = 0, axis = 0, lambda_ = 1 def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None): """Cressie-Read power divergence statistic and goodness of fit test. This function tests the null hypothesis that the categorical data has the given frequencies, using the Cressie-Read power divergence statistic. Parameters ---------- f_obs : array_like Observed frequencies in each category. f_exp : array_like, optional Expected frequencies in each category. By default the categories are assumed to be equally likely. ddof : int, optional "Delta degrees of freedom": adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with ``k - 1 - ddof`` degrees of freedom, where `k` is the number of observed frequencies. The default value of `ddof` is 0. axis : int or None, optional The axis of the broadcast result of `f_obs` and `f_exp` along which to apply the test. If axis is None, all values in `f_obs` are treated as a single data set. Default is 0. lambda_ : float or str, optional The power in the Cressie-Read power divergence statistic. The default is 1. For convenience, `lambda_` may be assigned one of the following strings, in which case the corresponding numerical value is used:: String Value Description "pearson" 1 Pearson's chi-squared statistic. In this case, the function is equivalent to `stats.chisquare`. "log-likelihood" 0 Log-likelihood ratio. Also known as the G-test [3]_. "freeman-tukey" -1/2 Freeman-Tukey statistic. "mod-log-likelihood" -1 Modified log-likelihood ratio. "neyman" -2 Neyman's statistic. "cressie-read" 2/3 The power recommended in [5]_. Returns ------- statistic : float or ndarray The Cressie-Read power divergence test statistic. The value is a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D. pvalue : float or ndarray The p-value of the test. The value is a float if `ddof` and the return value `stat` are scalars. See Also -------- chisquare Notes ----- This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5. Also, the sum of the observed and expected frequencies must be the same for the test to be valid; `power_divergence` raises an error if the sums do not agree within a relative tolerance of ``1e-8``. When `lambda_` is less than zero, the formula for the statistic involves dividing by `f_obs`, so a warning or error may be generated if any value in `f_obs` is 0. Similarly, a warning or error may be generated if any value in `f_exp` is zero when `lambda_` >= 0. The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate. This function handles masked arrays. If an element of `f_obs` or `f_exp` is masked, then data at that position is ignored, and does not count towards the size of the data set. .. versionadded:: 0.13.0 References ---------- .. [1] Lowry, Richard. "Concepts and Applications of Inferential Statistics". Chapter 8. https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html .. [2] "Chi-squared test", https://en.wikipedia.org/wiki/Chi-squared_test .. [3] "G-test", https://en.wikipedia.org/wiki/G-test .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and practice of statistics in biological research", New York: Freeman (1981) .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984), pp. 440-464. Examples -------- (See `chisquare` for more examples.) When just `f_obs` is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies. Here we perform a G-test (i.e. use the log-likelihood ratio statistic): >>> from scipy.stats import power_divergence >>> power_divergence([16, 18, 16, 14, 12, 12], lambda_='log-likelihood') (2.006573162632538, 0.84823476779463769) The expected frequencies can be given with the `f_exp` argument: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[16, 16, 16, 16, 16, 8], ... lambda_='log-likelihood') (3.3281031458963746, 0.6495419288047497) When `f_obs` is 2-D, by default the test is applied to each column. >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T >>> obs.shape (6, 2) >>> power_divergence(obs, lambda_="log-likelihood") (array([ 2.00657316, 6.77634498]), array([ 0.84823477, 0.23781225])) By setting ``axis=None``, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array. >>> power_divergence(obs, axis=None) (23.31034482758621, 0.015975692534127565) >>> power_divergence(obs.ravel()) (23.31034482758621, 0.015975692534127565) `ddof` is the change to make to the default degrees of freedom. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1) (2.0, 0.73575888234288467) The calculation of the p-values is done by broadcasting the test statistic with `ddof`. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2]) (2.0, array([ 0.84914504, 0.73575888, 0.5724067 ])) `f_obs` and `f_exp` are also broadcast. In the following, `f_obs` has shape (6,) and `f_exp` has shape (2, 6), so the result of broadcasting `f_obs` and `f_exp` has shape (2, 6). To compute the desired chi-squared statistics, we must use ``axis=1``: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[[16, 16, 16, 16, 16, 8], ... [8, 20, 20, 16, 12, 12]], ... axis=1) (array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846])) """ # Convert the input argument `lambda_` to a numerical value. if isinstance(lambda_, str): if lambda_ not in _power_div_lambda_names: names = repr(list(_power_div_lambda_names.keys()))[1:-1] raise ValueError("invalid string for lambda_: {0!r}. " "Valid strings are {1}".format(lambda_, names)) lambda_ = _power_div_lambda_names[lambda_] elif lambda_ is None: lambda_ = 1 f_obs = np.asanyarray(f_obs) f_obs_float = f_obs.astype(np.float64) if f_exp is not None: f_exp = np.asanyarray(f_exp) bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape) f_obs_float = _m_broadcast_to(f_obs_float, bshape) f_exp = _m_broadcast_to(f_exp, bshape) rtol = 1e-8 # to pass existing tests with np.errstate(invalid='ignore'): f_obs_sum = f_obs_float.sum(axis=axis) f_exp_sum = f_exp.sum(axis=axis) relative_diff = (np.abs(f_obs_sum - f_exp_sum) / np.minimum(f_obs_sum, f_exp_sum)) diff_gt_tol = (relative_diff > rtol).any() if diff_gt_tol: msg = (f"For each axis slice, the sum of the observed " f"frequencies must agree with the sum of the " f"expected frequencies to a relative tolerance " f"of {rtol}, but the percent differences are:\n" f"{relative_diff}") > raise ValueError(msg) E ValueError: For each axis slice, the sum of the observed frequencies must agree with the sum of the expected frequencies to a relative tolerance of 1e-08, but the percent differences are: E 0.025894044619658885 /usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError ______________________ test_two[power_divergence-kwargs5] ______________________ kind = 'power_divergence', kwargs = {} @pytest.mark.parametrize( "kind, kwargs", [ ("ttest_ind", {}), ("ttest_ind", {"equal_var": False}), ("ttest_1samp", {}), ("ttest_rel", {}), ("chisquare", {}), ("power_divergence", {}), ("power_divergence", {"lambda_": 0}), ("power_divergence", {"lambda_": -1}), ("power_divergence", {"lambda_": "neyman"}), ], ) def test_two(kind, kwargs): a = np.random.random(size=30) b = np.random.random(size=30) a_ = da.from_array(a, 3) b_ = da.from_array(b, 3) dask_test = getattr(dask.array.stats, kind) scipy_test = getattr(scipy.stats, kind) with pytest.warns(None): # maybe overflow warning (powrer_divergence) result = dask_test(a_, b_, **kwargs) > expected = scipy_test(a, b, **kwargs) /usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ f_obs = array([0.48154045, 0.91770608, 0.79433273, 0.677548 , 0.5172343 , 0.97963439, 0.05970396, 0.87044663, 0.047389...74, 0.51919702, 0.17786529, 0.30872526, 0.16799736, 0.3470901 , 0.60993241, 0.97491444, 0.6667405 , 0.40817497]) f_exp = array([0.34841225, 0.36222138, 0.90096367, 0.34009505, 0.06369394, 0.50103548, 0.08646673, 0.08211717, 0.962662...77, 0.17116813, 0.62746241, 0.11182426, 0.35465983, 0.40134001, 0.23779315, 0.98321459, 0.06142563, 0.88909906]) ddof = 0, axis = 0, lambda_ = 1 def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None): """Cressie-Read power divergence statistic and goodness of fit test. This function tests the null hypothesis that the categorical data has the given frequencies, using the Cressie-Read power divergence statistic. Parameters ---------- f_obs : array_like Observed frequencies in each category. f_exp : array_like, optional Expected frequencies in each category. By default the categories are assumed to be equally likely. ddof : int, optional "Delta degrees of freedom": adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with ``k - 1 - ddof`` degrees of freedom, where `k` is the number of observed frequencies. The default value of `ddof` is 0. axis : int or None, optional The axis of the broadcast result of `f_obs` and `f_exp` along which to apply the test. If axis is None, all values in `f_obs` are treated as a single data set. Default is 0. lambda_ : float or str, optional The power in the Cressie-Read power divergence statistic. The default is 1. For convenience, `lambda_` may be assigned one of the following strings, in which case the corresponding numerical value is used:: String Value Description "pearson" 1 Pearson's chi-squared statistic. In this case, the function is equivalent to `stats.chisquare`. "log-likelihood" 0 Log-likelihood ratio. Also known as the G-test [3]_. "freeman-tukey" -1/2 Freeman-Tukey statistic. "mod-log-likelihood" -1 Modified log-likelihood ratio. "neyman" -2 Neyman's statistic. "cressie-read" 2/3 The power recommended in [5]_. Returns ------- statistic : float or ndarray The Cressie-Read power divergence test statistic. The value is a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D. pvalue : float or ndarray The p-value of the test. The value is a float if `ddof` and the return value `stat` are scalars. See Also -------- chisquare Notes ----- This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5. Also, the sum of the observed and expected frequencies must be the same for the test to be valid; `power_divergence` raises an error if the sums do not agree within a relative tolerance of ``1e-8``. When `lambda_` is less than zero, the formula for the statistic involves dividing by `f_obs`, so a warning or error may be generated if any value in `f_obs` is 0. Similarly, a warning or error may be generated if any value in `f_exp` is zero when `lambda_` >= 0. The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate. This function handles masked arrays. If an element of `f_obs` or `f_exp` is masked, then data at that position is ignored, and does not count towards the size of the data set. .. versionadded:: 0.13.0 References ---------- .. [1] Lowry, Richard. "Concepts and Applications of Inferential Statistics". Chapter 8. https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html .. [2] "Chi-squared test", https://en.wikipedia.org/wiki/Chi-squared_test .. [3] "G-test", https://en.wikipedia.org/wiki/G-test .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and practice of statistics in biological research", New York: Freeman (1981) .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984), pp. 440-464. Examples -------- (See `chisquare` for more examples.) When just `f_obs` is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies. Here we perform a G-test (i.e. use the log-likelihood ratio statistic): >>> from scipy.stats import power_divergence >>> power_divergence([16, 18, 16, 14, 12, 12], lambda_='log-likelihood') (2.006573162632538, 0.84823476779463769) The expected frequencies can be given with the `f_exp` argument: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[16, 16, 16, 16, 16, 8], ... lambda_='log-likelihood') (3.3281031458963746, 0.6495419288047497) When `f_obs` is 2-D, by default the test is applied to each column. >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T >>> obs.shape (6, 2) >>> power_divergence(obs, lambda_="log-likelihood") (array([ 2.00657316, 6.77634498]), array([ 0.84823477, 0.23781225])) By setting ``axis=None``, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array. >>> power_divergence(obs, axis=None) (23.31034482758621, 0.015975692534127565) >>> power_divergence(obs.ravel()) (23.31034482758621, 0.015975692534127565) `ddof` is the change to make to the default degrees of freedom. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1) (2.0, 0.73575888234288467) The calculation of the p-values is done by broadcasting the test statistic with `ddof`. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2]) (2.0, array([ 0.84914504, 0.73575888, 0.5724067 ])) `f_obs` and `f_exp` are also broadcast. In the following, `f_obs` has shape (6,) and `f_exp` has shape (2, 6), so the result of broadcasting `f_obs` and `f_exp` has shape (2, 6). To compute the desired chi-squared statistics, we must use ``axis=1``: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[[16, 16, 16, 16, 16, 8], ... [8, 20, 20, 16, 12, 12]], ... axis=1) (array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846])) """ # Convert the input argument `lambda_` to a numerical value. if isinstance(lambda_, str): if lambda_ not in _power_div_lambda_names: names = repr(list(_power_div_lambda_names.keys()))[1:-1] raise ValueError("invalid string for lambda_: {0!r}. " "Valid strings are {1}".format(lambda_, names)) lambda_ = _power_div_lambda_names[lambda_] elif lambda_ is None: lambda_ = 1 f_obs = np.asanyarray(f_obs) f_obs_float = f_obs.astype(np.float64) if f_exp is not None: f_exp = np.asanyarray(f_exp) bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape) f_obs_float = _m_broadcast_to(f_obs_float, bshape) f_exp = _m_broadcast_to(f_exp, bshape) rtol = 1e-8 # to pass existing tests with np.errstate(invalid='ignore'): f_obs_sum = f_obs_float.sum(axis=axis) f_exp_sum = f_exp.sum(axis=axis) relative_diff = (np.abs(f_obs_sum - f_exp_sum) / np.minimum(f_obs_sum, f_exp_sum)) diff_gt_tol = (relative_diff > rtol).any() if diff_gt_tol: msg = (f"For each axis slice, the sum of the observed " f"frequencies must agree with the sum of the " f"expected frequencies to a relative tolerance " f"of {rtol}, but the percent differences are:\n" f"{relative_diff}") > raise ValueError(msg) E ValueError: For each axis slice, the sum of the observed frequencies must agree with the sum of the expected frequencies to a relative tolerance of 1e-08, but the percent differences are: E 0.0255236948989553 /usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError ______________________ test_two[power_divergence-kwargs6] ______________________ kind = 'power_divergence', kwargs = {'lambda_': 0} @pytest.mark.parametrize( "kind, kwargs", [ ("ttest_ind", {}), ("ttest_ind", {"equal_var": False}), ("ttest_1samp", {}), ("ttest_rel", {}), ("chisquare", {}), ("power_divergence", {}), ("power_divergence", {"lambda_": 0}), ("power_divergence", {"lambda_": -1}), ("power_divergence", {"lambda_": "neyman"}), ], ) def test_two(kind, kwargs): a = np.random.random(size=30) b = np.random.random(size=30) a_ = da.from_array(a, 3) b_ = da.from_array(b, 3) dask_test = getattr(dask.array.stats, kind) scipy_test = getattr(scipy.stats, kind) with pytest.warns(None): # maybe overflow warning (powrer_divergence) result = dask_test(a_, b_, **kwargs) > expected = scipy_test(a, b, **kwargs) /usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ f_obs = array([0.63467182, 0.96048488, 0.58221072, 0.71074857, 0.85604392, 0.80233744, 0.19603734, 0.51923681, 0.334861...59, 0.96676095, 0.6747112 , 0.15166506, 0.64653236, 0.11540545, 0.16450541, 0.11795982, 0.41130776, 0.97301235]) f_exp = array([0.13534693, 0.12792919, 0.16689546, 0.63389176, 0.68782358, 0.6177741 , 0.026473 , 0.58817575, 0.044479...02, 0.1481515 , 0.12261963, 0.84576878, 0.93908736, 0.44969247, 0.14905221, 0.54827366, 0.15204124, 0.30575429]) ddof = 0, axis = 0, lambda_ = 0 def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None): """Cressie-Read power divergence statistic and goodness of fit test. This function tests the null hypothesis that the categorical data has the given frequencies, using the Cressie-Read power divergence statistic. Parameters ---------- f_obs : array_like Observed frequencies in each category. f_exp : array_like, optional Expected frequencies in each category. By default the categories are assumed to be equally likely. ddof : int, optional "Delta degrees of freedom": adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with ``k - 1 - ddof`` degrees of freedom, where `k` is the number of observed frequencies. The default value of `ddof` is 0. axis : int or None, optional The axis of the broadcast result of `f_obs` and `f_exp` along which to apply the test. If axis is None, all values in `f_obs` are treated as a single data set. Default is 0. lambda_ : float or str, optional The power in the Cressie-Read power divergence statistic. The default is 1. For convenience, `lambda_` may be assigned one of the following strings, in which case the corresponding numerical value is used:: String Value Description "pearson" 1 Pearson's chi-squared statistic. In this case, the function is equivalent to `stats.chisquare`. "log-likelihood" 0 Log-likelihood ratio. Also known as the G-test [3]_. "freeman-tukey" -1/2 Freeman-Tukey statistic. "mod-log-likelihood" -1 Modified log-likelihood ratio. "neyman" -2 Neyman's statistic. "cressie-read" 2/3 The power recommended in [5]_. Returns ------- statistic : float or ndarray The Cressie-Read power divergence test statistic. The value is a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D. pvalue : float or ndarray The p-value of the test. The value is a float if `ddof` and the return value `stat` are scalars. See Also -------- chisquare Notes ----- This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5. Also, the sum of the observed and expected frequencies must be the same for the test to be valid; `power_divergence` raises an error if the sums do not agree within a relative tolerance of ``1e-8``. When `lambda_` is less than zero, the formula for the statistic involves dividing by `f_obs`, so a warning or error may be generated if any value in `f_obs` is 0. Similarly, a warning or error may be generated if any value in `f_exp` is zero when `lambda_` >= 0. The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate. This function handles masked arrays. If an element of `f_obs` or `f_exp` is masked, then data at that position is ignored, and does not count towards the size of the data set. .. versionadded:: 0.13.0 References ---------- .. [1] Lowry, Richard. "Concepts and Applications of Inferential Statistics". Chapter 8. https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html .. [2] "Chi-squared test", https://en.wikipedia.org/wiki/Chi-squared_test .. [3] "G-test", https://en.wikipedia.org/wiki/G-test .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and practice of statistics in biological research", New York: Freeman (1981) .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984), pp. 440-464. Examples -------- (See `chisquare` for more examples.) When just `f_obs` is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies. Here we perform a G-test (i.e. use the log-likelihood ratio statistic): >>> from scipy.stats import power_divergence >>> power_divergence([16, 18, 16, 14, 12, 12], lambda_='log-likelihood') (2.006573162632538, 0.84823476779463769) The expected frequencies can be given with the `f_exp` argument: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[16, 16, 16, 16, 16, 8], ... lambda_='log-likelihood') (3.3281031458963746, 0.6495419288047497) When `f_obs` is 2-D, by default the test is applied to each column. >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T >>> obs.shape (6, 2) >>> power_divergence(obs, lambda_="log-likelihood") (array([ 2.00657316, 6.77634498]), array([ 0.84823477, 0.23781225])) By setting ``axis=None``, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array. >>> power_divergence(obs, axis=None) (23.31034482758621, 0.015975692534127565) >>> power_divergence(obs.ravel()) (23.31034482758621, 0.015975692534127565) `ddof` is the change to make to the default degrees of freedom. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1) (2.0, 0.73575888234288467) The calculation of the p-values is done by broadcasting the test statistic with `ddof`. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2]) (2.0, array([ 0.84914504, 0.73575888, 0.5724067 ])) `f_obs` and `f_exp` are also broadcast. In the following, `f_obs` has shape (6,) and `f_exp` has shape (2, 6), so the result of broadcasting `f_obs` and `f_exp` has shape (2, 6). To compute the desired chi-squared statistics, we must use ``axis=1``: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[[16, 16, 16, 16, 16, 8], ... [8, 20, 20, 16, 12, 12]], ... axis=1) (array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846])) """ # Convert the input argument `lambda_` to a numerical value. if isinstance(lambda_, str): if lambda_ not in _power_div_lambda_names: names = repr(list(_power_div_lambda_names.keys()))[1:-1] raise ValueError("invalid string for lambda_: {0!r}. " "Valid strings are {1}".format(lambda_, names)) lambda_ = _power_div_lambda_names[lambda_] elif lambda_ is None: lambda_ = 1 f_obs = np.asanyarray(f_obs) f_obs_float = f_obs.astype(np.float64) if f_exp is not None: f_exp = np.asanyarray(f_exp) bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape) f_obs_float = _m_broadcast_to(f_obs_float, bshape) f_exp = _m_broadcast_to(f_exp, bshape) rtol = 1e-8 # to pass existing tests with np.errstate(invalid='ignore'): f_obs_sum = f_obs_float.sum(axis=axis) f_exp_sum = f_exp.sum(axis=axis) relative_diff = (np.abs(f_obs_sum - f_exp_sum) / np.minimum(f_obs_sum, f_exp_sum)) diff_gt_tol = (relative_diff > rtol).any() if diff_gt_tol: msg = (f"For each axis slice, the sum of the observed " f"frequencies must agree with the sum of the " f"expected frequencies to a relative tolerance " f"of {rtol}, but the percent differences are:\n" f"{relative_diff}") > raise ValueError(msg) E ValueError: For each axis slice, the sum of the observed frequencies must agree with the sum of the expected frequencies to a relative tolerance of 1e-08, but the percent differences are: E 0.11828134694186175 /usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError ______________________ test_two[power_divergence-kwargs7] ______________________ kind = 'power_divergence', kwargs = {'lambda_': -1} @pytest.mark.parametrize( "kind, kwargs", [ ("ttest_ind", {}), ("ttest_ind", {"equal_var": False}), ("ttest_1samp", {}), ("ttest_rel", {}), ("chisquare", {}), ("power_divergence", {}), ("power_divergence", {"lambda_": 0}), ("power_divergence", {"lambda_": -1}), ("power_divergence", {"lambda_": "neyman"}), ], ) def test_two(kind, kwargs): a = np.random.random(size=30) b = np.random.random(size=30) a_ = da.from_array(a, 3) b_ = da.from_array(b, 3) dask_test = getattr(dask.array.stats, kind) scipy_test = getattr(scipy.stats, kind) with pytest.warns(None): # maybe overflow warning (powrer_divergence) result = dask_test(a_, b_, **kwargs) > expected = scipy_test(a, b, **kwargs) /usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ f_obs = array([0.93323015, 0.06847719, 0.04883692, 0.11047738, 0.46259841, 0.20786193, 0.9073499 , 0.35431746, 0.389861...12, 0.75042707, 0.04320992, 0.58871239, 0.60498304, 0.21119014, 0.07340626, 0.64619613, 0.78444948, 0.57526428]) f_exp = array([0.24077443, 0.18040326, 0.14252158, 0.50056972, 0.11418123, 0.06264705, 0.12162502, 0.08841781, 0.660959...98, 0.81831997, 0.46537992, 0.0474171 , 0.87974671, 0.88955421, 0.94621617, 0.81090708, 0.89923667, 0.95013257]) ddof = 0, axis = 0, lambda_ = -1 def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None): """Cressie-Read power divergence statistic and goodness of fit test. This function tests the null hypothesis that the categorical data has the given frequencies, using the Cressie-Read power divergence statistic. Parameters ---------- f_obs : array_like Observed frequencies in each category. f_exp : array_like, optional Expected frequencies in each category. By default the categories are assumed to be equally likely. ddof : int, optional "Delta degrees of freedom": adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with ``k - 1 - ddof`` degrees of freedom, where `k` is the number of observed frequencies. The default value of `ddof` is 0. axis : int or None, optional The axis of the broadcast result of `f_obs` and `f_exp` along which to apply the test. If axis is None, all values in `f_obs` are treated as a single data set. Default is 0. lambda_ : float or str, optional The power in the Cressie-Read power divergence statistic. The default is 1. For convenience, `lambda_` may be assigned one of the following strings, in which case the corresponding numerical value is used:: String Value Description "pearson" 1 Pearson's chi-squared statistic. In this case, the function is equivalent to `stats.chisquare`. "log-likelihood" 0 Log-likelihood ratio. Also known as the G-test [3]_. "freeman-tukey" -1/2 Freeman-Tukey statistic. "mod-log-likelihood" -1 Modified log-likelihood ratio. "neyman" -2 Neyman's statistic. "cressie-read" 2/3 The power recommended in [5]_. Returns ------- statistic : float or ndarray The Cressie-Read power divergence test statistic. The value is a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D. pvalue : float or ndarray The p-value of the test. The value is a float if `ddof` and the return value `stat` are scalars. See Also -------- chisquare Notes ----- This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5. Also, the sum of the observed and expected frequencies must be the same for the test to be valid; `power_divergence` raises an error if the sums do not agree within a relative tolerance of ``1e-8``. When `lambda_` is less than zero, the formula for the statistic involves dividing by `f_obs`, so a warning or error may be generated if any value in `f_obs` is 0. Similarly, a warning or error may be generated if any value in `f_exp` is zero when `lambda_` >= 0. The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate. This function handles masked arrays. If an element of `f_obs` or `f_exp` is masked, then data at that position is ignored, and does not count towards the size of the data set. .. versionadded:: 0.13.0 References ---------- .. [1] Lowry, Richard. "Concepts and Applications of Inferential Statistics". Chapter 8. https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html .. [2] "Chi-squared test", https://en.wikipedia.org/wiki/Chi-squared_test .. [3] "G-test", https://en.wikipedia.org/wiki/G-test .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and practice of statistics in biological research", New York: Freeman (1981) .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984), pp. 440-464. Examples -------- (See `chisquare` for more examples.) When just `f_obs` is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies. Here we perform a G-test (i.e. use the log-likelihood ratio statistic): >>> from scipy.stats import power_divergence >>> power_divergence([16, 18, 16, 14, 12, 12], lambda_='log-likelihood') (2.006573162632538, 0.84823476779463769) The expected frequencies can be given with the `f_exp` argument: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[16, 16, 16, 16, 16, 8], ... lambda_='log-likelihood') (3.3281031458963746, 0.6495419288047497) When `f_obs` is 2-D, by default the test is applied to each column. >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T >>> obs.shape (6, 2) >>> power_divergence(obs, lambda_="log-likelihood") (array([ 2.00657316, 6.77634498]), array([ 0.84823477, 0.23781225])) By setting ``axis=None``, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array. >>> power_divergence(obs, axis=None) (23.31034482758621, 0.015975692534127565) >>> power_divergence(obs.ravel()) (23.31034482758621, 0.015975692534127565) `ddof` is the change to make to the default degrees of freedom. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1) (2.0, 0.73575888234288467) The calculation of the p-values is done by broadcasting the test statistic with `ddof`. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2]) (2.0, array([ 0.84914504, 0.73575888, 0.5724067 ])) `f_obs` and `f_exp` are also broadcast. In the following, `f_obs` has shape (6,) and `f_exp` has shape (2, 6), so the result of broadcasting `f_obs` and `f_exp` has shape (2, 6). To compute the desired chi-squared statistics, we must use ``axis=1``: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[[16, 16, 16, 16, 16, 8], ... [8, 20, 20, 16, 12, 12]], ... axis=1) (array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846])) """ # Convert the input argument `lambda_` to a numerical value. if isinstance(lambda_, str): if lambda_ not in _power_div_lambda_names: names = repr(list(_power_div_lambda_names.keys()))[1:-1] raise ValueError("invalid string for lambda_: {0!r}. " "Valid strings are {1}".format(lambda_, names)) lambda_ = _power_div_lambda_names[lambda_] elif lambda_ is None: lambda_ = 1 f_obs = np.asanyarray(f_obs) f_obs_float = f_obs.astype(np.float64) if f_exp is not None: f_exp = np.asanyarray(f_exp) bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape) f_obs_float = _m_broadcast_to(f_obs_float, bshape) f_exp = _m_broadcast_to(f_exp, bshape) rtol = 1e-8 # to pass existing tests with np.errstate(invalid='ignore'): f_obs_sum = f_obs_float.sum(axis=axis) f_exp_sum = f_exp.sum(axis=axis) relative_diff = (np.abs(f_obs_sum - f_exp_sum) / np.minimum(f_obs_sum, f_exp_sum)) diff_gt_tol = (relative_diff > rtol).any() if diff_gt_tol: msg = (f"For each axis slice, the sum of the observed " f"frequencies must agree with the sum of the " f"expected frequencies to a relative tolerance " f"of {rtol}, but the percent differences are:\n" f"{relative_diff}") > raise ValueError(msg) E ValueError: For each axis slice, the sum of the observed frequencies must agree with the sum of the expected frequencies to a relative tolerance of 1e-08, but the percent differences are: E 0.11063329565866321 /usr/lib/python3/dist-packages/scipy/stats/stats.py:6694: ValueError ______________________ test_two[power_divergence-kwargs8] ______________________ kind = 'power_divergence', kwargs = {'lambda_': 'neyman'} @pytest.mark.parametrize( "kind, kwargs", [ ("ttest_ind", {}), ("ttest_ind", {"equal_var": False}), ("ttest_1samp", {}), ("ttest_rel", {}), ("chisquare", {}), ("power_divergence", {}), ("power_divergence", {"lambda_": 0}), ("power_divergence", {"lambda_": -1}), ("power_divergence", {"lambda_": "neyman"}), ], ) def test_two(kind, kwargs): a = np.random.random(size=30) b = np.random.random(size=30) a_ = da.from_array(a, 3) b_ = da.from_array(b, 3) dask_test = getattr(dask.array.stats, kind) scipy_test = getattr(scipy.stats, kind) with pytest.warns(None): # maybe overflow warning (powrer_divergence) result = dask_test(a_, b_, **kwargs) > expected = scipy_test(a, b, **kwargs) /usr/lib/python3/dist-packages/dask/array/tests/test_stats.py:88: _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ f_obs = array([8.77388479e-01, 9.60149501e-01, 4.47046846e-01, 8.40131926e-01, 2.53213006e-01, 7.75601859e-01, 9.242306...546e-01, 9.89797308e-01, 3.58514314e-01, 1.74643187e-01, 8.59181200e-01, 8.11582172e-01, 8.30851765e-04]) f_exp = array([0.38556923, 0.78153729, 0.45783246, 0.6889513 , 0.85721604, 0.6226383 , 0.76431552, 0.6329189 , 0.037484...69, 0.16427409, 0.00692481, 0.91507232, 0.41494886, 0.90695644, 0.57371155, 0.36052197, 0.61507666, 0.87718468]) ddof = 0, axis = 0, lambda_ = -2 def power_divergence(f_obs, f_exp=None, ddof=0, axis=0, lambda_=None): """Cressie-Read power divergence statistic and goodness of fit test. This function tests the null hypothesis that the categorical data has the given frequencies, using the Cressie-Read power divergence statistic. Parameters ---------- f_obs : array_like Observed frequencies in each category. f_exp : array_like, optional Expected frequencies in each category. By default the categories are assumed to be equally likely. ddof : int, optional "Delta degrees of freedom": adjustment to the degrees of freedom for the p-value. The p-value is computed using a chi-squared distribution with ``k - 1 - ddof`` degrees of freedom, where `k` is the number of observed frequencies. The default value of `ddof` is 0. axis : int or None, optional The axis of the broadcast result of `f_obs` and `f_exp` along which to apply the test. If axis is None, all values in `f_obs` are treated as a single data set. Default is 0. lambda_ : float or str, optional The power in the Cressie-Read power divergence statistic. The default is 1. For convenience, `lambda_` may be assigned one of the following strings, in which case the corresponding numerical value is used:: String Value Description "pearson" 1 Pearson's chi-squared statistic. In this case, the function is equivalent to `stats.chisquare`. "log-likelihood" 0 Log-likelihood ratio. Also known as the G-test [3]_. "freeman-tukey" -1/2 Freeman-Tukey statistic. "mod-log-likelihood" -1 Modified log-likelihood ratio. "neyman" -2 Neyman's statistic. "cressie-read" 2/3 The power recommended in [5]_. Returns ------- statistic : float or ndarray The Cressie-Read power divergence test statistic. The value is a float if `axis` is None or if` `f_obs` and `f_exp` are 1-D. pvalue : float or ndarray The p-value of the test. The value is a float if `ddof` and the return value `stat` are scalars. See Also -------- chisquare Notes ----- This test is invalid when the observed or expected frequencies in each category are too small. A typical rule is that all of the observed and expected frequencies should be at least 5. Also, the sum of the observed and expected frequencies must be the same for the test to be valid; `power_divergence` raises an error if the sums do not agree within a relative tolerance of ``1e-8``. When `lambda_` is less than zero, the formula for the statistic involves dividing by `f_obs`, so a warning or error may be generated if any value in `f_obs` is 0. Similarly, a warning or error may be generated if any value in `f_exp` is zero when `lambda_` >= 0. The default degrees of freedom, k-1, are for the case when no parameters of the distribution are estimated. If p parameters are estimated by efficient maximum likelihood then the correct degrees of freedom are k-1-p. If the parameters are estimated in a different way, then the dof can be between k-1-p and k-1. However, it is also possible that the asymptotic distribution is not a chisquare, in which case this test is not appropriate. This function handles masked arrays. If an element of `f_obs` or `f_exp` is masked, then data at that position is ignored, and does not count towards the size of the data set. .. versionadded:: 0.13.0 References ---------- .. [1] Lowry, Richard. "Concepts and Applications of Inferential Statistics". Chapter 8. https://web.archive.org/web/20171015035606/http://faculty.vassar.edu/lowry/ch8pt1.html .. [2] "Chi-squared test", https://en.wikipedia.org/wiki/Chi-squared_test .. [3] "G-test", https://en.wikipedia.org/wiki/G-test .. [4] Sokal, R. R. and Rohlf, F. J. "Biometry: the principles and practice of statistics in biological research", New York: Freeman (1981) .. [5] Cressie, N. and Read, T. R. C., "Multinomial Goodness-of-Fit Tests", J. Royal Stat. Soc. Series B, Vol. 46, No. 3 (1984), pp. 440-464. Examples -------- (See `chisquare` for more examples.) When just `f_obs` is given, it is assumed that the expected frequencies are uniform and given by the mean of the observed frequencies. Here we perform a G-test (i.e. use the log-likelihood ratio statistic): >>> from scipy.stats import power_divergence >>> power_divergence([16, 18, 16, 14, 12, 12], lambda_='log-likelihood') (2.006573162632538, 0.84823476779463769) The expected frequencies can be given with the `f_exp` argument: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[16, 16, 16, 16, 16, 8], ... lambda_='log-likelihood') (3.3281031458963746, 0.6495419288047497) When `f_obs` is 2-D, by default the test is applied to each column. >>> obs = np.array([[16, 18, 16, 14, 12, 12], [32, 24, 16, 28, 20, 24]]).T >>> obs.shape (6, 2) >>> power_divergence(obs, lambda_="log-likelihood") (array([ 2.00657316, 6.77634498]), array([ 0.84823477, 0.23781225])) By setting ``axis=None``, the test is applied to all data in the array, which is equivalent to applying the test to the flattened array. >>> power_divergence(obs, axis=None) (23.31034482758621, 0.015975692534127565) >>> power_divergence(obs.ravel()) (23.31034482758621, 0.015975692534127565) `ddof` is the change to make to the default degrees of freedom. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=1) (2.0, 0.73575888234288467) The calculation of the p-values is done by broadcasting the test statistic with `ddof`. >>> power_divergence([16, 18, 16, 14, 12, 12], ddof=[0,1,2]) (2.0, array([ 0.84914504, 0.73575888, 0.5724067 ])) `f_obs` and `f_exp` are also broadcast. In the following, `f_obs` has shape (6,) and `f_exp` has shape (2, 6), so the result of broadcasting `f_obs` and `f_exp` has shape (2, 6). To compute the desired chi-squared statistics, we must use ``axis=1``: >>> power_divergence([16, 18, 16, 14, 12, 12], ... f_exp=[[16, 16, 16, 16, 16, 8], ... [8, 20, 20, 16, 12, 12]], ... axis=1) (array([ 3.5 , 9.25]), array([ 0.62338763, 0.09949846])) """ # Convert the input argument `lambda_` to a numerical value. if isinstance(lambda_, str): if lambda_ not in _power_div_lambda_names: names = repr(list(_power_div_lambda_names.keys()))[1:-1] raise ValueError("invalid string for lambda_: {0!r}. " "Valid strings are {1}".format(lambda_, names)) lambda_ = _power_div_lambda_names[lambda_] elif lambda_ is None: lambda_ = 1 f_obs = np.asanyarray(f_obs) f_obs_float = f_obs.astype(np.float64) if f_exp is not None: f_exp = np.asanyarray(f_exp) bshape = _broadcast_shapes(f_obs_float.shape, f_exp.shape) f_obs_float = _m_broadcast_to(f_obs_float, bshape) f_exp = _m_broadcast_to(f_exp, bshape) rtol = 1e-8 # to pass existing tests with np.errstate(invalid='ignore'): f_obs_sum = f_obs_float.sum(axis=axis) f_exp_sum = f_exp.sum(axis=axis) relative_diff = (np.abs(f_obs_sum - f_exp_sum) / np.minimum(f_obs_sum, f_exp_sum)) diff_gt_tol = (relative_diff > rtol).any() if diff_gt_tol: msg = (f"For each axis slice, the sum of the observed " f"frequencies must agree with the sum of the " f"expected frequencies to a relative tolerance " f"of {rtol}, but the percent differences are:\n" f"{relative_diff}") > raise ValueError(msg) E ValueError: For each axis slice, the sum of the observed frequencies must agree with the sum of the expected frequencies to a relative tolerance of 1e-08, but the percent differences are: E 0.1911078274656955
OpenPGP_signature
Description: OpenPGP digital signature