This is an automated email from the ASF dual-hosted git repository. aherbert pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/commons-statistics.git
The following commit(s) were added to refs/heads/master by this push: new b18e686 STATISTICS-6: Add interval module b18e686 is described below commit b18e68612ac0b7ad21c421f22891b6e349b28eed Author: Alex Herbert <aherb...@apache.org> AuthorDate: Wed Jun 11 17:37:41 2025 +0100 STATISTICS-6: Add interval module Port o.a.c.math4.legacy.stat.interval package. --- README.md | 1 + .../src/main/resources-filtered/bom.xml | 5 + commons-statistics-docs/pom.xml | 5 + commons-statistics-interval/LICENSE | 201 ++++++++++++++++++++ commons-statistics-interval/NOTICE | 5 + README.md => commons-statistics-interval/README.md | 29 +-- commons-statistics-interval/pom.xml | 55 ++++++ .../commons/statistics/interval/BaseInterval.java | 50 +++++ .../interval/BinomialConfidenceInterval.java | 208 +++++++++++++++++++++ .../commons/statistics/interval/Interval.java | 39 ++++ .../commons/statistics/interval/package-info.java | 23 +++ .../src/site/resources/profile.jacoco | 17 ++ commons-statistics-interval/src/site/site.xml | 32 ++++ .../src/site/xdoc/index.xml | 53 ++++++ .../interval/BinomialConfidenceIntervalTest.java | 139 ++++++++++++++ .../commons/statistics/interval/UserGuideTest.java | 48 +++++ dist-archive/pom.xml | 19 ++ doc/release/copyLongTermJavadoc.sh | 1 + pom.xml | 1 + src/changes/changes.xml | 4 + src/conf/checkstyle/checkstyle-suppressions.xml | 1 + src/site/xdoc/userguide/index.xml | 43 +++++ 22 files changed, 956 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index dc68f49..62f340a 100644 --- a/README.md +++ b/README.md @@ -59,6 +59,7 @@ The [Javadoc](https://commons.apache.org/proper/commons-statistics/commons-stati - [Commons Statistics Descriptive](https://commons.apache.org/proper/commons-statistics/commons-statistics-descriptive/apidocs/) - [Commons Statistics Distribution](https://commons.apache.org/proper/commons-statistics/commons-statistics-distribution/apidocs/) - [Commons Statistics Inference](https://commons.apache.org/proper/commons-statistics/commons-statistics-inference/apidocs/) +- [Commons Statistics Interval](https://commons.apache.org/proper/commons-statistics/commons-statistics-interval/apidocs/) - [Commons Statistics Ranking](https://commons.apache.org/proper/commons-statistics/commons-statistics-ranking/apidocs/) Questions related to the usage of Apache Commons Statistics should be posted to the [user mailing list](https://commons.apache.org/mail-lists.html). diff --git a/commons-statistics-bom/src/main/resources-filtered/bom.xml b/commons-statistics-bom/src/main/resources-filtered/bom.xml index 5ad2ca3..7cde60c 100644 --- a/commons-statistics-bom/src/main/resources-filtered/bom.xml +++ b/commons-statistics-bom/src/main/resources-filtered/bom.xml @@ -46,6 +46,11 @@ <artifactId>commons-statistics-inference</artifactId> <version>${version}</version> </dependency> + <dependency> + <groupId>org.apache.commons</groupId> + <artifactId>commons-statistics-interval</artifactId> + <version>${version}</version> + </dependency> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-statistics-ranking</artifactId> diff --git a/commons-statistics-docs/pom.xml b/commons-statistics-docs/pom.xml index 8161ad0..61e34e7 100644 --- a/commons-statistics-docs/pom.xml +++ b/commons-statistics-docs/pom.xml @@ -77,6 +77,11 @@ <artifactId>commons-statistics-inference</artifactId> <version>1.2-SNAPSHOT</version> </dependency> + <dependency> + <groupId>org.apache.commons</groupId> + <artifactId>commons-statistics-interval</artifactId> + <version>1.2-SNAPSHOT</version> + </dependency> </dependencies> <build> diff --git a/commons-statistics-interval/LICENSE b/commons-statistics-interval/LICENSE new file mode 100644 index 0000000..261eeb9 --- /dev/null +++ b/commons-statistics-interval/LICENSE @@ -0,0 +1,201 @@ + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. + + END OF TERMS AND CONDITIONS + + APPENDIX: How to apply the Apache License to your work. + + To apply the Apache License to your work, attach the following + boilerplate notice, with the fields enclosed by brackets "[]" + replaced with your own identifying information. (Don't include + the brackets!) The text should be enclosed in the appropriate + comment syntax for the file format. We also recommend that a + file or class name and description of purpose be included on the + same "printed page" as the copyright notice for easier + identification within third-party archives. + + Copyright [yyyy] [name of copyright owner] + + Licensed under the Apache License, Version 2.0 (the "License"); + you may not use this file except in compliance with the License. + You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. diff --git a/commons-statistics-interval/NOTICE b/commons-statistics-interval/NOTICE new file mode 100644 index 0000000..dc42354 --- /dev/null +++ b/commons-statistics-interval/NOTICE @@ -0,0 +1,5 @@ +Apache Commons Statistics +Copyright 2018-2024 The Apache Software Foundation + +This product includes software developed at +The Apache Software Foundation (http://www.apache.org/). diff --git a/README.md b/commons-statistics-interval/README.md similarity index 80% copy from README.md copy to commons-statistics-interval/README.md index dc68f49..cf5ae8e 100644 --- a/README.md +++ b/commons-statistics-interval/README.md @@ -40,49 +40,32 @@ | | +======================================================================+ ---> -Apache Commons Statistics +Apache Commons Statistics Interval =================== [](https://github.com/apache/commons-statistics/actions/workflows/maven.yml) [](https://app.codecov.io/gh/apache/commons-statistics) -[](https://search.maven.org/artifact/org.apache.commons/commons-statistics-bom/) -[](https://sonarcloud.io/dashboard?id=commons-statistics) +[](https://search.maven.org/artifact/org.apache.commons/commons-statistics-interval/) -The Apache Commons Statistics project provides tools for statistics. +Statistical intervals. Documentation ------------- More information can be found on the [Apache Commons Statistics homepage](https://commons.apache.org/proper/commons-statistics). -The [Javadoc](https://commons.apache.org/proper/commons-statistics/commons-statistics-docs/apidocs) for each of the modules can be browsed: - -- [Commons Statistics Descriptive](https://commons.apache.org/proper/commons-statistics/commons-statistics-descriptive/apidocs/) -- [Commons Statistics Distribution](https://commons.apache.org/proper/commons-statistics/commons-statistics-distribution/apidocs/) -- [Commons Statistics Inference](https://commons.apache.org/proper/commons-statistics/commons-statistics-inference/apidocs/) -- [Commons Statistics Ranking](https://commons.apache.org/proper/commons-statistics/commons-statistics-ranking/apidocs/) - +The [Javadoc](https://commons.apache.org/proper/commons-statistics/commons-statistics-interval/apidocs) can be browsed. Questions related to the usage of Apache Commons Statistics should be posted to the [user mailing list](https://commons.apache.org/mail-lists.html). Getting the latest release -------------------------- You can download source and binaries from our [download page](https://commons.apache.org/proper/commons-statistics/download_statistics.cgi). -Alternatively, you can pull it from the central Maven repositories, for example: +Alternatively, you can pull it from the central Maven repositories: ```xml <dependency> <groupId>org.apache.commons</groupId> - <artifactId>commons-statistics-descriptive</artifactId> - <version>1.1</version> -</dependency> -<dependency> - <groupId>org.apache.commons</groupId> - <artifactId>commons-statistics-distribution</artifactId> - <version>1.1</version> -</dependency> -<dependency> - <groupId>org.apache.commons</groupId> - <artifactId>commons-statistics-inference</artifactId> + <artifactId>commons-statistics-interval</artifactId> <version>1.1</version> </dependency> ``` diff --git a/commons-statistics-interval/pom.xml b/commons-statistics-interval/pom.xml new file mode 100644 index 0000000..fc04b91 --- /dev/null +++ b/commons-statistics-interval/pom.xml @@ -0,0 +1,55 @@ +<?xml version="1.0"?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd"> + <modelVersion>4.0.0</modelVersion> + + <parent> + <groupId>org.apache.commons</groupId> + <artifactId>commons-statistics-parent</artifactId> + <version>1.2-SNAPSHOT</version> + </parent> + + <artifactId>commons-statistics-interval</artifactId> + <name>Apache Commons Statistics Interval</name> + + <description>Statistical intervals.</description> + + <properties> + <!-- OSGi --> + <commons.osgi.symbolicName>org.apache.commons.statistics.interval</commons.osgi.symbolicName> + <commons.osgi.export>org.apache.commons.statistics.interval</commons.osgi.export> + <!-- Java 9+ --> + <commons.module.name>org.apache.commons.statistics.interval</commons.module.name> + <!-- Workaround to avoid duplicating config files. --> + <statistics.parent.dir>${basedir}/..</statistics.parent.dir> + <!-- Reproducible builds --> + <project.build.outputTimestamp>${statistics.build.outputTimestamp}</project.build.outputTimestamp> + <statistics.jira.component>interval</statistics.jira.component> + </properties> + + <dependencies> + + <dependency> + <groupId>org.apache.commons</groupId> + <artifactId>commons-statistics-distribution</artifactId> + <version>1.2-SNAPSHOT</version> + </dependency> + + </dependencies> + +</project> diff --git a/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/BaseInterval.java b/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/BaseInterval.java new file mode 100644 index 0000000..08e2f2a --- /dev/null +++ b/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/BaseInterval.java @@ -0,0 +1,50 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.statistics.interval; + +/** + * Base class representing an {@link Interval}. + * + * @since 1.2 + */ +final class BaseInterval implements Interval { + /** Lower bound. */ + private final double lower; + /** Upper bound. */ + private final double upper; + + /** + * Create an instance. + * + * @param lower Lower bound. + * @param upper Upper bound. + */ + BaseInterval(double lower, double upper) { + this.lower = lower; + this.upper = upper; + } + + @Override + public double getLowerBound() { + return lower; + } + + @Override + public double getUpperBound() { + return upper; + } +} diff --git a/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/BinomialConfidenceInterval.java b/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/BinomialConfidenceInterval.java new file mode 100644 index 0000000..078cbc8 --- /dev/null +++ b/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/BinomialConfidenceInterval.java @@ -0,0 +1,208 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.statistics.interval; + +import org.apache.commons.statistics.distribution.BetaDistribution; +import org.apache.commons.statistics.distribution.NormalDistribution; + +/** + * Generate confidence intervals for a binomial proportion. + * + * <p>Note: To avoid <em>overshoot</em>, the confidence intervals are clipped to be in the + * {@code [0, 1]} interval in the case of the {@link #NORMAL_APPROXIMATION normal + * approximation} and {@link #AGRESTI_COULL Agresti-Coull} methods. + * + * @see <a + * href="https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval">Binomial + * proportion confidence interval (Wikipedia)</a> + * + * @since 1.2 + */ +public enum BinomialConfidenceInterval { + /** + * Implements the normal approximation method for creating a binomial proportion + * confidence interval. + * + * <p>This method clips the confidence interval to be in {@code [0, 1]}. + * + * @see <a + * href="https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Problems_with_using_a_normal_approximation_or_%22Wald_interval%22"> + * Normal approximation interval (Wikipedia)</a> + */ + NORMAL_APPROXIMATION { + @Override + Interval create(int n, int x, double alpha) { + final double z = NORMAL_DISTRIBUTION.inverseSurvivalProbability(alpha * 0.5); + final double p = (double) x / n; + final double distance = z * Math.sqrt(p * (1 - p) / n); + // This may exceed the interval [0, 1] + return new BaseInterval(clip(p - distance), clip(p + distance)); + } + }, + /** + * Implements the Wilson score method for creating a binomial proportion confidence + * interval. + * + * @see <a + * href="https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Wilson_score_interval"> + * Normal approximation interval (Wikipedia)</a> + */ + WILSON_SCORE { + @Override + Interval create(int n, int x, double alpha) { + final double z = NORMAL_DISTRIBUTION.inverseSurvivalProbability(alpha * 0.5); + final double z2 = z * z; + final double p = (double) x / n; + final double denom = 1 + z2 / n; + final double centre = (p + 0.5 * z2 / n) / denom; + final double distance = z * Math.sqrt(p * (1 - p) / n + z2 / (4.0 * n * n)) / denom; + return new BaseInterval(centre - distance, centre + distance); + } + }, + /** + * Implements the Jeffreys method for creating a binomial proportion confidence + * interval. + * + * <p>In order to avoid the coverage probability tending to zero when {@code p} tends + * towards 0 or 1, when {@code x = 0} the lower limit is set to 0, and when + * {@code x = n} the upper limit is set to 1. + * + * @see <a + * href="https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Jeffreys_interval"> + * Jeffreys interval (Wikipedia)</a> + */ + JEFFREYS { + @Override + Interval create(int n, int x, double alpha) { + final BetaDistribution d = BetaDistribution.of(x + 0.5, n - x + 0.5); + final double lower = x == 0 ? 0 : d.inverseCumulativeProbability(alpha * 0.5); + final double upper = x == n ? 1 : d.inverseSurvivalProbability(alpha * 0.5); + return new BaseInterval(lower, upper); + } + }, + /** + * Implements the Clopper-Pearson method for creating a binomial proportion confidence + * interval. + * + * @see <a + * href="https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Clopper%E2%80%93Pearson_interval"> + * Clopper-Pearson interval (Wikipedia)</a> + */ + CLOPPER_PEARSON { + @Override + Interval create(int n, int x, double alpha) { + double lower = 0; + double upper = 1; + // Use closed form expressions + if (x == 0) { + upper = 1 - Math.pow(alpha * 0.5, 1.0 / n); + } else if (x == n) { + lower = Math.pow(alpha * 0.5, 1.0 / n); + } else { + lower = BetaDistribution.of(x, n - x + 1).inverseCumulativeProbability(alpha * 0.5); + upper = BetaDistribution.of(x + 1, n - x).inverseSurvivalProbability(alpha * 0.5); + } + return new BaseInterval(lower, upper); + } + }, + /** + * Implements the Agresti-Coull method for creating a binomial proportion confidence + * interval. + * + * <p>This method clips the confidence interval to be in {@code [0, 1]}. + * + * @see <a + * href="https://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval#Agresti%E2%80%93Coull_interval"> + * Agresti-Coull interval (Wikipedia)</a> + */ + AGRESTI_COULL { + @Override + Interval create(int n, int x, double alpha) { + final double z = NORMAL_DISTRIBUTION.inverseSurvivalProbability(alpha * 0.5); + final double zSquared = z * z; + final double nc = n + zSquared; + final double p = (x + 0.5 * zSquared) / nc; + final double distance = z * Math.sqrt(p * (1 - p) / nc); + // This may exceed the interval [0, 1] + return new BaseInterval(clip(p - distance), clip(p + distance)); + } + }; + + /** The standard normal distribution. */ + static final NormalDistribution NORMAL_DISTRIBUTION = NormalDistribution.of(0, 1); + + /** + * Create a confidence interval for the true probability of success of an unknown + * binomial distribution with the given observed number of trials, successes and error + * rate. + * + * <p>The error rate {@code alpha} is related to the confidence level that the + * interval contains the true probability of success as + * {@code alpha = 1 - confidence}, where {@code confidence} is the confidence level + * in {@code [0, 1]}. For example a 95% confidence level is an {@code alpha} of 0.05. + * + * @param numberOfTrials Number of trials. + * @param numberOfSuccesses Number of successes. + * @param alpha Desired error rate that the true probability of success falls <em>outside</em> + * the returned interval. + * @return Confidence interval containing the probability of success with error rate + * {@code alpha} + * @throws IllegalArgumentException if {@code numberOfTrials <= 0}, if + * {@code numberOfSuccesses < 0}, if {@code numberOfSuccesses > numberOfTrials}, or if + * {@code alpha} is not in the open interval {@code (0, 1)}. + */ + public Interval fromErrorRate(int numberOfTrials, int numberOfSuccesses, double alpha) { + if (numberOfTrials <= 0) { + throw new IllegalArgumentException("Number of trials is not strictly positive: " + numberOfTrials); + } + if (numberOfSuccesses < 0) { + throw new IllegalArgumentException("Number of successes is not positive: " + numberOfSuccesses); + } + if (numberOfSuccesses > numberOfTrials) { + throw new IllegalArgumentException( + String.format("Number of successes (%d) must be less than or equal to number of trials (%d)", + numberOfSuccesses, numberOfTrials)); + } + // Negation of alpha inside the interval (0, 1) detects NaN + if (!(alpha > 0 && alpha < 1)) { + throw new IllegalArgumentException("Error rate is not in (0, 1): " + alpha); + } + return create(numberOfTrials, numberOfSuccesses, alpha); + } + + /** + * Create a confidence interval for the true probability of success of an unknown + * binomial distribution with the given observed number of trials, successes and error + * rate. + * + * @param n Number of trials. + * @param x Number of successes. + * @param alpha Desired error rate. + * @return Confidence interval + */ + abstract Interval create(int n, int x, double alpha); + + /** + * Clip the probability to [0, 1]. + * + * @param p Probability. + * @return the probability in [0, 1] + */ + static double clip(double p) { + return Math.min(1, Math.max(0, p)); + } +} diff --git a/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/Interval.java b/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/Interval.java new file mode 100644 index 0000000..e8bffce --- /dev/null +++ b/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/Interval.java @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.statistics.interval; + +/** + * Interface representing an interval. + * + * @since 1.2 + */ +public interface Interval { + + /** + * Get the lower bound of the interval. + * + * @return the lower end point of the interval + */ + double getLowerBound(); + + /** + * Get the upper bound of the interval. + * + * @return the upper end point of the interval + */ + double getUpperBound(); +} diff --git a/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/package-info.java b/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/package-info.java new file mode 100644 index 0000000..5c8cb19 --- /dev/null +++ b/commons-statistics-interval/src/main/java/org/apache/commons/statistics/interval/package-info.java @@ -0,0 +1,23 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +/** + * Classes providing statistical intervals. + * + * @since 1.2 + */ +package org.apache.commons.statistics.interval; diff --git a/commons-statistics-interval/src/site/resources/profile.jacoco b/commons-statistics-interval/src/site/resources/profile.jacoco new file mode 100644 index 0000000..a12755f --- /dev/null +++ b/commons-statistics-interval/src/site/resources/profile.jacoco @@ -0,0 +1,17 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# ----------------------------------------------------------------------------- +# +# Empty file used to automatically trigger JaCoCo profile from commons parent pom diff --git a/commons-statistics-interval/src/site/site.xml b/commons-statistics-interval/src/site/site.xml new file mode 100644 index 0000000..7812358 --- /dev/null +++ b/commons-statistics-interval/src/site/site.xml @@ -0,0 +1,32 @@ +<?xml version="1.0" encoding="ISO-8859-1"?> +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> +<site name="Statistics"> + <!-- Use a full URL allows a correct banner for the modules. --> + <bannerRight name="Apache Commons Statistics" href="https://commons.apache.org/proper/commons-statistics/index.html"> + <image src="https://commons.apache.org/proper/commons-statistics/images/commons_statistics.small.png"/> + </bannerRight> + + <body> + <menu name="Statistics Interval"> + <item name="Overview" href="index.html"/> + <item name="Latest API docs (development)" + href="apidocs/index.html"/> + </menu> + + </body> +</site> diff --git a/commons-statistics-interval/src/site/xdoc/index.xml b/commons-statistics-interval/src/site/xdoc/index.xml new file mode 100644 index 0000000..fe07b59 --- /dev/null +++ b/commons-statistics-interval/src/site/xdoc/index.xml @@ -0,0 +1,53 @@ +<?xml version="1.0"?> + +<!-- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. + --> + +<document> + + <properties> + <title>Apache Commons Statistics Interval</title> + </properties> + + <body> + + <section name="Apache Commons Statistics: Interval" href="summary"> + <p> + Apache Commons Statistics provides statistical intervals. + </p> + + <p> + Example: + </p> + +<source class="prettyprint">import org.apache.commons.statistics.interval.Interval; +import org.apache.commons.statistics.interval.BinomialConfidenceInterval; + +int n = 400; +int x = 20; +double alpha = 0.05; +Interval interval = BinomialConfidenceInterval.CLOPPER_PEARSON.fromErrorRate(n, x, alpha); +</source> + + <p> + Browse the <a href="apidocs/index.html">Javadoc</a> for more information. + </p> + </section> + + </body> + +</document> diff --git a/commons-statistics-interval/src/test/java/org/apache/commons/statistics/interval/BinomialConfidenceIntervalTest.java b/commons-statistics-interval/src/test/java/org/apache/commons/statistics/interval/BinomialConfidenceIntervalTest.java new file mode 100644 index 0000000..de8a355 --- /dev/null +++ b/commons-statistics-interval/src/test/java/org/apache/commons/statistics/interval/BinomialConfidenceIntervalTest.java @@ -0,0 +1,139 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.commons.statistics.interval; + +import java.util.stream.Stream; +import java.util.stream.Stream.Builder; +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.params.ParameterizedTest; +import org.junit.jupiter.params.provider.Arguments; +import org.junit.jupiter.params.provider.EnumSource; +import org.junit.jupiter.params.provider.MethodSource; + +/** + * Test cases for {@link BinomialConfidenceInterval}. + */ +class BinomialConfidenceIntervalTest { + @ParameterizedTest + @EnumSource + void testInvalidArgumentsThrow(BinomialConfidenceInterval method) { + int n = 10; + int x = 5; + double alpha = 0.05; + Assertions.assertDoesNotThrow(() -> method.fromErrorRate(n, x, alpha)); + // n <= 0 + Assertions.assertThrows(IllegalArgumentException.class, () -> method.fromErrorRate(-1, x, alpha)); + Assertions.assertThrows(IllegalArgumentException.class, () -> method.fromErrorRate(0, x, alpha)); + // x < 0 + Assertions.assertDoesNotThrow(() -> method.fromErrorRate(n, 0, alpha)); + Assertions.assertThrows(IllegalArgumentException.class, () -> method.fromErrorRate(n, -1, alpha)); + // x > n + Assertions.assertDoesNotThrow(() -> method.fromErrorRate(n, n, alpha)); + Assertions.assertThrows(IllegalArgumentException.class, () -> method.fromErrorRate(n, n + 1, alpha)); + // alpha not in (0, 1) + Assertions.assertDoesNotThrow(() -> method.fromErrorRate(n, x, Math.nextUp(0.0))); + Assertions.assertDoesNotThrow(() -> method.fromErrorRate(n, x, Math.nextDown(1.0))); + Assertions.assertThrows(IllegalArgumentException.class, () -> method.fromErrorRate(n, x, 0.0)); + Assertions.assertThrows(IllegalArgumentException.class, () -> method.fromErrorRate(n, x, 1.0)); + Assertions.assertThrows(IllegalArgumentException.class, () -> method.fromErrorRate(n, x, -0.01)); + Assertions.assertThrows(IllegalArgumentException.class, () -> method.fromErrorRate(n, x, 1.01)); + Assertions.assertThrows(IllegalArgumentException.class, () -> method.fromErrorRate(n, x, Double.NaN)); + } + + @ParameterizedTest + @MethodSource() + void testInterval(BinomialConfidenceInterval method, int n, int x, double alpha, + double lower, double upper, double relativeError) { + final Interval i = method.fromErrorRate(n, x, alpha); + Assertions.assertEquals(lower, i.getLowerBound(), lower * relativeError, "lower"); + Assertions.assertEquals(upper, i.getUpperBound(), upper * relativeError, "upper"); + } + + static Stream<Arguments> testInterval() { + final Builder<Arguments> builder = Stream.builder(); + // Cases taken from Commons Math. + // Results generated using Python statsmodels.stats.proportion.proportion_confint + // with method parameter: + // normal : asymptotic normal approximation + // agresti_coull : Agresti-Coull interval + // beta : Clopper-Pearson interval based on Beta distribution + // wilson : Wilson Score interval + // jeffreys : Jeffreys Bayesian Interval + // E.g. + // proportion_confint(0,10,method='beta') = (0, 0.3084971078187608) + final int n = 500; + final int x = 50; + final double alpha = 0.1; + add(builder, BinomialConfidenceInterval.NORMAL_APPROXIMATION, n, x, alpha, 0.07793197286259657, 0.12206802713740345, 1e-15); + add(builder, BinomialConfidenceInterval.WILSON_SCORE, n, x, alpha, 0.0800391858824593, 0.12426638582141426, 1e-15); + add(builder, BinomialConfidenceInterval.JEFFREYS, n, x, alpha, 0.07963646817350203, 0.1237728842019873, 1e-15); + add(builder, BinomialConfidenceInterval.CLOPPER_PEARSON, n, x, alpha, 0.07873857004520295, 0.1248658074138089, 1e-15); + add(builder, BinomialConfidenceInterval.AGRESTI_COULL, n, x, alpha, 0.07993520614825012, 0.12437036555562345, 1e-15); + // Expand Commons Math Clopper-Pearson test to all methods. + // Test MATH-1421: lower >= 0 when x=0 + BinomialConfidenceInterval method; + method = BinomialConfidenceInterval.NORMAL_APPROXIMATION; + add(builder, method, 10, 0, 0.05, 0, 0, 1e-15); + add(builder, method, 10, 10, 0.05, 1, 1, 1e-15); + add(builder, method, 10, 3, 0.05, 0.015974234910674567, 0.5840257650893255, 1e-15); + add(builder, method, 400, 20, 0.05, 0.028641787646026474, 0.07135821235397354, 1e-15); + add(builder, method, 19436, 0, 0.05, 0, 0, 1e-15); + // Interval requires clipping to [0, 1] + add(builder, method, 100, 1, 0.05, 0, 0.02950139541798788, 1e-15); + add(builder, method, 100, 99, 0.05, 0.9704986045820121, 1, 1e-15); + method = BinomialConfidenceInterval.WILSON_SCORE; + add(builder, method, 10, 0, 0.05, 0.0, 0.27753279986288926, 1e-15); + add(builder, method, 10, 10, 0.05, 0.7224672001371106, 1.0, 1e-15); + add(builder, method, 10, 3, 0.05, 0.10779126740630104, 0.6032218525388546, 1e-15); + add(builder, method, 400, 20, 0.05, 0.03259742983714725, 0.07596363506371961, 1e-15); + add(builder, method, 19436, 0, 0.05, 1.3552527156068805e-20, 0.00019760751798472573, 1e-15); + method = BinomialConfidenceInterval.JEFFREYS; + // Note: Java implementation sets limits when x=0 or n to 0 or 1 respectively + add(builder, method, 10, 0, 0.05, /*4.7890433157581984e-05*/ 0, 0.21719626750921053, 1e-14); + add(builder, method, 10, 10, 0.05, 0.7828037324907895, /*0.9999521095668424*/ 1, 1e-15); + add(builder, method, 10, 3, 0.05, 0.09269459393815319, 0.6058183181486713, 1e-15); + add(builder, method, 400, 20, 0.05, 0.031795039152749435, 0.07467797318472456, 1e-15); + add(builder, method, 19436, 0, 0.05, /*2.5263852458384886e-08*/ 0, 0.00012923175911984633, 1e-13); + method = BinomialConfidenceInterval.CLOPPER_PEARSON; + add(builder, method, 10, 0, 0.05, 0, 0.3084971078187608, 1e-15); + add(builder, method, 10, 10, 0.05, 0.6915028921812392, 1, 1e-15); + add(builder, method, 10, 3, 0.05, 0.06673951117773447, 0.6524528500599972, 1e-15); + add(builder, method, 400, 20, 0.05, 0.030805241143265938, 0.07616697275514255, 1e-15); + // proportion_confint does not match this implementation. + // Computed using code from Wikipedia: + // from scipy.stats import beta + // import numpy as np + // k = 0 + // n = 19436 + // alpha = 0.05 + // p_u, p_o = beta.ppf([alpha / 2, 1 - alpha / 2], [k, k + 1], [n - k + 1, n - k]) + add(builder, method, 19436, 0, 0.05, 0.0, 0.0001897782161226719, 1e-12); + method = BinomialConfidenceInterval.AGRESTI_COULL; + add(builder, method, 10, 0, 0.05, 0.0, 0.3208873057505458, 1e-15); + add(builder, method, 10, 10, 0.05, 0.6791126942494543, 1.0, 1e-15); + add(builder, method, 10, 3, 0.05, 0.10333841792242526, 0.6076747020227304, 1e-15); + add(builder, method, 400, 20, 0.05, 0.03218289448554276, 0.0763781704153241, 1e-15); + add(builder, method, 19436, 0, 0.05, 0.0, 0.00023852647189663768, 1e-15); + return builder.build(); + } + + private static void add(Builder<Arguments> builder, BinomialConfidenceInterval method, + int n, int x, double alpha, + double lower, double upper, double relativeError) { + builder.accept(Arguments.of(method, n, x, alpha, lower, upper, relativeError)); + } +} diff --git a/commons-statistics-interval/src/test/java/org/apache/commons/statistics/interval/UserGuideTest.java b/commons-statistics-interval/src/test/java/org/apache/commons/statistics/interval/UserGuideTest.java new file mode 100644 index 0000000..d7b4d01 --- /dev/null +++ b/commons-statistics-interval/src/test/java/org/apache/commons/statistics/interval/UserGuideTest.java @@ -0,0 +1,48 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.commons.statistics.interval; + +import org.junit.jupiter.api.Assertions; +import org.junit.jupiter.api.Test; + +/** + * Test code used in the interval section of the user guide. + */ +class UserGuideTest { + @Test + void testInterval1() { + // Results generated using Python statsmodels.stats.proportion.proportion_confint + // >>> print('%.5f, %.5f' % proportion_confint(5, 10, 0.05, method='wilson')) + // 0.23659, 0.76341 + BinomialConfidenceInterval method = BinomialConfidenceInterval.WILSON_SCORE; + double alpha = 0.05; + + Interval interval = method.fromErrorRate(10, 5, alpha); + Assertions.assertEquals(0.23659, interval.getLowerBound(), 1e-5); + Assertions.assertEquals(0.76341, interval.getUpperBound(), 1e-5); + + assertInterval(method.fromErrorRate(100, 50, alpha), 0.40383, 0.59617, 1e-5); + assertInterval(method.fromErrorRate(1000, 500, alpha), 0.46907, 0.53093, 1e-5); + assertInterval(method.fromErrorRate(10000, 5000, alpha), 0.49020, 0.50980, 1e-5); + } + + private static void assertInterval(Interval interval, double lower, double upper, double relError) { + Assertions.assertEquals(lower, interval.getLowerBound(), lower * relError, "lower"); + Assertions.assertEquals(upper, interval.getUpperBound(), upper * relError, "upper"); + } +} diff --git a/dist-archive/pom.xml b/dist-archive/pom.xml index 9a3196a..1f1dc0e 100644 --- a/dist-archive/pom.xml +++ b/dist-archive/pom.xml @@ -118,6 +118,25 @@ under the License. <classifier>javadoc</classifier> </dependency> + <!-- Module: Interval --> + <dependency> + <groupId>org.apache.commons</groupId> + <artifactId>commons-statistics-interval</artifactId> + <version>1.2-SNAPSHOT</version> + </dependency> + <dependency> + <groupId>org.apache.commons</groupId> + <artifactId>commons-statistics-interval</artifactId> + <version>1.2-SNAPSHOT</version> + <classifier>sources</classifier> + </dependency> + <dependency> + <groupId>org.apache.commons</groupId> + <artifactId>commons-statistics-interval</artifactId> + <version>1.2-SNAPSHOT</version> + <classifier>javadoc</classifier> + </dependency> + <!-- Module: Ranking --> <dependency> <groupId>org.apache.commons</groupId> diff --git a/doc/release/copyLongTermJavadoc.sh b/doc/release/copyLongTermJavadoc.sh index 484e262..d1ddabd 100755 --- a/doc/release/copyLongTermJavadoc.sh +++ b/doc/release/copyLongTermJavadoc.sh @@ -22,6 +22,7 @@ set -e MODULES=(commons-statistics-descriptive \ commons-statistics-distribution \ commons-statistics-inference \ + commons-statistics-interval \ commons-statistics-ranking) while getopts r:v: option diff --git a/pom.xml b/pom.xml index e860b67..8bd26c5 100644 --- a/pom.xml +++ b/pom.xml @@ -629,6 +629,7 @@ This is avoided by creating an empty directory when svn is not available. <module>commons-statistics-distribution</module> <module>commons-statistics-ranking</module> <module>commons-statistics-inference</module> + <module>commons-statistics-interval</module> <module>commons-statistics-regression</module> <module>commons-statistics-bom</module> <!-- Include an aggregate module to build aggregate javadoc and test coverage reports --> diff --git a/src/changes/changes.xml b/src/changes/changes.xml index 99d1f0d..d837c90 100644 --- a/src/changes/changes.xml +++ b/src/changes/changes.xml @@ -56,6 +56,10 @@ If the output is not quite correct, check for invisible trailing spaces! <release version="1.2" date="TBD" description=" New features, updates and bug fixes (requires Java 8). "> + <action dev="aherbert" type="add" issue="STATISTICS-6"> + Add a commons-statistics-interval module for statistical intervals. This ports and + updates functionality in org.apache.commons.math4.stat.interval. + </action> <action dev="aherbert" type="add" issue="STATISTICS-90"> Support creation of descriptive statistics from an array range defined using [from, to) indices. diff --git a/src/conf/checkstyle/checkstyle-suppressions.xml b/src/conf/checkstyle/checkstyle-suppressions.xml index 4e2dbc6..7e47f14 100644 --- a/src/conf/checkstyle/checkstyle-suppressions.xml +++ b/src/conf/checkstyle/checkstyle-suppressions.xml @@ -40,6 +40,7 @@ <suppress checks="ParameterNumber" files=".*[/\\]MannWhitneyUTestTest.java" /> <suppress checks="ParameterNumber" files=".*[/\\]WilcoxonSignedRankTestTest.java" /> <suppress checks="ParameterNumber" files=".*[/\\]UnconditionedExactTestTest.java" /> + <suppress checks="ParameterNumber" files=".*[/\\]BinomialConfidenceIntervalTest.java" /> <suppress checks="MethodLength" files=".*[/\\]WilcoxonSignedRankTestTest.java" /> <suppress checks="IllegalCatch" files=".*[/\\]TestHelper.java" lines="295-410" /> <suppress checks="IllegalCatch" files=".*[/\\]BaseStatisticTest.java" lines="280-400" /> diff --git a/src/site/xdoc/userguide/index.xml b/src/site/xdoc/userguide/index.xml index a5dac66..cff4d81 100644 --- a/src/site/xdoc/userguide/index.xml +++ b/src/site/xdoc/userguide/index.xml @@ -77,6 +77,9 @@ </li> </ul> </li> + <li> + <a href="#interval">Interval</a> + </li> <li> <a href="#ranking">Ranking</a> </li> @@ -109,6 +112,10 @@ <code><a href="../commons-statistics-inference/index.html"> commons-statistics-inference</a></code> - Provides hypothesis testing. </li> + <li> + <code><a href="../commons-statistics-interval/index.html"> + commons-statistics-interval</a></code> - Provides statistical intervals. + </li> <li> <code><a href="../commons-statistics-ranking/index.html"> commons-statistics-ranking</a></code> - Provides rank transformations. @@ -722,6 +729,42 @@ result.reject(0.001); // true </table> </subsection> </section> + <section name="Interval" id="interval"> + <p> + The <code>commons-statistics-interval</code> module provides statistical intervals. + </p> + <p> + The <code>Interval</code> interface provides a bounded interval with a lower and upper + bound: \( [l, u] \). + </p> + <p> + The <code>BinomialConfidenceInterval</code> enumeration provides methods + to create a confidence interval for a binomial proportion. This is an interval + containing the probability of success given a series of success-failure experiments. + The interval is constructed using a confidence level. For example a 95% confidence interval + will contain the true proportion of successes 95% of the times that the procedure + for constructing the confidence interval is employed. The target error rate \( \alpha \) + is defined as \( 1 - confidence \) when expressing the confidence level as a probability + in \( (0, 1) \). + </p> + <p> + The following example demonstrates an ideal coin toss experiment. Note how the 95% + confidence interval containing the true probability narrows as the number of trials + increases. + </p> +<source class="prettyprint"> +BinomialConfidenceInterval method = BinomialConfidenceInterval.WILSON_SCORE; +double alpha = 0.05; + +Interval interval = method.fromErrorRate(10, 5, alpha); +interval.getLowerBound(); // 0.23659 +interval.getUpperBound(); // 0.76341 + +method.fromErrorRate(100, 50, alpha); // 0.40383, 0.59617 +method.fromErrorRate(1000, 500, alpha); // 0.46907, 0.53093 +method.fromErrorRate(10000, 5000, alpha); // 0.49020, 0.50980 +</source> + </section> <section name="Ranking" id="ranking"> <p> The <code>commons-statistics-ranking</code> module provides rank transformations.