Dear list,
I'm computing a p-value for the Student test and discover some inconsistencies
with the cdf pt().
The observed statistic is 11.23995 for 95 observations, so the p-value is very
small
> t_score <- 11.23995
> n <- 95
> print(pt(t_score, df = n-2, lower=FALSE), digits=22)
[1] 2.539746620181247991746e-19
> print(integrate(dt, lower=t_score, upper=Inf, df=n-2)$value, digits = 22)
[1] 2.539746631161970791961e-19
But if I compute with pt(lower=TRUE), I got 0
> print(1-pt(t_score, df = n-2, lower=TRUE), digits=22)
[1] 0
Indeed, the p-value is lower than the epsilon machine
> pt(t_score, df = n-2, lower=FALSE) < .Machine$double.eps
[1] TRUE
Using the square of t statistic which follows a Fisher distribution, I got the
same issue:
> print(pf(z, 1, n-2, lower=FALSE), digits=22)
[1] 5.079493240362495983491e-19
> print(integrate(df, lower=z, upper=Inf, df1=1, df2=n-2)$value, digits = 22)
[1] 5.079015231299358486828e-19
> print(1-pf(z, 1, n-2, lower=TRUE), digits=22)
[1] 0
When using the t.test() function, the p-value is naturally printed : p-value <
2.2e-16.
Any comment is welcome.
Christophe
> R.version
_
platform aarch64-apple-darwin20
arch aarch64
os darwin20
system aarch64, darwin20
status
major 4
minor 5.1
year 2025
month 06
day 13
svn rev 88306
language R
version.string R version 4.5.1 (2025-06-13)
nickname Great Square Root
-------------------------------------------------
Christophe DUTANG
LJK, Ensimag, Grenoble INP, UGA, France
ILB research fellow
Web: http://dutangc.free.fr
-------------------------------------------------
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.