Re: [R] [EXT] Re: A very small p-value

Viechtbauer, Wolfgang (NP) via R-help Wed, 05 Nov 2025 06:04:53 -0800

Eik, thanks for posting this. I thought that the page was making the usual 
(just somewhat flawed) argument that once the dfs are sufficiently large, 
whether one does pnorm(...) or pt(..., df=<>) makes little difference (although 
far out in the tails it still does).


Your post made me look at the page and I hope nobody takes anything written 
there serious. The argument is so utterly wrong. I am absolutely flabbergasted 
how somebody could write so many pages of text based on such a flawed 
understanding of basic statistical concepts.

Just to give some examples:

"The next issue I have is that I can't see the underlying data. So I don't know 
what the actual shape of the distribution is, but it's probably fair to say 
it's normally distributed (assuming the Central Limit Theorem applies)." The 
CLT says nothing about the distribution of the raw data.

"As the sample size increases, samples will begin to operate and appear more 
and more like the population they are drawn from. This is the Law of Large 
Numbers." The law of large numbers has nothing to do with this.

And as Eik already pointed out, the 'z-test' the author is describing is not a 
test at all, but essentially just calculates the standardized mean difference 
(and computing a p-value from it makes no sense).

Best,
Wolfgang

> -----Original Message-----
> From: R-help <[email protected]> On Behalf Of Eik Vettorazzi via R-
> help
> Sent: Tuesday, November 4, 2025 20:44
> To: Petr Pikal <[email protected]>; Christophe Dutang <[email protected]>
> Cc: [email protected]
> Subject: Re: [R] [EXT] Re: A very small p-value
>
> Hi,
> Stepping briefly outside the R context, I noticed a statistical point in
> the text you linked that, in my opinion, isn't quite right. I believe
> there's a key misunderstanding here: The statement that the z-test does
> not depend on the number of cases is incorrect. The p-value of the
> z-test is —just like other tests— very much dependent on the sample
> size, assuming the same mean difference and standard deviation.
> The text you linked is actually calculating an Effect Size, which is
> (largely) independent of the sample size. Effect Size answers the
> question of how "relevant" or "large" the difference between groups is.
> This is fundamentally different from testing for "significant" differences.
> Specifically, the crucial 1/\sqrt{n} term, which is necessary for
> calculating the standard error of the mean difference, seems to be
> missing from the presented formula for the z-score. I just wanted to
> quickly point this out.
>
> Best regards
>
> Am 27.10.2025 um 14:12 schrieb Petr Pikal:
> > Hallo
> >
> > The t test is probably not the best option in your case. With 95
> > observations your data behave more like a population and you  may get
> > better insight using z-test. See
> > https://toxictruthblog.com/avoiding-little-known-problems-with-the-t-test/
> >
> > Best regards.
> > Petr
> >
> > so 25. 10. 2025 v 11:46 odesílatel Christophe Dutang <[email protected]>
> > napsal:
> >
> >> Dear list,
> >>
> >> I'm computing a p-value for the Student test and discover some
> >> inconsistencies with the cdf pt().
> >>
> >> The observed statistic is 11.23995 for 95 observations, so the p-value is
> >> very small
> >>
> >>> t_score <- 11.23995
> >>> n <- 95
> >>> print(pt(t_score, df = n-2, lower=FALSE), digits=22)
> >> [1] 2.539746620181247991746e-19
> >>> print(integrate(dt, lower=t_score, upper=Inf, df=n-2)$value, digits = 22)
> >> [1] 2.539746631161970791961e-19
> >>
> >> But if I compute with pt(lower=TRUE), I got 0
> >>
> >>> print(1-pt(t_score, df = n-2, lower=TRUE), digits=22)
> >> [1] 0
> >>
> >> Indeed, the p-value is lower than the epsilon machine
> >>
> >>> pt(t_score, df = n-2, lower=FALSE) < .Machine$double.eps
> >> [1] TRUE
> >>
> >> Using the square of t statistic which follows a Fisher distribution, I got
> >> the same issue:
> >>
> >>> print(pf(z, 1, n-2, lower=FALSE), digits=22)
> >> [1] 5.079493240362495983491e-19
> >>> print(integrate(df, lower=z, upper=Inf, df1=1, df2=n-2)$value, digits =
> >> 22)
> >> [1] 5.079015231299358486828e-19
> >>> print(1-pf(z, 1, n-2, lower=TRUE), digits=22)
> >> [1] 0
> >>
> >> When using the t.test() function, the p-value is naturally printed :
> >> p-value < 2.2e-16.
> >>
> >> Any comment is welcome.
> >>
> >> Christophe
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [EXT] Re: A very small p-value

Reply via email to