In a general situation of observational studies, your point is
undoubtedly true, and apparently you believe it to be true even in the
setting of designed experiments. Perhaps I should have confined myself
to my first sentence.
--
David.
On Aug 2, 2010, at 2:05 PM, Bert Gunter wrote:
David et. al:
I take issue with this. It is the lack of independence that is the
major issue. In particular, clustering, split-plotting, and so forth
due to "convenience order" experimentation, lack of randomization,
exogenous effects like the systematic effects due to measurement
method/location have the major effect on inducing bias and
distorting inference. Normality and unequal variances typically pale
to insignificance compared to this.
Obviously, IMHO.
Note 1: George Box noted this at least 50 years ago in the early
'60's when he and Jenkins developed arima modeling.
Note 2: If you can, have a look at Jack Youden's classic paper
"Enduring Values", which comments to some extent on these issues,
here: http://www.jstor.org/pss/1266913
Cheers,
Bert
Bert Gunter
Genentech Nonclinical Biostatistics
On Mon, Aug 2, 2010 at 10:32 AM, David Winsemius <dwinsem...@comcast.net
> wrote:
On Aug 2, 2010, at 9:33 AM, wwreith wrote:
I am conducting an experiment with four independent variables each
of which
has three or more factor levels. The sample size is quite large i.e.
several
thousand. The dependent variable data does not pass a normality test
but
"visually" looks close to normal so is there a way to compute the
affect
this would have on the p-value for ANOVA or is there a way to
perform an
nonparametric test in R that will handle this many independent
variables.
Simply saying ANOVA is robust to small departures from normality is
not
going to be good enough for my client.
The statistical assumption of normality for linear models do not
apply to the distribution of the dependent variable, but rather to
the residuals after a model is estimated. Furthermore, it is the
homoskedasticity assumption that is more commonly violated and also
greater threat to validity. (And if you don't already know both of
these points, then you desperately need to review your basic
modeling practices.)
I need to compute an error amount for
ANOVA or find a nonparametric equivalent.
You might get a better answer if you expressed the first part of
that question in unambiguous terminology. What is "error amount"?
For the second part, there is an entire Task View on Robust
Statistical Methods.
--
David Winsemius, MD
West Hartford, CT
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.