In a general situation of observational studies, your point is undoubtedly true, and apparently you believe it to be true even in the setting of designed experiments. Perhaps I should have confined myself to my first sentence.

--
David.


On Aug 2, 2010, at 2:05 PM, Bert Gunter wrote:

David et. al:

I take issue with this. It is the lack of independence that is the major issue. In particular, clustering, split-plotting, and so forth due to "convenience order" experimentation, lack of randomization, exogenous effects like the systematic effects due to measurement method/location have the major effect on inducing bias and distorting inference. Normality and unequal variances typically pale to insignificance compared to this.

Obviously, IMHO.

Note 1: George Box noted this at least 50 years ago in the early '60's when he and Jenkins developed arima modeling.

Note 2: If you can, have a look at Jack Youden's classic paper "Enduring Values", which comments to some extent on these issues, here: http://www.jstor.org/pss/1266913

Cheers,
Bert


Bert Gunter
Genentech Nonclinical Biostatistics



On Mon, Aug 2, 2010 at 10:32 AM, David Winsemius <dwinsem...@comcast.net > wrote:

On Aug 2, 2010, at 9:33 AM, wwreith wrote:


I am conducting an experiment with four independent variables each of which has three or more factor levels. The sample size is quite large i.e. several thousand. The dependent variable data does not pass a normality test but "visually" looks close to normal so is there a way to compute the affect this would have on the p-value for ANOVA or is there a way to perform an nonparametric test in R that will handle this many independent variables. Simply saying ANOVA is robust to small departures from normality is not
going to be good enough for my client.

The statistical assumption of normality for linear models do not apply to the distribution of the dependent variable, but rather to the residuals after a model is estimated. Furthermore, it is the homoskedasticity assumption that is more commonly violated and also greater threat to validity. (And if you don't already know both of these points, then you desperately need to review your basic modeling practices.)


 I need to compute an error amount for
ANOVA or find a nonparametric equivalent.

You might get a better answer if you expressed the first part of that question in unambiguous terminology. What is "error amount"?

For the second part, there is an entire Task View on Robust Statistical Methods.

--

David Winsemius, MD
West Hartford, CT





David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to