chipmaney wrote:
typically, the apply family wants you to use vectors to run functions on.
However, I have a function, kruskal.test, that requires 2 arguments.
kruskal.test(Herb.df$Score,Herb.df$Year)
This easily computes the KW ANOVA statistic for any difference across
years....
However, my data has multiple sites on which KW needs to be run...
here's the data:
Herb.df<-
data.frame(Score=rep(c(2,4,6,6,6,5,7,8,6,9),2),Year=rep(c(rep(1,5),rep(2,5)),2),Site=c(rep(3,10),rep(4,10)))
However, if I try this:
tapply(Herb.df,Herb.df$Site,function(.data)
kruskal.test(.data$Indicator_Rating,.data$Year))
Error in tapply(Herb.df, Herb.df$ID, function(.data)
kruskal.test(.data$Indicator_Rating, :
arguments must have same length
How can I vectorize the kruskal.test() for all sites using tapply() in lieu
of a loop?
Your example data makes little sense; you have precisely the
same data for both sites and you have only two sites (why do
kruskal.test on two sites?). Finally, you need to decide what
your response variable is: 'Score' or 'Indicator_Rating'.
So here's some made-up data and the use of by() to apply
the test to each site:
dat <- data.frame(y = rnorm(60), yr=gl(4,5,60), st=gl(3,20))
with(dat, by(dat, st, function(x) kruskal.test(y~yr, data=x)))
See the last example in ?by.
-Peter Ehlers
--
Peter Ehlers
University of Calgary
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.