Hi R users, 
I have been struggling to select the equal number of samples from each strata. 
I have the data collected in different years in different regions with 
different sample size. Basically, I have two two conditions (year and region). 
I wanted to make smaple sample size for both year and region. 
I found that "strata.sampling' package can use if I had one condition, but I 
have two conditions . Is there any package from which I can put two conditions 
and select the rows randomly 999 times and put the mean value? 

Your help would be really appreciated. I am spending so much time...

Here What I did for the example data
raw=structure(list(watershed = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), 
    year = c(2001, 2001, 2002, 2002, 2002, 2002, 2002, 2001, 
    2001, 2001, 2002, 2002, 2002), sp1 = c(18.38, 29.1, 90.72, 
    16.12, 49.12, 20.81, 65.1, 1.87, 72.99, 93.45, 38.44, 67.13, 
    45.71), sp2 = c(46.46, 94, 86.87, 46.91, 21.41, 92.82, 87.75, 
    16.18, 18.16, 18.76, 19.26, 52.73, 49.09), sp3 = c(86.9, 
    62.82, 74.32, 75.49, 20.17, 58.84, 16.51, 44.14, 44.39, 32.36, 
    53.28, 67.42, 33.37)), .Names = c("watershed", "year", "sp1", 
"sp2", "sp3"), class = "data.frame", row.names = c(NA, -13L))

 require(sampling)
  if (is.null(method)) method <- "srswor"
  if (!method %in% c("srswor", "srswr")) 
    stop('method must be "srswor" or "srswr"')
  temp <- data[order(data[[group]]), ]
  ifelse(length(size) > 1,
         size <- size, 
         ifelse(size < 1,
                size <- round(table(temp[group]) * size),
                size <- rep(size, times=length(table(temp[group])))))
  strat = strata(temp, stratanames = names(temp[group]), 
                 size = size, method = method)
  getdata(temp, strat)
}

test1<-strata.sampling(raw, ("watershed"), 2)# select 2 rows by watershed

BUT, I wanted to use "year" too. ("watershed", "year"). When I added the 
"year", it did not work
test1<-strata.sampling(raw, ("watershed", "year"), 2)# select 2 rows by 
watershed and year
> test1<-strata.sampling(raw, ("watershed", "year"), 2)
Error: unexpected ',' in "test1<-strata.sampling(raw, ("watershed","

Here I want to select rows using tow conditions ("watershed", "year") with 999 
times and put mean value of sp1,sp2,sp3, using random sampling 999. here is the 
output I wanted
output<-structure(list(watershed = structure(c(1L, 1L, 2L, 2L), .Label = c("A", 
"B"), class = "factor"), year = c(2001L, 2002L, 2001L, 2002L), 
    sp1 = structure(c(1L, 1L, 1L, 1L), .Label = "mean", class = "factor"), 
    sp2 = structure(c(1L, 1L, 1L, 1L), .Label = "mean", class = "factor"), 
    sp3 = structure(c(1L, 1L, 1L, 1L), .Label = "mean", class = "factor")), 
.Names = c("watershed", 
"year", "sp1", "sp2", "sp3"), class = "data.frame", row.names = c(NA, 
-4L))

Any suggestions? 
Thanks for your help. 
KG












                                          
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to