Try this: x$survey1[x$count==0] <- 0 x$survey1[x$count %in% 1:11] <- 1 x$survey1[x$count > 11] <- 2 x.wide <- reshape(x, v.names="survey1", idvar="location", timevar="survey1", direction="wide") x.wide <- x.wide[order(x.wide$location),] x.wide$var1 <- tapply(x$var1, x$location, mean) x.wide
On 24/01/2008, Dave Hewitt <[EMAIL PROTECTED]> wrote: > I need to do a somewhat complex rearrangement of an existing dataframe that > involves an "on-the-fly" conditional summary as well. I've tried to make > the example as simple as possible. Code to produce the object being acted > on is at the end of the message. I hope this isn't too much for a single > posting. Any guidance will be much appreciated. > > The input example looks like this: > > survey year count location var1 var2 > 1 1 2 A1 21.2 1 > 2 1 0 A1 15.6 1 > 3 1 29 A1 12.1 1 > 1 1 11 B3 9 2 > 3 1 5 B3 4.8 2 > 1 2 7 A2 20.1 4 > 2 2 2 A2 19.2 4 > > I need to rearrange it to look like this (the order of the columns is not > important): > > location year survey1 survey2 survey3 var1summ var2 > A1 1 1 0 2 16.3 1 > B3 1 2 NA 1 6.9 2 > A2 2 1 1 NA 19.7 4 > > The operations/conditions that I need are: > > (A) The observations in "count" for each survey at each location in a given > year need to be reduced according to the following rules and then placed in > new columns "survey1", "survey2", "survey3" : > -- when count = 0, surveyX = 0 > -- when 0 < count < 11, surveyX = 1 > -- when 11 < count, surveyX = 2 > > (B) The observations in var1 need to be summarized (here, by taking the > mean) across all surveys for a given location in a given year and placed in > "var1summ" > > (C) Observations in var2 need to be summarized in a similar way, except > that they don't change across surveys for a given location in a given year > (so mean, median, etc. would work, but a more general solution would be > nice for variables that are categorical) > > (D) Surveys (rows) can be missing for a given location in a given year. I'd > like to retain a column for each survey in the new dataframe and denote > surveys that were missing as NA. E.g., survey2 is missing for location B3 > in year 1. > > > ** Object creation: > x <- structure(list(survey = c(1L, 2L, 3L, 1L, 3L, 1L, 2L), year = c(1L, > 1L, 1L, 1L, 1L, 2L, 2L), count = c(2L, 0L, 29L, 11L, 5L, 7L, 2L), location > = structure(c(1L, 1L, 1L, 3L, 3L, 2L, 2L), .Label = c("A1", "A2", "B3"), > class = "factor"), var1 = c(21.2, 15.6, 12.1, 9, 4.8, 20.1, 19.2), var2 = > c(1L, 1L, 1L, 2L, 2L, 4L, 4L)), .Names = c("survey", "year", "count", > "location", "var1", "var2"), class = "data.frame", row.names = c(NA, -7L)) > > Thanks, > Dave Hewitt > Fisheries Science > Virginia Institute of Marine Science, Gloucester Point, VA, USA > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.