I have a dataset of about 10^6 rows, each consisting of a timestamp, several factors, a string, some integers, and some floats.
I'd like to graph this data in various ways, including straightforward ones (how many events per week over the past year for each of 4 values of some factor), some less straightforward. I've managed to do this by brute force, but I'd like to learn how to do it in more elegant, more R-like code. Consider for example the following, which graphs the 25th, 50th, and 75th percentile values per day of data$x perc <- function(code,data) { # select the part of the data with factor value slice <- data[data$factor == code,]; # calc quartiles for each day quarts <- tapply(slice$x, slice$day, function(x) quantile(x,c(.25,.50,.75))); # returns a tagged list of tagged vectors # list("2008-10-07" = c("25%" = .05, "50%" = .47, ... ) , ...) # convert to a data frame -- is there some mapping function to do this? fr <- data.frame( day = to.time(names(quarts)), # strings back to dates (!) "25%" = sapply(quarts, function(x) x[[1]] ), # !! "50%" = sapply(quarts, function(x) x[[2]] ), "75%" = sapply(quarts, function(x) x[[3]] ) ); # columns are now labelled "X25." etc. (!) for (i in 2:4) { plot( fr$day, res[[2]], type="l", ylim= c( 0, max(pmax(fr[[1]],fr[[2]],fr[[3]] )) )); par(new=TRUE); } par(new=FALSE); } This works, but is pretty ugly in a variety of ways. What is the right way to do this? Thanks, -s ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.