Yes, that's exactly it! Many thanks, it all comes back to me now! It's the darn do.call that I can never remember somehow. I know I've reinvented this wheel several times -- you'd think I'd learn. Sigh.
Again, my thanks! Regards, Mike On Nov 7, 2007 2:35 PM, jim holtman <[EMAIL PROTECTED]> wrote: > Is this closer to what you would like? > > > x <- textConnection(" hostName user sys idle date time > + 10142 fred 0.4 0.5 98.0 2007-11-01 02:02:18 > + 16886 barney 0.5 0.2 94.6 2007-10-25 19:12:12 > + 8795 fred 0.0 0.1 99.8 2007-10-30 05:08:22 > + 5261 fred 0.1 0.2 99.7 2007-10-25 07:20:32 > + 12427 barney 0.1 0.2 93.2 2007-10-19 14:34:10 > + 18067 barney 0.1 0.2 99.4 2007-10-27 10:34:08 > + 973 fred 0.0 0.2 99.8 2007-10-19 08:24:22 > + 5426 fred 0.2 0.3 99.5 2007-10-25 12:50:33 > + 7067 fred 0.1 0.2 99.4 2007-10-27 19:32:27 > + 13159 barney 0.1 0.4 84.3 2007-10-20 14:58:11 > + 17481 barney 1.2 2.0 92.6 2007-10-26 15:02:11 > + 21632 barney 0.1 0.1 99.6 2007-11-01 09:24:09 > + 206 fred 19.4 4.8 53.7 2007-10-18 06:50:34 > + 18151 barney 0.1 0.2 94.9 2007-10-27 13:22:09 > + 10662 fred 0.1 0.2 99.6 2007-11-01 19:22:27 > + 10376 fred 0.0 0.2 99.7 2007-11-01 09:50:24 > + 3630 fred 43.7 7.0 33.0 2007-10-23 00:58:27 > + 1118 fred 0.6 0.4 98.9 2007-10-19 13:14:23 > + 5122 fred 0.1 0.2 99.6 2007-10-25 02:42:21 > + 22117 barney 0.0 0.2 99.4 2007-11-02 01:34:04") > > x.in <- read.table(x, header=TRUE, as.is=TRUE) > > x.in$hour <- sapply(strsplit(x.in$time, ":"), '[', 1) # pick off the > hour > > x.by <- by(x.in, list(x.in$hour, x.in$hostName), function(.host){ > + data.frame(hostName=.host$hostName[1], hour=.host$hour[1], > + user.mean=mean(.host$user), > + sys.mean=mean(.host$sys), > + idle.mean=mean(.host$idle), > + user.max=max(.host$user), > + sys.max=max(.host$sys), > + idle.max=max(.host$idle)) > + }) > > do.call('rbind', x.by) > hostName hour user.mean sys.mean idle.mean user.max sys.max idle.max > 1 barney 01 0.00 0.20 99.40 0.0 0.2 99.4 > 2 barney 09 0.10 0.10 99.60 0.1 0.1 99.6 > 3 barney 10 0.10 0.20 99.40 0.1 0.2 99.4 > 4 barney 13 0.10 0.20 94.90 0.1 0.2 94.9 > 5 barney 14 0.10 0.30 88.75 0.1 0.4 93.2 > 6 barney 15 1.20 2.00 92.60 1.2 2.0 92.6 > 7 barney 19 0.50 0.20 94.60 0.5 0.2 94.6 > 8 fred 00 43.70 7.00 33.00 43.7 7.0 33.0 > 9 fred 02 0.25 0.35 98.80 0.4 0.5 99.6 > 10 fred 05 0.00 0.10 99.80 0.0 0.1 99.8 > 11 fred 06 19.40 4.80 53.70 19.4 4.8 53.7 > 12 fred 07 0.10 0.20 99.70 0.1 0.2 99.7 > 13 fred 08 0.00 0.20 99.80 0.0 0.2 99.8 > 14 fred 09 0.00 0.20 99.70 0.0 0.2 99.7 > 15 fred 12 0.20 0.30 99.50 0.2 0.3 99.5 > 16 fred 13 0.60 0.40 98.90 0.6 0.4 98.9 > 17 fred 19 0.10 0.20 99.50 0.1 0.2 99.6 > > > > > On 11/7/07, Mike Nielsen <[EMAIL PROTECTED]> wrote: > > R-Helpers, > > > > I'm sorry to have to ask this -- I've not used R very much in the last > > 8 or 10 months, and I've gotten rusty. > > > > I have the following (ff2 is a subset of a much, much larger dataset): > > > > > ff2 > > hostName user sys idle obsTime > > 10142 fred 0.4 0.5 98.0 2007-11-01 02:02:18 > > 16886 barney 0.5 0.2 94.6 2007-10-25 19:12:12 > > 8795 fred 0.0 0.1 99.8 2007-10-30 05:08:22 > > 5261 fred 0.1 0.2 99.7 2007-10-25 07:20:32 > > 12427 barney 0.1 0.2 93.2 2007-10-19 14:34:10 > > 18067 barney 0.1 0.2 99.4 2007-10-27 10:34:08 > > 973 fred 0.0 0.2 99.8 2007-10-19 08:24:22 > > 5426 fred 0.2 0.3 99.5 2007-10-25 12:50:33 > > 7067 fred 0.1 0.2 99.4 2007-10-27 19:32:27 > > 13159 barney 0.1 0.4 84.3 2007-10-20 14:58:11 > > 17481 barney 1.2 2.0 92.6 2007-10-26 15:02:11 > > 21632 barney 0.1 0.1 99.6 2007-11-01 09:24:09 > > 206 fred 19.4 4.8 53.7 2007-10-18 06:50:34 > > 18151 barney 0.1 0.2 94.9 2007-10-27 13:22:09 > > 10662 fred 0.1 0.2 99.6 2007-11-01 19:22:27 > > 10376 fred 0.0 0.2 99.7 2007-11-01 09:50:24 > > 3630 fred 43.7 7.0 33.0 2007-10-23 00:58:27 > > 1118 fred 0.6 0.4 98.9 2007-10-19 13:14:23 > > 5122 fred 0.1 0.2 99.6 2007-10-25 02:42:21 > > 22117 barney 0.0 0.2 99.4 2007-11-02 01:34:04 > > > > > doit(ff2) > > hostName hour user.mean sys.mean idle.mean user.max sys.max idle.max > > 1 barney 01 0.00 0.20 99.40 0.0 0.2 99.4 > > 2 barney 09 0.10 0.10 99.60 0.1 0.1 99.6 > > 3 barney 10 0.10 0.20 99.40 0.1 0.2 99.4 > > 4 barney 13 0.10 0.20 94.90 0.1 0.2 94.9 > > 5 barney 14 0.10 0.30 88.75 0.1 0.4 93.2 > > 6 barney 15 1.20 2.00 92.60 1.2 2.0 92.6 > > 7 barney 19 0.50 0.20 94.60 0.5 0.2 94.6 > > 8 fred 00 43.70 7.00 33.00 43.7 7.0 33.0 > > 9 fred 02 0.25 0.35 98.80 0.4 0.5 99.6 > > 10 fred 05 0.00 0.10 99.80 0.0 0.1 99.8 > > 11 fred 06 19.40 4.80 53.70 19.4 4.8 53.7 > > 12 fred 07 0.10 0.20 99.70 0.1 0.2 99.7 > > 13 fred 08 0.00 0.20 99.80 0.0 0.2 99.8 > > 14 fred 09 0.00 0.20 99.70 0.0 0.2 99.7 > > 15 fred 12 0.20 0.30 99.50 0.2 0.3 99.5 > > 16 fred 13 0.60 0.40 98.90 0.6 0.4 98.9 > > 17 fred 19 0.10 0.20 99.50 0.1 0.2 99.6 > > > doit > > function(x){ > > x.mean<-aggregate(x[,c("user","sys","idle")], > > by=list(hostName=x$hostName, > > > > hour=strftime(as.POSIXlt(x$obsTime),"%H")), > > mean) > > > > x.max<-aggregate(x[,c("user","sys","idle")], > > by=list(hostName=x$hostName, > > > > hour=strftime(as.POSIXlt(x$obsTime),"%H")), > > max) > > > > t1<-merge(x.mean,x.max > ,by=c("hostName","hour"),suffixes=c(".mean",".max")) > > return(t1) > > } > > > > The point of the "doit" function is to make a new dataframe in which > > the columns are summary statistics of certain columns in the argument. > > > > Is there a function similar to: > > > > magic.function(ff2[,c("user","system","idle")], > > by=list(hostName=ff2$hostName,hour=strftime(as.POSIXlt > (ff2$obsTime),"%H")), > > function(x){c(mean.user=mean(x$user), > > mean.system=mean(x$system), > > mean.idle=mean(x$idle), > > max.user=max(x$user), > > max.system=max(x$system), > > max.idle=max(x$idle))}) > > > > ie. an "aggregate" that can cope with a non-scalar function and "do > > what I mean"? My doit function gets more and more ugly the more > > summary statistics I add, and I worry about the "merge" with hundreds > > of thousands of rows. > > > > I'm almost sure I've seen a solution to what I know is a simple > > problem, but I guess my search skills are as bad as my "R": I've > > rummaged around the r-help archives and came up with nothing to show > > for it. > > > > > > Pointers would be gratefully received. > > > > Many thanks. > > -- > > Regards, > > > > Mike Nielsen > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem you are trying to solve? > -- Regards, Mike Nielsen [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.