bugfix already :P prior version fails when there is only one factor in Ind. This version also might be faster as I avoid using aggregate to create the dummy frame.
agg=function(z,Ind,FUN,...){ FUN.out=by(z,Ind,FUN,...) num.cells=length(FUN.out) num.values=length(FUN.out[[1]]) for(i in 1:length(Ind)){ Ind[[i]]=unique(Ind[[i]]) } temp=expand.grid(Ind) for(i in 1:num.values){ temp$new=NA n=names(FUN.out[[1]])[i] names(temp)[length(temp)]=ifelse(!is.null(n),n,ifelse(i==1,'x',paste ('x',i,sep=''))) for(j in 1:num.cells){ temp[j,length(temp)]=FUN.out[[j]][i] } } return(temp) } On 13-Jul-07, at 1:29 PM, Mike Lawrence wrote: > Hi all, > > This is my first post to the developers list. As I understand it, > aggregate() currently repeats a function across cells in a > dataframe but is only able to handle functions with single value > returns. Aggregate() also lacks the ability to retain the names > given to the returned value. I've created an agg() function (pasted > below) that is apparently backwards compatible (i.e. returns > identical results as aggregate() if the function returns a single > unnamed value), but is able to handle named and/or multiple return > values. The code may be a little inefficient (there must be an > easier way to set up the 'temp' data frame than to call aggregate > and remove the final column), but I'm suggesting that something > similar to this may be profitably used to replace aggregate entirely. > > #modified aggregate command, allowing for multiple/named output values > agg=function(z,Ind,FUN,...){ > FUN.out=by(z,Ind,FUN,...) > num.cells=length(FUN.out) > num.dv=length(FUN.out[[1]]) > > temp=aggregate(z,Ind,length) #dummy data frame > temp=temp[,c(1:(length(temp)-1))] #remove last column from dummy > frame > > for(i in 1:num.dv){ > temp=cbind(temp,NA) > n=names(FUN.out[[1]])[i] > names(temp)[length(temp)]=ifelse(!is.null(n),n,ifelse > (i==1,'x',paste('x',i,sep=''))) > for(j in 1:num.cells){ > temp[j,length(temp)]=FUN.out[[j]][i] > } > } > return(temp) > } > > #create some factored data > z=rnorm(100) # the DV > A=rep(1:2,each=25,2) #one factor > B=rep(1:2,each=50) #another factor > Ind=list(A=A,B=B) #the factor list > > aggregate(z,Ind,mean) #show the means of each cell > agg(z,Ind,mean) #should be identical to aggregate > > aggregate(z,Ind,summary) #returns an error > agg(z,Ind,summary) #returns named columns > > #Make a function that returns multiple unnamed values > summary2=function(x){ > s=summary(x) > names(s)=NULL > return(s) > } > agg(z,Ind,summary2) #returns multiple columns, default names > > > -- > Mike Lawrence > Graduate Student, Department of Psychology, Dalhousie University > > Website: http://memetic.ca > > Public calendar: http://icalx.com/public/informavore/Public > > "The road to wisdom? Well, it's plain and simple to express: > Err and err and err again, but less and less and less." > - Piet Hein > > -- Mike Lawrence Graduate Student, Department of Psychology, Dalhousie University Website: http://memetic.ca Public calendar: http://icalx.com/public/informavore/Public "The road to wisdom? Well, it's plain and simple to express: Err and err and err again, but less and less and less." - Piet Hein ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel