Not tested but should work: sums = tapply(x, group, sum); sums.ext = sums[ match(group, names(sums))] normalized = x/sums.ext
It may be that the tapply is just as slow as your loop though, I'm not sure. HTH, Peter On Thu, Nov 29, 2012 at 10:55 AM, Noah Silverman <noahsilver...@ucla.edu> wrote: > Hi, > > I have a very large data set (aprox. 100,000 rows.) > > The data comes from around 10,000 "groups" with about 10 entered per group. > > The values are in one column, the group ID is an integer in the second column. > > I want to normalize the values by group: > > for(g in unique(groups){ > x[group==g] / sum(x[group==g]) > } > > This works find in a loop, but is slow. Is there a faster way to do this? > > Thanks! > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.