On Thu, 17 Sep 2009, William Revelle wrote:
At 2:40 PM +0900 9/17/09, Gundala Viswanath wrote:
I have a data frame (dat). What I want to do is for each row,
divide each row with the sum of its row.
The number of row can be large > 1million.
Is there a faster way than doing it this way?
datnorm;
for (rw in 1:length(dat)) {
tmp <- dat[rw,]/sum(dat[rw,])
datnorm <- rbind(datnorm, tmp);
}
- G.V.
datnorm <- dat/rowSums(dat)
this will be faster if dat is a matrix rather than a data.frame.
Even if it's a data frame and he needs a data frame answer it might be faster
to do
mat<-as.matrix(dat)
matnorm<-mat/rowSums(mat)
datnorm<-as.data.frame(dat)
The other advantage, apart from speed, of doing it with dat/rowSums(dat) rather
than the loop is he gets the right answer. The loop goes from 1 to the number
of columns if dat is a data frame and 1 to the number of entries if dat is a
matrix, not from 1 to the number of rows.
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlum...@u.washington.edu University of Washington, Seattle
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.