On Thu, 17 Sep 2009, William Revelle wrote:

At 2:40 PM +0900 9/17/09, Gundala Viswanath wrote:
I have a data frame (dat). What I want to do is for each row,
divide each row  with the sum of its row.

The number of row can be large > 1million.
Is there a faster way than doing it this way?

datnorm;
for (rw in 1:length(dat)) {
    tmp <- dat[rw,]/sum(dat[rw,])
    datnorm <- rbind(datnorm, tmp);
}


- G.V.


datnorm <- dat/rowSums(dat)

this will be faster if dat is a matrix rather than a data.frame.


Even if it's a data frame and he needs a data frame answer it might be faster 
to do
  mat<-as.matrix(dat)
  matnorm<-mat/rowSums(mat)
  datnorm<-as.data.frame(dat)

The other advantage, apart from speed, of doing it with dat/rowSums(dat) rather 
than the loop is he gets the right answer. The loop goes from 1 to the number 
of columns if dat is a data frame and 1 to the number of entries if dat is a 
matrix, not from 1 to the number of rows.

     -thomas

Thomas Lumley                   Assoc. Professor, Biostatistics
tlum...@u.washington.edu        University of Washington, Seattle

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to