> I've sent this question 2 days ago and got response from Sarah. Thanks for > that. But unfortunately, it did not really solve our problem. The main issue > is that we want to use our own (manipulated) covariance matrix in the > calculation of the mahalanobis distance. Does anyone know how to vectorize > the below code instead of using a loop (which slows it down)? > I'd really appreciate any help on this, thank you all in advance! > Cheers, > Frank > > This is what I posted 2 days ago: > We have a data frame x with n people as rows and k variables as columns. > Now, for each person (i.e., each row) we want to calculate a distance > between him/her and EACH other person in x. In other words, we want to > create a n x n matrix with distances (with zeros in the diagonal). > However, we do not want to calculate Euclidian distances. We want to > calculate Mahalanobis distances, which take into account the covariance > among variables. > Below is the piece of code we wrote ("covmat" in the function below is the > variance-covariance matrix among variables in Data that has to be fed into > mahalonobis function we are using). > mahadist = function(x, covmat) { > dismat = matrix(0,ncol=nrow(x),nrow=nrow(x)) > for (i in 1:nrow(x)) { > dismat[i,] = mahalanobis(as.matrix(x), as.matrix(x[i,]), covmat)^.5 > } > return(dismat) > } > > This piece of code works, but it is very slow. We were wondering if it's at > all possible to somehow vectorize this function. Any help would be greatly > appreciated.
You can save a substantial time by calling as.matrix before the loop, e.g. x <- data.frame(runif(1000), runif(1000), runif(1000)) covmat <- cov(x) mahadist = function(x, covmat) #yours { dismat = matrix(0,ncol=nrow(x),nrow=nrow(x)) for (i in 1:nrow(x)) { dismat[i,] = mahalanobis(as.matrix(x), as.matrix(x[i,]), covmat)^.5 } return(dismat) } mahadist2 <- function(x, covmat) #my modification { n <- nrow(x) dismat <- matrix(0,ncol=n,nrow=n) matx <- as.matrix(x) for (i in 1:n) { dismat[i,] <- mahalanobis(matx, matx[i,], covmat)^.5 } dismat } system.time(mahadist(x, covmat)) # user system elapsed # 2.82 0.06 2.95 system.time(mahadist2(x, covmat)) # user system elapsed # 1.39 0.04 1.45 Regards, Richie. Mathematical Sciences Unit HSL ------------------------------------------------------------------------ ATTENTION: This message contains privileged and confidential inform...{{dropped:20}} ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.