Michael,

You can modify the following code to suit. Also avoid using dist as a variable name since it is a function in base. However, are you sure you want to do this? Sx is the variance using sites in all the regions!

d1 <- apply(x,1, function(i){mahalanobis(x,i,Sx)})
is.na(d1) <- !sapply(id1, grepl, colnames(d1), fixed=T)

If on the other hand you want to use only variance within a region modify like this ( i am sure more optimal code can be written)

#not tested
x.L <- split(x,id1)
n.L <- split(rownames(x), id1)
for (i in 1:length(x.L)){names(x.L[[i]]) <- n.L[[i]]}
m2 <- function(i,j){mahalanobis(j, i, var(j))}
m3 <- function(k){apply(as.matrix(k),1,m2,as.matrix(k))}
d2 <- lapply(x.L, m3)



Nikhil Kaza
Asst. Professor,
City and Regional Planning
University of North Carolina

nikhil.l...@gmail.com

On Jul 19, 2010, at 11:37 AM, Michael Ralph M. Abrigo wrote:

Thanks for the tip, Nikhil. However, i need only one matrix as input
for another to compute for non-bipartite matching which minimizes
pairwise distances between observations. As such, I need the
georeference (id) of the observations for subsequent processing. Below
is an illustration.


#generate data
x <- as.matrix(runif(5))
Sx <- var(x)

#generate id
set.seed(1)
id1 <- sample(1:2,5, replace=T)
id2 <- c(1:5)
rownames(x) <- paste(id1, id2)

#generate distance
dist <- as.matrix(
+   apply(x,1,function(i){
+     mahalanobis(x,i,Sx)
+    }
+  )
+ )

#print matrices
x
         [,1]
1 1 0.2059746
1 2 0.1765568
2 3 0.6870228
2 4 0.3841037
1 5 0.7698414
dist
           1 1        1 2        2 3       2 4        1 5
1 1 0.00000000 0.01165534 3.11660015 0.4273402 4.28210082
1 2 0.01165534 0.00000000 3.50943798 0.5801450 4.74056406
2 3 3.11660015 3.50943798 0.00000000 1.2358255 0.09237602
2 4 0.42734018 0.58014499 1.23582554 0.0000000 2.00395492
1 5 4.28210082 4.74056406 0.09237602 2.0039549 0.00000000


The geo-id is composed of two references, the first digit for the
region and the next for the observation itself. What I'm thinking of
is for pairwise distance between observations of different regions,
say site-11 and site-23 or site-24 to be replaced by a large number,
say 999999. I need the id for future processing, though.
Maybe I can stack the matrices generated using your tip to form a
block diagonal matrix, but then I do not have my ids? Im really sorry.
Im a bit lost.
Cheers,
Michael

On Mon, Jul 19, 2010 at 10:10 PM, Nikhil Kaza <nikhil.l...@gmail.com> wrote:

replace dist with mahalanobis distance in the following example.

a <- cbind(runif(10), sample(1:3, 10, replace=T))
a.L <- split(a,a[,2])
dist.L <- lapply(a.L, dist)



Nikhil Kaza
Asst. Professor,
City and Regional Planning
University of North Carolina

nikhil.l...@gmail.com

On Jul 19, 2010, at 9:24 AM, Michael Ralph M. Abrigo wrote:

Hi! I am trying to implement non-bipartite matching. I have around 500 sites which can be clustered by 10 regions. I am able to calculate pairwise Mahalanobis distances between sites (thanks to another post in the forum). However, I want to constrain my match to sites within the same region. Thus I want to replace elements of the distance matrix with a high value, say 999999, for sites not of the same region so that the pair will not be
matched.
In the original data file I have information on which sites belong to what region. However, when I compute for pairwise Mahalanobis distances, I only
use a subset of the file, which, naturally, does not include the
georeference of the sites. How should I do this? Any hint will be most
appreciated.
Btw, I am relatively new in using R. I may export the matrix to another program and replace the elements there, but that is a very very dirty and
rough trick that I would rather not do given better options.
Many thanks in advance.

Cheers,
Michael

--
"I am most anxious for liberties for our country... but I place as a prior condition the education of the people so that our country may have an individuality of its own and make itself worthy of liberties... " Jose
Rizal,1896

       [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
"I am most anxious for liberties for our country... but I place as a
prior condition the education of the people so that our country may
have an individuality of its own and make itself worthy of
liberties... " Jose Rizal,1896



--
"I am most anxious for liberties for our country... but I place as a
prior condition the education of the people so that our country may
have an individuality of its own and make itself worthy of
liberties... " Jose Rizal,1896

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to