Michael,
You can modify the following code to suit. Also avoid using dist as a
variable name since it is a function in base. However, are you sure
you want to do this? Sx is the variance using sites in all the regions!
d1 <- apply(x,1, function(i){mahalanobis(x,i,Sx)})
is.na(d1) <- !sapply(id1, grepl, colnames(d1), fixed=T)
If on the other hand you want to use only variance within a region
modify like this ( i am sure more optimal code can be written)
#not tested
x.L <- split(x,id1)
n.L <- split(rownames(x), id1)
for (i in 1:length(x.L)){names(x.L[[i]]) <- n.L[[i]]}
m2 <- function(i,j){mahalanobis(j, i, var(j))}
m3 <- function(k){apply(as.matrix(k),1,m2,as.matrix(k))}
d2 <- lapply(x.L, m3)
Nikhil Kaza
Asst. Professor,
City and Regional Planning
University of North Carolina
nikhil.l...@gmail.com
On Jul 19, 2010, at 11:37 AM, Michael Ralph M. Abrigo wrote:
Thanks for the tip, Nikhil. However, i need only one matrix as input
for another to compute for non-bipartite matching which minimizes
pairwise distances between observations. As such, I need the
georeference (id) of the observations for subsequent processing. Below
is an illustration.
#generate data
x <- as.matrix(runif(5))
Sx <- var(x)
#generate id
set.seed(1)
id1 <- sample(1:2,5, replace=T)
id2 <- c(1:5)
rownames(x) <- paste(id1, id2)
#generate distance
dist <- as.matrix(
+ apply(x,1,function(i){
+ mahalanobis(x,i,Sx)
+ }
+ )
+ )
#print matrices
x
[,1]
1 1 0.2059746
1 2 0.1765568
2 3 0.6870228
2 4 0.3841037
1 5 0.7698414
dist
1 1 1 2 2 3 2 4 1 5
1 1 0.00000000 0.01165534 3.11660015 0.4273402 4.28210082
1 2 0.01165534 0.00000000 3.50943798 0.5801450 4.74056406
2 3 3.11660015 3.50943798 0.00000000 1.2358255 0.09237602
2 4 0.42734018 0.58014499 1.23582554 0.0000000 2.00395492
1 5 4.28210082 4.74056406 0.09237602 2.0039549 0.00000000
The geo-id is composed of two references, the first digit for the
region and the next for the observation itself. What I'm thinking of
is for pairwise distance between observations of different regions,
say site-11 and site-23 or site-24 to be replaced by a large number,
say 999999. I need the id for future processing, though.
Maybe I can stack the matrices generated using your tip to form a
block diagonal matrix, but then I do not have my ids? Im really sorry.
Im a bit lost.
Cheers,
Michael
On Mon, Jul 19, 2010 at 10:10 PM, Nikhil Kaza
<nikhil.l...@gmail.com> wrote:
replace dist with mahalanobis distance in the following example.
a <- cbind(runif(10), sample(1:3, 10, replace=T))
a.L <- split(a,a[,2])
dist.L <- lapply(a.L, dist)
Nikhil Kaza
Asst. Professor,
City and Regional Planning
University of North Carolina
nikhil.l...@gmail.com
On Jul 19, 2010, at 9:24 AM, Michael Ralph M. Abrigo wrote:
Hi! I am trying to implement non-bipartite matching. I have around
500 sites
which can be clustered by 10 regions. I am able to calculate
pairwise
Mahalanobis distances between sites (thanks to another post in the
forum).
However, I want to constrain my match to sites within the same
region. Thus
I want to replace elements of the distance matrix with a high
value, say
999999, for sites not of the same region so that the pair will not
be
matched.
In the original data file I have information on which sites belong
to what
region. However, when I compute for pairwise Mahalanobis
distances, I only
use a subset of the file, which, naturally, does not include the
georeference of the sites. How should I do this? Any hint will be
most
appreciated.
Btw, I am relatively new in using R. I may export the matrix to
another
program and replace the elements there, but that is a very very
dirty and
rough trick that I would rather not do given better options.
Many thanks in advance.
Cheers,
Michael
--
"I am most anxious for liberties for our country... but I place as
a prior
condition the education of the people so that our country may have
an
individuality of its own and make itself worthy of liberties... "
Jose
Rizal,1896
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
"I am most anxious for liberties for our country... but I place as a
prior condition the education of the people so that our country may
have an individuality of its own and make itself worthy of
liberties... " Jose Rizal,1896
--
"I am most anxious for liberties for our country... but I place as a
prior condition the education of the people so that our country may
have an individuality of its own and make itself worthy of
liberties... " Jose Rizal,1896
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.