Thank you very much for your help, Nikhil! The code I'm using now is #generate data set.seed(2) x <- as.matrix(runif(5)) id1 <- sample(1:2,5,replace=T) id2 <- c(1:5) rownames(x) <- paste(id1, id2)
#create distance matrix if same id1 x.L <- split(x,id1) n.L <- split(rownames(x), id1) for(i in 1:length(x.L)){ names(x.L[[i]]) <- n.L[[i]] } m2 <- function(i,j) { mahalanobis(j,i,var(j)) } m3 <- function(k) { apply(as.matrix(k),1,m2,as.matrix(k)) } dd <- lapply(x.L, m3) df <- bdiag(dd) rownames(df) <- sort(rownames(x)) colnames(df) <- sort(rownames(x)) x.L df dd Cheers, Michael On Tue, Jul 20, 2010 at 2:27 AM, Nikhil Kaza <nikhil.l...@gmail.com> wrote: > My mistake, instead of colnames(d1) > > use substr(colnames(d1),1,1) or similar > > On Jul 19, 2010, at 2:15 PM, Nikhil Kaza wrote: > >> Michael, >> >> You can modify the following code to suit. Also avoid using dist as a >> variable name since it is a function in base. However, are you sure you want >> to do this? Sx is the variance using sites in all the regions! >> >> d1 <- apply(x,1, function(i){mahalanobis(x,i,Sx)}) >> is.na(d1) <- !sapply(id1, grepl, colnames(d1), fixed=T) >> >> If on the other hand you want to use only variance within a region modify >> like this ( i am sure more optimal code can be written) >> >> #not tested >> x.L <- split(x,id1) >> n.L <- split(rownames(x), id1) >> for (i in 1:length(x.L)){names(x.L[[i]]) <- n.L[[i]]} >> m2 <- function(i,j){mahalanobis(j, i, var(j))} >> m3 <- function(k){apply(as.matrix(k),1,m2,as.matrix(k))} >> d2 <- lapply(x.L, m3) >> >> >> >> Nikhil Kaza >> Asst. Professor, >> City and Regional Planning >> University of North Carolina >> >> nikhil.l...@gmail.com >> >> On Jul 19, 2010, at 11:37 AM, Michael Ralph M. Abrigo wrote: >> >>> Thanks for the tip, Nikhil. However, i need only one matrix as input >>> for another to compute for non-bipartite matching which minimizes >>> pairwise distances between observations. As such, I need the >>> georeference (id) of the observations for subsequent processing. Below >>> is an illustration. >>> >>> >>>> #generate data >>>> x <- as.matrix(runif(5)) >>>> Sx <- var(x) >>>> >>>> #generate id >>>> set.seed(1) >>>> id1 <- sample(1:2,5, replace=T) >>>> id2 <- c(1:5) >>>> rownames(x) <- paste(id1, id2) >>>> >>>> #generate distance >>>> dist <- as.matrix( >>> >>> + apply(x,1,function(i){ >>> + mahalanobis(x,i,Sx) >>> + } >>> + ) >>> + ) >>>> >>>> #print matrices >>>> x >>> >>> [,1] >>> 1 1 0.2059746 >>> 1 2 0.1765568 >>> 2 3 0.6870228 >>> 2 4 0.3841037 >>> 1 5 0.7698414 >>>> >>>> dist >>> >>> 1 1 1 2 2 3 2 4 1 5 >>> 1 1 0.00000000 0.01165534 3.11660015 0.4273402 4.28210082 >>> 1 2 0.01165534 0.00000000 3.50943798 0.5801450 4.74056406 >>> 2 3 3.11660015 3.50943798 0.00000000 1.2358255 0.09237602 >>> 2 4 0.42734018 0.58014499 1.23582554 0.0000000 2.00395492 >>> 1 5 4.28210082 4.74056406 0.09237602 2.0039549 0.00000000 >>> >>> >>> The geo-id is composed of two references, the first digit for the >>> region and the next for the observation itself. What I'm thinking of >>> is for pairwise distance between observations of different regions, >>> say site-11 and site-23 or site-24 to be replaced by a large number, >>> say 999999. I need the id for future processing, though. >>> Maybe I can stack the matrices generated using your tip to form a >>> block diagonal matrix, but then I do not have my ids? Im really sorry. >>> Im a bit lost. >>> Cheers, >>> Michael >>> >>> On Mon, Jul 19, 2010 at 10:10 PM, Nikhil Kaza <nikhil.l...@gmail.com> >>> wrote: >>>> >>>> replace dist with mahalanobis distance in the following example. >>>> >>>> a <- cbind(runif(10), sample(1:3, 10, replace=T)) >>>> a.L <- split(a,a[,2]) >>>> dist.L <- lapply(a.L, dist) >>>> >>>> >>>> >>>> Nikhil Kaza >>>> Asst. Professor, >>>> City and Regional Planning >>>> University of North Carolina >>>> >>>> nikhil.l...@gmail.com >>>> >>>> On Jul 19, 2010, at 9:24 AM, Michael Ralph M. Abrigo wrote: >>>> >>>>> Hi! I am trying to implement non-bipartite matching. I have around 500 >>>>> sites >>>>> which can be clustered by 10 regions. I am able to calculate pairwise >>>>> Mahalanobis distances between sites (thanks to another post in the >>>>> forum). >>>>> However, I want to constrain my match to sites within the same region. >>>>> Thus >>>>> I want to replace elements of the distance matrix with a high value, >>>>> say >>>>> 999999, for sites not of the same region so that the pair will not be >>>>> matched. >>>>> In the original data file I have information on which sites belong to >>>>> what >>>>> region. However, when I compute for pairwise Mahalanobis distances, I >>>>> only >>>>> use a subset of the file, which, naturally, does not include the >>>>> georeference of the sites. How should I do this? Any hint will be most >>>>> appreciated. >>>>> Btw, I am relatively new in using R. I may export the matrix to another >>>>> program and replace the elements there, but that is a very very dirty >>>>> and >>>>> rough trick that I would rather not do given better options. >>>>> Many thanks in advance. >>>>> >>>>> Cheers, >>>>> Michael >>>>> >>>>> -- >>>>> "I am most anxious for liberties for our country... but I place as a >>>>> prior >>>>> condition the education of the people so that our country may have an >>>>> individuality of its own and make itself worthy of liberties... " Jose >>>>> Rizal,1896 >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>> >>> >>> >>> -- >>> "I am most anxious for liberties for our country... but I place as a >>> prior condition the education of the people so that our country may >>> have an individuality of its own and make itself worthy of >>> liberties... " Jose Rizal,1896 >>> >>> >>> >>> -- >>> "I am most anxious for liberties for our country... but I place as a >>> prior condition the education of the people so that our country may >>> have an individuality of its own and make itself worthy of >>> liberties... " Jose Rizal,1896 >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >> > > -- "I am most anxious for liberties for our country... but I place as a prior condition the education of the people so that our country may have an individuality of its own and make itself worthy of liberties... " Jose Rizal,1896 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.