On Sun, Apr 26, 2009 at 6:22 PM, <dirk...@gmx.de> wrote: > Dear R users, > > I am trying to do exact matching on a large dataset (500.000 obs), about > equal size of treatment and controll group, with replacement: As for the > moment I use the "Match" function of the "Matching" library. I match on 2 > covariates and all observations in the treatment group have at least one > exact counterpart in the controllgroup. Now I want to introduce observation > weights. I set ties=FALSE, as I want exactly one by one matching: Is there a > way which makes that I draw randomly from the individuals in the > controllgroup which have the same values of covariates as the individual in > the treatmentgroup, setting the probabilities to be drawn proportional to the > weights of the individual in the CT? E.g. I have three individuals which all > have the same value for the covariates as the one observation I want to find > a partner for, and the first of the three individuals has a very large > weight: Now when drawing randomly among those three I want the probability > that the first one is dr! > awn to be very large. > > I'd really appreciate any suggestions: the "weights" option does not do the > job, this seems to work only if setting "ties=TRUE" > > Thanks > Dirk > -- >
Hi Dirk, You don't give a sample dataset, and I've not used the Matching library, so take my comments with a scoop of salt. Looking at the help page for Match, it seems as if the option "Weight.matrix" is what you're looking for. creating a "weight" column in the treatment group with a constant, high value, including "weight" in the matching, and giving that covariate a high importance might work, no? /Gustaf ------------------------- Quote: "Weight.matrix This matrix denotes the weights the matching algorithm uses when weighting each of the covariates in X—see the Weight option. This square matrix should have as many columns as the number of columns of the X matrix. This matrix is usually provided by a call to the GenMatch function which finds the optimal weight each variable should be given so as to achieve balance on the covariates. For most uses, this matrix has zeros in the off-diagonal cells. This matrix can be used to weight some variables more than others. For example, if X contains three variables and we want to match as best as we can on the first, the following would work well: > Weight.matrix <- diag(3) > Weight.matrix[1,1] <- 1000/var(X[,1]) > Weight.matrix[2,2] <- 1/var(X[,2]) > Weight.matrix[3,3] <- 1/var(X[,3]) This code changes the weights implied by the inverse of the variances by multiplying the first variable by a 1000 so that it is highly weighted. In order to enforce exact matching see the exact and caliper options. " -- Gustaf Rydevik, M.Sci. tel: +46(0)703 051 451 address:Essingetorget 40,112 66 Stockholm, SE skype:gustaf_rydevik ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.