On Sun, Apr 01, 2012 at 06:00:43PM -0700, Burak Aydin wrote: > Hello Greg, > Sorry for the confusion. > Lets say, I have a population. I have 6 variables. They are correlated to > each other. I can get you pearson correlation, tetrachoric or polychoric > correlation coefficients. > 2 of them continuous, 2 binary, 2 categorical. > Lets assume following conditions; > Co1 and Co2 are normally distributed continuous random variables. Co1-- N > (0,1), Co2--N(100,15) > Ca1 and Ca2 are categorical variables. Ca1 probabilities > =c(.02,.18,.28,.22,.30), Ca2 probs =c(.06,.18,.76) > Bi1 and Bi2 are binaries, Marginal probabilities Bi1 p= 0.4, Bi2 p=0.5. > And , again, I have the correlations. > > When I try to simulate this population I fail. If I keep the means and > probabilities same I lost the correct correlations. When I keep > correlations, I loose precision on means and frequencies/probabilities.
Hi. One idea, which occured to me, is the following. Formulate a model of the joint distribution with some parameters and a criterion function, which measures how much the data generated from the model differ from the required marginal distributions and the required correlations. Then run an optimization of the parameters to minimize the difference. If you have enough data, then the model can be a table of estimated probabilities for all 5*3*2*2 = 60 combinations of the discrete variables and for each of these combinations the parameters of the conditional distribution on the 2 continuous variables, which can be a bivariate normal distribution. However, you probably do not have enough data for this. Another approach starts from the distribution of the continuous variables and the model for the discrete variables can be a logistic model using the continuous variables as input. Another type of a model, which may be suitable, is a Bayesian network. For this, you need to choose only a subset of the most important dependencies, so that the selected dependencies can be represented by a directed acyclic graph. Petr Savicky. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.