[R] simulate data with binary outcome and correlated predictors

Delphine COURVOISIER Tue, 11 Nov 2008 06:29:18 -0800

Hi,

I would like to simulate data with a binary outcome and a set of predictors 
that are correlated. I want to be able to fix the number of event (Y=1) vs. 
non-event (Y=0). Thus, I fix this and then simulate the predictors. I have 2 
questions: 
1. When the predictors are continuous, I can use mvrnorm(). However, if I have 
continuous, ordinal and binary predictors, I'm not sure how to simulate 
accurately the relationships between predictors. 
2. To specify the coefficients of the regression of Y on predictors, I must 
specify separately the predictors for Y=1 and Y=0, I can vary the mean and the 
variance/covariances of the predictors. However, with this approach, it is 
harder to determine precisely the coefficients of the predictors. Any help on 
how to be as precise as possible on the betas would be nice.


Here is the code I wrote where all predictors are continuous with variance =1 
and correlations between predictors vary for each condition of Y. 
_________
library(MASS)
N<-1000

nbX<-3
propSick<-0.2
corrSick<-.8
corrHealthy<-.9

sigma0<-matrix(corrHealthy,nbX,nbX)
diag(sigma0)<-1
sigma1<-matrix(corrSick,nbX,nbX)
diag(sigma1)<-1
dataHealthy<-mvrnorm(N*(1-propSick),c(0,0,0),sigma0)
dataSick<-mvrnorm(N*propSick,c(1,1,1),sigma1)

dataS<-as.data.frame(matrix(0,ncol=4,nrow=N))
dimnames(dataS)[[2]]<-c("IV1","IV2","IV3","DV")
dataS$DV[1:(N*propSick)]<-1
dataS$DV<-factor(dataS$DV)
dataS[1:(N*propSick),1:3]<-dataSick
dataS[(N*propSick+1):N,1:3]<-dataHealthy
_____________

thanks in advance for any suggestions,




************************************
Delphine Courvoisier
Clinical Epidemiology Division
University of Geneva Hospital
+4122 37 29029

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] simulate data with binary outcome and correlated predictors

Reply via email to