[R] simulation data with dichotomous varuables

thanoon younis Mon, 04 Aug 2014 18:10:27 -0700

Dear R-users
i need your help to solve my problem in the code below, i  want to simulate
two different samples R1 and R2 and each sample has 10 variables and 1000
observations so i want to simulate a data with high correlation between
var. in R1 and also in R2 and no correlation between R1 and R2 also i have
a problem with correlation coefficient between tow dichotomous var. the R-
program supports just these types of correlation coefficients such as
pearson, spearman,kendall.


thanks alot in advance

Thanoon


ords <- seq(0,1)
p <- 10
N <- 1000
percent_change <- 0.9

R1 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))
R2 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))
# pearson is more appropriate for dichotomous data
cor(R1, R2, method = "pearson")


# subset variable to have a stronger correlation


v1 <- R1[,1, drop = FALSE]
v1 <- R2[,1, drop = FALSE]
# randomly choose which rows to retain
keep <- sample(as.numeric(rownames(v1)), size = percent_change*nrow(v1))
 change <- as.numeric(rownames(v1)[-keep])

# randomly choose new values for changing
new.change <- sample(ords, ((1-percent_change)*N)+1, replace = T)

# replace values in copy of original column
v1.samp <- v1
 v1.samp[change,] <- new.change

# closer correlation
cor(v1, v1.samp, method = "pearson")

# set correlated column as one of your other columns
R1[,2] <- v1.samp
R2[,2] <- v1.samp
R1
R2

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] simulation data with dichotomous varuables

Reply via email to