Dear R-users i need your help to solve my problem in the code below, i want to simulate two different samples R1 and R2 and each sample has 10 variables and 1000 observations so i want to simulate a data with high correlation between var. in R1 and also in R2 and no correlation between R1 and R2 also i have a problem with correlation coefficient between tow dichotomous var. the R- program supports just these types of correlation coefficients such as pearson, spearman,kendall.
thanks alot in advance Thanoon ords <- seq(0,1) p <- 10 N <- 1000 percent_change <- 0.9 R1 <- as.data.frame(replicate(p, sample(ords, N, replace = T))) R2 <- as.data.frame(replicate(p, sample(ords, N, replace = T))) # pearson is more appropriate for dichotomous data cor(R1, R2, method = "pearson") # subset variable to have a stronger correlation v1 <- R1[,1, drop = FALSE] v1 <- R2[,1, drop = FALSE] # randomly choose which rows to retain keep <- sample(as.numeric(rownames(v1)), size = percent_change*nrow(v1)) change <- as.numeric(rownames(v1)[-keep]) # randomly choose new values for changing new.change <- sample(ords, ((1-percent_change)*N)+1, replace = T) # replace values in copy of original column v1.samp <- v1 v1.samp[change,] <- new.change # closer correlation cor(v1, v1.samp, method = "pearson") # set correlated column as one of your other columns R1[,2] <- v1.samp R2[,2] <- v1.samp R1 R2 [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.