Dear all, I have a .csv file called df4. (15752 obs. of 264 variables). I apply this code but couldn't continue further other analyses, a warning message keeps coming up. Then, I want to determine max and min similarity values, heat map plot, cluster...etc
> require(SNPRelate) > library(gdsfmt) > myd <- read.csv(file = "df4.csv", header = TRUE) > names(myd)[-1] myd[,1] > myd[1:10, 1:10] # the data must be 0,1,2 with 3 as missing so you have r > sample.id <- names(myd)[-1] > snp.id <- myd[,1] > snp.position <- 1:length(snp.id) # not needed for ibs > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs # genotype data must have - in 3 > genod <- myd[,-1] > genod[is.na(genod)] <- 3 > genod[genod=="0"] <- 0 > genod[genod=="1"] <- 2 > genod[1:10,1:10] > genod <- as.matrix(genod) > class(genod) <- "numeric" *Warning message:In class(genod) <- "numeric" : NAs introduced by coercion* Maybe I could illustrate more with details so I can be more specific? Please, let me know. I would appreciate your help. Thanks, Meriam [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.