Yes, sorry. I attached the file once again. Well, still getting the same warning.
> class(genod) <- "numeric" Warning message: In class(genod) <- "numeric" : NAs introduced by coercion > class(genod) [1] "matrix" Then, I run the following code and it gives this: > filn <-"simTunesian.gds" > snpgdsCreateGeno(filn, genmat = genod, + sample.id = sample.id, snp.id = snp.id, + snp.chromosome = snp.chromosome, + snp.position = snp.position, + snp.allele = snp.allele, snpfirstdim=TRUE) > # calculate similarity matrix > # Open the GDS file > (genofile <- snpgdsOpen(filn)) File: C:\Users\DELL\Documents\TEST\simTunesian.gds (1.4M) + [ ] * |--+ sample.id { Str8 363 ZIP_ra(42.5%), 755B } |--+ snp.id { Int32 15752 ZIP_ra(35.1%), 21.6K } |--+ snp.position { Int32 15752 ZIP_ra(34.7%), 21.3K } |--+ snp.chromosome { Float64 15752 ZIP_ra(0.18%), 230B } |--+ snp.allele { Str8 15752 ZIP_ra(0.16%), 108B } \--+ genotype { Bit2 15752x363, 1.4M } * > ibs <- snpgdsIBS(genofile, remove.monosnp = FALSE, num.thread=1) Identity-By-State (IBS) analysis on genotypes: Excluding 0 SNP on non-autosomes Working space: 363 samples, 15,752 SNPs using 1 (CPU) core IBS: the sum of all selected genotypes (0,1,2) = 3658952 Tue Jan 08 15:38:00 2019 (internal increment: 42880) [==================================================] 100%, completed in 0s Tue Jan 08 15:38:00 2019 Done. > # maximum similarity value > max(ibs$ibs) [1] NaN > # minimum similarity value > min(ibs$ibs) [1] NaN As you can see, I can't continue my analysis (heat map plot, clustering with hclust) because values are NaN. On Tue, Jan 8, 2019 at 2:01 PM David L Carlson <dcarl...@tamu.edu> wrote: > > Your attached file is not a .csv file since the field are not separated by > commas (just rename the mydata.csv to mydata.txt). > > The command "genod2 <- as.matrix(genod)" created a character matrix from the > data frame genod. When you try to force genod2 to numeric, the marker column > becomes NAs which is probably not what you want. > > The error message is because you passed genod (a data frame) to the > snpgdsCreateGeno() function not genod2 (the matrix you created from genod). > > ------------------------------------ > David L. Carlson > Department of Anthropology > Texas A&M University > > -----Original Message----- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of N Meriam > Sent: Tuesday, January 8, 2019 1:38 PM > To: Michael Dewey <li...@dewey.myzen.co.uk> > Cc: r-help@r-project.org > Subject: Re: [R] Warning message: NAs introduced by coercion > > Here's a portion of what my data looks like (text file format attached). > When running in R, it gives me this: > > > df4 <- read.csv(file = "mydata.csv", header = TRUE) > > require(SNPRelate) > > library(gdsfmt) > > myd <- df4 > > myd <- df4 > > names(myd)[-1] > [1] "marker" "X88" "X9" "X17" "X25" > > myd[,1] > [1] 3 4 5 6 8 10 > # the data must be 0,1,2 with 3 as missing so you have r > > sample.id <- names(myd)[-1] > > snp.id <- myd[,1] > > snp.position <- 1:length(snp.id) # not needed for ibs > > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs > # genotype data must have - in 3 > > genod <- myd[,-1] > > genod[is.na(genod)] <- 3 > > genod[genod=="0"] <- 0 > > genod[genod=="1"] <- 2 > > genod2 <- as.matrix(genod) > > head(genod2) > marker X88 X9 > X17 X25 > [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3" > [2,] "1043336|F|0-7:A>G-7:A>G" "2" "0" "3" "0" > [3,] "1212218|F|0-49:A>G-49:A>G" "0" "0" "0" "0" > [4,] "1019554|F|0-14:T>C-14:T>C" "0" "0" "3" "0" > [5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3" > [6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0" > > class(genod2) <- "numeric" > Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion > > head(genod2) > marker X88 X9 X17 X25 > [1,] NA 0 3 3 3 > [2,] NA 2 0 3 0 > [3,] NA 0 0 0 0 > [4,] NA 0 0 3 0 > [5,] NA 3 3 3 3 > [6,] NA 0 0 0 0 > > class(genod2) <- "numeric" > > class(genod2) > [1] "matrix" > # read data > > filn <-"simTunesian.gds" > > snpgdsCreateGeno(filn, genmat = genod, > + sample.id = sample.id, snp.id = snp.id, > + snp.chromosome = snp.chromosome, > + snp.position = snp.position, > + snp.allele = snp.allele, snpfirstdim=TRUE) > Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id, > : is.matrix(genmat) is not TRUE > > Can't find a solution to my problem...my guess is that the problem > comes from converting the column 'marker' factor to numerical. > > Best, > Meriam > > On Tue, Jan 8, 2019 at 11:28 AM Michael Dewey <li...@dewey.myzen.co.uk> wrote: > > > > Dear Meriam > > > > Your csv file did not come through as attachments are stripped unless of > > certain types and you post is very hard to read since you are posting in > > HTML. Try renaming the file to ????.txt and set your mailer to send > > plain text then people may be able to help you better. > > > > Michael > > > > On 08/01/2019 15:35, N Meriam wrote: > > > I see... > > > Here's a portion of what my data looks like (csv file attached). > > > I run again and here are the results: > > > > > > df4 <- read.csv(file = "mydata.csv", header = TRUE) > > > > > >> require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> > > >> names(myd)[-1][1] "marker" "X88" "X9" "X17" "X25" > > > > > >> myd[,1][1] 3 4 5 6 8 10 > > > > > > > > >> # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- > > >> names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # > > >> not needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not > > >> needed for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed > > >> for ibs> # genotype data must have - in 3> genod <- myd[,-1]> > > >> genod[is.na(genod)] <- 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2 > > > > > >> genod2 <- as.matrix(genod)> head(genod2) marker > > >> X88 X9 X17 X25 > > > [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3" > > > [2,] "1043336|F|0-7:A>G-7:A>G" "2" "0" "3" "0" > > > [3,] "1212218|F|0-49:A>G-49:A>G" "0" "0" "0" "0" > > > [4,] "1019554|F|0-14:T>C-14:T>C" "0" "0" "3" "0" > > > [5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3" > > > [6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0" > > > > > >> class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" > > >> : NAs introduced by coercion> head(genod2) > > > > > > marker X88 X9 X17 X25 > > > [1,] NA 0 3 3 3 > > > [2,] NA 2 0 3 0 > > > [3,] NA 0 0 0 0 > > > [4,] NA 0 0 3 0 > > > [5,] NA 3 3 3 3 > > > [6,] NA 0 0 0 0 > > > > > >> class(genod2) <- "numeric"> class(genod2)[1] "matrix" > > > > > >> # read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = > > >> genod,+ sample.id = sample.id, snp.id = snp.id,+ > > >> snp.chromosome = snp.chromosome,+ > > >> snp.position = snp.position,+ snp.allele = snp.allele, > > >> snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, > > >> sample.id = sample.id, : > > > is.matrix(genmat) is not TRUE > > > > > > Thanks, > > > Meriam > > > > > > On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr <petr.pi...@precheza.cz> wrote: > > > > > >> Hi > > >> > > >> see in line > > >> > > >>> -----Original Message----- > > >>> From: R-help <r-help-boun...@r-project.org> On Behalf Of N Meriam > > >>> Sent: Tuesday, January 8, 2019 3:08 PM > > >>> To: r-help@r-project.org > > >>> Subject: [R] Warning message: NAs introduced by coercion > > >>> > > >>> Dear all, > > >>> > > >>> I have a .csv file called df4. (15752 obs. of 264 variables). > > >>> I apply this code but couldn't continue further other analyses, a > > >>> warning > > >>> message keeps coming up. Then, I want to determine max and min > > >>> similarity values, > > >>> heat map plot, cluster...etc > > >>> > > >>>> require(SNPRelate) > > >>>> library(gdsfmt) > > >>>> myd <- read.csv(file = "df4.csv", header = TRUE) > > >>>> names(myd)[-1] > > >>> myd[,1] > > >>>> myd[1:10, 1:10] > > >>> # the data must be 0,1,2 with 3 as missing so you have r > > >>>> sample.id <- names(myd)[-1] > > >>>> snp.id <- myd[,1] > > >>>> snp.position <- 1:length(snp.id) # not needed for ibs > > >>>> snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > > >>>> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs > > >>> # genotype data must have - in 3 > > >>>> genod <- myd[,-1] > > >>>> genod[is.na(genod)] <- 3 > > >>>> genod[genod=="0"] <- 0 > > >>>> genod[genod=="1"] <- 2 > > >>>> genod[1:10,1:10] > > >>>> genod <- as.matrix(genod) > > >> > > >> matrix can have only one type of data so you probaly changed it to > > >> character by such construction. > > >> > > >>>> class(genod) <- "numeric" > > >> > > >> This tries to change all "numeric" values to numbers but if it cannot it > > >> sets it to NA. > > >> > > >> something like > > >> > > >>> head(iris) > > >> Sepal.Length Sepal.Width Petal.Length Petal.Width Species > > >> 1 5.1 3.5 1.4 0.2 setosa > > >> 2 4.9 3.0 1.4 0.2 setosa > > >> 3 4.7 3.2 1.3 0.2 setosa > > >> 4 4.6 3.1 1.5 0.2 setosa > > >> 5 5.0 3.6 1.4 0.2 setosa > > >> 6 5.4 3.9 1.7 0.4 setosa > > >>> ir <-head(iris) > > >>> irm <- as.matrix(ir) > > >>> head(irm) > > >> Sepal.Length Sepal.Width Petal.Length Petal.Width Species > > >> 1 "5.1" "3.5" "1.4" "0.2" "setosa" > > >> 2 "4.9" "3.0" "1.4" "0.2" "setosa" > > >> 3 "4.7" "3.2" "1.3" "0.2" "setosa" > > >> 4 "4.6" "3.1" "1.5" "0.2" "setosa" > > >> 5 "5.0" "3.6" "1.4" "0.2" "setosa" > > >> 6 "5.4" "3.9" "1.7" "0.4" "setosa" > > >>> class(irm) <- "numeric" > > >> Warning message: > > >> In class(irm) <- "numeric" : NAs introduced by coercion > > >>> head(irm) > > >> Sepal.Length Sepal.Width Petal.Length Petal.Width Species > > >> 1 5.1 3.5 1.4 0.2 NA > > >> 2 4.9 3.0 1.4 0.2 NA > > >> 3 4.7 3.2 1.3 0.2 NA > > >> 4 4.6 3.1 1.5 0.2 NA > > >> 5 5.0 3.6 1.4 0.2 NA > > >> 6 5.4 3.9 1.7 0.4 NA > > >>> > > >> > > >> Cheers > > >> Petr > > >> > > >> > > >>> > > >>> > > >>> *Warning message:In class(genod) <- "numeric" : NAs introduced by > > >> coercion* > > >>> > > >>> Maybe I could illustrate more with details so I can be more specific? > > >>> Please, let me know. > > >>> > > >>> I would appreciate your help. > > >>> Thanks, > > >>> Meriam > > >>> > > >>> [[alternative HTML version deleted]] > > >>> > > >>> ______________________________________________ > > >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >>> https://stat.ethz.ch/mailman/listinfo/r-help > > >>> PLEASE do read the posting guide > > >> http://www.R-project.org/posting-guide.html > > >>> and provide commented, minimal, self-contained, reproducible code. > > >> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních > > >> partnerů PRECHEZA a.s. jsou zveřejněny na: > > >> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information > > >> about processing and protection of business partner’s personal data are > > >> available on website: > > >> https://www.precheza.cz/en/personal-data-protection-principles/ > > >> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou > > >> důvěrné a podléhají tomuto právně závaznému prohláąení o vyloučení > > >> odpovědnosti: https://www.precheza.cz/01-dovetek/ | This email and any > > >> documents attached to it may be confidential and are subject to the > > >> legally > > >> binding disclaimer: https://www.precheza.cz/en/01-disclaimer/ > > >> > > >> > > > > > > > -- > > Michael > > http://www.dewey.myzen.co.uk/home.html > > > > -- > Meriam Nefzaoui > MSc. in Plant Breeding and Genetics > Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil -- Meriam Nefzaoui MSc. in Plant Breeding and Genetics Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil
,marker,X88,X9,X17,X25 3,100023173|F|0-47:G>A-47:G>A,0,NA,NA,NA 4,1043336|F|0-7:A>G-7:A>G,1,0,NA,0 5,1212218|F|0-49:A>G-49:A>G,0,0,0,0 6,1019554|F|0-14:T>C-14:T>C,0,0,NA,0 8,100024550|F|0-16:G>A-16:G>A,NA,NA,NA,NA 10,1106702|F|0-8:C>A-8:C>A,0,0,0,0
______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.