oh it seems that I can just use your last line of code and solve my problem: m2=tapply(mc$IID, list(FID=mc$FID, PLATE=mc$PLATE), mean) m2=as.data.frame(m2) library(data.table) m3=setDT(m2, keep.rownames = TRUE)[] colnames(m3)[1] <- "FID" mt=merge(mc,m3,by="FID" for(i in 4:ncol(mt)) mt[,i] <- 1 + (names(mt)[i]== mt$PLATE)
Thanks! On Tue, Sep 29, 2020 at 12:08 PM Ana Marija <sokovic.anamar...@gmail.com> wrote: > > HI Bert, > > thank you for getting back to me. > I tried this: > > > dat <- cbind(mc, matrix(0,ncol = 34)) > > head(dat) > FID IID PLATE 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 > 22 > 1 fam0110 G110 4RWG569 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 > 2 fam0113 G113 cherry 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 > 3 fam0114 G114 cherry 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 > 4 fam0117 G117 4RWG569 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 > 5 fam0118 G118 5XAV049 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 > 6 fam0119 G119 cherry 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0 > 23 24 25 26 27 28 29 30 31 32 33 34 > 1 0 0 0 0 0 0 0 0 0 0 0 0 > 2 0 0 0 0 0 0 0 0 0 0 0 0 > 3 0 0 0 0 0 0 0 0 0 0 0 0 > 4 0 0 0 0 0 0 0 0 0 0 0 0 > 5 0 0 0 0 0 0 0 0 0 0 0 0 > 6 0 0 0 0 0 0 0 0 0 0 0 0 > > names(dat) <- c(names(dat)[1:34], unique(dat$PLATE)) > Error in names(dat) <- c(names(dat)[1:34], unique(dat$PLATE)) : > 'names' attribute [68] must be the same length as the vector [37] > > so names should include FID,IID,PLATE plus unique dat$PLATE > how do I fix that so the code works? > > Also I tried a bit on my own: > > > head(mc) > FID IID PLATE > 1 fam0110 G110 4RWG569 > 2 fam0113 G113 cherry > 3 fam0114 G114 cherry > 4 fam0117 G117 4RWG569 > 5 fam0118 G118 5XAV049 > 6 fam0119 G119 cherry > ... > > m2=tapply(mc$IID, list(FID=mc$FID, PLATE=mc$PLATE), mean) > m2=as.data.frame(m2) > library(data.table) > m3=setDT(m2, keep.rownames = TRUE)[] > colnames(m3)[1] <- "FID" > mt=merge(mc,m3,by="FID") > > > head(mt) > FID IID PLATE 0VXC556 1CNF297 1CWO500 1DXJ626 1LTX827 1SHK635 1TNP840 > 1 fam0110 G110 4RWG569 NA NA NA NA NA NA NA > 2 fam0113 G113 cherry NA NA NA NA NA NA NA > 3 fam0114 G114 cherry NA NA NA NA NA NA NA > 4 fam0117 G117 4RWG569 NA NA NA NA NA NA NA > 5 fam0118 G118 5XAV049 NA NA NA NA NA NA NA > 6 fam0119 G119 cherry NA NA NA NA NA NA NA > 1URP242 2BKX529 2PAG415 3DEF425 3ECO791 3FQM386 3KYJ479 3XHK903 4RWG569 > 1 NA NA NA NA NA NA NA NA NA > 2 NA NA NA NA NA NA NA NA NA > 3 NA NA NA NA NA NA NA NA NA > 4 NA NA NA NA NA NA NA NA NA > 5 NA NA NA NA NA NA NA NA NA > 6 NA NA NA NA NA NA NA NA NA > ... > > so this gives me the correct columns. Now is the question of how to > replace NA with 2 id column name matches the rownname in PLATE column > with 2 otherwise it is 1. > > Cheers, > Ana > > On Tue, Sep 29, 2020 at 11:46 AM Bert Gunter <bgunter.4...@gmail.com> wrote: > > > > I am not sure reshape2 is appropriate for this task, but, assuming I > > understand correctly, it's quite easy without it. The following is one way, > > which probably can be done more elegantly and efficiently, but I think it > > does what you want. > > > > "dat" is your example data frame, in which the columns were read in with > > "stringsAsFactors" = FALSE (this is important!) > > > > dat <- cbind(dat, matrix(0,ncol = 3)) ## change 3 to 34 for your full data > > names(dat) <- c(names(dat)[1:3], unique(dat$PLATE)) > > for(i in 4:ncol(dat)) dat[,i] <- 1 + (names(dat)[i]== dat$PLATE) > > dat > > > > Result: > > > > FID IID PLATE 4RWG569 cherry 5XAV049 > > 1 fam0110 G110 4RWG569 2 1 1 > > 2 fam0113 G113 cherry 1 2 1 > > 3 fam0114 G114 cherry 1 2 1 > > 4 fam0117 G117 4RWG569 2 1 1 > > 5 fam0118 G118 5XAV049 1 1 2 > > 6 fam0119 G119 cherry 1 2 1 > > > > > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along and > > sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Tue, Sep 29, 2020 at 9:19 AM Ana Marija <sokovic.anamar...@gmail.com> > > wrote: > >> > >> Hello, > >> > >> I have a data frame like this: > >> > >> > head(mc) > >> FID IID PLATE > >> 1 fam0110 G110 4RWG569 > >> 2 fam0113 G113 cherry > >> 3 fam0114 G114 cherry > >> 4 fam0117 G117 4RWG569 > >> 5 fam0118 G118 5XAV049 > >> 6 fam0119 G119 cherry > >> ... > >> > dim(mc) > >> [1] 1625 4 > >> > length(unique(mc$PLATE)) > >> [1] 34 > >> > >> I am trying to make a new data frame which would look like this: > >> FID IID PLATE 4RWG569 cherry 5XAV049 ... > >> 1 fam0110 G110 4RWG569 2 1 1 > >> 2 fam0113 G113 cherry 1 2 1 > >> 3 fam0114 G114 cherry 1 2 1 > >> 4 fam0117 G117 4RWG569 2 1 1 > >> 5 fam0118 G118 5XAV049 2 1 1 > >> 6 fam0119 G119 cherry 1 2 1 > >> ... > >> > >> so the new data frame would have an additional 34 columns (for every > >> unique mc$PLATE) and if in the row of PLATE column the value is ==to > >> that column name I would have 2 otherwise 1 > >> > >> I tried to do this with: > >> > >> library(reshape2) > >> > m2=dcast(mc, IID ~ PLATE) > >> Using PLATE as value column: use value.var to override. > >> > >> Please advise, > >> Ana > >> > >> ______________________________________________ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.