Hi, May be this helps:
dat1 <- read.table(text="Sample Genotype Region sample1 A Region1 sample1 B Region1 sample1 A Region1 sample2 A Region1 sample2 A Region1 sample3 A Region1 sample4 B Region1",sep="",header=TRUE,stringsAsFactors=FALSE) library(plyr) unique(ddply(dat1,.(Sample),mutate, Genotype=if(length(unique(Genotype))>1) {"E"} else Genotype)) dat2 <- read.table(text="Sample Genotype Region sample1 A Region1 sample1 B Region1 sample1 A Region1 sample2 A Region1 sample2 A Region1 sample3 A Region1 sample4 B Region1 sample1 A Region2 sample1 B Region2 sample1 A Region2 sample2 A Region2 sample2 A Region2",sep="",header=TRUE,stringsAsFactors=FALSE) unique(ddply(dat2,.(Region,Sample),mutate, Genotype=if(length(unique(Genotype))>1) {"E"} else Genotype)) #or aggregate(Genotype~.,data=dat2,function(x) x <- if(length(unique(x))>1) "E" else unique(x)) A.K. I would like to transform this data: Sample Genotype Region sample1 A Region1 sample1 B Region1 sample1 A Region1 sample2 A Region1 sample2 A Region1 sample3 A Region1 sample4 B Region1 In that format, tagging with "E" samples with more than one genotype and unifying samples with the same genotype 2 times: Sample Genotype Region sample1 E Region1 sample2 A Region1 sample3 A Region1 sample4 B Region1 I have one list with many regions (Region1 - Regionx). It is possible to do in R software? Thanks a lot. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.