You can do this with ave(): gene <- c("a","b","c","d","c","d","c","f") ave(gene, gene, FUN=function(x)if(length(x)>1)paste(x,seq_along(x),sep="-") else x) # [1] "a" "b" "c-1" "d-1" "c-2" "d-2" "c-3" "f"
You can probably speed it up a bit by pulling the paste() out of FUN and doing it later. It would be simpler if you put the '-N' after all genes, not just the ones that were not repeated. Bill Dunlap TIBCO Software wdunlap tibco.com On Wed, Feb 28, 2018 at 10:18 PM, Stephen HonKit Wong <stephe...@gmail.com> wrote: > Dear All, > Suppose I have a dataframe like this with many thousands rows all with > different names: > data.frame(gene=c("a","b","c","d","c","d","c","f"),value=c( > 20,300,48,55,9,2,100,200)), > > I want to set column "gene" as row.names, but there are duplicates (c, d), > which I want to transform into this as row names: a, b, c-1, d-1, c-2, d-2, > c-3, f > > Many thanks! > > Stephen > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.