Here is a way of determining where the dups are: > vec <- scan(textConnection(' "STAT1" "STAT1" "STAT1" "STAT1" "GAPDH" "GAPDH" "GAPDH" + + "ACTB" "ACTB" "ACTB" "DDR1" "RFC2" "HSPA6" "PAX8" + "GUCA1A" "UBE1L" "THRA" "PTPN21" "CCL5" "CYP2E1" "STAT1" + "THRA" "PAX8"'), what='') Read 23 items > > # create a list of which ones are the same; if the length of the list is greater > # than one, then it marks where the dups are > dup <- split(seq(vec), vec) > > dup $ACTB [1] 8 9 10 $CCL5 [1] 19 $CYP2E1 [1] 20 $DDR1 [1] 11 $GAPDH [1] 5 6 7 $GUCA1A [1] 15 $HSPA6 [1] 13 $PAX8 [1] 14 23 $PTPN21 [1] 18 $RFC2 [1] 12 $STAT1 [1] 1 2 3 4 21 $THRA [1] 17 22 $UBE1L [1] 16 >
On Thu, Jun 18, 2009 at 10:28 AM, njhuang86 <njhuan...@yahoo.com> wrote: > > Hi all, > > Suppose I have a vector like this: > > [1] "STAT1" "STAT1" "STAT1" "STAT1" "GAPDH" "GAPDH" "GAPDH" "ACTB" > "ACTB" > [10] "ACTB" "DDR1" "RFC2" "HSPA6" "PAX8" "GUCA1A" "UBE1L" "THRA" > "PTPN21" > [19] "CCL5" "CYP2E1" "STAT1" "THRA" "PAX8" > > I would like to produce a vector such that it has the same length as the > one > above but it tells me where the duplicates are. So essentially, if I could > represent each gene symbol as a specific number, and have the duplicates be > the same number, that would be ideal. Right now, I'm using the unique > command along with two nested for loops to do the job... But it's really > taking too long... Any suggestions would be greatly appreciated. Thank you! > -- > View this message in context: > http://www.nabble.com/Any-method-to-speed-up-this-problem--tp24094164p24094164.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.