Here is a way of determining where the dups are:

> vec <- scan(textConnection(' "STAT1"  "STAT1"  "STAT1"  "STAT1"  "GAPDH"
"GAPDH"  "GAPDH"
+
+ "ACTB"   "ACTB"   "ACTB"   "DDR1"   "RFC2"   "HSPA6"  "PAX8"
+  "GUCA1A" "UBE1L"  "THRA"   "PTPN21" "CCL5"   "CYP2E1" "STAT1"
+  "THRA"   "PAX8"'), what='')
Read 23 items
>
> # create a list of which ones are the same; if the length of the list is
greater
> # than one, then it marks where the dups are
> dup <- split(seq(vec), vec)
>
> dup
$ACTB
[1]  8  9 10
$CCL5
[1] 19
$CYP2E1
[1] 20
$DDR1
[1] 11
$GAPDH
[1] 5 6 7
$GUCA1A
[1] 15
$HSPA6
[1] 13
$PAX8
[1] 14 23
$PTPN21
[1] 18
$RFC2
[1] 12
$STAT1
[1]  1  2  3  4 21
$THRA
[1] 17 22
$UBE1L
[1] 16
>


On Thu, Jun 18, 2009 at 10:28 AM, njhuang86 <njhuan...@yahoo.com> wrote:

>
> Hi all,
>
> Suppose I have a vector like this:
>
> [1] "STAT1"  "STAT1"  "STAT1"  "STAT1"  "GAPDH"  "GAPDH"  "GAPDH"  "ACTB"
> "ACTB"
> [10] "ACTB"   "DDR1"   "RFC2"   "HSPA6"  "PAX8"   "GUCA1A" "UBE1L"  "THRA"
> "PTPN21"
> [19] "CCL5"   "CYP2E1"  "STAT1"  "THRA"  "PAX8"
>
> I would like to produce a vector such that it has the same length as the
> one
> above but it tells me where the duplicates are. So essentially, if I could
> represent each gene symbol as a specific number, and have the duplicates be
> the same number, that would be ideal. Right now, I'm using the unique
> command along with two nested for loops to do the job... But it's really
> taking too long... Any suggestions would be greatly appreciated. Thank you!
> --
> View this message in context:
> http://www.nabble.com/Any-method-to-speed-up-this-problem--tp24094164p24094164.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to