Hi--

  This is a question with a trivial and obvious answer, I'm sure, but I can't 
seem to find it in the help files and books that I have handy.  I have a 
dataframe consisting of two columns, "Gene_Name," a list of gene symbols, and 
"Number," a numeric measure of how frequently a tag representing that gene 
showed up in a SAGE library.  Several of the genes are represented by multiple 
tags, and therefore are present more than once in the list, e.g.:

1167     Zcchc8      6
1168     Zcwpw1      5
1169     Zdhhc18     6
1170     Zdhhc20     5
1171     Zdhhc3      6
1172     Zdhhc3      5
1173     Zeb2        9
1174     Zeb2        6

  What I want is to collapse the list by gene name, such that duplicates are 
summed up and appear only once in the final version:



Zcchc8      6

Zcwpw1      5

Zdhhc18     6
Zdhhc20     5

Zdhhc3     11

Zeb2       15



  The only way I can figure out to do this is via rowsum:



> rowsum (Number,Gene_Name)



gives me exactly what I want, *except* that in the end, I am left with a matrix 
containing the Number values and with the Gene_Names used as row names (the 
output therefore looks exactly as printed above) -- what I want is a dataframe 
equivalent to the starting table, with numbered rows and separate, accessible 
columns containing the Gene_Name and Number values.



  I was able to put such a dataframe together manually, by cobbling together 
the row names of the above list with the values:



> genes.unique <- data.frame (rownames (rowsum(Number,Gene_Name)), 
> rowsum(Number,Gene_Name))



but then I have to manually replace the row names of the dataframe with 
numbers, to get back to what I wanted in the first place.



  I hope this makes some sort of sense.  Is there an easier way to do this?  
Thanks in advance!



  Charlie Murtaugh







=====

L. Charles Murtaugh
Assistant Professor

University of Utah
Dept. of Human Genetics
15 N. 2030 E. Rm. 2100
Salt Lake City, UT 84112

tel 801-581-5958
fax 801-581-6463
email [EMAIL PROTECTED]


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to