If you have a large datatable, you might consider using 'data.table' which is better performing than 'plyr'
> x <- read.table(textConnection("Gene ProbeID > Expression_Level + A 1 0.34 + A 2 0.21 + E 3 0.11 + A 4 0.21 + F 5 0.56 + F 6 0.87"), header = TRUE) > closeAllConnections() > require(data.table) > x <- data.table(x) > x[, + list(nProbes = length(ProbeID) + , Mean_Level = mean(Expression_Level) + ) + , by = Gene + ] Gene nProbes Mean_Level [1,] A 3 0.2533333 [2,] E 1 0.1100000 [3,] F 2 0.7150000 > > On Thu, Jun 30, 2011 at 3:28 AM, Max Mariasegaram <max.mariasega...@qut.edu.au> wrote: > Hi, > > I am interested in using the cast function in R to perform some aggregation. > I did once manage to get it working, but have now forgotten how I did this. > So here is my dilemma. I have several thousands of probes (about 180,000) > corresponding to each gene; what I'd like to do is obtain is a frequency > count of the various occurrences of each probes for each gene. > > The data would look something like this: > > Gene ProbeID Expression_Level > A 1 0.34 > A 2 0.21 > E 3 0.11 > A 4 0.21 > F 5 0.56 > F 6 0.87 > . > . > . > (180000 data points) > > In each case, the probeID is unique. The output I am looking for is something > like this: > > Gene No.ofprobes Mean_expression > A 3 0.25 > > Is there an easy way to do this using "cast" or "melt"? Ideally, I would also > like to see the unique probes corresponding to each gene in the wide format. > > Thanks in advance > Max > > Maxy Mariasegaram| Reserach Fellow | Australian Prostate Cancer Research > Centre| Level 1, Building 33 | Princess Alexandra Hospital | 199 Ipswich > Road, Brisbane QLD 4102 Australia | t: 07 3176 3073| f: 07 3176 7440 | e: > maria...@qut.edu.au > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.