Hi, I figured a workaround to my problem, but if anyone has any advice on how to express a function in tapply to achieve the same outcome, that would be awesome and I'd learn something about functions!
The workaround was tapply ((data$Variation.Type %in% c(2,3)), data$Patient, sum) Thanks. Min-Han On Fri, Mar 26, 2010 at 12:40 PM, Min-Han Tan <minhan.scie...@gmail.com>wrote: > Dear R-help members, > > Apologies for the trouble. > > I have a question : > > Essentially, I have a dataset which stores genetic variations for > individual patients. Each individual patient can have more than one > variation, and each new record corresponds to a new variation (thus, both > individual patients and variations are non-unique). > > So the dataset looks something like this ((letters = patients, numbers = > variation type). > Patient, Variation Type > A, 1 > A, 2 > A, 3 > B, 1 > C, 2 > D, 2 > D, 3 > E, 2 > E, 4 > F, 4 > > My final desired output is a data.frame or a vector containing patients, > each corresponding to a desired subset of variations. For e.g., if I only > was interested in variation type 2,3, my output would look like this. > > A, 2 > B, 0 > C, 1 > D, 2 > E, 1 > F, 0. > > I am trying to figure out how to use tapply to do this. > > It would be something like tapply (Variation Type, Patient, ??? ) > > I am not sure about the function syntax of ??? to subselect only 2,3, and > have been looking at the r-help. > > Sorry! Essentially, I am trying to avoid awkward loops in this whole > process. > > Thanks very much for your advice! > > Min-Han > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.