Dear R-help members,

Apologies for the trouble.

I have a question :

Essentially, I have a dataset which stores genetic variations for individual
patients. Each individual patient can have more than one variation, and each
new record corresponds to a new variation (thus, both individual patients
and variations are non-unique).

So the dataset looks something like this ((letters = patients, numbers =
variation type).
Patient, Variation Type
A, 1
A, 2
A, 3
B, 1
C, 2
D, 2
D, 3
E, 2
E, 4
F, 4

My final desired output is a data.frame or a vector containing patients,
each corresponding to a desired subset of variations. For e.g., if I only
was interested in variation type 2,3, my output would look like this.

A, 2
B, 0
C, 1
D, 2
E, 1
F, 0.

I am trying to figure out how to use tapply to do this.

It would be something like tapply (Variation Type, Patient, ??? )

I am not sure about the function syntax of ??? to subselect only 2,3, and
have been looking at the r-help.

Sorry! Essentially, I am trying to avoid awkward loops in this whole
process.

Thanks very much for your advice!

Min-Han

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to