On Sat, Aug 2, 2014 at 1:11 PM, Adrian Johnson <oriolebaltim...@gmail.com> wrote: > Hi: > > I am trying to identify mutually exclusive events from the following > example: > > > Cluster Gene Mutated not-mutated > 1 G1 1 0 > 1 G2 1 0 > 1 G3 0 1 > 1 G4 0 1 > 1 G5 1 0 > 2 G1 0 1 > 2 G2 1 0 > 2 G3 1 0 > 2 G4 0 0 > 2 G5 1 0 > > > In cluster 1 : G1, G2, G5 are mutated > > In cluster 2: G2, G3, G5 are mutated. > > > I am interested in finding such G2-G5 event and G1-G3 events. > > In total I have a 8 clusters and 150 gene (1200 rows x 4 columns). > > What test could be appropriate to identify such pairs. > > In my naive understanding would a fishers-exact test give such > combinations. > > Thanks a lot. > > -Adrian
I am having trouble visualizing your data. How about a sample? The easy is to do something like: temp <- head(realData,10); dput(temp); Then cut'n'paste the output from the dput() into another email here. But, asuming I have a bit of a grasp, you have four columns (example only shows 3). If you have a set of columns which are 0 & 1 or FALSE and TRUE, then you can create a "temp" column which encodes tehm simply by considering them to be binary digits in a number. I.e. tempColumn = 1 * column1 + 2 * column2 + 4*column3 + 8*column4. You can the "group" the data by this value. All rows with the same value are in the same "group". But I don't know what you want your output to look like. As an aside any value other than 0, 1, 2,4, or 8 could be considered invalid because it means that more than one column is TRUE, which violates your constraint. -- There is nothing more pleasant than traveling and meeting new people! Genghis Khan Maranatha! <>< John McKown ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.