Hello there,

this is a cross-post of a stack-overflow question, which wasnt answered, but is very important for my work. Apologies for breaking any rules, but i do hope for some help from the list instead:

I have a huge matrix of pairwise similarity percentages between different samples. The samples are belonging to groups. The groups are determined by the suffix "_n" in the row.names/header names. In the first step, i wanted to create submatrices consisting of all pairs within single groups (i.e. for all samples from "_1"). However, I realized that i need to know all pairwise submatrices, between all combination of groups. So, i want to create (a list of) vectors that are named "_n1 vs _n2" (or similar) for all combinations of n, as illustrated by the colored rectangulars:

http://i.stack.imgur.com/XMkxj.png

Reproducible code, as provided by helpful Stack Overflow members, dealing with identical "_n"s.


df <- structure(list(HQ673618_1 = c(NA, 90.8, 89.8, 89.6, 89.8, 88.9,
        87.8, 88.2, 88.3), HQ674317_1 = c(90.8, NA, 98.6, 97.7, 98.4,
        97.4, 94.9, 96.2, 95.1), EU686630_1 = c(89.8, 98.6, NA, 98.4,
        98.9, 97.7, 95.4, 96.4, 95.8), EU686593_2 = c(89.6, 97.7, 98.4,
        NA, 98.1, 96.8, 94.4, 95.6, 94.8), JN166322_2 = c(89.8, 98.4,
        98.9, 98.1, NA, 97.5, 95.3, 96.5, 95.9), EU491340_2 = c(88.9,
        97.4, 97.7, 96.8, 97.5, NA, 96.5, 97.7, 96), AB694259_3 = c(87.8,
94.9, 95.4, 94.4, 95.3, 96.5, NA, 98.3, 95.9), AB694258_3 = c(88.2, 96.2, 96.4, 95.6, 96.5, 97.7, 98.3, NA, 95.8), AB694462_3 = c(88.3, 95.1, 95.8, 94.8, 95.9, 96, 95.9, 95.8, NA)), .Names = c("HQ673618_1", "HQ674317_1", "EU686630_1", "EU686593_2", "JN166322_2", "EU491340_2", "AB694259_3", "AB694258_3", "AB694462_3"), class = "data.frame", row.names = c("HQ673618_1", "HQ674317_1", "EU686630_1", "EU686593_2", "JN166322_2", "EU491340_2",
        "AB694259_3", "AB694258_3", "AB694462_3"))


        indx <- gsub(".*_", "", names(df))
        sub.matrices <- lapply(unique(indx), function(x) {
          temp <- which(indx %in% x)
          df[temp, temp]
        })
        unique_values <- lapply(sub.matrices, function(x) x[upper.tri(x)])
        names(unique_values) <- unique(indx)

This code needs to be expanded to form sub.matrices for any combination of unique indices in temp.


Thank you so much!




--
Tim Richter-Heitmann (M.Sc.)
PhD Candidate



International Max-Planck Research School for Marine Microbiology
University of Bremen
Microbial Ecophysiology Group (AG Friedrich)
FB02 - Biologie/Chemie
Leobener Straße (NW2 A2130)
D-28359 Bremen
Tel.: 0049(0)421 218-63062
Fax: 0049(0)421 218-63069

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to