Your expanded explanation helps clarify your intent. Herewith some comments. Of course, feel free to ignore and not respond. And, as always, my apologies if I have failed to comprehend your intent.
1. I would avoid any notion of "statistical significance" like the plague. This is a purely exploratory exercise. 2. My understanding is that you want to know the proportion of rows in a pair of columns/vectors in which only 1 values of the pair is 1 out of the number of pairs where 1 or 2 values is 1. In R syntax, this is simply: sum(xor(x, y)) / sum(x | y) , where x and y are two columns of 1's and 0's Better yet might be to report both this *and* sum(x|y) to help you judge "meaningfulness". Here is a simple function that does this ## first, define a function that does above calculation: assoc <- \(z){ x <- z[,1]; y <- z[,2] n <- sum(x|y) c(prop = sum(xor(x, y))/n, N = n) } ## Now a function that uses it for the various combinations: somecor <- function(dat, func = assoc){ dat <- as.matrix(dat) indx <- seq_len(ncol(dat)) rbind(w <- combn(indx,2), combn(indx, 2, FUN = \(m)func(dat[,m]) )) |> t() |> round(digits =2) |> 'dimnames<-'(list(rep.int('',ncol(w)), c("","", "prop","N"))) } # Now apply it to your example data: somecor(dat) ## which gives prop N 1 2 0.67 6 1 3 0.60 5 1 4 0.57 7 2 3 0.60 5 2 4 0.33 6 3 4 0.71 7 This seems more interpretable and directly useful to me. Bigger values of prop for bigger N are the more interesting, assuming I have interpreted you correctly. Cheers, Bert On Sat, Jul 27, 2024 at 12:54 PM Yuan Chun Ding <ycd...@coh.org> wrote: > > Hi Richard, > > > > Nice to know you had similar experience. > > Yes, your understanding is right. all correlations are negative after > removing double-zero rows. > > It is consistent with a heatmap we generated. > > 1 is for a cancer patient with a specific mutation. 0 is no mutation for the > same mutation type in a patient. > > a pair of mutation type (two different mutations) are exclusive for most of > patients in heatmap or oncoplots. > > If we include all 1000 patients, 900 of patients with no mutations in both > mutation types, then the correlation is not significant at all. > > But eyeball the heatmap (oncoplots) for mutation (row) by patient (column), > mutations are exclusive for most of patients, > > so I want to measure how strong the exclusiveness between two specific > mutation types across those patients with at least one mutation type. > > Then put the pair of mutations with strong negative mutations on the top rows > by order of negative mutation values. > > > > Regarding a final application, maybe there are some usage for my case. > > If one develops two drugs specific to the two negative correlated mutations, > the drug treatment for cancer patients is usually only for those patients > carrying the specific mutation, > > then it is informative to know how strong the negative correlation when > considering different combination of treatment strategies. > > > > Ding > > > > > > > > > > > > From: R-help <r-help-boun...@r-project.org> On Behalf Of Richard O'Keefe > Sent: Saturday, July 27, 2024 4:47 AM > To: Bert Gunter <bgunter.4...@gmail.com> > Cc: r-help@r-project.org > Subject: Re: [R] please help generate a square correlation matrix > > > > Curses, my laptop is hallucinating again. Hope I can get through this. So > we're talking about correlations between binary variables. Suppose we have > two 0-1-valued variables, x and y. Let A <- sum(x*y) # number of cases where > x and y are > > Curses, my laptop is hallucinating again. Hope I can get through this. > > So we're talking about correlations between binary variables. > > Suppose we have two 0-1-valued variables, x and y. > > Let A <- sum(x*y) # number of cases where x and y are both 1. > > Let B <- sum(x)-A # number of cases where x is 1 and y is 0 > > Let C <- sum(y)-A # number of cases where y is 1 and x is 0 > > Let D <- sum(!x * !y) # number of cases where x and y are both 0. > > (also D = length(x)-A-B-C) > > > > All the information is summarised in the 2-by-2 contingency table. > > Some years ago, Nathan Rountree and I supervised Yung-Sing Koh's > > data-mining PhD. > > She surveyed the data mining literature and found some 37 different > > "interestingness measures" for two-variable associations -- if I > > remember correctly; there were a lot of them. They fell into a much > > smaller number of qualitatively similar groups. > > At any rate, the Pearson correlation between x and y is > > (A*D - B*C)/sqrt((A+B)*(C+D)*(A+C)*(B+D)) > > > > So what happens when we delete the rows where x = 0 and y = 0? > > Right, it forces D to 0, leaving A B C unchanged. > > And looking at the numerator, > > If you delete rows with x = 0 y = 0 you MUST get a negative correlation. > > > > Quite a modest "true" correlation (based on all the data) like -0.2 > > can masquerade as quite a strong "zero-suppressed" correlation like > > -0.6. Even +0.2 can turn into -0.4. (These figures are from a > > particular simulation run and may not apply in your case.) > > > > Now one of the reasons why Yun-Sing Koh, Nathan Rountree, and I were > > interested in interestingness measures is perhaps coincidentally > > related to the file drawer/underreporting problem: it's quite common > > for rows where x = 0 and y = 0 never to have been reported to you, so > > we were hoping there were measures immune to that. I have argued for > > years that "till record analysis" for supermarkets &c is badly flawed > > by two facts: (a) it is hard to measure how much of a product people > > WOULD have bought if only you had offered it for sale (although you > > can make educated guesses) and (b) till records provide no evidence on > > what the people who walked out without buying anything wanted (was the > > price too high? could they not find it?). Problem (a) leads to a > > commercial variant of the Signor-Lipps effect: "when x and/or y were > > available for purchase" is not the same as "the period for which data > > were recorded", thus inflating D, perhaps massively. Methods > > developed for handling the Signor-Lipps effect in paleontology can be > > used to estimate when x and y were available helping you to recover a > > more realistic N=A+B+C+D. I really should have published that. > > > > All of which is a long-winded way of saying that > > - Pearson correlations on binary columns can be computed very efficiently > > - the rows with x=0 and y=0 may be very informative, even essential for > analysis > > - delete them at your peril. > > - really, delete them at your peril. > > > > On Sat, 27 Jul 2024 at 23:07, Richard O'Keefe <rao...@gmail.com> wrote: > > > > > > Let's go back to the original posting. > > > > > > > > > > > > >> in each column, less than 10% values are 1, most of them are 0; > > > > > > > > > > > > > > > > > > > >> so I want to remove a row with value of zero in both columns when > > > >> calculate correlation between two columns. > > > > > > > > > > > So we're talking about correlations between binary variables. > > > Suppose we have two 0-1-valued variables, x and y. > > > Let A <- sum(x*y) # number of cases where x and y are both 1. > > > Let B <- sum(x)-a # number of cases where x is 1 and y is 0 > > > Let C <- sum(y)-a # number of cases where y is 1 and x is 0 > > > Let D <- sum(!x * !y) # number of cases where x and y are both 0. > > > > > > N > > > > > > On Fri, 26 Jul 2024 at 12:07, Bert Gunter <bgunter.4...@gmail.com> wrote: > > > > > > > > If I have understood the request, I'm not sure that omitting all 0 > > > > pairs for each pair of columns makes much sense, but be that as it > > > > may, here's another way to do it by using the 'FUN' argument of combn > > > > to encapsulate any calculations that you do. I just use cor() as the > > > > calculation -- you can use anything you like that takes two vectors of > > > > 0's and 1's and produces fixed length numeric results (or fromm which > > > > you can extract such). > > > > > > > > I encapsulated it all in a little function. Note that I first > > > > converted the data frame to a matrix. Because of their generality, > > > > data frames carry a lot of extra baggage that can slow purely numeric > > > > manipulations down. > > > > > > > > Anyway, here's the function, 'somecors' (I'm a bad name picker :( ! ) > > > > > > > > somecors <- function(dat, func = cor){ > > > > dat <- as.matrix(dat) > > > > indx <- seq_len(ncol(dat)) > > > > combn(indx, 2, FUN = \(z) { > > > > i <- z[1]; j <- z[2] > > > > k <- dat[, i ] | dat[, j ] > > > > c(z,func(dat[k,i ], dat[k,j ])) > > > > }) > > > > } > > > > > > > > Results come out as a matrix with combn(ncol(dat),2) columns, the > > > > first 2 rows giving the pair of column numbers for each column,and > > > > then 1 or more rows (possibly extracted) from whatever func you use. > > > > Here's the results for your data formatted to 2 decimal places: > > > > > > > > > round(somecors(dat),2) > > > > [,1] [,2] [,3] [,4] [,5] [,6] > > > > [1,] 1.0 1.00 1.00 2.00 2 3.00 > > > > [2,] 2.0 3.00 4.00 3.00 4 4.00 > > > > [3,] -0.5 -0.41 -0.35 -0.41 NA -0.47 > > > > Warning message: > > > > In func(dat[k, i], dat[k, j]) : the standard deviation is zero > > > > > > > > The NA and warning comes in the 2,4 pair of columns because after > > > > removing all zero rows in the pair, dat[,4] is all 1's, giving a zero > > > > in the denominator of the cor() calculation -- again, assuming I have > > > > correctly understood your request. If so, this might be something you > > > > need to worry about. > > > > > > > > Again, feel free to ignore if I have misinterpreterd or this does not > > > suit. > > > > > > > > Cheers, > > > > Bert > > > > > > > > > > > > On Thu, Jul 25, 2024 at 2:01 PM Rui Barradas <ruipbarra...@sapo.pt> wrote: > > > > > > > > > > Às 20:47 de 25/07/2024, Yuan Chun Ding escreveu: > > > > > > Hi Rui, > > > > > > > > > > > > You are always very helpful!! Thank you, > > > > > > > > > > > > I just modified your R codes to remove a row with zero values in both > > > > > column pair as below for my real data. > > > > > > > > > > > > Ding > > > > > > > > > > > > dat<-gene22mut.coded > > > > > > r <- P <- matrix(NA, nrow = 22L, ncol = 22L, > > > > > > dimnames = list(names(dat), names(dat))) > > > > > > > > > > > > for(i in 1:22) { > > > > > > #i=1 > > > > > > x <- dat[[i]] > > > > > > for(j in (1:22)) { > > > > > > #j=2 > > > > > > if(i == j) { > > > > > > # there's nothing to test, assign correlation 1 > > > > > > r[i, j] <- 1 > > > > > > } else { > > > > > > tmp <-cbind(x,dat[[j]]) > > > > > > row0 <-rowSums(tmp) > > > > > > tem2 <-tmp[row0!=0,] > > > > > > tmp3 <- cor.test(tem2[,1],tem2[,2]) > > > > > > r[i, j] <- tmp3$estimate > > > > > > P[i, j] <- tmp3$p.value > > > > > > } > > > > > > } > > > > > > } > > > > > > r<-as.data.frame(r) > > > > > > P<-as.data.frame(P) > > > > > > > > > > > > From: R-help <r-help-boun...@r-project.org> On Behalf Of Yuan Chun > > > > > Ding via R-help > > > > > > Sent: Thursday, July 25, 2024 11:26 AM > > > > > > To: Rui Barradas <ruipbarra...@sapo.pt>; r-help@r-project.org > > > > > > Subject: Re: [R] please help generate a square correlation matrix > > > > > > > > > > > > HI Rui, Thank you for the help! You did not remove a row if zero > > > > > values exist in both column pair, right? Ding From: Rui Barradas > > > > > <ruipbarradas@ sapo. pt> Sent: Thursday, July 25, 2024 11: 15 AM To: > > > > > Yuan Chun Ding <ycding@ coh. org>; > > > > > > > > > > > > > > > > > > HI Rui, > > > > > > > > > > > > > > > > > > > > > > > > Thank you for the help! > > > > > > > > > > > > > > > > > > > > > > > > You did not remove a row if zero values exist in both column pair, > > > > > right? > > > > > > > > > > > > > > > > > > > > > > > > Ding > > > > > > > > > > > > > > > > > > > > > > > > From: Rui Barradas <ruipbarra...@sapo.pt<mailto:ruipbarra...@sapo.pt>> > > > > > > > > > > > > Sent: Thursday, July 25, 2024 11:15 AM > > > > > > > > > > > > To: Yuan Chun Ding <ycd...@coh.org<mailto:ycd...@coh.org>>; > > > > > r-help@r-project.org<mailto:r-help@r-project.org> > > > > > > > > > > > > Subject: Re: [R] please help generate a square correlation matrix > > > > > > > > > > > > > > > > > > > > > > > > Às 17: 39 de 25/07/2024, Yuan Chun Ding via R-help escreveu: > Hi R > > > > > users, > > I generated a square correlation matrix for the dat > > > > > dataframe below; > dat<-data. frame(g1=c(1,0,0,1,1,1,0,0,0), > > > > > > g2=c(0,1,0,1,0,1,1,0,0), > g3=c(1,1,0,0,0,1,0,0,0), > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Às 17:39 de 25/07/2024, Yuan Chun Ding via R-help escreveu: > > > > > > > > > > > > > > > > > > > > > > > >> Hi R users, > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> I generated a square correlation matrix for the dat dataframe below; > > > > > > > > > > > > > > > > > > > > > > > >> dat<-data.frame(g1=c(1,0,0,1,1,1,0,0,0), > > > > > > > > > > > > > > > > > > > > > > > >> g2=c(0,1,0,1,0,1,1,0,0), > > > > > > > > > > > > > > > > > > > > > > > >> g3=c(1,1,0,0,0,1,0,0,0), > > > > > > > > > > > > > > > > > > > > > > > >> g4=c(0,1,0,1,1,1,1,1,0)) > > > > > > > > > > > > > > > > > > > > > > > >> library("Hmisc") > > > > > > > > > > > > > > > > > > > > > > > >> dat.rcorr = rcorr(as.matrix(dat)) > > > > > > > > > > > > > > > > > > > > > > > >> dat.r <-round(dat.rcorr$r,2) > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> however, I want to modify this correlation calculation; > > > > > > > > > > > > > > > > > > > > > > > >> my dat has more than 1000 rows and 22 columns; > > > > > > > > > > > > > > > > > > > > > > > >> in each column, less than 10% values are 1, most of them are 0; > > > > > > > > > > > > > > > > > > > > > > > >> so I want to remove a row with value of zero in both columns when > > > > >> calculate correlation between two columns. > > > > > > > > > > > > > > > > > > > > > > > >> I just want to check whether those values of 1 are correlated > > > > >> between two columns. > > > > > > > > > > > > > > > > > > > > > > > >> Please look at my code in the following; > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> cor.4gene <-matrix(0,nrow=4*4, ncol=4) > > > > > > > > > > > > > > > > > > > > > > > >> for (i in 1:4){ > > > > > > > > > > > > > > > > > > > > > > > >> #i=1 > > > > > > > > > > > > > > > > > > > > > > > >> for (j in 1:4) { > > > > > > > > > > > > > > > > > > > > > > > >> #j=1 > > > > > > > > > > > > > > > > > > > > > > > >> d <-dat[,c(i,j)]%>% > > > > > > > > > > > > > > > > > > > > > > > >> filter(eval(as.symbol(colnames(dat)[i]))!=0 | > > > > > > > > > > > > > > > > > > > > > > > >> eval(as.symbol(colnames(dat)[j]))!=0) > > > > > > > > > > > > > > > > > > > > > > > >> c <-cor.test(d[,1],d[,2]) > > > > > > > > > > > > > > > > > > > > > > > >> cor.4gene[i*j,]<-c(colnames(dat)[i],colnames(dat)[j], > > > > > > > > > > > > > > > > > > > > > > > >> c$estimate,c$p.value) > > > > > > > > > > > > > > > > > > > > > > > >> } > > > > > > > > > > > > > > > > > > > > > > > >> } > > > > > > > > > > > > > > > > > > > > > > > >> cor.4gene<-as.data.frame(cor.4gene)%>%filter(V1 !=0) > > > > > > > > > > > > > > > > > > > > > > > >> colnames(cor.4gene)<-c("gene1","gene2","cor","P") > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> Can you tell me what mistakes I made? > > > > > > > > > > > > > > > > > > > > > > > >> first, why cor is NA when calculation of correlation for g1 and g1, > > > > >> I though it should be 1. > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> cor.4gene$cor[is.na(cor.4gene$cor)]<-1 > > > > > > > > > > > > > > > > > > > > > > > >> cor.4gene$cor[is.na(cor.4gene$P)]<-0 > > > > > > > > > > > > > > > > > > > > > > > >> cor.4gene.sq <-pivot_wider(cor.4gene, names_from = gene1, > > > > >> values_from = cor) > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> Then this line of code above did not generate a square matrix as > > > > >> what the HMisc library did. > > > > > > > > > > > > > > > > > > > > > > > >> How to fix my code? > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> Thank you, > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> Ding > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> ---------------------------------------------------------------------- > > > > > > > > > > > > > > > > > > > > > > > >> ------------------------------------------------------------ > > > > > > > > > > > > > > > > > > > > > > > >> -SECURITY/CONFIDENTIALITY WARNING- > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> This message and any attachments are intended solely for the > > > > >> individual or entity to which they are addressed. This communication > > > > >> may contain information that is privileged, confidential, or exempt > > > > >> from disclosure under applicable law (e.g., personal health > > > > >> information, research data, financial information). Because this > > > > >> e-mail has been sent without encryption, individuals other than the > > > > >> intended recipient may be able to view the information, forward it > > > > >> to others or tamper with the information without the knowledge or > > > > >> consent of the sender. If you are not the intended recipient, or the > > > > >> employee or person responsible for delivering the message to the > > > > >> intended recipient, any dissemination, distribution or copying of > > > > >> the communication is strictly prohibited. If you received the > > > > >> communication in error, please notify the sender immediately by > > > > >> replying to this message and deleting the message and any > > > > >> accompanying files from your system. If, due to the security risks, > > > > >> you do not wish to rec > > > > > > > > > > > > > > > > > > > > > > > >> eive further communications via e-mail, please reply to this > > > > >> message and inform the sender that you do not wish to receive > > > > >> further e-mail from the sender. (LCP301) > > > > > > > > > > > > > > > > > > > > > > > >> ------------------------------------------------------------ > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> [[alternative HTML version deleted]] > > > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > >> ______________________________________________ > > > > > > > > > > > > > > > > > > > > > > > >> R-help@r-project.org<mailto:R-help@r-project.org<mailto:R-help@r-project.org%3cmailto:R-help@r-project.org>> > > > > >> mailing list -- To UNSUBSCRIBE and more, see > > > > > > > > > > > > > > > > > > > > > > > >> https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$<https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$><https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$%3chttps:/urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$%3e%3e> > > >> > > > > > > > > > > > > > <https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$%3chttps:/urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$%3e%3e> > > >> > > > > > > > >> <https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$%3chttps:/urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb8338TBM$%3e%3e>PLEASEdo > > > > >> read the posting guide > > > > >> https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$<https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$><https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$%3chttps:/urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$%3e%3e> > > >> > > > > > > > > > > > > > <https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$%3chttps:/urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$%3e%3e> > > >> > > > > > > > >> <https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$%3chttps:/urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6Hb880tLw0$%3e%3e>andprovide > > > > >> commented, minimal, self-contained, reproducible code. > > > > > > > > > > > > > > > > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > You are complicating the code, there's no need for as.symbol/eval, the > > > > > > > > > > > > > > > > > > > > > > > > column numbers do exactly the same. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > # create the two results matrices beforehand > > > > > > > > > > > > > > > > > > > > > > > > r <- P <- matrix(NA, nrow = 4L, ncol = 4L, dimnames = list(names(dat), > > > > > > > > > > > > > > > > > > > > > > > > names(dat))) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > for(i in 1:4) { > > > > > > > > > > > > > > > > > > > > > > > > x <- dat[[i]] > > > > > > > > > > > > > > > > > > > > > > > > for(j in (1:4)) { > > > > > > > > > > > > > > > > > > > > > > > > if(i == j) { > > > > > > > > > > > > > > > > > > > > > > > > # there's nothing to test, assign correlation 1 > > > > > > > > > > > > > > > > > > > > > > > > r[i, j] <- 1 > > > > > > > > > > > > > > > > > > > > > > > > } else { > > > > > > > > > > > > > > > > > > > > > > > > tmp <- cor.test(x, dat[[j]]) > > > > > > > > > > > > > > > > > > > > > > > > r[i, j] <- tmp$estimate > > > > > > > > > > > > > > > > > > > > > > > > P[i, j] <- tmp$p.value > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > # these two results are equal up to floating-point precision > > > > > > > > > > > > > > > > > > > > > > > > dat.rcorr$r > > > > > > > > > > > > > > > > > > > > > > > > #> g1 g2 g3 g4 > > > > > > > > > > > > > > > > > > > > > > > > #> g1 1.0000000 0.1000000 0.3162278 0.1581139 > > > > > > > > > > > > > > > > > > > > > > > > #> g2 0.1000000 1.0000000 0.3162278 0.6324555 > > > > > > > > > > > > > > > > > > > > > > > > #> g3 0.3162278 0.3162278 1.0000000 0.0000000 > > > > > > > > > > > > > > > > > > > > > > > > #> g4 0.1581139 0.6324555 0.0000000 1.0000000 > > > > > > > > > > > > > > > > > > > > > > > > r > > > > > > > > > > > > > > > > > > > > > > > > #> g1 g2 g3 g4 > > > > > > > > > > > > > > > > > > > > > > > > #> g1 1.0000000 0.1000000 3.162278e-01 1.581139e-01 > > > > > > > > > > > > > > > > > > > > > > > > #> g2 0.1000000 1.0000000 3.162278e-01 6.324555e-01 > > > > > > > > > > > > > > > > > > > > > > > > #> g3 0.3162278 0.3162278 1.000000e+00 1.355253e-20 > > > > > > > > > > > > > > > > > > > > > > > > #> g4 0.1581139 0.6324555 1.355253e-20 1.000000e+00 > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > # these two results are equal up to floating-point precision > > > > > > > > > > > > > > > > > > > > > > > > dat.rcorr$P > > > > > > > > > > > > > > > > > > > > > > > > #> g1 g2 g3 g4 > > > > > > > > > > > > > > > > > > > > > > > > #> g1 NA 0.79797170 0.4070838 0.68452834 > > > > > > > > > > > > > > > > > > > > > > > > #> g2 0.7979717 NA 0.4070838 0.06758329 > > > > > > > > > > > > > > > > > > > > > > > > #> g3 0.4070838 0.40708382 NA 1.00000000 > > > > > > > > > > > > > > > > > > > > > > > > #> g4 0.6845283 0.06758329 1.0000000 NA > > > > > > > > > > > > > > > > > > > > > > > > P > > > > > > > > > > > > > > > > > > > > > > > > #> g1 g2 g3 g4 > > > > > > > > > > > > > > > > > > > > > > > > #> g1 NA 0.79797170 0.4070838 0.68452834 > > > > > > > > > > > > > > > > > > > > > > > > #> g2 0.7979717 NA 0.4070838 0.06758329 > > > > > > > > > > > > > > > > > > > > > > > > #> g3 0.4070838 0.40708382 NA 1.00000000 > > > > > > > > > > > > > > > > > > > > > > > > #> g4 0.6845283 0.06758329 1.0000000 NA > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > You can put these two results in a list, like Hmisc::rcorr does. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > lst_rcorr <- list(r = r, P = P) > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hope this helps, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Rui Barradas > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > > > > > > > > > Este e-mail foi analisado pelo software antivírus AVG para verificar > > > > > a presença de vírus. > > > > > > > > > > > > > > > > > > > > > > > > https://urldefense.com/v3/__http://www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$<https://urldefense.com/v3/__http:/www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$><https://urldefense.com/v3/__http:/www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$%3chttps:/urldefense.com/v3/__http:/www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$%3e%09%5b%5balternative> > > >> > > > > > > > > > > > > > <https://urldefense.com/v3/__http:/www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$%3chttps:/urldefense.com/v3/__http:/www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$%3e%09%5b%5balternative> > > >> > > > > > > > > > > > > > [[alternative<https://urldefense.com/v3/__http:/www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$%3chttps:/urldefense.com/v3/__http:/www.avg.com__;!!Fou38LsQmgU!tyykZkQmOKcwoWXEpV2ohbnr02thhHMabAcYLL_-7dteKHAabK-eo4rGDnwgSFjniAy8SO00L6HbloMCQMI$%3e%09%5b%5balternative>HTMLversion > > > > > deleted]] > > > > > > > > > > > > > > > > > > > > > > > > ______________________________________________ > > > > > > > > > > > > R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To > > > > > UNSUBSCRIBE and more, see > > > > > > > > > > > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!s54ahZtZNAIyaIGV3C2p8lhXpYlHksC6XvFKZltf6g3ElJHOO3I1MYFecLQ4QeMO3MpP3qXz-YrhdUE$<https://urldefense.com/v3/__https:/stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!s54ahZtZNAIyaIGV3C2p8lhXpYlHksC6XvFKZltf6g3ElJHOO3I1MYFecLQ4QeMO3MpP3qXz-YrhdUE$> > > >> > > > > > > > > PLEASE do read the posting guide > > > > > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!s54ahZtZNAIyaIGV3C2p8lhXpYlHksC6XvFKZltf6g3ElJHOO3I1MYFecLQ4QeMO3MpP3qXzNRZxc6s$<https://urldefense.com/v3/__http:/www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!s54ahZtZNAIyaIGV3C2p8lhXpYlHksC6XvFKZltf6g3ElJHOO3I1MYFecLQ4QeMO3MpP3qXzNRZxc6s$> > > >> > > > > > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > > > > Hello, > > > > > > > > > > Here are two other ways. > > > > > > > > > > The first is equivalent to your long format attempt. > > > > > > > > > > > > > > > library(tidyverse) > > > > > > > > > > dat %>% > > > > > names() %>% > > > > > expand.grid(., .) %>% > > > > > apply(1L, \(x) { > > > > > tmp <- dat[rowSums(dat[x]) > 0, ] > > > > > tmp2 <- cor.test(tmp[[ x[1L] ]], tmp[[ x[2L] ]]) > > > > > c(tmp2$estimate, P = tmp2$p.value) > > > > > }) %>% > > > > > t() %>% > > > > > as.data.frame() %>% > > > > > cbind(tmp_df, .) %>% > > > > > na.omit() > > > > > > > > > > > > > > > The second is, in my opinion the one that makes more sense. If you see > > > > > the results, cor is symmetric (as it should) so the calculations are > > > > > repeated. If you only run the cor.tests on the combinations of > > > > > names(dat) by groups of 2, it will save a lot of work. But the output is > > > > > a much smaller a data.frame. > > > > > > > > > > > > > > > cbind( > > > > > combn(names(dat), 2L) %>% > > > > > t() %>% > > > > > as.data.frame(), > > > > > combn(dat, 2L, FUN = \(d) { > > > > > d2 <- d[rowSums(d) > 0, ] > > > > > tmp2 <- cor.test(d2[[1L]], d2[[2L]]) > > > > > c(tmp2$estimate, P = tmp2$p.value) > > > > > }) %>% t() > > > > > ) %>% na.omit() > > > > > > > > > > > > > > > > > > > > Hope this helps, > > > > > > > > > > Rui Barradas > > > > > > > > > > > > > > > ______________________________________________ > > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!syHSW7xho1Y4ssijZgjysEtoRhDbKljzLKfIYGOzmXQsT7wsjfUQ3n7CZDn7aQ-aUmwxIgqJrg$ > > > > > PLEASE do read the posting guide > > > > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!syHSW7xho1Y4ssijZgjysEtoRhDbKljzLKfIYGOzmXQsT7wsjfUQ3n7CZDn7aQ-aUmxc76WB2w$ > > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > > > ______________________________________________ > > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!syHSW7xho1Y4ssijZgjysEtoRhDbKljzLKfIYGOzmXQsT7wsjfUQ3n7CZDn7aQ-aUmwxIgqJrg$ > > > > PLEASE do read the posting guide > > > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!syHSW7xho1Y4ssijZgjysEtoRhDbKljzLKfIYGOzmXQsT7wsjfUQ3n7CZDn7aQ-aUmxc76WB2w$ > > > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!syHSW7xho1Y4ssijZgjysEtoRhDbKljzLKfIYGOzmXQsT7wsjfUQ3n7CZDn7aQ-aUmwxIgqJrg$ > > PLEASE do read the posting guide > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!syHSW7xho1Y4ssijZgjysEtoRhDbKljzLKfIYGOzmXQsT7wsjfUQ3n7CZDn7aQ-aUmxc76WB2w$ > > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.