Is it because I failed to to add a column of ones for an intercept to the x matrix? TRhat would be my bad.
-- Bert On Sat, Aug 10, 2024 at 12:59 PM Bert Gunter <bgunter.4...@gmail.com> wrote: > > Probably because you inadvertently ran different models. Without your code, I > haven't a clue. > > > On Sat, Aug 10, 2024, 12:29 Yuan Chun Ding <ycd...@coh.org> wrote: >> >> HI Bert and Ben, >> >> >> >> Yes, running lm.fit using the matrix format is much faster. I read a couple >> of online comments why it is faster. >> >> >> >> However, the residual values for three tested variables or genes from lm >> function and lm.fit function are different, with Pearson correlation of >> 0.55, 0.89, and 0.99. >> >> >> >> I have not found the reason. >> >> >> >> Thanks, >> >> >> Ding >> >> >> >> From: Bert Gunter <bgunter.4...@gmail.com> >> Sent: Friday, August 9, 2024 7:11 PM >> To: Ben Bolker <bbol...@gmail.com> >> Cc: Yuan Chun Ding <ycd...@coh.org>; r-help@r-project.org >> Subject: Re: [R] a fast way to do my job >> >> >> >> Better idea, Ben! It would work as you might expect it to to produce the >> same results as the above: ##first make sure your regressor is a matrix: >> pur2 <- matrix(purity2, ncol =1) ## convert the data frame variables into a >> matrix dat <- >> >> Better idea, Ben! >> >> >> >> It would work as you might expect it to to produce the same results as >> >> the above: >> >> >> >> ##first make sure your regressor is a matrix: >> >> pur2 <- matrix(purity2, ncol =1) >> >> ## convert the data frame variables into a matrix >> >> dat <- as.matrix(gem751be.rpkm[ , 74:35164]) >> >> ##then >> >> result <- residuals(lm.fit( x= pur2, y = dat)) >> >> >> >> Cheers, >> >> Bert >> >> >> >> On Fri, Aug 9, 2024 at 6:38 PM Ben Bolker <bbol...@gmail.com> wrote: >> >> > >> >> > You can also fit a linear model with a matrix-valued response >> >> > variable, which should be even faster (not sure off the top of my head >> >> > how to get the residuals and reshape them to the dimensions you want) >> >> > >> >> > On Fri, Aug 9, 2024 at 9:31 PM Bert Gunter <bgunter.4...@gmail.com> wrote: >> >> > > >> >> > > See ?lm.fit. >> >> > > I must be missing something, because: >> >> > > >> >> > > results <- sapply(74:35164, \(i) residuals(lm.fit(purity2, >> >> > > gem751be.rpkm[, i] ))) >> >> > > >> >> > > would give you a 751 x 35091 matrix of the residuals from each of the >> >> > > regressions. >> >> > > I assume it will be considerably faster than all the overhead you are >> >> > > carrying in your current code, but of course you'll have to try it and >> >> > > see. ... Assuming that I have interpreted your request correctly. >> >> > > Ignore if not. >> >> > > >> >> > > Cheers, >> >> > > Bert >> >> > > >> >> > > On Fri, Aug 9, 2024 at 4:50 PM Yuan Chun Ding via R-help >> >> > > <r-help@r-project.org> wrote: >> >> > > > >> >> > > > Dear R users, >> >> > > > >> >> > > > I am running the following code below, the gem751be.rpkm is a >> > > > dataframe with dim of 751 samples by 35164 variables, 73 phenotypic >> > > > variables in the furst to 73rd column and 35091 genomic variables or >> > > > genes in the 74th to 35164th columns. What I need to do is to >> > > > calculate the residuals for each gene using the simple linear >> > > > regression model of genelist[i] ~ purity2; >> >> > > > >> >> > > > The following code is running, it takes long time, but I have an >> > > > expensive ThinkStation window computer. >> >> > > > Can you provide a fast way to do it? >> >> > > > >> >> > > > Thank you, >> >> > > > >> >> > > > Ding >> >> > > > >> >> > > > --------------------------------------------------------------------------------- >> >> > > > >> >> > > > >> >> > > > gem751be.rpkm <-merge(gem751be10, as.data.frame(t(rna849.fpkm2)), >> >> > > > + by.x="id2",by.y=0) >> >> > > > > row.names(gem751be.rpkm)<-gem751be.rpkm$id3 >> >> > > > > >> > > > > colnames(gem751be.rpkm)<-gsub(colnames(gem751be.rpkm),pattern="-",replacement="_") >> >> > > > > genelist <- gem751be.rpkm %>% dplyr::select(74:35164) >> >> > > > > residuals <- NULL >> >> > > > > for (i in 1:length(genelist)) { >> >> > > > + #i=1 >> >> > > > + formula <- reformulate("purity2", response=names(genelist)[i]) >> >> > > > + model <- lm(formula, data = gem751be.rpkm) >> >> > > > + resi <- as.data.frame(residuals(model)) >> >> > > > + colnames(resi)[1]<-names(genelist)[i] >> >> > > > + resi <-as.data.frame(t(resi)) >> >> > > > + residuals <- rbind(residuals, resi) >> >> > > > + } >> >> > > > >> >> > > > >> >> > > > >> >> > > > ---------------------------------------------------------------------- >> >> > > > ------------------------------------------------------------ >> >> > > > -SECURITY/CONFIDENTIALITY WARNING- >> >> > > > >> >> > > > This message and any attachments are intended solely for the >> > > > individual or entity to which they are addressed. This communication >> > > > may contain information that is privileged, confidential, or exempt >> > > > from disclosure under applicable law (e.g., personal health >> > > > information, research data, financial information). Because this >> > > > e-mail has been sent without encryption, individuals other than the >> > > > intended recipient may be able to view the information, forward it to >> > > > others or tamper with the information without the knowledge or consent >> > > > of the sender. If you are not the intended recipient, or the employee >> > > > or person responsible for delivering the message to the intended >> > > > recipient, any dissemination, distribution or copying of the >> > > > communication is strictly prohibited. If you received the >> > > > communication in error, please notify the sender immediately by >> > > > replying to this message and deleting the message and any accompanying >> > > > files from your system. If, due to the security risks, you do not wish >> > > > to rec >> >> > > > eive further communications via e-mail, please reply to this message >> > > > and inform the sender that you do not wish to receive further e-mail >> > > > from the sender. (LCP301) >> >> > > > ------------------------------------------------------------ >> >> > > > >> >> > > > [[alternative HTML version deleted]] >> >> > > > >> >> > > > ______________________________________________ >> >> > > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> > > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX72PW30DQ$ >> >> > > > PLEASE do read the posting guide >> > > > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX66rfmKvA$ >> >> > > > and provide commented, minimal, self-contained, reproducible code. >> >> > > >> >> > > ______________________________________________ >> >> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> > > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX72PW30DQ$ >> >> > > PLEASE do read the posting guide >> > > https://urldefense.com/v3/__http://www.R-project.org/posting-guide.html__;!!Fou38LsQmgU!qcJ4z-vlMNzsa8XCsJUcuPOz8Vt12zsV_XaWpqXsyUYJBTlcNRonFPr7w7Ql3xqcDnZ9ZYC8JX66rfmKvA$ >> >> > > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.