Re: [R] Extracting information from lm results (multiple model runs)

R. Michael Weylandt Mon, 15 Aug 2011 08:27:33 -0700

Two ways to do it, both of which come down to forcing R to recognize "(" as
a string element than an actual parenthesis:


x = data.frame(1:3, 4:6)
colnames(x) = c("(Intercept)","Slope")

x$"(Intercept)"

or x[,"(Intercept)"] # Works with or without the comma in slightly different
ways -- figure out which you need

Alternatively, you can change the name to just "Intercept": this isn't the
most efficient way, but it does avoid the problem of having to know which
column is "(Intercept)"

colnames(x) <- ifelse(colnames(x) == "(Intercept)","Intercept",colnames(x))

Michael

On Mon, Aug 15, 2011 at 10:15 AM, Johannes Radinger <jradin...@gmx.at>wrote:

> Hello,
>
> thank you...it works!
>
> > results.dataframe <- as.data.frame(t(as.data.frame(results.model_runs)))
> > is.data.frame(results.dataframe)
> [1] TRUE
>
> that works...
> just: how to call columns in the dataframe with names like (Intercept),
> rT()...there must be a special handling of names with brackets.
>
> /johannes
>
>
>
> -------- Original-Nachricht --------
> > Datum: Mon, 15 Aug 2011 09:39:47 -0400
> > Von: "R. Michael Weylandt" <michael.weyla...@gmail.com>
> > An: Johannes Radinger <jradin...@gmx.at>
> > CC: r-help@r-project.org
> > Betreff: Re: [R] Extracting information from lm results (multiple model
> runs)
>
> > I can't help with the questions on pooling regressions, but for the
> > question
> > about data-frame usage, I'll note two things.
> >
> > One,
> >
> > > x = data.frame(y = 1:3, z = 4:6)
> > > is.data.frame(x)
> > TRUE
> >
> > > is.data.frame(t(x))
> > FALSE
> >
> > > is.data.frame(as.data.frame(t(x)))
> > TRUE
> >
> > Two, you'll need to check that your row and column names for the
> > transposed
> > data frame are what you expect them to be.
> >
> > Once you do both,(   results2 = as.data.frame(t(results)) & names ) ,
> > results2$intercept should work fine.
> >
> > Hope this helps,
> >
> > Michael Weylandt
> >
> > On Mon, Aug 15, 2011 at 8:32 AM, Johannes Radinger <jradin...@gmx.at>
> > wrote:
> >
> > > Just to inform:
> > > I posted that before in R-sig-ecology but as it might be interesting
> > also
> > > for other useRs, I post it also to the general r-user list:
> > >
> > >
> > > Hello Alexandre,
> > >
> > > thank you very much. I also found another way to extract summarizing
> > > information from lm results over e.g. 1000 repeated model runs:
> > >
> > > results2 <- t(as.data.frame(results))
> > > summary(results2)
> > >
> > > Although some questions popped up in my analysis (procedure see below)
> :
> > >
> > > 1) it seems that I can't get an element with the usual
> > > command like results2$intercept ... that doesn't work here, why?
> > > That would be useful for plotting an element like R-squared etc.
> > > It can be done with e.g hist(results2[,1]) but then I've to know which
> > > column is which coefficient.
> > >
> > > 2) is there a correct way to get a "mean" p - value or any
> > > other mean of the regression coefficients? What is the statistically
> > proper
> > > way to calculate summaries for regression coefficients of many model
> > runs
> > > Or can I just use the mean from the summary? Is there any restriction
> > when
> > > using a log transformation like: lm(log(Y)=log(X1)+X2())?
> > >
> > > 3) How can I check for normality of the residuals in my overall model?
> > > I know how to run lillie.test(resid(lm.model)). But how to do that in
> my
> > > case with 1000 model runs? Perform the test for each run and get an
> > average
> > > or pool the residuals?
> > >
> > > 4)with single lm-runs it is possible to compare the results of
> > > two models with an anova...how can I compare the results of lets
> > > say two models with each 1000 model runs?
> > >
> > >
> > > Thank you very much
> > >
> > > /johannes
> > >
> > >
> > > -------- Original-Nachricht --------
> > > > Datum: Mon, 15 Aug 2011 08:33:52 -0300
> > > > Von: Alexandre Camargo Martensen <acmarten...@gmail.com>
> > > > An: Johannes Radinger <jradin...@gmx.at>
> > > > Betreff: Re: [R-sig-eco] Extracting information from lm results
> > >
> > > > Hi Johannes,
> > > >
> > > > try this
> > > >
> > > > results.df<-data.frame(results)
> > > > #results.df
> > > >
> > > > p<-t(results.df[4,])
> > > > r<-t(results.df[5,])
> > > >
> > > > mean(p)
> > > > mean(r)
> > > >
> > > > HTH,
> > > >
> > > > Alexandre
> > > >
> > > >
> > > > On Mon, Aug 15, 2011 at 8:07 AM, Johannes Radinger <jradin...@gmx.at
> >
> > > > wrote:
> > > >
> > > > > Hello Peter,
> > > > >
> > > > > I just found a mail list correspondence from you in R-sig-eco and
> > > > > found your information and code snipped about extracting
> information
> > > > from
> > > > > lm results very useful. I do have a similar challenge to perform
> > > several
> > > > > lm-regression and in the end I'd like to get e.g a mean value for
> > > > R-squared
> > > > > or p or any other of the regression coefficients.
> > > > >
> > > > > So far I managed to write following code example:
> > > > >
> > > > > #Defining variables
> > > > > Y=c(15,14,23,18,19,9,19,13)
> > > > > X1=c(0.2,0.6,0.45,0.27,0.6,0.14,0.1,0.52)
> > > > > X2a=c(17,22,21,18,19,25,8,19)
> > > > > X2b=c(22,22,29,34,19,26,17,22)
> > > > > X2 <- function()runif(length(X2a), X2a, X2b)
> > > > >
> > > > > #create empty list
> > > > > mod <- list()
> > > > > #perfrom n-model runs for lm
> > > > > for(i in 1:10) {
> > > > >        mod[[paste("run",i,sep="")]] <- lm(Y~X1+X2())
> > > > >        }
> > > > >
> > > > > #define coeficients and how to extract them from lm
> > > > > All_Model_runs <- function(lm){
> > > > >        out <- c(lm$coefficients[1],
> > > > >        lm$coefficients[2],
> > > > >        summary(lm)$coefficients[2,2],
> > > > >        pf(summary(lm)$fstatistic[1], summary(lm)$fstatistic[2],
> > > > >        summary(lm)$fstatistic[3], lower.tail = FALSE),
> > > > >        summary(lm)$r.squared)
> > > > >        names(out) <-
> > > > > c("intercept","slope","slope.SE","p.value","r.squared")
> > > > >        return(out)
> > > > >        }
> > > > >
> > > > > #create empty list
> > > > > results <- list()
> > > > > #get coeffiencts for all model runs into list: results
> > > > > for (i in 1:length(mod)) results[[names(mod)[i]]] <-
> > > > > All_Model_runs(mod[[i]])
> > > > >
> > > > >
> > > > > Now my questions you might probably be able to answer:
> > > > >
> > > > > 1) Is that so far good what I was doing (according to your code)?
> > > > > 2) How can I get a mean/average p value for example or an average
> > > > R-squared
> > > > > over all model runs?
> > > > > 3) Is there any special function to call a mean of these
> > coefficients?
> > > > >
> > > > > Thank you very much
> > > > > /Johannes
> > > > >
> > > > > --
> > > > >
> > > > > _______________________________________________
> > > > > R-sig-ecology mailing list
> > > > > r-sig-ecol...@r-project.org
> > > > > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> > > > >
> > >
> > > --
> > >
> > >
> > >
> > >
> > > --
> > >
> > > ______________________________________________
> > > R-help@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
>
> --
> NEU: FreePhone - 0ct/min Handyspartarif mit Geld-zurück-Garantie!
> Jetzt informieren: http://www.gmx.net/de/go/freephone
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting information from lm results (multiple model runs)

Reply via email to