Hi all


I am hoping this is just a minor problem, I am trying to implement a best 
subsets regression procedure on some ecological datasets using the regsubsets 
function in the leaps package. The dataset contains 43 predictor variables plus 
the response (logcount) all in a dataframe called environment. I am 
implementing it as follows:

library(leaps)


subsets<-regsubsets(logcount~.,data=environment,nvmax=10,nbest=2,really.big=FALSE,method="exhaustive")

###the subset regression runs fine when i run it as above and i can get all the 
usual summaries

###The problem comes when i try and get it to output the variance convariance 
matric by adding the option vcov=TRUE
##When I do that i get the following:

subsets<-regsubsets(logcount~.,data=environment,nvmax=10,nbest=2,really.big=FALSE,method="exhaustive",vcov=TRUE)

Error in model.frame.default(data = environment, vcov = TRUE, formula = 
logcount~  : 
  variable lengths differ (found for '(vcov)')

> traceback()
6: model.frame.default(data = environment, vcov = TRUE, formula = logcount~ 
       .)
5: model.frame(data = environment, vcov = TRUE, formula = logcount~ .)
4: eval(expr, envir, enclos)
3: eval(mm, sys.frame(sys.parent()))
2: regsubsets.formula(logcount~ ., data = environment, nvmax = 10, really.big = 
TRUE, 
       method = "exhaustive", nbest = 5, vcov = TRUE)
1: regsubsets(obs ~ ., data = environment, nvmax = 10, really.big = TRUE, 
       method = "exhaustive", nbest = 5, vcov = TRUE)

I get the same error when i try and add the all.best=TRUE option (as ideally i 
would like it to report the fits of all the subsets).

All the predictor variables and the response are the same length (143) so not 
sure if I am misinterpreting the error or have misspecified the regsubsets?

I was wanting to get the variance - covariance matrix as I believe I need it to 
calculate the Variance Inflation Factors (VIFs) for each of the models reported 
by regsubsets. As I want to exclude any models that exhibit multicollinerarity 
from later analysis, I am hoping to select say the 'best' 10 models and 
bootstrap them to find out more about how they perform.

Or am I going about this all the wrong way? is there away to calulate vifs from 
regsubsets or pass it directly to something that calculates them e.g. VIF in 
the car package?

Any help will be most appreciated, many thanks in advance

Lara

lara.har...@bbsrc.ac.uk

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to