Re: [R] ols function in rms package

Frank E Harrell Jr Tue, 08 Jun 2010 05:24:42 -0700

On 06/08/2010 05:29 AM, Mark Seeto wrote:

On 06/06/2010 10:49 PM, Mark Seeto wrote:

Hello,

I have a couple of questions about the ols function in Frank Harrell's
rms
package.

Is there any way to specify variables by their column number in the data
frame rather than by the variable name?

For example,

library(rms)
x1<- rnorm(100, 0, 1)
x2<- rnorm(100, 0, 1)
x3<- rnorm(100, 0, 1)
y<- x2 + x3 + rnorm(100, 0, 5)
d<- data.frame(x1, x2, x3, y)
rm(x1, x2, x3, y)
lm(y ~ d[,2] + d[,3], data = d)  # This works
ols(y ~ d[,2] + d[,3], data = d) # Gives error
Error in if (!length(fname) || !any(fname == zname)) { :
    missing value where TRUE/FALSE needed

However, this works:
ols(y ~ x2 + d[,3], data = d)

The reason I want to do this is to program variable selection for
bootstrap model validation.

A related question: does ols allow "y ~ ." notation?

lm(y ~ ., data = d[, 2:4])  # This works
ols(y ~ ., data = d[, 2:4]) # Gives error
Error in terms.formula(formula) : '.' in formula and no 'data' argument

Thanks for any help you can give.

Regards,
Mark


Hi Mark,

It appears that you answered the questions yourself.  rms wants real
variables or transformations of them.  It makes certain assumptions
about names of terms.   The y ~ . should work though; sometime I'll have
a look at that.

But these are the small questions compared to what you really want.  Why
do you need variable selection, i.e., what is wrong with having
insignificant variables in a model?  If you indeed need variable
selection see if backwards stepdown works for you.  It is built-in to
rms bootstrap validation and calibration functions.

Frank


Thank you for your reply, Frank. I would have reached the conclusion
that rms only accepts real variables had this not worked:
ols(y ~ x2 + d[,3], data = d)


Hi Mark - that probably worked by accident.


The reason I want to program variable selection is so that I can use the
bootstrap to check the performance of a model-selection method. My
co-workers and I have used a variable selection method which combines
forward selection, backward elimination, and best subsets (the forward and
backward methods were run using different software).

I want to do bootstrap validation to (1) check the over-optimism in R^2,
and (2) justify using a different approach, if R^2 turns out to be very
over-optimistic. The different approach would probably be data reduction
using variable clustering, as you describe in your book.

The validate.ols function which calls the predab.resample function maygive you some code to start with. Note however that the performance ofthe approach you are suggestion has already been shown to be poor inmany cases. You might run the following in parallel: full model fitsand penalized least squares using penalties selected by AIC (usingspecial arguments to ols along with the pentrace function).


Frank


Regards,
Mark



--
Frank E Harrell Jr   Professor and Chairman        School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ols function in rms package

Reply via email to