[R] Predictably puzzled.

Rolf Turner Fri, 19 Nov 2021 18:12:54 -0800


Consider the following toy example:


    set.seed(42)
    y <- rnorm(20)
    x <- rnorm(20)
    y[c(3,5,14,15)] <- NA
    fit <- lm(y~x)
    predict(fit)

This for some reason, which escapes me, does not provide predicted
values when the response/dependent variable is missing, despite
there being no missing values in the predictor/independent variable.

I can get predicted values for all values of x if I set

    ddd <- data.frame(y=y,x=x)

and execute

    predict(fit,newdata=ddd)

Note that y is (unnecessarily) included in ddd.  I thought that
predict() might omit any rows of the data in which there are missing
values, but not so.

OK.  I have a workaround which gives me the predicted values that I
want, but:

(a) Why does predict() behave in this way?  It makes no sense to me,
but I figure there *must* be a rationale.

(b) Is there a way of getting predict() to behave as I would like, by
specifying an appropriate value for na.action?  I could not find such
an appropriate value.

Thanks for any enlightenment.

cheers,

Rolf Turner

-- 
Honorary Research Fellow
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Predictably puzzled.

Reply via email to