Re: [R] Tabulating Baseline Characteristics on specific observations

William Dunlap Tue, 20 Sep 2011 17:21:08 -0700

You could use the na.action function on the fitted
object to see which observations were omitted.  E.g.,
let's make a data.frame that we can actually do some
regressions with and try na.action():


  > d <- data.frame(V1=11:15, V2=log(c(1,NA,NA,4,5)), V3=sqrt((-1):3), 
V4=sin(1:5))
  Warning message:
  In sqrt((-1):3) : NaNs produced
  > d
    V1       V2       V3         V4
  1 11 0.000000      NaN  0.8414710
  2 12       NA 0.000000  0.9092974
  3 13       NA 1.000000  0.1411200
  4 14 1.386294 1.414214 -0.7568025
  5 15 1.609438 1.732051 -0.9589243
  > fit12 <- lm(V1 ~ V2, data=d, na.action=na.omit)
  > if (length(na.action(fit12))>0) d[-na.action(fit12), ] else d
    V1       V2       V3         V4
  1 11 0.000000      NaN  0.8414710
  4 14 1.386294 1.414214 -0.7568025
  5 15 1.609438 1.732051 -0.9589243

You can also call na.action on the output of na.omit (or
na.exclude) itself, but then you have to remember which
variables were in the model.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On 
> Behalf Of justin jarvis
> Sent: Tuesday, September 20, 2011 4:38 PM
> To: David Winsemius
> Cc: [email protected]
> Subject: Re: [R] Tabulating Baseline Characteristics on specific observations
> 
> That still discards the other data columns.  For example, in the data frame
> 
> V1 V2 V3 V4
> 1  1  1  NA 1
> 2  1 NA  1  1
> 3  1 NA  1  1
> 4  1  1    1  1
> 5  1  1    1  1
> 
> Suppose I was running a regression using V1 and V2.  R will remove rows 2
> and 3 due to the "NA."  I would like a way to look at only the observations
> used for the regression, the data frame:
> 
> V1 V2 V3 V4
> 1  1  1  NA 1
> 4  1  1    1  1
> 5  1  1    1  1
> 
> If I run na.omit(subset(dataframe, select= c(V1,V2)) it returns
> 
> V1 V2
> 1  1  1
> 4  1  1
> 5  1  1
> 
> Sorry for being unclear the previous time.
> 
> Justin
> 
> On Tue, Sep 20, 2011 at 4:54 AM, David Winsemius 
> <[email protected]>wrote:
> 
> >
> > On Sep 19, 2011, at 8:49 PM, justin jarvis wrote:
> >
> >  I have a data set with many missing observations.  When I run a
> >> regression, R of course discards the observations (the whole row) that
> >> have "NA".  I want to tabulate some baseline characteristics (column
> >> means) but only for the observations that R used for the regression.
> >> I tried to recreate this data frame by using na.omit on the original
> >> data frame, but this will not work as this will discard an observation
> >> with an "NA" in any column, and not just in the covariates.
> >>
> >> In summary, I only want to remove observations that have an "NA" in
> >> the covariate columns.  Something like Stata's e(sample), as far as I
> >>
> >
> > na.omit(subset(dfrm, select= <covariate-vector> )  # or equivalent
> >
> >  can tell.
> >>
> >> Justin Jarvis
> >> PhD student, University of California, Irvine
> >>
> >> ______________________________**________________
> >> [email protected] mailing list
> >> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> >> PLEASE do read the posting guide http://www.R-project.org/**
> >> posting-guide.html <http://www.R-project.org/posting-guide.html>
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
> > David Winsemius, MD
> > West Hartford, CT
> >
> >
> 
>       [[alternative HTML version deleted]]
> 
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Tabulating Baseline Characteristics on specific observations

Reply via email to