It's difficult to help you if we don't know what the data looks like. Two more tips : - look at ?with instead of attach(). The latter causes a lot of trouble further on. - use require() instead of library(). See also the help files on this.
I also wonder what you're doing with the na.rm=TRUE. I don't see how it affects the function, as it is used nowhere. Does it really remove NA? This said: estfun takes the estimation function from your model object. Your model is essentially fit on all observations that are not NA, whereas I presume your cluster contains all observations, including NA. Now I don't know what kind of model fm represents, but normally there is information in the model object about the removed observations. I illustrate this using lm : > x <- rnorm(100) > y <- c(rnorm(90),NA,rnorm(9)) > test <- lm(x~y) > str(test) List of 13 ... (tons of information) $ model :'data.frame': 99 obs. of 2 variables: ... (more tons of information) ..- attr(*, "na.action")=Class 'omit' Named int 91 .. .. ..- attr(*, "names")= chr "91" - attr(*, "class")= chr "lm" Now we know that we can do : > not <-attr(test$model,"na.action") > y[-not] So try this ### NOT TESTED cl <- function(dat, na.rm = TRUE, fm, cluster){ require(sandwich) require(lmtest) not <- attr(fm$model,"na.action") cluster <- cluster[-not] with( dat ,{ M <- length(unique(cluster)) N <- length(cluster) K <- fm$rank dfc <- (M/(M-1))*((N-1)/(N-K)) uj <- data.frame(apply(estfun(fm),2, function(x) data.frame(tapply(x, cluster, sum)) ) ); vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N) coeftest(fm, vcovCL) } ) } Cheers Joris On Wed, Jun 9, 2010 at 12:06 AM, edmund jones <edmund.j.jo...@gmail.com> wrote: > Hi, > > I am relatively new to R; when creating functions, I run into problems with > missing values. I would like my functions to ignore rows with missing values > for arguments of my function) in the analysis (as for example is the case in > STATA). Note that I don't want my function to drop rows if there are missing > arguments elsewhere in a row, ie for variables that are not arguments of my > function. > > As an example: here is a clustering function I wrote: > > > cl <- function(dat, na.rm = TRUE, fm, cluster){ > > attach( dat , warn.conflicts = F) > > library(sandwich) > > library(lmtest) > > M <- length(unique(cluster)) > > N <- length(cluster) > > K <- fm$rank > > dfc <- (M/(M-1))*((N-1)/(N-K)) > > uj <- data.frame(apply(estfun(fm),2, function(x) data.frame(tapply(x, > cluster, sum)) ) ); > > vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N) > > coeftest(fm, vcovCL) > > } > > > When I run my function, I get the message: > > > Error in tapply(x, cluster, sum) : arguments must have same length > > > If I specify instead attach(na.omit(dat), warn.conflicts = F) and don't > have the "na.rm=TRUE" argument, then my function runs; but only for the rows > where there are no missing values AT ALL; however, I don't care if there are > missing values for variables on which I am not applying my function. > > > For example, I have information on children's size; if I want regress scores > on age and parents' education, clustering on class, I would like missing > values in size not to interfere (ie if I have scores, age, parents' > education, and class, but not size, I don't want to drop this observation). > > > I tried to look at the code of "lm" to see how the na.action part works, but > I couldn't figure it out... This is exactly how I would like to deal with > missing values. > > > I tried to write > > cl <- function(dat, fm, cluster, na.action){ > > attach( dat , warn.conflicts = F) > > library(sandwich) > > library(lmtest) > > M <- length(unique(cluster)) > > N <- length(cluster) > > K <- fm$rank > > dfc <- (M/(M-1))*((N-1)/(N-K)) > > uj <- data.frame(apply(estfun(fm),2, function(x) data.frame(tapply(x, > cluster, sum)) ) ); > > vcovCL <- dfc*sandwich(fm, meat=crossprod(uj)/N) > > coeftest(fm, vcovCL) > > } > > attr(cl,"na.action") <- na.exclude > > > but it still didn't work... > > > Any ideas of how to deal with this issue? > > Thank you for your answers! > > Edmund > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.