Dear all
I have written a function to perform a very simple but useful task which I
do regularly. It is designed to show how many values are missing from each
variable in a data.frame. In its current form it works but is slow because I
have used several loops to achieve this simple task.
Can anyone see a more efficient way to get the same results? Or is there
existing function which does this?
Thanks for your help
Tim
Function:
miss <- function (data)
{
miss.list <- list(NA)
for (i in 1:length(data)) {
miss.list[[i]] <- table(is.na(data[i]))
}
for (i in 1:length(miss.list)) {
if (length(miss.list[[i]]) == 2) {
miss.list[[i]] <- miss.list[[i]][2]
}
}
for (i in 1:length(miss.list)) {
if (names(miss.list[[i]]) == "FALSE") {
miss.list[[i]] <- 0
}
}
data.frame(names(data), as.numeric(miss.list))
}
Example:
data(ToothGrowth)
data.m <- ToothGrowth
data.m$supp[sample(1:nrow(data.m), size=25)] <- NA
miss(data.m)
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.