Quick question, I tried to find a function in available packages to find NA's for an entire data set (or single variables) and report the row of missing values (NA's for each column). I searched the typical routes through the blogs and the help manuals for 15 minutes. Rather than spend any more time searching I created my own function to do this (probably in less time than it would have taken me to find the function). Now I still have the same question: Is this function (NAhunter I call it) already in existence? If so please direct me (because I'm sure they've written better code more efficiently). I highly doubt I'm this first person to want to find all the missing values in a data set so I assume there is a function for it but I just didn't spend enough time looking. If there is no existing function (big if here), is this something people feel is worthwhile for me to put into a package of some sort? Tyler Here's the code: NAhunter<-function(dataset) { find.NA<-function(variable) { if(is.numeric(variable)){ n<-length(variable) mean<-mean(variable, na.rm=T) median<-median(variable, na.rm=T) sd<-sd(variable, na.rm=T) NAs<-is.na(variable) total.NA<-sum(NAs) percent.missing<-total.NA/n descriptives<-data.frame(n,mean,median,sd,total.NA,percent.missing) rownames(descriptives)<-c(" ") Case.Number<-1:n Missing.Values<-ifelse(NAs>0,"Missing Value"," ") missing.value<-data.frame(Case.Number,Missing.Values) missing.values<-missing.value[ which(Missing.Values=='Missing Value'),] list("NUMERIC DATA","DESCRIPTIVES"=t(descriptives),"CASE # OF MISSING VALUES"=missing.values[,1]) } else{ n<-length(variable) NAs<-is.na(variable) total.NA<-sum(NAs) percent.missing<-total.NA/n descriptives<-data.frame(n,total.NA,percent.missing) rownames(descriptives)<-c(" ") Case.Number<-1:n Missing.Values<-ifelse(NAs>0,"Missing Value"," ") missing.value<-data.frame(Case.Number,Missing.Values) missing.values<-missing.value[ which(Missing.Values=='Missing Value'),] list("CATEGORICAL DATA","DESCRIPTIVES"=t(descriptives),"CASE # OF MISSING VALUES"=missing.values[,1]) } } dataset<-data.frame(dataset) options(scipen=100) options(digits=2) lapply(dataset,find.NA) } [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.