Dear List,

I am trying to sub-sample some data by taking a data point every x minutes.  
The data contains missing values, and I would like to take the sub-sample that 
maximizes the number of valid points in the sample.  I.e. minimizes the number 
of NA's in the data set.  

For example, given the following:

da<-seq(Sys.time(),by=1,length.out=10)
x<-c(1,2,NA,4,NA,6,NA,8,9,10)
mydata<-data.frame(da,x)

If I wanted to take a subsample every 2 seconds, I would have the following two 
possible answers:

answer1: 2,4,NA,8
answer2: 1,NA,NA,7

I would like a function that would choose between these and obtain the one with 
the fewest missing values.

In my real dataset I have multiple variables collected every second and I would 
like to subsample it every 5, 10, and 15 minutes.

I appreciate your help.

Tim

Tim Clark
Department of Zoology 
University of Hawaii

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to