On Wed, 2009-04-01 at 16:49 +0100, Jose Iparraguirre D'Elia wrote: > Dear all, > > Say I have the following dataset: > > > DF > x y z > [1] 1 1 1 > [2] 2 2 2 > [3] 3 3 NA > [4] 4 NA 4 > [5] NA 5 5 > > And I want to omit all the rows which have NA, but only in columns X and Y, > so that I get: > > x y z > 1 1 1 > 2 2 2 > 3 3 NA > > If I use na.omit(DF), I would delete the row for which z=NA, obtaining thus > > x y z > 1 1 1 > 2 2 2 > > But this is not what I want, of course. > If I use na.omit(DF[,1:2]), then I obtain > > x y > 1 1 > 2 2 > 3 3 > > which is OK for x and y columns, but I wouldn't get the corresponding values > for z (ie 1 2 NA) > > Any suggestions about how to obtain the desired results efficiently (the > actual dataset has millions of records and almost 50 columns, and I would > apply the procedure on 12 of these columns)? > > Sincerely, > > Jose Luis > > Jose Luis Iparraguirre > Senior Research Economist > Economic Research Institute of Northern Ireland >
Hi Jose Luis, I think this script is sufficient for your problem: tab<-matrix(c(1,1,1,2,2,2,3,3,NA,4,NA,4,NA,5,5),ncol=3,byrow=T) tab[!is.na(tab[,1])&!is.na(tab[,2]),] -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.