Dear Bill and Phil Many thanks for your help, your solutions worked perfectly. Bill: I did not specify whether the data was a matrix or dataframe because it is in fact the Expression file in an eset object (bioBase).
Thank you so much again! Hind On Sat, Feb 26, 2011 at 4:34 PM, William Dunlap <wdun...@tibco.com> wrote: > You didn't say if your data set was a matrix or data.frame. > Here are 2 functions that do the job on either and one that > only works with data.frames, but is faster (a similar speedup > is available for matrices as well). They all compute the > number of small values in each row, nSmall, and extract the > rows for which nSmall is less than 2. > > f0 <- function (x) { > nSmall <- apply(x, 1, function(row) sum(abs(row) <= 1.58) > x[nSmall<2, , drop = FALSE] > } > f1 <- function (x) { > nSmall<- rowSums(abs(x) < 1.58) > x[nSmall<2, , drop = FALSE] > } > f2 <- function (x) { > stopifnot(is.data.frame(x)) > nSmall <- 0 > for (column in x) { > nSmall <- nSmall + (abs(column) < 1.58) > } > x[nSmall < 2, , drop = FALSE] > } > > For a 10^5 row by 50 column data.frame I got the > following times: > > system.time(r0 <- f0(z)) > user system elapsed > 2.39 0.04 2.51 > > system.time(r1 <- f1(z)) > user system elapsed > 0.42 0.08 0.51 > > system.time(r2 <- f2(z)) > user system elapsed > 0.21 0.05 0.24 > > identical(r0, r1) && identical(r0, r2) > [1] TRUE > > Bill Dunlap > Spotfire, TIBCO Software > wdunlap tibco.com > >> -----Original Message----- >> From: r-help-boun...@r-project.org >> [mailto:r-help-boun...@r-project.org] On Behalf Of hind lazrak >> Sent: Saturday, February 26, 2011 3:37 PM >> To: r-help@r-project.org >> Subject: [R] how to remove rows in which 2 or more >> observations are smaller than a given threshold? >> >> Hello >> >> The data set I am examining has 7425 observations (rows with unique >> identifiers) and 46 samples(columns). >> >> I have been trying to generate a dataset that filters out observations >> that are "negligible" >> The definition of "negligible" is absolute value less or >> equal to 1.58. >> >> The rule that I would like to adopt to create a new data is: drop rows >> in which 2 or more observations have absolute values <= 1.58. >> >> Since I have unique identifier per row, I have tried to reshape the >> data so I could create a new variable using an ifelse statement that >> would flag observations <=1.58 but I am not getting anywhere with this >> approach >> >> I could not come up with an apply function that counts the number of >> observations for which the absolute values are below the cutoff I've >> specified. >> >> All observations are numerical and I don't have missing values. >> >> >> Thank you in advance for the help, >> >> Hind >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.