Hi Berend, Thank you very much for pointing out the mistake and for your patience. I have corrected the the script, which has worked fine.
regards, Pradip Muhuri ________________________________________ From: Berend Hasselman [b...@xs4all.nl] Sent: Thursday, November 22, 2012 12:42 PM To: Muhuri, Pradip (SAMHSA/CBHSQ) Cc: r-help@r-project.org Subject: Re: [R] Data Extraction - benchmark() On 22-11-2012, at 18:20, Muhuri, Pradip (SAMHSA/CBHSQ) wrote: > Hi Berend, > > I see you are one of the contributors to the rbecnhmark package. > > I am sorry that I am bothering you again. I have tried to run your code > (slightly tweaked) involving the benchmark function, and I am getting the > following error message. What am I doing wrong? > > > Error in benchmark(d1 <- s1(df), d2 <- s2(df), d3 <- s3(df), d4 <- s4(df), : > could not find function "s1" > Because you haven't defined a function s1 (or s2, s3, s4 for that matter). You did s1 <- df[complete.cases(df),] Berend >> >> identical (d1,d2), identical (d1,d3), identical (d1,d4), identical (d1,d5), >> identical (d1,d6) > Error: unexpected ',' in "identical (d1,d2)," > >> sessionInfo () > R version 2.15.1 (2012-06-22) > Platform: i386-pc-mingw32/i386 (32-bit) > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 LC_MONETARY=English_United States.1252 > [4] LC_NUMERIC=C LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] rbenchmark_1.0.0 > > loaded via a namespace (and not attached): > [1] tools_2.15.1 > > > > I would appreciate receiving your help if your time permits .. > > > Thanks and regards, > > Pradip Muhuri > > ##### Berend's code extended > N <- 100000 > set.seed(13) > df<-data.frame(matrix(sample(c(1:10,NA),N, replace=TRUE),ncol=50)) > s1 <- df[complete.cases(df),] > s2 <- na.omit(df) > s3 <- df[apply(df, 1, function(x)all(!is.na(x))), ] > s4 <- function(df) {df[apply(df, 1, function(x)all(!is.na(x))),][,1:ncol(df)]} > s5 <- function(df) {df[!is.na(rowSums(df)),][1:ncol(df)]} > s6 <- function(df) {df[complete.cases(df),][1:ncol(df)]} > > require(rbenchmark) > > benchmark( d1 <- s1(df), d2 <- s2(df), d3 <- s3(df), d4 <- s4(df), d5 <- > s5(df), d6 <- s6(df), > columns=c("test","elapsed", "relative", "replications") ) > > identical (d1,d2), identical (d1,d3), identical (d1,d4), identical (d1,d5), > identical (d1,d6) > > > > > ________________________________________ > From: Berend Hasselman [b...@xs4all.nl] > Sent: Thursday, November 22, 2012 11:03 AM > To: Muhuri, Pradip (SAMHSA/CBHSQ) > Cc: r-help@r-project.org > Subject: Re: [R] Data Extraction > > On 22-11-2012, at 16:50, Muhuri, Pradip (SAMHSA/CBHSQ) wrote: > >> Hi Berend, >> >> You have compared all 3 ways. ... very nicely evaluated. >> > > Bert's solution is indeed nice and simple. But Petr's solution is still the > quickest: > >> N <- 100000 >> set.seed(13) >> df <- data.frame(matrix(sample(c(1:10,NA),N,replace=TRUE),ncol=50)) >> library(rbenchmark) >> >> f1 <- function(df) {df[apply(df, 1, function(x)all(!is.na(x))),]} >> f2 <- function(df) {df[!is.na(rowSums(df)),]} >> f3 <- function(df) {df[complete.cases(df),]} >> f4 <- function(df) {data.frame(na.omit(df))} >> benchmark(d1 <- f1(df), d2 <- f2(df), d3 <- f3(df), d4 <- f4(df), >> columns=c("test","elapsed", "relative", "replications")) > test elapsed relative replications > 1 d1 <- f1(df) 3.588 14.888 100 > 2 d2 <- f2(df) 0.403 1.672 100 > 3 d3 <- f3(df) 0.241 1.000 100 > 4 d4 <- f4(df) 0.557 2.311 100 >> >> identical(d1,d2) > [1] TRUE >> identical(d1,d3) > [1] TRUE >> identical(d1,d4) > [1] TRUE > > Berend > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.