You can study this yourself using the System.time() utility: just write System.time() around any block of code and R will time it for you.
Offhand, I'd guess example2 may be ever so slightly quicker since it doesn't have to create colA and colB, but not to a degree that would be noticeable for reasonably sized data. More importantly, you should probably notice that the examples give different output: one puts just the p.value of the t.test in tt_pvalue while the other puts the entire t.test object. You probably meant Example2: tt_pvalue [ i ] <- t.test ( temp[ , j ], temp[ , k ], var.equal=TRUE)$p.value If you are a beginner, I'd strongly suggest you wait the extra 3.2 milliseconds and use code like example one: it will be easier to debug. In your second block of code, you wind up t-testing a column against itself many times and you wind up deleting many of the p.values you store. Is this actual code or are you more interested in how something would be vectorized? If the first, write back and I'll talk to you about storing the results and doing the tests in a logical manner. If you are only interested from a coding efficiency point of view, the first for loop over all the files is probably best replaced by L = lapply(files_to_test, read.table, header=TRUE, sep="\t") This will create a list object L containing all of the file information: List objects are basically R's way of sticking any combination of objects together in one big "super-object" that can contain anything. (I'm sure the code experts will want to correct me, but for a beginner I think that gives sufficient intuition.) Once you have everything in R you have a wealth of opportunities depending on what you want to do: there's an open thread started by J. Bouldin on how to do things columnwise over different objects most efficiently in R right now that will hopefully get some good answers. Let me know if there's a specific thing you want to wind up doing and I'll try to give you a hand: if it's just a theoretical interest, keep an eye on the other thread. Hope this helps, Michael Weylandt On Thu, Aug 4, 2011 at 11:19 PM, Matt Curcio <matt.curcio...@gmail.com>wrote: > Greetings all, > I am curious to know if either of these two sets of code is more efficient? > > Example1: > ## t-test ## > colA <- temp [ , j ] > colB <- temp [ , k ] > ttr <- t.test ( colA, colB, var.equal=TRUE) > tt_pvalue [ i ] <- ttr$p.value > > or > Example2: > tt_pvalue [ i ] <- t.test ( temp[ , j ], temp[ , k ], var.equal=TRUE) > ------------- > I have three loops, i, j, k. > One to test the all of <i> files in a directory. One to tease out > column <j> and compare it by means of t-test to column <k> in each of > the files. > --------------- > for ( i in 1:num_files ) { > temp <- read.table ( files_to_test [ i ], header=TRUE, sep="\t") > num_cols <- ncol ( temp ) > ## Define Columns To Compare ## > for ( j in 2 : num_cols ) { > for ( k in 3 : num_cols ) { > ## t-test ## > colA <- temp [ , j ] > colB <- temp [ , k ] > ttr <- t.test ( colA, colB, var.equal=TRUE) > tt_pvalue [ i ] <- ttr$p.value > } > } > } > -------------------------------- > I am a novice writer of code and am interested to hear if there are > any (dis)advantages to one way or the other. > M > > > Matt Curcio > M: 401-316-5358 > E: matt.curcio...@gmail.com > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.