You might add vapply() to you repertoire, as it is quicker than sapply but also does some error checking on the your input data. E.g., your f2 returns a matrix whose columns are the elements of the list l and you assume that there each element of l contains 2 character strings. f2 <- function(l)matrix(unlist(l),nr=2) Here is a function based on vapply() the returns the same thing but also verifies that element of l is really a 2-long character vector. f2v <- function (l) vapply(l, function(x) x, FUN.VALUE = character(2)) and a function to generate datasets of various sizes makeL <- function(n)strsplit(paste(sample(LETTERS,n,rep=TRUE),sample(1:10,n,rep=TRUE),sep="+"),"+",fix=TRUE)
Timing the functions on a million-long list I get > l <- makeL(n=10^6) > system.time( r2 <- f2(l) ) user system elapsed 0.088 0.000 0.090 > system.time( r2v <- f2v(l) ) user system elapsed 0.92 0.00 0.92 > identical(r2, r2v) [1] TRUE vapply() is ten times slower than unlist() but three times faster than sapply(x,function(x)x). However, when you give it data that doesn't meet your expectations, which is common when using strsplit(), f2v tells you about the problem and f2 gives you an incorrect result: > l[[10]] <- c("a","b","c","d") > system.time( r2v <- f2v(l) ) Error in vapply(l, function(x) x, FUN.VALUE = character(2)) : values must be length 2, but FUN(X[[10]]) result is length 4 Timing stopped at: 0.004 0 0.002 > system.time( rv <- f2(l) ) user system elapsed 0.088 0.008 0.095 > dim(rv) # you will have alignment problems later [1] 2 1000001 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On > Behalf > Of Bert Gunter > Sent: Thursday, April 25, 2013 7:54 AM > To: ted.hard...@wlandres.net > Cc: R mailing list > Subject: Re: [R] Decomposing a List > > Well, what you really want to do is convert the list to a matrix, and > it can be done directly and considerably faster than with the > (implicit) looping of sapply: > > f1 <- function(l)sapply(l,"[",1) > f2 <- function(l)matrix(unlist(l),nr=2) > l <- > strsplit(paste(sample(LETTERS,1e6,rep=TRUE),sample(1:10,1e6,rep=TRUE),sep="+"),"+",f > ix=TRUE) > > ## Then you get these results: > > > system.time(x1 <- f1(l)) > user system elapsed > 1.92 0.01 1.95 > > system.time(x2 <- f2(l)) > user system elapsed > 0.06 0.02 0.08 > > system.time(x2 <- f2(l)[1,]) > user system elapsed > 0.1 0.0 0.1 > > identical(x1,x2) > [1] TRUE > > > Cheers, > Bert > > > > > > > On Thu, Apr 25, 2013 at 3:32 AM, Ted Harding <ted.hard...@wlandres.net> wrote: > > Thanks, Jorge, that seems to work beautifully! > > (Now to try to understand why ... but that's for later). > > Ted. > > > > On 25-Apr-2013 10:21:29 Jorge I Velez wrote: > >> Dear Dr. Harding, > >> > >> Try > >> > >> sapply(L, "[", 1) > >> sapply(L, "[", 2) > >> > >> HTH, > >> Jorge.- > >> > >> > >> > >> On Thu, Apr 25, 2013 at 8:16 PM, Ted Harding > >> <ted.hard...@wlandres.net>wrote: > >> > >>> Greetings! > >>> For some reason I am not managing to work out how to do this > >>> (in principle) simple task! > >>> > >>> As a result of applying strsplit() to a vector of character strings, > >>> I have a long list L (N elements), where each element is a vector > >>> of two character strings, like: > >>> > >>> L[1] = c("A1","B1") > >>> L[2] = c("A2","B2") > >>> L[3] = c("A3","B3") > >>> [etc.] > >>> > >>> >From L, I wish to obtain (as directly as possible, e.g. avoiding > >>> a loop) two vectors each of length N where one contains the strings > >>> that are first in the pair, and the other contains the strings > >>> which are second, i.e. from L (as above) I would want to extract: > >>> > >>> V1 = c("A1","A2","A3",...) > >>> V2 = c("B1","B2","B3",...) > >>> > >>> Suggestions? > >>> > >>> With thanks, > >>> Ted. > >>> > >>> ------------------------------------------------- > >>> E-Mail: (Ted Harding) <ted.hard...@wlandres.net> > >>> Date: 25-Apr-2013 Time: 11:16:46 > >>> This message was sent by XFMail > >>> > >>> ______________________________________________ > >>> R-help@r-project.org mailing list > >>> https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting guide > >>> http://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > >>> > > > > ------------------------------------------------- > > E-Mail: (Ted Harding) <ted.hard...@wlandres.net> > > Date: 25-Apr-2013 Time: 11:31:57 > > This message was sent by XFMail > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb- > biostatistics/pdb-ncb-home.htm > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.