Sorry, you **did** supply data and my solution **does** work (except I left off 1 closing ")" .
> sq.n <- seq_len(nrow(data.df)) > tapply(sq.n,data.df$seq,function(x)with(data.df[x,], + sort(unique(do.call(c,mapply(seq,from=startNo,length=len,SIMPLIFY=FALSE)))))) $`1` [1] 3 4 5 6 10 11 $`2` [1] 3 4 5 6 7 15 16 17 Cheers, Bert On Wed, Oct 10, 2012 at 10:59 PM, Bert Gunter <bgun...@gene.com> wrote: > I am not sure you have expressed what you wanjt to do correctly. See inline: > > On Wed, Oct 10, 2012 at 9:10 PM, andrewH <ahoer...@rprogress.org> wrote: >> I have a couple of hundred American Community Survey Summary Files files >> containing rectangular arrays of data, mainly though not exclusively >> numeric. Each file is referred to as a sequence (henceforth "seq"). > -- so 1 "seq" (terrible identifier -- see below for why) = 1 file > > From >> these files I am trying to extract particular subsets (tables) consisting of >> a sets of columns. These tables are defined by three numbers (now in >> columns in a data frame): >> 1. a file identifier (seq) >> 2. first column position numbers (startNo) >> 3. length of table (len) > > So your data frame, call it yourframe, has columns named: > > seq startNo len > > >> so the columns to select for one triple would consist of >> startNo:(startNo+length-1). I am trying to create for each sequence a >> vector of all the column numbers for tables in that sequence. > > So for each seq id you want to find all the column numbers, right? > > sq.n <- seq_len(nrow(yourframe)) ## Just to make it easier to read > colms <- tapply(sq.n, yourframe$seq,function(x) with(yourframe[x,], > sort(unique(do.call(c, mapply(seq, from=startNo, > length=len,SIMPLIFY = FALSE))))) > > ## Comments > In the mapply call, seq is the R function, ?seq. That's why using it > as a name for a file id is terrible -- it causes confusion. > > In the absence of data, this is untested -- and probably not quite > right. But it should be close, I hope. The key idea is the use of > mapply to get the sequence of columns for each row in all the rows for > each seq id. The SIMPLIFY = FALSE guarantees that this yields a list > of vectors of column indices, which are then glopped together and > cleaned up by the sort(unique(do.call( ... stuff. > > colms should then be a list giving the sorted column numbers to choose > for each "seq" id. > > I do not know whether (once cleaned up,) this is either more elegant > or more efficient than what you proposed. And I wouldn't be surprised > if someone like Bill Dunlap comes up with a lot better way, either. > But it is different -- and perhaps amusing. > > ... If I have properly understood what you wanted. If not, ignore all. > > Cheers, > Bert > >> >> Obviously I could do this with nested for loops,e.g.. >> >>> seq <- c(1,1,2,2) >>> startNo <- c(3, 10, 3, 15) >>> len <- c(4, 2, 5, 3) >>> data.df <- data.frame(seq, startNo, len) >>> >>> seq.f <- factor(data.df$seq) >>> data.l <- split(data.df, seq.f) >>> selectColsList<- vector("list", length(levels(seq.f))) >>> for (i in seq_along(levels(seq.f))){ >> selectCols <- numeric() >> for (j in seq_along(data.l[[i]]$startNo)){ >> selectCols <- c(selectCols, >> data.l[[i]]$startNo[j]:(data.l[[i]]$startNo[j] >> data.l[[i]]$len[j]-1)) >> } >> selectColsList[[i]] <- selectCols >> } >>> selectColsList >> [[1]] >> [1] 3 4 5 6 10 11 >> [[2]] >> [1] 3 4 5 6 7 15 16 17 >> >> But this code strikes me as inelegant and verbose. It seems to me that there >> ought to be a way to make the outer loop, (indexed with i) into a tapply >> function (which is why I started with a split()), and the inner loop >> (indexed with j) into some cute recursive function, but I was not able to do >> so. If anyone could suggest some nicer (e.g. shorter, or faster, or just >> more sophisticated) way to do this instead, I would be most grateful. >> >> Sincerely, andrewH >> >> >> >> >> -- >> View this message in context: >> http://r.789695.n4.nabble.com/replacing-ugly-for-loops-tp4645821.html >> Sent from the R help mailing list archive at Nabble.com. >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > > -- > > Bert Gunter > Genentech Nonclinical Biostatistics > > Internal Contact Info: > Phone: 467-7374 > Website: > http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.