I know I am overwriting. merge doesn't solve it because each version in mydata is given to more than one id. Hence, I thought I can't merge by version. I am not sure how to answer the question about "the problem". I described the current state and the desired state. If possible, I'd like to get from the current state to the desired state faster than when using a loop.
On Tue, Dec 22, 2015 at 2:26 PM, jim holtman <jholt...@gmail.com> wrote: > You seem to be saving 'myid' and then overwriting it with the last > statement: > > result[[i]] <- result[[i]][c(5, 1:4)] > > Why doesn't 'merge' work for you? I tried it on your data, and seem to get > back the same number of rows; may not be in the same order, but the content > looks the same, and it does have 'myid' on it. > > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > On Tue, Dec 22, 2015 at 12:27 PM, Dimitri Liakhovitski > <dimitri.liakhovit...@gmail.com> wrote: >> >> Hello! >> I have a solution for my task that is based on a loop. However, it's >> too slow for my real-life problem that is much larger in scope. >> However, I cannot use merge. Any advice on how to do it faster? >> Thanks a lot for any hint on how to speed it up! >> >> # I have 'mydata' data frame: >> set.seed(123) >> mydata <- data.frame(myid = 1001:1100, >> version = sample(1:20, 100, replace = T)) >> head(mydata) >> table(mydata$version) >> >> # I have 'myinfo' data frame that contains information for each 'version': >> set.seed(12) >> myinfo <- data.frame(version = sort(rep(1:20, 30)), a = rnorm(60), b = >> rnorm(60), >> c = rnorm(60), d = rnorm(60)) >> head(myinfo, 40) >> >> ### MY SOLUTION WITH A LOOP: >> ### Looping through each id of mydata and grabbing >> ### all columns from 'myinfo' for the corresponding 'version': >> >> # 1. Creating placeholder list for the results: >> result <- split(mydata[c("myid", "version")], f = list(mydata$myid)) >> length(result) >> (result)[1:3] >> >> >> # 2. Looping through each element of 'result': >> for(i in 1:length(result)){ >> id <- result[[i]]$myid >> result[[i]] <- myinfo[myinfo$version == result[[i]]$version, ] >> result[[i]]$myid <- id >> result[[i]] <- result[[i]][c(5, 1:4)] >> } >> result <- do.call(rbind, result) >> head(result) # This is the desired result >> >> -- >> Dimitri Liakhovitski >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- Dimitri Liakhovitski ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.