Nice suggestion for the extra "Time" column. But I think I didn't ask clear enough my problem. My main problem is to find a way to "classify" the rrt's, so that we don't have to check each dataframe by our selfs.
So I need a function that fills in the extra "Time" column by taking a look at the rrt's (and maybe the results), and take the discision which rrts are the same, and which are new ones. As stated: rrt's never switch place, and results can't be concatenated or averaged within a Mnd. I hope my question is a bit more clear now. Thank you all for your suggestions Bart > Date: Thu, 24 Jan 2013 15:01:40 -0800 > Subject: Re: [R] sorting/grouping/classification problem? > From: djmu...@gmail.com > To: bartjoo...@hotmail.com > > Hi: > > Here's a potential workaround: > > # Add a time order variable > dat$ord <- c(rep(2:4, 2), rep(1:4, 2)) > > # Average rrt by ord > dat$Time <- with(dat, ave(rrt, ord, FUN = mean)) > dat > > # Reshape the data > > library(reshape2) > > dcast(dat, Time ~ Mnd, value.var = "Result") > Time 0 3 6 9 > 1 0.3550 NA NA 0.05 0.06 > 2 0.4475 0.1 0.2 0.40 0.60 > 3 0.4750 0.3 0.6 1.20 1.80 > 4 1.2225 0.5 0.4 0.45 0.50 > > You could always round dat$Time to two decimal places in its > definition before doing the cast if you so desired. > > Dennis > > On Thu, Jan 24, 2013 at 11:31 AM, Bart Joosen <bartjoo...@hotmail.com> wrote: > > > > Hi, > > > > > > I'm a database admin for a database which manage chromatographic results of > > products during stability studies. > > I use R for the reporting of the results in MS Word through R2wd. > > > > > > But now I think I need your help: > > suppose we have the following data frame: > > > > > > ID rrt Mnd Result > > 1 0.45 0 0.10 > > 1 0.48 0 0.30 > > 1 1.24 0 0.50 > > 2 0.45 3 0.20 > > 2 0.48 3 0.60 > > 2 1.22 3 0.40 > > 3 0.35 6 0.05 > > 3 0.44 6 0.40 > > 3 0.46 6 1.20 > > 3 1.21 6 0.45 > > 4 0.36 9 0.06 > > 4 0.45 9 0.60 > > 4 0.48 9 1.80 > > 4 1.22 9 0.50 > > > > > > > > ID is the database ID, rrt is an identifier for the result, Mnd is the > > timepoint of analysis and Result is... the result of the test. > > What I need is this dataframe in a wide format (which I managed with dat2 > > <- as.data.frame(tapply(dat$Result,list(rrt=dat$rrt,Mnd=dat$Mnd), > > function(x) paste(x[x!=""],collapse="/"))) ) > > But as you can see, rrt is not an exact identifier for the result. > > > > Sometimes rrt for 0 Mnd is 0.45, but at 6 Mnd the rrt is 0.44. > > Now I need the results to align so that one can easily see how rrt x is > > evolving within the Mnd time points. > > I tried with different rounding procedures (round every 0.02, check that no > > results are discarded this way, and check for alignment), but nothing seems > > to make some sense. > > Also tried checking the highest results in each Mnd, align these, determine > > correction factors for the rrt for all the other rrts, ... > > > > > > Some results will follow a trend (like rrt 0.45), some will remain more or > > less stable. > > But NEVER rrt will switch i with each other! > > > > > > > > > > Ultimately I need to update in the db, so I need a list/dataframe with the > > ID, the original rrt and the adjusted rrt (maybe the first occuring rrt, or > > the mean of the rrts, doesn't matter). > > > > > > > > > > Any ideas about which algorithms can be used? I searched on pubmed, but > > couldn't find anything > > > > > > > > > > Thanks > > > > > > Bart > > > > > > PS: to get the data: > > > > > > dat <- > > structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, > > 4L, 4L, 4L, 4L), rrt = c(0.45, 0.48, 1.24, 0.45, 0.48, 1.22, > > 0.35, 0.44, 0.46, 1.21, 0.36, 0.45, 0.48, 1.22), Mnd = c(0L, > > 0L, 0L, 3L, 3L, 3L, 6L, 6L, 6L, 6L, 9L, 9L, 9L, 9L), Result = c(0.1, > > 0.3, 0.5, 0.2, 0.6, 0.4, 0.05, 0.4, 1.2, 0.45, 0.06, 0.6, 1.8, > > 0.5)), .Names = c("ID", "rrt", "Mnd", "Result"), class = "data.frame", > > row.names = c(NA, > > -14L)) > > > > > > > > resulting dataframe: > > dat3 <- > > structure(list(Time = c(0.355, 0.45, 0.48, 1.22), `0` = c(NA, > > 0.1, 0.3, 0.5), `3` = c(NA, 0.2, 0.6, 0.4), `6` = c(0.05, 0.4, > > 1.2, 0.45), `9` = c(0.06, 0.6, 1.8, 0.5)), .Names = c("Time", > > "0", "3", "6", "9"), class = "data.frame", row.names = c(NA, > > -4L)) > > > > > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.