classification problem?

Bart Joosen Thu, 24 Jan 2013 23:26:48 -0800

Nice suggestion for the extra "Time" column.

But I think I didn't ask clear enough my problem.
My main problem is to find a way to "classify" the rrt's, so that we don't have 
to check each dataframe by our selfs.


So I need a function that fills in the extra "Time" column by taking a look at 
the rrt's (and maybe the results), and take the discision which rrts are the 
same, and which are new ones.

As stated: rrt's never switch place, and results can't be concatenated or 
averaged within a Mnd.

I hope my question is a bit more clear now.

Thank you all for your suggestions

Bart

> Date: Thu, 24 Jan 2013 15:01:40 -0800
> Subject: Re: [R] sorting/grouping/classification problem?
> From: djmu...@gmail.com
> To: bartjoo...@hotmail.com
> 
> Hi:
> 
> Here's a potential workaround:
> 
> # Add a time order variable
> dat$ord <- c(rep(2:4, 2), rep(1:4, 2))
> 
> # Average rrt by ord
> dat$Time <- with(dat, ave(rrt, ord, FUN = mean))
> dat
> 
> # Reshape the data
> 
> library(reshape2)
> > dcast(dat, Time ~ Mnd, value.var = "Result")
>     Time   0   3    6    9
> 1 0.3550  NA  NA 0.05 0.06
> 2 0.4475 0.1 0.2 0.40 0.60
> 3 0.4750 0.3 0.6 1.20 1.80
> 4 1.2225 0.5 0.4 0.45 0.50
> 
> You could always round dat$Time to two decimal places in its
> definition before doing the cast if you so desired.
> 
> Dennis
> 
> On Thu, Jan 24, 2013 at 11:31 AM, Bart Joosen <bartjoo...@hotmail.com> wrote:
> >
> > Hi,
> >
> >
> > I'm a database admin for a database which manage chromatographic results of 
> > products during stability studies.
> > I use R for the reporting of the results in MS Word through R2wd.
> >
> >
> > But now I think I need your help:
> > suppose we have the following data frame:
> >
> >
> >    ID  rrt Mnd Result
> > 1 0.45   0   0.10
> > 1 0.48   0   0.30
> > 1 1.24   0   0.50
> > 2 0.45   3   0.20
> > 2 0.48   3   0.60
> > 2 1.22   3   0.40
> > 3 0.35   6   0.05
> > 3 0.44   6   0.40
> > 3 0.46   6   1.20
> > 3 1.21   6   0.45
> > 4 0.36   9   0.06
> > 4 0.45   9   0.60
> > 4 0.48   9   1.80
> > 4 1.22   9   0.50
> >
> >
> >
> > ID is the database ID, rrt is an identifier for the result, Mnd is the 
> > timepoint of analysis and Result is... the result of the test.
> > What I need is this dataframe in a wide format (which I managed with dat2 
> > <- as.data.frame(tapply(dat$Result,list(rrt=dat$rrt,Mnd=dat$Mnd), 
> > function(x) paste(x[x!=""],collapse="/"))) )
> > But as you can see, rrt is not an exact identifier for the result.
> >
> > Sometimes rrt for 0 Mnd is 0.45, but at 6 Mnd the rrt is 0.44.
> > Now I need the results to align so that one can easily see how rrt x is 
> > evolving within the Mnd time points.
> > I tried with different rounding procedures (round every 0.02, check that no 
> > results are discarded this way, and check for alignment), but nothing seems 
> > to make some sense.
> > Also tried checking the highest results in each Mnd, align these, determine 
> > correction factors for the rrt for all the other rrts, ...
> >
> >
> > Some results will follow a trend (like rrt 0.45), some will remain more or 
> > less stable.
> > But NEVER rrt will switch i with each other!
> >
> >
> >
> >
> > Ultimately I need to update in the db, so I need a list/dataframe with the 
> > ID, the original rrt and the adjusted rrt (maybe the first occuring rrt, or 
> > the mean of the rrts, doesn't matter).
> >
> >
> >
> >
> > Any ideas about which algorithms can be used? I searched on pubmed, but 
> > couldn't find anything
> >
> >
> >
> >
> > Thanks
> >
> >
> > Bart
> >
> >
> > PS: to get the data:
> >
> >
> > dat <-
> > structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L, 3L,
> > 4L, 4L, 4L, 4L), rrt = c(0.45, 0.48, 1.24, 0.45, 0.48, 1.22,
> > 0.35, 0.44, 0.46, 1.21, 0.36, 0.45, 0.48, 1.22), Mnd = c(0L,
> > 0L, 0L, 3L, 3L, 3L, 6L, 6L, 6L, 6L, 9L, 9L, 9L, 9L), Result = c(0.1,
> > 0.3, 0.5, 0.2, 0.6, 0.4, 0.05, 0.4, 1.2, 0.45, 0.06, 0.6, 1.8,
> > 0.5)), .Names = c("ID", "rrt", "Mnd", "Result"), class = "data.frame", 
> > row.names = c(NA,
> > -14L))
> >
> >
> >
> > resulting dataframe:
> > dat3 <-
> > structure(list(Time = c(0.355, 0.45, 0.48, 1.22), `0` = c(NA,
> > 0.1, 0.3, 0.5), `3` = c(NA, 0.2, 0.6, 0.4), `6` = c(0.05, 0.4,
> > 1.2, 0.45), `9` = c(0.06, 0.6, 1.8, 0.5)), .Names = c("Time",
> > "0", "3", "6", "9"), class = "data.frame", row.names = c(NA,
> > -4L))
> >
> >
> >
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
                                          
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sorting/grouping/classification problem?

Reply via email to