There are a number of different ways that you would have to manipulate your data to do what you want. It is useful to learn some of these techniques. Here, I think, are the set of actions that you want to do.
> x <- read.table(textConnection("row k.idx step.forwd pt.num model > prev value abs.error + 1 200 0 1 lm 09 10.5 1.5 + 2 200 0 2 lm 11 10.5 1.5 + 3 201 1 1 lm 10 12 2.0 + 4 201 1 2 lm 12 12 2.0 + 5 202 2 1 lm 12 12.1 0.1 + 6 202 2 2 lm 12 12.1 0.1 + 7 200 0 1 rlm 10.1 10.5 0.4 + 8 200 0 2 rlm 10.3 10.5 0.2 + 9 201 1 1 rlm 11.6 12 0.4 + 10 201 1 2 rlm 11.4 12 0.6 + 11 202 2 1 rlm 11.8 12.1 0.1 + 12 202 2 2 rlm 11.9 12.1 0.2"), header=TRUE) > closeAllConnections() > > # split the data by the grouping factors > x.split <- split(x, list(x$k.idx, x$step.forwd, x$model), drop=TRUE) > x.split $`200.0.lm` row k.idx step.forwd pt.num model prev value abs.error 1 1 200 0 1 lm 9 10.5 1.5 2 2 200 0 2 lm 11 10.5 1.5 $`201.1.lm` row k.idx step.forwd pt.num model prev value abs.error 3 3 201 1 1 lm 10 12 2 4 4 201 1 2 lm 12 12 2 $`202.2.lm` row k.idx step.forwd pt.num model prev value abs.error 5 5 202 2 1 lm 12 12.1 0.1 6 6 202 2 2 lm 12 12.1 0.1 $`200.0.rlm` row k.idx step.forwd pt.num model prev value abs.error 7 7 200 0 1 rlm 10.1 10.5 0.4 8 8 200 0 2 rlm 10.3 10.5 0.2 $`201.1.rlm` row k.idx step.forwd pt.num model prev value abs.error 9 9 201 1 1 rlm 11.6 12 0.4 10 10 201 1 2 rlm 11.4 12 0.6 $`202.2.rlm` row k.idx step.forwd pt.num model prev value abs.error 11 11 202 2 1 rlm 11.8 12.1 0.1 12 12 202 2 2 rlm 11.9 12.1 0.2 > > # now take the means of given columns > x.mean <- lapply(x.split, function(.grp) colMeans(.grp[, c('prev', 'value', > 'abs.error')])) > > # put back into a matrix > (x.mean <- do.call(rbind, x.mean)) prev value abs.error 200.0.lm 10.00 10.5 1.50 201.1.lm 11.00 12.0 2.00 202.2.lm 12.00 12.1 0.10 200.0.rlm 10.20 10.5 0.30 201.1.rlm 11.50 12.0 0.50 202.2.rlm 11.85 12.1 0.15 > > #boxplot > boxplot(abs.error ~ k.idx, data=x) > > # create a table with average of the abs.error for each 'model' > cbind(x, abs.error.mean=ave(x$abs.error, x$model)) row k.idx step.forwd pt.num model prev value abs.error abs.error.mean 1 1 200 0 1 lm 9.0 10.5 1.5 1.2000000 2 2 200 0 2 lm 11.0 10.5 1.5 1.2000000 3 3 201 1 1 lm 10.0 12.0 2.0 1.2000000 4 4 201 1 2 lm 12.0 12.0 2.0 1.2000000 5 5 202 2 1 lm 12.0 12.1 0.1 1.2000000 6 6 202 2 2 lm 12.0 12.1 0.1 1.2000000 7 7 200 0 1 rlm 10.1 10.5 0.4 0.3166667 8 8 200 0 2 rlm 10.3 10.5 0.2 0.3166667 9 9 201 1 1 rlm 11.6 12.0 0.4 0.3166667 10 10 201 1 2 rlm 11.4 12.0 0.6 0.3166667 11 11 202 2 1 rlm 11.8 12.1 0.1 0.3166667 12 12 202 2 2 rlm 11.9 12.1 0.2 0.3166667 > On Jan 6, 2008 10:50 AM, Rense Nieuwenhuis <[EMAIL PROTECTED]> wrote: > Hi, > > you may want to use that apply / tapply function. Some find it a bit > hard to grasp at first, but it will help you many times in many > situations when you get the hang of it. > > Maybe you can get some information on my site: http:// > www.rensenieuwenhuis.nl/r-project/manual/basics/tables/ > > > Hope this helps, > > Rense Nieuwenhuis > > > > On Jan 3, 2008, at 11:53 , José Augusto M. de Andrade Junior wrote: > > > Hi all, > > > > Could someone please explain how can i efficientily query a data frame > > with several factors, as shown below: > > > > ---------------------------------------------------------------------- > > ----------------------------------- > > Data frame: pt.knn > > ---------------------------------------------------------------------- > > ----------------------------------- > > row | k.idx | step.forwd | pt.num | model | prev | value > > | abs.error > > 1 200 0 1 lm 09 > > 10.5 1.5 > > 2 200 0 2 lm 11 > > 10.5 1.5 > > 3 201 1 1 lm 10 > > 12 2.0 > > 4 201 1 2 lm 12 > > 12 2.0 > > 5 202 2 1 lm 12 > > 12.1 0.1 > > 6 202 2 2 lm 12 > > 12.1 0.1 > > 7 200 0 1 rlm 10.1 > > 10.5 0.4 > > 8 200 0 2 rlm 10.3 > > 10.5 0.2 > > 9 201 1 1 rlm 11.6 > > 12 0.4 > > 10 201 1 2 rlm 11.4 > > 12 0.6 > > 11 202 2 1 rlm 11.8 > > 12.1 0.1 > > 12 202 2 2 rlm 11.9 > > 12.1 0.2 > > ---------------------------------------------------------------------- > > ------------------------------------ > > > > k.idx, step.forwd, pt.num and model columns are FACTORS. > > prev, value, abs.error are numeric > > > > I need to take the mean value of the numeric columns (prev, value and > > abs.error) for each k.idx and step.forwd and model. So: rows 1 and 2, > > 3 and 4, 5 and 6,7 and 8, 9 and 10, 11 and 12 must be grouped > > together. > > > > Next, i need to plot a boxplot of the mean(abs.error) of each model > > for each k.idx. > > I need to compare the abs.error of the two models for each step and > > the mean overall abs.error of each model. And so on. > > > > I read the manuals, but the examples there are too simple. I know how > > to do this manipulation in a "brute force" manner, but i wish to learn > > how to work the right way with R. > > > > Could someone help me? > > Thanks in advance. > > > > José Augusto > > Undergraduate student > > University of São Paulo > > Business Administration Faculty > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.