Hi all, Could someone please explain how can i efficientily query a data frame with several factors, as shown below:
--------------------------------------------------------------------------------------------------------- Data frame: pt.knn --------------------------------------------------------------------------------------------------------- row | k.idx | step.forwd | pt.num | model | prev | value | abs.error 1 200 0 1 lm 09 10.5 1.5 2 200 0 2 lm 11 10.5 1.5 3 201 1 1 lm 10 12 2.0 4 201 1 2 lm 12 12 2.0 5 202 2 1 lm 12 12.1 0.1 6 202 2 2 lm 12 12.1 0.1 7 200 0 1 rlm 10.1 10.5 0.4 8 200 0 2 rlm 10.3 10.5 0.2 9 201 1 1 rlm 11.6 12 0.4 10 201 1 2 rlm 11.4 12 0.6 11 202 2 1 rlm 11.8 12.1 0.1 12 202 2 2 rlm 11.9 12.1 0.2 ---------------------------------------------------------------------------------------------------------- k.idx, step.forwd, pt.num and model columns are FACTORS. prev, value, abs.error are numeric I need to take the mean value of the numeric columns (prev, value and abs.error) for each k.idx and step.forwd and model. So: rows 1 and 2, 3 and 4, 5 and 6,7 and 8, 9 and 10, 11 and 12 must be grouped together. Next, i need to plot a boxplot of the mean(abs.error) of each model for each k.idx. I need to compare the abs.error of the two models for each step and the mean overall abs.error of each model. And so on. I read the manuals, but the examples there are too simple. I know how to do this manipulation in a "brute force" manner, but i wish to learn how to work the right way with R. Could someone help me? Thanks in advance. José Augusto Undergraduate student University of São Paulo Business Administration Faculty ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.