On 2010-03-05 18:00, Alastair wrote:
Hi, I'm new to R and I've run into a problem that I'm not really sure how to express properly in the language. I've got a data table that I've read from a file containing some simple information about the performance of 4 algorithms. The columns are the name of the algorithm, the problem instance and the resulting score on that problem (if it wasn't solved I mark that with NA). solver instance result A prob1 40 B prob1 NA C prob1 39 D prob1 35 A prob2 100 B prob2 50 C prob2 NA D prob2 NA A prob3 75 B prob3 80 C prob3 60 D prob3 70 A prob4 80 B prob4 NA C prob4 85 D prob4 75 I've managed to read in the data as follows: data<- table.read("./test.txt", header = TRUE, colClasses = c("factor","factor","numeric"), na.strings = c("NA")) and I've got a nice barchart via lattice library(lattice) barchart(result ~ instance, group = solver, data = data) What I want to try and calculate (and plot somehow) is a) What percentage of the instances each solver can solve and b) What percentage of the instances a solver returns a better score than solver A for that particular problem. These don't seem like particularly ambitious requirements, but I still don't really know where to start. Any pointers would be most appreciated.
(I doubt that table.read(....) worked for you.) For (a), use tapply(): with(data, tapply(result, solver, function(x) sum(!is.na(x)))) # A B C D # 4 2 3 3 For (b), you could either use reshape() to transform to wide format or generate an appropriate matrix; in either case, follow with apply(): m <- with(data, tapply(result, list(instance, solver), function(x) x)) m apply(m[,-1], 2, function(x) sum(x > m[, 1], na.rm=TRUE)) # B C D # 1 1 0 -Peter Ehlers
Thanks, Alastair
-- Peter Ehlers University of Calgary ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.