HI, I am not sure the output you wanted is correct:
" sample1 sample2 sample3 1 1.0 0 0.5 " because 0.2*colMeans(x[,-4]) sample1 sample2 sample3 # 28.40 24.08 21.36 This might help you: apply(x[-4],2,function(y) length(y[y <0.2*mean(y) & x$class=="a"])/length(x[x$class=="a"])) #sample1 sample2 sample3 # 0.0 0.0 0.5 A.K. ----- Original Message ----- From: Simon <simonzm...@gmail.com> To: r-help@r-project.org Cc: Sent: Tuesday, December 4, 2012 4:49 AM Subject: [R] computing marginal values based on multiple columns? Hello all, I have what feels like a simple problem, but I can't find an simple answer. Consider this data frame: > x <- data.frame(sample1=c(35,176,182,193,124), sample2=c(198,176,190,23,15), sample3=c(12,154,21,191,156), class=c('a','a','c','b','c')) > x sample1 sample2 sample3 class 1 35 198 12 a 2 176 176 154 a 3 182 190 21 c 4 193 23 191 b 5 124 15 156 c Now I wish to know: for each sample, for values < 20% of the sample mean, what percentage of those are class a? I want to end up with a table like: sample1 sample2 sample3 1 1.0 0 0.5 I can calculate this for an individual sample using this rather clumsy expression: length(which(x$sample1 < mean(x$sample1) & x$class=='a')) / length(which(x$sample1 < mean(x$sample1))) I'd normally propagate it across the data frame using apply, but I can't because it depends on more than one column. Any help much appreciated! Cheers, Simon [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.