Hello all,
I have what feels like a simple problem, but I can't find an simple
answer. Consider this data frame:
> x <- data.frame(sample1=c(35,176,182,193,124),
sample2=c(198,176,190,23,15), sample3=c(12,154,21,191,156),
class=c('a','a','c','b','c'))
> x
sample1 sample2 sample3 class
1 35 198 12 a
2 176 176 154 a
3 182 190 21 c
4 193 23 191 b
5 124 15 156 c
Now I wish to know: for each sample, for values < 20% of the sample mean,
what percentage of those are class a?
I want to end up with a table like:
sample1 sample2 sample3
1 1.0 0 0.5
I can calculate this for an individual sample using this rather clumsy
expression:
length(which(x$sample1 < mean(x$sample1) & x$class=='a')) /
length(which(x$sample1 < mean(x$sample1)))
I'd normally propagate it across the data frame using apply, but I
can't because it depends on more than one column.
Any help much appreciated!
Cheers,
Simon
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.