Re: [R] computing marginal values based on multiple columns?

arun Tue, 04 Dec 2012 09:12:26 -0800

HI,

I am not sure the output you wanted is correct:


"
sample1 sample2 sample3
1      1.0     0     0.5
"

because
0.2*colMeans(x[,-4])
sample1 sample2 sample3 
#  28.40   24.08   21.36 


This might help you:
apply(x[-4],2,function(y) length(y[y <0.2*mean(y) & 
x$class=="a"])/length(x[x$class=="a"]))
#sample1 sample2 sample3 
  #  0.0     0.0     0.5 
A.K.



----- Original Message -----
From: Simon <simonzm...@gmail.com>
To: r-help@r-project.org
Cc: 
Sent: Tuesday, December 4, 2012 4:49 AM
Subject: [R] computing marginal values based on multiple columns?

Hello all,

I have what feels like a simple problem, but I can't find an simple
answer. Consider this data frame:

> x <- data.frame(sample1=c(35,176,182,193,124),
sample2=c(198,176,190,23,15), sample3=c(12,154,21,191,156),
class=c('a','a','c','b','c'))

> x
  sample1 sample2 sample3 class
1      35     198      12     a
2     176     176     154     a
3     182     190      21     c
4     193      23     191     b
5     124      15     156     c

Now I wish to know: for each sample, for values < 20% of the sample mean,
what percentage of those are class a?

I want to end up with a table like:

   sample1 sample2 sample3
1      1.0     0     0.5

I can calculate this for an individual sample using this rather clumsy
expression:

length(which(x$sample1 < mean(x$sample1) & x$class=='a')) /
length(which(x$sample1 < mean(x$sample1)))

I'd normally propagate it across the data frame using apply, but I
can't because it depends on more than one column.

Any help much appreciated!

Cheers,

Simon

    [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] computing marginal values based on multiple columns?

Reply via email to