Yes, thanks Henrik. I neglected to mention that rowMedians could just
be plugged in instead of apply (..,1,...)
However, my main point is that that's probably not what matters,as
Benno points out. Maybe it's the data frames instead of the matrices,
but The process should execute in a few seco
On Wed, May 23, 2012 at 11:54 AM, peter dalgaard wrote:
>
> On May 23, 2012, at 19:30 , Preeti wrote:
>
> > Hmm.. that is interesting... I did this on our server machine which has
> > about 200 cores. So memory is not an issue. Also, building the dataframe
> > takes about a few minutes maximum fo
On May 23, 2012, at 19:30 , Preeti wrote:
> Hmm.. that is interesting... I did this on our server machine which has
> about 200 cores. So memory is not an issue. Also, building the dataframe
> takes about a few minutes maximum for me. My code is similar to yours but
> for the fact that I create m
Just adding a few cents to this:
rowMedians(x) is roughly 4-10 times faster than apply(x, MARGIN=1,
FUN=median) - at least on my local Windows 7 64bit tests. You can do
these simple benchmark runs yourself via the
matrixStats/tests/rowMedians.R system test, cf. http://goo.gl/YCJed
[R-forge].
/He
Hmm.. that is interesting... I did this on our server machine which has
about 200 cores. So memory is not an issue. Also, building the dataframe
takes about a few minutes maximum for me. My code is similar to yours but
for the fact that I create my dataframe from read.delim("filename") and
then I d
I wonder how you do this (or maybe on what kind of machine you execute it).
I tried it out of curiosity and get
> df = as.data.frame(lapply(1:300,function(x)sample(200,25,T)))
> colnames(df) = sample(letters[1:20],300,T)
> system.time(dfmed<-lapply(unique(colnames(df)), function(x)
+ rowMedia
Assuming your original matrix IS a matrix, call it yourmat, and not a
data frame (whose columns **must* have unique names if you haven't
messed with the check.names default) then maybe:
UNTESTED!!! ###
thenames <- unique(dimnames(yourmat)[[2]])
ans <- lapply(thenames, function(nm, {
apply
Hello Everybody,
The code:
dfmed<-lapply(unique(colnames(df)), function(x)
rowMedians(as.matrix(df[,colnames(df) == x]),na.rm=TRUE))
takes really long time to execute ( in hours). Is there a faster way to do
this?
Thanks!
On Tue, May 22, 2012 at 3:46 PM, Preeti wrote:
> Thanks Henrik! Here i
Thanks Henrik! Here is the one-liner that I wrote:
dfmed<-lapply(unique(colnames(df)), function(x)
rowMedians(as.matrix(df[,colnames(df) == x]),na.rm=TRUE))
Thanks again!
On Tue, May 22, 2012 at 3:23 PM, Henrik Bengtsson wrote:
> See rowMedians() of the matrixStats package for replacing apply(x
See rowMedians() of the matrixStats package for replacing apply(x,
MARGIN=1, FUN=median). /Henrik
On Tue, May 22, 2012 at 12:34 PM, Preeti wrote:
> Hi,
>
> I have a 250,000 by 300 matrix. I am trying to calculate the median of
> those columns (by row) with column names that are identical. I would
On Tue, May 22, 2012 at 01:34:45PM -0600, Preeti wrote:
> Hi,
>
> I have a 250,000 by 300 matrix. I am trying to calculate the median of
> those columns (by row) with column names that are identical. I would like
> this to be efficient since apply(x,1,median) where x is created by choosing
> only
Hi,
I have a 250,000 by 300 matrix. I am trying to calculate the median of
those columns (by row) with column names that are identical. I would like
this to be efficient since apply(x,1,median) where x is created by choosing
only those columns with same column name and looping on this is taking a
12 matches
Mail list logo