on 06/19/2008 09:59 AM Gundala Viswanath wrote:
Hi,
I have the following dataset (simplified for example).
__DATA__
300.35 200.25 104.30
22.00 31.12 89.99
444.50 22.10 43.00
22.10 200.55 66.77
Now from that I wish to do the following:
1. Compute variance of each row
2. Pick top-2 row with highest variance
3. Store those selected rows for further processing
To achieve this, I tried to: a) read the table and compute
variance for each row, b) append variance with its original
row in a vector, c) store a vector into multidimentional array (matrix),
d) sort that array. But I am stuck at the step (b).
Can anybody suggest what's the best way to achieve
my aim above?
This is the sample code I have so far (not working).
__BEGIN__
#data <- read.table("testdata.txt")
# Is this a right way to initialize?
all.arr = NULL
for (gi in 1:nofrow) {
gex <- as.vector(data.matrix(data[gi,],rownames.force=FALSE))
#compute variance
gexvar <- var(gex)
# join variance with its original vector
nvec <- c(gexvar,gex)
# I'm stuck here.....This doesn't seem to work
all.arr <- data.frame(nvec)
}
print(all.arr)
__END__
--
If your data is contained in a data frame 'DF':
> DF
V1 V2 V3
1 300.35 200.25 104.30
2 22.00 31.12 89.99
3 444.50 22.10 43.00
4 22.10 200.55 66.77
# Get row-wise variances and cbind() them to DF
> DF.var <- cbind(DF, var = apply(DF, 1, var, na.rm = TRUE))
> DF.var
V1 V2 V3 var
1 300.35 200.25 104.30 9610.336
2 22.00 31.12 89.99 1361.915
3 444.50 22.10 43.00 56676.803
4 22.10 200.55 66.77 8622.817
# Sort DF by 'var' using order()
> DF.var[order(DF.var$var, decreasing = TRUE), ]
V1 V2 V3 var
3 444.50 22.10 43.00 56676.803
1 300.35 200.25 104.30 9610.336
4 22.10 200.55 66.77 8622.817
2 22.00 31.12 89.99 1361.915
To get the top 2, you can take a couple of approaches:
> DF.var[order(DF.var$var, decreasing = TRUE)[1:2], ]
V1 V2 V3 var
3 444.50 22.10 43.0 56676.803
1 300.35 200.25 104.3 9610.336
or
> head(DF.var[order(DF.var$var, decreasing = TRUE), ], 2)
V1 V2 V3 var
3 444.50 22.10 43.0 56676.803
1 300.35 200.25 104.3 9610.336
See ?cbind, ?apply, ?order and ?head for more information.
HTH,
Marc Schwartz
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.