# Let's say your expression data is in a matrix # named expression in which the rows are genes # and the columns are samples
myvars <- apply(expression,1, var,na.rm=TRUE) myvars <- sort(myvars,decreasing=TRUE) myvars <- myvars[1:200] expression <- expression[names(myvars),] dim(expression) Also check out the genefilter package in bioconductor. You may find the bioconductor mailing list is better for questions like this one. On Tue, Jun 7, 2011 at 9:47 AM, GIS Visitor 33 <gis...@gis.a-star.edu.sg> wrote: > Hi > > I have a problem for which I would like to know a solution. I have a gene > expression data and I would like to choose only lets say top 200 genes that > had the highest expression variance across patients. > > How do i do this in R? > > I tried x=apply(leukemiadata,1,var) > x1=x[order(-1*x)] > > but the problem here is x and x1 are numeric data , If I choose the first > 200 after sorting in descending, so I do not know how to choose the > associated samples with just the numeric values. > > Kindly help! > > > Regards > Ap > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.