On 04/03/2013 03:17 PM, Atul Kakrana wrote:
Hello All,
I need your help. I am analysing affymetrix data and have to select the
probe-set that has median expression among all the probe-sets for same
gene. This way I want to remove the redundancy by keeping the analysis
to single gene entry level. I am fully aware that it is not a nice thing
to do but I just have to do it.
To do so, I came across 'findLargest' function of 'genefilter' package
but it's not well documented; and I do not know how to implement the
'findLargest' function. At this point I have:
esetRMA <- rma(mydata)
Could anybody guide me on how can I select single probeset with median
expression from multiple probe-sets corresponding to single gene and
discard others? Is there any other way to achieve so i.e. other than
using 'genefilter'?
Genefilter package:
http://www.bioconductor.org/packages/2.11/bioc/html/genefilter.html
Hi Atul --It's a Bioconductor package, so might as well ask instead on the
Bioconductor mailing list
http://bioconductor.org/help/mailing-list/
As a reproducible example, load the "ALL" sample ExpressionSet, Biobase and
genefilter packates
library(Biobase)
library(ALL)
library(genefilter)
The three arguments to findLargest are the names of the probe sets
featureNames(ALL)
the test statistic
rowMedians(ALL)
and the chip from which the ExpressionSet is based
annotation(ALL)
So the variable
idx = findLargest(featureNames(ALL), rowMedians(ALL), annotation(ALL)
identifies the probes and
ALL1 = ALL[idx,]
gets you the data you're interested in.
Again, follow-up questions should go to the Bioconductor mailing list.
Martin
Thanks
AK
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.