Hi Pan, On Wed, May 14, 2014 at 9:14 PM, Panos Bolan <panbo...@hotmail.com> wrote: > Dear list, > > Apologies for posting this to both Bioconductor and here. I recently read a > Bioconductor post where the developer of the WGCNA suggested the use of the > package for RNA-seq data analysis after implementing a variance > stabilization normalization to the raw counts. I have read the tutorials and > run the example dataset at > http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/index.html. > I would like to apply WGCNA to my RNA-seq data consisting of 1000 > transcripts whose expression is measured for 50 triplicated cell types > (approximately 150 samples) and derive networks. > > I would like to ask if WGCNA can be used successfully in this kind of > heterogeneous dataset
My standard answer to this question is that the success of WGCNA (or any other analysis, for that matter) depends on the design of your experiment and what the question is you want to answer. Are the inter-line differences going to help you answer the question, or will they confound it? > where for most of the transcripts, the various cell > types expression patterns might differ substantially (so that a variance > stabilizing transformation will not give me approximately normal > distribution for each transcript; it would rather be a mixture of normal > distributions). Normal distribution is not a pre-requisite of WGCNA, or, indeed, any other linear model-based analysis (correlation can be thought of as one of the statistics arising from a linear model). Linear models do not assume that variables are distributed normally, only that their residuals are distributed normally (and have the same variance - this is where VST comes in). HTH, Peter ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.