Hi Amy, You might repost this question to the Bioconductor mailing list: you'll get more specialized help there. But to answer your questions as best I can see inline:
On Fri, Mar 30, 2012 at 12:15 PM, Minyue Wang <mwa...@ncsu.edu> wrote: > Hi all, > My name is Amy, I am a masters student in Bioinformatics at North Carolina > State University. I am working on a project and I am trying to use the lumi > R package for microarray data analysis. I have shown the sample code here > and have questions about modifying the sample code for my own data. > > > lumi package in R, example.lumi, the sample data has 8000 features and 4 > samples > > I have highlighted the code I have questions on in red, Well, the mail server strips HTML so we can't see those, but I think I know where you are talking about. > my data has 4 > different types of samples, each repeated 6 times, so a total of 24 samples > and about 48,000 rows. how should I identify my sampleType in my case? also > what does colnames(design) <- c('100:0', '95:5-100:0') do, which columns > exactly does it take into consideration? Thanks! This will hit all the columns -- however, if you don't give enough names, it throws an error. For example, x <- matrix(1:6, nrow = 2); colnames(x) <- letters[1:3]; rownames(x) <-LETTERS[1:2] print(x) colnames(x) <- "A" # Problem! colnames(x) <- c("X","Y", "Z") # Good to go. > > > so the sample code i'm trying to follow is below: > > ################################################### > > ### code chunk number 30: filtering > > ################################################### > > presentCount <- detectionCall(example.lumi) > > selDataMatrix <- dataMatrix[presentCount > 0,] > > probeList <- rownames(selDataMatrix) > > > > > > ################################################### > > ### code chunk number 31: Identify differentially expressed genes > > ################################################### > > ## Specify the sample type > > sampleType <- c('100:0', '95:5', '100:0', '95:5') I'm not sure where you got this code, but if I were using it, I'd rewrite it as: sampleType <- rep(c("100:0", "95:5"), 2) and then you can just change those to specify sampleType (after making my other change below) > > if (require(limma)) { > > ## compare '95:5' and '100:0' > > design <- model.matrix(~ factor(sampleType)) > > colnames(design) <- c('100:0', '95:5-100:0') colnames(design) <- sampleType[1:2] > > fit <- lmFit(selDataMatrix, design) > > fit <- eBayes(fit) > > ## Add gene symbols to gene properties > > if (require(lumiHumanAll.db) & require(annotate)) { > > geneSymbol <- getSYMBOL(probeList, 'lumiHumanAll.db') > > geneName <- sapply(lookUp(probeList, 'lumiHumanAll.db', > 'GENENAME'), function(x) x[1]) > > fit$genes <- data.frame(ID= probeList, > geneSymbol=geneSymbol, geneName=geneName, stringsAsFactors=FALSE) > > } > > ## print the top 10 genes > > print(topTable(fit, coef=colnames(design), adjust='fdr', > number=10)) Change here too! Hope that helps, [But no promises it's perfect -- i'm not a geneticist!] Michael > > > > ## get significant gene list with FDR adjusted p.values > less than 0.01 > > p.adj <- > p.adjust(fit$p.value[,2]) > > sigGene.adj <- probeList[ p.adj < 0.01] > > ## without FDR adjustment > > sigGene <- probeList[ fit$p.value[,2] < 0.01] > > } > > > > -- > - Amy W. > > -- > Minyue Wang (Amy) > Graduate student, Bioinformatics > mwa...@ncsu.edu > 919-5210893 > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.