Hi, Thanks in advance for any advice you can give me, I am very stumped on this problem...
I use R every day and consider myself a confident user, but this seems to be an elementary problem.. Outline of problem: I am analysing the results of a study on protein expression in cancer tissues. I have raw intensities from 2 different types of cancer and normal tissue, which can be taken from several different parts of the cell, as well as patient information. Part of the analysis calls for a fold-change calculation. In order to do this I am sub-setting the dataset by cancer type, merging each cancer dataset with the data from the Normal tissue, then calculating fold change for matching individuals and cell section. The problem is that I have been tracking one factor in particular ('branch', values 2 or 3) and once the final merge occurs, the second level of this factor seems to disappear in the last dataset, even though it was present before. See code & output below: > dim(tma) > names(tma) [1] "Code" "marker" "cell" "tumourA" "tumourEXP" "int" "stain" "tumourPERC" "branch" > levels(tma$tumourA) [1] "DCIS" "LN Metastasis" "Normal" "Primary Invasive Carcinoma" #split into cancer and normal tissue > tma1<-subset(tma, tumourA=="Primary Invasive Carcinoma") > tma2<-subset(tma, tumourA=="LN Metastasis") > tmaN<-subset(tma, tumourA=="Normal") #size of datasets > dim(tma1) [1] 587 9 > dim(tma2) [1] 323 9 > dim(tmaN) [1] 142 9 #merge back with normal type > tma1.1<-merge(tmaN, tma1, by="Code") > tma2.1<-merge(tmaN, tma2, by="Code") #new dimensions (seem excessively large) > dim(tma1.1) [1] 2439 17 > dim(tma2.1) [1] 625 17 #progression of "branch: factor in datasets. Note last one where it disappears... > table(tma$branch) 2 3 450 613 > table(tma1$branch) 2 3 314 273 > table(tma2$branch) 2 3 39 284 > table(tmaN$branch) 2 3 91 51 > table(tma1.1$branch.x) 2 3 1806 633 > table(tma2.1$branch.x) 3 625 Please, can someone tell me what's going on? Thanks you very much, Zoe van Havre [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.