Dear Daniel, Thank you very much for the help!
I tried the code and got the following: > John<-read.table(file="John_probe.txt",header=TRUE,row.names=NULL,fill=TRUE) > Susan<-read.table(file="Susan_probe.txt",header=TRUE,row.names=NULL,fill=TRUE) > dim(John) [1] 48701 2 > dim(Susan) [1] 46713 2 > dataToMerge=data.frame(John$PROBE_ID2, John$SEARCH_KEY2) > mergedData=merge(Susan,dataToMerge,by.x=SEARCH_KEY1,by.y=SEARCH_KEY2,all.x=T,all.y=F) Error in fix.by(by.x, x) : object "SEARCH_KEY1" not found > I modified the last one by specifying the file name and got : > mergedData=merge(Susan,dataToMerge,by.x=Susan$SEARCH_KEY1,by.y=John$SEARCH_KEY2,all.x=T,all.y=F) Error in fix.by(by.x, x) : 'by' must specify valid column(s) > Have you got any clue about this? Thanks much, Allen On Wed, Oct 8, 2008 at 11:46 AM, Daniel Malter <[EMAIL PROTECTED]> wrote: > dataToMerge=data.frame(yourtablename2$PROBE_ID2, > yourtablename2$SEARCH_KEY2) > ##Puts the two columns of interest in dataset 2 in a separate data frame. > > > mergedData=merge(yourtablename1,dataToMerge,by.x=SEARCH_KEY1,by.y=SEARCH_KEY > 2,all.x=T,all.y=F) > ##merges the first table with the data frame just created looking for > matches between SEARCH_KEY1 and SEARCH_KEY2, all entries in dataset 1 are > retained (whether matched or not), entries in the dataToMerge dataframe > that > do not match any entries in dataset 1 are dropped. > > You then have an additional column in "mergedData" that contains the > PROBE_ID2 and you can just assign them to PROBE_ID1 (i.e. replace PROBE_ID1 > by the values in this column). > > Cheers, > Daniel > > ------------------------- > cuncta stricte discussurus > ------------------------- > > -----Ursprüngliche Nachricht----- > Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im > Auftrag von ss > Gesendet: Wednesday, October 08, 2008 11:33 AM > An: R help > Betreff: [R] How to join the two tables based on one overlapped column > > Dear list, > > I need some clues on this. I have two excel files and I basically want to > map one to the other one. Can you give me some hints how to do it? > > The first excel file, named as "Susan_probe.xls", there are two columns, > "PROBE_ID1" and "SEARCH_KEY1" > > PROBE_ID1 SEARCH_KEY1 ILMN_30212 ILMN_30212 ILMN_1285 ILMN_1285 > ILMN_137964 ILMN_137964 ILMN_138109 ILMN_138109 ... > > The second excel file, named as "John_probe.xls", there are two columns as > well, "PROBE_ID2" and "SEARCH_KEY2". > > PROBE_ID2 SEARCH_KEY2 ILMN_1809034 ILMN_16367 ILMN_1660305 ILMN_16583 > ILMN_1792173 ILMN_19158 ... > > There are 46713 rows in the first excel file and 49702 rows in the second > file. > > Probes in the first columns of two excel files are different but they can > be > matched based on the second column "SEARCH_KEY". So what I want to do is to > substitute the "PROBE_ID1" in the "Susan_probe.xls" file with the > "PROBE_ID2" in the "John_probe.xls" based on their common "SEARCH_KEY". > > Thank you so much for your help. I really appreciate. > > All the best, > Allen > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.