On Oct 28, 2011, at 9:49 AM, Ben Ganzfried wrote: > Hey, > > I'm trying to match patient identifiers from two separate input files, and > then add information from one of the input files to the corresponding output > file. I'd greatly appreciate any help! > > More specifically, > Input_File_1 has a column header "bcr_patient_barcode" > Input_File_2 has a column header "Barcode" and a column header "Batch" > > I want my script to match the appropriate patient identifiers since > "bcr_patient_barcode" and "Barcode" are not in the same order. Then I want > to add the information from "Batch" to the corresponding patient. > > My (incorrect) code is below: > > #batch > tmp <- Input_File_2$Barcode > tmp1 <- Input_File_1$bcr_patient_barcode > > for i in tmp > for item in tmp1 > if (tmp == tmp1) { > curated$batch <- Input_File_2$Batch > } > > Thanks!
See ?merge and then use something like: newDF <- merge(Input_File_2, Input_File_1, by.x = "Barcode", by.y = "bcr_patient_barcode") Also, pay attention to the 'all', 'all.x' and 'all.y' arguments, which control whether or not only matching records are retained or non-matching records are retained from one or both datasets. merge() performs an "SQL-like" join operation. HTH, Marc Schwartz ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.