Re: [R] quick matching question

Marc Schwartz Fri, 28 Oct 2011 08:01:51 -0700

On Oct 28, 2011, at 9:49 AM, Ben Ganzfried wrote:

> Hey,
> 
> I'm trying to match patient identifiers from two separate input files, and
> then add information from one of the input files to the corresponding output
> file.  I'd greatly appreciate any help!
> 
> More specifically,
> Input_File_1 has a column header "bcr_patient_barcode"
> Input_File_2 has a column header "Barcode" and a column header "Batch"
> 
> I want my script to match the appropriate patient identifiers since
> "bcr_patient_barcode" and "Barcode" are not in the same order.  Then I want
> to add the information from "Batch" to the corresponding patient.
> 
> My (incorrect) code is below:
> 
> #batch
> tmp <- Input_File_2$Barcode
> tmp1 <- Input_File_1$bcr_patient_barcode
> 
> for i in tmp
> for item in tmp1
> if (tmp == tmp1) {
>  curated$batch <- Input_File_2$Batch
> }
> 
> Thanks!



See ?merge and then use something like:

  newDF <- merge(Input_File_2, Input_File_1, by.x = "Barcode", by.y = 
"bcr_patient_barcode")

Also, pay attention to the 'all', 'all.x' and 'all.y' arguments, which control 
whether or not only matching records are retained or non-matching records are 
retained from one or both datasets. merge() performs an "SQL-like" join 
operation.

HTH,

Marc Schwartz

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] quick matching question

Reply via email to