Re: [R] Merging data frames on two conditions

2010-04-06 Thread Gabor Grothendieck
Yes, indexing will typically make a large difference. On Tue, Apr 6, 2010 at 3:54 PM, Abhishek Pratap wrote: > Hi Guys > > I have two data frames which I would like to merge on two conditions. > > I am doing the following  (abstract form) > > new.data.frame <- merge(df1,df2, by=c("Col1","Col2"))

Re: [R] Merging data frames on two conditions

2010-04-06 Thread Abhishek Pratap
You got the error. It is different naming convention of chr. I should be able to fix that pretty easily. In case the problem persists, I will contact the list. Thanks! -Abhi On Tue, Apr 6, 2010 at 5:01 PM, David Winsemius wrote: > OK, not the SNP's. So look at the "chr"'s. I will bet that you g

Re: [R] Merging data frames on two conditions

2010-04-06 Thread David Winsemius
OK, not the SNP's. So look at the "chr"'s. I will bet that you get 0 when you try : length(intersect(data_lane6_snps$chr, data_lane6_snps_rsid$chr)) ... since one is using a format of "chrNN" and the other is using just "NN". You need to get the chromosome naming convention straightened out

Re: [R] Merging data frames on two conditions

2010-04-06 Thread Abhishek Pratap
Just so you know length(intersect(data_lane6_snps$SNP, data_lane6_snps_rsid$SNP)) 796120 I just need to include the chr condition now where I am stuck. -Abhi On Tue, Apr 6, 2010 at 4:51 PM, Abhishek Pratap wrote: > Hi David > > I can understand looking the SNP data values it can be felt that t

Re: [R] Merging data frames on two conditions

2010-04-06 Thread Abhishek Pratap
Hi David I can understand looking the SNP data values it can be felt that they are different values and hence no result in merge. However the columns still have ~700K SNPs common. What I am looking for is a merge where the SNP and Chr matches. If I match only the SNP column I get partially correct

Re: [R] Merging data frames on two conditions

2010-04-06 Thread David Winsemius
On Apr 6, 2010, at 4:03 PM, Abhishek Pratap wrote: Hi David Here it is. You can ignore the bio jargon if it sounds confusing. Sometimes it is essential to have domain details. The corresponding data type of column (SNP, chr) on which I am applying merge is same. merge(data_lane6_snps, d

Re: [R] Merging data frames on two conditions

2010-04-06 Thread Abhishek Pratap
And I should also add that if I merge only on one column it works fine but the result is not what I want. merge(data_lane6_snps, data_lane6_snps_rsid , by = c("SNP") : works as expected. Is the "chr" column being a factor creating probs here ? -A On Tue, Apr 6, 2010 at 4:03 PM, Abhishek Pratap

Re: [R] Merging data frames on two conditions

2010-04-06 Thread Abhishek Pratap
Hi David Here it is. You can ignore the bio jargon if it sounds confusing. The corresponding data type of column (SNP, chr) on which I am applying merge is same. merge(data_lane6_snps, data_lane6_snps_rsid , by = c("SNP,"chr")) str(data_lane6_snps) 'data.frame': 7724462 obs. of 10 variables:

Re: [R] Merging data frames on two conditions

2010-04-06 Thread David Winsemius
On Apr 6, 2010, at 3:54 PM, Abhishek Pratap wrote: Hi Guys I have two data frames which I would like to merge on two conditions. I am doing the following (abstract form) new.data.frame <- merge(df1,df2, by=c("Col1","Col2")) What does str(df1) ; str(df2) ... show? It is giving me a

[R] Merging data frames on two conditions

2010-04-06 Thread Abhishek Pratap
Hi Guys I have two data frames which I would like to merge on two conditions. I am doing the following (abstract form) new.data.frame <- merge(df1,df2, by=c("Col1","Col2")) It is giving me a null result. Basically I need to apply two conditions. I also tried sqldf but it is running forever.