Re: [R] compare two data frames of different dimensions and only keep unique rows

Arnaud Gaboury Mon, 27 Feb 2012 10:12:03 -0800

No, but I tried your way too.

In fact, the only three unique rows are these ones:


 Product Price Nbr.Lots
   Cocoa  2440        5
   Cocoa  2450        1
   Cocoa  2440        6

Here is a dirty working trick I found :

> df<-merge(exportfile,reported,all.y=T)
> df1<-merge(exportfile,reported)
> dff1<-do.call(paste,df)
> dff<-do.call(paste,df)
> dff1<-do.call(paste,df1)
> df[!dff %in% dff1,]
  Product Price Nbr.Lots
3   Cocoa  2440        5
4   Cocoa  2450        1
 

My two problems are : I do think it is not so a clean code, then I won't know 
by advance which of my two df will have the greates dimension (I can add some 
lines to deal with it, but again, seems very heavy).

I hoped I could find a better solution.


A2CT2 Ltd.


-----Original Message-----
From: jim holtman [mailto:jholt...@gmail.com] 
Sent: lundi 27 février 2012 18:42
To: Arnaud Gaboury
Cc: r-help@r-project.org
Subject: Re: [R] compare two data frames of different dimensions and only keep 
unique rows

is this what you want:

> v <- rbind(reported, exportfile)
> v[!duplicated(v), ]
       Product    Price Nbr.Lots
1        Cocoa  2331.00      -61
2        Cocoa  2356.00      -61
3        Cocoa  2440.00        5
4        Cocoa  2450.00        1
6     Coffee C   204.55       40
7     Coffee C   205.45       40
5           GC 17792.00       -1
10 Sugar No 11    24.81       -1
8           ZS  1273.50       -1
9           ZS  1276.25        1
13       Cocoa  2440.00        6
>


On Mon, Feb 27, 2012 at 12:36 PM, Arnaud Gaboury <arnaud.gabo...@a2ct2.com> 
wrote:
> Dear list,
>
> I am still struggling with something that should be easy: I compare two data 
> frames with a lot of common rows and want to keep only rows that are NOT in 
> both data frames, unique.
>
> Here are an example of these data frame.
>
> reported <-
> structure(list(Product = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 3L, 4L, 
> 5L, 5L), .Label = c("Cocoa", "Coffee C", "GC", "Sugar No 11", "ZS"), 
> class = "factor"), Price = c(2331, 2356, 2440, 2450, 204.55, 205.45, 
> 17792, 24.81, 1273.5, 1276.25), Nbr.Lots = c(-61L, -61L, 5L, 1L, 40L, 
> 40L, -1L, -1L, -1L, 1L)), .Names = c("Product", "Price", "Nbr.Lots"), 
> row.names = c(1L, 2L, 3L, 4L, 6L, 7L, 5L, 10L, 8L, 9L), class = 
> "data.frame")
>
> exportfile <-
> structure(list(Product = c("Cocoa", "Cocoa", "Cocoa", "Coffee C", 
> "Coffee C", "GC", "Sugar No 11", "ZS", "ZS"), Price = c(2331, 2356, 
> 2440, 204.55, 205.45, 17792, 24.81, 1273.5, 1276.25), Nbr.Lots = 
> c(-61, -61, 6, 40, 40, -1, -1, -1, 1)), .Names = c("Product", "Price", 
> "Nbr.Lots"), row.names = c(NA, 9L), class = "data.frame")
>
> I can rbind() them, thus resulting in one data frame with duplicated 
> row, but I have no idea how to delete duplicated rows. I have tried 
> plyaing with unique(), duplicated with no success
>
> v<-rbind(exportfile,reported)
> v <-
> structure(list(Product = c("Cocoa", "Cocoa", "Cocoa", "Coffee C", 
> "Coffee C", "GC", "Sugar No 11", "ZS", "ZS", "Cocoa", "Cocoa", 
> "Cocoa", "Cocoa", "Coffee C", "Coffee C", "GC", "Sugar No 11", "ZS", 
> "ZS"), Price = c(2331, 2356, 2440, 204.55, 205.45, 17792, 24.81, 
> 1273.5, 1276.25, 2331, 2356, 2440, 2450, 204.55, 205.45, 17792, 24.81, 
> 1273.5, 1276.25), Nbr.Lots = c(-61, -61, 6, 40, 40, -1, -1, -1, 1, 
> -61, -61, 5, 1, 40, 40, -1, -1, -1, 1)), .Names = c("Product", 
> "Price", "Nbr.Lots"), row.names = c("1", "2", "3", "4", "5", "6", "7", 
> "8", "9", "11", "21", "31", "41", "61", "71", "51", "10", "81", "91"), 
> class = "data.frame")
>
>
> TY for your help
>
> Arnaud Gaboury
>
> A2CT2 Ltd.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



--
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] compare two data frames of different dimensions and only keep unique rows

Reply via email to