Is this what you want:

> ds1 <- read.table(text = "1 A
+ 2 B
+ 3 X
+ 4 AA
+ 5 A
+ 6 D
+ 7 XA
+ 8 C", as.is = TRUE)
>
> ds2 <- read.table(text = "1 A
+ 2 X
+ 3 A", as.is = TRUE)
>
> # find matches
> ds3 <- ds1[!(ds1$V2 %in% ds2$V2), ]
> ds3
  V1 V2
2  2  B
4  4 AA
6  6  D
7  7 XA
8  8  C


On Mon, Feb 27, 2012 at 6:17 AM, Jonas Fransson <j...@iva.dk> wrote:
> Dear all,
>
> I want to delete the exact matches in a large dataset based on a smaller 
> dataset. In other words I want to subtract the smaller dataset from the 
> larger one. The smaller dataset is a part of the larger one. The datasets 
> contains hundred of thousands of lines (1 column) and the content on each 
> line differ in length. The data is extracted paths from web logs.
>
> On an abstract level I want to subtract dataset2 from dataset1 to get 
> dataset3:
>
> dataset1:
> 1 A
> 2 B
> 3 X
> 4 AA
> 5 A
> 6 D
> 7 XA
> 8 C
>
> dataset2:
> 1 A
> 2 X
> 3 A
>
> dataset3:
> 1 B
> 2 AA
> 3 D
> 4 XA
> 5 C
>
> The final order in dataset3 is not important.
>
> Thanks,
>
> Jonas Fransson
> Ph.D.stud.
>
> IVA / Det Informationsvidenskabelige Akademi
> Royal School of Library and Information Science
> Birketinget 6
> DK-2300 Copenhagen S
> T +45 32 58 60 66
> D +45 32 34 15 10
> www.iva.dk/jf
>
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to