Re: [Rd] Performing Merge and Duplicated on very large files

2007-04-18 Thread Sean Davis
On Tuesday 17 April 2007 23:44, Eitan Rubin wrote: > Hi, > > I am working with very large matrices (>1 million records), and need to > 1. Join the files (can be achieved with Merge) > 2. Find lines that have the same value in some field (after the join) and > randomly sample 1 row. > > I am conce

[Rd] Performing Merge and Duplicated on very large files

2007-04-17 Thread Eitan Rubin
Hi, I am working with very large matrices (>1 million records), and need to 1. Join the files (can be achieved with Merge) 2. Find lines that have the same value in some field (after the join) and randomly sample 1 row. I am concerned with the complexity of merge - how (un)efficient is it? I do