On Thu, Aug 21, 2008 at 5:56 AM, Kat <[EMAIL PROTECTED]> wrote:

> I have several input files where in each file, every line has a 
> space-separated pair values. The files are essentially tables with two 
> columns. There are no duplicates in the first column values within each file, 
> but they overlap when all files are considered. I'd like to merge them into 
> one file according to values of the first column of each file with values 
> from the second column of all files combined
>
> My second idea is to convert each file into a dictionary (since the first 
> column's values are unique within each file), then I can create a combined 
> dictionary which allows multiple values to each key, then output that. Does 
> that sound reasonable?

Yes, as long as the order of entries doesn't matter - a dict does not
preserve order. Make a dict that maps a key to a list of values
(collections.defaultdict is useful for this). Read each file and add
its pairs to the dict. Then iterate the dict and write to a new file.

If you do care about order, there are various implementations of
ordered dictionaries available.

Kent
_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Reply via email to