On Thu, Aug 21, 2008 at 5:56 AM, Kat <[EMAIL PROTECTED]> wrote: > I have several input files where in each file, every line has a > space-separated pair values. The files are essentially tables with two > columns. There are no duplicates in the first column values within each file, > but they overlap when all files are considered. I'd like to merge them into > one file according to values of the first column of each file with values > from the second column of all files combined > > My second idea is to convert each file into a dictionary (since the first > column's values are unique within each file), then I can create a combined > dictionary which allows multiple values to each key, then output that. Does > that sound reasonable?
Yes, as long as the order of entries doesn't matter - a dict does not preserve order. Make a dict that maps a key to a list of values (collections.defaultdict is useful for this). Read each file and add its pairs to the dict. Then iterate the dict and write to a new file. If you do care about order, there are various implementations of ordered dictionaries available. Kent _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor