Hi, I'm trying to compare two CSV files (and many more like these below). I tried many ways, using lists, dictreader and more but nothing gave me the output I require. I want to compare all those rows that have same !Sample_title and !Sample_geo_accession values (whose positions vary). I've been struggling with this for three days now and couldn't come to a solution. I highly appreciate any help.
CSV1: !Sample_title,!Sample_geo_accession,!Sample_status,!Sample_type,!Sample_source_name_ch1 body,GSM501443,Public on july 22 2010,ribonucleic acid,FB_50_12wk foreign,GSM501445,Public on july 22 2010,ribonucleic acid,FB_0_12wk HJCENV,GSM501446,Public on july 22 2010,ribonucleic acid,FB_50_12wk AsDW,GSM501444,Public on july 22 2010,ribonucleic acid,FB_0_12wk CSV2: !Sample_title,!Sample_type,!Sample_source_name_ch1,!Sample_geo_accession AsDW,ribonucleic acid,FB_0,GSM501444 foreign,ribonucleic acid,FB,GSM501449 HJCENV,RNA,12wk,GSM501446 Desired output (with respect to CSV2): Added: {!Sample_status:{HJCENV:Public on july 22 2010,AsDW:Public on july 22 2010}} #Added columns, not rows. Deleted: {} #Since nothing's deleted with respect to CSV2 Changed: {!Sample_title:AsDW,!Sample_source_name_ch1:(FB_0_12wk,FB_0),!Sample_geo_accession:GSM501444 !Sample_title:HJCENV,!Sample_type:(ribonucleic acid,RNA),!Sample_source_name_ch1:(FB_50_12wk,12wk),!Sample_geo_accession:GSM501446} #foreign,ribonucleic acid,FB,GSM501449 doesn't come here since the !Sample_geo_accession column value didn't match. -AN _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: https://mail.python.org/mailman/listinfo/tutor