[issue1818] Add named tuple reader to CSV module
Rob Renaud added the comment: I am totally new to Python dev. I reinvented a NamedTupleReader tonight, only to find out that it was created a year ago. My primary motivation is that DictReader reads headers nicely, but DictWriter totally sucks at handling them. Consider doing some filtering on a csv file, like so. sample_data = [ 'title,latitude,longitude', 'OHO Ofner & Hammecke Reinigungsgesellschaft mbH,48.128265,11.610848', 'Kitchen Kaboodle,45.544241,-122.715728', 'Walgreens,28.339727,-81.596367', 'Gurnigel Pass,46.731944,7.447778' ] def filter_with_dict_reader_writer(): accepted_rows = [] for row in csv.DictReader(sample_data): if float(row['latitude']) > 0.0 and float(row['longitude']) > 0.0: accepted_rows.append(row) field_names = csv.reader(sample_data).next() output_writer = csv.DictWriter(open('accepted_by_dict.csv', 'w'), field_names) output_writer.writerow(dict(zip(field_names, field_names))) output_writer.writerows(accepted_rows) You have to work so hard to maintain the headers when you write the file with DictWriter. I understand this is a limitation of dicts throwing away the order information. But namedtuples don't have that problem. NamedTupleReader and NamedTupleWriter should be inverses. This means that NamedTupleWriter needs to write headers. This should produce identical output as the dict writer example, but it's much cleaner. def filter_with_named_tuple_reader_writer(): accepted_rows = [] for row in csv.NamedTupleReader(sample_data): if float(row.latitude) > 0.0 and float(row.longitude) > 0.0: accepted_rows.append(row) output_writer = csv.NamedTupleWriter( open('accepted_by_named_tuple.csv', 'w')) output_writer.writerows(accepted_rows) I patched on top of the existing NamedTupleWriter patch adding support for writing headers. I don't know if that's bad style/etiquette, etc. -- nosy: +rrenaud Added file: http://bugs.python.org/file13187/named_tuple_write_header.patch ___ Python tracker <http://bugs.python.org/issue1818> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1818] Add named tuple reader to CSV module
Rob Renaud added the comment: My previous patch could write the header twice. But I am not sure about about how the writer should handle the fieldnames parameter on one hand, and the namedtuple._fields on the other. Added file: http://bugs.python.org/file13188/named_tuple_write_header2.patch ___ Python tracker <http://bugs.python.org/issue1818> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1818] Add named tuple reader to CSV module
Changes by Rob Renaud : Removed file: http://bugs.python.org/file13187/named_tuple_write_header.patch ___ Python tracker <http://bugs.python.org/issue1818> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1818] Add named tuple reader to CSV module
Rob Renaud added the comment: I want to make sure I understand. Am I correct in believing that Skip thinks writing headers should be optional, while Jervis believes we should leave the burden to the NamedTupleWriter client? I agree that we should not unconditionally write headers, but I think that we should write headers by default, much like we read them by default. I believe the implicit header writing is very elegant, and the only reason that the DictWriter object doesn't write headers is the impedance mismatch between dicts and CSV. namedtuples has the field order information, the impedance mismatch is gone, we should no longer be hindered. Implicitly reading but not explicitly writing headers just seems wrong. It also seems wrong to require the construction of "header" namedtuple objects. It's much less natural than dicts holding identity mappings. >>> Point._make(Point._fields) Point(x='x', y='y') To me, that just looks weird and non-obvious to me. That Point instance doesn't really fit in my mind as something that should be a Point. ___ Python tracker <http://bugs.python.org/issue1818> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1818] Add named tuple reader to CSV module
Rob Renaud added the comment: I did a search on Google code for the DictReader constructor. I analyzed the first 3 pages, the fieldnames parameter was used in 14 of 27 cases (discounting unittest code built into Python) and was not used in 13 of 27 cases. I suppose that means headered csv files are sufficiently rare that they shouldn't be created implicitly by default. I still don't like the lack of symmetry of supporting implicit header reads, but not implicit header writes. On Thu, Feb 26, 2009 at 8:00 PM, Skip Montanaro wrote: > > Skip Montanaro added the comment: > > More concretely, I don't think this is so onerous: > >names = ["col1", "col2", "color"] >writer = csv.DictWriter(open("f.csv", "wb"), fieldnames=names, ...) >writer.writerow(dict(zip(names, names))) >... > > or > >f = open("f.csv", "rb") >names = csv.reader(f).next() >reader = csv.DictReader(f, fieldnames=names, ...) >... > > Skip > > ___ > Python tracker > <http://bugs.python.org/issue1818> > ___ > ___ Python tracker <http://bugs.python.org/issue1818> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1551113] random.choice(setinstance) fails
Rob Renaud added the comment: I found this via google search when disappointed that random.choice raised an exception rather than returned a random item in the set. It's quite easy to implement random.choice for sets/dicts in O(1) expected time from the C implementation as long as the set/dict implementation guarantees minimal constant density. Simply generate random indices in the set object until one with an object is found . This has will work in expected O(1/density) probes. I suppose making random.choice work for sets/dicts isn't worth a C implementation (as happy as it would have made me a few hours ago...)? -- nosy: +rrenaud ___ Python tracker <http://bugs.python.org/issue1551113> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com