On Tue, Sep 18, 2012 at 07:14:26AM -0700, Michiel de Hoon wrote: > Dear all, > > Suppose I have a parser that parses information stored in e.g. an XML file.
You mean like the XML parsers that already come with Python? http://docs.python.org/library/markup.html http://eli.thegreenplace.net/2012/03/15/processing-xml-in-python-with-elementtree/ Or powerful third-party libraries that already exist? http://lxml.de/index.html Please don't waste your time re-inventing the wheel :) > I would like to design a Python class to store the information > contained in this XML file. > > One option is to create a class like this: > > class Record(object): > pass > > and store the information in the XML file as attributes of objects of > this class That is perfectly fine if you have a known set of attribute names, and none of them clash with Python reserved words (like "class", "del", etc.) or are otherwise illegal identifiers (e.g. "2or3"). In general, I prefer to use a record-like object if and only if I have a pre-defined set of field names, in which case I prefer to use namedtuple: py> from collections import namedtuple as nt py> Record = nt("Record", "north south east west") py> x = Record(1, 2, 3, 4) py> print x Record(north=1, south=2, east=3, west=4) py> x.east 3 > Alternatively I could subclass the dictionary class: > > class Record(dict): > pass Why bother subclassing it? You don't add any functionality. Just return a dict, it will be lighter-weight and faster. > I can see some advantage to using a dictionary, because it allows me > to use the same strings as keys in the dictionary as in used in the > XML file itself. But are there some general guidelines for when to use > a dictionary-like class, Yes. You should prefer a dictionary when you have one or more of these: - your field names could be illegal as identifiers (e.g. "field name", "foo!", etc.) - you have an unknown and potentially unlimited number of field names - each record could have a different set of field names - or some fields may be missing - you expect to be programmatically inspecting field names that aren't known until runtime, e.g.: name = get_name_of_field() value = record[name] # is cleaner than getattr(record, name) - you expect to iterate over all field names You might prefer to use attributes of a class if you have one or more of these: - all field names are guaranteed to be legal identifiers - you have a fixed set of field names, known ahead of time - you value the convenience of writing record.field instead of record['field'] > and when to use attributes to store > information? In particular, are there any situations where there is > some advantage in using attributes? Not so much. Attributes are convenient, because you save three characters: obj.spam obj['spam'] but otherwise attributes are just a more limited version of dict keys. Anything that can be done with attributes can be done with a dict, since attributes are usually implemented with a dict. -- Steven _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor