OrderedDict
Hi all, I have a understanding problem with return values from xmltodict. I have a xml file. Content: With code __f_name = '' with open(__f_name) as __fd: __doc = xmltodict.parse(__fd.read()) __doc I get OrderedDict([(u'profiles', OrderedDict([(u'profile', OrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'), (u'package', OrderedDict([(u'@package-id', u'0964-gpg4win')]))]))]))]) If I use __doc['profiles']['profile']['package'][0]['@package-id'] I get Traceback (most recent call last): File "", line 1, in KeyError: 0 If I change xml file like this: and run code from above the result is: OrderedDict([(u'profiles', OrderedDict([(u'profile', OrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'), (u'package', [OrderedDict([(u'@package-id', u'0964-gpg4win')]), OrderedDict([(u'@package-id', u'0965-gpg4win')])])]))]))]) No prints __doc['profiles']['profile']['package'][0]['@package-id']: u'0964-gpg4win' Can everybody explain this? Many thanks in advance -- https://mail.python.org/mailman/listinfo/python-list
Re: OrderedDict
On Wednesday, May 18, 2016 at 2:25:16 PM UTC+2, Peter Otten wrote: > Chris Angelico wrote: > > > On Wed, May 18, 2016 at 7:28 PM, Peter Otten <[email protected]> wrote: > >> I don't see an official way to pass a custom dict type to the library, > >> but if you are not afraid to change its source code the following patch > >> will allow you to access the value of dictionaries with a single entry as > >> d[0]: > >> > >> $ diff -u py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py > >> py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py > >> --- py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py > >> 2016-05-18 11:18:44.0 +0200 > >> +++ py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py > >> 2016-05-18 11:11:13.417665697 +0200 @@ -35,6 +35,13 @@ > >> __version__ = '0.10.1' > >> __license__ = 'MIT' > >> > >> +_OrderedDict = OrderedDict > >> +class OrderedDict(_OrderedDict): > >> +def __getitem__(self, key): > >> +if key == 0: > >> +[result] = self.values() > >> +return result > >> +return _OrderedDict.__getitem__(self, key) > >> > >> class ParsingInterrupted(Exception): > >> pass > > > > Easier than patching might be monkeypatching. > > > > class OrderedDict(OrderedDict): > > ... getitem code as above ... > > xmltodict.OrderedDict = OrderedDict > > > > Try it, see if it works. > > It turns out I was wrong on (at least) two accounts: > > - xmltodict does offer a way to specify the dict type > - the proposed dict implementation will not solve the OP's problem > > Here is an improved fix which should work: > > > $ cat sample.xml > > > > > > > $ cat sample2.xml > > > > > > > > $ cat demo.py > import collections > import sys > import xmltodict > > > class MyOrderedDict(collections.OrderedDict): > def __getitem__(self, key): > if key == 0 and len(self) == 1: > return self > return super(MyOrderedDict, self).__getitem__(key) > > > def main(): > filename = sys.argv[1] > with open(filename) as f: > doc = xmltodict.parse(f.read(), dict_constructor=MyOrderedDict) > > print "doc:\n{}\n".format(doc) > print "package-id: {}".format( > doc['profiles']['profile']['package'][0]['@package-id']) > > > if __name__ == "__main__": > main() > $ python demo.py sample.xml > doc: > MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile', > MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'), > (u'package', MyOrderedDict([(u'@package-id', u'0964-gpg4win')]))]))]))]) > > package-id: 0964-gpg4win > $ python demo.py sample2.xml > doc: > MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile', > MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'), > (u'package', [MyOrderedDict([(u'@package-id', u'0964-gpg4win')]), > MyOrderedDict([(u'@package-id', u'0965-gpg4win')])])]))]))]) > > package-id: 0964-gpg4win I have tested the first solution. Works nice. Before I used xml.etree to parse 2000 xml files. Execution time decrease from more then 5 min to 20 sec. Great. On weekend I will test the solution with the own class. Many thanks. -- https://mail.python.org/mailman/listinfo/python-list
Re: OrderedDict
On Friday, May 20, 2016 at 7:15:38 AM UTC+2, [email protected] wrote: > On Wednesday, May 18, 2016 at 2:25:16 PM UTC+2, Peter Otten wrote: > > Chris Angelico wrote: > > > > > On Wed, May 18, 2016 at 7:28 PM, Peter Otten <[email protected]> wrote: > > >> I don't see an official way to pass a custom dict type to the library, > > >> but if you are not afraid to change its source code the following patch > > >> will allow you to access the value of dictionaries with a single entry as > > >> d[0]: > > >> > > >> $ diff -u py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py > > >> py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py > > >> --- py2b_xmltodict/local/lib/python2.7/site-packages/xmltodict.py > > >> 2016-05-18 11:18:44.0 +0200 > > >> +++ py2_xmltodict/local/lib/python2.7/site-packages/xmltodict.py > > >> 2016-05-18 11:11:13.417665697 +0200 @@ -35,6 +35,13 @@ > > >> __version__ = '0.10.1' > > >> __license__ = 'MIT' > > >> > > >> +_OrderedDict = OrderedDict > > >> +class OrderedDict(_OrderedDict): > > >> +def __getitem__(self, key): > > >> +if key == 0: > > >> +[result] = self.values() > > >> +return result > > >> +return _OrderedDict.__getitem__(self, key) > > >> > > >> class ParsingInterrupted(Exception): > > >> pass > > > > > > Easier than patching might be monkeypatching. > > > > > > class OrderedDict(OrderedDict): > > > ... getitem code as above ... > > > xmltodict.OrderedDict = OrderedDict > > > > > > Try it, see if it works. > > > > It turns out I was wrong on (at least) two accounts: > > > > - xmltodict does offer a way to specify the dict type > > - the proposed dict implementation will not solve the OP's problem > > > > Here is an improved fix which should work: > > > > > > $ cat sample.xml > > > > > > > > > > > > > > $ cat sample2.xml > > > > > > > > > > > > > > > > $ cat demo.py > > import collections > > import sys > > import xmltodict > > > > > > class MyOrderedDict(collections.OrderedDict): > > def __getitem__(self, key): > > if key == 0 and len(self) == 1: > > return self > > return super(MyOrderedDict, self).__getitem__(key) > > > > > > def main(): > > filename = sys.argv[1] > > with open(filename) as f: > > doc = xmltodict.parse(f.read(), dict_constructor=MyOrderedDict) > > > > print "doc:\n{}\n".format(doc) > > print "package-id: {}".format( > > doc['profiles']['profile']['package'][0]['@package-id']) > > > > > > if __name__ == "__main__": > > main() > > $ python demo.py sample.xml > > doc: > > MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile', > > MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'), > > (u'package', MyOrderedDict([(u'@package-id', u'0964-gpg4win')]))]))]))]) > > > > package-id: 0964-gpg4win > > $ python demo.py sample2.xml > > doc: > > MyOrderedDict([(u'profiles', MyOrderedDict([(u'profile', > > MyOrderedDict([(u'@id', u'visio02'), (u'@revision', u'2015051501'), > > (u'package', [MyOrderedDict([(u'@package-id', u'0964-gpg4win')]), > > MyOrderedDict([(u'@package-id', u'0965-gpg4win')])])]))]))]) > > > > package-id: 0964-gpg4win > > I have tested the first solution. Works nice. Before I used xml.etree to > parse 2000 xml files. > > Execution time decrease from more then 5 min to 20 sec. Great. On weekend I > will test the solution with the own class. > > Many thanks. Hi all, tests with solution with the own class successful. Nice inspiration. I use this solution in my django script. Many thanks. -- https://mail.python.org/mailman/listinfo/python-list
