[issue17902] Document that _elementtree C API cannot use custom TreeBuilder for iterparse or IncrementalParser
Aaron Oakley added the comment: So sorry, I just found the emails from the bug tracker in my spam folder. Anyhow, I've now signed the CLA. -- ___ Python tracker <http://bugs.python.org/issue17902> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17902] Document that _elementtree C API cannot use custom TreeBuilder for iterparse or IncrementalParser
Aaron Oakley added the comment: >From memory, the use case at the time was using a custom TreeBuilder sub-class >fed into a builtin XMLParser object. The code would construct a builder >separately and keep a reference to it around. The builder would delegate calls >to start(), data(), end(), and close() to super and save the completed tree >when its close() was called. my_builder = CustomTreeBuilder() et_parser = ET.XMLParser(target=my_builder) for (evt, elem) in ET.iterparse("...", events, parser=et_parser): pass # Do first processing tree = my_builder.root # Saved tree It was done like this initially so that some data (I can't recall exactly what) from the XML input could be processed first very conveniently using the parse events from iterparse while allowing the whole tree to be retrieved afterwards. That said, the project later moved to using lxml for various features not contained in xml.etree.ElementTree, and I don't think the process I described is still being used. -- ___ Python tracker <http://bugs.python.org/issue17902> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17901] _elementtree.TreeBuilder raises IndexError on end if constructed with element_factory=None
New submission from Aaron Oakley: When the _elementtree module is in use, the TreeBuilder class will raise an IndexError in treebuilder_handle_end if __init__ was passed "None". I discovered this while writing a subclass of TreeBuilder with a modified __init__ method that delegated to TreeBuilder: class MyTreeBuilder(ET.TreeBuilder): def __init__(self, element_factory=None): super().__init__(element_factory) Used as a target, this class (and also simply "TreeBuilder(None)") will cause the IndexError when "parser.feed(data)" is called. >>> import xml.etree.ElementTree as ET >>> parser = ET.XMLParser(target=ET.TreeBuilder(None)) >>> parser.feed('22') Traceback (most recent call last): File "", line 1, in IndexError: pop from empty stack The error is raised from treebuilder_handle_end, but the cause appears to be in treebuilder_handle_start. if (self->element_factory) { node = PyObject_CallFunction(self->element_factory, "OO", tag, attrib); } else { node = create_new_element(tag, attrib); } I included a patch adding a check against Py_None to the "if" test above which seems to fix the issue. I also included a simple test case for it. -- components: XML files: _elementtree.c-340a0.patch keywords: patch messages: 188326 nosy: Aaron.Oakley priority: normal severity: normal status: open title: _elementtree.TreeBuilder raises IndexError on end if constructed with element_factory=None type: behavior versions: Python 3.2, Python 3.3, Python 3.4 Added file: http://bugs.python.org/file30117/_elementtree.c-340a0.patch ___ Python tracker <http://bugs.python.org/issue17901> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue17902] Document that _elementtree C API cannot use custom TreeBuilder for iterparse or IncrementalParser
New submission from Aaron Oakley: It would really help to document that the C API can only use the default xml.etree.ElementTree.TreeBuilder for targets with iterparse (and by extension, IncrementalParser). I got a nice surprise about that when I went from 3.2 to 3.3 and started getting "TypeError: event handling only supported for ElementTree.TreeBuilder targets". I included a patch to add notes to iterparse and IncrementalParser, but I'm not sure what to refer to the C module as since xml.etree.cElementTree is deprecated. -- assignee: docs@python components: Documentation, XML files: elementtree.rst-340a0.patch keywords: patch messages: 188329 nosy: Aaron.Oakley, docs@python priority: normal severity: normal status: open title: Document that _elementtree C API cannot use custom TreeBuilder for iterparse or IncrementalParser type: behavior versions: Python 3.4 Added file: http://bugs.python.org/file30119/elementtree.rst-340a0.patch ___ Python tracker <http://bugs.python.org/issue17902> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com