Re: [Python-Dev] xml.etree.ElementTree.IncrementalParser

2013-08-09 Thread Stefan Behnel
Antoine Pitrou, 08.08.2013 10:20:
> Le Thu, 08 Aug 2013 06:33:42 +0200,
> Stefan Behnel a écrit :
>> Antoine Pitrou, 07.08.2013 08:04:
>>> http://docs.python.org/dev/library/xml.etree.elementtree.html#incremental-parsing
>>
>> I don't like the fact that it adds a second interface to iterparse()
>> that allows injecting arbitrary content into the parser.
>> You can now
>> run iterparse() to read from a file, and at an arbitrary iteration
>> position, send it a byte string to parse from, before it goes reading
>> more data from the file. Or take out some events before iteration
>> continues.
>>
>> I think the implementation should be changed to make iterparse()
>> return something that wraps an IncrementalParser, not something that
>> is an IncrementalParser.
> 
> That sounds reasonable. Do you want to post a patch? :-)

I attached it to the ticket that seems to have been the source of this
addition.

http://bugs.python.org/issue17741

Please note that the tulip mailing list is not an appropriate place to
discuss additions to the XML libraries, and ElementTree in particular.

Is there a way to get automatic notification when the XML component is
assigned to a ticket? (Not that it would have helped in this case, as the
component was missing from the ticket.)


>> Also, IMO it should mimic the interface of the TreeBuilder, which
>> calls the data reception method "data()"

Uups, sorry. It's actually called feed().

>> and the termination method
>> "close()". There is no reason to add yet another set of methods names
>> just to do what others do already.
> 
> Well, the difference here is that after calling eof_received() you can
> still (and should) call events() once to get the last events. I think
> it would be weird if you could still do something useful with the object
> after calling close().
> 
> Also, the method names are not invented, they mimick the PEP 3156
> stream protocols:
> http://www.python.org/dev/peps/pep-3156/#stream-protocols

I see your point about close(). I assume your reasoning was to make the
IncrementalParser an arbitrary stream end-point. However, it doesn't really
make all that much sense to connect an arbitrary data source to it, as the
source wouldn't know that, in addition to passing in data, it would also
have to ask for events from time to time. I mean, you could do it, but then
it would just fill up the memory with parser events and loose the actual
advantages of incremental parsing. So, in a way, the whole point of the
class is to *not* be an arbitrary stream end-point.

Anyway, given that there isn't really the One Obvious Way to do it, maybe
you should just add a docstring to the class (ahem), reference the stream
protocol as the base for its API, and then rename it to
IncrementalStreamParser. That would at least make it clear why it doesn't
really fit with the rest of the module API (which was designed some decade
before PEP 3156) and instead uses its own naming scheme.

Stefan


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] xml.etree.ElementTree.IncrementalParser

2013-08-09 Thread Antoine Pitrou
Le Fri, 09 Aug 2013 13:11:11 +0200,
Stefan Behnel  a écrit :
> 
> I attached it to the ticket that seems to have been the source of this
> addition.
> 
> http://bugs.python.org/issue17741
> 
> Please note that the tulip mailing list is not an appropriate place to
> discuss additions to the XML libraries, and ElementTree in particular.

Well, the bug tracker is the main point of discussion, except that few
people bothered discussing it.

> Is there a way to get automatic notification when the XML component is
> assigned to a ticket? (Not that it would have helped in this case, as
> the component was missing from the ticket.)

You could ask to get included in the "experts" index:
http://docs.python.org/devguide/experts.html
(I doubt anyone would object to that)

> Anyway, given that there isn't really the One Obvious Way to do it,
> maybe you should just add a docstring to the class (ahem), reference
> the stream protocol as the base for its API, and then rename it to
> IncrementalStreamParser.

I don't think there's any point in making the class name longer.
Parsing XML incrementally is pretty much what it does.
As for the docstring, uh, well, sure :-)

(IMHO, IncrementalParser is the One Obvious Way to do incremental XML
parsing in 3.4, but YMMV)

Regards

Antoine.


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] xml.etree.ElementTree.IncrementalParser

2013-08-09 Thread Stefan Behnel
Antoine Pitrou, 09.08.2013 14:50:
> Le Fri, 09 Aug 2013 13:11:11 +0200,
> Stefan Behnel a écrit :
>> I attached it to the ticket that seems to have been the source of this
>> addition.
>>
>> http://bugs.python.org/issue17741
>>
>> Please note that the tulip mailing list is not an appropriate place to
>> discuss additions to the XML libraries, and ElementTree in particular.
> 
> Well, the bug tracker is the main point of discussion, except that few
> people bothered discussing it.

The bug tracker is usually not a very visible place to start discussing
about changes. This change is a particularly good example, I've certainly
seen others.


>> Is there a way to get automatic notification when the XML component is
>> assigned to a ticket? (Not that it would have helped in this case, as
>> the component was missing from the ticket.)
> 
> You could ask to get included in the "experts" index:
> http://docs.python.org/devguide/experts.html
> (I doubt anyone would object to that)

Ok, please add me for xml.etree then. I used to get added to the noisy list
for ET tickets during the 3.3 release cycle, but that seems to have stopped
a while back.

Since it's easier to erase my name from the noisy list than to add myself
to a bug I've never heard about, I'm ok with being added for anything that
relates to ET, basically, be it bug or feature.


>> Anyway, given that there isn't really the One Obvious Way to do it,
>> maybe you should just add a docstring to the class (ahem), reference
>> the stream protocol as the base for its API, and then rename it to
>> IncrementalStreamParser.
> 
> I don't think there's any point in making the class name longer.

Agreed. It's not the class name that should be modified but the method
names. I changed my mind and posted to the tracker. I also attached a new
patch that changes the implementation to what I think it should look like.

Stefan


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Summary of Python tracker Issues

2013-08-09 Thread Python tracker

ACTIVITY SUMMARY (2013-08-02 - 2013-08-09)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue.
Do NOT respond to this message.

Issues counts and deltas:
  open4148 (+20)
  closed 26321 (+47)
  total  30469 (+67)

Open issues with patches: 1874 


Issues opened (56)
==

#4322: function with modified __name__ uses original name when there'
http://bugs.python.org/issue4322  reopened by benjamin.peterson

#17741: event-driven XML parser
http://bugs.python.org/issue17741  reopened by pitrou

#18630: mingw: exclude unix only modules
http://bugs.python.org/issue18630  opened by rpetrov

#18631: mingw: setup msvcrt and _winapi modules
http://bugs.python.org/issue18631  opened by rpetrov

#18632: mingw: build extensions with GCC
http://bugs.python.org/issue18632  opened by rpetrov

#18633: mingw: use Mingw32CCompiler as default compiler for  mingw* bu
http://bugs.python.org/issue18633  opened by rpetrov

#18634: mingw find import library
http://bugs.python.org/issue18634  opened by rpetrov

#18636: mingw: setup _ssl module
http://bugs.python.org/issue18636  opened by rpetrov

#18637: mingw: export _PyNode_SizeOf as PyAPI for parser module
http://bugs.python.org/issue18637  opened by rpetrov

#18638: mingw: generalization of posix build in sysconfig.py
http://bugs.python.org/issue18638  opened by rpetrov

#18639: mingw: avoid circular dependency from time module during nativ
http://bugs.python.org/issue18639  opened by rpetrov

#18640: mingw: generalization of posix build in distutils/sysconfig.py
http://bugs.python.org/issue18640  opened by rpetrov

#18641: mingw: customize site
http://bugs.python.org/issue18641  opened by rpetrov

#18643: implement socketpair() on Windows
http://bugs.python.org/issue18643  opened by neologix

#18644: Got ResourceWarning: unclosed file when using test function fr
http://bugs.python.org/issue18644  opened by vajrasky

#18645: Add a configure option for performance guided optimization
http://bugs.python.org/issue18645  opened by rhettinger

#18646: Improve tutorial entry on 'Lambda Forms'.
http://bugs.python.org/issue18646  opened by terry.reedy

#18647: re.error: nothing to repeat
http://bugs.python.org/issue18647  opened by serhiy.storchaka

#18648: FP Howto and the PEP 8 lambda guildline
http://bugs.python.org/issue18648  opened by terry.reedy

#18650: intermittent test_pydoc failure on 3.4.0a1
http://bugs.python.org/issue18650  opened by ned.deily

#18651: test failures on KFreeBSD
http://bugs.python.org/issue18651  opened by doko

#18652: Add itertools.first_true (return first true item in iterable)
http://bugs.python.org/issue18652  opened by hynek

#18653: mingw-meta: build core modules
http://bugs.python.org/issue18653  opened by rpetrov

#18654: modernize mingw&cygwin compiler classes
http://bugs.python.org/issue18654  opened by rpetrov

#18655: GUI apps take long to launch on Windows
http://bugs.python.org/issue18655  opened by netrick

#18659: test_precision in test_format.py is not executed and has unuse
http://bugs.python.org/issue18659  opened by vajrasky

#18660: os.read behavior on Linux
http://bugs.python.org/issue18660  opened by dugres

#18663: In unittest.TestCase.assertAlmostEqual doc specify the delta d
http://bugs.python.org/issue18663  opened by py.user

#18664: occasional test_threading failure
http://bugs.python.org/issue18664  opened by pitrou

#18667: missing HAVE_FCHOWNAT
http://bugs.python.org/issue18667  opened by salinger

#18669: curses.chgat() moves cursor, documentation says it shouldn't
http://bugs.python.org/issue18669  opened by productivememberofsociety666

#18670: Using read_mime_types function from mimetypes module gives res
http://bugs.python.org/issue18670  opened by vajrasky

#18672: Fix format specifiers for debug output in _sre.c
http://bugs.python.org/issue18672  opened by serhiy.storchaka

#18673: Add and use O_TMPFILE for Linux 3.11
http://bugs.python.org/issue18673  opened by christian.heimes

#18674: Store weak references in modules_by_index
http://bugs.python.org/issue18674  opened by pitrou

#18675: Daemon Threads can seg fault
http://bugs.python.org/issue18675  opened by guettli

#18676: Queue: document that zero is accepted as timeout value
http://bugs.python.org/issue18676  opened by zyluo

#18677: Enhanced context managers with ContextManagerExit and None
http://bugs.python.org/issue18677  opened by kristjan.jonsson

#18678: Wrong struct members name for spwd module
http://bugs.python.org/issue18678  opened by vajrasky

#18679: include a codec to handle escaping only control characters but
http://bugs.python.org/issue18679  opened by underrun

#18680: JSONDecoder should document that it raises a ValueError for ma
http://bugs.python.org/issue18680  opened by corey

#18681: typo in imp.reload
http://bugs.python.org/issue18681  opened by felloak

#18682: [PATCH] remove bogus codepath from pprint._safe_repr
http://bugs.python.org/issue18682  open