[issue13503] improved efficiency of bytearray pickling by using bytes type instead of str
New submission from Irmen de Jong : Pickling of bytearrays is quite inefficient right now, because bytearray's __reduce__ encodes the bytes of the bytearray into a str. A pickle made from a bytearray is quite a bit larger than necessary because of this, and it also takes a lot more processing to create it and to convert it back into the actual bytearray when unpickling (because it uses bytearray's str based initializer with encoding). I've attached a patch (both for the default 3.x branch and the 2.7 branch) that changes this to use the bytes type instead. A pickle made from a bytearray with this patch applied now utilizes the BINBYTES/BINSTRING pickle opcodes which are a lot more efficient than BINUNICODE that is used now. The reconstruction of the bytearray now uses bytearray's initializer that takes a bytes object. I don't think additional unit tests are needed because test_bytes already performs pickle tests on bytearrays. A bit more info can be found in my recent post on comp.lang.python about this, see http://mail.python.org/pipermail/python-list/2011-November/1283668.html -- components: Interpreter Core files: bytearray3x.patch keywords: easy, needs review, patch messages: 148627 nosy: alexandre.vassalotti, irmen, pitrou priority: normal severity: normal stage: patch review status: open title: improved efficiency of bytearray pickling by using bytes type instead of str type: performance versions: Python 2.7, Python 3.1, Python 3.2, Python 3.3, Python 3.4 Added file: http://bugs.python.org/file23812/bytearray3x.patch ___ Python tracker <http://bugs.python.org/issue13503> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13503] improved efficiency of bytearray pickling by using bytes type instead of str
Changes by Irmen de Jong : Added file: http://bugs.python.org/file23813/bytearray27.patch ___ Python tracker <http://bugs.python.org/issue13503> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13503] improved efficiency of bytearray pickling by using bytes type instead of str
Irmen de Jong added the comment: btw I'm aware of PEP-3154 but I don't think this particular patch requires a pickle protocol bump. -- ___ Python tracker <http://bugs.python.org/issue13503> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13503] improved efficiency of bytearray pickling by using bytes type instead of str
Irmen de Jong added the comment: Added new patch that only does the new reduction when protocol is 3 or higher. -- Added file: http://bugs.python.org/file23852/bytearray3x_reduceex.patch ___ Python tracker <http://bugs.python.org/issue13503> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13503] improved efficiency of bytearray pickling by using bytes type instead of str
Irmen de Jong added the comment: Alexandre: the existing test_bytes already performs byte array pickle tests. -- ___ Python tracker <http://bugs.python.org/issue13503> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2871] store thread.get_ident() thread identifier inside threading.Thread objects
New submission from Irmen de Jong <[EMAIL PROTECTED]>: I've ran into a problem where it would be very nice to be able to tell the tread.get_ident() of a given threading.Thread object. Currently, when creating a new Thread object, there is no good way of getting that thread's get_ident() value. I propose adding the get_ident() value as a publicly accessible field of every threading.Thread object. -- components: Extension Modules messages: 66882 nosy: irmen severity: normal status: open title: store thread.get_ident() thread identifier inside threading.Thread objects type: feature request __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2871> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2871] store thread.get_ident() thread identifier inside threading.Thread objects
Irmen de Jong <[EMAIL PROTECTED]> added the comment: Adding it in the run method would only work for threads that I create in my own code. The thing is: I need to be able to get the tread identification from threads created by third party code. So I cannot rely on that code putting it in the thread object themselves like that. Hence my wish of letting the standard library module take care of it. And using the id() of the current thread object has a rather obscure problem. I was using it as a matter of fact, until people reported problems in my code when used with certain atexit handling. (Sometimes the wrong id() is returned). Because of that I wanted to switch to the more low-level thread.get_ident() identification of different threads, because that is supposed to return a stable os-level thread identification, right? __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2871> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2871] store thread.get_ident() thread identifier inside threading.Thread objects
Irmen de Jong <[EMAIL PROTECTED]> added the comment: Thanks Gregory, for taking the time to make a patch. __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2871> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1103213] Adding the missing socket.recvall() method
Irmen de Jong added the comment: Sure, I'll give it another go. I've not done any c-development for quite a while though, so I have to pick up the pieces and see how far I can get. Also, I don't have any compiler for Windows so maybe I'll need someone else to validate the patch on Windows for me, once I've got something together. -- ___ Python tracker <http://bugs.python.org/issue1103213> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1103213] Adding the missing socket.recvall() method
Irmen de Jong added the comment: Ok I've looked at it again and think I can build an acceptable patch this time. However there are 2 things that I'm not sure of: 1) how to return the partial data to the application if the recv() loop fails before completion. Because the method will probably raise an exception on failure, as usual, it seems to me that the best place to put the partial data is inside the exception object. I can't think of another easy and safe way for the application to retrieve it otherwise. But, how is this achieved in code? I'll be using set_error() to return an error from my sock_recvall function I suppose. 2) the trunk is Python 2.7, should I make a separate patch for 3.x? -- ___ Python tracker <http://bugs.python.org/issue1103213> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1103213] Adding the missing socket.recvall() method
Irmen de Jong added the comment: Ok I think I've got the code and doc changes ready. I added a recvall and a recvall_into method to the socket module. Any partially received data in case of errors is returned to the application as part of the args for a new exception, socket.partialdataerror. Still need to work on some unit tests for these new methods. -- ___ Python tracker <http://bugs.python.org/issue1103213> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1103213] Adding the missing socket.recvall() method
Changes by Irmen de Jong : Removed file: http://bugs.python.org/file6439/patch.txt ___ Python tracker <http://bugs.python.org/issue1103213> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1103213] Adding the missing socket.recvall() method
Changes by Irmen de Jong : Added file: http://bugs.python.org/file16762/socketmodulepatch.txt ___ Python tracker <http://bugs.python.org/issue1103213> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1103213] Adding the missing socket.recvall() method
Changes by Irmen de Jong : Added file: http://bugs.python.org/file16763/libpatch.txt ___ Python tracker <http://bugs.python.org/issue1103213> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1103213] Adding the missing socket.recvall() method
Changes by Irmen de Jong : Added file: http://bugs.python.org/file16764/docpatch.txt ___ Python tracker <http://bugs.python.org/issue1103213> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1103213] Adding the missing socket.recvall() method
Irmen de Jong added the comment: Currently if MSG_WAITALL is defined, recvall() just calls recv() internally with the extra flag. Maybe that isn't the smartest thing to do because it duplicates recv's behavior on errors. Which is: release the data and raise an error. Would it be nicer to have recvall() release the data and raise an error, or to let it return the partial data? Either way, I think the behavior should be the same regardless of MSG_WAITALL being available. This is not yet the case. Why C: this started out by making the (very) old patch that I once wrote for socketmodule.c up to date with the current codebase, and taking Martin's comments into account. The old patch was small and straightforward. Unfortunately the new one turned out bigger and more complex than I thought. For instance I'm not particularly happy with the way recvall returns the partial data on fail. It uses a new exception for that but the code has some trickery to replace the socket.error exception that is initially raised. I'm not sure if my code is the right way to do this, it needs some review. I do think that putting it into the exception object is the only safe way of returning it to the application, unless the semantics on error are changed as mentioned above. Maybe it could be made simpler then. In any case, it probably is a good idea to see if a pure python solution (perhaps just some additions to Lib/socket.py?) would be better. Will put some effort into this. -- ___ Python tracker <http://bugs.python.org/issue1103213> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8320] docs on socket.recv_into doesn't mention the return value
New submission from Irmen de Jong : Doc/library/socket.rst doesn't mention the return value for recv_into. Adding a simple "Returns the number of bytes received." should fix this. (note that recvfrom_into does mention its return value) -- assignee: georg.brandl components: Documentation keywords: easy messages: 102393 nosy: georg.brandl, irmen severity: normal status: open title: docs on socket.recv_into doesn't mention the return value type: feature request versions: Python 2.7, Python 3.1 ___ Python tracker <http://bugs.python.org/issue8320> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31072] add filter to zipapp
New submission from Irmen de Jong: As briefly discussed on comp.lang.python, I propose to add an optional filter callback function to zipapp.create_archive. The function could perhaps work like the os.walk generator or maybe just lets you to return a simple boolean for every folder/file that it wants to include in the zip. My use case is that I sometimes don't want to include every file in the root folder into the zip file (I want to be able to skip temporary or irrelevant folders such as .git/.svn, .tox, .tmp and sometimes want to avoid including *.pyc/*.pyo files). Right now, I first have to manually clean up the folder before I can use zipapp.create_archive. (Instead of providing a filter callback fuction, another approach may be to provide your own dir/file generator instead, that fully replaces the internal file listing logic of zipapp.create_archive?) -- assignee: paul.moore components: Library (Lib) keywords: easy messages: 299409 nosy: irmen, paul.moore priority: normal severity: normal status: open title: add filter to zipapp type: enhancement ___ Python tracker <http://bugs.python.org/issue31072> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31072] add filter to zipapp
Irmen de Jong added the comment: That sounds fine to me. I guess the paths passed to the function should be relative to the root folder being zipped? -- ___ Python tracker <http://bugs.python.org/issue31072> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28673] pyro4 with more than 15 threads often crashes 2.7.12
Irmen de Jong added the comment: Due to lack of example code to reproduce the issue, and because I'm mildly interested in this bug because it mentions Pyro4 (because I'm the author of that) I've tried to crash my system myself using Pyro4 and a simple torture test but it trucked on just fine. Pyro4 is not doing any "strange" things as far as I am aware. It does have its own (simple) thread pool of regular Python threads that are handling incoming proxy connections. (Had Pyro4 been doing weird things I suppose Python itself still should never core dump on the user but rather raise a regular exception if something was wrong) -- nosy: +irmen ___ Python tracker <http://bugs.python.org/issue28673> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28673] pyro4 with more than 15 threads often crashes 2.7.12
Irmen de Jong added the comment: The 28673-reproduce.py didn't crash on any of the systems I've tried it on. Are you sure it is complete? It looks like a part is missing. -- ___ Python tracker <http://bugs.python.org/issue28673> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue1103213] Adding the missing socket.recvall() method
Irmen de Jong added the comment: I created the patch about 5 years ago and in the meantime a few things have happened: - I've not touched C for a very long time now - I've learned that MSG_WAITALL may be unreliable on certain systems, so any implementation of recvall depending on MSG_WAITALL may inexplicably fail on such systems - I've been using a python implementation of a custom recv loop in Pyro4 for years - it is unclear that a C implementation will provide a measurable performance benefit because I think most of the time is spent in the network I/O anyway, and the GIL is released when doing a normal recv (I hope?) In other words, I will never follow up on my original C-based patch from 5 years ago. I do still like the idea of having a reliable recvall in the stdlib instead of having to code a page long one in my own code. -- ___ Python tracker <http://bugs.python.org/issue1103213> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11620] winsound.PlaySound() with SND_MEMORY should accept bytes instead of strings
Irmen de Jong added the comment: Ran into this today when trying to provide a fallback sound output on windows when the user hasn't got pyaudio installed. It seems that this module has been forgotten and didn't get fixed when the str/bytes change happened in Python 3.0? -- nosy: +irmen ___ Python tracker <http://bugs.python.org/issue11620> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11620] winsound.PlaySound() with SND_MEMORY should accept bytes instead of strings
Changes by Irmen de Jong : -- versions: +Python 3.5 ___ Python tracker <http://bugs.python.org/issue11620> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com