[Python-Dev] pydoc - revised
Hi all, This is a somehow an update to the thread on pydoc started by Ron. Since the last entry we took the bull by the horns (so to speak), and are seriously aiming at delivering something that can qualify as: - a revision of the module pydoc - something that will facilitate the inclusion of ideas explored in efforts like the nice epydoc - something that is in general easily extensible to answer most of the need (both regarding the output that is returned, and what are the capabilities regarding the exploration of the documentation). We are probably setting the bar a little high, but we hope to get somewhere at least with the first point thanks to a number of advices on this list, some of which by pydoc's or ipython's original authors (beside that, the bar could not be elsewhere since I am not a very good limbo dancer ;-) ). There is a sourceforge page for that project, since it lets us work with SVN without risking to disturb a larger project, but we are ready to have that moved to the sandbox in the python tree (although it would be neat to wait a little until there are more tests). svn co https://pydoc-r.svn.sourceforge.net/svnroot/pydoc-r pydoc-r Still, the thought process is not over and we were thinking of having discussions recorded either in the form of posts on python-dev, or say using the mailing list on the sourceforge page. An example of discussions we had are about whether we should already name the module pydoc or an other name, Posting on python-dev would give exposure to what is discussed, but at the same could be perceived as off-topic until the module is in the sandbox. Posting on the sourceforge page would have pretty much opposite points (no exposure, but no one would feel annoyed) Any suggestion ? Thanks, Laurent ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] buglet in long("123\0", 10)
Nick> With an explicit base, however, PyLong_FromString is called Nick> directly. Since that API takes a char* string, it stops at the Nick> first embedded NULL: long('123\0003', 10) Nick> 123L long('123\00032', 10) Nick> 123L Nick> So 'long_from_string' in abstract.c already has this problem Nick> solved - the helper function just needs to be moved into Nick> longobject.c so it can be used for explicit bases as well. That's a bug. \0 is not a numeric character in any base, so the docs imply that an exception should be raised: long([x[, radix]]) Convert a string or number to a long integer. If the argument is a string, it must contain a possibly signed number of arbitrary size, possibly embedded in whitespace. The radix argument is interpreted in the same way as for int(), and may only be given when x is a string. Otherwise, the argument may be a plain or long integer or a floating point number, and a long integer with the same value is returned. Conversion of floating point numbers to integers truncates (towards zero). If no arguments are given, returns 0L. The only nonnumeric characters which can occur are whitespace characters, and they can only occur at the start or end of the string. Unlike the similar C functions atoi and atof conversion doesn't just continue until a nonnumeric character is encountered. There is no way to tell the caller that part of the string wasn't munched. Skip ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] pydoc - revised
On 1/14/07, Laurent Gautier <[EMAIL PROTECTED]> wrote: > Hi all, > > This is a somehow an update to the thread on pydoc started by Ron. > > Since the last entry we took the bull by the horns (so to speak), and > are seriously aiming at delivering something that can qualify as: > - a revision of the module pydoc > - something that will facilitate the inclusion of ideas explored in efforts > like the nice epydoc > - something that is in general easily extensible to answer most of the > need (both regarding the output that is returned, and what are the > capabilities regarding the exploration of the documentation). > > We are probably setting the bar a little high, but we hope to get somewhere > at least with the first point thanks to a number of advices on this > list, some of which by pydoc's or ipython's original authors > (beside that, the bar could not be elsewhere since I am not a very > good limbo dancer ;-) ). > > There is a sourceforge page for that project, since it lets us work with SVN > without risking to disturb a larger project, but we are ready to have that > moved > to the sandbox in the python tree (although it would be neat to wait a > little until > there are more tests). > svn co https://pydoc-r.svn.sourceforge.net/svnroot/pydoc-r pydoc-r > > Still, the thought process is not over and we were thinking of having > discussions > recorded either in the form of posts on python-dev, or say using the > mailing list > on the sourceforge page. An example of discussions we had are about > whether we should already name the module pydoc or an other name, > Posting on python-dev would give exposure to what is discussed, but at > the same could be perceived as off-topic until the module is in the > sandbox. > Posting on the sourceforge page would have pretty much opposite points > (no exposure, but no one would feel annoyed) > > Any suggestion ? > Don't know how other list subscribers think, but if this is going to take time it would be best to either use the SF one or get one on mail.python.org until you are at a point that you are in the sandbox. If we allowed you guys to talk here then everyone who wants to write up a replacement for something in the stdlib would want to use python-dev and that would be too much. -Brett ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rewrite of import in Python source (sans docs) is complete
I am really looking into get into hacking on CPython and I'm keenly interested in your security work (my top reason for hoping i can make PyCon. keeping fingers crossed!), so if you need help with this to focus on other things, I'd be delighted to try my hand at the task. Do you have some docs up anywhere of what directionyou hope this to go in from here? On 1/5/07, Brett Cannon <[EMAIL PROTECTED]> wrote: > Finally, after a few months worth of work, I have finally gotten far enough > in my import rewrite that I am willing to stick my neck out and say it is > semantically complete! You can find it in the sandbox under import_in_py. > > So, details of this implementation. I implemented PEP 302 importers/loaders > for built-in, frozen, extension, .py, and .pyc files along with rewriting > the steps __import__ goes through to do an import. I also developed an API > for .py/.pyc file handling so that there is a generic filesystem > importer/loader and a separate handler for .py/.pyc files. This should > allow for (relatively) easy selective overriding of just how .py/.pyc files > are stored ( e.g., introducing a database backend) or how variants on > .py/.pyc files are handled (e.g., Quixote's .ptl format). > > This code has extensive tests and so I am fairly confident that it does what > is expected of an import rewrite. There are actually more lines in the test > file than the implementation. There is also a mock implementation used for > testing. Was interesting doing this in such a test-driven, XP style of only > coding what I needed. > > I have run this code through the entire regression test suite and that is > where you find out subtle differences between this implementation and the > built-in import (you can see for yourself with the regrtest.sh shell > script). First test_pkg will fail because currently the new import adds a > __loader__ attribute on all modules (that might change for security reasons) > and test_pkg is an old, stdout comparing test. Second, test_runpy fails > because I have not implemented get_code on the filesystem loader which is > required by runpy. Both are shallow issues that can be dealt with. > > Third, and the hardest difference to deal with, is that you will get some > warnings that print out that you normally don't see. This is because > warnings.warn and its stack_level argument don't have the effect people are > used to when importing a deprecated module. Before you could set > stack_level to 2 and it would look like it came from the import statement. > But now, with import written in Python and thus on the call stack compared > to being in C and thus not showing up, two levels back is still in the > import code. I really don't know how this should be dealt with short of > saying that the rule of thumb with 2 stack levels back for a warning does > not work when done at the import level. > > It is not blazing fast at the moment. Some things, like the built-in and > frozen importers/loaders could be rewritten in C without huge issue. I am > also sure I have made some stupid design decisions at various points in the > code. But there is benchmarking code in the sandbox called importbench and > it showed up a 10x speed slowdown on a Linux box I was using in mid to late > December when doing a fresh import of certain types (I don't remember > exactly which kind off the top of my head). > > Because of this current slowness I don't know if people want to rush into > trying to make this the default import implementation quite yet or if this > is not too big of a thing since the common case of pulling out of > sys.modules is not that much slower. I know I am currently not planning on > devoting the time to bootstrap it in as I have my security work to finish > first along with other Python stuff that seems more pressing to me. And > since (I think) I don't need to bootstrap it in order to finish my security > work I can't justify spending work time on it. But I can rearrange > priorities if people really want to pursue this (especially if I can get > some help with it). > > As with the module's name, it is currently named 'importer', but that is bad > since it conflicts with the idea of importers from PEP 302. I was thinking > importlib, but I wanted to wait and see what other people thought. > > Don't know if you guys are okay with me checking this in without having it > vetted by the community first like we prefer all new modules to do. I have > not done the LaTeX docs yet. > > I think that is all of the details that I can think of. I am still working > towards implementing the security needed so that an application that embeds > Python can execute arbitrary code securely. Giving a talk at PyCon on the > topic for anyone interested. > > Special thanks needs to go to Paul Moore who I talked to through most of > the design of the code. Nick Coghlan also provided some handy feedback. > And Neal Norwitz for bugging about wanting something like this done. P
Re: [Python-Dev] The bytes type
Guido van Rossum wrote: > On 1/12/07, Raymond Hettinger <[EMAIL PROTECTED]> wrote: >> [A.M. Kuchling] >>> 2.6 wouldn't go changing existing APIs to begin requiring or returning >>> the bytes type[*], of course, but extensions and new modules might use >>> it. >> The premise is dubious. >> >> If I am currently maintaining a module, why would I switch to a bytes type >> and forgo compatibility with Py2.5 and prior? I might as well just convert >> it to run on Py3.0 and leave my Py2.5 code as-is for people who want to >> run 2.x. >> >> If I'm writing a new module, what's the point of twisting myself into knots >> to get it to run on both Py2.6 and Py3.0? That just makes coding harder >> (by limiting me to the intersection of the feature sets). >> >> I think we should draw a line in the sand and resolve not to garbage-up >> Py2.6. >> The whole Py3.0 project is about eliminating cruft and being free of the >> bonds of backwards compatibility. Adding non-essential cruft to Py2.6 >> goes against that philosophy. > > I'm not so sure, since 2.6 will likely be out and stable long before > 3.0 gains much of a foothold. I believe the experiences with a similar > approach in the Zope world for the 2->3 transition was overall a > favorable one. > > However, I'd be loathe to make any compromises in 3.0 in order to make > life easier for 2.6. The burden must be on 2.6 (and 2.7-2.9), and if > it's just impossible, that's too bad for them. > This may have been discussed ob the Py3k list, and/or may be inappropriate for python-dev. I just wondered if you could indicate whether the developers of Jython, IronPython and PyPy have indicated any interest in and/or commitment to supporting Py3.0. It's important that the development of 3.0 doesn't fragment the development community (not to mention the user community), and Jython is already aiming at a moving target in its attempts to catch up. regards Steve -- Steve Holden +44 150 684 7255 +1 800 494 3119 Holden Web LLC/Ltd http://www.holdenweb.com Skype: holdenweb http://del.icio.us/steve.holden Blog of Note: http://holdenweb.blogspot.com ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] buglet in long("123\0", 10)
SVN rev 52305 resolved Bug #1545497: when given an explicit base, int() did ignore NULs embedded in the string to convert. However, the same fix wasn't applied for long(). n On 1/13/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > What's wrong with this session? :-) > > Python 2.6a0 (trunk:53416, Jan 13 2007, 15:24:17) > [GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin > Type "help", "copyright", "credits" or "license" for more information. > >>> int('123\0') > Traceback (most recent call last): > File "", line 1, in > ValueError: null byte in argument for int() > >>> int('123\0', 10) > Traceback (most recent call last): > File "", line 1, in > ValueError: invalid literal for int() with base 10: '123\x00' > >>> long('123\0') > Traceback (most recent call last): > File "", line 1, in > ValueError: null byte in argument for long() > >>> long('123\0', 10) > 123L > >>> > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/nnorwitz%40gmail.com > ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Rewrite of import in Python source (sans docs) is complete
On 1/14/07, Calvin Spealman <[EMAIL PROTECTED]> wrote: > I am really looking into get into hacking on CPython And doing the python-dev Summaries will definitely help with that. =) > and I'm keenly > interested in your security work (my top reason for hoping i can make > PyCon. keeping fingers crossed!), I guess I better not screw up on the presentation. > so if you need help with this to > focus on other things, I'd be delighted to try my hand at the task. Do > you have some docs up anywhere of what directionyou hope this to go in > from here? > The progresss of the work is being kept in a Google Doc at http://docs.google.com/View?docid=dg7fctr4_4d8tdbq . As for the overall design, you can read securing_python.txt in the bcannon-objcap branch off the trunk. It's a little outdated (mostly because it stems back when I was trying to worry about supporting multiple interpreters in a single process and thinking about how to expose all of it for Python source code apps), but otherwise it is still accurate in terms of how the security is being designed to work. Overall the goal is to get it so that you can embed Python in your C app but have it be secure so that if you have people use Python as a plug-in or DSL you don't have to worry about them ruining your machine. I have ideas on how to extend the security to be more general and such and possibly be used in Python source code apps, but it looks like that will be outside of my Ph.D. thesis and thus I won't be actively working on it (heck, this current work isn't either but I have put so much time in at this point that my supervisor and I want to get a published paper out of it eventually). -Brett > On 1/5/07, Brett Cannon <[EMAIL PROTECTED]> wrote: > > Finally, after a few months worth of work, I have finally gotten far enough > > in my import rewrite that I am willing to stick my neck out and say it is > > semantically complete! You can find it in the sandbox under import_in_py. > > > > So, details of this implementation. I implemented PEP 302 importers/loaders > > for built-in, frozen, extension, .py, and .pyc files along with rewriting > > the steps __import__ goes through to do an import. I also developed an API > > for .py/.pyc file handling so that there is a generic filesystem > > importer/loader and a separate handler for .py/.pyc files. This should > > allow for (relatively) easy selective overriding of just how .py/.pyc files > > are stored ( e.g., introducing a database backend) or how variants on > > .py/.pyc files are handled (e.g., Quixote's .ptl format). > > > > This code has extensive tests and so I am fairly confident that it does what > > is expected of an import rewrite. There are actually more lines in the test > > file than the implementation. There is also a mock implementation used for > > testing. Was interesting doing this in such a test-driven, XP style of only > > coding what I needed. > > > > I have run this code through the entire regression test suite and that is > > where you find out subtle differences between this implementation and the > > built-in import (you can see for yourself with the regrtest.sh shell > > script). First test_pkg will fail because currently the new import adds a > > __loader__ attribute on all modules (that might change for security reasons) > > and test_pkg is an old, stdout comparing test. Second, test_runpy fails > > because I have not implemented get_code on the filesystem loader which is > > required by runpy. Both are shallow issues that can be dealt with. > > > > Third, and the hardest difference to deal with, is that you will get some > > warnings that print out that you normally don't see. This is because > > warnings.warn and its stack_level argument don't have the effect people are > > used to when importing a deprecated module. Before you could set > > stack_level to 2 and it would look like it came from the import statement. > > But now, with import written in Python and thus on the call stack compared > > to being in C and thus not showing up, two levels back is still in the > > import code. I really don't know how this should be dealt with short of > > saying that the rule of thumb with 2 stack levels back for a warning does > > not work when done at the import level. > > > > It is not blazing fast at the moment. Some things, like the built-in and > > frozen importers/loaders could be rewritten in C without huge issue. I am > > also sure I have made some stupid design decisions at various points in the > > code. But there is benchmarking code in the sandbox called importbench and > > it showed up a 10x speed slowdown on a Linux box I was using in mid to late > > December when doing a fresh import of certain types (I don't remember > > exactly which kind off the top of my head). > > > > Because of this current slowness I don't know if people want to rush into > > trying to make this the default import implementation quite yet or if this > > is not too big of a thi
Re: [Python-Dev] buglet in long("123\0", 10)
Is it a more general problem that null-terminated strings are used with data from strings we specifically allow to contain null bytes? Perhaps a migration of *FromString() to *FromStringAndSize() functions, or taking Python string object pointers, would be a more general solution to set as a goal, so this sort of thing can't crop up down the road, again. I know I'm still very uninitiated in the internals of CPython, so anyone please correct me if my thoughts here are against any on-going policy or reasoning. On 1/14/07, Neal Norwitz <[EMAIL PROTECTED]> wrote: > SVN rev 52305 resolved Bug #1545497: when given an explicit base, > int() did ignore NULs embedded in the string to convert. > > However, the same fix wasn't applied for long(). > > n > > On 1/13/07, Guido van Rossum <[EMAIL PROTECTED]> wrote: > > What's wrong with this session? :-) > > > > Python 2.6a0 (trunk:53416, Jan 13 2007, 15:24:17) > > [GCC 4.0.1 (Apple Computer, Inc. build 5363)] on darwin > > Type "help", "copyright", "credits" or "license" for more information. > > >>> int('123\0') > > Traceback (most recent call last): > > File "", line 1, in > > ValueError: null byte in argument for int() > > >>> int('123\0', 10) > > Traceback (most recent call last): > > File "", line 1, in > > ValueError: invalid literal for int() with base 10: '123\x00' > > >>> long('123\0') > > Traceback (most recent call last): > > File "", line 1, in > > ValueError: null byte in argument for long() > > >>> long('123\0', 10) > > 123L > > >>> > > > > -- > > --Guido van Rossum (home page: http://www.python.org/~guido/) > > ___ > > Python-Dev mailing list > > Python-Dev@python.org > > http://mail.python.org/mailman/listinfo/python-dev > > Unsubscribe: > > http://mail.python.org/mailman/options/python-dev/nnorwitz%40gmail.com > > > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/ironfroggy%40gmail.com > -- Read my blog! I depend on your acceptance of my opinion! I am interesting! http://ironfroggy-code.blogspot.com/ ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] buglet in long("123\0", 10)
On 1/14/07, Calvin Spealman <[EMAIL PROTECTED]> wrote: > Is it a more general problem that null-terminated strings are used > with data from strings we specifically allow to contain null bytes? > Perhaps a migration of *FromString() to *FromStringAndSize() > functions, or taking Python string object pointers, would be a more > general solution to set as a goal, so this sort of thing can't crop up > down the road, again. Most of the time this is taken care of by the argument type codes passed to PyArg_ParseTuple() and friends. If you use 's' then it assumes the string is for consumption of C code that uses null-termination, and it checks that there are no null bytes. Try open("foo\0"). -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com