Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.
Hi Stefan, yes, I'm working on this, in fact I'm trying to recompile json module (http://docs.python.org/library/json.html) adding some type definitions and cython things o get the code faster. I'm getting in trouble with some things too, I'm going to enumerate here so that, you could give me some tips about how to solve them. 1 - Compile package modules - json module is inside a package (files: __init__.py, decoder.py, encoder.py, decoder.py) is there a way to generate the cython modules just like its get generated by cython? 2 - Because I'm getting in trouble with issue #1, I'm running the tests manually, I go to %Python-dir%/Lib/tests/json_tests, get the files corresponding to the tests python make and run manually. 3 - To get the performance of the module, I'm thinking about to use the timeit function in the unit tests for the project. I think a good number of executions would be made and it would be possible to compare each time. 4 - I didn't create the .pxd files, some problems are happening, it tells methods are not defined, but, they are defined, I will try to investigate this better The code is in this repository: https://github.com/arthursribeiro/JSON-module your feedback would be very important, so that I could improve my skills to get more and more able to work sooner in the project. I think some things implemented in this rewriting process are going to be useful when doing this with C modules... Thank you very much. Best Regards. []s Arthur 2011/4/12 Stefan Behnel > Arthur de Souza Ribeiro, 08.04.2011 02:43: > >> 2011/4/7 Robert Bradshaw >> >>> What I'd like to see is an implementation of a single simple but not >>> >>> entirely trivial (e.g. not math) module, passing regression tests with >>> comprable if not better speed than the current C version (though I >>> think it'd probably make sense to start out with the Python version >>> and optimize that). E.g. http://docs.python.org/library/json.html >>> looks like a good candidate. That should only take 8 hours or so, >>> maybe two days at most, given your background. I'm not expecting >>> anything before the application deadline, but if you could whip >>> something like this out in the next week to point to that would help >>> your application out immensely. In fact, one of the Python >>> foundation's requirements is that students submit a patch before being >>> accepted, and this would knock out that requirement and give you a >>> chance to prove yourself. Create an account on https://github.com and >>> commit your code into a new repository there. >>> >> >> I will start the implementation of json module right now. I created my >> github account and as soon as I have code implemented I will send >> repository >> link. >> > > Any news on this? We're currently discussing which of the Cython-related > projects to mentor. It's likely that not all of our projects will get > accepted, so if you could get us a good initial idea about your work, we'd > have a stronger incentive to value yours over the others. > > Stefan > > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Test runner
On Mon, Apr 11, 2011 at 3:56 AM, mark florisson wrote: > On 11 April 2011 12:53, mark florisson wrote: >> On 11 April 2011 12:45, Stefan Behnel wrote: >>> mark florisson, 11.04.2011 12:26: Can we select tests in the tests directory selectively? I see the -T or --ticket option, but it doens't seem to find the test tagged with # ticket:. I can select unit tests using python runtests.py Cython.SubPackage.Tests.SomeTest, but I can't seem to do the same thing for tests in the tests directory. Running the entire suite takes rather long. >>> >>> You can still select them by name using a regex, e.g. >>> >>> runtests.py 'run\.empty_builtin_constructors' >>> >>> Stefan >>> ___ >>> cython-devel mailing list >>> cython-devel@python.org >>> http://mail.python.org/mailman/listinfo/cython-devel >>> >> >> Great, thanks! I'll update the hackerguide wiki. >> > I see now that it is briefly mentioned there, apologies. I've added a note there about tags as well, and fixed the -T to look at the ticket tag. Note that "mode:run" is the default, so you don't need to explicitly tag mode except for compile/error tests. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] cython-docs repository
On Sat, Apr 9, 2011 at 10:13 AM, Jason Grout wrote: > On 4/9/11 12:02 PM, Robert Bradshaw wrote: >> >> Yep, we did that during the workshop. I thought I had sent out an >> announcement, but I guess not. > > Is there a summary anywhere of the exciting things that happened in the > workshop? Not yet, but I'll post as soon as I have a writeup. > For example, it seems that generators are finally in, if I read > the commit logs correctly. Is that true? If so, fantastic! Yep! > Any idea of a timeline for that to make it into an official release? There's still a some fallout and more testing to do, but hopefully it won't be too long before a release. The biggest step back seems to be the disabling of inline generators, now they're full-fledged generators, which is an optimization regression but not a feature regression. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] "Cython's Users Guide"
On Mon, Apr 11, 2011 at 2:35 PM, Francesc Alted wrote: > 2011/4/11 William Stein >> >> Hi, >> >> I'm teaching Cython in my Sage course yet again, and noticed that >> again there are some very confusing aspects of the Cython >> documentation organization, which could probably be improved by a few >> simple changes. >> >> 1. At http://cython.org/ there is a big link in the middle of the >> page labeled "Cython Users Guide" which goes to >> http://docs.cython.org/. However, http://docs.cython.org/ is *not* >> the users guide -- it is "Cython’s Documentation". In fact, the >> Users Guide is Chapter 3 of the documentation. >> >> 2. Looking at http://docs.cython.org, we see that Chapter 2 is >> "Tutorials". But then looking down to Chapter 3 we see that it is >> "Cython Users Guide". Of course, that's what one is after having just >> clicked a link called "Cython Users Guide". So we click on "Cython >> Users Guide" again. >> >> 3. We arrive at a page that again has "Tutorial" as Chapter 2. For >> some reason this makes me feel even more confused. >> >> Recommend changes: >> >> 1. Change the link on the main page from "Cython Users Guide" to >> "Documentation" or put a direct link into the Users Guide, or have >> two links. >> >> 2. At http://docs.cython.org/ rename the "Cython Users Guide" to >> "Users Guide", since it is obviously the Cython Users Guide at this >> point and "Cython documentation" is in the upper left of the page >> everywhere. >> >> 3. Possibly rename the tutorial in chapter 2 of the users guide to >> something like "First Steps" or "Basic Tutorial" or something. Thanks for the suggestions. Done. > Yeah, that's something that we discussed in the past workshop in Munich > (BTW, many thanks for providing the means for making this happen!). The > basic idea is to completely remove the Chapter 3 (Cython Users Guide) by > moving its parts to either Chapter 2 (Tutorials), or either to Chapter 4 > (Reference Guide). During the meeting we agreed that the doc repository > should be moved (and has been moved indeed) into the source repo, so that > modifications that affect to code and docs can be put in the same > commit/branch. Also, the wiki has a lot of information that can be better > consolidated and integrated into the User's Guide. > In fact, I already started some job in this direction and created a couple > of pull requests during the workshop (that they have been already > integrated). I plan to continue this job, but unfortunately I'm pretty busy > lately, so I don't think I can progress a lot in the next weeks, so if > anybody is interested in joining the effort for improving Cython's > documentation, she will be very welcome indeed! +1, and thanks, Fransesc, for the work you've started in moving us in this direction. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Code examples missing in Cython User's Guide
My apologies for cross-posting this from cython-users, but I realized I should have sent this bug report to cython-devel. Several code examples are missing from the User's Guide due to the source code files being moved or deleted. See for example the Tutorial page on the User's Guide http://docs.cython.org/src/userguide/tutorial.html The code for both fib.pyx and primes.pyx (and their setup.py files) is absent from the document. I looked at the ReST files and they are trying to source the files from an "examples" directory. Looking through the git repository, I wasn't able to locate this directory. Has it been moved or deleted? ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.
Den 12.04.2011 14:59, skrev Arthur de Souza Ribeiro: 1 - Compile package modules - json module is inside a package (files: __init__.py, decoder.py, encoder.py, decoder.py) is there a way to generate the cython modules just like its get generated by cython? I'll propose these 10 guidelines: 1. The major concern is to replace the manual use of Python C API with Cython. We aim to improve correctness and readability, not speed. 2. Replacing plain C with Cython for readability is less important, sometimes even discourged. If you do, it's ok to leverage on Python container types if it makes the code concise and readable, even if it will sacrifice some speed. 3. Use exceptions instead of C style error checks: It's better to ask forgiveness than permission. 4. Use exceptions correctly. All resourse C allocation belongs in __cinit__. All C resource deallocation belongs in __dealloc__. Remember that exceptions can cause resource leaks if you don't. Wrap all resource allocation in an extension type. Never use functions like malloc or fopen directly in your Cython code, except in a __cinit__ method. 5. We should keep as much of the code in Python as we can. Replacing Python with Cython for speed is less important. Only the parts that will really benefit from static typing should be changed to Cython. 6. Leave the __init__.py file as it is. A Python package is allowed contain a mix of Python source files and Cython extension libraries. 7. Be careful to release the GIL whenever appropriate, and never release it otherwise. Don't yield the GIL just because you can, it does not come for free, even with a single thread. 8. Use the Python and C standard libraries whenever you can. Don't re-invent the wheel. Don't use system dependent APIs when the standard libraries declare a common interface. Callbacks to Python are ok. 9. Write code that will work correctly on 32 and 64 bit systems, big- or little-endian. Know your C: Py_intptr_t can contain a pointer. Py_ssize_t can represent the largest array size allowed. Py_intptr_t and Py_ssize_t can have different size. The native array offset can be different from Py_ssize_t, for which a common example is AMD64. 10. Don't clutter the namespace, use pxd includes. Short source files are preferred to long. Simple is better than complex. Keep the source nice and tidy. Sturla ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Code examples missing in Cython User's Guide
Hey Chris, the code for primes and fib examples, are in the directory 'Demos' of the repository... Best Regards. []s Arthur 2011/4/12 Chris Lasher > My apologies for cross-posting this from cython-users, but I realized I > should have sent this bug report to cython-devel. > > Several code examples are missing from the User's Guide due to the source > code files being moved or deleted. See for example the Tutorial page on the > User's Guide http://docs.cython.org/src/userguide/tutorial.html The code > for both fib.pyx and primes.pyx (and their setup.py files) is absent from > the document. > > I looked at the ReST files and they are trying to source the files from an > "examples" directory. Looking through the git repository, I wasn't able to > locate this directory. Has it been moved or deleted? > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.
Arthur de Souza Ribeiro, 12.04.2011 14:59: Hi Stefan, yes, I'm working on this, in fact I'm trying to recompile json module (http://docs.python.org/library/json.html) adding some type definitions and cython things o get the code faster. Cool. I'm getting in trouble with some things too, I'm going to enumerate here so that, you could give me some tips about how to solve them. 1 - Compile package modules - json module is inside a package (files: __init__.py, decoder.py, encoder.py, decoder.py) is there a way to generate the cython modules just like its get generated by cython? The __init__.py doesn't really look performance critical. It's better to leave that modules in plain Python, that improves readability by reducing surprises and simplifies reuse by other implementations. That being said, you can compile each module separately, just use the "cython" command line tool for that, or write a little distutils script as in http://docs.cython.org/src/quickstart/build.html#building-a-cython-module-using-distutils Don't worry too much about a build integration for now. 2 - Because I'm getting in trouble with issue #1, I'm running the tests manually, I go to %Python-dir%/Lib/tests/json_tests, get the files corresponding to the tests python make and run manually. That's fine. 3 - To get the performance of the module, I'm thinking about to use the timeit function in the unit tests for the project. I think a good number of executions would be made and it would be possible to compare each time. That's ok for a start, artificial benchmarks are good to test specific functionality. However, unit tests tend to be short running with a lot of overhead, so later on, you will need to use real code to benchmark the modules. I would expect that there are benchmarks for JSON implementations around, and you can just generate a large JSON file and run loads and dumps on it. 4 - I didn't create the .pxd files, some problems are happening, it tells methods are not defined, but, they are defined, I will try to investigate this better When reporting usage related problems (preferably on the cython-users mailing list), it's best to present the exact error messages and the relevant code snippets, so that others can quickly understand what's going on and/or reproduce the problem. The code is in this repository: https://github.com/arthursribeiro/JSON-module your feedback would be very important, so that I could improve my skills to get more and more able to work sooner in the project. I'd strongly suggest implementing this in pure Python (.py files instead of .pyx files), with externally provided static types for performance. A single code base is very advantageous for a large project like CPython, much more than the ultimate 5% better performance. I think some things implemented in this rewriting process are going to be useful when doing this with C modules... Well, if you can get the existing Python implementation up to mostly comparable speed as the C implementation, then there is no need to care about the C module anymore. Even if you can get only 90% of a module to run at comparable speed, and need to keep 10% in plain C, that's already a huge improvement in terms of maintainability. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.
On Tue, Apr 12, 2011 at 11:22 AM, Stefan Behnel wrote: > Arthur de Souza Ribeiro, 12.04.2011 14:59: >> >> Hi Stefan, yes, I'm working on this, in fact I'm trying to recompile json >> module (http://docs.python.org/library/json.html) adding some type >> definitions and cython things o get the code faster. > > Cool. > > >> I'm getting in trouble with some things too, I'm going to enumerate here >> so >> that, you could give me some tips about how to solve them. >> >> 1 - Compile package modules - json module is inside a package (files: >> __init__.py, decoder.py, encoder.py, decoder.py) is there a way to >> generate >> the cython modules just like its get generated by cython? > > The __init__.py doesn't really look performance critical. It's better to > leave that modules in plain Python, that improves readability by reducing > surprises and simplifies reuse by other implementations. > > That being said, you can compile each module separately, just use the > "cython" command line tool for that, or write a little distutils script as > in > > http://docs.cython.org/src/quickstart/build.html#building-a-cython-module-using-distutils > > Don't worry too much about a build integration for now. > > >> 2 - Because I'm getting in trouble with issue #1, I'm running the tests >> manually, I go to %Python-dir%/Lib/tests/json_tests, get the files >> corresponding to the tests python make and run manually. > > That's fine. > > >> 3 - To get the performance of the module, I'm thinking about to use the >> timeit function in the unit tests for the project. I think a good number >> of >> executions would be made and it would be possible to compare each time. > > That's ok for a start, artificial benchmarks are good to test specific > functionality. However, unit tests tend to be short running with a lot of > overhead, so later on, you will need to use real code to benchmark the > modules. I would expect that there are benchmarks for JSON implementations > around, and you can just generate a large JSON file and run loads and dumps > on it. > > >> 4 - I didn't create the .pxd files, some problems are happening, it tells >> methods are not defined, but, they are defined, I will try to investigate >> this better > > When reporting usage related problems (preferably on the cython-users > mailing list), it's best to present the exact error messages and the > relevant code snippets, so that others can quickly understand what's going > on and/or reproduce the problem. > > >> The code is in this repository: >> https://github.com/arthursribeiro/JSON-module your feedback would be very >> important, so that I could improve my skills to get more and more able to >> work sooner in the project. > > I'd strongly suggest implementing this in pure Python (.py files instead of > .pyx files), with externally provided static types for performance. A single > code base is very advantageous for a large project like CPython, much more > than the ultimate 5% better performance. While this is advantageous for the final product, it may not be the easiest to get up and running with. >> I think some things implemented in this rewriting process are going to be >> useful when doing this with C modules... > > Well, if you can get the existing Python implementation up to mostly > comparable speed as the C implementation, then there is no need to care > about the C module anymore. Even if you can get only 90% of a module to run > at comparable speed, and need to keep 10% in plain C, that's already a huge > improvement in terms of maintainability. > > Stefan > ___ > cython-devel mailing list > cython-devel@python.org > http://mail.python.org/mailman/listinfo/cython-devel > ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] GSoC Proposal - Reimplement C modules in CPython's standard library in Cython.
On Tue, Apr 12, 2011 at 8:33 AM, Sturla Molden wrote: > Den 12.04.2011 14:59, skrev Arthur de Souza Ribeiro: >> >> 1 - Compile package modules - json module is inside a package (files: >> __init__.py, decoder.py, encoder.py, decoder.py) is there a way to generate >> the cython modules just like its get generated by cython? >> > > > I'll propose these 10 guidelines: > > 1. The major concern is to replace the manual use of Python C API with > Cython. We aim to improve correctness and readability, not speed. Speed is a concern, otherwise many of these modules wouldn't have been written in C in the first place (at least, not the ones with a pure Python counterpart). Of course some of them are just wrapping C libraries where speed doesn't matter as much. > 2. Replacing plain C with Cython for readability is less important, > sometimes even discourged. Huh? I'd say this is a big point of the project. Maybe less so than manual dependance on the C API, but certainly not discouraged. > If you do, it's ok to leverage on Python > container types if it makes the code concise and readable, even if it will > sacrifice some speed. That's true. > 3. Use exceptions instead of C style error checks: It's better to ask > forgiveness than permission. Yep, this is natural in Cython. > 4. Use exceptions correctly. All resourse C allocation belongs in __cinit__. > All C resource deallocation belongs in __dealloc__. Remember that exceptions > can cause resource leaks if you don't. Wrap all resource allocation in an > extension type. Never use functions like malloc or fopen directly in your > Cython code, except in a __cinit__ method. This can be useful advice, but is not strictly necessary. Try..finally can fill this need as well. > 5. We should keep as much of the code in Python as we can. Replacing Python > with Cython for speed is less important. Only the parts that will really > benefit from static typing should be changed to Cython. True. Of course, compiling the (unchanged) pure Python files with Cython could also yield interesting results, but that's not part of the project. > 6. Leave the __init__.py file as it is. A Python package is allowed contain > a mix of Python source files and Cython extension libraries. > > 7. Be careful to release the GIL whenever appropriate, and never release it > otherwise. Don't yield the GIL just because you can, it does not come for > free, even with a single thread. > > 8. Use the Python and C standard libraries whenever you can. Don't > re-invent the wheel. Don't use system dependent APIs when the standard > libraries declare a common interface. Callbacks to Python are ok. > > 9. Write code that will work correctly on 32 and 64 bit systems, big- or > little-endian. Know your C: Py_intptr_t can contain a pointer. Py_ssize_t > can represent the largest array size allowed. Py_intptr_t and Py_ssize_t can > have different size. The native array offset can be different from > Py_ssize_t, for which a common example is AMD64. It's rare to have to do pointer arithmetic in Cython, and rarer still to have to store the pointer as an integer. > 10. Don't clutter the namespace, use pxd includes. Short source files are > preferred to long. Simple is better than complex. Keep the source nice and > tidy. Not sure what you mean by "pxd includes," but yes, you should use pxd files and cimport just as you would in Python to keep things manageable and modular. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel