Arthur de Souza Ribeiro, 12.04.2011 14:59:
Hi Stefan, yes, I'm working on this, in fact I'm trying to recompile json module (http://docs.python.org/library/json.html) adding some type definitions and cython things o get the code faster.
Cool.
I'm getting in trouble with some things too, I'm going to enumerate here so that, you could give me some tips about how to solve them. 1 - Compile package modules - json module is inside a package (files: __init__.py, decoder.py, encoder.py, decoder.py) is there a way to generate the cython modules just like its get generated by cython?
The __init__.py doesn't really look performance critical. It's better to leave that modules in plain Python, that improves readability by reducing surprises and simplifies reuse by other implementations.
That being said, you can compile each module separately, just use the "cython" command line tool for that, or write a little distutils script as in
http://docs.cython.org/src/quickstart/build.html#building-a-cython-module-using-distutils Don't worry too much about a build integration for now.
2 - Because I'm getting in trouble with issue #1, I'm running the tests manually, I go to %Python-dir%/Lib/tests/json_tests, get the files corresponding to the tests python make and run manually.
That's fine.
3 - To get the performance of the module, I'm thinking about to use the timeit function in the unit tests for the project. I think a good number of executions would be made and it would be possible to compare each time.
That's ok for a start, artificial benchmarks are good to test specific functionality. However, unit tests tend to be short running with a lot of overhead, so later on, you will need to use real code to benchmark the modules. I would expect that there are benchmarks for JSON implementations around, and you can just generate a large JSON file and run loads and dumps on it.
4 - I didn't create the .pxd files, some problems are happening, it tells methods are not defined, but, they are defined, I will try to investigate this better
When reporting usage related problems (preferably on the cython-users mailing list), it's best to present the exact error messages and the relevant code snippets, so that others can quickly understand what's going on and/or reproduce the problem.
The code is in this repository: https://github.com/arthursribeiro/JSON-module your feedback would be very important, so that I could improve my skills to get more and more able to work sooner in the project.
I'd strongly suggest implementing this in pure Python (.py files instead of .pyx files), with externally provided static types for performance. A single code base is very advantageous for a large project like CPython, much more than the ultimate 5% better performance.
I think some things implemented in this rewriting process are going to be useful when doing this with C modules...
Well, if you can get the existing Python implementation up to mostly comparable speed as the C implementation, then there is no need to care about the C module anymore. Even if you can get only 90% of a module to run at comparable speed, and need to keep 10% in plain C, that's already a huge improvement in terms of maintainability.
Stefan _______________________________________________ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel