Arthur de Souza Ribeiro, 17.04.2011 20:07:
Hi Stefan, about your first comment : "And it's better to let Cython know
that this name refers to a function."  in line 69 of encoder.pyx file I
didn't understand well what does that mean, can you explain more this
comment?

Hmm, sorry, I think that was not so important. That code line is only used to override the Python implementation with the implementation from the external C accelerator module. To do that, it assigns either of the two functions to a name. So, when that name is called in the code, Cython cannot know that it actually is a function, and has to resort to Python calling, whereas a visible c(p)def function that is defined inside of the same module could be called faster.

I missed the fact that this name isn't really used inside of the module, so whether Cython knows that it's a function or not isn't really all that important.

I added another comment to this commit, though:

https://github.com/arthursribeiro/JSON-module/commit/e2d80e0aeab6d39ff2d9b847843423ebdb9c57b7#diff-4


About the other comments, I think I solved them all, any problem with them
or other ones, please tell me. I'll try to fix.

It looks like you fixed a good deal of them.

I actually tried to work with your code, but I'm not sure how you are building it. Could you give me a hint on that?

Where did you actually take the original code from? Python 3.2? Or from Python's hg branch?


Note that it's not obvious from your initial commit what you actually
changed. It would have been better to import the original file first, rename
it to .pyx, and then commit your changes.

I created a directory named 'Diff files' where I put the files generated by
'diff' command that i run in my computer, if you think it still be better if
I commit and then change, there is no problem for me...

Diff only gives you the final outcome. Committing on top of the original files has the advantage of making the incremental changes visible separately. That makes it clearer what you tried, and a good commit comment will then make it clear why you did it.


I think it's more important to get some performance
numbers to see how your module behaves compared to the C accelerator module
(_json.c). I think the best approach to this project would actually be to
start with profiling the Python implementation to see where performance
problems occur (or to look through _json.c to see what the CPython
developers considered performance critical), and then put the focus on
trying to speed up only those parts of the Python implementation, by adding
static types and potentially even rewriting them in a way that Cython can
optimise them better.

I've profilled the module I created and the module that is in Python 3.2,
the result is that the cython module spent about 73% less time then python's

That's a common mistake when profiling: the actual time it takes to run is not meaningful. Depending on how far the two profiled programs differ, they may interact with the profiler in more or less intensive ways (as is clearly the case here), so the total time it takes for the programs to run can differ quite heavily under profiling, even if the non-profiled programs run at exactly the same speed.

Also, I don't think you have enabled profiling for the Cython code. You can do that by passing the "profile=True" directive to the compiler, or by putting it at the top of the source files. That will add module-inner function calls to the profiling output. Note, however, that enabling profiling will slow down the execution, so disable it when you measure absolute run times.

http://docs.cython.org/src/tutorial/profiling_tutorial.html


(blue for cython, red for python):

Colours tend to pass rather badly through mailing lists. Many people disable the HTML presentation of e-mails, and plain text does not have colours. But it was still obvious enough what you meant.


The behavior between my module and python's one seems to be the same I think
that's the way it should be.

JSONModule nested_dict
          10004 function calls in 0.268 seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     10000    0.196    0.000    0.196    0.000 :0(dumps)

This is a pretty short list (I stripped the uninteresting parts). The profile right below shows a lot more entries in encoder.py. It would be good to see these calls in the Cython code as well.


json nested_dict
          60004 function calls in 1.016 seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
         1    0.000    0.000    1.016    1.016 :0(exec)
     20000    0.136    0.000    0.136    0.000 :0(isinstance)
     10000    0.120    0.000    0.120    0.000 :0(join)
         1    0.000    0.000    0.000    0.000 :0(setprofile)
         1    0.088    0.088    1.016    1.016<string>:1(<module>)
     10000    0.136    0.000    0.928    0.000 __init__.py:180(dumps)
     10000    0.308    0.000    0.792    0.000 encoder.py:172(encode)
     10000    0.228    0.000    0.228    0.000 encoder.py:193(iterencode)
[...]
JSONModule ustring
          10004 function calls in 0.140 seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     10000    0.072    0.000    0.072    0.000 :0(dumps)
[...]

json ustring
          40004 function calls in 0.580 seconds

    Ordered by: standard name

    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     10000    0.092    0.000    0.092    0.000 :0(encode_basestring_ascii)
         1    0.004    0.004    0.580    0.580 :0(exec)
     10000    0.060    0.000    0.060    0.000 :0(isinstance)
         1    0.000    0.000    0.000    0.000 :0(setprofile)
         1    0.100    0.100    0.576    0.576<string>:1(<module>)
     10000    0.152    0.000    0.476    0.000 __init__.py:180(dumps)
     10000    0.172    0.000    0.324    0.000 encoder.py:172(encode)

The code is upated in repository, any comments that you might have, please,
let me know. Thank you very much for your feedback.

Thank you for the numbers. Could you add absolute timings using timeit? And maybe also try with larger input data?

ISTM that a lot of overhead comes from calls that Cython can easily optimise all by itself: isinstance() and (bytes|unicode).join(). That's the kind of observation that previously let me suggest to start by benchmarking and profiling in the first place. Cython compiled code has quite different performance characteristics from code executing in CPython's interpreter, so it's important to start by getting an idea of how the code behaves when compiled, and then optimising it in the places where it still needs to run faster.

Optimisation is an incremental process, and you will often end up reverting changes along the way when you see that they did not improve the performance, or maybe just made it so slightly faster that the speed improvement is not worth the code degradation of the optimisation change in question.

Could you try to come up with a short list of important code changes you made that let this module run faster, backed by some timings that show the difference with and without each change?

Stefan
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to