Vitja Makarov, 12.08.2011 08:49:
Recently I've found one more Python to C compiler, that translates
python bytecode into C source.
And author says about 100% Python compatibility.

That's clearly incorrect. From the code, it appears that at least the builtins are static, which means that it's *not* Python compatible (even less so than Cython currently is). I'm also not sure about the integer type handling - worth some investigation, but it looks somewhat sloppy.

I mean, it's well known that you can generate fast code by diverting from Python semantics. Shedskin excels in that. They mostly seem to be using the same tricks that Cython uses as well, namely static builtins, as well as optimistic optimisations for likely types based on usage patterns. And a huge amount of special casing, maybe even more than Cython currently applies.

They also seem to make excessive use of CPython internals and internal APIs. Not sure if that's a good idea. Specifically, they didn't care a bit about Python 3 compatibility, it seems.


The project is a signle 800KB python file.

That's just plain sick.


http://code.google.com/p/2c-python/

I was wondering when found that 2c beats Cython in some benchmarks.
For instance, it's about 2 times faster than Cython in pystone test

PyStone is known to be a particularly bad benchmark.

The other benchmark results are somewhat surprising and (IMHO) hint mostly at a lack of Python compatibility. Again, it's well known that you can make specific benchmarks fast by diverting from Python semantics in general.

For example, Cython runs richards.py ~70% faster, whereas they claim ~90%. Cython is ~50% faster on slowpickle, they claim ~80%. Not really that much of a difference actually, and easily achieved by tuning the language semantics to the code.


Think we should investigate performance differences and make cython faster.

You could start by contacting the authors. From the project site, it appears that it's basically Russian(?)-only.

My guess is that they simply use more special casing and slightly better type inference than Cython currently does. Look at these, for example:

http://code.google.com/p/2c-python/source/browse/2c.py?r=23d5c350a56e21d5a3e12e153d1fbe91ae1f5d56#15583

http://code.google.com/p/2c-python/source/browse/2c.py?r=23d5c350a56e21d5a3e12e153d1fbe91ae1f5d56#15869

They infer some more return types of builtin methods and their compiler knows about some stdlib modules (such as math):

http://code.google.com/p/2c-python/source/browse/2c.py?r=23d5c350a56e21d5a3e12e153d1fbe91ae1f5d56#578

Overriding external modules statically means diverting from Python semantics. Cython would want to require user interaction for this, e.g. an explicit external .pxd file.

Basically, I think that Cython could do a lot better with control flow driven type inference. Another thing is that it would be nice to extend the type system so that it knows about data types in Python containers.

What we should definitely do is to use Mark's fused types for optimisations, e.g. when default arguments hint at a specific input type, or even just when we find a function call inside the module with a specific combination of input types.

Also, I would expect that eventually optimising the CyFunction type would give us another bit of performance.

Stefan
_______________________________________________
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel

Reply via email to