Python extension performance

2005-04-08 Thread David Jones
Hi,
I am trying to hunt down the difference in performance between some raw 
C++ code and calling the C++ code from Python.  My goal is to use Python 
to control a bunch of number crunching code, and I need to show that 
this will not incur a (big) performance hit.

This post includes a description of my problem, ideas I have for the 
cause, and some things I plan to try next week.  If anyone knows the 
real cause, or thinks any of my ideas are way off base, I would 
appreciate hearing about it.

My C++ function (testfunction) runs in 2.9 seconds when called from a 
C++ program, but runs in 4.3 seconds when called from Python. 
testfunction calculates its own running time with calls to clock(), and 
this is for only one iteration, so none of the time is in the SWIG code 
or Python.

Both the C++ executable and python module were linked from the same 
object files, and linked with the same options.  The only difference is 
that the Python module is linked with -shared, and the C++ code is not.

The computer is an Itanium 2.  The code was compiled with the Intel 
Compiler, and uses the Intel Math Libraries.  Python is version 2.2 
(with little hope of being able to upgrade) from the Red Hat rpm 
install.  When I link the C++ exe, I get some warnings about "log2l not 
implemented" from libimf, but I do not see these when I link the Python .so.

Some potential causes of my problems:
- linking to a shared library instead building a static exe.
- intel libraries are not being used when I think they are
- libpython.so was built with gcc, so I am getting some link issues
- can linking to python affect my memory allocation and deallocation in 
c++??

Some things I can try:
- recompile python with the intel compiler and try again
- compile my extension into a python interpreter, statically
- segregate the memory allocations from the numerical work and compare 
how the C++ and Python versions compare

--end brain dump
Dave
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python extension performance

2005-04-09 Thread David Jones
Jack Diederich wrote:
On Fri, Apr 08, 2005 at 10:14:52PM -0400, David Jones wrote:
I am trying to hunt down the difference in performance between some raw 
C++ code and calling the C++ code from Python.  My goal is to use Python 
to control a bunch of number crunching code, and I need to show that 
this will not incur a (big) performance hit.

My C++ function (testfunction) runs in 2.9 seconds when called from a 
C++ program, but runs in 4.3 seconds when called from Python. 
testfunction calculates its own running time with calls to clock(), and 
this is for only one iteration, so none of the time is in the SWIG code 
or Python.

Some potential causes of my problems:
- linking to a shared library instead building a static exe.
- intel libraries are not being used when I think they are
- libpython.so was built with gcc, so I am getting some link issues
- can linking to python affect my memory allocation and deallocation in 
c++??
The main overhead of calling C/C++ from python is the function call overhead
(python creating the stack frame for the call, and then changing the python
objects into regular ints, char *, etc).  You don't mention how many times
you are calling the function.  If it is only once and the difference is 1.4
seconds then something is really, really, messed up.  So I'll guess it is
hundreds of thousands of times?  Let us know.
Sorry I was not clearer above;  the function is only called one time.  I 
have run out of obvious things I may have screwed up.  The part that 
bugs me most is that these are built from the same .o files except for 
the .o file that has the wrapper function for python.


Some things I can try:
- recompile python with the intel compiler and try again
- compile my extension into a python interpreter, statically
- segregate the memory allocations from the numerical work and compare 
how the C++ and Python versions compare
Recompiling with the Intel compiler might help, I hear it is faster than 
GCC for all modern x86 platforms.  I think CPython is only tested on GCC
and windows Visual-C-thingy so you might be SOL.  The other two ideas
seem much harder to do and less likely to show an improvement.
>
> -jackdied
>
By the second option, I meant to compile my extension statically instead 
of using a shared library by unpacking the source rpm and putting my 
code in the Modules/ directory.  That is a pretty standard thing to do, 
isn't it?

Thanks for the comments.
Dave
--
http://mail.python.org/mailman/listinfo/python-list


Re: 2**2**2**2**2 wrong? Bug?

2007-07-11 Thread David Jones
On Jul 10, 12:47 am, "Jim Langston" <[EMAIL PROTECTED]> wrote:
> "Paul Rubin"  wrote in message
>
> news:[EMAIL PROTECTED]
>
> > "Jim Langston" <[EMAIL PROTECTED]> writes:
> >> In Python 2.5 on intel, the statement
> >> 2**2**2**2**2
> >> evaluates to
> >> >>> 2**2**2**2**2
>
> > I get the same number from hugs--why do you think it might be wrong?
>
> 2**2 = 4
> 4**2 = 16
> 16**2 = 256
> 256**2 = 65536
> 65536**2 = 4294967296
>
> In fact, if I put (2**2)**2**2**2
> it comes up with the correct answer, 4294967296

Actually, the "correct" answer (even by your own demonstration) is
65536. Assuming left-associativity, i.e., (((2**2)**2)**2)**2, python
returns 65536. The answer of 4294967296 is actually
2**2)**2)**2)**2)**2, which is one extra raise-to-the-power-of-two
instruction.

The statement (2**2)**2**2**2 is the same as 4**16, following right-
associativity rules, which just happens to be the same as
2**2)**2)**2)**2)**2.

David

-- 
http://mail.python.org/mailman/listinfo/python-list