Dear Kevin, On Fri, 13 May 2011 20:20:24 -0400, Kevin Daly <[email protected]> wrote: > I have been testing the performance of two implementations of the same > kernel function. One of them launches the kernel from Python using PyCUDA, > while the other launches it from a c script. It appears that the PyCUDA > implementation is systematically slower by 20% under a range of different > conditions. > > My kernel takes in an array of length M, performs a calculation N times on > each element, summing the N results for each element. It then stores the M > sums in an output array. The 20% difference in speed persists across many > different values of N, holding M fixed. If the difference merely > corresponded to a longer initialization time, then I would expect the > difference to shrink as N increases. > > This is how I am launching the kernel using PyCUDA: > > cube_file = open(cu_file_path) > module = pycuda.compiler.SourceModule(cube_file.read(), no_extern_c=True) > cube_file.close() > kernel_func = module.get_function("my_kernel") > kernel_func(drv.In(inp_array), numpy.int32(arg2), numpy.float32(arg3), ..., > drv.Out(outp_array)) > > > This is how I compile the c script implementation: > > nvcc -ccbin /usr/bin -I. -I/usr/local/cuda/include -Xptxas -v -arch sm_20 -c > test_kernel.cu -o test_kernel.cu.o > g++ -fPIC -o test_kernel test_kernel.cu.o -L/usr/local/cuda/lib64 -lcudart > > In both cases I am launching the kernel with the same number of threads per > block and blocks per grid. > > Is this the best way of compiling/launching the kernel from PyCUDA?
How long is the execution time of your kernel? If it is less than 0.1 ms (measure using, e.g., events), then perhaps you are seeing launch overhead added by PyCUDA. In that case, I'd recommend switching to the "prepared invocation" scheme, which is faster, as the documentation recommends. I would be very surprised if you saw this 20% slowdown independently of kernel execution length--this would mean that Nvidia's driver API is slower than the runtime, which I find quite unlikely. HTH, Andreas
pgpC9FOsZtBj1.pgp
Description: PGP signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
