Dear David,

On Fri, 6 May 2011 08:31:08 -0500, David Mertens <[email protected]> 
wrote:
> I've managed to hack together some CUDA (runtime) bindings for Perl
> that I've been using for my numerical research for a few months.
> However, the interface is far from complete. It basically supports
> CUDA's malloc, memcpy, and a couple of other important functions
> necessary for handling errors. (Because I use the runtime, function
> compilation is handled with Inline::C, or by writing straight XS, in
> case you know what that means.) It does not support any device
> querying stuff and it does not allow for any memory allocations apart
> from those supported by cudaMalloc. While it is far from complete, it
> covers all my needs, and would be useful for somebody who has learned
> out of Kirk and Hwu, since they only use cudaMalloc.
> 
> I am familiar with CUDA's runtime API. My bindings work for the
> runtime API and I had intended to continue down this path. However,
> you chose to wrap the driver API.

Wrapping the driver API makes sense for PyCUDA because it allows you to
use the clean driver-level CUDA kernel invocation interface. Calling
through the runtime brings in a mess of machinery to generate host code,
compile it, link it back into the interpreter, and expose it to the
language. It seems that Inline::C takes care of much of this for you, so
it might not be as much of headache as it might be in Python, where many
wrapper generators are sort of heavyweight.

Aside from that, the differences between runtime and driver are pretty
minuscule, although the driver side has a few features that are hard to
get to from the runtime.

> I am curious what other sorts of
> design decisions you have made, and whether they have worked out well
> or not for your implementation. Also, I'm curious if all of your
> bindings were written by-hand or if you found a way to automatically
> process the header files to generate bindings.

Hand-written. The point was to reveal the underlying object-oriented
nature of CUDA, i.e. to change the API slightly while exposing it. To
that end, I first made a (thin) C++ layer that makes the CUDA driver
interface object-oriented, and then exposed that using Boost.Python.

Btw--if you're starting from scratch, OpenCL is definitely worth a look.

Andreas

Attachment: pgpgVw0J7VQUL.pgp
Description: PGP signature

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to