Hi all,

I have a quick question about how to handle different types in kernel calls from PyCUDA...

Currently, I have a Python package that uses PyCUDA to manage GPU arrays and call some custom CUDA kernels. For many of the functions in the package, I allow the user to specify whether the output will be single-precision or double-precision values. So far, this has required me to write two versions of each CUDA kernel, which pretty much boils down to one with the "float" type and one with the "double" type, though sometimes I use PyCUDA's double texref.

Anyway, I was wondering if there is a better way to provide this functionality? In normal CUDA code, this could be done with templates, but that doesn't seem to be an option here. I know metaprogramming is a solution, but I'd like to avoid that as it seems like an unnecessarily large solution to a tiny problem. (And I don't want to require additional dependencies like jinja2, mako, etc...)

Am I simply stuck maintaining two kernels that are nearly identical?

Thanks for your help!

Best,

Irwin

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to