Hi all,
I have a quick question about how to handle different types in kernel
calls from PyCUDA...
Currently, I have a Python package that uses PyCUDA to manage GPU arrays
and call some custom CUDA kernels. For many of the functions in the
package, I allow the user to specify whether the output will be
single-precision or double-precision values. So far, this has required
me to write two versions of each CUDA kernel, which pretty much boils
down to one with the "float" type and one with the "double" type, though
sometimes I use PyCUDA's double texref.
Anyway, I was wondering if there is a better way to provide this
functionality? In normal CUDA code, this could be done with templates,
but that doesn't seem to be an option here. I know metaprogramming is a
solution, but I'd like to avoid that as it seems like an unnecessarily
large solution to a tiny problem. (And I don't want to require
additional dependencies like jinja2, mako, etc...)
Am I simply stuck maintaining two kernels that are nearly identical?
Thanks for your help!
Best,
Irwin
_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda