Hi Andreas, (Hi Bryan), Last December I was asking you about CodePy. See how far I went with it with your help: http://dev.math.canterbury.ac.nz/home/pub/17/
Note, there is no CUDA or thrust code in the CodePY example. There seemed to be no easy way to do it. I'll paste some excerpts from our emails from Dec 16,17, I: "My next question, suppose MODULE_CODE contains some thrust code and would have to be compiled through nvcc (and g++). Simply using nvcc_toolchain, nvcc_toolchain = codepy.toolchain.guess_nvcc_toolchain() cmod = extension_from_string(nvcc_toolchain, "module", MODULE_CODE) Didn't work of course. Do you have a similar function that takes a STRING, both host_toolchain, and nvcc_toolchain, and compiles it? If not, what is the right way?" B: "NVCC can't parse Boost well, so I have to segregate the host code which binds it to Python from the CUDA code compiled by NVCC. The way I do this is to create a codepy.bpl.BoostPythonModule which has the host entry point (and will be compiled by g++). Then I create a codepy.cuda.CudaModule which references the BoostPythonModule (making this link explicit lets codepy compile them together into a single binary). Then I call compile on the CudaModule, which should to the right thing. You can see code that does this here: http://code.google.com/r/bryancatanzaro-copperhead/source/browse/copperhead/compiler/binarygenerator.py#84" A: "I'd just like to add that I recently split out the code generation bits of codepy and called them cgen. http://pypi.python.org/pypi/cgen https://github.com/inducer/cgen https://github.com/inducer/codepy (but compatibility wrappers that wrap cgen into codepy will stay in place for a while)" Has something changed since then? ctypes works fine and it has the advantage of not having to use boost. It's just an unaltered C++/CUDA/thrust code. Invoking systems' nvcc was as easy as gcc. As for the caching, I check the hash of the source string: if it has changed, I build and load a new (small!) .so module with the hash value attached to the name. The pointers into the old .so get garbage collected and unloaded; if they are stored in a tmp folder -- the .so files get deleted eventually. I remember you preferred Boost::Python to ctypes in general for its better performance; but if we make calls to ctypes library rarely, small additional overheads, if there were some, aren't important. A better programme would be to port all the algorithms and interfaces of Thrust to PyCUDA. The only reason I need thrust for example, is that it can find me the extremum element's _location_ which I still don't know how to do in PyCUDA. Cheers, Igor On Thu, May 24, 2012 at 11:58 AM, Andreas Kloeckner <[email protected]> wrote: > Hi Igor, > > On Thu, 24 May 2012 10:51:55 +1200, Igor <[email protected]> wrote: >> Andreas, thanks, but it currently implies Linux, I'll see if I can >> make it work on Windows. Or maybe I'll submit and someone will try it >> on Windows. I just need to extract it from Sage into a plain Python >> script. Give me a couple of days. >> http://dev.math.canterbury.ac.nz/home/pub/14/ >> http://dev.math.canterbury.ac.nz/home/pub/19/ > > I would actually suggest you use the codepy machinery to let nvcc do the > compilation--this has the advantage that a) there is code out there that > makes this work on Windows (Bryan?) and b) you get compiler caching for > free. > > All you'd need to do is build an analog of extension_from_string, say > ctypes_dll_from_string. Just imitate this code here, where > compile_from_string does all the hard work: > > https://github.com/inducer/codepy/blob/master/codepy/jit.py#L146 > > In any case, even if you can make something that's Linux-only, it would > likely help a big bunch of people. Windows support can always be added > later. > > Andreas _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
