Nope, nothing has changed since then. If ctypes works for you, that's great. However, I don't think using codepy is that difficult - if you look at the example I posted, you'll see it's not that much work to compile things with nvcc using codepy. http://wiki.tiker.net/PyCuda/Examples/ThrustInterop Perhaps the Boost dependence scares people off - but if you're using PyCUDA, you are already using Boost. ;)
It might be useful to include more of Thrust as a precompiled binary, but in general it's not possible to compile all of Thrust, since it's a template library. Even when you restrict yourself to a basic set of fundamental types, if you allow tuples, the combinatorics are prohibitive. - bryan On Thu, May 24, 2012 at 3:46 AM, Igor <[email protected]> wrote: > Hi Andreas, (Hi Bryan), > > Last December I was asking you about CodePy. See how far I went with > it with your help: http://dev.math.canterbury.ac.nz/home/pub/17/ > > Note, there is no CUDA or thrust code in the CodePY example. There > seemed to be no easy way to do it. I'll paste some excerpts from our > emails from Dec 16,17, > > I: "My next question, suppose MODULE_CODE contains some thrust code and > would have to be compiled through nvcc (and g++). Simply using > nvcc_toolchain, > > nvcc_toolchain = codepy.toolchain.guess_nvcc_toolchain() > cmod = extension_from_string(nvcc_toolchain, "module", MODULE_CODE) > > Didn't work of course. Do you have a similar function that takes a > STRING, both host_toolchain, and nvcc_toolchain, and compiles it? If > not, what is the right way?" > > B: "NVCC can't parse Boost well, so I have to segregate the host code > which binds it to Python from the CUDA code compiled by NVCC. > The way I do this is to create a codepy.bpl.BoostPythonModule which > has the host entry point (and will be compiled by g++). Then I create > a codepy.cuda.CudaModule which references the BoostPythonModule > (making this link explicit lets codepy compile them together into a > single binary). Then I call compile on the CudaModule, which should > to the right thing. You can see code that does this here: > http://code.google.com/r/bryancatanzaro-copperhead/source/browse/copperhead/compiler/binarygenerator.py#84" > > A: "I'd just like to add that I recently split out the code generation bits > of codepy and called them cgen. > > http://pypi.python.org/pypi/cgen > https://github.com/inducer/cgen > https://github.com/inducer/codepy > > (but compatibility wrappers that wrap cgen into codepy will stay in > place for a while)" > > Has something changed since then? > > ctypes works fine and it has the advantage of not having to use boost. > It's just an unaltered C++/CUDA/thrust code. Invoking systems' nvcc > was as easy as gcc. As for the caching, I check the hash of the source > string: if it has changed, I build and load a new (small!) .so module > with the hash value attached to the name. The pointers into the old > .so get garbage collected and unloaded; if they are stored in a tmp > folder -- the .so files get deleted eventually. > > I remember you preferred Boost::Python to ctypes in general for its > better performance; but if we make calls to ctypes library rarely, > small additional overheads, if there were some, aren't important. > > A better programme would be to port all the algorithms and interfaces > of Thrust to PyCUDA. The only reason I need thrust for example, is > that it can find me the extremum element's _location_ which I still > don't know how to do in PyCUDA. > > Cheers, > Igor > > On Thu, May 24, 2012 at 11:58 AM, Andreas Kloeckner > <[email protected]> wrote: >> Hi Igor, >> >> On Thu, 24 May 2012 10:51:55 +1200, Igor <[email protected]> wrote: >>> Andreas, thanks, but it currently implies Linux, I'll see if I can >>> make it work on Windows. Or maybe I'll submit and someone will try it >>> on Windows. I just need to extract it from Sage into a plain Python >>> script. Give me a couple of days. >>> http://dev.math.canterbury.ac.nz/home/pub/14/ >>> http://dev.math.canterbury.ac.nz/home/pub/19/ >> >> I would actually suggest you use the codepy machinery to let nvcc do the >> compilation--this has the advantage that a) there is code out there that >> makes this work on Windows (Bryan?) and b) you get compiler caching for >> free. >> >> All you'd need to do is build an analog of extension_from_string, say >> ctypes_dll_from_string. Just imitate this code here, where >> compile_from_string does all the hard work: >> >> https://github.com/inducer/codepy/blob/master/codepy/jit.py#L146 >> >> In any case, even if you can make something that's Linux-only, it would >> likely help a big bunch of people. Windows support can always be added >> later. >> >> Andreas > > _______________________________________________ > PyCUDA mailing list > [email protected] > http://lists.tiker.net/listinfo/pycuda _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
