On Fri, 10 Jun 2011 13:38:09 -0600, Ryan Marcus <[email protected]> wrote: > Hello! > > I'm a bit of a pycuda newb, and today I dove into the source for the first > time. > > Essentially, I was trying to implement a numpy-style argmax argument > using a custom reduction kernel: > http://docs.scipy.org/doc/numpy/reference/generated/numpy.argmax.html > > I thought this would be a pretty simple procedure, and the code I tried was: > > maxloc_reduction_k = ReductionKernel( > numpy.int32, > arguments="float *x", > neutral="0", > map_expr="i", > reduce_expr="(x[(int)a] > x[(int)b]) ? (int)a : (int)b") > > What I discovered was that pycuda really didn't appreciate me using > the specified arguments (in this case, "float *x") in the reduce_expr > part. > > Looking at reduction.py (in the latest release code, 0.94.2) , I > couldn't really see a reason for why this shouldn't be possible. I > added my own parameter to the reduction kernel's constructor ("hack") > to allow my original parameters to be passed to both _stage1 and > _stage2. This allowed me to do this: > > maxloc_reduction_k = ReductionKernel( > numpy.int32, > hack=True, > arguments="float *x", > neutral="0", > map_expr="i", > reduce_expr="(x[(int)a] > x[(int)b]) ? (int)a : (int)b") > > This way, my hacked changes would only affect some reduction kernels. > > I'm not totally clear on how to create a patch. Hopefully I didn't > screw it up. I don't really know if I'm allowed to send attachments to > a mailing group... so I used pastebin. Here's my changes: > http://pastebin.com/R5i5JveM > > Just in case I messed up creating a patch, here's my full modified > version of reduction.py: http://pastebin.com/WExgBTQ9 > > Is there a better way to be doing this?
Now there is--all arguments are now also available in stage 2 by default. Sorry this took so long. > I considered using the current dev-build of pycuda to access float2, > and store my array value in x and my index in y, but the reduction > code hasn't been updated to properly handle float2's (you can't just > assert a float to another volatile type -- you have to copy the .x and > the .y). There's now also this: http://documen.tician.de/pycuda/array.html#pycuda.scan.pycuda.tools.register_dtype which makes this here possible: https://github.com/inducer/pycuda/blob/master/test/test_gpuarray.py#L764 Andreas
pgptVlveiN0mo.pgp
Description: PGP signature
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
