Dnia 2011-05-06, pią o godzinie 08:30 -0400, Andreas Kloeckner pisze: > On Fri, 06 May 2011 11:00:08 +0200, Tomasz Rybak <[email protected]> wrote: > > Dnia 2011-04-24, nie o godzinie 17:50 +0200, Tomasz Rybak pisze: > > > Dnia 2011-04-24, nie o godzinie 01:22 -0400, Andreas Kloeckner pisze: > > > > On Sun, 17 Apr 2011 17:10:30 +0200, "=?UTF-8?B?VG9tYXN6IFJ5YmFr?=" > > > > <[email protected]> wrote: > > > > > Dnia 2011-04-16 00:48 Andreas Kloeckner napisał(a): > > > > > > > > > > >Hi Tomasz, > > > > > > > > > > > >any progress on the things below? Has maybe another email gone > > > > > >missing? > > > > > >I'd like to release PyCUDA 2011.1 soon. > > > > > > > > > > > > > > > I have send the patch to the mailing list on 2011-03-18 and > > > > > 2011-03-24. > > > > > Please check the archives. > > > > > > > > Weird. I'm really not sure what happened there--I could've sworn I never > > > > saw these emails. > > > > > > > > In any case, I've merged your patch into the > > > > curand-wrapper-v2-from-tomasz branch, which I've also brought up to > > > > current master. > > > > > > Thanks! > > > > > > > > > > > Looks like we're almost done here--only initialization is missing. > > > > > > I hope to have it done by the end of the next week. > > > > I have troubles with transferring data between C and Python. > > I want to: > > 1. call curandGetDirectionVectors32 which returns pointer > > to the 32 int32 > > 2. transfer this data to the device memory > > 3. create GPUArray from it > > 4. call prepare() kernel which will pass appropriate > > direction vectors to the curand_init() kernel > > > > I have tried two approaches (curand-hostptr.diff > > and curand-memcpy.diff). The former, when compiled, > > causes _curand module to misbehave - e.g. it get_curand_version() > > returns None instead of (3, 2, 0) or (0, 0, 0) > > The latter compiles but cuMemcpy call fails because of > > parameter type mismatch. > > > > Can someone give me some direction how to transfer raw > > pointer from C to Python? > > What's wrong with creating a numpy array for the data?
Thanks for the tip. Here is the patch. It works, and generates quasi-random numbers. The only problem is when with calling curandGetDirectionVectors - it generates only 20000 vectors, so we will have repeated generators when we have GPUs with more than 20000 cores ;-) BTW - there is difference in wrap_cudadrv.cpp between master and curand branch. I had to fix it but did not include this change in the patch. Please apply this patch - and I believe that we could think about merging curand branch into master. Best regards. -- Tomasz Rybak <[email protected]> GPG/PGP key ID: 2AD5 9860 Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860 http://member.acm.org/~tomaszrybak
diff --git a/doc/source/array.rst b/doc/source/array.rst
index 4dabe59..431aa37 100644
--- a/doc/source/array.rst
+++ b/doc/source/array.rst
@@ -399,6 +399,21 @@ algorithm designed to fill an n-dimensional space evenly.
Quasirandom numbers are more expensive to generate.
+.. function:: get_curand_version()
+
+ Obtain the version of CURAND against which PyCUDA was compiled. Returns a
+ 3-tuple of integers as *(major, minor, revision)*.
+
+.. function:: seed_getter_uniform(N)
+
+ Return an :class:`GPUArray` filled with one random `int32` repeated `N`
+ times which can be used as a seed for XORWOW generator.
+
+.. function:: seed_getter_unique(N)
+
+ Return an :class:`GPUArray` filled with `N` random `int32` which can
+ be used as a seed for XORWOW generator.
+
.. class:: XORWOWRandomNumberGenerator(seed_getter=None, offset=0)
:arg seed_getter: a function that, given an integer count, will yield an
@@ -432,7 +447,18 @@ Quasirandom numbers are more expensive to generate.
Accepts array i of integer values, telling each generator how many
values to skip.
-.. class:: Sobol32RandomNumberGenerator(vector, offset)
+.. function:: generate_direction_vectors(count, direction=direction_vector_set.VECTOR_32)
+
+ Return an :class:`GPUArray` `count` filled with direction vectors
+ used to initialize Sobol32 generators.
+
+.. class:: Sobol32RandomNumberGenerator(dir_vector=None, offset=0)
+
+ :arg dir_vector: a :class:`GPUArray` of 32-element `int32` vectors which
+ are used to initialize quasirandom generator; it must contain one vector
+ for each initialized generator
+ :arg offset: Starting index into the Sobol32 sequence, given direction
+ vector.
Provides quasirandom numbers. Generates
sequences with period of :math:`2^32`.
diff --git a/pycuda/curandom.py b/pycuda/curandom.py
index 73ca18d..5982b33 100644
--- a/pycuda/curandom.py
+++ b/pycuda/curandom.py
@@ -255,7 +255,7 @@ else:
if get_curand_version() >= (3, 2, 0):
direction_vector_set = _curand.direction_vector_set
- get_direction_vectors32 = _curand.get_direction_vectors32
+ _get_direction_vectors = _curand._get_direction_vectors
# {{{ Base class
@@ -526,9 +526,14 @@ if get_curand_version() >= (3, 2, 0):
# {{{ Sobol32 RNG
+def generate_direction_vectors(count, direction=direction_vector_set.VECTOR_32):
+ result = np.empty((count, 32), dtype=np.int32)
+ _get_direction_vectors(direction, result, count)
+ return pycuda.gpuarray.to_gpu(result)
+
sobol32_random_source = """
extern "C" {
-__global__ void prepare(curandStateSobol32 *s, const int n, unsigned int **v,
+__global__ void prepare(curandStateSobol32 *s, const int n, curandDirectionVectors32_t *v,
const unsigned int o)
{
const int id = blockIdx.x*blockDim.x+threadIdx.x;
@@ -578,10 +583,10 @@ if get_curand_version() >= (3, 2, 0):
dev = drv.Context.get_device()
if dev.compute_capability() >= (2, 0):
p.prepared_call((self.block_count, 1), self.state,
- self.block_count * self.generators_per_block, vector, offset)
+ self.block_count * self.generators_per_block, dir_vector.gpudata, offset)
else:
p.prepared_call((2 * self.block_count, 1), self.state,
- self.block_count * self.generators_per_block // 2, vector, offset)
+ self.block_count * self.generators_per_block // 2, dir_vector.gpudata, offset)
except drv.LaunchError:
raise ValueError("Initialisation failed. Decrease number of threads.")
diff --git a/src/cpp/curand.hpp b/src/cpp/curand.hpp
index e025b0c..9783239 100644
--- a/src/cpp/curand.hpp
+++ b/src/cpp/curand.hpp
@@ -3,7 +3,28 @@
#if CUDAPP_CUDA_VERSION >= 3020
-#include <curand.h>
+ #include <curand.h>
+
+ #ifdef CUDAPP_TRACE_CUDA
+ #define CURAND_PRINT_ERROR_TRACE(NAME, CODE) \
+ if (CODE != CURAND_STATUS_SUCCESS) \
+ std::cerr << NAME << " failed with code " << CODE << std::endl;
+ #else
+ #define CURAND_PRINT_ERROR_TRACE(NAME, CODE) /*nothing*/
+ #endif
+
+ #define CURAND_CALL_GUARDED(NAME, ARGLIST) \
+ { \
+ CUDAPP_PRINT_CALL_TRACE(#NAME); \
+ curandStatus_t cu_status_code; \
+ cu_status_code = NAME ARGLIST; \
+ CURAND_PRINT_ERROR_TRACE(#NAME, cu_status_code); \
+ if (cu_status_code != CURAND_STATUS_SUCCESS) \
+ throw pycuda::error(#NAME, CUDA_SUCCESS);\
+ }
+#else
+ #define CURAND_PRINT_ERROR_TRACE(NAME, CODE) /*nothing*/
+ #define CURAND_CALL_GUARDED(NAME, ARGLIST) /*nothing*/
#endif
@@ -22,10 +43,26 @@ namespace pycuda { namespace curandom {
}
#if CUDAPP_CUDA_VERSION >= 3020
- void py_curand_get_direction_vectors32(curandDirectionVectors32_t *vectors[],
- curandDirectionVectorSet_t set)
-// TODO: checking; cannot use CUDAPP_CALL_GUARDED because function returns CURAND enum
- { curandGetDirectionVectors32(vectors, set); }
+ void py_curand_get_direction_vectors(
+ curandDirectionVectorSet_t set, py::object dst, int count)
+ {
+ void *buf;
+ PYCUDA_BUFFER_SIZE_T len;
+ int n = 0;
+
+ if (PyObject_AsWriteBuffer(dst.ptr(), &buf, &len))
+ throw py::error_already_set();
+ if (CURAND_DIRECTION_VECTORS_32_JOEKUO6 == set) {
+ curandDirectionVectors32_t *vectors;
+ CURAND_CALL_GUARDED(curandGetDirectionVectors32, (&vectors, set));
+ while (count > 0) {
+ int size = ((count > 20000) ? 20000 : count)*sizeof(curandDirectionVectors32_t);
+ memcpy((int *)buf+n*20000*sizeof(curandDirectionVectors32_t)/sizeof(unsigned int), vectors, size);
+ count -= size/sizeof(curandDirectionVectors32_t);
+ n++;
+ }
+ }
+ }
#endif
} }
diff --git a/src/wrapper/wrap_curand.cpp b/src/wrapper/wrap_curand.cpp
index 71adf57..2c1f2e3 100644
--- a/src/wrapper/wrap_curand.cpp
+++ b/src/wrapper/wrap_curand.cpp
@@ -25,7 +25,8 @@ void pycuda_expose_curand()
py::def("get_curand_version", py_curand_version);
#if CUDAPP_CUDA_VERSION >= 3020
- py::def("get_direction_vectors32", py_curand_get_direction_vectors32, (arg("vectors"), arg("set")));
+ py::def("_get_direction_vectors", py_curand_get_direction_vectors,
+ (arg("set"), arg("dst"), arg("count")));
#endif
}
signature.asc
Description: This is a digitally signed message part
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
