hi everyone.

assuming the following standard task:

gpu_foo = gpuarray.to_gpu(foo)
gpu_function(gpu_foo, block=...)

now i measure the time the interpreter needs for each line. would this
reflect the true time that pycuda needs to perform these tasks?

i'm currently working on a problem where these two lines are looped.
during the first iteration both lines take approximately a millisec.
but every further iteration increases the runtime of the first line,
ending up with over 2 seconds per evaluation.

i thought this might be somehow connected with the cleanup process of
the gpuarrays. by adding

del gpu_foo

to the block, the runtime of the first 2 lines stays low but the del
instruktion takes ages.

on the other hand i though, maybe these two seconds is just the time
the gpu needs to perform the calculation and some asynchronism leads
to the extended runtime of the successive instruction.

greetings
m

-- 

Martin Hammerschmied   |  student of Telematics
[email protected]  |  Graz University Of Technology

_______________________________________________
PyCUDA mailing list
[email protected]
http://lists.tiker.net/listinfo/pycuda

Reply via email to