hi everyone. assuming the following standard task:
gpu_foo = gpuarray.to_gpu(foo) gpu_function(gpu_foo, block=...) now i measure the time the interpreter needs for each line. would this reflect the true time that pycuda needs to perform these tasks? i'm currently working on a problem where these two lines are looped. during the first iteration both lines take approximately a millisec. but every further iteration increases the runtime of the first line, ending up with over 2 seconds per evaluation. i thought this might be somehow connected with the cleanup process of the gpuarrays. by adding del gpu_foo to the block, the runtime of the first 2 lines stays low but the del instruktion takes ages. on the other hand i though, maybe these two seconds is just the time the gpu needs to perform the calculation and some asynchronism leads to the extended runtime of the successive instruction. greetings m -- Martin Hammerschmied | student of Telematics [email protected] | Graz University Of Technology _______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
