Hello. Recently I have been playing with profiler and logger in CUDA. Detailed description is in http://wiki.tiker.net/ToolCheatSheet and http://wiki.tiker.net/ToolCheatSheet?action=AttachFile&do=view&target=compute-profiler-manual.txt
Basically I have set environment variable COMPUTE_PROFILE to 1 and run PyCUDA programs. I have observed that logger does not put every function call into text files. I have also observed that test cases (functions decorated by pycuda.tools.mark_cuda_test) were generating full logs. The only difference I have found was calling context.detach() in mark_cuda_test. I have then experimented a little bit and observed that indeed when I was not using pycuda.autotools but instead created context manually and then popped _and detached_ it full log was generated. I am attaching patch that adds ctx.detach to functions called at exit of program in pycuda.autoinit. I have tested PyCUDA with this patch, and all programs from test/* run without problems. I am also attaching two logs from examples/demo.py. One is result of using autoinit with detach, one without. As you can see the latter misses some of the functions like (2*gpuarray).get() (axpb kernel). So Andreas, please apply this patch before finalising 2011.1. Best regards. -- Tomasz Rybak <[email protected]> GPG/PGP key ID: 2AD5 9860 Fingerprint A481 824E 7DD3 9C0E C40A 488E C654 FB33 2AD5 9860 http://member.acm.org/~tomaszrybak
# CUDA_PROFILE_LOG_VERSION 2.0 # CUDA_DEVICE 0 GeForce GTX 460 # TIMESTAMPFACTOR fffff703cfe75b50 method,gputime,cputime,occupancy method=[ memcpyHtoD ] gputime=[ 0.992 ] cputime=[ 29.000 ] method=[ doublify ] gputime=[ 2.976 ] cputime=[ 38.000 ] occupancy=[ 0.021 ] method=[ memcpyDtoH ] gputime=[ 1.248 ] cputime=[ 22.000 ] method=[ memcpyHtoD ] gputime=[ 0.864 ] cputime=[ 8.000 ] method=[ doublify ] gputime=[ 1.920 ] cputime=[ 12.000 ] occupancy=[ 0.021 ] method=[ memcpyDtoH ] gputime=[ 1.056 ] cputime=[ 22.000 ] method=[ memcpyHtoD ] gputime=[ 0.832 ] cputime=[ 3.000 ] method=[ axpb ] gputime=[ 2.592 ] cputime=[ 19.000 ] occupancy=[ 0.021 ] method=[ memcpyDtoH ] gputime=[ 1.280 ] cputime=[ 17.000 ] method=[ memcpyDtoH ] gputime=[ 1.056 ] cputime=[ 14.000 ]
# CUDA_PROFILE_LOG_VERSION 2.0 # CUDA_DEVICE 0 GeForce GTX 460 # TIMESTAMPFACTOR fffff703d17a0930 method,gputime,cputime,occupancy method=[ memcpyHtoD ] gputime=[ 1.024 ] cputime=[ 28.000 ] method=[ doublify ] gputime=[ 3.008 ] cputime=[ 51.000 ] occupancy=[ 0.021 ] method=[ memcpyDtoH ] gputime=[ 1.248 ] cputime=[ 25.000 ] method=[ memcpyHtoD ] gputime=[ 0.832 ] cputime=[ 11.000 ] method=[ doublify ] gputime=[ 1.920 ] cputime=[ 17.000 ] occupancy=[ 0.021 ]
diff --git a/pycuda/autoinit.py b/pycuda/autoinit.py index ec1c7bd..9aadb90 100644 --- a/pycuda/autoinit.py +++ b/pycuda/autoinit.py @@ -8,4 +8,5 @@ context = make_default_context() device = context.get_device() import atexit +atexit.register(context.detach) atexit.register(context.pop)
signature.asc
Description: This is a digitally signed message part
_______________________________________________ PyCUDA mailing list [email protected] http://lists.tiker.net/listinfo/pycuda
