yaoyaoding commented on issue #292:
URL: https://github.com/apache/tvm-ffi/issues/292#issuecomment-3597822530

   The advantage of `bin2c` also includes that it's might be more 
platform-agnostic since it does not depends on the link-stage tools that might 
be platform specific. We can use `objcopy` in linux while we need to use 
another tool (after asking Gemini) named `EDITBIN.exe`. 
   
   Some performance numbers on my workstation:
   ```
   yaoyaod@yaoyaod-ldt:~$ bash bench.sh
   $ dd if=/dev/urandom of=a.cubin bs=1M count=1
   1+0 records in
   1+0 records out
   1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00355077 s, 295 MB/s
   
   $ time bin2c a.cubin > a.cc
   
   real 0m0.038s
   user 0m0.026s
   sys  0m0.012s
   
   $ time gcc -c -o a.o a.cc
   
   real 0m0.501s
   user 0m0.454s
   sys  0m0.047s
   ==========
   
   
   $ dd if=/dev/urandom of=a.cubin bs=5M count=1
   1+0 records in
   1+0 records out
   5242880 bytes (5.2 MB, 5.0 MiB) copied, 0.0118941 s, 441 MB/s
   
   $ time bin2c a.cubin > a.cc
   
   real 0m0.131s
   user 0m0.126s
   sys  0m0.005s
   
   $ time gcc -c -o a.o a.cc
   
   real 0m3.606s
   user 0m3.403s
   sys  0m0.199s
   ==========
   
   
   $ dd if=/dev/urandom of=a.cubin bs=10M count=1
   1+0 records in
   1+0 records out
   10485760 bytes (10 MB, 10 MiB) copied, 0.0250449 s, 419 MB/s
   
   $ time bin2c a.cubin > a.cc
   
   real 0m0.247s
   user 0m0.219s
   sys  0m0.027s
   
   $ time gcc -c -o a.o a.cc
   ^[[A^[[A^[[A^C
   yaoyaod@yaoyaod-ldt:~$ vim bench.sh
   yaoyaod@yaoyaod-ldt:~$ bash bench.sh
   $ dd if=/dev/urandom of=a.cubin bs=1M count=1
   
   1+0 records in
   1+0 records out
   1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00406298 s, 258 MB/s
   
   $ time bin2c a.cubin > a.cc
   
   
   real 0m0.038s
   user 0m0.031s
   sys  0m0.007s
   
   $ time gcc -c -o a.o a.cc
   
   
   real 0m0.489s
   user 0m0.440s
   sys  0m0.049s
   
   ==========
   
   
   $ dd if=/dev/urandom of=a.cubin bs=5M count=1
   
   1+0 records in
   1+0 records out
   5242880 bytes (5.2 MB, 5.0 MiB) copied, 0.0118218 s, 443 MB/s
   
   $ time bin2c a.cubin > a.cc
   
   
   real 0m0.130s
   user 0m0.117s
   sys  0m0.012s
   
   $ time gcc -c -o a.o a.cc
   
   
   real 0m3.614s
   user 0m3.409s
   sys  0m0.188s
   
   ==========
   
   
   $ dd if=/dev/urandom of=a.cubin bs=10M count=1
   
   1+0 records in
   1+0 records out
   10485760 bytes (10 MB, 10 MiB) copied, 0.0249933 s, 420 MB/s
   
   $ time bin2c a.cubin > a.cc
   
   
   real 0m0.244s
   user 0m0.215s
   sys  0m0.028s
   
   $ time gcc -c -o a.o a.cc
   
   
   real 0m8.549s
   user 0m8.108s
   sys  0m0.438s
   
   ==========
   ```
   
   The compilation time grows (approx.) linearly with the cubin file size. If 
the cubin is x MB, the time might be `kx` seconds with k in 1 to 2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to