yaoyaoding commented on issue #292: URL: https://github.com/apache/tvm-ffi/issues/292#issuecomment-3597822530
The advantage of `bin2c` also includes that it's might be more platform-agnostic since it does not depends on the link-stage tools that might be platform specific. We can use `objcopy` in linux while we need to use another tool (after asking Gemini) named `EDITBIN.exe`. Some performance numbers on my workstation: ``` yaoyaod@yaoyaod-ldt:~$ bash bench.sh $ dd if=/dev/urandom of=a.cubin bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00355077 s, 295 MB/s $ time bin2c a.cubin > a.cc real 0m0.038s user 0m0.026s sys 0m0.012s $ time gcc -c -o a.o a.cc real 0m0.501s user 0m0.454s sys 0m0.047s ========== $ dd if=/dev/urandom of=a.cubin bs=5M count=1 1+0 records in 1+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 0.0118941 s, 441 MB/s $ time bin2c a.cubin > a.cc real 0m0.131s user 0m0.126s sys 0m0.005s $ time gcc -c -o a.o a.cc real 0m3.606s user 0m3.403s sys 0m0.199s ========== $ dd if=/dev/urandom of=a.cubin bs=10M count=1 1+0 records in 1+0 records out 10485760 bytes (10 MB, 10 MiB) copied, 0.0250449 s, 419 MB/s $ time bin2c a.cubin > a.cc real 0m0.247s user 0m0.219s sys 0m0.027s $ time gcc -c -o a.o a.cc ^[[A^[[A^[[A^C yaoyaod@yaoyaod-ldt:~$ vim bench.sh yaoyaod@yaoyaod-ldt:~$ bash bench.sh $ dd if=/dev/urandom of=a.cubin bs=1M count=1 1+0 records in 1+0 records out 1048576 bytes (1.0 MB, 1.0 MiB) copied, 0.00406298 s, 258 MB/s $ time bin2c a.cubin > a.cc real 0m0.038s user 0m0.031s sys 0m0.007s $ time gcc -c -o a.o a.cc real 0m0.489s user 0m0.440s sys 0m0.049s ========== $ dd if=/dev/urandom of=a.cubin bs=5M count=1 1+0 records in 1+0 records out 5242880 bytes (5.2 MB, 5.0 MiB) copied, 0.0118218 s, 443 MB/s $ time bin2c a.cubin > a.cc real 0m0.130s user 0m0.117s sys 0m0.012s $ time gcc -c -o a.o a.cc real 0m3.614s user 0m3.409s sys 0m0.188s ========== $ dd if=/dev/urandom of=a.cubin bs=10M count=1 1+0 records in 1+0 records out 10485760 bytes (10 MB, 10 MiB) copied, 0.0249933 s, 420 MB/s $ time bin2c a.cubin > a.cc real 0m0.244s user 0m0.215s sys 0m0.028s $ time gcc -c -o a.o a.cc real 0m8.549s user 0m8.108s sys 0m0.438s ========== ``` The compilation time grows (approx.) linearly with the cubin file size. If the cubin is x MB, the time might be `kx` seconds with k in 1 to 2. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
