[PATCH] D116583: Change the default optimisation level of PTXAS from -O0 to -O3. This makes the optimisation levels of PTXAS and the ptxjitcompiler equal (ptxjitcompiler defaults to -O3).

Artem Belevich via Phabricator via cfe-commits Tue, 04 Jan 2022 10:44:10 -0800

tra added inline comments.


================
Comment at: clang/lib/Driver/ToolChains/Cuda.cpp:433
   } else {
-    // If no -O was passed, pass -O0 to ptxas -- no opt flag should correspond
-    // to no optimizations, but ptxas's default is -O3.
-    CmdArgs.push_back("-O0");
+    // If no -O was passed, pass -O3 to ptxas -- this makes ptxas's
+    // optimization level the same as the ptxjitcompiler.
----------------
I think this would be contrary to the expectation that lack of `-O` in clang 
means - `do not optimize` and it generally implies the whole compilation chain, 
including assembler. Matching whatever nvidia tools do is an insufficient 
reason for breaking this assumption, IMO. 

If you do want do run optimized ptxas on unoptimized PTX, you can use 
`-Xcuda-ptxas -O3`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116583/new/

https://reviews.llvm.org/D116583

_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[PATCH] D116583: Change the default optimisation level of PTXAS from -O0 to -O3. This makes the optimisation levels of PTXAS and the ptxjitcompiler equal (ptxjitcompiler defaults to -O3).

Reply via email to