https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118356
Bug ID: 118356 Summary: RISC-V: -falign-labels=0 should (probably) default to 4 Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: cousteaulecommandant at gmail dot com Target Milestone: --- Some RISC-V implementations, including the CORE-V CVE4 family [1], allow having instructions aligned to 2- or 4-byte boundaries, but introduce an extra clock cycle penalty if the target of a branch instruction is a 4-byte instruction that is not aligned to a 4-byte boundary (but not if the target instruction is aligned to a 4-byte boundary, or if it's a 2-byte instruction). In those cases, forcing alignment of branch targets to 4 bytes (which can be achieved by providing `-falign-labels=4`) can provide a great improvement on certain programs. For example, a tight `for` loop may take 9 clock cycles to run if the branch target is aligned but 10 if it's not, resulting in a 10% performance loss. (What's worse, this performance loss will only kick in arbitrarily, and can appear or disappear even if I change a completely different part of the code, which drove me crazy when I was trying to measure the performance of a function affected by this issue; enabling `-falign-labels=4` also has the advantage of removing this uncertainty.) Here, my expectation would be that enabling a certain optimization level (such as `-O2`) enabled this particular optimization. In fact, the documentation [2] states that `-O2` enables the `-falign-labels` flag, but without specifying an alignment. It later states that `-falign-labels` without a value or with `=0` will "use a machine-dependent default which is very likely to be ‘1’, meaning no alignment". Now, I don't quite understand why `-O2` would want to enable an optimization option whose default behavior is to do nothing, but my guess is that this is so that specific targets where setting `-falign-labels=X` can provide an advantage (as is the case with RISC-V) use `X` as the default value rather than 1. What do you think? Would it make sense for RISC-V targets to make `-falign-labels=4` the default alignment value when `-falign-labels` is used without providing an explicit value, so that this forced alignment will happen when `-O2` or `-O3` are used? [1]: https://docs.openhwgroup.org/projects/cv32e40p-user-manual/en/latest/pipeline.html#cycle-counts-per-instruction-type [2]: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html