https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118356

            Bug ID: 118356
           Summary: RISC-V: -falign-labels=0 should (probably) default to
                    4
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: middle-end
          Assignee: unassigned at gcc dot gnu.org
          Reporter: cousteaulecommandant at gmail dot com
  Target Milestone: ---

Some RISC-V implementations, including the CORE-V CVE4 family [1], allow having
instructions aligned to 2- or 4-byte boundaries, but introduce an extra clock
cycle penalty if the target of a branch instruction is a 4-byte instruction
that is not aligned to a 4-byte boundary (but not if the target instruction is
aligned to a 4-byte boundary, or if it's a 2-byte instruction).

In those cases, forcing alignment of branch targets to 4 bytes (which can be
achieved by providing `-falign-labels=4`) can provide a great improvement on
certain programs.  For example, a tight `for` loop may take 9 clock cycles to
run if the branch target is aligned but 10 if it's not, resulting in a 10%
performance loss.  (What's worse, this performance loss will only kick in
arbitrarily, and can appear or disappear even if I change a completely
different part of the code, which drove me crazy when I was trying to measure
the performance of a function affected by this issue; enabling
`-falign-labels=4` also has the advantage of removing this uncertainty.)

Here, my expectation would be that enabling a certain optimization level (such
as `-O2`) enabled this particular optimization.  In fact, the documentation [2]
states that `-O2` enables the `-falign-labels` flag, but without specifying an
alignment.  It later states that `-falign-labels` without a value or with `=0`
will "use a machine-dependent default which is very likely to be ‘1’, meaning
no alignment".

Now, I don't quite understand why `-O2` would want to enable an optimization
option whose default behavior is to do nothing, but my guess is that this is so
that specific targets where setting `-falign-labels=X` can provide an advantage
(as is the case with RISC-V) use `X` as the default value rather than 1.

What do you think?  Would it make sense for RISC-V targets to make
`-falign-labels=4` the default alignment value when `-falign-labels` is used
without providing an explicit value, so that this forced alignment will happen
when `-O2` or `-O3` are used?

[1]:
https://docs.openhwgroup.org/projects/cv32e40p-user-manual/en/latest/pipeline.html#cycle-counts-per-instruction-type
[2]: https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Reply via email to