On Wed, Feb 24, 2021 at 6:18 AM Jiaxun Yang <jiaxun.y...@flygoat.com> wrote:

> I found it's very difficult for GCC to generate this kind of pcrel_lo
> expression,
> RTX label_ref can't be lower into such LOW_SUM expression.
>

Yes, it is difficult.  You need to generate a label, and put the label
number in an unspec in the auipc pattern, and then create a label_ref to
put in the addi.  The fact that we have an unspec and a label_ref means a
number of optimizations get disabled, like basic block duplication and loop
unrolling, because they can't make a copy of an instruction that uses a
label as data, as they have no way to know how to duplicate the label
itself.  Or at least RISC-V needs to create one label.  You probably need
to create two labels.

There is a far easier way to do this, which is to just emit an assembler
macro, and let the assembler generate the labels and relocs.  This is what
the RISC-V GCC port does by default.  This prevents some optimizations like
scheduling the two instructions, but enables some other optimizations like
loop unrolling.  So it is a tossup.  Sometimes we get better code with the
assembler macro, and sometimes we get better code by emitting the auipc and
addi separately.

The RISC-V gcc port can emit the auipc/addi with
-mexplicit-relocs -mcode-model=medany, but this is known to sometimes
fail.  The problem is that if you have an 8-byte variable with 8-byte
alignment, and try to load it with 2 4-byte loads, gcc knows that offset+4
must be safe from overflow because the data is 8-byte aligned.  However,
when you use a pc-relative offset that is data address-code address, the
offset is only as aligned as the code is.  RISC-V has 2-byte instruction
alignment with the C extension.  So if you have offset+4 and offset is only
2-byte aligned, it is possible that offset+4 may overflow the add immediate
field.  The same thing can happen with 16-byte data that is 16-byte
aligned, accessed with two 8-byte loads.  There is no easy software
solution.  We just emit a linker error in that case as we can't do anything
else.  I think this would work better if auipc cleared some low bits of the
result, in which case the pc-relative offset would have enough alignment to
prevent overflow when adding small offsets, but it is far too late to
change how the RISC-V auipc works.

If it looks infeasible for GCC side, another option would be adding
> RISC-V style
> %pcrel_{hi,lo} modifier at assembler side. We can add another pair of
> modifier
> like %pcrel_paired_{hi,lo} to implement the behavior. Would it be a good
> idea?
>

I wouldn't recommend following the RISC-V approach for the relocation.

Jim

Reply via email to