On Wed, Feb 24, 2021 at 6:18 AM Jiaxun Yang <[email protected]> wrote:
> I found it's very difficult for GCC to generate this kind of pcrel_lo
> expression,
> RTX label_ref can't be lower into such LOW_SUM expression.
>
Yes, it is difficult. You need to generate a label, and put the label
number in an unspec in the auipc pattern, and then create a label_ref to
put in the addi. The fact that we have an unspec and a label_ref means a
number of optimizations get disabled, like basic block duplication and loop
unrolling, because they can't make a copy of an instruction that uses a
label as data, as they have no way to know how to duplicate the label
itself. Or at least RISC-V needs to create one label. You probably need
to create two labels.
There is a far easier way to do this, which is to just emit an assembler
macro, and let the assembler generate the labels and relocs. This is what
the RISC-V GCC port does by default. This prevents some optimizations like
scheduling the two instructions, but enables some other optimizations like
loop unrolling. So it is a tossup. Sometimes we get better code with the
assembler macro, and sometimes we get better code by emitting the auipc and
addi separately.
The RISC-V gcc port can emit the auipc/addi with
-mexplicit-relocs -mcode-model=medany, but this is known to sometimes
fail. The problem is that if you have an 8-byte variable with 8-byte
alignment, and try to load it with 2 4-byte loads, gcc knows that offset+4
must be safe from overflow because the data is 8-byte aligned. However,
when you use a pc-relative offset that is data address-code address, the
offset is only as aligned as the code is. RISC-V has 2-byte instruction
alignment with the C extension. So if you have offset+4 and offset is only
2-byte aligned, it is possible that offset+4 may overflow the add immediate
field. The same thing can happen with 16-byte data that is 16-byte
aligned, accessed with two 8-byte loads. There is no easy software
solution. We just emit a linker error in that case as we can't do anything
else. I think this would work better if auipc cleared some low bits of the
result, in which case the pc-relative offset would have enough alignment to
prevent overflow when adding small offsets, but it is far too late to
change how the RISC-V auipc works.
If it looks infeasible for GCC side, another option would be adding
> RISC-V style
> %pcrel_{hi,lo} modifier at assembler side. We can add another pair of
> modifier
> like %pcrel_paired_{hi,lo} to implement the behavior. Would it be a good
> idea?
>
I wouldn't recommend following the RISC-V approach for the relocation.
Jim