On Wed, Feb 24, 2021 at 6:18 AM Jiaxun Yang <jiaxun.y...@flygoat.com> wrote:
> I found it's very difficult for GCC to generate this kind of pcrel_lo > expression, > RTX label_ref can't be lower into such LOW_SUM expression. > Yes, it is difficult. You need to generate a label, and put the label number in an unspec in the auipc pattern, and then create a label_ref to put in the addi. The fact that we have an unspec and a label_ref means a number of optimizations get disabled, like basic block duplication and loop unrolling, because they can't make a copy of an instruction that uses a label as data, as they have no way to know how to duplicate the label itself. Or at least RISC-V needs to create one label. You probably need to create two labels. There is a far easier way to do this, which is to just emit an assembler macro, and let the assembler generate the labels and relocs. This is what the RISC-V GCC port does by default. This prevents some optimizations like scheduling the two instructions, but enables some other optimizations like loop unrolling. So it is a tossup. Sometimes we get better code with the assembler macro, and sometimes we get better code by emitting the auipc and addi separately. The RISC-V gcc port can emit the auipc/addi with -mexplicit-relocs -mcode-model=medany, but this is known to sometimes fail. The problem is that if you have an 8-byte variable with 8-byte alignment, and try to load it with 2 4-byte loads, gcc knows that offset+4 must be safe from overflow because the data is 8-byte aligned. However, when you use a pc-relative offset that is data address-code address, the offset is only as aligned as the code is. RISC-V has 2-byte instruction alignment with the C extension. So if you have offset+4 and offset is only 2-byte aligned, it is possible that offset+4 may overflow the add immediate field. The same thing can happen with 16-byte data that is 16-byte aligned, accessed with two 8-byte loads. There is no easy software solution. We just emit a linker error in that case as we can't do anything else. I think this would work better if auipc cleared some low bits of the result, in which case the pc-relative offset would have enough alignment to prevent overflow when adding small offsets, but it is far too late to change how the RISC-V auipc works. If it looks infeasible for GCC side, another option would be adding > RISC-V style > %pcrel_{hi,lo} modifier at assembler side. We can add another pair of > modifier > like %pcrel_paired_{hi,lo} to implement the behavior. Would it be a good > idea? > I wouldn't recommend following the RISC-V approach for the relocation. Jim