On Thu, 2017-01-26 at 13:00 +0000, Matthew Fortune wrote: > Matthew Fortune <matthew.fort...@imgtec.com> writes: > ... > > Pseudo 300 is assigned to memory and then LRA produces a simple > > DImode > > load from the assigned stack slot. The only instruction to set > > pseudo > > 300 is: > > > > (insn 247 212 389 3 (set (reg:SI 300) > > (ne:SI (subreg/s/u:SI (reg/v:DI 231 [ taken ]) 0) > > (const_int 0 [0]))) > > "/home/mfortune/gcc/gcc/predict.c":2904 > > 504 {*sne_zero_sisi} > > (nil)) > > > > Which leads to an SImode store to the stack slot: > > > > (insn 247 392 393 3 (set (reg:SI 4 $4 [300]) > > (ne:SI (reg:SI 20 $20 [orig:231 taken ] [231]) > > (const_int 0 [0]))) > > "/home/mfortune/gcc/gcc/predict.c":2904 > > 504 {*sne_zero_sisi} > > (nil)) > > (insn 393 247 389 3 (set (mem/c:SI (plus:DI (reg/f:DI 29 $sp) > > (const_int 16 [0x10])) [403 %sfp+16 S4 A64]) > > (reg:SI 4 $4 [300])) > > "/home/mfortune/gcc/gcc/predict.c":2904 312 > > {*movsi_internal} > > (nil)) > > ... > > > > (note 248 246 249 40 NOTE_INSN_DELETED) > > (note 249 248 256 40 NOTE_INSN_DELETED) > > (note 256 249 250 40 NOTE_INSN_DELETED) > > (insn 250 256 251 40 (set (reg:DI 6 $6) > > (mem/c:DI (plus:DI (reg/f:DI 29 $sp) > > (const_int 16 [0x10])) [403 %sfp+16 S8 A64])) > > "/home/mfortune/gcc/gcc/predict.c":2904 310 {*movdi_64bit} > > (nil)) > > > > My assumption is that LRA is again expected to deal with this case > > and > > for insn > > 250 should be recognising that it must load 32-bits and rely on > > implicit > > LOAD_EXTEND_OP behaviour producing an acceptable 64-bit value. In > > this > > case it does not matter whether it is sign or zero extension and my > > assumption is that this construct would never appear if a specific > > sign > > or zero extension was required. > > > > I haven't got to looking at where the issue is this time but it > > seems > > different as this is a subreg in a simple move instruction where we > > already support the load/ store directly so no new reload > > instruction is > > required. I don't know if this implies that simple move patterns > > should > > reject subregs but that doesn't sound right either. > > > > Resolving this fixes at least one bug and potentially all bugs in > > the > > MIPS bootstrap as manually modified the generated assembly code to > > use > > LW instead of LD for insn > > 250 and one of the buggy stage 3 objects is fixed. > > > > I'll keep thinking, any advice in the meantime is appreciated. > > All I have been able to determine on this is that there is > potentially > different behaviour for paradoxical subregs in LRA vs reload. There > is > this comment in reload.c:push_reload: > > If we have (SUBREG:M1 (MEM:M2 ...) ...) (or an inner REG that is > still > a pseudo and hence will become a MEM) with M1 wider than M2 and > the > register is a pseudo, also reload the inside expression. > > To me this makes perfect sense as I believe the RTL is only saying > that > there is an M2-mode object to access or at least only the M2-mode > sized > bits are valid. There are comments to say there will always be > sufficient > memory assigned for spill slots as they are sized to fit the largest > paradoxical subreg, I just don't know why that is useful/important. > > However in lra-constraints.c:simplify_operand_subreg it quite happily > performs a reload using the outer mode in this case and only drops > down to > the inner mode if the outer mode reload would be slower than the > inner. > > Presumably this is safe for non WORD_REGISTER_OPERATIONS targets as > the > junk upper bits in registers will be ignored; On > WORD_REGISTER_OPERATIONS > targets then the narrower-than-word mode load will take care of any > 'magic' needed to set the upper bits to a safe value in register. > > So my thinking is that at least WORD_REGISTER_OPERATIONS targets > should > always reload the inner mode for the case mentioned above much like > the same > is required for normal subregs. Does that seem reasonable? Have I > misunderstood the paradoxical subreg case entirely? > > I've only done superficial testing of a change to this code so far > but my > testcase starts working at least which is a start.
FWIW, the RTL "frontend" [1] is now in trunk (as of r244878), so it should now possible to write small fragments of RTL as testcases in DejaGnu. I don't know if it's helpful for this bug though. In case it is, I started some documentation for it here: https://gcc.gnu.org/ml/gcc-patches/2017-01/msg02065.html Dave [1] I put "frontend" in quotes as it's actually an extension to the C frontend, rather than a true frontend.