Richard, this sounds very good. Let me try this. So far, what I meant by 'reloading myself' was that when generating the RTL above, I actually emit move instructions to/from memory if the arguments are memory, and generate RTL above only with hard registers. I did not mention that RTL expansion is done from a builtin function (which is where I am getting either memory or register arguments)..
Your approach sounds better. On Tue, Sep 17, 2013 at 10:33 AM, Richard Sandiford <rdsandif...@googlemail.com> wrote: > Hendrik Greving <hendrik.greving.in...@gmail.com> writes: >> For a special mechanism we are generating jump_insn with a 'set' side >> effect in our backend. RTL looks e.g. like this: >> >> (jump_insn 25 24 26 (nil) (parallel [ >> (set (pc) >> (if_then_else (ne (unspec_volatile [ >> (const_string ("<myinsn> %0,[%1] =%2")) >> (const_int 0 [0x0]) >> (reg:SI 348) >> ] 21) >> (const_int 0 [0x0])) >> (label_ref:SI 43) >> (pc))) >> (clobber (reg:SI 321 link)) >> (set (reg/v:QI 346) >> (unspec:QI [ >> (const_int 0 [0x0]) >> ] 0)) >> ]) -1 (nil) >> (nil)) >> >> After working out some issues initially, this all works fine and the >> 'set' seems to be properly recognized by RTL optimization phases (e.g. >> CSE). I am now running into an issue however with reload. The problem >> seems to be that if the parallel 'set' from e.g. RTL above feeds into >> a (mem (reg)). This happens when e.g. compiling in debug mode, most >> variables are at memory locations of the stack. In this case, compiler >> needs to reload the instruction above (I am actually not sure why, but >> I guess this could happen all the time). The further problem seems to >> be that there is a hard constraint in reload that jump_insn can't have >> output operands / reloads. >> >> if (GET_CODE (insn) == JUMP_INSN || GET_CODE (insn) == CALL_INSN) >> no_output_reloads = 1; >> >> so what I am doing is, when generating RTL above, I always 'reload' >> myself in the backend, generating mov's to/from memory, and always >> making sure only hard registers are put into RTL above. The problem is >> that I need hard registers for that. I would strongly prefer using >> pseudo's, but pseudo's also seem to require reload. > > This might be what you mean by doing reload yourself, but the usual > way of handling this is to add a memory alternative to the pattern > and split that alternative after reload. If the split requires > a temporary register, you can allocate one by adding a > (clobber (match_scratch ...)) to the insn pattern. The scratch can > be "X" for the normal register case that doesn't need a temporary. > See the *ctr<mode>_internal1 pattern in config/rs6000/rs6000.md and > the doloop_si64 pattern in config/s390/s390.md for examples. > > Thanks, > Richard