Hi Richard,
Thanks for getting back to me and your insight. I've implemented a
TARGET_CANNOT_COPY_INSN_P that rejects volatile memory and it appears to be
doing the trick!
It will take some time to determine if there are any other side effects but
I can move forward with this. Much appreciated!
-Tucker
On Tue, Oct 13, 2020 at 3:00 AM Richard Sandiford
wrote:
> Tucker Kern via Gcc writes:
> > TL;DR
> >
> > In GCC 9.3, I believe modified_between_p should return 1 if the memory
> > reference is volatile. My layman's understanding of volatile memory is
> that
> > it could change at any time, and thus could be modified between any two
> > instructions.
>
> That's true, but in RTL, volatile accesses provide no timing guarantee,
> even in an abstract sense. E.g. suppose we traced the execution of a
> function call based on the instructions that exist immediately after
> the expand pass. If a volatile access occurs in the Nth instruction
> of this trace, there's no guarantee that the access will still occur
> in the Nth instruction of traces for later passes. Unrelated
> instructions can be added and removed at any time.
>
> For example, a volatile access should not stop us from swapping:
>
> (set (reg:SI R0) (const_int 0))
>
> and:
>
> (set (mem/v:SI (reg:SI R1)) (const_int 0))
>
> So…
>
> > Possible patch:
> > diff --git a/gcc/rtlanal.c b/gcc/rtlanal.c
> > index 01af063a222..a395df0627b 100644
> > --- a/gcc/rtlanal.c
> > +++ b/gcc/rtlanal.c
> > @@ -1308,6 +1308,8 @@ modified_between_p (const_rtx x, const rtx_insn
> > *start, const rtx_insn *end)
> >return 1;
> >
> > case MEM:
> > + if (MEM_VOLATILE_P(x))
> > +return 1;
> >if (modified_between_p (XEXP (x, 0), start, end))
> > return 1;
> >if (MEM_READONLY_P (x))
>
> …I think the current code handles volatile correctly. The issue
> is whether any instruction in (START, END) might modify the same
> memory as X. Most of the real work is done by the alias machinery,
> which understands (or is supposed to understand) the special rules
> around volatile memory and how it interacts with other memory accesses:
>
> for (insn = NEXT_INSN (start); insn != end; insn = NEXT_INSN (insn))
> if (memory_modified_in_insn_p (x, insn))
> return 1;
>
> > Details
> > I'm writing a splitter to optimize bit field conditionals (e.g. testing a
> > bit of memory mapped I/O). The non-optimized pattern consists of a SET,
> > AND, LSHIFTRT + CMP, JMP pattern which I know can be simplified to an
> > AND + JMP as the AND operation will set the necessary CC bits. To achieve
> > this optimization in the combine pass a custom general_operand has been
> > defined which allows volatile memory.
> >
> > However in certain situations I've found the combine pass is generating
> > invalid patterns when it merges a SET and IF_THEN_ELSE (CMP + JMP)
> > patterns. One such situation is when GCC decides to reuse the result of
> the
> > comparison for the return value.
> >
> > This issue seems intertwined with the combine + volatile_ok issue that
> has
> > been discussed a number of times in the past. The backend is question is
> > highly embedded and significant performance gains are made by enabling
> > volatile references during combine.
> >
> > Optimized tree of the test program. In this program the bit field value
> is
> > saved in _1 to be reused in the return value. Why it makes this choice is
> > beyond me.
> >_1;
> > _Bool _3;
> >
> >[local count: 1044213930]:
> > _1 ={v} MEM[(volatile struct register_t *)5B].done;
> > if (_1 != 0)
> > goto ; [11.00%]
> > else
> > goto ; [89.00%]
> >
> >[local count: 1073741824]:
> > // Additional code trimmed for brevity...
> >
> >[local count: 114863532]:
> > # _3 = PHI <0(4), _1(5)>
> > return _3;
> >
> > This produces the following pre-combine RTL.
> > (insn 9 6 10 3 (parallel [
> > (set (reg:QI 17 [ ])
> > (lshiftrt:QI (mem/v:QI (const_int 5 [0x5]) [1
> MEM[(volatile
> > struct register_t *)5B]+0 S1 A16])
> > (const_int 3 [0x3])))
> > (clobber (reg:CC 7 FLAG))
> > ]) "fail.c":22:26 19 {*lshrqi3_im}
> > (expr_list:REG_DEAD (reg/f:QI 18)
> > (expr_list:REG_UNUSED (reg:CC 7 FLAG)
> > (nil
> > (insn 10 9 12 3 (parallel [
> > (set (reg:QI 17 [ ])
> > (and:QI (reg:QI 17 [ ])
> > (const_int 1 [0x1])))
> > (clobber (reg:CC 7 FLAG))
> > ]) "fail.c":22:26 10 {andqi3}
> > (expr_list:REG_UNUSED (reg:CC 7 FLAG)
> > (nil)))
> > (jump_insn 12 10 20 3 (parallel [
> > (set (pc)
> > (if_then_else (eq (reg:QI 17 [ ])
> > (const_int 0 [0]))
> > (label_ref:QI 11)
> > (pc)))
> > (clobber (scratch:QI))
> > ]) "fail.c":22:9 41 {*cbranchqi4_im}
> > (int_list:REG_BR_PROB 955