> > > This code was added in 1997 (r14770). In 2004 the documentation was > > > changed to clarify how things really work (r88999): > > > > > > "Note that even a volatile @code{asm} instruction can be moved relative to > > > other code, including across jump instructions." > > > > > > (followed by an example exactly about what this means for FPU control). > > > > Thanks for pointing to that changes! Unfortunately, sched-deps.c was > > more conservative this 15 years... > > Let’s try to fix it. > > If it causes problems now, that would be a good idea yes :-) > > > > I mean have the modulo scheduler implement the correct asm semantics, not > > > some more restrictive thing that gets it into conflicts with DF, etc. > > > > > > I don't think this will turn out to be a problem in any way. Some invalid > > > asm will break, sure. > > > > I have started with applying this without any SMS changes: > > > > diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c > > --- a/gcc/sched-deps.c > > +++ b/gcc/sched-deps.c > > @@ -2753,22 +2753,14 @@ sched_analyze_2 (struct deps_desc *deps, rtx > > x, rtx_insn *insn) > > > > case UNSPEC_VOLATILE: > > flush_pending_lists (deps, insn, true, true); > > + > > + if (!DEBUG_INSN_P (insn)) > > + reg_pending_barrier = TRUE_BARRIER; > > /* FALLTHRU */ > > > > case ASM_OPERANDS: > > case ASM_INPUT: > > { > > - /* Traditional and volatile asm instructions must be considered to > > use > > - and clobber all hard registers, all pseudo-registers and all of > > - memory. So must TRAP_IF and UNSPEC_VOLATILE operations. > > - > > - Consider for instance a volatile asm that changes the fpu > > rounding > > - mode. An insn should not be moved across this even if it only > > uses > > - pseudo-regs because it might give an incorrectly rounded result. > > */ > > - if ((code != ASM_OPERANDS || MEM_VOLATILE_P (x)) > > - && !DEBUG_INSN_P (insn)) > > - reg_pending_barrier = TRUE_BARRIER; > > - > > /* For all ASM_OPERANDS, we must traverse the vector of input > > operands. > > We cannot just fall through here since then we would be confused > > by the ASM_INPUT rtx inside ASM_OPERANDS, which do not indicate > > UNSPEC_VOLATILE and volatile asm should have the same semantics, ideally. > One step at a time I guess :-)
When barrier for UNSPEC_VOLATILE is also dropped, there are more issues on x86_64: FAIL: gcc.dg/vect/pr62021.c -flto -ffat-lto-objects execution test FAIL: gcc.dg/vect/pr62021.c execution test FAIL: gcc.target/i386/avx2-vect-aggressive.c execution test I haven’t looked at those vectorization tests. > > Regstrapping it on x86-64 shows some failures. First is with ms-sysv > > abi test and can be solved like this: > > [ snip ] > > > Here we have some asms which control stack pointer (sigh!). It > > certainly may be broken at any moment, but right now “memory” clobber > > fixes everything for me. > > Makes sense. > > > Another failure is: > > > > FAIL: gcc.dg/guality/pr58791-4.c -O2 -DPREVENT_OPTIMIZATION line > > pr58791-4.c:32 i == 486 > > [ snip ] > > > It is connected to debug-info, and I cannot solve it myself. I am not > > sure how this should work when we try to print dead-code variable (i2) > > while debugging -O2 (O3/Os) compiled executable. Jakub created that > > test, he is in CC already. > > What does PREVENT_OPTIMIZATION do? It probably needs to be made a bit > stronger. It seems PREVENT_OPTIMIZATION stuff influents only one test – gcc.dg/guality/pr45882.c. Other tests do not contain ATTRIBUTE_USED macro. > (It seems to just add some "used"?) > > > I will also look at aarch64 regstrap a bit later. But that job should > > also be done for all other targets. Segher, could you test it on > > rs6000? > > Do you have an account on the compile farm? It has POWER7 (BE, 32/64), > as well as POWER8 and POWER9 machines (both LE). > https://cfarm.tetaneutral.net/ Ok, I’ll use the farm. Have filled the “new user” form already. Aarch64 is still testing, maybe next time will use farm instead of qemu-system. I forgot to mention in previous letter, that with this patch we also drop dependencies between volatile asms and volatile mems. We have some offline discussion with Alex, and it seems that since 2004 docs never guarantee such a dependency, user should add relevant constraints into asm in all cases. But somebody may have another opinion about this tricky moment. Roman