>>> On 23.10.14 at 15:42, <ja...@redhat.com> wrote: > On Wed, Oct 22, 2014 at 04:28:52PM +0100, Jan Beulich wrote: >> I noticed the issue with 4.9.1 (in that x86 Linux'es >> this_cpu_read_stable() no longer does what the comment preceding >> its definition promises), and the example below demonstrates this in >> a simplified (but contrived) way. I just now verified that trunk has >> the same issue; 4.8.3 still folds redundant ones as expected. Is this >> known, or possibly even intended (in which case I'd be curious as to >> what the reasons are, and how the functionality Linux wants can be >> gained back)? > > This changed because of my http://gcc.gnu.org/PR60663 fix. > In your testcase the inline asm doesn't have more than one output > (which IMNSHO is very much desirable not to CSE), and doesn't have explicit > clobbers either, but happens to have implicit clobbers (fprs and cc), > so CSE still could generate invalid code out of that without the fix > (if it decided to materialize the inline asm somewhere, instead of reusing > existing inline asm). > So, if we e.g. weakened the PR60663 fix so that it only bails out > if the inline asm contains more than one output. we'd need to fix up CSE, so > that it analyzes all the clobbers and doesn't consider asms as equivalent > just based on the ASM_OPERANDS, it needs to have the same clobbers too, > and either doesn't try to materialize it out without preexisting insn > if it has any clobbers.
So why would clobbers in general matter? I can see memory clobbers to need special care, but any others? If two asm()-s only differ in the registers they clobber, surely this is (1) a programmer mistake and (2) irrelevant which of the two forms are to be picked. I first thought hard register variables could matter here, but looking at the (x86) code generated (at -O2) for int test1(int x) { register int y asm("edx"); int z = y; asm("" ::: "edx"); return z + y + x; } register int y asm("ebx"); int test2(int x) { int z = y; asm("" ::: "ebx"); return z + y + x; } shows that the clobbers don't have the theoretically possible effect of forcing y to be re-evaluated after the asm()-s (i.e. both cases get translated as "return z * 2 + x"). Jan