[PATCH] i8k: Tell gcc that *regs gets clobbered
More recent GCC caused the i8k driver to stop working, on Slackware compiler was upgraded from gcc-4.4.4 to gcc-4.5.1 after which it didn't work anymore, meaning the driver didn't load or gave total nonsensical output. As it turned out the asm(..) statement forgot to mention it modifies the *regs variable. Credits to Andi Kleen and Andreas Schwab for providing the fix. Signed-off-by: Jim Bos --- linux-2.6.36/drivers/char/i8k.c.ORIG2010-08-02 17:20:46.0 +0200 +++ linux-2.6.36/drivers/char/i8k.c 2010-11-13 11:35:11.0 +0100 @@ -141,7 +141,7 @@ "lahf\n\t" "shrl $8,%%eax\n\t" "andl $1,%%eax\n" - :"=a"(rc) + :"=a"(rc), "+m" (*regs) :"a"(regs) :"%ebx", "%ecx", "%edx", "%esi", "%edi", "memory"); #else @@ -166,7 +166,8 @@ "movl %%edx,0(%%eax)\n\t" "lahf\n\t" "shrl $8,%%eax\n\t" - "andl $1,%%eax\n":"=a"(rc) + "andl $1,%%eax\n" + :"=a"(rc), "+m" (*regs) :"a"(regs) :"%ebx", "%ecx", "%edx", "%esi", "%edi", "memory"); #endif
Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?
On 11/15/2010 05:04 PM, Linus Torvalds wrote: > On Mon, Nov 15, 2010 at 3:16 AM, Jakub Jelinek wrote: >> >> I don't see any problems on the assembly level. i8k_smm is >> not inlined in this case and checks all 3 conditions. > > If it really is related to gcc not understanding that "*regs" has > changed due to the memory being an automatic variable, and passing in > "regs" itself as a pointer to that automatic variable together with > the "memory" clobber not being sufficient, than I think it's the lack > of inlining that will automatically hide the bug. > > (Side note: and I think this does show how much of a gcc bug it is not > to consider "memory" together with passing in a pointer to an asm to > always be a clobber). > > Because if it isn't inlined, then "regs" will be seen a a real pointer > to some external memory (the caller) rather than being optimized to > just be the auto structure on the stack. Because *mem is auto only > within the context of the caller. > > Which actually points to a possible simpler: > - remove the "+m" since it adds too much register pressure > - mark the i8k_smm() as "noinline" instead. > > Quite frankly, I'd hate to add even more crud to that inline asm (to > save/restore the registers manually). It's already not the prettiest > thing around. > > So does the attached patch work for everybody? > > Linus Hmm, that doesn't work. [ Not sure if you read to whole thread but initial workaround was to change the asm(..) to asm volatile(..) which did work. ] Jim.
Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?
On 11/15/2010 06:44 PM, Jakub Jelinek wrote: > On Mon, Nov 15, 2010 at 06:36:06PM +0100, Jim Bos wrote: >> On 11/15/2010 12:37 PM, Andi Kleen wrote: >> See attached, note this is the vanilla 2.6.36 i8k.c (without any patch). >> And to be 100% sure, if I build this (make drivers/char/i8k.ko) it won't >> work. >> >> [ The i8k.i is rather big, even gzipped 80k, not sure if it'll bounce ] > > Please also say which exact gcc you are using. > > Note, I've compiled it with current 4.5 branch and made the function > always_inline and still didn't see any issues in the *.optimized dump, > regs.eax after the inline asm has always been compared to the constant > that has been stored into regs.eax before the inline asm. > > Jakub > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ # gcc -v Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper Target: i486-slackware-linux Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib --mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap --enable-languages=ada,c,c++,fortran,java,objc --enable-threads=posix --enable-checking=release --with-system-zlib --with-python-dir=/lib/python2.6/site-packages --disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp --with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux --build=i486-slackware-linux --host=i486-slackware-linux Thread model: posix gcc version 4.5.1 (GCC) I'm re-reading this thread where I found the asm-> asm volatine suggestion: https://bbs.archlinux.org/viewtopic.php?pid=752099#p752099 but nobody there reported their gcc version (but apparently first people started complaining May 1st). _ Jim
Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?
On 11/15/2010 07:08 PM, Linus Torvalds wrote: > On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos wrote: >> >> Hmm, that doesn't work. >> >> [ Not sure if you read to whole thread but initial workaround was to >> change the asm(..) to asm volatile(..) which did work. ] > > Since I have a different gcc than yours (and I'm not going to compile > my own), have you posted your broken .s file anywhere? In fact, with > the noinline (and the removal of the "+m" thing - iow just the patch > you tried), what does just the "i8k_smm" function assembly look like > for you after you've done a "make drivers/char/i8k.s"? > > If the asm just doesn't exist AT ALL, that's just odd. Because every > single call-site of i8k_smm() clearly looks at the return value. So > the volatile really shouldn't make any difference from that > standpoint. Odd. > >Linus > Attached version with plain 2.6.36 source and version with the committed patch, i.e with the '"+m" (*regs)' _ Jim .file "i8k.c" # GNU C (GCC) version 4.5.1 (i486-slackware-linux) # compiled by GNU C version 4.5.1, GMP version 5.0.1, MPFR version 2.4.2-p3, MPC version 0.8.2 # GGC heuristics: --param ggc-min-expand=81 --param ggc-min-heapsize=96817 # options passed: -nostdinc -I/usr/src/linux-2.6.36/arch/x86/include # -Iinclude -D__KERNEL__ -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 # -DCONFIG_AS_CFI_SECTIONS=1 -DMODULE -DKBUILD_STR(s)=#s # -DKBUILD_BASENAME=KBUILD_STR(i8k) -DKBUILD_MODNAME=KBUILD_STR(i8k) # -isystem /usr/lib/gcc/i486-slackware-linux/4.5.1/include -include # include/generated/autoconf.h -MD drivers/char/.i8k.s.d drivers/char/i8k.c # -m32 -msoft-float -mregparm=3 -mpreferred-stack-boundary=2 -march=i686 # -mtune=pentium3 -mtune=generic -mno-sse -mno-mmx -mno-sse2 -mno-3dnow # -auxbase-strip drivers/char/i8k.s -Os -Wall -Wundef -Wstrict-prototypes # -Wno-trigraphs -Werror-implicit-function-declaration -Wno-format-security # -Wno-sign-compare -Wframe-larger-than=1024 -Wdeclaration-after-statement # -Wno-pointer-sign -fno-strict-aliasing -fno-common # -fno-delete-null-pointer-checks -freg-struct-return -ffreestanding # -fno-asynchronous-unwind-tables -fno-stack-protector -fomit-frame-pointer # -fno-strict-overflow -fconserve-stack -fverbose-asm # options enabled: -falign-loops -fargument-alias -fauto-inc-dec # -fbranch-count-reg -fcaller-saves -fcprop-registers -fcrossjumping # -fcse-follow-jumps -fdefer-pop -fdwarf2-cfi-asm -fearly-inlining # -feliminate-unused-debug-types -fexpensive-optimizations # -fforward-propagate -ffunction-cse -fgcse -fgcse-lm # -fguess-branch-probability -fident -fif-conversion -fif-conversion2 # -findirect-inlining -finline -finline-functions # -finline-functions-called-once -finline-small-functions -fipa-cp # -fipa-pure-const -fipa-reference -fipa-sra -fira-share-save-slots # -fira-share-spill-slots -fivopts -fkeep-static-consts # -fleading-underscore -fmath-errno -fmerge-constants -fmerge-debug-strings # -fmove-loop-invariants -fomit-frame-pointer -foptimize-register-move # -foptimize-sibling-calls -fpeephole -fpeephole2 -freg-struct-return # -fregmove -freorder-blocks -freorder-functions -frerun-cse-after-loop # -fsched-critical-path-heuristic -fsched-dep-count-heuristic # -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic # -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic # -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column -fsigned-zeros # -fsplit-ivs-in-unroller -fsplit-wide-types -fthread-jumps # -ftoplevel-reorder -ftrapping-math -ftree-builtin-call-dce -ftree-ccp # -ftree-ch -ftree-copy-prop -ftree-copyrename -ftree-cselim -ftree-dce # -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre # -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize # -ftree-parallelize-loops= -ftree-phiprop -ftree-pre -ftree-pta # -ftree-reassoc -ftree-scev-cprop -ftree-sink -ftree-slp-vectorize # -ftree-sra -ftree-switch-conversion -ftree-ter -ftree-vect-loop-version # -ftree-vrp -funit-at-a-time -fvect-cost-model -fverbose-asm # -fzero-initialized-in-bss -m32 -m96bit-long-double -malign-stringops # -mfused-madd -mglibc -mieee-fp -mno-fancy-math-387 -mno-red-zone # -mno-sse4 -mpush-args -msahf -mtls-direct-seg-refs # Compiler executable checksum: 7ba2dc3c015559b9d16b297ee7f8d354 .text .type i8k_smm, @function i8k_smm: pushl %ebp# movl%eax, %ebp # regs, regs pushl %edi# pushl %esi# pushl %ebx# subl$8, %esp#, movl(%eax), %eax# regs_2(D)->eax, movl%eax, 4(%esp) #, %sfp movl%ebp, %eax # regs, #APP # 148 "drivers/char/i8k.c" 1 pushl %eax movl 0(%eax),%edx push %edx movl 4(%eax),%ebx
Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?
On 11/15/2010 07:30 PM, Jim Bos wrote: > On 11/15/2010 07:08 PM, Linus Torvalds wrote: >> On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos wrote: >>> >>> Hmm, that doesn't work. >>> >>> [ Not sure if you read to whole thread but initial workaround was to >>> change the asm(..) to asm volatile(..) which did work. ] >> >> Since I have a different gcc than yours (and I'm not going to compile >> my own), have you posted your broken .s file anywhere? In fact, with >> the noinline (and the removal of the "+m" thing - iow just the patch >> you tried), what does just the "i8k_smm" function assembly look like >> for you after you've done a "make drivers/char/i8k.s"? >> >> If the asm just doesn't exist AT ALL, that's just odd. Because every >> single call-site of i8k_smm() clearly looks at the return value. So >> the volatile really shouldn't make any difference from that >> standpoint. Odd. >> >>Linus >> > > Attached version with plain 2.6.36 source and version with the committed > patch, i.e with the '"+m" (*regs)' > > > _ > Jim > > And I just tried with your noninline patch which results in exactly the same .s file as with plain 2.6.36 source, i.e. the noninline patch is not doing anything here. _ Jim
Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?
On 11/15/2010 07:26 PM, Jakub Jelinek wrote: > On Mon, Nov 15, 2010 at 07:17:31PM +0100, Jim Bos wrote: >> # gcc -v >> Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs >> COLLECT_GCC=gcc >> COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper >> Target: i486-slackware-linux >> Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib >> --mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap >> --enable-languages=ada,c,c++,fortran,java,objc --enable-threads=posix >> --enable-checking=release --with-system-zlib >> --with-python-dir=/lib/python2.6/site-packages >> --disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp >> --with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux >> --build=i486-slackware-linux --host=i486-slackware-linux >> Thread model: posix >> gcc version 4.5.1 (GCC) > > Does it have any patches applied? The gcc options look the same as what > I've been already trying earlier. > Thus, can you run gcc with those options on i8k.i and add -fverbose-asm > to make it easier to read and post i8k.s you get? > > Jakub > Slackware is typically not patching much (and I'm just using the pre-compiled binary). Here is the link to how it's built: http://slackware.osuosl.org/slackware-current/source/d/gcc/ there doesn't appear to be anything relevant changed. I already posted the .s files, plain 2.6.36 and the one with working patch, I =think= that's already using -fverbose-asm, at least that shows in the output. _ Jim
Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?
On 11/15/2010 08:51 PM, Jakub Jelinek wrote: > On Mon, Nov 15, 2010 at 11:21:30AM -0800, Linus Torvalds wrote: >> On Mon, Nov 15, 2010 at 11:12 AM, Jakub Jelinek wrote: >>> >>> Ah, the problem is that memory_identifier_string is only initialized in >>> ipa-reference.c's initialization, so it can be (and is in this case) NULL in >>> ipa-pure-const.c. >> >> Ok. And I guess you can verify that all versions of gcc do this >> correctly for "asm volatile"? > > Yes, reading 4.1/4.2/4.3/4.4/4.5/4.6 code ipa-pure-const.c handles > asm volatile correctly, in each case the function is no longer assumed to be > pure or const in the discovery (of course, user can still say the > function is const or pure). 4.0 and earlier didn't have ipa-pure-const.c. > > Using the simplified > > extern void abort (void); > > __attribute__((noinline)) int > foo (int *p) > { > int r; > asm ("movl $6, (%1)\n\txorl %0, %0" : "=r" (r) : "r" (p) : "memory"); > return r; > } > > int > main (void) > { > int p = 8; > if ((foo (&p) ? : p) != 6) > abort (); > return 0; > } > > testcase shows that in 4.1/4.2/4.3/4.4 this is miscompiled only when using > -fno-ipa-reference, in 4.5 it is miscompiled always when optimizing > unless -fno-ipa-pure-const (as 4.5 added local-pure-const pass which is run > before ipa-reference) and in 4.6 this has been fixed by Honza when > doing ipa cleanups. > >> Because since we'll have to work around this problem in the kernel, I >> suspect the simplest solution is to remove the "+m" that causes >> register pressure problems, and then use "asm volatile" to work around >> the const-function bug. > > Yes. > > Jakub > Linus, In case you didn't already fixed this, here's the follow-up patch. --- The fix to work around the gcc miscompiling i8k.c to add "+m (*regs)" caused register pressure problems. Changing the 'asm' statement to 'asm volatile' instead should prevent that and works around the gcc bug as well. Signed-off-by: Jim Bos --- linux/drivers/char/i8k.c.ORIG 2010-11-15 21:04:19.0 +0100 +++ linux/drivers/char/i8k.c2010-11-15 21:02:32.0 +0100 @@ -119,7 +119,7 @@ int eax = regs->eax; #if defined(CONFIG_X86_64) - asm("pushq %%rax\n\t" + asm volatile("pushq %%rax\n\t" "movl 0(%%rax),%%edx\n\t" "pushq %%rdx\n\t" "movl 4(%%rax),%%ebx\n\t" @@ -141,11 +141,11 @@ "lahf\n\t" "shrl $8,%%eax\n\t" "andl $1,%%eax\n" - :"=a"(rc), "+m" (*regs) + :"=a"(rc) :"a"(regs) :"%ebx", "%ecx", "%edx", "%esi", "%edi", "memory"); #else - asm("pushl %%eax\n\t" + asm volatile("pushl %%eax\n\t" "movl 0(%%eax),%%edx\n\t" "push %%edx\n\t" "movl 4(%%eax),%%ebx\n\t" @@ -167,7 +167,7 @@ "lahf\n\t" "shrl $8,%%eax\n\t" "andl $1,%%eax\n" - :"=a"(rc), "+m" (*regs) + :"=a"(rc) :"a"(regs) :"%ebx", "%ecx", "%edx", "%esi", "%edi", "memory"); #endif