[PATCH] i8k: Tell gcc that *regs gets clobbered

2010-11-13 Thread Jim Bos


More recent GCC caused the i8k driver to stop working, on Slackware
compiler was upgraded from gcc-4.4.4 to gcc-4.5.1 after which it
didn't work anymore, meaning the driver didn't load or gave total
nonsensical output.
As it turned out the asm(..) statement forgot to mention it modifies
the *regs variable.
Credits to Andi Kleen  and Andreas Schwab
 for providing the fix.

Signed-off-by: Jim Bos 

--- linux-2.6.36/drivers/char/i8k.c.ORIG2010-08-02 17:20:46.0 
+0200
+++ linux-2.6.36/drivers/char/i8k.c 2010-11-13 11:35:11.0 +0100
@@ -141,7 +141,7 @@
"lahf\n\t"
"shrl $8,%%eax\n\t"
"andl $1,%%eax\n"
-   :"=a"(rc)
+   :"=a"(rc), "+m" (*regs)
:"a"(regs)
:"%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
 #else
@@ -166,7 +166,8 @@
"movl %%edx,0(%%eax)\n\t"
"lahf\n\t"
"shrl $8,%%eax\n\t"
-   "andl $1,%%eax\n":"=a"(rc)
+   "andl $1,%%eax\n"
+   :"=a"(rc), "+m" (*regs)
:"a"(regs)
:"%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
 #endif


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 05:04 PM, Linus Torvalds wrote:
> On Mon, Nov 15, 2010 at 3:16 AM, Jakub Jelinek  wrote:
>>
>> I don't see any problems on the assembly level.  i8k_smm is
>> not inlined in this case and checks all 3 conditions.
> 
> If it really is related to gcc not understanding that "*regs" has
> changed due to the memory being an automatic variable, and passing in
> "regs" itself as a pointer to that automatic variable together with
> the "memory" clobber not being sufficient, than I think it's the lack
> of inlining that will automatically hide the bug.
> 
> (Side note: and I think this does show how much of a gcc bug it is not
> to consider "memory" together with passing in a pointer to an asm to
> always be a clobber).
> 
> Because if it isn't inlined, then "regs" will be seen a a real pointer
> to some external memory (the caller) rather than being optimized to
> just be the auto structure on the stack. Because *mem is auto only
> within the context of the caller.
> 
> Which actually points to a possible simpler:
>  - remove the "+m" since it adds too much register pressure
>  - mark the i8k_smm() as "noinline" instead.
> 
> Quite frankly, I'd hate to add even more crud to that inline asm (to
> save/restore the registers manually). It's already not the prettiest
> thing around.
> 
> So does the attached patch work for everybody?
> 
>  Linus

Hmm, that doesn't work.

[ Not sure if you read to whole thread but initial workaround was to
change the asm(..) to asm volatile(..) which did work. ]

Jim.


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 06:44 PM, Jakub Jelinek wrote:
> On Mon, Nov 15, 2010 at 06:36:06PM +0100, Jim Bos wrote:
>> On 11/15/2010 12:37 PM, Andi Kleen wrote:
>> See attached, note this is the vanilla 2.6.36 i8k.c (without any patch).
>> And to be 100% sure, if I build this (make drivers/char/i8k.ko) it won't
>> work.
>>
>> [ The i8k.i is rather big, even gzipped 80k, not sure if it'll bounce ]
> 
> Please also say which exact gcc you are using.
> 
> Note, I've compiled it with current 4.5 branch and made the function
> always_inline and still didn't see any issues in the *.optimized dump,
> regs.eax after the inline asm has always been compared to the constant
> that has been stored into regs.eax before the inline asm.
> 
>   Jakub
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

 # gcc -v
Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper
Target: i486-slackware-linux
Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib
--mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap
--enable-languages=ada,c,c++,fortran,java,objc --enable-threads=posix
--enable-checking=release --with-system-zlib
--with-python-dir=/lib/python2.6/site-packages
--disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp
--with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux
--build=i486-slackware-linux --host=i486-slackware-linux
Thread model: posix
gcc version 4.5.1 (GCC)

I'm re-reading this thread where I found the asm-> asm volatine suggestion:
  https://bbs.archlinux.org/viewtopic.php?pid=752099#p752099
but nobody there reported their gcc version (but apparently first
people started complaining May 1st).

_
Jim


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 07:08 PM, Linus Torvalds wrote:
> On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos  wrote:
>>
>> Hmm, that doesn't work.
>>
>> [ Not sure if you read to whole thread but initial workaround was to
>> change the asm(..) to asm volatile(..) which did work. ]
> 
> Since I have a different gcc than yours (and I'm not going to compile
> my own), have you posted your broken .s file anywhere? In fact, with
> the noinline (and the removal of the "+m" thing - iow just the patch
> you tried), what does just the "i8k_smm" function assembly look like
> for you after you've done a "make drivers/char/i8k.s"?
> 
> If the asm just doesn't exist AT ALL, that's just odd. Because every
> single call-site of i8k_smm() clearly looks at the return value. So
> the volatile really shouldn't make any difference from that
> standpoint. Odd.
> 
>Linus
> 

Attached version with plain 2.6.36 source and version with the committed
patch, i.e with the '"+m" (*regs)'


_
Jim


.file   "i8k.c"
# GNU C (GCC) version 4.5.1 (i486-slackware-linux)
#   compiled by GNU C version 4.5.1, GMP version 5.0.1, MPFR version 
2.4.2-p3, MPC version 0.8.2
# GGC heuristics: --param ggc-min-expand=81 --param ggc-min-heapsize=96817
# options passed:  -nostdinc -I/usr/src/linux-2.6.36/arch/x86/include
# -Iinclude -D__KERNEL__ -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1
# -DCONFIG_AS_CFI_SECTIONS=1 -DMODULE -DKBUILD_STR(s)=#s
# -DKBUILD_BASENAME=KBUILD_STR(i8k) -DKBUILD_MODNAME=KBUILD_STR(i8k)
# -isystem /usr/lib/gcc/i486-slackware-linux/4.5.1/include -include
# include/generated/autoconf.h -MD drivers/char/.i8k.s.d drivers/char/i8k.c
# -m32 -msoft-float -mregparm=3 -mpreferred-stack-boundary=2 -march=i686
# -mtune=pentium3 -mtune=generic -mno-sse -mno-mmx -mno-sse2 -mno-3dnow
# -auxbase-strip drivers/char/i8k.s -Os -Wall -Wundef -Wstrict-prototypes
# -Wno-trigraphs -Werror-implicit-function-declaration -Wno-format-security
# -Wno-sign-compare -Wframe-larger-than=1024 -Wdeclaration-after-statement
# -Wno-pointer-sign -fno-strict-aliasing -fno-common
# -fno-delete-null-pointer-checks -freg-struct-return -ffreestanding
# -fno-asynchronous-unwind-tables -fno-stack-protector -fomit-frame-pointer
# -fno-strict-overflow -fconserve-stack -fverbose-asm
# options enabled:  -falign-loops -fargument-alias -fauto-inc-dec
# -fbranch-count-reg -fcaller-saves -fcprop-registers -fcrossjumping
# -fcse-follow-jumps -fdefer-pop -fdwarf2-cfi-asm -fearly-inlining
# -feliminate-unused-debug-types -fexpensive-optimizations
# -fforward-propagate -ffunction-cse -fgcse -fgcse-lm
# -fguess-branch-probability -fident -fif-conversion -fif-conversion2
# -findirect-inlining -finline -finline-functions
# -finline-functions-called-once -finline-small-functions -fipa-cp
# -fipa-pure-const -fipa-reference -fipa-sra -fira-share-save-slots
# -fira-share-spill-slots -fivopts -fkeep-static-consts
# -fleading-underscore -fmath-errno -fmerge-constants -fmerge-debug-strings
# -fmove-loop-invariants -fomit-frame-pointer -foptimize-register-move
# -foptimize-sibling-calls -fpeephole -fpeephole2 -freg-struct-return
# -fregmove -freorder-blocks -freorder-functions -frerun-cse-after-loop
# -fsched-critical-path-heuristic -fsched-dep-count-heuristic
# -fsched-group-heuristic -fsched-interblock -fsched-last-insn-heuristic
# -fsched-rank-heuristic -fsched-spec -fsched-spec-insn-heuristic
# -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column -fsigned-zeros
# -fsplit-ivs-in-unroller -fsplit-wide-types -fthread-jumps
# -ftoplevel-reorder -ftrapping-math -ftree-builtin-call-dce -ftree-ccp
# -ftree-ch -ftree-copy-prop -ftree-copyrename -ftree-cselim -ftree-dce
# -ftree-dominator-opts -ftree-dse -ftree-forwprop -ftree-fre
# -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize
# -ftree-parallelize-loops= -ftree-phiprop -ftree-pre -ftree-pta
# -ftree-reassoc -ftree-scev-cprop -ftree-sink -ftree-slp-vectorize
# -ftree-sra -ftree-switch-conversion -ftree-ter -ftree-vect-loop-version
# -ftree-vrp -funit-at-a-time -fvect-cost-model -fverbose-asm
# -fzero-initialized-in-bss -m32 -m96bit-long-double -malign-stringops
# -mfused-madd -mglibc -mieee-fp -mno-fancy-math-387 -mno-red-zone
# -mno-sse4 -mpush-args -msahf -mtls-direct-seg-refs

# Compiler executable checksum: 7ba2dc3c015559b9d16b297ee7f8d354

.text
.type   i8k_smm, @function
i8k_smm:
pushl   %ebp#
movl%eax, %ebp  # regs, regs
pushl   %edi#
pushl   %esi#
pushl   %ebx#
subl$8, %esp#,
movl(%eax), %eax# regs_2(D)->eax,
movl%eax, 4(%esp)   #, %sfp
movl%ebp, %eax  # regs,
#APP
# 148 "drivers/char/i8k.c" 1
pushl %eax
movl 0(%eax),%edx
push %edx
movl 4(%eax),%ebx

Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 07:30 PM, Jim Bos wrote:
> On 11/15/2010 07:08 PM, Linus Torvalds wrote:
>> On Mon, Nov 15, 2010 at 9:40 AM, Jim Bos  wrote:
>>>
>>> Hmm, that doesn't work.
>>>
>>> [ Not sure if you read to whole thread but initial workaround was to
>>> change the asm(..) to asm volatile(..) which did work. ]
>>
>> Since I have a different gcc than yours (and I'm not going to compile
>> my own), have you posted your broken .s file anywhere? In fact, with
>> the noinline (and the removal of the "+m" thing - iow just the patch
>> you tried), what does just the "i8k_smm" function assembly look like
>> for you after you've done a "make drivers/char/i8k.s"?
>>
>> If the asm just doesn't exist AT ALL, that's just odd. Because every
>> single call-site of i8k_smm() clearly looks at the return value. So
>> the volatile really shouldn't make any difference from that
>> standpoint. Odd.
>>
>>Linus
>>
> 
> Attached version with plain 2.6.36 source and version with the committed
> patch, i.e with the '"+m" (*regs)'
> 
> 
> _
> Jim
> 
> 

And I just tried with your noninline patch which results in exactly the
same .s file as with plain 2.6.36 source, i.e. the noninline patch is
not doing anything here.

_
Jim




Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 07:26 PM, Jakub Jelinek wrote:
> On Mon, Nov 15, 2010 at 07:17:31PM +0100, Jim Bos wrote:
>>  # gcc -v
>> Reading specs from /usr/lib/gcc/i486-slackware-linux/4.5.1/specs
>> COLLECT_GCC=gcc
>> COLLECT_LTO_WRAPPER=/usr/libexec/gcc/i486-slackware-linux/4.5.1/lto-wrapper
>> Target: i486-slackware-linux
>> Configured with: ../gcc-4.5.1/configure --prefix=/usr --libdir=/usr/lib
>> --mandir=/usr/man --infodir=/usr/info --enable-shared --enable-bootstrap
>> --enable-languages=ada,c,c++,fortran,java,objc --enable-threads=posix
>> --enable-checking=release --with-system-zlib
>> --with-python-dir=/lib/python2.6/site-packages
>> --disable-libunwind-exceptions --enable-__cxa_atexit --enable-libssp
>> --with-gnu-ld --verbose --with-arch=i486 --target=i486-slackware-linux
>> --build=i486-slackware-linux --host=i486-slackware-linux
>> Thread model: posix
>> gcc version 4.5.1 (GCC)
> 
> Does it have any patches applied?  The gcc options look the same as what
> I've been already trying earlier.
> Thus, can you run gcc with those options on i8k.i and add -fverbose-asm
> to make it easier to read and post i8k.s you get?
> 
>   Jakub
> 

Slackware is typically not patching much (and I'm just using the
pre-compiled binary).  Here is the link to how it's built:
 http://slackware.osuosl.org/slackware-current/source/d/gcc/
there doesn't appear to be anything relevant changed.

I already posted the .s files, plain 2.6.36 and the one with working
patch, I =think= that's already using -fverbose-asm, at least that shows
in the output.

_
Jim





Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-15 Thread Jim Bos
On 11/15/2010 08:51 PM, Jakub Jelinek wrote:
> On Mon, Nov 15, 2010 at 11:21:30AM -0800, Linus Torvalds wrote:
>> On Mon, Nov 15, 2010 at 11:12 AM, Jakub Jelinek  wrote:
>>>
>>> Ah, the problem is that memory_identifier_string is only initialized in
>>> ipa-reference.c's initialization, so it can be (and is in this case) NULL in
>>> ipa-pure-const.c.
>>
>> Ok. And I guess you can verify that all versions of gcc do this
>> correctly for "asm volatile"?
> 
> Yes, reading 4.1/4.2/4.3/4.4/4.5/4.6 code ipa-pure-const.c handles
> asm volatile correctly, in each case the function is no longer assumed to be
> pure or const in the discovery (of course, user can still say the
> function is const or pure). 4.0 and earlier didn't have ipa-pure-const.c.
> 
> Using the simplified
> 
> extern void abort (void);
> 
> __attribute__((noinline)) int
> foo (int *p)
> {
>   int r;
>   asm ("movl $6, (%1)\n\txorl %0, %0" : "=r" (r) : "r" (p) : "memory");
>   return r;
> }
> 
> int
> main (void)
> {
>   int p = 8;
>   if ((foo (&p) ? : p) != 6)
> abort ();
>   return 0;
> }
> 
> testcase shows that in 4.1/4.2/4.3/4.4 this is miscompiled only when using
> -fno-ipa-reference, in 4.5 it is miscompiled always when optimizing
> unless -fno-ipa-pure-const (as 4.5 added local-pure-const pass which is run
> before ipa-reference) and in 4.6 this has been fixed by Honza when
> doing ipa cleanups.
> 
>> Because since we'll have to work around this problem in the kernel, I
>> suspect the simplest solution is to remove the "+m" that causes
>> register pressure problems, and then use "asm volatile" to work around
>> the const-function bug.
> 
> Yes.
> 
>   Jakub
> 

Linus,

In case you didn't already fixed this, here's the follow-up patch.
---

The fix to work around the gcc miscompiling i8k.c to add "+m (*regs)"
caused register pressure problems. Changing the 'asm' statement to
'asm volatile' instead should prevent that and works around the gcc
bug as well.

Signed-off-by: Jim Bos 

--- linux/drivers/char/i8k.c.ORIG   2010-11-15 21:04:19.0 +0100
+++ linux/drivers/char/i8k.c2010-11-15 21:02:32.0 +0100
@@ -119,7 +119,7 @@
int eax = regs->eax;
 
 #if defined(CONFIG_X86_64)
-   asm("pushq %%rax\n\t"
+   asm volatile("pushq %%rax\n\t"
"movl 0(%%rax),%%edx\n\t"
"pushq %%rdx\n\t"
"movl 4(%%rax),%%ebx\n\t"
@@ -141,11 +141,11 @@
"lahf\n\t"
"shrl $8,%%eax\n\t"
"andl $1,%%eax\n"
-   :"=a"(rc), "+m" (*regs)
+   :"=a"(rc)
:"a"(regs)
:"%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
 #else
-   asm("pushl %%eax\n\t"
+   asm volatile("pushl %%eax\n\t"
"movl 0(%%eax),%%edx\n\t"
"push %%edx\n\t"
"movl 4(%%eax),%%ebx\n\t"
@@ -167,7 +167,7 @@
"lahf\n\t"
"shrl $8,%%eax\n\t"
"andl $1,%%eax\n"
-   :"=a"(rc), "+m" (*regs)
+   :"=a"(rc)
:"a"(regs)
:"%ebx", "%ecx", "%edx", "%esi", "%edi", "memory");
 #endif