from:"jpoimboe at redhat dot com"

[Bug c/70604] New: switch statement optimization creates dead code

2016-04-08 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70604

Bug ID: 70604
   Summary: switch statement optimization creates dead code
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jpoimboe at redhat dot com
  Target Milestone: ---

Created attachment 38226
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38226&action=edit
iscsi_target.i.gz

The linux kernel has a new tool named "objtool" which follows all possible code
paths for every .o file, looking for abnormalities.  In rare cases it has
stumbled on a gcc optimization issue related to switch statements, which leaves
some dead code (unreachable instructions) laying around, resulting in increased
code size and confusing assembler code.

Note: For some reason the likelihood of seeing this problem seems to have
diminished from gcc 5.3.1 to gcc 6.0.  We've seen the problem in three separate
object files in the kernel with gcc 5.3.1, but with gcc 6.0 there's only one
known occurrence of this issue.  Further, even in that one case, the size of
the dead code has decreased from several instructions in 5.3.1 to only a single
unreachable instruction in 6.0.

Here are the relevant assembler code excerpts, using gcc 6.0:

iscsit_handle_task_mgt_cmd:

[...snip...]

jmp *.L6212+64(%rip)#
.section.rodata
.align 8
.align 4
.L6212:
.quad   .L6211
.quad   .L6070
.quad   .L6214
.quad   .L6214
.quad   .L6214
.quad   .L6214
.quad   .L6082
.quad   .L6089
.quad   .L6096
.text

[...snip...]

jmp .L6196  #
.L6211:
movl$8, %esi#, _315
.L6213:
movq$.LC48, %rdi#,


There's an indirect jump instruction in .text, along with a jump table in
.rodata, which is a common pattern for switch statements.  But this one's a
little different than the normal pattern: the indirect jump is hard-coded to
use a single entry in the jump table.  The other jump table entries are unused.
 Further, the code referenced in one of the entries (.L6211) is dead code and
completely unreachable from anywhere in the function.

Note that the -fsanitize=unreachable option is enabled, but this seems
unrelated: __builtin_unreachable() doesn't appear to have been used for this
code path, and the unreachable code block doesn't have a call to
__ubsan_handle_builtin_unreachable.

So to summarize, the issues are:

1) dead code in .text (in this case, the function only has one unreachable
instruction)

2) mostly unused (and completely unnecessary) jump table in .rodata

3) unnecessary indirect jump (a direct jump could be used instead)


$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/6.0.0/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-linker-hash-style=gnu --enable-plugin --enable-initfini-array
--disable-libgcj --with-isl --enable-libmpx --enable-gnu-indirect-function
--with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 6.0.0 20160311 (Red Hat 6.0.0-0.17) (GCC) 

$ gcc -Wp,-MD,drivers/target/iscsi/.iscsi_target.o.d  -nostdinc
-I./arch/x86/include -Iarch/x86/include/generated/uapi
-Iarch/x86/include/generated  -Iinclude -I./arch/x86/include/uapi
-Iarch/x86/include/generated/uapi -I./include/uapi -Iinclude/generated/uapi
-include ./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
-Werror-implicit-function-declaration -Wno-format-security -std=gnu89 -mno-sse
-mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1 -falign-loops=1
-mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup
-mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time
-maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1
-DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_SSSE3=1
-DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -DCONFIG_AS_SHA1_NI=1
-DCONFIG_AS_SHA256_NI=1 -pipe -Wno-sign-compare -fno-asynchronous-unwind-tables
-fno-delete-null-pointer-checks -O2 --param=allow-store-data-races=0
-Wframe-larger-than=1024 -fno-stack-protector -Wno-unused-but-set-variable
-fomit-frame-pointer -fno-var-tracking-assignments
-Wdeclaration-after-statement -Wno-pointer-sign -fno-st

[Bug c/70604] switch statement optimization creates dead code

2016-04-08 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70604

--- Comment #1 from Josh Poimboeuf  ---
Created attachment 38227
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38227&action=edit
Linux .config

[Bug tree-optimization/70604] switch statement optimization creates dead code

2016-04-12 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70604

--- Comment #3 from Josh Poimboeuf  ---
Hi Richard, thanks for looking at it!

(In reply to Richard Biener from comment #2)
> Are the cases you still see indirect jumps to only one active case or
> is it just that if()s with multiple cases would have avoided the dead
> code?

I don't quite understand the second part of your question, but maybe this
information will answer it.

With GCC 6 I've only seen one occurrence of this issue, which is the one for
which I posted the assembler and .i file above.  It has an indirect jump which
is hard-coded to use only one entry in the jump table, as shown in the
assembler code above.

Note that the C code has two switch statements, which seem to correspond to the
two "normal" cases of indirect jump table patterns.  But then there's the third
unusual indirect jump, shown above, which also corresponds to one of the two
switch statements -- so there are two jump tables for a single switch
statement, where one of the tables appears to be optimized for a single case. 
Hopefully I'm making sense :-)

With GCC 5, the other occurrences of this issue were very similar, with switch
statements and indirect jumps to a single entry in the jump table.

[Bug c/70646] Corrupt truncated function

2016-04-13 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646

--- Comment #2 from Josh Poimboeuf  ---
$ gcc -Wp,-MD,drivers/scsi/qla2xxx/.qla_attr.o.d  -nostdinc -isystem
/usr/lib/gcc/x86_64-redhat-linux/5.3.1/include -I./arch/x86/include
-Iarch/x86/include/generated/uapi -Iarch/x86/include/generated  -Iinclude
-I./arch/x86/include/uapi -Iarch/x86/include/generated/uapi -I./include/uapi
-Iinclude/generated/uapi -include ./include/linux/kconfig.h -D__KERNEL__ -Wall
-Wundef -Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
-Werror-implicit-function-declaration -Wno-format-security -std=gnu89 -mno-sse
-mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m64 -falign-jumps=1 -falign-loops=1
-mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3 -mskip-rax-setup
-mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time
-maccumulate-outgoing-args -DCONFIG_X86_X32_ABI -DCONFIG_AS_CFI=1
-DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1
-DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1
-DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -pipe -Wno-sign-compare
-fno-asynchronous-unwind-tables -fno-delete-null-pointer-checks -Os
-Wno-maybe-uninitialized --param=allow-store-data-races=0
-Wframe-larger-than=2048 -fno-stack-protector -Wno-unused-but-set-variable
-fno-omit-frame-pointer -fno-optimize-sibling-calls
-fno-var-tracking-assignments -fno-inline-functions-called-once
-Wdeclaration-after-statement -Wno-pointer-sign -fno-strict-overflow
-fconserve-stack -Werror=implicit-int -Werror=strict-prototypes
-Werror=date-time -Werror=incompatible-pointer-types -DCC_HAVE_ASM_GOTO   
-D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(qla_attr)" 
-D"KBUILD_MODNAME=KBUILD_STR(qla2xxx)" -c -o drivers/scsi/qla2xxx/qla_attr.o
drivers/scsi/qla2xxx/qla_attr.c

$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/5.3.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-linker-hash-style=gnu --enable-plugin --enable-initfini-array
--disable-libgcj --with-default-libstdcxx-abi=gcc4-compatible --with-isl
--enable-libmpx --enable-gnu-indirect-function --with-tune=generic
--with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 5.3.1 20151207 (Red Hat 5.3.1-2) (GCC)

[Bug tree-optimization/70604] switch statement optimization creates dead code

2016-04-13 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70604

Josh Poimboeuf  changed:

   What|Removed |Added

 CC||rguenth at gcc dot gnu.org

--- Comment #4 from Josh Poimboeuf  ---
Richard, just realized you weren't on CC for my response to your question.

[Bug c/70646] Corrupt truncated function

2016-04-13 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646

--- Comment #1 from Josh Poimboeuf  ---
Created attachment 38256
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38256&action=edit
Linux kernel config

[Bug c/70646] New: Corrupt truncated function

2016-04-13 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646

Bug ID: 70646
   Summary: Corrupt truncated function
   Product: gcc
   Version: 5.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jpoimboe at redhat dot com
  Target Milestone: ---

Created attachment 38255
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38255&action=edit
qla_attr.i.gz

The linux kernel has a new tool named "objtool" which follows all possible code
paths for every .o file, looking for abnormalities.  In one rare case it has
discovered a corrupt truncated function.

From the disassembly of drivers/scsi/qla2xxx/qla_attr.o:

2f53 :
2f53:   55  push   %rbp
2f54:   48 89 e5mov%rsp,%rbp

2f57 :
2f57:   55  push   %rbp
2f58:   b9 e8 00 00 00  mov$0xe8,%ecx
2f5d:   48 89 e5mov%rsp,%rbp
...

Note that qla2x00_get_host_fabric_name() is inexplicably truncated after
setting up the frame pointer.  It falls through to the next function, which is
very bad.

I can recreate it with gcc 5.3.1 or gcc 6.0 on the upstream Linux kernel at tag
v4.6-rc3.

The call chain which appears to trigger the problem is:

qla2x00_get_host_fabric_name()
  wwn_to_u64()
get_unaligned_be64()
  be64_to_cpup()
__be64_to_cpup()

It occurs with the combination of the following two recent Linux commits:

- bc27fb68aaad ("include/uapi/linux/byteorder, swab: force inlining of some
byteswap operations")
- ef3fb2422ffe ("scsi: fc: use get/put_unaligned64 for wwn access")


The gzipped .i file is attached.  I'll also attach the kernel .config file.

[Bug ipa/70646] [4.9/5/6 Regression] Corrupt truncated function

2016-04-14 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646

--- Comment #13 from Josh Poimboeuf  ---
So if I understand correctly, some reachable code is incorrectly getting marked
unreachable and then getting discarded.

Interestingly, the function's epilogue (frame pointer restore) and return
instruction are also getting discarded.  Can you tell if that will always be
the case when this bug triggers?

If so, that should make it possible for objtool to reliably detect the bug.

[Bug ipa/70646] [4.9/5/6 Regression] Corrupt truncated function

2016-04-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646

--- Comment #17 from Josh Poimboeuf  ---
(In reply to Richard Biener from comment #14)
> (In reply to Josh Poimboeuf from comment #13)
> > Interestingly, the function's epilogue (frame pointer restore) and return
> > instruction are also getting discarded.  Can you tell if that will always be
> > the case when this bug triggers?
> 
> I don't think we can rely on that.  The path could simply fall thru to
> a random instruction which is still inside the function.  Say, if it was
> 
>   if (x)
> ...
>   else
> ...
>   foo ();
> 
> then the arm marked unreachable would simply disappear.

I tried doing that by putting the offending wwn_to_u64() call in an if
statement, but then the bug disappeared.

In fact, adding *any* code before the call seems to make the bug disappear.

[Bug ipa/70646] [4.9/5/6 Regression] Corrupt truncated function

2016-04-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646

--- Comment #20 from Josh Poimboeuf  ---
Thanks very much to everyone who has looked into this so far.  It would be very
helpful to get answers to the following questions, so we can understand the
impact to the kernel:

1) Is there a reliable way to avoid the bug, either in code or with a gcc flag?

2) Is there a reliable way to detect the bug by looking at the object code?

[Bug ipa/70646] [4.9/5/6/7 Regression] Corrupt truncated function

2016-04-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646

--- Comment #24 from Josh Poimboeuf  ---
(In reply to Martin Jambor from comment #23)
> (In reply to Josh Poimboeuf from comment #20)
> > Thanks very much to everyone who has looked into this so far.  It would be
> > very helpful to get answers to the following questions, so we can understand
> > the impact to the kernel:
> > 
> > 1) Is there a reliable way to avoid the bug, either in code or with a gcc
> > flag?
> 
> If you mean in general, then unfortunately no, except for disabling
> inlining altogether and removing all always_inline's.  In addition to
> the main bug, I found out that I check --param ipa-max-agg-items only
> after incrementing it, so even setting that to zero does not help.
> I'll prepare a patch for that too.

Yes, I'm looking for a general way to either prevent or try to detect potential
other cases of the bug throughout the entire kernel.

Can it only occur with the use of __builtin_constant_p(exp) by an inline
function (where exp is a constant)?

[Bug c/81813] New: Inefficient stack pointer adjustment

2017-08-10 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81813

Bug ID: 81813
   Summary: Inefficient stack pointer adjustment
   Product: gcc
   Version: 7.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jpoimboe at redhat dot com
  Target Milestone: ---

Created attachment 41968
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41968&action=edit
fs_pin.i

With GCC 7.1, the kernel's object code static analysis tool (objtool) found
this unusual method of adjusting the stack pointer in the kernel:

   2cc:   48 8d 4c 24 08  lea0x8(%rsp),%rcx
   2d1:   48 89 ccmov%rcx,%rsp

The value in %rcx was never used afterwards.  It would be faster and more
straightforward to just add 8 to %rsp.

To recreate:

  gcc -mno-sse -mpreferred-stack-boundary=3 -O2 -c -o fs_pin.o fs_pin.i

[Bug c/82221] New: internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

Bug ID: 82221
   Summary: internal compiler error: in print_reg, at
config/i386/i386.c:17656
   Product: gcc
   Version: 7.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jpoimboe at redhat dot com
  Target Milestone: ---

When compiling a patch for the Linux kernel, GCC crashes with the following
error:

gcc -Wp,-MD,arch/x86/events/.core.o.d  -nostdinc -isystem
/usr/lib/gcc/x86_64-redhat-linux/7/include -I./arch/x86/include
-I./arch/x86/include/generated  -I./include -I./arch/x86/include/uapi
-I./arch/x86/include/generated/uapi -I./include/uapi -I./include/generated/uapi
-include ./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
-fshort-wchar -Werror-implicit-function-declaration -Wno-format-security
-std=gnu89 -fno-PIE -mno-sse -mno-mmx -mno-sse2 -mno-3dnow -mno-avx -m32
-msoft-float -mregparm=3 -freg-struct-return -fno-pic
-mpreferred-stack-boundary=2 -march=i686 -Wa,-mtune=generic32 -ffreestanding
-DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -DCONFIG_AS_CFI_SECTIONS=1
-DCONFIG_AS_SSSE3=1 -DCONFIG_AS_CRC32=1 -DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1
-DCONFIG_AS_AVX512=1 -DCONFIG_AS_SHA1_NI=1 -DCONFIG_AS_SHA256_NI=1 -pipe
-Wno-sign-compare -fno-asynchronous-unwind-tables
-fno-delete-null-pointer-checks -Wno-frame-address -Wno-format-truncation
-Wno-format-overflow -Wno-int-in-bool-context -O2
--param=allow-store-data-races=0 -DCC_HAVE_ASM_GOTO -fno-reorder-blocks
-fno-ipa-cp-clone -fno-partial-inlining -Wframe-larger-than=1024
-fno-stack-protector -Wno-unused-but-set-variable -Wno-unused-const-variable
-fno-omit-frame-pointer -fno-optimize-sibling-calls
-fno-var-tracking-assignments -pg -mfentry -DCC_USING_FENTRY
-fno-inline-functions-called-once -Wdeclaration-after-statement
-Wno-pointer-sign -fno-strict-overflow -fconserve-stack -Werror=implicit-int
-Werror=strict-prototypes -Werror=date-time -Werror=incompatible-pointer-types
-Werror=designated-init-DKBUILD_BASENAME='"core"' 
-DKBUILD_MODNAME='"core"' -c -o arch/x86/events/core.o arch/x86/events/core.c
arch/x86/events/core.c: In function ‘x86_perf_event_update’:
arch/x86/events/core.c:108:1: internal compiler error: in print_reg, at
config/i386/i386.c:17656
 }
 ^
Please submit a full bug report,
with preprocessed source if appropriate.
See <http://bugzilla.redhat.com/bugzilla> for instructions.
{standard input}: Assembler messages:
{standard input}: Warning: end of file not at end of a line; newline inserted
{standard input}:1573: Error: unbalanced parenthesis in operand 2.
Preprocessed source stored into /tmp/ccydezQe.out file, please attach this to
your bugreport.

The code being compiled is at:

  git://git.kernel.org/pub/scm/linux/kernel/git/jpoimboe/linux.git
clang-sp-clobber

I will also attach the kernel .config file which triggers this bug.

[Bug c/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

--- Comment #1 from Josh Poimboeuf  ---
Created attachment 42179
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42179&action=edit
preprocessed source

[Bug c/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

--- Comment #2 from Josh Poimboeuf  ---
Created attachment 42180
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42180&action=edit
kernel config file

[Bug c/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

--- Comment #3 from Josh Poimboeuf  ---
This has been seen on Fedora GCC 7.1.1 and Debian GCC 6.2.0-3.

This bug can be recreated with the above kernel git branch and the attached
.config by typing "make ARCH=i386 arch/x86/events/core.o".

I've also attached the preprocessed source which GCC produced when it reported
the error.

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

--- Comment #7 from Josh Poimboeuf  ---
Putting "sp" in the clobbers list is something that was suggested to me on the
GCC mailing list a while back.  And, other than this rare bug, it seems to do
exactly what we want, which is, force GCC to save the frame pointer before
inserting the inline asm.  We need that to happen when we put a "call"
instruction inside the inline asm, so that we can get a reliable stack trace
from the called function.

I know that putting "sp" in the clobbers list is an undocumented "feature", so
maybe it is user error.  However it would be nice to have something like this
as a real feature.  Either with "sp", or maybe a new clobbers keyword like
"frame". 

Would that be feasible?

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

--- Comment #9 from Josh Poimboeuf  ---
I might be misunderstanding, but I don't think we need DRAP for our use case. 
We just need assurance that the frame pointer (RBP) has been set up before the
inline asm is inserted.

If clobbering "sp" isn't the right approach, maybe a new syntax, hopefully a
documented one, would be better.  Like a special "frame" clobber, for example.

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

--- Comment #12 from Josh Poimboeuf  ---
While I know about DRAP, I'm otherwise not very familiar with GCC internals, so
I don't quite understand your points.

I would like to clarify that most of the time, when we use "sp" in the clobbers
list, the stack does *not* need to be realigned.  And in those cases, we would
*not* want DRAP, because it's not needed.  We would instead just want the
normal frame pointer setup.

For cases where the stack pointer *does* need to be realigned, DRAP is fine.

I can try out your patch next week to see what changes it makes to the
generated code.  Thanks!

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

--- Comment #14 from Josh Poimboeuf  ---
(In reply to H.J. Lu from comment #13)
> (In reply to Josh Poimboeuf from comment #12)
> > I would like to clarify that most of the time, when we use "sp" in the
> > clobbers list, the stack does *not* need to be realigned.  And in those
> 
> Yes, stack alignment may be needed due to:
> 
> typedef struct {
>  u64 __attribute__((aligned(8))) counter;
> } atomic64_t;
> 
> with -mpreferred-stack-boundary=2.

Yes, in the above example that is true.  However there are many more cases in
the kernel where we want to use the "sp" clobbers, where stack alignment is not
needed.

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-15 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

--- Comment #16 from Josh Poimboeuf  ---
Ok.  I thought you were saying that, with your new patch, putting "sp" in
clobbers would *always* force DRAP, even in cases where the function doesn't
need realignment.

If you're *not* saying that, then it sounds good :-)

Either way I'll try the patch out next week and see what it does.

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

2017-09-18 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82221

--- Comment #18 from Josh Poimboeuf  ---
(In reply to H.J. Lu from comment #17)
> (In reply to Josh Poimboeuf from comment #7)
> > Putting "sp" in the clobbers list is something that was suggested to me on
> > the GCC mailing list a while back.  And, other than this rare bug, it seems
> > to do exactly what we want, which is, force GCC to save the frame pointer
> > before inserting the inline asm.  We need that to happen when we put a
> > "call" instruction inside the inline asm, so that we can get a reliable
> > stack trace from the called function.
> > 
> > I know that putting "sp" in the clobbers list is an undocumented "feature",
> > so maybe it is user error.  However it would be nice to have something like
> > this as a real feature.  Either with "sp", or maybe a new clobbers keyword
> > like "frame". 
> > 
> > Would that be feasible?
> 
> Can you use something like
> 
> unsigned long paravirt_read_pmc(void)
> {
>   register char *frame __asm__("ebp");
>   unsigned long __eax;
>   asm volatile("# foo" : "=a" (__eax)
>: "r" (frame)
>: "memory", "cc");
>   return __eax;
> }
> 
> That is your asm statement needs frame pointer.

That works, with frame pointers enabled (-fno-omit-frame-pointer).  But with
-fomit-frame-pointer, it causes GCC to save rbp on the stack, which is not
ideal.

There's something similar, which does exactly what we need: make "rsp" an
input/output constraint:

  register void *__sp asm("rsp");

  static inline function foo(void)
  {
asm volatile("call bar" : "+r" (__sp));
  }

That forces a frame pointer when frame pointers are enabled, and does nothing
when frame pointers are disabled.  So I think we'll go with that solution for
now.

(Note that "__sp" is a global variable due to a compatibility issue with the
clang compiler.)

[Bug c/77966] New: Corrupt function with -fsanitize-coverage=trace-pc

2016-10-13 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77966

Bug ID: 77966
   Summary: Corrupt function with -fsanitize-coverage=trace-pc
   Product: gcc
   Version: 6.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jpoimboe at redhat dot com
  Target Milestone: ---

In the Linux kernel, we found another case (other than bug 70646) where a
couple of functions are getting corrupted.  Arnd Bergmann reduced it down to a
simple test case, and I've reduced it slightly further:

$ cat test.c
typedef int spinlock_t;
extern unsigned int ioread32(void *);

struct vnic_wq_ctrl {
unsigned int error_status;
};

struct vnic_wq {
struct vnic_wq_ctrl *ctrl;
} mempool_t;

struct snic {
unsigned int wq_count;
__attribute__ ((__aligned__)) struct vnic_wq wq[1];
spinlock_t wq_lock[1];
};

unsigned int snic_log_q_error_err_status;

void snic_log_q_error(struct snic *snic)
{
unsigned int i;
for (i = 0; i < snic->wq_count; i++)
snic_log_q_error_err_status =
ioread32(&snic->wq[i].ctrl->error_status);
}

$ gcc -O2 -fno-reorder-blocks -fsanitize-coverage=trace-pc -c test.c -o test.o
$ objdump -dr test.o

test.o: file format elf64-x86-64


Disassembly of section .text:

 :
   0:   53  push   %rbx
   1:   48 89 fbmov%rdi,%rbx
   4:   e8 00 00 00 00  callq  9 
5: R_X86_64_PC32__sanitizer_cov_trace_pc-0x4
   9:   8b 03   mov(%rbx),%eax
   b:   85 c0   test   %eax,%eax
   d:   75 09   jne18 
   f:   5b  pop%rbx
  10:   e9 00 00 00 00  jmpq   15 
11: R_X86_64_PC32   __sanitizer_cov_trace_pc-0x4
  15:   0f 1f 00nopl   (%rax)
  18:   e8 00 00 00 00  callq  1d 
19: R_X86_64_PC32   __sanitizer_cov_trace_pc-0x4
  1d:   48 8b 7b 10 mov0x10(%rbx),%rdi
  21:   e8 00 00 00 00  callq  26 
22: R_X86_64_PC32   ioread32-0x4
  26:   83 3b 01cmpl   $0x1,(%rbx)
  29:   89 05 00 00 00 00   mov%eax,0x0(%rip)# 2f

2b: R_X86_64_PC32   snic_log_q_error_err_status-0x4
  2f:   76 de   jbef 
  31:   e8 00 00 00 00  callq  36 
32: R_X86_64_PC32   __sanitizer_cov_trace_pc-0x4

Notice how the function ends unexpectedly after the last call to
__sanitizer_cov_trace_pc().

$ gcc -v
Using built-in specs.
COLLECT_GCC=/usr/bin/gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/6.2.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap
--enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr
--mandir=/usr/share/man --infodir=/usr/share/info
--with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared
--enable-threads=posix --enable-checking=release --enable-multilib
--with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions
--enable-gnu-unique-object --enable-linker-build-id
--with-linker-hash-style=gnu --enable-plugin --enable-initfini-array
--disable-libgcj --with-isl --enable-libmpx --enable-gnu-indirect-function
--with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 6.2.1 20160916 (Red Hat 6.2.1-2) (GCC)

[Bug target/77966] Corrupt function with -fsanitize-coverage=trace-pc

2016-10-14 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77966

--- Comment #6 from Josh Poimboeuf  ---
(In reply to Arnd Bergmann from comment #5)
> I checked the test case using "-fsanitize=unreachable" and that avoids the
> problem.
> 
> Josh, should we set that whenever we enable objtool in the kernel?

In theory, adding -fsanitize=unreachable might be a workable option for
allowing objtool to detect such unreachable blocks.

However, in practice, that option doesn't seem to work as advertised.  It seems
to change the control flow unexpectedly.  When adding it to the test case, it
doesn't add a __ubsan_handle_builtin_unreachable() call to the unreachable
block.  Instead, it treats it as a normal loop, and removes the assumption that
the loop can only run one time.

Here's the same test case from comment #1, with -fsanitize-unreachable added:

 :
   0:   55  push   %rbp
   1:   53  push   %rbx
   2:   48 89 fdmov%rdi,%rbp
   5:   31 db   xor%ebx,%ebx
   7:   48 83 ec 08 sub$0x8,%rsp
   b:   e8 00 00 00 00  callq  10 
c: R_X86_64_PC32__sanitizer_cov_trace_pc-0x4
  10:   8b 45 00mov0x0(%rbp),%eax
  13:   85 c0   test   %eax,%eax
  15:   75 11   jne28 
  17:   48 83 c4 08 add$0x8,%rsp
  1b:   5b  pop%rbx
  1c:   5d  pop%rbp
  1d:   e9 00 00 00 00  jmpq   22 
1e: R_X86_64_PC32   __sanitizer_cov_trace_pc-0x4
  22:   66 0f 1f 44 00 00   nopw   0x0(%rax,%rax,1)
  28:   e8 00 00 00 00  callq  2d 
29: R_X86_64_PC32   __sanitizer_cov_trace_pc-0x4
  2d:   89 d8   mov%ebx,%eax
  2f:   83 c3 01add$0x1,%ebx
  32:   48 8b 7c c5 08  mov0x8(%rbp,%rax,8),%rdi
  37:   e8 00 00 00 00  callq  3c 
38: R_X86_64_PC32   ioread32-0x4
  3c:   39 5d 00cmp%ebx,0x0(%rbp)
  3f:   77 e7   ja 28 
  41:   eb d4   jmp17

[Bug c/61958] function arbitrarily placed in .text.unlikely section

2014-07-29 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61958

--- Comment #1 from Josh Poimboeuf  ---
Created attachment 33207
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33207&action=edit
gzipped .i file for the good case

[Bug c/61958] New: function arbitrarily placed in .text.unlikely section

2014-07-29 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61958

Bug ID: 61958
   Summary: function arbitrarily placed in .text.unlikely section
   Product: gcc
   Version: 4.9.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jpoimboe at redhat dot com

Created attachment 33206
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33206&action=edit
gzipped .i file for the bad case

When making changes to a certain source file in the Linux kernel, gcc is
unexpectedly moving two functions from the .text section to the .text.unlikely
section.

When compiling the original version of the source file, the two functions are
placed in the .text section:

$ cd linux
$ git describe
v3.16-rc7-7-g31dab71
$ make net/ipv4/fib_trie.o
  CHK include/config/kernel.release
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CALLscripts/checksyscalls.sh
  CC  net/ipv4/fib_trie.o
$ eu-readelf -s net/ipv4/fib_trie.o |grep 'insert_leaf_info\|leaf_info_new'
6: 136 FUNCLOCAL  DEFAULT1 insert_leaf_info
   37: 0dc0 88 FUNCLOCAL  DEFAULT1 leaf_info_new
$ eu-readelf -S net/ipv4/fib_trie.o |grep '\[ 1\]'
[ 1] .textPROGBITS  0040 38c0  0 AX
0   0 16


However, when I apply the following patch, the two functions get unexpectedly
moved to .text.unlikely:

$ cat /tmp/fib_trie.patch
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 5afeb5a..cc5e3d3 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1187,6 +1187,7 @@ int fib_table_insert(struct fib_table *tb, struct
fib_config *cfg)

 key = key & mask;

+printk("foo\n");
 fi = fib_create_info(cfg);
 if (IS_ERR(fi)) {
 err = PTR_ERR(fi);
$ patch -p1 < /tmp/fib_trie.patch
patching file net/ipv4/fib_trie.c
$ make net/ipv4/fib_trie.o
  CHK include/config/kernel.release
  CHK include/generated/uapi/linux/version.h
  CHK include/generated/utsrelease.h
  CALLscripts/checksyscalls.sh
  CC  net/ipv4/fib_trie.o
$ eu-readelf -s net/ipv4/fib_trie.o |grep 'insert_leaf_info\|leaf_info_new'
6: 111 FUNCLOCAL  DEFAULT5 insert_leaf_info
   28: 006f 82 FUNCLOCAL  DEFAULT5 leaf_info_new
$ eu-readelf -S net/ipv4/fib_trie.o |grep '\[ 5\]'
[ 5] .text.unlikely   PROGBITS  3770 00c1  0 AX
0   0  1


Both of the moved functions are called by fib_insert_node(), which is inlined
by fib_table_insert(), which was modified by the patch.  As far as I can tell,
it seems likely that the moved functions would be called, especially
insert_leaf_info() which is called in the main control path of the function. 
Also it seems odd that adding a call to printk would change the likelihood of a
function being called, since it doesn't change control flow.


Using gcc from Fedora rawhide:
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/4.9.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla
--enable-bootstrap --enable-shared --enable-threads=posix
--enable-checking=release --enable-multilib --with-system-zlib
--enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object
--enable-linker-build-id --with-linker-hash-style=gnu
--enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --enable-plugin
--enable-initfini-array --disable-libgcj
--with-isl=/builddir/build/BUILD/gcc-4.9.1-20140717/obj-x86_64-redhat-linux/isl-install
--with-cloog=/builddir/build/BUILD/gcc-4.9.1-20140717/obj-x86_64-redhat-linux/cloog-install
--enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686
--build=x86_64-redhat-linux
Thread model: posix
gcc version 4.9.1 20140717 (Red Hat 4.9.1-2) (GCC) 

Full gcc cmdline:
  gcc -Wp,-MD,net/ipv4/.fib_trie.o.d  -nostdinc -isystem
/usr/lib/gcc/x86_64-redhat-linux/4.9.1/include -I./arch/x86/include
-Iarch/x86/include/generated  -Iinclude -I./arch/x86/include/uapi
-Iarch/x86/include/generated/uapi -I./include/uapi -Iinclude/generated/uapi
-include ./include/linux/kconfig.h -D__KERNEL__ -Wall -Wundef
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common
-Werror-implicit-function-declaration -Wno-format-security -m64 -mno-mmx
-mno-sse -mno-80387 -mno-fp-ret-in-387 -mpreferred-stack-boundary=3
-mtune=generic -mno-red-zone -mcmodel=kernel -funit-at-a-time
-maccumulate-outgoing-args -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1
-DCONFIG_AS_CFI_SECTIONS=1 -DCONFIG_AS_FXSAVEQ=1 -DCONFIG_AS_CRC32=1
-DCONFIG_AS_AVX=1 -DCONFIG_AS_AVX2=1 -pipe -Wno-sign-compare
-

[Bug middle-end/61958] functions arbitrarily placed in .text.unlikely section

2014-07-29 Thread jpoimboe at redhat dot com

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61958

--- Comment #2 from Josh Poimboeuf  ---
I see a similar issue with another patch to a different kernel file:

diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index b10cd43a..40c275f 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -797,6 +797,7 @@ struct fib_info *fib_create_info(struct fib_config *cfg)
 #endif

err = -ENOBUFS;
+   printk("foo\n");
if (fib_info_cnt >= fib_info_hash_size) {
unsigned int new_size = fib_info_hash_size << 1;
struct hlist_head *new_info_hash;


This results in fib_info_hashfn() no longer being inlined (despite its static
inline directive), and being placed in .text.unlikely.  Other functions are
also moved to .text.unlikely: fib_info_hash_free(), kzalloc.constprop.19(), and
fib_info_hash_alloc()  I can also provide the data for this error if needed,
but it looks like the same issue.

[Bug c/119279] New: Specifying frame pointer dependency in inline asm

2025-03-13 Thread jpoimboe at redhat dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119279

Bug ID: 119279
   Summary: Specifying frame pointer dependency in inline asm
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jpoimboe at redhat dot com
CC: ak at gcc dot gnu.org, hjl.tools at gmail dot com, hpa at 
zytor dot com,
jakub at gcc dot gnu.org, peterz at infradead dot org,
pinskia at gcc dot gnu.org, rguenth at gcc dot gnu.org,
torva...@linux-foundation.org, ubizjak at gmail dot com
  Target Milestone: ---
Target: x86_64-*-*

Created attachment 60749
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=60749&action=edit
Reduced test case (gcc -O2 -fno-omit-frame-pointer -fno-optimize-sibling-calls
-c mmap.i -o mmap.o)

[ This was discussed previously in bug 117311, which was a documentation
request for existing (but unsupported) behavior.  This bug here is not for
documenting existing behavior, but rather a request to define a supported
solution for specifying a frame pointer dependency in inline asm, whatever that
looks like. ]

In the Linux kernel on x86 it's common for inline asm to have a "call" to a
thunk which saves registers before calling out to another function.  If the
inline asm is in a leaf function, or gets emitted before the function prologue,
the frame pointer doesn't get set up before the call.

The current unsupported workaround used throughout the kernel is to make the
stack pointer an in/out constraint:

  register unsigned long current_stack_pointer asm("%rsp");
  #define ASM_CALL_CONSTRAINT "+r" (current_stack_pointer)

We've also experimented with making __builtin_frame_address(0) an input
constraint, which also seems to work:

  #define ASM_CALL_CONSTRAINT "r" (__builtin_frame_address(0))

Unfortunately these are unsupported hacks which just happen to work.  It would
be much better to have a supported solution.

The attached test case results in the following:

 :
   0:   e8 00 00 00 00  call   51:
R_X86_64_PLT32   __SCT__preempt_schedule-0x4
   5:   48 83 3d 00 00 00 00 00 cmpq   $0x0,0x0(%rip)# d
   8: R_X86_64_PC32   
pfn_modify_allowed_pfn-0x5
   d:   75 01   jne10 
   f:   c3  ret
  10:   55  push   %rbp
  11:   31 c0   xor%eax,%eax
  13:   48 89 e5mov%rsp,%rbp
  16:   e8 00 00 00 00  call   1b  17:
R_X86_64_PLT32  capable-0x4
  1b:   5d  pop%rbp
  1c:   c3  ret

[Bug c/70604] New: switch statement optimization creates dead code

[Bug c/70604] switch statement optimization creates dead code

[Bug tree-optimization/70604] switch statement optimization creates dead code

[Bug c/70646] Corrupt truncated function

[Bug tree-optimization/70604] switch statement optimization creates dead code

[Bug c/70646] Corrupt truncated function

[Bug c/70646] New: Corrupt truncated function

[Bug ipa/70646] [4.9/5/6 Regression] Corrupt truncated function

[Bug ipa/70646] [4.9/5/6 Regression] Corrupt truncated function

[Bug ipa/70646] [4.9/5/6 Regression] Corrupt truncated function

[Bug ipa/70646] [4.9/5/6/7 Regression] Corrupt truncated function

[Bug c/81813] New: Inefficient stack pointer adjustment

[Bug c/82221] New: internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug c/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug c/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug c/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug target/82221] internal compiler error: in print_reg, at config/i386/i386.c:17656

[Bug c/77966] New: Corrupt function with -fsanitize-coverage=trace-pc

[Bug target/77966] Corrupt function with -fsanitize-coverage=trace-pc

[Bug c/61958] function arbitrarily placed in .text.unlikely section

[Bug c/61958] New: function arbitrarily placed in .text.unlikely section

[Bug middle-end/61958] functions arbitrarily placed in .text.unlikely section

[Bug c/119279] New: Specifying frame pointer dependency in inline asm

28 matches

Site Navigation

Mail list logo

Footer information