[Bug sanitizer/94076] New: libsanitizer fails with 64-bit time_t

2020-03-06 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94076

Bug ID: 94076
   Summary: libsanitizer fails with 64-bit time_t
   Product: gcc
   Version: 9.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

I tried bootstrapping a debian armhf system with an experimental glibc
configured with a 64-bit time_t/off_t/ino_t, and saw several failed assertions
on 'struct dirent' and 'struct timeb':

  339 | typedef char IMPL_PASTE(assertion_failed_##_,
line)[2*(int)(pred)-1]
  |   
^
../../../../src/libsanitizer/sanitizer_common/sanitizer_internal_defs.h:333:30:
note: in expansion of macro 'IMPL_COMPILER_ASSERT'
  333 | #define COMPILER_CHECK(pred) IMPL_COMPILER_ASSERT(pred, __LINE__)
  |  ^~~~
../../../../src/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.h:1495:3:
note: in expansion of macro 'COMPILER_CHECK'
 1495 |   COMPILER_CHECK(sizeof(__sanitizer_##TYPE) == sizeof(TYPE))
  |   ^~
../../../../src/libsanitizer/sanitizer_common/sanitizer_platform_limits_posix.cc:1213:1:
note: in expansion of macro 'CHECK_TYPE_SIZE'
 1213 | CHECK_TYPE_SIZE(timeb);
  | ^~~
../../../../src/libsanitizer/sanitizer_common/sanitizer_internal_defs.h:339:70:
warning: size of array 'assertion_failed__1213' is not an integral
constant-expression [-Wpedantic]

In particular:
CHECK_SIZE_AND_OFFSET(dirent, d_ino);
CHECK_SIZE_AND_OFFSET(dirent, d_off);
CHECK_SIZE_AND_OFFSET(dirent, d_reclen);
CHECK_TYPE_SIZE(timeb);
CHECK_SIZE_AND_OFFSET(timeb, time);
CHECK_SIZE_AND_OFFSET(timeb, millitm);
CHECK_SIZE_AND_OFFSET(timeb, timezone);
CHECK_SIZE_AND_OFFSET(timeb, dstflag);

The same thing will likely hit on architectures that always set them to 64-bit,
as riscv32 and arc, on musl-1.2+, and on upstream glibc once that the time64
interfaces get enabled there.

[Bug sanitizer/94076] libsanitizer fails with 64-bit time_t

2020-03-06 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94076

--- Comment #2 from Arnd Bergmann  ---
I'm not at the point of the bootstrap where I can attempt building llvm, but I
opened another report at https://bugs.llvm.org/show_bug.cgi?id=45138 anyway.

[Bug sanitizer/94881] New: incorrect Wstringop-overflow warning with thread sanitizer

2020-04-30 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94881

Bug ID: 94881
   Summary: incorrect Wstringop-overflow warning with thread
sanitizer
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

A gcc-10 snapshot from earlier this month (dated 2020-04-13) produced a
-Wstringop-overflow warning that made no sense when building the Linux kernel.
I reduced it to a small test case

https://godbolt.org/z/NyjxvH

struct a {
  char b[50];
};
struct c {
  short action;
  struct a d;
};
struct f {
  short command;
  struct c e;
};
void i(struct f *f, int *g, unsigned h) {
  struct c *j = &f->e;
  j->action = 0;
  __builtin_memcpy(&j->d.b[h], g, 16);
}

$ x86_64-linux-gcc -O2 -Wall -fsanitize=thread test.c -c
test.c:In function 'i':
test.c:15:3: warning: writing 16 bytes into a region of size 0
[-Wstringop-overflow=]
   15 |   __builtin_memcpy(&j->d.b[h], g, 16);
  |   ^~~
test.c:5:9: note: at offset 0 to object 'action' with size 2 declared here
5 |   short action;
  | ^~

[Bug sanitizer/94881] [10 Regression] incorrect Wstringop-overflow warning with thread sanitizer since r10-5451-gef29b12cfbb4979a

2020-04-30 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94881

--- Comment #2 from Arnd Bergmann  ---
I ran into another file that triggered this problem, reducing that one gave me
a slightly simpler test case:

struct a {
  char b[8];
};
struct e {
  unsigned c;
  struct a d[2];
};
void i(struct e *e, void *g) {
  struct e *f = e + 1;
  __builtin_memcpy(f->d[f->c].b, g, 1);
}

[Bug c/94986] New: missing diagnostic on ARM thumb2 compilation with -pg when using r7 in inline asm

2020-05-07 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94986

Bug ID: 94986
   Summary: missing diagnostic on ARM thumb2 compilation with -pg
when using r7 in inline asm
   Product: gcc
   Version: 10.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

I reported a bug against clang for a Linux kernel failure, but 
 it was suggested that the clang behavior is probably correct in this corner
case while gcc gets it wrong, see https://bugs.llvm.org/show_bug.cgi?id=45826

echo 'void f(void) { asm("mov r7, #0" ::: "r7"); }' | arm-linux-gnueabi-gcc
-march=armv7-a -O2  -mthumb -pg -S -xc -

silently accepts an inline asm statement that clobbers the frame pointer, but
gcc rejects the same code if any of '-O0', '-fomit-frame-pointer' or
'fno-omit-frame-pointer' are used:

: In function 'f':
:1:44: error: r7 cannot be used in 'asm' here

If using r7 in this case is indeed invalid, we need to ensure the kernel does
not do this, and having gcc reject it would be helpful.

[Bug target/95943] New: arc -mbig-endian "inappropriate arguments" error from assembler

2020-06-27 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95943

Bug ID: 95943
   Summary: arc -mbig-endian "inappropriate arguments" error from
assembler
   Product: gcc
   Version: 10.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

Building an 'allmodconfig' linux kernel for ARC results in a failure to
assemble one file:

{standard input}: Assembler messages:
{standard input}:12347: Error: inappropriate arguments for opcode 'mpyd'
make[3]: *** [/tmp/mainline/scripts/Makefile.build:281: kernel/sched/core.o]
Error 1

With creduce, I reduced the problem to

void a(int b, int c, long long d) {
  long e = d;
  long long f = 0;
  if (e / 1000)
f = (long long)e * 1000;
  g(a, f);
}

$ arc-linux-gcc-O2 -mbig-endian  -mcpu=hs38 -c test.c
/tmp/ccfzWfgR.s: Assembler messages:
/tmp/ccfzWfgR.s:21: Error: inappropriate arguments for opcode 'mpyd'


$ arc-linux-gcc-O2 -mbig-endian  -mcpu=hs38 -S test.c -o-
.section.text
.align 4
.global a
.type   a, @function
a:
mov_s   r2,r3   ;4
add r2,r2,999
cmp r2,1998
mov.ls r2,0
mov.ls r3,0
mpyd.hi r2,r3,1000
mov_s   r1,r2   ;4
mov_s   r0,@a   ;13
b.d @g
mov_s   r2,r3   ;4
.size   a, .-a
.ident  "GCC: (GNU) 10.1.0"

This happens with at least gcc-8 through gcc-10, but not with gcc-7.5.

[Bug rtl-optimization/92657] High stack usage due ftree-ch

2020-01-05 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92657

Arnd Bergmann  changed:

   What|Removed |Added

 CC||arnd at linaro dot org

--- Comment #5 from Arnd Bergmann  ---
Submitted a workaround for the warning that triggered this bug report in the
linux kernel:

https://lore.kernel.org/lkml/20200104215156.689245-1-a...@arndb.de/

[Bug rtl-optimization/88879] [9 Regression] ICE in sel_target_adjust_priority, at sel-sched.c:3332

2020-02-11 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88879

Arnd Bergmann  changed:

   What|Removed |Added

 CC||arnd at linaro dot org

--- Comment #14 from Arnd Bergmann  ---
I got the same output while building the linux kernel defconfig today, reduced
to this minimal test case:

$ cat test.c
long a, b;
int fn1() {
  char *c = (void *)b;
  while (1) {
const long d = *c = d;
do
  a++;
while (a == 5);
  }
}
$ ia64-linux-gcc -O3 -c test.c
during RTL pass: mach
lz4_decompress.c:10:1: internal compiler error: in sel_target_adjust_priority,
at sel-sched.c:3334
   10 | }

Reproduced both with 9.2 and current HEAD 
$ ia64-linux-gcc --version
ia64-linux-gcc (GCC) 9.2.1 20200211

[Bug rtl-optimization/88879] [9 Regression] ICE in sel_target_adjust_priority, at sel-sched.c:3332

2020-02-11 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88879

--- Comment #16 from Arnd Bergmann  ---
Right, I was on the releases/gcc-9 branch, not HEAD. Sorry about the confusion.
I applied your fix and have a working 9.2 build that can build the kernel now.
Thanks!

[Bug sanitizer/84863] false-positive -Warray-bounds warning with -fsanitize-coverage=object-size

2018-12-16 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84863

--- Comment #3 from Arnd Bergmann  ---
The problem in the kernel then is that we then have to turn off the sanitizers
for the 'allmodconfig' build, since the recommended minimum regression testing
for kernel changes involves building a kernel with all options (including
UBSAN) enabled and checking that there are no compiler warnings.

This means we would fail to catch any build regressions in the kernel part of
UBSAN, as well as any legitimate warnings that are only seen when the
sanitizers are active.

[Bug tree-optimization/85301] New: bitfield check causes maybe-uninitialized warning

2018-04-09 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85301

Bug ID: 85301
   Summary: bitfield check causes maybe-uninitialized warning
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

A Linux kernel patch that changed a few flags from type 'int' to a single-bit
bitfield caused a false-positive warning. I reduced a test case to

struct tick_sched {
#ifdef USE_BITFIELD
  unsigned int tick_stopped : 1;
  unsigned int idle_active : 1;
#else
  int tick_stopped;
  int idle_active;
#endif
};
long ktime_get();
void __tick_nohz_idle_restart_tick(long);
struct tick_sched tick_nohz_idle_exit_ts;
void tick_nohz_idle_exit(void) {
  long now;
  if (tick_nohz_idle_exit_ts.idle_active ||
tick_nohz_idle_exit_ts.tick_stopped)
now = ktime_get();
  if (tick_nohz_idle_exit_ts.tick_stopped)
__tick_nohz_idle_restart_tick(now);
}

$ gcc  -c tick-sched.c -Wall -O2 -DUSE_BITFIELD
tick-sched.c: In function ‘tick_nohz_idle_exit’:
tick-sched.c:19:5: warning: ‘now’ may be used uninitialized in this function
[-Wmaybe-uninitialized]
 __tick_nohz_idle_restart_tick(now);
$ gcc  -c tick-sched.c -Wall -O2
# no warning

It's easy to work around the warning, e.g. by copying the flag into a temporary
variable, but it looks like this is something that gcc could handle better.

I looked through the list of false-positive Wmaybe-uninitialized bug reports,
but couldn't find one that looks related to this particular one.

[Bug sanitizer/84732] false-positive -Wstringop-truncation warning with -fsanitize-coverage=trace-pc

2018-04-09 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84732

--- Comment #9 from Arnd Bergmann  ---
One more instance got added to the kernel today:

In file included from /git/arm-soc/include/trace/perf.h:90,
 from /git/arm-soc/include/trace/define_trace.h:97,
 from /git/arm-soc/include/trace/events/fscache.h:537,
 from /git/arm-soc/fs/fscache/internal.h:32,
 from /git/arm-soc/fs/fscache/main.c:20:
/git/arm-soc/include/trace/events/fscache.h: In function
'perf_trace_fscache_netfs':
/git/arm-soc/include/trace/events/fscache.h:200:1261: warning: 'strncpy'
specified bound 8 equals destination size [-Wstringop-truncation]
 TRACE_EVENT(fscache_netfs,

Same as the one from comment #5, this one happen for -fsanitize=kernel-address
and is based on simple code that we don't warn for without sanitizer:

   strncpy(__entry->name, netfs->name, 8);
   __entry->name[7]= 0;

I can probably work around that by turning off -Wstringop-truncation whenever
sanitizers enabled in the kernel configuration (we already do that for
-Wmaybe-uninitialized), if this one is unlikely to get fixed before the gcc-8
release.

[Bug tree-optimization/85301] bitfield check causes maybe-uninitialized warning

2018-04-09 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85301

--- Comment #6 from Arnd Bergmann  ---
I found that older versions (gcc-5 and before) did not warn when the type gets
changed to bitfield of '_Bool' rather than 'unsigned int'. It seems that this
was only because they tested each bit separately in the _Bool case rather than
combining the first two comparisons into a &3 mask.

[Bug libgcc/85869] New: libgcc fails to build in canadian cross: cet.h not found

2018-05-22 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85869

Bug ID: 85869
   Summary: libgcc fails to build in canadian cross: cet.h not
found
   Product: gcc
   Version: 8.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libgcc
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

I tried cross-building (for host=ppc64le) a set of cross-toolchain on an x86_64
build system. This fails for the target=i386 compiler with this error:

In file included from /home/arnd/git/gcc/libgcc/config/i386/avx_savms64.S:2:0:
/home/arnd/git/gcc/libgcc/config/i386/savms64.h:26:10: fatal error: cet.h: No
such file or directory

The compiler was configured using

/home/arnd/git/gcc/configure --host=ppc64le-linux-gnu --target=i386-linux
--enable-targets=all
--prefix=/home/arnd/cross/ppc64le/gcc-8.1.0-nolibc/ppc64le-linux-gnu/i386-linux
--enable-languages=c --without-headers --disable-bootstrap --disable-nls
--disable-threads --disable-shared --disable-libmudflap --disable-libssp
--disable-libgomp --disable-decimal-float --disable-libquadmath
--disable-libatomic --disable-libcc1 --disable-libmpx --enable-checking=release

This was no problem in earlier releases, or on other target architectures (not
sure about target=x86_64, which fails differently for me). I have not tried
other host architectures but would assume that this is not ppc64le specific.

[Bug libgcc/85869] libgcc fails to build in canadian cross: cet.h not found

2018-05-22 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85869

Arnd Bergmann  changed:

   What|Removed |Added

   Keywords|build   |
 Target|x86_64-*-*, i?86-*-*|
   Host|powerpc64le |
  Build|x86_64-*-*  |

--- Comment #2 from Arnd Bergmann  ---
Ah, found a bug in my scripts: I had built and installed a i386 cross compiler
from these sources, but passed the wrong PATH variable, so it picked up a
native compiler of the wrong version instead. This works fine now after fixing
my script, sorry about the false-positive report.

A related problem still seems to happen for the
build=x86_64/host=ppc64le/target=x86_64 cross compiler when building the 32-bit
libgcc: I have the correct x86_64-linux-gcc binary in the PATH here, and this
gets used for the 64-bit libgcc, but when building the 32-bit libgcc, it uses
'cc -m32' instead, which comes from /usr/bin/.

After manually removing the '/usr/bin/cc -> gcc' symlink, that appears to work
fine as well, but I don't see why that symlink causes this behavior.

[Bug sanitizer/83356] New: [7 regression] excessive stack usage compiling with -O2 -fsanitize=bounds -fsanitize=object-size

2017-12-10 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356

Bug ID: 83356
   Summary: [7 regression] excessive stack usage compiling with
-O2 -fsanitize=bounds -fsanitize=object-size
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

Created attachment 42826
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42826&action=edit
The AES crypto implementation from linux, preprocessed

After upgrading from gcc-7.1.1 (20170717) to 7.2.1 (20171130), I observed a
warning when building the kernel in configurations with UBSAN:

x86_64-linux-gcc-7.2.1 -Wframe-larger-than=1024 -fsanitize=bounds
-fsanitize=object-size -fno-strict-aliasing -O2 -Wall -c aes_generic.i 
/git/arm-soc/crypto/aes_generic.c: In function 'aes_encrypt':
/git/arm-soc/crypto/aes_generic.c:1371:1: warning: the frame size of 3840 bytes
is larger than 1024 bytes [-Wframe-larger-than=]
/git/arm-soc/crypto/aes_generic.c: In function 'aes_decrypt':
/git/arm-soc/crypto/aes_generic.c:1441:1: warning: the frame size of 3840 bytes
is larger than 1024 bytes [-Wframe-larger-than=]

With gcc-7.1.1, the stack frames were 304 bytes each, gcc-8.0.0 uses 512 bytes,
and with "gcc-7.2.1 -Os -fsanitize=bounds -fsanitize=object-size", it's even
smaller at 56 bytes.

I have only noticed this regression with one file in the kernel so far,
crypto/aes_generic.c, all other files apparently stay below the stack frame
warning limit.

[Bug tree-optimization/83312] [8 regression] bogus -Warray-bounds warning

2017-12-13 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83312

--- Comment #8 from Arnd Bergmann  ---
(In reply to David Malcolm from comment #7)
> Candidate patch:
>   https://gcc.gnu.org/ml/gcc-patches/2017-12/msg00778.html

I confirmed this fixes the problem on both the original source file as well as
the reduced test case.

[Bug target/83408] New: microblaze: long compile times

2017-12-13 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83408

Bug ID: 83408
   Summary: microblaze: long compile times
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

Created attachment 42869
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42869&action=edit
drivers/net/ethernet/cisco/enic/enic_main.c, preprocessed and compressed

Building an 'allmodconfig' linux kernel for microblaze gets stuck in a couple
of files that take several hours to compile on a fast machine. 'perf'
shows the time is spent mostly in two functions:

  88.17%  cc1  cc1 [.] find_base_term
   9.47%  cc1  cc1 [.]
cselib_sp_based_value_p
   ...


reproduce with "microblaze-linux-gcc-7.2.1 -O2 -c enic_main.i". An older
compiler version I happened to have (gcc-4.9.3) behaves similarly. I did not
wait for the compilation to finish but saw in older build reports that it
eventually continued or crashed.

[Bug target/83409] New: arc: "internal compiler error: in extract_constrain_insn" with -O3

2017-12-13 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83409

Bug ID: 83409
   Summary: arc: "internal compiler error: in
extract_constrain_insn" with -O3
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

Created attachment 42870
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42870&action=edit
linux/lib/scatterlist.c, preprocessed and compressed

I get several internal compiler errors building linux 'allmodconfig' with
arc-elf-gcc-7.2.1:

/home/arnd/cross-gcc/bin/arc-elf-gcc-7.2.1 -Wall -O3 -c lib/scatterlist.i
-fno-strict-aliasing

/git/arm-soc/lib/scatterlist.c: In function '__sg_alloc_table_from_pages':
/git/arm-soc/lib/scatterlist.c:445:1: error: insn does not satisfy its
constraints:
 }
 ^
(insn 735 249 620 23 (set (reg:SI 60 lp_count [477])
(plus:SI (mult:SI (reg:SI 60 lp_count [375])
(const_int 4 [0x4]))
(const_int -20 [0xffec])))
"/git/arm-soc/lib/scatterlist.c":139 89 {*add_n}
 (nil))
/git/arm-soc/lib/scatterlist.c:445:1: internal compiler error: in
extract_constrain_insn, at recog.c:2213
0x95a673 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/home/arnd/git/gcc/gcc/rtl-error.c:108
0x95a69f _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/arnd/git/gcc/gcc/rtl-error.c:119
0x927aad extract_constrain_insn(rtx_insn*)
/home/arnd/git/gcc/gcc/recog.c:2213
0x92adf4 copyprop_hardreg_forward_1
/home/arnd/git/gcc/gcc/regcprop.c:801
0x92bad7 execute
/home/arnd/git/gcc/gcc/regcprop.c:1308
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

[Bug target/83408] microblaze: long compile times

2017-12-13 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83408

--- Comment #2 from Arnd Bergmann  ---
I've tried removing the architecture specific bits from the preprocessed file.
The issue is still unchanged on microblaze but I could not trigger it on x86,
the same file builds fine here.

[Bug middle-end/82365] stack locations are consolidated if noreturn function is on the path

2017-12-13 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365

--- Comment #10 from Arnd Bergmann  ---
I have now observed the problem in another file, this one time that is more
commonly used, but not as drastic, as the stack usage is only around 1000
bytes.

The original source code is in
https://elixir.free-electrons.com/linux/v4.14/source/include/net/ip_vs.h#L1243
This reduces to

int a, b, c, d, e;
inline char *ip_vs_dbg_addr(char *p1, int *p2) {
  __builtin_snprintf(&p1[*p2], a, "[%pI6c]", &b);
  if ((a)) {
__builtin_trap();
}
  return &p1[1];
}
void ip_vs_control_add() {
  do {
char f[160];
ip_vs_dbg_addr(f, &e);
  } while (0);
  {
int g = d;
char h[160];
ip_vs_dbg_addr(h, &c);
{
  char i[160];
  ip_vs_dbg_addr(i, &g);
}
{
  char j[160];
  ip_vs_dbg_addr(j, &e);
}
  }
  char k[160];
  ip_vs_dbg_addr(k, &e);
}

which uses 800 bytes of stack space. Adding an 'asm volatile("");' statement
before __builtin_trap() again solves the problem.

[Bug middle-end/82365] stack locations are consolidated if noreturn function is on the path

2017-12-15 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365

--- Comment #11 from Arnd Bergmann  ---
More testing reveals that a handful of files in the kernel are affected by this
bug in the BUG() definition on architectures that do not use an inline assembly
statement to trap during an assertion, around half the supported architectures.
This kernel patch

diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h
index 963b755d19b0..23c6a2a6a3d6 100644
--- a/include/asm-generic/bug.h
+++ b/include/asm-generic/bug.h
@@ -52,6 +52,7 @@ struct bug_entry {
 #ifndef HAVE_ARCH_BUG
 #define BUG() do { \
printk("BUG: failure at %s:%d/%s()!\n", __FILE__, __LINE__, __func__);
\
+   barrier(); \
panic("BUG!"); \
 } while (0)
 #endif

can work around the following set of overly large stack frames:

fs/ext4/inode.c:82:1: warning: the frame size of 1672 bytes is larger than 800
bytes [-Wframe-larger-than=]
fs/ext4/namei.c:434:1: warning: the frame size of 904 bytes is larger than 800
bytes [-Wframe-larger-than=]
fs/ext4/super.c:2279:1: warning: the frame size of 1160 bytes is larger than
800 bytes [-Wframe-larger-than=]
fs/ext4/xattr.c:146:1: warning: the frame size of 1168 bytes is larger than 800
bytes [-Wframe-larger-than=]
fs/f2fs/inode.c:152:1: warning: the frame size of 1424 bytes is larger than 800
bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_core.c:1195:1: warning: the frame size of 1068 bytes
is larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_core.c:395:1: warning: the frame size of 1084 bytes is
larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_ftp.c:298:1: warning: the frame size of 928 bytes is
larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_ftp.c:418:1: warning: the frame size of 908 bytes is
larger than 800 bytes [-Wframe-larger-than=]
net/netfilter/ipvs/ip_vs_lblcr.c:718:1: warning: the frame size of 960 bytes is
larger than 800 bytes [-Wframe-larger-than=]
drivers/net/xen-netback/netback.c:1500:1: warning: the frame size of 1088 bytes
is larger than 800 bytes [-Wframe-larger-than=]

and similar patches can be created for architectures not using the generic
implementation. For reference, the above was tested on all architectures that
are supported by mainline versions of both linux and gcc using an
'allmodconfig' build, and the same symptoms were visible on all architectures
using the generic BUG(). I only looked at files that had any functions with
frame sizes over 800 bytes (1000 bytes for 64-bit architectures), 89 files in
total out of 31841 source files that were built, down to 78 with my workaround.

In a single build, around 100 files had functions that get a (mostly minor)
reduction in frame size with my patch, in a few cases the frame sizes appear to
get slightly larger due to different inlining decisions, and in some other
cases including the ones listed above there is a drastic reduction in frame
size of factor two to five.

I have submitted a workaround for the kernel for the original case (involving
strncpy()) and plan to submit another workaround for BUG() now. However, I'd
still like to see this addressed in gcc as well, since that will cover those
instances in other code. I would hope that a simple workaround such as the
patch for PR81715 is possible. This seems to be a related issue with very
similar symptoms.

[Bug sanitizer/83356] [7 Regression] excessive stack usage compiling with -O2 -fsanitize=bounds -fsanitize=object-size

2017-12-18 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356

--- Comment #5 from Arnd Bergmann  ---
(In reply to Richard Biener from comment #3)

> That said, can't see an easy workaround but to change the source and/or
> not use -fsanitize= and expect decent code quality.

I don't see a good way to modify the source code without risking performance
regressions in other configurations (arch, compiler version or flags), and the
AES cipher is a rather performance critical part of the kernel.

I have experimentally shown that passing either -fno-tree-sra and/or
-fno-tree-pre improves the stack usage in this file significantly (312 bytes
with ubsan, 192 bytes without), though it's still around twice as much as with
gcc-7.1.1, both with and without sanitizers.

I have not yet tried to analyze or measure the resulting object code though.

[Bug target/83485] New: cris: ICE in extract_insn, at recog.c:2311

2017-12-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83485

Bug ID: 83485
   Summary: cris: ICE in extract_insn, at recog.c:2311
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

Created attachment 42918
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42918&action=edit
linux/drivers/tty/serial/8250/8250_core.c, preprocessed

I ran into one ICE while cross-compiling an allmodconfig linux kernel for cris,
this happened with both gcc-4.9.3 and gcc-7.2.1, but not with gcc-4.3.6. I
don't care about this bug, but it seems better to report it. PR69040 seems to
be another bug that also results in an ICE, but I could not reproduce that one
with current linux sources.

$ cris-linux-gcc-7.2.1 -Wall -O2 8250_core.i -c
In file included from /git/arm-soc/include/linux/srcu.h:34:0,
 from /git/arm-soc/include/linux/notifier.h:16,
 from /git/arm-soc/include/linux/memory_hotplug.h:7,
 from /git/arm-soc/include/linux/mmzone.h:775,
 from /git/arm-soc/include/linux/gfp.h:6,
 from /git/arm-soc/include/linux/umh.h:4,
 from /git/arm-soc/include/linux/kmod.h:22,
 from /git/arm-soc/include/linux/module.h:13,
 from /git/arm-soc/drivers/tty/serial/8250/8250_core.c:17:
/git/arm-soc/include/linux/workqueue.h: In function 'work_static':
/git/arm-soc/include/linux/workqueue.h:198:2: warning: dereferencing
type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing]
  return *work_data_bits(work) & WORK_STRUCT_STATIC;
  ^~
/git/arm-soc/drivers/tty/serial/8250/8250_core.c: In function
'serial8250_unregister_port':
/git/arm-soc/drivers/tty/serial/8250/8250_core.c:1098:1: error: unrecognizable
insn:
 }
 ^
(insn 121 120 92 5 (set (reg:QI 9 r9 [95])
(ior:QI (mem:QI (reg/f:SI 10 r10 [87]) [94 MEM[(struct uart_8250_port
*)&serial8250_ports][line_9(D)].port.quirks+0 S1 A8])
(reg:QI 9 r9 [89])))
"/git/arm-soc/drivers/tty/serial/8250/8250_core.c":496 -1
 (nil))
/git/arm-soc/drivers/tty/serial/8250/8250_core.c:1098:1: internal compiler
error: in extract_insn, at recog.c:2311
0x9424f3 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/home/arnd/git/gcc/gcc/rtl-error.c:108
0x942529 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/arnd/git/gcc/gcc/rtl-error.c:116
0x910039 extract_insn(rtx_insn*)
/home/arnd/git/gcc/gcc/recog.c:2311
0x9100a1 extract_insn_cached(rtx_insn*)
/home/arnd/git/gcc/gcc/recog.c:2201
0x727449 cleanup_subreg_operands(rtx_insn*)
/home/arnd/git/gcc/gcc/final.c:3152
0x90e58c split_insn
/home/arnd/git/gcc/gcc/recog.c:2925
0x91230c split_all_insns()
/home/arnd/git/gcc/gcc/recog.c:2978
0x9123b8 execute
/home/arnd/git/gcc/gcc/recog.c:3983
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ /home/arnd/cross-gcc/bin/cris-linux-gcc-4.9.3 -Wall -O2 8250_core.i -c
/git/arm-soc/drivers/tty/serial/8250/8250_core.c: In function
'serial8250_unregister_port':
/git/arm-soc/drivers/tty/serial/8250/8250_core.c:1098:1: error: unrecognizable
insn:
 }
 ^
(insn 122 121 92 5 (set (reg:QI 13 r13 [orig:93 D.32632 ] [93])
(ior:QI (mem/j:QI (reg/f:SI 9 r9 [85]) [0 MEM[(struct uart_8250_port
*)&serial8250_ports][line_3(D)].port.quirks+0 S1 A8])
(reg:QI 13 r13 [87])))
/git/arm-soc/drivers/tty/serial/8250/8250_core.c:496 -1
 (nil))
/git/arm-soc/drivers/tty/serial/8250/8250_core.c:1098:1: internal compiler
error: in extract_insn, at recog.c:2202
0x7e5295 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/home/arnd/git/gcc/gcc/rtl-error.c:109
0x7e52c9 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/home/arnd/git/gcc/gcc/rtl-error.c:117
0x7b6fa4 extract_insn(rtx_def*)
/home/arnd/git/gcc/gcc/recog.c:2202
0x7b7024 extract_insn_cached(rtx_def*)
/home/arnd/git/gcc/gcc/recog.c:2105
0x664e8d cleanup_subreg_operands(rtx_def*)
/home/arnd/git/gcc/gcc/final.c:3063
0x7b4b9c split_insn
/home/arnd/git/gcc/gcc/recog.c:2920
0x7b9254 split_all_insns()
/home/arnd/git/gcc/gcc/recog.c:2974
0x7b92f4 rest_of_handle_split_after_reload
/home/arnd/git/gcc/gcc/recog.c:3925
0x7b92f4 execute
/home/arnd/git/gcc/gcc/recog.c:3954

[Bug middle-end/82365] stack locations are consolidated if noreturn function is on the path

2017-12-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82365

--- Comment #12 from Arnd Bergmann  ---
The first partial workaround for strncpy() got merged into Linux and stable
backports: 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=146734b091430

Submitted a second partial workaround for BUG():
https://patchwork.kernel.org/patch/10123109/

[Bug target/83485] cris: ICE in extract_insn, at recog.c:2311

2017-12-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83485

--- Comment #1 from Arnd Bergmann  ---
Reduced test case:

struct uart_port {
  char quirks;
};
struct uart_8250_port {
  struct uart_port port;
  int em485;
} b[1];
int a, c;
void fn1(void) {
  struct uart_8250_port *d = &b[c];
  d->port.quirks |= a ? 1 : 0;
}

[Bug target/83409] arc: "internal compiler error: in extract_constrain_insn" with -O3

2017-12-20 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83409

Arnd Bergmann  changed:

   What|Removed |Added

   Keywords|needs-reduction |

--- Comment #1 from Arnd Bergmann  ---
Reduced to:

$ arc-elf-gcc-7.2.1  -O2 -c scatterlist.i

struct scatterlist {
  long sg_magic;
  long page_link;
  int offset;
};
int a, b;
int fn2(void);
struct scatterlist g;
void fn1(void) {
  do {
struct scatterlist *c = &g;
a = 0;
for (; a < b; a++)
  c[a].sg_magic = 1;
if (fn2())
  while (10)
;
  } while (1);
}

[Bug sanitizer/83356] [7 Regression] excessive stack usage compiling with -O2 -fsanitize=bounds -fsanitize=object-size

2017-12-20 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356

--- Comment #10 from Arnd Bergmann  ---
I sent a (rather crude) workaround to the kernel mailing list now, mainly to
get the crypto maintainers involved in this as well. I also did further testing
and found that in the entire kernel, this is the only place that caused this
particular problem:

- comparing gcc-7.1.1 and gcc-7.2.1, many of the 72326 functions with >64 byte
stack usage use less stack in gcc-7.2, only the AES code uses much more

- In 30 architectures I tested, the stack frames are more than twice as big as
the second-largest function, around 2KB for 32-bit architectures and 4KB for
64-bit architectures. The only exception is aarch64, which uses only 500 bytes
here.

- There are a handful of other functions that show a very substantial stack
size increase with UBSAN (enic_rq_service() 2240 bytes,
fnic_rq_cmpl_handler_cont() 2240 bytes, dsp3780I_EnableDSP() 1016 bytes), but
all of them are the same way in older compilers as well, those are not
regressions. 

- From looking at the assembler output for the AES cipher, it appears that
gcc-7.1 and gcc-8 are both worse than gcc-6, with or without UBSAN, but only
gcc-7.2 with UBSAN is drastically worse. As mentioned in my patch description,
this needs to be tested better. There is a test module in the kernel to measure
the throughput of the AES code, but I have not tried it myself.

https://patchwork.kernel.org/patch/10126567/

[Bug tree-optimization/83651] New: [7.2 regression] 20% slowdown of linux kernel AES cipher

2018-01-02 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

Bug ID: 83651
   Summary: [7.2 regression] 20% slowdown of linux kernel AES
cipher
   Product: gcc
   Version: 7.2.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

Following the discussion on PR83356, I did some more performance analysis of
the AES code with various compiler versions, by running the in-kernel crypto
selftest (kvm -kernel linux/arch/x86/boot/bzImage -append "tcrypt.mode=200
tcrypt.sec=1 console=ttyS0"  -nographic -serial mon:stdio), which showed a very
clear slowdown at gcc-7.2 (dated 20171130) compared to 7.1, all numbers are in
cycles/byte for AES256+CBC on a 3.1GHz AMD Threadripper, lower numbers are
better:

default  ubsan patchedpatched+ubsan
gcc-4.3.6 -O214.9   14.9 
gcc-4.6.4 -O215.0   15.8 
gcc-4.9.4 -O215.520.7   15.9 20.9
gcc-5.5.0 -O215.647.3   86.4 48.8
gcc-6.3.1 -O214.649.4   94.3 50.9
gcc-7.1.1 -O213.554.6   15.2 52.0
gcc-7.2.1 -O216.8   124.7   92.0 52.2
gcc-8.0.0 -O214.656.6   15.3 53.5
gcc-7.1.1 -O114.653.8
gcc-7.2.1 -O115.555.9
gcc-8.0.0 -O115.050.7
clang-5 -O1  21.758.3
clang-5 -O2  15.549.1
handwritten asm  16.4

The 'patched' columns are with '-ftree-pre and -ftree-sra' disabled in the
sources, which happened to help on gcc-7.2.1 for performance and to work around
PR83356 but made things worse for most other cases.

For better reproducibility, I tried doing the same with the libressl
implementation of the same cipher, which also has interesting but unfortunately
very different results:

gcc-5.5.0 -O249.0
gcc-6.3.1 -O248.8
gcc-7.1.1 -O259.7
gcc-7.2.1 -O260.3
gcc-8.0.0 -O259.6

gcc-5.5.0 -O159.5
gcc-6.3.1 -O148.5
gcc-7.1.1 -O151.6
gcc-7.2.1 -O151.6
gcc-8.0.0 -O151.6

The source code is apparently derived from a common source, but has evolved in
different ways, and the version from the kernel appears to be much faster
overall. In both cases, we see a ~20% degradation between gcc-6.3.1 and
gcc-7.2.1, but gcc-7.1.1 happens to produce the best results for the kernel
version and very bad results for the libressl sources. The stack consumption
problem from PR83356 does not appear with the libressl sources. I have not
managed to run a ubsan-enabled libressl binary for testing.

To put this in context, both libressl and Linux come with architecture-specific
versions using SIMD registers for most architectures, and those tend to be much
faster, but the C version is used on old x86 CPUs and minor architectures that
lack SIMD registers or an AES implementation for them.

If there is enough interest in addressing the slowdown, it should be possible
to create a version of the kernel AES implementation that can be run in user
space, as the current method of reproducing the results is fairly tedious.

[Bug tree-optimization/83651] [7/8 regression] 20% slowdown of linux kernel AES cipher

2018-01-05 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

--- Comment #1 from Arnd Bergmann  ---
Before posting a new workaround for PR83356 (the workaround is to use -Os
instead of O2 for this file), I retested the performance numbers as well, and
got slightly different numbers this time. I don't know what caused that
difference, but now this is what I see is slightly different:


  -O2 -Os
  gcc-6.3.1   14.915.1
  gcc-7.0.1   14.715.3
  gcc-7.1.1   15.314.7
  gcc-7.2.1   16.815.9
  gcc-8.0.0   15.515.6

In particular, the gcc-7.1.1 results are a bit worse than they were, leading to
a less significant regression from 7.1.1 to 7.2.2, and the numbers are now
closer to what I saw with libressl. In both cases, we still have a 5% to 9%
regression between gcc-7.1.1 (20170717) and gcc-7.2.1 (20180102), and a 14% to
23% regression between 6.3.1 and 7.2.1.

I also found my mistake in the libressl numbers I showed in comment #1, they
are listed exactly factor 3 higher than they should have been, and the actual
results are close to the kernel implementation. I've measure these again now as
well and come to the following results, using identical compilers as above:

  -O2 -Os
  gcc-6.3.1   16.716.7
  gcc-7.0.1   17.516.0
  gcc-7.1.1   17.516.0
  gcc-7.2.1   17.616.0
  gcc-8.0.0   16.815.5

To reproduce with libressl, one could use the following steps:

$ git clone https://github.com/libressl-portable/portable.git
$ cd portable
$ ./autogen.sh
$ sed -i 's/undef FULL_UNROLL/define FULL_UNROLL/' crypto/aes/aes_locl.h
$ CC=x86_64-linux-gcc-7.2.1 ./configure --disable-asm
$ make -sj8
$ ./apps/openssl/openssl speed aes-256-cbc
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
aes-256 cbc 168004.61k   174024.74k   174855.76k   176270.13k   176608.14k
$ CC=x86_64-linux-gcc-6.3.1 ./configure --disable-asm
$ touch crypto/aes/aes_core.c 
$ make -sj8
$ ./apps/openssl/openssl speed aes-256-cbc
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
aes-256 cbc 175366.81k   182261.29k   183131.80k   184369.21k   184611.37k

[Bug middle-end/81897] [6/7 Regression] spurious -Wmaybe-uninitialized warning

2018-01-08 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81897

--- Comment #16 from Arnd Bergmann  ---
Created attachment 43056
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43056&action=edit
linux/net/ipv6/route.c, preprocessed and compressed

To test the patch, I reverted the workaround that was added to the kernel when
I originally reported this. Unfortunately the warning is still there, only the
reduced version is fixed. I attached the preprocessed source now, test with

$ x86_64-linux-gcc-8.0.0 -fno-strict-aliasing -Wall -O2 -Wno-pointer-sign -c
route-1.i
/git/arm-soc/net/ipv6/route.c: In function 'inet6_rtm_getroute':
/git/arm-soc/net/ipv6/route.c:4350:9: warning: 'dst' may be used uninitialized
in this function [-Wmaybe-uninitialized]
   goto errout;
 ^~
  }

Reducing this with the latest gcc-8.0.0 snapshot gave me

enum { true } fn1();
int inet6_rtm_getroute_iif, inet6_rtm_getroute_rt, inet6_rtm_getroute_rtm_0;
int *inet6_rtm_getroute___trans_tmp_8;
int fn2();
void fn3() {
  int *dst;
  _Bool fibmatch = inet6_rtm_getroute_rtm_0 & 2;
  if (inet6_rtm_getroute_iif) {
if (!fibmatch)
  dst = inet6_rtm_getroute___trans_tmp_8;
static _Bool __warned;
fn2() && __warned &&fn1();
__warned = true;
  } else if (!fibmatch)
dst = 0;
  if (fibmatch)
dst = 0;
  inet6_rtm_getroute_rt = *dst;
}

[Bug middle-end/81897] [6/7 Regression] spurious -Wmaybe-uninitialized warning

2018-01-08 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81897

--- Comment #17 from Arnd Bergmann  ---
Created attachment 43057
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43057&action=edit
linux/drivers/scsi/lpfc/lpfc_bsg.c, preprocessed and compressed

A possibly related warning I just saw this week, with and without the gcc
patch, and back to gcc-4.5+:

x86_64-linux-gcc-8.0.0 -fno-strict-aliasing -Wall -O2 -Wno-pointer-sign -m32 -c
-xc lpfc_bsg.i
/git/arm-soc/drivers/scsi/lpfc/lpfc_bsg.c: In function
'lpfc_bsg_rport_els_cmp':
/git/arm-soc/drivers/scsi/lpfc/lpfc_bsg.c:632:22: warning: 'bsg_reply' may be
used uninitialized in this function [-Wmaybe-uninitialized]

reduced to
8<
long current_stack_pointer, lpfc_bsg_rport_els_cmp_rsp_0;
long pv_irq_ops_1_0, pv_irq_ops_0_0;
struct bsg_job {
  void *reply;
};
struct fc_bsg_reply {
  int reply_payload_rcv_len;
  struct bsg_job set_job;
} * lpfc_bsg_rport_els_cmp_dd_data;
static void check(long v)
{
  if (v) {
asm("");
__builtin_unreachable();
  }
}
void lpfc_bsg_rport_els_cmp(void) {
  struct bsg_job *job;
  struct fc_bsg_reply *bsg_reply;
  job = &lpfc_bsg_rport_els_cmp_dd_data->set_job;
  if (job)
bsg_reply = job->reply;
  check(pv_irq_ops_0_0);
  check(pv_irq_ops_1_0);
  asm("" : "+r"(current_stack_pointer));
  check(pv_irq_ops_0_0);
  check(pv_irq_ops_1_0);
  if (job) {
if (lpfc_bsg_rport_els_cmp_rsp_0 == 0)
  bsg_reply->reply_payload_rcv_len = 0;
else if (lpfc_bsg_rport_els_cmp_rsp_0 == 1)
  bsg_reply->reply_payload_rcv_len = 0;
  }
}
>8
This looked similar to the originally reported symptom in the source code. The
reduced version looks quite a bit different, so no idea if this is something
else entirely or not.

[Bug sanitizer/83356] [7 Regression] excessive stack usage compiling with -O2 -fsanitize=bounds -fsanitize=object-size

2018-01-12 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356

--- Comment #11 from Arnd Bergmann  ---
The second version of my workaround (build with 'gcc -Os' on gcc-7.1+) was
merged into mainline linux: https://patchwork.kernel.org/patch/10143607/

[Bug sanitizer/83356] [7 Regression] excessive stack usage compiling with -O2 -fsanitize=bounds -fsanitize=object-size

2018-01-14 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83356

--- Comment #12 from Arnd Bergmann  ---
Unfortunately that patch caused a regression (nothing to do with the compiler
really, just the way powerpc linux uses some libgcc functions), and I've done
some more investigation. The new finding is that selectively turning off
'-fcode-hoisting' on gcc-7.2.1 restores the behavior we had on gcc-7.2.1,
solving both the stack consumption problem with UBSAN and the performance
regression (with and without UBSAN) as described in pr83651, with performance
better than my previous workarounds that replaced -O2 with -Os, -O1, or -O2
-fno-tree-sra -fno-tree-pre.

I have to do more performance tests to send an updated kernel patch, but maybe
this already helps analyze the problem further.

[Bug tree-optimization/83651] [7/8 regression] 20% slowdown of linux kernel AES cipher

2018-01-15 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

--- Comment #2 from Arnd Bergmann  ---
My kernel patch to use -Os got merged, but caused a regression, so I kept
experimenting with the libressl implementation. Apparently, turning off
-fcode-hoisting is a better way address PR83356, and the performance is the
same as with -Os: New numbers with libressl, same method as before:

  -O2 -Os-O2 -fno-code-hoisting
  gcc-6.3.1   16.716.7-
  gcc-7.0.1   17.516.016.0
  gcc-7.1.1   17.516.016.0
  gcc-7.2.1   17.516.016.0
  gcc-8.0.0   16.815.515.5

[Bug tree-optimization/83651] [7/8 regression] 20% slowdown of linux kernel AES cipher

2018-01-17 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

--- Comment #4 from Arnd Bergmann  ---
(In reply to Aldy Hernandez from comment #3)
> (In reply to Arnd Bergmann from comment #0)
> 
> > If there is enough interest in addressing the slowdown, it should be
> > possible to create a version of the kernel AES implementation that can be
> > run in user space, as the current method of reproducing the results is
> > fairly tedious.
> 
> I would say that a 20% slowdown is significant enough that we should
> definitely look into this.  A user space version would help immensely here.

The 20% number I got was from 7.1.1 to 7.2.1, but I can't reproduce the
7.1.1 performance any more, so it's possible that this was supposed to be
15.3 cycles instead of 13.5 cycles, but we'd still have a 13% regression
using the kernel implementation, and a 9% regression with libressl, which is
probably still significant.

> > The source code is apparently derived from a common source, but has evolved
> > in different ways, and the version from the kernel appears to be much faster
> > overall. 
> 
> It looks like you have various benchmarks based on different code bases. 
> This is not good for reproduceability and diagnosing the problem.  Could we
> settle on one, and ideally a (simple) user space version?  This will
> drastically increase the likelihood of finding a solution :).

I'd suggest sticking with the libressl test case from comment 1, and ignoring
the kernel version until the libressl one is fully understood. It seems very
likely that fixing one will also address the other.

Are you able to start with the test procedure from comment 1, or do you need
something that can be scripted better, e.g. in a single C file?

> Also, is this a GCC 8 regression?  It looks like in most of the benchmarks
> you post, GCC 8 performs pretty close to 4.x.  Again, settling on one
> benchmark, preferably in user space, would really help.

I had originally classified it as "7.2 regression", Richard changed it to "7/8
regression", which I think is correct: The problem is almost certainly the
"-fcode-hoisting" optimization step, and both gcc-7 and gcc-8 show about a 10%
difference between the normal "-O2" and "-O2 -fno-code-hoisting", it's just
that gcc-8 is faster overall.

[Bug tree-optimization/83651] [7/8 regression] 20% slowdown of linux kernel AES cipher

2018-01-18 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

--- Comment #6 from Arnd Bergmann  ---
Created attachment 43177
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43177&action=edit
Single-file version of aes benchmark

I've managed to condense the 'openssl speed aes-256-cbc' test into a single
file now:

$ x86_64-linux-gcc-7.2.1 -Wall -O2 aes_test.c -o aes_test
$ time ./aes_test
real0m4.499s
user0m4.498s
sys 0m0.000s

$ x86_64-linux-gcc-7.2.1 -Wall -O2 aes_test.c -o aes_test -fno-code-hoisting
$ time ./aes_test
real0m4.135s
user0m4.134s
sys 0m0.000s

The test is hardcoded to do 10 runs of of 8192 bytes, so on my 3.1GHz CPU
that translates cyles:
$ echo $[4999 * 31000  / 81920]
1891 # 18.91 cycles
$ echo $[4135 * 31000  / 81920]
1564 # 15.6 cycles

Similar results with gcc-8.0.0:

$ x86_64-linux-gcc-8.0.0 -Wall -O2 aes_test.c -o aes_test
$ time ./aes_test
real0m4.471s
user0m4.470s
sys 0m0.000s

$ x86_64-linux-gcc-8.0.0 -Wall -O2 aes_test.c -o aes_test -fno-code-hoisting
$ time ./aes_test
real0m4.052s
user0m4.052s
sys 0m0.000s

Hope that helps

[Bug tree-optimization/83651] [7/8 regression] 20% slowdown of linux kernel AES cipher

2018-01-18 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

Arnd Bergmann  changed:

   What|Removed |Added

  Attachment #43177|0   |1
is obsolete||

--- Comment #7 from Arnd Bergmann  ---
Created attachment 43178
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43178&action=edit
Single-file version of aes benchmark (shorter)

I decided to strip the test case a bit more down by removing the decryption
side that was unused, and checked that nothing else has changed.

[Bug tree-optimization/83651] [7/8 regression] 20% slowdown of linux kernel AES cipher

2018-01-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

--- Comment #11 from Arnd Bergmann  ---
Trying out the patch from comment 10 on the original preprocessed source as
attached to pr83356 also shows very noticeable improvements with stack spilling
there:

x86_64-linux-gcc-6.3.1 -Wall -O2 -S ./aes_generic.i  -Wframe-larger-than=10
-fsanitize=bounds -fsanitize=object-size -fno-strict-aliasing ; grep rsp
aes_generic.s | wc -l
/git/arm-soc/crypto/aes_generic.c: In function 'aes_encrypt':
/git/arm-soc/crypto/aes_generic.c:1371:1: warning: the frame size of 48 bytes
is larger than 10 bytes [-Wframe-larger-than=]
4075

x86_64-linux-gcc-7.1.1 -Wall -O2 -S aes_generic.i  -Wframe-larger-than=10
-fsanitize=bounds -fsanitize=object-size -fno-strict-aliasing ; grep rsp
aes_generic.s | wc -l
/git/arm-soc/crypto/aes_generic.c: In function 'aes_encrypt':
/git/arm-soc/crypto/aes_generic.c:1371:1: warning: the frame size of 304 bytes
is larger than 10 bytes [-Wframe-larger-than=]
 }
4141

x86_64-linux-gcc-7.2.1 -Wall -O2 -S aes_generic.i  -Wframe-larger-than=10
-fsanitize=bounds -fsanitize=object-size -fno-strict-aliasing ; grep rsp
aes_generic.s | wc -l
/git/arm-soc/crypto/aes_generic.c: In function 'aes_encrypt':
/git/arm-soc/crypto/aes_generic.c:1371:1: warning: the frame size of 3840 bytes
is larger than 10 bytes [-Wframe-larger-than=]
10351

# same as x86_64-linux-gcc-7.2.1 but with patch from comment 10:
./xgcc -Wall -O2 -S ./aes_generic.i  -Wframe-larger-than=10 -fsanitize=bounds
-fsanitize=object-size -fno-strict-aliasing ; grep rsp aes_generic.s | wc -l 
/git/arm-soc/crypto/aes_generic.c: In function 'aes_encrypt':
/git/arm-soc/crypto/aes_generic.c:1371:1: warning: the frame size of 272 bytes
is larger than 10 bytes [-Wframe-larger-than=]
4739

My interpretation is that there are two distinct issues: both AES
implementations (libressl and linux-kernel) suffer from a 5% to 10% regression
that is triggered by the combination of -ftree-pre and -fcode-hoisting, but
only the kernel implementation suffers from a second issue that Martin Liška
traced back to r251376. This results in another few percents of slowdown in
gcc-7.2.1  and an factor 2.3x slowdown (and corresponding increase in stack
accesses) when -fsanitize=bounds -fsanitize=object-size gets enabled.

[Bug tree-optimization/83651] [7/8 regression] 20% slowdown of linux kernel AES cipher

2018-01-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

--- Comment #13 from Arnd Bergmann  ---
Created attachment 43185
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43185&action=edit
Linux kernel version of AES algorithm, ported to standalone executable

I've had another look at extracting a test case from the Linux kernel copy of
this code. This now also shows the gcc-7.2.1 specific problem:

$ x86_64-linux-gcc-7.1.1 -Wall -O2 -fsanitize=bounds -fsanitize=object-size
aes_generic.c -o aes_generic; time ./aes_generic
real0m9.406s

$ x86_64-linux-gcc-7.1.1 -Wall -O2 -fsanitize=bounds -fsanitize=object-size
aes_generic.c -o aes_generic -fno-code-hoisting; time ./aes_generic
real0m8.318s

$ x86_64-linux-gcc-7.2.1 -Wall -O2 -fsanitize=bounds -fsanitize=object-size
aes_generic.c -o aes_generic; time ./aes_generic
real0m22.151s

$ x86_64-linux-gcc-7.2.1 -Wall -O2 -fsanitize=bounds -fsanitize=object-size
aes_generic.c -o aes_generic -fno-code-hoisting; time ./aes_generic
real0m8.439s

$ x86_64-linux-gcc-7.1.1 -Wall -O2 aes_generic.c -o aes_generic ; time
./aes_generic
real0m3.031s

$ x86_64-linux-gcc-7.1.1 -Wall -O2 aes_generic.c -o aes_generic
-fno-code-hoisting ; time ./aes_generic
real0m2.894s

$ x86_64-linux-gcc-7.2.1 -Wall -O2 aes_generic.c -o aes_generic  ; time
./aes_generic
real0m3.307s

$ x86_64-linux-gcc-7.2.1 -Wall -O2 aes_generic.c -o aes_generic
-fno-code-hoisting ; time ./aes_generic
real0m2.875s

[Bug tree-optimization/83651] [7/8 regression] 20% slowdown of linux kernel AES cipher

2018-01-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83651

--- Comment #15 from Arnd Bergmann  ---
(In reply to rguent...@suse.de from comment #14)

> Would be nice if somebody can bisect it.  It doesn't look like a PRE
> specific issue because there's no relevant PRE changes in the rev. range.
> I can't reproduce the slowdown when comparing 7.1.0 against 7.2.0
> btw, so the regression must occur somewhere between 7.2.0 and now
> (or 7.1.1 got faster for a few revs).

I've checked r251376 (the one I mentioned in comment #11), and confirmed that
this caused the difference between my old 7.1.1 and the current 7.2.1.

[Bug target/84038] New: powerpc-linux-gcc gets stuck building linux kernel

2018-01-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84038

Bug ID: 84038
   Summary: powerpc-linux-gcc gets stuck building linux kernel
   Product: gcc
   Version: 7.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

Created attachment 43240
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43240&action=edit
linux/kernel/cpu.c, preprocessed and compressed

I tried building all powerpc 'defconfig' kernels with gcc-7.3.0, and got stuck
in mpc866_ads_defconfig while compiling linux/kernel/cpu.c

Build with 'powerpc-linux-gcc-7.3.0 -O2 -c cpu.i'. Passing '-O2
-finline-functions' avoids the problem. 'perf top' during the gcc run shows

  11.30%  cc1 [.] fast_dce
   8.63%  cc1 [.] df_worklist_dataflow
   7.07%  cc1 [.] volatile_refs_p
   6.23%  cc1 [.] prescan_insns_for_dce
   5.94%  cc1 [.] deletable_insn_p
   5.31%  cc1 [.] bitmap_set_bit

I started manually reducing the file. The only other compiler versions I tried
were a 7.2.1 snapshot (same result) and an old 5.4.0 compiler (no problems).

[Bug rtl-optimization/83985] [8 Regression] Compile time hog for 32-bit BE powerpc targets

2018-01-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83985

Arnd Bergmann  changed:

   What|Removed |Added

 CC||arnd at linaro dot org

--- Comment #2 from Arnd Bergmann  ---
Reproduced with gcc-7.3.0 as well.

[Bug rtl-optimization/83985] [8 Regression] Compile time hog for 32-bit BE powerpc targets

2018-01-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83985

Arnd Bergmann  changed:

   What|Removed |Added

 CC||segher at kernel dot 
crashing.org

--- Comment #3 from Arnd Bergmann  ---
Bisected to "rs6000: Separate shrink-wrapping" (r241065), starting from 7.3.
Adding Segher to Cc. This is not the same commit that caused pr84038 though,
that one started later.

[Bug rtl-optimization/84038] [7/8 Regression] powerpc-linux-gcc gets stuck building linux kernel

2018-01-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84038

Arnd Bergmann  changed:

   What|Removed |Added

   Keywords|needs-bisection |

--- Comment #2 from Arnd Bergmann  ---
bisected this one to r244207, and bisected pr83985 to something earlier.

[Bug rtl-optimization/83985] [8 Regression] Compile time hog for 32-bit BE powerpc targets

2018-01-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83985

--- Comment #7 from Arnd Bergmann  ---
*** Bug 84038 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/84038] [7/8 Regression] powerpc-linux-gcc gets stuck building linux kernel

2018-01-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84038

Arnd Bergmann  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #3 from Arnd Bergmann  ---
(In reply to Arnd Bergmann from comment #2)
> bisected this one to r244207, and bisected pr83985 to something earlier.

attachment 43241 that jakub created for pr83985 addresses this one as well,
closing as duplicate.

*** This bug has been marked as a duplicate of bug 83985 ***

[Bug middle-end/84095] New: false-positive -Wrestrict warnings for memcpy within array

2018-01-28 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

Bug ID: 84095
   Summary: false-positive -Wrestrict warnings for memcpy within
array
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

I see multiple new warnings for correct code in the Linux kernel for code that
copies one array member into other members of the same array, reduced to:

$ cat > test.c << EOF
struct { int i; } a[8];

void f(void)
{
int i;

for (i=1; i <8; i++)
__builtin_memcpy(&a[i], &a[0], sizeof(a[0]));
}
EOF

$ x86_64-linux-gcc-8.0.1 -c -Wall test.c 
test4.c: In function 'f':
test4.c:8:3: warning: '__builtin_memcpy' accessing 4 bytes at offsets 0 and 0
overlaps 4 bytes at offset 0 [-Wrestrict]
   __builtin_memcpy(&a[i], &a[0], sizeof(a[0]));
   ^~~~

[Bug lto/84105] New: [8 regression] Segmentation fault in pp_tree_identifier() during LTO

2018-01-29 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84105

Bug ID: 84105
   Summary: [8 regression] Segmentation fault in
pp_tree_identifier() during LTO
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
CC: marxin at gcc dot gnu.org
  Target Milestone: ---

I got an ICE while building the linux kernel module net/sctp/sctp.ko with
i386-linux-gcc-8.0.1, currently using r257114. A slightly older gcc-8.0.0
(dated 20180107, exact revision unknown) doesn't have this problem.

  /bin/bash /git/arm-soc/scripts/gcc-ld -fuse-linker-plugin -flto=jobserver
-flto  -fno-strict-aliasing -fno-fat-lto-objects -Wno-attribute-alias
-fwhole-program  -fno-strict-aliasing -fdump-ipa-cgraph
-fdump-ipa-inline-details -fipa-cp-clone -r -m elf_i386 -T
/git/arm-soc/scripts/module-common.lds --build-id  -o net/sctp/sctp.ko
net/sctp/sctp.o net/sctp/sctp.mod.o ;  true
during IPA pass: inline
dump file: net/sctp/sctp.ko.ltrans0.079i.inline
/git/arm-soc/net/sctp/sm_sideeffect.c: In function 'sctp_do_sm':
/git/arm-soc/net/sctp/sm_sideeffect.c:1155:5: internal compiler error:
Segmentation fault
 int sctp_do_sm(struct net *net, enum sctp_event event_type,
 ^
0xa42b7f crash_signal
/home/arnd/git/gcc/gcc/toplev.c:325
0xaf0659 pp_tree_identifier(pretty_printer*, tree_node*)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:4006
0xaf0966 dump_decl_name
/home/arnd/git/gcc/gcc/tree-pretty-print.c:261
0xaf42ea dump_generic_node(pretty_printer*, tree_node*, int, unsigned long,
bool)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:1826
0xaf769a print_declaration(pretty_printer*, tree_node*, int, unsigned long)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:
0xaf7997 print_generic_decl(_IO_FILE*, tree_node*, unsigned long)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:122
0xb4603a dump_scope_block
/home/arnd/git/gcc/gcc/tree-ssa-live.c:647
0xb471b9 dump_scope_blocks(_IO_FILE*, unsigned long)
/home/arnd/git/gcc/gcc/tree-ssa-live.c:678
0xb471b9 remove_unused_locals()
/home/arnd/git/gcc/gcc/tree-ssa-live.c:870
0x97af44 execute_function_todo
/home/arnd/git/gcc/gcc/passes.c:1972
0x97b8b9 execute_todo
/home/arnd/git/gcc/gcc/passes.c:2048
0x97dac5 execute_one_ipa_transform_pass
/home/arnd/git/gcc/gcc/passes.c:2245
0x97dac5 execute_all_ipa_transforms()
/home/arnd/git/gcc/gcc/passes.c:2281
0x6d681c cgraph_node::expand()
/home/arnd/git/gcc/gcc/cgraphunit.c:2132
0x6d7b38 expand_all_functions
/home/arnd/git/gcc/gcc/cgraphunit.c:2275
0x6d7b38 symbol_table::compile()
/home/arnd/git/gcc/gcc/cgraphunit.c:2624
0x656c51 lto_main()
/home/arnd/git/gcc/gcc/lto/lto.c:3349
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

I have not been able to create a simple test case for it, but can provide steps
for reproducing, or help test patches. If necessary, I can do a bisection, but
maybe someone can see from the backtrace what is happening, or has a duplicate
bugreport.

>From what I can tell, the ICE is caused by a typedef inside of a function,
moving the typedef outside of the function avoids the problem. See the source
code at:

https://elixir.free-electrons.com/linux/v4.15/source/net/sctp/sm_sideeffect.c#L1172

[Bug c/84108] New: incorrect -Wattributes warning for packed/aligned conflict

2018-01-29 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84108

Bug ID: 84108
   Summary: incorrect -Wattributes warning for packed/aligned
conflict
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

Marking a struct member as both 'packed' and 'aligned()' triggers a warning in
gcc-8:

struct s {
char x;
int y __attribute__((packed, aligned(2)));
int z __attribute__((aligned(2), packed));
} s;

x86_64-linux-gcc-8.0.1 -c -Wall test.c  -O2 -c
test5.c:3:2: warning: ignoring attribute 'aligned' because it conflicts with
attribute 'packed' [-Wattributes]
  int y __attribute__((packed, aligned(2)));
  ^~~
test5.c:4:2: warning: ignoring attribute 'packed' because it conflicts with
attribute 'aligned' [-Wattributes]
  int z __attribute__((aligned(2), packed));
  ^~~

The warning appears to be a mistake, since neither attribute is in fact ignored
here, and both 'y' and 'z' are aligned to two bytes, which matches the gcc
documentation and the behavior of older versions.

[Bug c/84108] [8 Regression] incorrect -Wattributes warning for packed/aligned conflict on struct members

2018-01-29 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84108

--- Comment #3 from Arnd Bergmann  ---
(In reply to Jakub Jelinek from comment #1)
> I vaguely remember the behavior of packed + aligned(N) kept changing in the
> past, some versions of GCC treated it just like packed, others as aligned.
> Is this in the kernel?

Yes. The example I showed corresponds to include/linux/sysv_fs.h, which
probably has very few users, if any. If this was inconsistent in the past, it
may have gone unnoticed. OTOH, there is a compile-time assertion in the code to
ensure that the structure has the right size:

#define __packed2__ __attribute__((packed,aligned(2)))
#define XENIX_NICINOD   100 /* number of inode cache entries */
#define XENIX_NICFREE   100 /* number of free block list chunk entries */
struct xenix_super_block {
__fs16  s_isize; /* index of first data zone */
__fs32  s_fsize __packed2__; /* total number of zones of this
fs */
/* the start of the free block list: */
__fs16  s_nfree;/* number of free blocks in s_free, <=
XENIX_NICFREE */
sysv_zone_t s_free[XENIX_NICFREE]; /* first free block list chunk
*/
/* the cache of free inodes: */
__fs16  s_ninode; /* number of free inodes in s_inode, <=
XENIX_NICINOD */
sysv_ino_t  s_inode[XENIX_NICINOD]; /* some free inodes */
/* locks, not used by Linux: */
chars_flock;/* lock during free block list
manipulation */
chars_ilock;/* lock during inode cache manipulation
*/
chars_fmod; /* super-block modified flag */
chars_ronly;/* flag whether fs is mounted read-only
*/
__fs32  s_time __packed2__; /* time of last super block update
*/
__fs32  s_tfree __packed2__; /* total number of free zones */
__fs16  s_tinode;   /* total number of free inodes */
__fs16  s_dinfo[4]; /* device information ?? */
chars_fname[6]; /* file system volume name */
chars_fpack[6]; /* file system pack name */
chars_clean;/* set to 0x46 when filesystem is
properly unmounted */
chars_fill[371];
s32 s_magic;/* version of file system */
__fs32  s_type; /* type of file system: 1 for 512 byte
blocks
2 for 1024 byte
blocks
3 for 2048 byte
blocks */

};
BUILD_BUG_ON(1024 != sizeof (struct xenix_super_block));

Two other variations of this warning that I ran into are slightly different:
fs/ubifs/ubifs-media.h and include/scsi/libsas.h define a structure with
__attribute__((packed)) and use that structure as a member in another
structure, with that member being marked __attribute__((aligned)). I would hope
that this behavior has never changed.

[Bug middle-end/84095] [8 Regression] false-positive -Wrestrict warnings for memcpy within array

2018-01-29 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

--- Comment #3 from Arnd Bergmann  ---
(In reply to Martin Sebor from comment #2)
> (In reply to Arnd Bergmann from comment #0)
> 
> Let me work on this.
> 
> I tested the warning with the kernel but don't recall coming across this
> false positive.  While I retry with the latest, how many of these do you see?

They are fairly rare, I have seen four or five now with randconfig builds over
the last two days. It's possible that there is none in the default config.

[Bug lto/84105] [8 regression] Segmentation fault in pp_tree_identifier() during LTO

2018-01-29 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84105

--- Comment #2 from Arnd Bergmann  ---
Created attachment 43281
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43281&action=edit
preprocessed simplified sm_sideeffect.c, compressed

I managed to get a standalone testcase now, manually reduced the original
source and then preprocessed. Further automated reduction should be easy now if
necessary.

x86_64-linux-gcc-8.0.1 -c -m32 -o lto-test.o lto-test.i -O2 -flto
-Wno-pointer-sign -fno-strict-aliasing
x86_64-linux-gcc-8.0.1  -fdump-ipa-inline-details   -m32 -r-o sctp.ko
lto-test.o

The part that goes wrong is apparently the '-fdump-ipa-inline-details'.

/git/arm-soc/net/sctp/lto-test.c: In function 'sctp_do_sm':
/git/arm-soc/net/sctp/lto-test.c:120:5: internal compiler error: Segmentation
fault
 int sctp_do_sm(struct net *net, enum sctp_event event_type,
 ^
0xa42b7f crash_signal
/home/arnd/git/gcc/gcc/toplev.c:325
0xaf0659 pp_tree_identifier(pretty_printer*, tree_node*)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:4006
0xaf0966 dump_decl_name
/home/arnd/git/gcc/gcc/tree-pretty-print.c:261
0xaf42ea dump_generic_node(pretty_printer*, tree_node*, int, unsigned long,
bool)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:1826
0xaf769a print_declaration(pretty_printer*, tree_node*, int, unsigned long)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:
0xaf7997 print_generic_decl(_IO_FILE*, tree_node*, unsigned long)
/home/arnd/git/gcc/gcc/tree-pretty-print.c:122
0xb4603a dump_scope_block
/home/arnd/git/gcc/gcc/tree-ssa-live.c:647
0xb471b9 dump_scope_blocks(_IO_FILE*, unsigned long)
/home/arnd/git/gcc/gcc/tree-ssa-live.c:678
0xb471b9 remove_unused_locals()
/home/arnd/git/gcc/gcc/tree-ssa-live.c:870
0x97af44 execute_function_todo
/home/arnd/git/gcc/gcc/passes.c:1972
0x97b8b9 execute_todo
/home/arnd/git/gcc/gcc/passes.c:2048
0x97dac5 execute_one_ipa_transform_pass
/home/arnd/git/gcc/gcc/passes.c:2245
0x97dac5 execute_all_ipa_transforms()
/home/arnd/git/gcc/gcc/passes.c:2281
0x6d681c cgraph_node::expand()
/home/arnd/git/gcc/gcc/cgraphunit.c:2132
0x6d7b38 expand_all_functions
/home/arnd/git/gcc/gcc/cgraphunit.c:2275
0x6d7b38 symbol_table::compile()
/home/arnd/git/gcc/gcc/cgraphunit.c:2624
0x656c51 lto_main()
/home/arnd/git/gcc/gcc/lto/lto.c:3349
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
lto-wrapper: fatal error: /home/arnd/cross-gcc/bin/x86_64-linux-gcc-8.0.1
returned 1 exit status
compilation terminated.
/home/arnd/cross-gcc/lib/gcc/x86_64-linux/8.0.1/../../../../x86_64-linux/bin/ld:
error: lto-wrapper failed

[Bug middle-end/84095] [8 Regression] false-positive -Wrestrict warnings for memcpy within array

2018-01-29 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

--- Comment #5 from Arnd Bergmann  ---
Here are some additional instances in the kernel. I'm currently trying to get a
reliable build first and haven't got a log of all the messages, but there are a
number of changes I did that are related, shutting up the -Wrestrict warning
(some may be -Warray-bounds).

Here are some false-positives:

--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -282,9 +282,7 @@ static void mux_clone(int cpu)
if (!has_mux())
return;

-   memcpy(per_cpu(cpu_msrs, cpu).multiplex,
-  per_cpu(cpu_msrs, 0).multiplex,
-  sizeof(struct op_msr) * model->num_virt_counters);
+   per_cpu(cpu_msrs, cpu).multiplex = per_cpu(cpu_msrs, 0).multiplex;
 }

 #else
@@ -463,6 +461,7 @@ static int nmi_setup(void)
if (!cpu)
continue;

+#ifdef CONFIG_SMP
memcpy(per_cpu(cpu_msrs, cpu).counters,
   per_cpu(cpu_msrs, 0).counters,
   sizeof(struct op_msr) * model->num_counters);
@@ -470,7 +469,7 @@ static int nmi_setup(void)
memcpy(per_cpu(cpu_msrs, cpu).controls,
   per_cpu(cpu_msrs, 0).controls,
   sizeof(struct op_msr) * model->num_controls);
-
+#endif
mux_clone(cpu);
}
--- a/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c
+++ b/drivers/staging/rtl8723bs/os_dep/ioctl_cfg80211.c
@@ -1643,7 +1643,7 @@ static int cfg80211_rtw_scan(struct wiphy *wiphy
spin_lock_bh(&pmlmepriv->lock);
if (request->n_channels == 1) {
for (i = 1;in_channels <= 4) {
for (j =request->n_channels-1;j>= 0;j--)


This one was weird, I suspect my change is incorrect:

diff --git a/drivers/fmc/fmc-fakedev.c b/drivers/fmc/fmc-fakedev.c
index 941d0930969a..0d322380d952 100644
--- a/drivers/fmc/fmc-fakedev.c
+++ b/drivers/fmc/fmc-fakedev.c
@@ -305,7 +305,7 @@ static int ff_init(void)

/* Replicate the default eeprom for the max number of mezzanines */
for (i = 1; i < FF_MAX_MEZZANINES; i++)
-   memcpy(ff_eeimg[i], ff_eeimg[0], sizeof(ff_eeimg[0]));
+   memcpy(&ff_eeimg[i][0], &ff_eeimg[0][0], sizeof(ff_eeimg[0]));

if (ff_nr_eeprom > ff_nr_dev)
ff_nr_dev = ff_nr_eeprom;

This one seems to be a kernel bug:

--- a/kernel/debug/kdb/kdb_support.c
+++ b/kernel/debug/kdb/kdb_support.c
@@ -129,13 +129,13 @@ int kdbnearsym(unsigned long addr, kdb_symtab_t *symtab)
}
if (i >= ARRAY_SIZE(kdb_name_table)) {
debug_kfree(kdb_name_table[0]);
-   memcpy(kdb_name_table, kdb_name_table+1,
+   memmove(kdb_name_table, kdb_name_table+1,
   sizeof(kdb_name_table[0]) *
   (ARRAY_SIZE(kdb_name_table)-1));
} else {
debug_kfree(knt1);
knt1 = kdb_name_table[i];
-   memcpy(kdb_name_table+i, kdb_name_table+i+1,
+   memmove(kdb_name_table+i, kdb_name_table+i+1,
   sizeof(kdb_name_table[0]) *
   (ARRAY_SIZE(kdb_name_table)-i-1));
}

[Bug middle-end/84095] [8 Regression] false-positive -Wrestrict warnings for memcpy within array

2018-01-30 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

--- Comment #6 from Arnd Bergmann  ---
I got one file that produces a rather cryptic warning related to this:

In file included from /git/arm-soc/arch/x86/include/asm/page_32.h:35,
 from /git/arm-soc/arch/x86/include/asm/page.h:14,
 from /git/arm-soc/arch/x86/include/asm/thread_info.h:12,
 from /git/arm-soc/include/linux/thread_info.h:38,
 from /git/arm-soc/arch/x86/include/asm/preempt.h:7,
 from /git/arm-soc/include/linux/preempt.h:81,
 from /git/arm-soc/include/linux/spinlock.h:51,
 from /git/arm-soc/include/linux/seqlock.h:36,
 from /git/arm-soc/include/linux/time.h:6,
 from /git/arm-soc/include/linux/stat.h:22,
 from /git/arm-soc/include/linux/module.h:10,
 from /git/arm-soc/drivers/isdn/isdnloop/isdnloop.c:12:
In function 'strcpy',
inlined from 'isdnloop_parse_cmd' at
/git/arm-soc/drivers/isdn/isdnloop/isdnloop.c:900:3:
/git/arm-soc/include/linux/string.h:437:10: error: '__builtin_strcpy' accessing
0 or more bytes at offsets [36, 25] and 446 may overlap up to 0 bytes at offset
[9223372036854775807, -9223372036854775808] [-Werror=restrict]
   return __builtin_strcpy(p, q);

Not sure if gcc should try to avoid that warning or print something more
helpful in that case. The isdnloop code itself is cryptic enough that I'm not
surprised to see gcc get confused as well, and using strncpy() or strncpy()
instead of strcpy() would avoid the warning and improve the source code.

Possibly gcc should not warn about anything involving 'up to 0 bytes' though
;-)

[Bug target/82641] Unable to enable crc32 for a certain function with target attribute on ARM (aarch32)

2018-01-30 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82641

Arnd Bergmann  changed:

   What|Removed |Added

 CC||arnd at linaro dot org

--- Comment #14 from Arnd Bergmann  ---
It looks like r255468 broke compilation of a couple of files in the Linux
kernel,
which use a top-level statement like

linux/arch/arm/kvm/hyp/banked-sr.c:
__asm__(".arch_extension virt");

linux/arch/arm/kernel/xscale-cp0.c
asm("   .arch armv5te\n");

to allow compilation for a target other than the one specified by with -march=
to the compiler.

I tried using

#if GCC_VERSION >= 80
#pragma GCC target("arch=armv5te")
#else
asm(".arch armv5te\n");
#endif

but that results in a build failure:
/git/arm-soc/arch/arm/kernel/xscale-cp0.c:21: warning: "__ARM_ARCH" redefined
: note: this is the location of the previous definition
/git/arm-soc/arch/arm/kernel/xscale-cp0.c:21: warning: "__ARM_FEATURE_COPROC"
redefined

and presumably would lead to the while file being built for armv5te, possibly
generating instructions that may be invalid for armv4 or armv4t outside of the
inline assembly that is known to be safe.

[Bug target/82641] Unable to enable crc32 for a certain function with target attribute on ARM (aarch32)

2018-01-30 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82641

--- Comment #16 from Arnd Bergmann  ---
Here is a simplified version of the file in question, to try as standalone:

typedef unsigned int u32;
asm(".arch armv5te\n");
extern int cpuid;
static _Bool cpu_is_xscale_family()
{
/* this code must be compiled to execute on other CPUs, so
   we cannot just use -march=armv5te */
switch (cpuid & 0xe000) {
case 0x69052000: /* Intel XScale 1 */
case 0x69054000: /* Intel XScale 2 */
case 0x69056000: /* Intel XScale 3 */
case 0x56056000: /* Marvell XScale 3 */
case 0x56158000: /* Marvell Mohawk */
return 1;
}
return 0;
}
static int cpu_has_iwmmxt(void)
{
u32 lo;
u32 hi;

/*
 * This sequence is interpreted by the DSP coprocessor as:
 *  mar acc0, %2, %3
 *  mra %0, %1, acc0
 *
 * And by the iWMMXt coprocessor as:
 *  tmcrr   wR0, %2, %3
 *  tmrrc   %0, %1, wR0
 */
__asm__ __volatile__ (
"mcrr   p0, 0, %2, %3, c0\n"
"mrrc   p0, 0, %0, %1, c0\n"
: "=r" (lo), "=r" (hi)
: "r" (0), "r" (0x100));

return !!hi;
}
int xscale_cp0_init(void)
{
/* do not attempt to probe iwmmxt on non-xscale family CPUs */
if (!cpu_is_xscale_family())
return 0;

if (!cpu_has_iwmmxt())
return 0;

/* ... start using iwmmxt */

return 0;
}

$ arm-linux-gnueabi-gcc-7.2.1 -Wall -O2 -c test.c -march=armv4t
# no output

$ arm-linux-gnueabi-gcc-8.0.1 -Wall -O2 -c test.c -march=armv4t
/tmp/ccobFwz5.s: Assembler messages:
/tmp/ccobFwz5.s:34: Error: selected processor does not support `mcrr
p0,0,r0,r3,c0' in ARM mode
/tmp/ccobFwz5.s:35: Error: selected processor does not support `mrrc
p0,0,r3,r2,c0' in ARM mode

[Bug target/82641] Unable to enable crc32 for a certain function with target attribute on ARM (aarch32)

2018-01-30 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82641

--- Comment #19 from Arnd Bergmann  ---
(In reply to Richard Earnshaw from comment #18)
> 
> So you're changing the targetted architecture behind the compilers back.  Ie
> you're lying to it.  Frankly, you deserve to get burnt if you do things like
> that.

Du you have a suggestion on what to do instead?

[Bug middle-end/84095] [8 Regression] false-positive -Wrestrict warnings for memcpy within array

2018-01-30 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

--- Comment #8 from Arnd Bergmann  ---
Created attachment 43295
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43295&action=edit
linux/drivers/isdn/isdnloop/isdnloop.c, preprocessed, compressed

This is the preprocessed file that showed the funky -Wrestrict warning message:

$ x86_64-linux-gcc-8.0.1 -Wall -O2 isdnloop.i -c -m32 -Wno-pointer-sign
-Wno-unused
In file included from /git/arm-soc/arch/x86/include/asm/page_32.h:35,
 from /git/arm-soc/arch/x86/include/asm/page.h:14,
 from /git/arm-soc/arch/x86/include/asm/thread_info.h:12,
 from /git/arm-soc/include/linux/thread_info.h:38,
 from /git/arm-soc/arch/x86/include/asm/preempt.h:7,
 from /git/arm-soc/include/linux/preempt.h:81,
 from /git/arm-soc/include/linux/spinlock.h:51,
 from /git/arm-soc/include/linux/seqlock.h:36,
 from /git/arm-soc/include/linux/time.h:6,
 from /git/arm-soc/include/linux/stat.h:22,
 from /git/arm-soc/include/linux/module.h:10,
 from /git/arm-soc/drivers/isdn/isdnloop/isdnloop.c:12:
In function 'strcpy',
inlined from 'isdnloop_parse_cmd' at
/git/arm-soc/drivers/isdn/isdnloop/isdnloop.c:900:3,
inlined from 'isdnloop_writecmd' at
/git/arm-soc/drivers/isdn/isdnloop/isdnloop.c:989:5:
/git/arm-soc/include/linux/string.h:437:10: warning: '__builtin_strcpy'
accessing 0 or more bytes at offsets [36, 25] and 446 may overlap up to 0 bytes
at offset [9223372036854775807, -9223372036854775808] [-Wrestrict]
   return __builtin_strcpy(p, q);
  ^~

I did not try to reduce the test case.

[Bug middle-end/84095] [8 Regression] false-positive -Wrestrict warnings for memcpy within array

2018-01-30 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

--- Comment #9 from Arnd Bergmann  ---
I found another false-positive -Wrestrict warning, did a manual reduction. Let
me know if I should better open separate bugs for each test case, or you prefer
to have them all here.

$ aarch64-linux-gcc-8.0.1 -Wall -O2 -c sit.i 
sit.i: In function 'sit_init_net':
sit.i:29:2: warning: 'memcpy' source argument is the same as destination
[-Wrestrict]
  memcpy(&t->ip6rd, &t0->ip6rd, sizeof(t->ip6rd));
  ^~~

void *memcpy(void *, const void *, unsigned long );
struct ip_tunnel_6rd_parm {
int relay_prefix;
int prefixlen;
int relay_prefixlen;
};
struct netdevice {
void *priv;
};
struct ip_tunnel {
struct netdevice *dev;
struct ip_tunnel_6rd_parm ip6rd;
};
struct sit_net {
struct netdevice *fb_tunnel_dev;
};
void ipip6_tunnel_clone_6rd(struct netdevice *dev, struct sit_net *sitn)
{
struct ip_tunnel *t = dev->priv;
if (t->dev == sitn->fb_tunnel_dev)
return;
struct ip_tunnel *t0 = sitn->fb_tunnel_dev->priv;
memcpy(&t->ip6rd, &t0->ip6rd, sizeof(t->ip6rd));
}
int sit_init_net(struct sit_net *sitn, struct netdevice *fb_tunnel_dev) 
{
sitn->fb_tunnel_dev = fb_tunnel_dev;
ipip6_tunnel_clone_6rd(sitn->fb_tunnel_dev, sitn);
return 0;
}

[Bug target/82641] Unable to enable crc32 for a certain function with target attribute on ARM (aarch32)

2018-01-30 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82641

--- Comment #23 from Arnd Bergmann  ---
I've done some more testing with '#pragma GCC target("arch=armv5te")' in place,
but ran into another problem:

: note: this is the location of the previous definition
In file included from /git/arm-soc/include/linux/thread_info.h:38,
 from /git/arm-soc/include/asm-generic/current.h:5,
 from ./arch/arm/include/generated/asm/current.h:1,
 from /git/arm-soc/include/linux/sched.h:12,
 from /git/arm-soc/arch/arm/kernel/xscale-cp0.c:14:
/git/arm-soc/arch/arm/kernel/xscale-cp0.c: In function 'dsp_do':
/git/arm-soc/arch/arm/include/asm/thread_info.h:88:35: error: inlining failed
in call to always_inline 'current_thread_info': target specific option mismatch
 static inline struct thread_info *current_thread_info(void)
   ^~~
/git/arm-soc/arch/arm/kernel/xscale-cp0.c:48:18: note: called from here
   dsp_save_state(current_thread_info()->cpu_context.extra);
  ^

I've worked around that now by separating the parts that use inline assembly
into standalone functions with GCC push_options/pop_options around them, so
they are not mixed with normal code that might call an inline function, but
this is getting increasingly ugly.

[Bug target/82641] Unable to enable crc32 for a certain function with target attribute on ARM (aarch32)

2018-01-31 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82641

--- Comment #25 from Arnd Bergmann  ---
(In reply to Tamar Christina from comment #24)
> Do you have a repro for this one? compiling the kernel with
> `CFLAGS="march=-armv4t"` doesn't seem to reproduce the original issue.

It needs to be a kernel configuration that enables both an ARMv4-based target
platform (e.g. ARCH_MOXART) and another target platform with ARMv5TE+IWMMXT
(e.g. ARCH_MMP).

> But the scenario should be working without needing to separate out the
> functions, as long as you're in-lining the right direction.

Ah, interesting.

> what would generate the error you're getting is if you're in-lining the
> armv5te code into armv4t which is an actual error
> 
> __attribute__((always_inline, target("arch=armv5te")))
> static inline int do_this (int x)
> {
>   return x*x;
> }
> 
> #pragma GCC target("arch=armv4t")   
> 
> 
> int do_that (int x, int y)
> {
>   return do_this (x - y);
> }
> 
> The compiler only rejects the inlining if you've told it to always inline
> and when the function to be inline's feature bits are not a strict subset of
> the function in which it is to inline

I can't reproduce it here myself now, no idea what I did earlier.

Anyway, since neither the #pragma nor the attribute work on existing
compilers, and making the hack version dependent would be worse,
I don't think we can use that anyway.

The best workaround I see so far is to either move all the affected
inline assembly statements into an external .S file to sidestep the
problem, or to apply more force and add the ".arch" to each inline
asm individually.

[Bug target/82641] Unable to enable crc32 for a certain function with target attribute on ARM (aarch32)

2018-01-31 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82641

--- Comment #27 from Arnd Bergmann  ---
(In reply to Richard Earnshaw from comment #26)
> (In reply to Arnd Bergmann from comment #25)
> 
> > or to apply more force and add the ".arch" to each inline
> > asm individually.
> 
> No, that would not be guaranteed to be supported: and you'd be lying to the
> compiler again.  At the end of each asm block the compiler *could* emit new
> .arch directive to forcibly reset the architecture to what IT thinks it
> should be.

That's fine though: we won't have any invalid instructions outside of the
inline asm, the whole point of setting .arch to a higher arch revision is to
make the inline asm work and avoid the build error from the assembler.

[Bug middle-end/84095] [8 Regression] false-positive -Wrestrict warnings for memcpy within array

2018-02-06 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

--- Comment #14 from Arnd Bergmann  ---
I applied the patches and seem to still get a warning for this:

$ x86_64-linux-gcc-8.0.1 -Wall -O2 -c nmi_int.c
nmi_int.c: In function 'nmi_setup':
nmi_int.c:43:3: warning: 'memcpy' source argument is the same as destination
[-Wrestrict]
   memcpy(per_cpu(cpu_msrs, cpu).counters,
   ^~~
  per_cpu(cpu_msrs, 0).counters,
  ~~
  sizeof(int) * model->num_counters);
  ~~


typedef unsigned long __kernel_size_t;
extern void * memcpy(void *,const void *,__kernel_size_t);
struct op_msrs {
int *counters;
};
#define for_each_possible_cpu(cpu) for_each_cpu((cpu), cpu_possible_mask)
#define for_each_cpu(cpu, mask)  for ((cpu) = 0; (cpu) < 1; (cpu)++,(void)mask)
extern struct cpumask __cpu_possible_mask;
#define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask)
#define DEFINE_PER_CPU(type, name) __typeof__(type) name
#define per_cpu_ptr(ptr, cpu)   ({ (void)(cpu); ptr; })
#define raw_cpu_ptr(ptr)per_cpu_ptr(ptr, 0)
#define per_cpu(var, cpu)   (*per_cpu_ptr(&(var), cpu))
extern void *pcpu_base_addr;
extern const unsigned long *pcpu_unit_offsets;
struct op_x86_model_spec {
unsigned intnum_counters;
};
static struct op_x86_model_spec *model;
static DEFINE_PER_CPU(struct op_msrs, cpu_msrs);
int nmi_setup(void)
{
int err = 0;
int cpu;
for_each_possible_cpu(cpu) {
if (!cpu)
continue;
memcpy(per_cpu(cpu_msrs, cpu).counters,
   per_cpu(cpu_msrs, 0).counters,
   sizeof(int) * model->num_counters);
}
return err;
}

In this code, we do copy from a variable onto itself, but only in a dead
branch, here because the for_each_possible_cpu() and per_cpu() macros degrade
to trivial wrappers on an non-SMP build.

[Bug middle-end/84095] [8 Regression] false-positive -Wrestrict warnings for memcpy within array

2018-02-06 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84095

--- Comment #15 from Arnd Bergmann  ---
(In reply to Arnd Bergmann from comment #14)
> I applied the patches and seem to still get a warning for this

I also just got the one from comment #9 again and found that the reduced test
case is still affected (and not claimed to be fixed by any of the patches, so
that's my fault for not checking).

[Bug target/86673] New: inline asm sometimes ignores 'register asm("reg")' declarations

2018-07-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673

Bug ID: 86673
   Summary: inline asm sometimes ignores 'register asm("reg")'
declarations
   Product: gcc
   Version: 8.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
      Reporter: arnd at linaro dot org
  Target Milestone: ---

Created attachment 44438
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44438&action=edit
linux/net/core/scm.o, preprocessed

Building older linux kernels for ARM with a gcc-8.1 compiler has triggered a
check for broken compiler versions, which compares the register number that is
used in an inline assembly statement with the expected value, for an argument
that was declared with the 'register foo asm ("reg")' syntax described in the
gcc manual under "Specifying Registers for Local Variables".

The diagnostic from the assembler is

$ arm-linux-gnueabi-gcc -Wall -O2 scm.i -c -Wno-pointer-sign
-fno-strict-aliasing
/tmp/ccCGMQmS.s:648: Error: .err encountered
/tmp/ccCGMQmS.s:679: Error: .err encountered

Unfortunately, a change made to the kernel a few years ago had made this go
unnoticed as everyone was testing gcc-8.1 only on more recent kernels that did
not run into the particular check, but may have run into the bug without
triggering the check. Architectures other than arm may also be affected, but
nothing else has this check.

I tested gcc-8.1.0 and today's gcc-8.1.1 (r262956), both with the same result.
I attached one of the files that showed the problem, and reduced this using
creduce to:

int a, c, d, e;
long b;
void fn1() {
  int f = ({
({
  long g = b, j = g;
  register const typeof(c) h asm("r2") = 1, i = d;
  __asm__(".ifnc %2,r2; .err; .endif\n\t"
"bl __put_user_4"
  : "=&r"(e)
  : ""(i), ""(h), ""(j));
  e;
});
  });
  a = f;
}

[Bug target/86673] inline asm sometimes ignores 'register asm("reg")' declarations

2018-07-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673

--- Comment #1 from Arnd Bergmann  ---
Further inspection shows that this happens for the cases where the input
argument to the inline asm is a compile-time constant, but not for those that
are variables.

[Bug target/86673] inline asm sometimes ignores 'register asm("reg")' declarations

2018-07-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673

Arnd Bergmann  changed:

   What|Removed |Added

 CC||mkuvyrkov at gcc dot gnu.org

--- Comment #2 from Arnd Bergmann  ---
Forcing constant inputs for put_user to be read from a volatile variable avoids
this problem and lets me cleanly build all files that showed it.

diff --git a/arch/arm/include/asm/uaccess.h b/arch/arm/include/asm/uaccess.h
index 35c9db857ebe..23e92a9a5ef4 100644
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -251,7 +251,8 @@ extern int __put_user_8(void *, unsigned long long);
({  \
unsigned long __limit = current_thread_info()->addr_limit - 1;
\
const typeof(*(p)) __user *__tmp_p = (p);   \
-   register const typeof(*(p)) __r2 asm("r2") = (x);   \
+   const typeof(*(p)) __x = (x);   \
+   register const typeof(*(p)) __r2 asm("r2") = READ_ONCE(__x);   
\
register const typeof(*(p)) __user *__p asm("r0") = __tmp_p; \
register unsigned long __l asm("r1") = __limit; \
register int __e asm("r0"); \

This confirms that constant inputs are what leads to the problem.

[Bug target/86673] [8/9 regression] inline asm sometimes ignores 'register asm("reg")' declarations

2018-07-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673

--- Comment #5 from Arnd Bergmann  ---
(In reply to Andreas Schwab from comment #4)
> Why are you using empty constraints when a register is required?

creduce did that, it had no effect on the result. The original source looks
like:

#define __get_user_x_64t(__r2, __p, __e, __l, __s)  \
   __asm__ __volatile__ (   \
__asmeq("%0", "r0") __asmeq("%1", "r2") \
__asmeq("%3", "r1") \
"bl __get_user_64t_" #__s   \
: "=&r" (__e), "=r" (__r2)  \
: "0" (__p), "r" (__l)  \
: __GUP_CLOBBER_##__s)

[Bug target/86673] [8/9 regression] inline asm sometimes ignores 'register asm("reg")' declarations

2018-07-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673

--- Comment #9 from Arnd Bergmann  ---
Reproduced on arm64 and x86 as well, see x86 version:

void fn1() {
   register const int h asm("edx") = 1;
__asm__(".ifnc %0,edx; .err; .endif" :: "r"(h));
}

[Bug target/86673] [8/9 regression] inline asm sometimes ignores 'register asm("reg")' declarations

2018-07-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673

--- Comment #11 from Arnd Bergmann  ---
I have checked all instances of 'register const' or 'const register' in the
current linux kernel (4.18-rc), and we never assign a constant expression to
any of them, so I guess none of them are affected:

arch/arm/include/asm/uaccess.h: register const void __user *__p
asm("r0") = __ptr;  \
arch/h8300/kernel/sim-console.c:register const char *_ptr
__asm__("er1") = s;
arch/h8300/kernel/sim-console.c:register const unsigned _len
__asm__("er2") = n;
arch/mips/include/asm/uaccess.h:register const void __user *__cu_from_r
__asm__("$5");  \
arch/mips/include/asm/uaccess.h:register const void *__cu_from_r
__asm__("$5"); \
arch/riscv/kernel/process.c:const register unsigned long gp __asm__
("gp");
arch/riscv/kernel/stacktrace.c: const register unsigned long current_sp
__asm__ ("sp");
arch/riscv/kernel/stacktrace.c: const register unsigned long current_sp
__asm__ ("sp");

Should we drop the 'const' for all of them as a rule? If there is no use case
for ever using a 'const register' variable and it can lead to bugs, should gcc
warn about it in the future?

Should this issue be mentioned in the documentation in
https://gcc.gnu.org/onlinedocs/gcc/Local-Register-Variables.html?

I also checked all instances in linux-4.4, and the ARM put_user() helper is the
only one I see that gets a constant expression input, so I suppose that is all
that needs to be fixed in backports, unless someone thinks we should get rid of
all them in backports as well.

[Bug target/86673] [8/9 regression] inline asm sometimes ignores 'register asm("reg")' declarations

2018-07-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673

--- Comment #13 from Arnd Bergmann  ---
(In reply to Andreas Schwab from comment #12)
> arch/h8300/kernel/sim-console.c:  register const int fd __asm__("er0") = 
> 1;

I found that too, and then noticed it is already fixed in linux-next:

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=14cf9451be78f8a

Ard points out that most of the other ones are pointers to const data, which
are not a problem. This leaves the arm put_user bug as the only definite
problem that needs to be addressed in older kernels.

The three arch/riscv instances of 'const register unsigned long gp __asm__
("gp")' are different because they are never passed into an inline assembly as
far as I can tell. This seems to be unsupported for local register variables
according to the gcc documentation, but if that's a problem, it's unrelated to
this bug.

[Bug sanitizer/81715] asan-stack=1 redzone allocation is too inflexible

2018-09-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715

--- Comment #30 from Arnd Bergmann  ---
(In reply to Martin Liška from comment #29)
> I'm got a patch candidate for which I did testing of allmodconfig
> configuration.
> Sorting all violations against 2KB of stack memory:
> 
> Before:
> TOTAL warnings: 185
>  
> drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5653:1: 23624
> drivers/net/wireless/ralink/rt2x00/rt2800lib.c:4518:1: 14144
> drivers/net/wireless/ralink/rt2x00/rt2800lib.c:3882:1: 11504
> lib/atomic64_test.c:250:1: 11192
> lib/atomic64_test.c:148:1: 10352

This is with -fsanitize-address-use-after-scope, right?

> after:
> 
> TOTAL warnings: 43
>  
> drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5653:1: 11880
> drivers/net/wireless/ralink/rt2x00/rt2800lib.c:4518:1:  7264
> drivers/net/wireless/ralink/rt2x00/rt2800lib.c:3882:1:  5840
> lib/atomic64_test.c:250:1:  5656
> lib/atomic64_test.c:148:1:  5232
>
> Which is very promising improvement I would say.

Agreed, this looks great. With most of the warnings against the
2048 byte limit gone, we can probably work around the remaining
ones by doing local code changes in the kernel. I had patches for
some of these in the past, which I can dig up then.

[Bug sanitizer/81715] asan-stack=1 redzone allocation is too inflexible

2018-09-25 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715

--- Comment #32 from Arnd Bergmann  ---
(In reply to Martin Liška from comment #31)
> (In reply to Arnd Bergmann from comment #30)
> > (In reply to Martin Liška from comment #29)
> > > Which is very promising improvement I would say.
> > 
> > Agreed, this looks great. With most of the warnings against the
> > 2048 byte limit gone, we can probably work around the remaining
> > ones by doing local code changes in the kernel. I had patches for
> > some of these in the past, which I can dig up then.
> 
> Just out of curiosity. Am I right that you're using KASAN build for
> syzkaller or an other fuzzer? If so, I bet you can't hit most of the
> stack overflows in drivers as you very probably don't have the
> real hardware?

No, I don't do any fuzzing myself. The side project that I'm
interested in here is to build the kernel in all random
configurations without compile-time warnings that may indicate
bugs. I tend to build several hundred such kernels per day to
catch new bugs in both the (linux-next) kernel and in the
toolchain (clang and gcc).

[Bug sanitizer/81715] asan-stack=1 redzone allocation is too inflexible

2018-02-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715

--- Comment #22 from Arnd Bergmann  ---
(In reply to Jakub Jelinek from comment #20)
> I haven't heard any answer to #c16 whether it actually helped the kernel or
> not.

Sorry about that. Yes, it definitely helped the kernel a lot. At this point, we
also have localized workarounds to the same effect (using local variables
instead of accessing inline function arguments) in all functions that exceeded
the arbitrary 2048 byte stack size limit, and backported into the 4.4 kernel
and later, but with the newer gcc releases, we also get a lower stack
consumption for lots of other functions that were high but below that limit.

I had hoped that we could also do this on gcc-7 branch without KASAN, as high
stack consumption is always problematic for the kernel, and the same functions
that got bumped over the warning limit with KASAN still suffer from wasted
stack space on older compilers without KASAN. Since you consider that too
invasive for the stable releases, my current workaround has to suffice.

One side issue that is not solved at all by the patch is
-fsanitize-address-use-after-scope, since that still leads to extreme stack
usage in the kernel. The problem here is that it forces many local variables
into separate stack slots even when they could get reused without
-fsanitize-address-use-after-scope, making it still actively dangerous to run
kernels built with this option.
My workaround in the kernel is now to have that option disabled by default and
only enabled when users explicitly turn it on. I still think it would be nice
to address that in the way I originally suggested, by copying the behavior that
LLVM uses with its variably sized redzone area.

[Bug sanitizer/81715] asan-stack=1 redzone allocation is too inflexible

2018-02-20 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715

--- Comment #24 from Arnd Bergmann  ---
(In reply to Martin Liška from comment #23)

> That's definitely possible for GCC 9. Question is whether such change will
> be sufficient for you. Do you expect it will reduce stack usage in the
> desired way?

I've recreated my original finding, comparing a clang-5 release against a
recent gcc-8 snapshot. Building an x86 allmodconfig kernel with clang, I get 78
-fsanitize-address-use-after-scope warnings against a 2048 byte limit, the
largest ones are:

drivers/usb/misc/sisusbvga/sisusb.c:1880:12: warning: stack frame size of 6776
bytes in function 'sisusb_init_gfxcore' [-Wframe-larger-than=]
drivers/gpu/drm/nouveau/nvkm/subdev/fb/ramgk104.c:1521:1: warning: stack frame
size of 5176 bytes in function 'gk104_ram_new_' [-Wframe-larger-than=]
drivers/usb/misc/sisusbvga/sisusb.c:1750:12: warning: stack frame size of 5112
bytes in function 'sisusb_set_default_mode' [-Wframe-larger-than=]
drivers/net/wireless/atmel/atmel.c:1307:5: warning: stack frame size of 5016
bytes in function 'atmel_open' [-Wframe-larger-than=]
net/core/ethtool.c:2549:5: warning: stack frame size of 4568 bytes in function
'dev_ethtool' [-Wframe-larger-than=]
drivers/net/wireless/broadcom/brcm80211/brcmsmac/phy/phy_n.c:19216:6: warning:
stack frame size of 4312 bytes in function 'wlc_phy_init_nphy'
[-Wframe-larger-than=]
drivers/media/usb/em28xx/em28xx-dvb.c:1129:12: warning: stack frame size of
3992 bytes in function 'em28xx_dvb_init' [-Wframe-larger-than=]
drivers/staging/media/atomisp/pci/atomisp2/css2400/sh_css.c:6802:24: warning:
stack frame size of 3960 bytes in function 'load_capture_binaries'
[-Wframe-larger-than=]
drivers/staging/wlan-ng/cfg80211.c:454:12: warning: stack frame size of 3864
bytes in function 'prism2_connect' [-Wframe-larger-than=]
drivers/staging/wilc1000/host_interface.c:2480:13: warning: stack frame size of
3704 bytes in function 'host_if_work' [-Wframe-larger-than=]


With gcc-8, the same configuration has 179 warnings, including:

drivers/net/wireless/ralink/rt2x00/rt2800lib.c:5650:1: warning: the frame size
of 23632 bytes is larger than 2048 bytes [-Wframe-larger-than=]
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:4515:1: warning: the frame size
of 14056 bytes is larger than 2048 bytes [-Wframe-larger-than=]
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:3879:1: warning: the frame size
of 11504 bytes is larger than 2048 bytes [-Wframe-larger-than=]
lib/atomic64_test.c:250:1: warning: the frame size of 11192 bytes is larger
than 2048 bytes [-Wframe-larger-than=]
lib/atomic64_test.c:148:1: warning: the frame size of 10360 bytes is larger
than 2048 bytes [-Wframe-larger-than=]
drivers/net/wireless/ralink/rt2x00/rt73usb.c:1294:1: warning: the frame size of
8680 bytes is larger than 2048 bytes [-Wframe-larger-than=]
fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than
2048 bytes [-Wframe-larger-than=]
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:8655:1: warning: the frame size
of 6456 bytes is larger than 2048 bytes [-Wframe-larger-than=]
drivers/media/dvb-frontends/stv090x.c:3090:1: warning: the frame size of 5872
bytes is larger than 2048 bytes [-Wframe-larger-than=]
drivers/net/wireless/ralink/rt2x00/rt61pci.c:1647:1: warning: the frame size of
5792 bytes is larger than 2048 bytes [-Wframe-larger-than=]
drivers/media/dvb-frontends/stv090x.c:1595:1: warning: the frame size of 5304
bytes is larger than 2048 bytes [-Wframe-larger-than=]
drivers/scsi/fnic/fnic_trace.c:451:1: warning: the frame size of 5000 bytes is
larger than 2048 bytes [-Wframe-larger-than=]
drivers/net/wireless/ralink/rt2x00/rt2800lib.c:2417:1: warning: the frame size
of 4912 bytes is larger than 2048 bytes [-Wframe-larger-than=]
drivers/media/dvb-frontends/stv090x.c:4265:1: warning: the frame size of 4840
bytes is larger than 2048 bytes [-Wframe-larger-than=]

Comparing against a 3072 byte limit, I get 18 warnings for clang vs 54 for
gcc-8. The detailed analysis of some of those warnings last year had shown that
the difference can be traced almost entirely to simple scalar variables that
use 64 bytes redzone with gcc but only 16 bytes with clang.

[Bug sanitizer/81715] asan-stack=1 redzone allocation is too inflexible

2018-02-20 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715

--- Comment #26 from Arnd Bergmann  ---
(In reply to Martin Liška from comment #25)
> (In reply to Arnd Bergmann from comment #24)
> 
> Ok, I don't have problem to implement the similar behavior in GCC 9. Looks
> most
> of warnings are in drivers. That should not be problem as I guess KASAN
> build is
> mainly used in a qemu machine (with syzkaller)? Thus exotic drivers should
> not
> be needed?

I actually have no idea in what other ways it may be used, though I didn't
think that running syzkaller was the only use case. It always feels like most
bugs in the kernel are in obscure drivers, but then most of the kernel code
consists of obscure drivers ;-)

Here are some warnings in code that is actually being run. For the full output
I see on linux-next, have a look at https://pastebin.com/CMJiUsuR. There
are a couple of other warnings mixed in there as well that I'm working on
addressing, but it's mainly the stack overflow once I turn on
CONFIG_KASAN_EXTRA.

arch/x86/kernel/cpu/mshyperv.c:261:1: warning: the frame size of 2704 bytes is
larger than 2048 bytes [-Wframe-larger-than=]
arch/x86/kvm/emulate.c:2552:1: warning: the frame size of 2128 bytes is larger
than 2048 bytes [-Wframe-larger-than=]
drivers/acpi/nfit/core.c:3168:1: warning: the frame size of 3952 bytes is
larger than 2048 bytes [-Wframe-larger-than=]
drivers/firmware/efi/test/efi_test.c:688:1: warning: the frame size of 2400
bytes is larger than 2048 bytes [-Wframe-larger-than=]
drivers/gpu/drm/amd/amdgpu/../display/dc/bios/command_table.c:83:1: warning:
the frame size of 3760 bytes is larger than 2048 bytes [-Wframe-larger-than=]
drivers/md/md.c:8561:1: warning: the frame size of 2544 bytes is larger than
2048 bytes [-Wframe-larger-than=]
drivers/net/bonding/bond_netlink.c:677:1: warning: the frame size of 2096 bytes
is larger than 2048 bytes [-Wframe-larger-than=]
fs/btrfs/relocation.c:1202:1: warning: the frame size of 4272 bytes is larger
than 2048 bytes [-Wframe-larger-than=]
fs/fscache/stats.c:287:1: warning: the frame size of 6536 bytes is larger than
2048 bytes [-Wframe-larger-than=]
fs/jbd2/commit.c:1128:1: warning: the frame size of 3728 bytes is larger than
2048 bytes [-Wframe-larger-than=]
fs/nfs/pnfs.c:1892:1: warning: the frame size of 2672 bytes is larger than 2048
bytes [-Wframe-larger-than=]
fs/ntfs/mft.c:2756:1: warning: the frame size of 2352 bytes is larger than 2048
bytes [-Wframe-larger-than=]
fs/userfaultfd.c:1824:1: warning: the frame size of 2256 bytes is larger than
2048 bytes [-Wframe-larger-than=]
fs/xfs/libxfs/xfs_rmap.c:1334:1: warning: the frame size of 2384 bytes is
larger than 2048 bytes [-Wframe-larger-than=]
kernel/rcu/tree.c:2282:1: warning: the frame size of 3160 bytes is larger than
2048 bytes [-Wframe-larger-than=]
lib/rbtree.c:481:1: warning: the frame size of 2520 bytes is larger than 2048
bytes [-Wframe-larger-than=]
mm/khugepaged.c:1560:1: warning: the frame size of 2976 bytes is larger than
2048 bytes [-Wframe-larger-than=]
mm/migrate.c:2129:1: warning: the frame size of 2104 bytes is larger than 2048
bytes [-Wframe-larger-than=]
mm/page_alloc.c:3247:1: warning: the frame size of 4584 bytes is larger than
2048 bytes [-Wframe-larger-than=]
mm/vmscan.c:1350:1: warning: the frame size of 5072 bytes is larger than 2048
bytes [-Wframe-larger-than=]
net/bridge/br_netlink.c:1446:1: warning: the frame size of 2592 bytes is larger
than 2048 bytes [-Wframe-larger-than=]
net/core/ethtool.c:2832:1: warning: the frame size of 3376 bytes is larger than
2048 bytes [-Wframe-larger-than=]
net/core/rtnetlink.c:1631:1: warning: the frame size of 2272 bytes is larger
than 2048 bytes [-Wframe-larger-than=]
net/mac80211/util.c:2188:1: warning: the frame size of 2464 bytes is larger
than 2048 bytes [-Wframe-larger-than=]
net/rxrpc/recvmsg.c:603:1: warning: the frame size of 2424 bytes is larger than
2048 bytes [-Wframe-larger-than=]
net/sctp/socket.c:7271:1: warning: the frame size of 2704 bytes is larger than
2048 bytes [-Wframe-larger-than=]
net/wireless/nl80211.c:1938:1: warning: the frame size of 4248 bytes is larger
than 2048 bytes [-Wframe-larger-than=]

> The middle red zone is only 32B. So I don't understand why 'Size' not used
> for red zone
> calculation?

No idea.

[Bug sanitizer/84732] New: false-positive -Wstringop-truncation warning with -fsanitize-coverage=trace-pc

2018-03-06 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84732

Bug ID: 84732
   Summary: false-positive -Wstringop-truncation warning with
-fsanitize-coverage=trace-pc
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

Created attachment 43576
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43576&action=edit
linux/drivers/staging/lustre/lnet/lnet/lib-socket.c, preprocessed, not reduced

I ran into this warning for what looks like correct code in the linux kernel
that we should not warn about:

$ aarch64-linux-gcc-8.0.1 -fno-strict-aliasing -Wno-pointer-sign
-fsanitize-coverage=trace-pc  -Wall -O2 -c lib-socket.i
In file included from /git/arm-soc/arch/arm64/include/asm/processor.h:37,
 from /git/arm-soc/arch/arm64/include/asm/spinlock.h:21,
 from /git/arm-soc/include/linux/spinlock.h:88,
 from /git/arm-soc/include/linux/wait.h:9,
 from /git/arm-soc/include/linux/net.h:23,
 from
/git/arm-soc/drivers/staging/lustre/lnet/lnet/lib-socket.c:37:
/git/arm-soc/drivers/staging/lustre/lnet/lnet/lib-socket.c: In function
'lnet_ipif_query':
/git/arm-soc/include/linux/string.h:254:9: warning: '__builtin_strncpy'
specified bound 16 equals destination size [-Wstringop-truncation]
  return __builtin_strncpy(p, q, size);
 ^
/git/arm-soc/include/linux/string.h:254:9: warning: '__builtin_strncpy'
specified bound 16 equals destination size [-Wstringop-truncation]
  return __builtin_strncpy(p, q, size);
 ^
/git/arm-soc/include/linux/string.h:254:9: warning: '__builtin_strncpy'
specified bound 16 equals destination size [-Wstringop-truncation]
  return __builtin_strncpy(p, q, size);
 ^

See
https://elixir.bootlin.com/linux/v4.15/source/drivers/staging/lustre/lnet/lnet/lib-socket.c#L99
for the original source code. Without -fsanitize-coverage=trace-pc, the
strlen() comparison is sufficient to avoid that warning, with
fsanitize=coverage=trace-pc, that logic fails:

if (strlen(name) > sizeof(ifr.ifr_name) - 1)
return -E2BIG;
strncpy(ifr.ifr_name, name, sizeof(ifr.ifr_name));


I can probably create a reduced test case if that helps.

[Bug sanitizer/84732] false-positive -Wstringop-truncation warning with -fsanitize-coverage=trace-pc

2018-03-07 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84732

--- Comment #5 from Arnd Bergmann  ---
Created attachment 43586
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43586&action=edit
drivers/gpu/drm/drm_property.c, preprocessed

I found another case that appears to be related but not the same, attaching
another (non-reduced) file for reference. The code that triggered a warning
this time is:

strncpy(property->name, name, DRM_PROP_NAME_LEN);
property->name[DRM_PROP_NAME_LEN-1] = '\0';

but unlike the first one, this only happens with -fsanitize=kernel-address but
not with -fsanitize-coverage=trace-pc:

$ x86_64-linux-gcc-8.0.1 -fno-strict-aliasing -O2 -Wall -S drm_property.i  
-fsanitize=kernel-address

/git/arm-soc/drivers/gpu/drm/drm_property.c: In function 'drm_property_create':
/git/arm-soc/include/linux/string.h:254:9: warning: '__builtin_strncpy'
specified bound 32 equals destination size [-Wstringop-truncation]
  return __builtin_strncpy(p, q, size);
 ^
/git/arm-soc/drivers/gpu/drm/drm_property.c: In function
'drm_property_add_enum':
/git/arm-soc/include/linux/string.h:254:9: warning: '__builtin_strncpy'
specified bound 32 equals destination size [-Wstringop-truncation]
  return __builtin_strncpy(p, q, size);
 ^
/git/arm-soc/include/linux/string.h:254:9: warning: '__builtin_strncpy'
specified bound 32 equals destination size [-Wstringop-truncation]
  return __builtin_strncpy(p, q, size);
 ^

[Bug sanitizer/84863] New: false-positive -Warray-bounds warning with -fsanitize-coverage=object-size

2018-03-14 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84863

Bug ID: 84863
   Summary: false-positive -Warray-bounds warning with
-fsanitize-coverage=object-size
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

Created attachment 43655
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43655&action=edit
linux/net/xfrm/xfrm_output.c, preprocessed, not reduced.

Among the linux kernel build warnings I see from enabling sanitizers
(CONFIG_UBSAN_SANITIZE_ALL), this is one that seems interesting and not yet
reported:

In file included from /git/arm-soc/include/linux/kernel.h:10,
 from /git/arm-soc/include/linux/list.h:9,
 from /git/arm-soc/include/linux/module.h:9,
 from /git/arm-soc/net/xfrm/xfrm_output.c:13:
/git/arm-soc/net/xfrm/xfrm_output.c: In function 'xfrm_output_resume':
/git/arm-soc/include/linux/compiler.h:251:20: error: array subscript 4 is above
array bounds of 'struct nf_hook_entries *[3]' [-Werror=array-bounds]
   __read_once_size(&(x), __u.__c, sizeof(x));  \
^~~~
/git/arm-soc/include/linux/compiler.h:257:22: note: in expansion of macro
'__READ_ONCE'
 #define READ_ONCE(x) __READ_ONCE(x, 1)
  ^~~
/git/arm-soc/include/linux/rcupdate.h:351:48: note: in expansion of macro
'READ_ONCE'
  typeof(*p) *p1 = (typeof(*p) *__force)READ_ONCE(p); \
^
/git/arm-soc/include/linux/rcupdate.h:488:2: note: in expansion of macro
'__rcu_dereference_check'
  __rcu_dereference_check((p), (c) || rcu_read_lock_held(), __rcu)
  ^~~
/git/arm-soc/include/linux/rcupdate.h:546:28: note: in expansion of macro
'rcu_dereference_check'
 #define rcu_dereference(p) rcu_dereference_check(p, 0)
^
/git/arm-soc/include/linux/netfilter.h:219:15: note: in expansion of macro
'rcu_dereference'
   hook_head = rcu_dereference(net->nf.hooks_arp[hook]);
   ^~~

The original function looks like

static inline int nf_hook(u_int8_t pf, unsigned int hook, struct net *net,
  struct sock *sk, struct sk_buff *skb,
  struct net_device *indev, struct net_device *outdev,
  int (*okfn)(struct net *, struct sock *, struct
sk_buff *))
{
struct nf_hook_entries *hook_head = NULL;
int ret = 1;

rcu_read_lock();
switch (pf) {
case NFPROTO_IPV4:
hook_head = rcu_dereference(net->nf.hooks_ipv4[hook]);
break;
case NFPROTO_IPV6:
hook_head = rcu_dereference(net->nf.hooks_ipv6[hook]);
break;
case NFPROTO_ARP:
#ifdef CONFIG_NETFILTER_FAMILY_ARP
hook_head = rcu_dereference(net->nf.hooks_arp[hook]);
#endif
break;
case NFPROTO_BRIDGE:
#ifdef CONFIG_NETFILTER_FAMILY_BRIDGE
hook_head = rcu_dereference(net->nf.hooks_bridge[hook]);
#endif
break;
#if IS_ENABLED(CONFIG_DECNET)
case NFPROTO_DECNET:
hook_head = rcu_dereference(net->nf.hooks_decnet[hook]);
break;
#endif
default:
WARN_ON_ONCE(1);
break;
}

where the net->nf.hooks_* fields all have different sizes. The function is
called with constant arguments for 'pf' and 'hook', and for this caller, the
latter is out of range for the net->nf.hooks_arp[] array, but in a line that
is never reached. Reproduced with all versions that support the object-size
sanitizer (gcc-5 through 8).

With the attached preprocessed file, reproduce with

$ arm-linux-gnueabi-gcc-8.0.1 -Wall -O2 -c net/xfrm/xfrm_output.i -Werror  
-fsanitize=object-size  -fno-strict-aliasing

The warning also shows up with an x86 compiler, but that causes other problems.
I can produce a reduced version that works on x86 if needed.

[Bug tree-optimization/85175] New: [8 regression] false-positive -Wformat-overflow= warning with gcc-8 -Os

2018-04-03 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85175

Bug ID: 85175
   Summary: [8 regression] false-positive -Wformat-overflow=
warning with gcc-8 -Os
   Product: gcc
   Version: 8.0.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

This snippet from the Linux kernel produces a bogus warning when built with gcc
-Os, using a recent snapshot (20180402) A two months older version did not
produce the warning, and building with -O2 is also fine. Reduced test case:

$ gcc-8.0.1 -Os -Wall -c test.c

int of_property_read_u32(int *out_value);
int imx_ldb_bind(void)
{
int i;
for (i = 0; i < 4; i++) {
char clkname[16];
__builtin_sprintf(clkname, "di%d_sel", i);
}
return of_property_read_u32(&i);
}

imx-ldb.i: In function 'imx_ldb_bind':
imx-ldb.i:7:35: warning: '_sel' directive writing 4 bytes into a region of size
between 3 and 13 [-Wformat-overflow=]
   __builtin_sprintf(clkname, "di%d_sel", i);

[Bug tree-optimization/85175] [8 regression] false-positive -Wformat-overflow= warning with gcc-8 -Os

2018-04-03 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85175

--- Comment #3 from Arnd Bergmann  ---
(In reply to Martin Sebor from comment #2)

> So with the change above GCC is doing a better but imperfect job of
> determining the range.  Changing the variable to unsigned constrains the
> lower bound to zero and eliminates the warning even at -Os.

Ok, so this is actually the same thing I saw with seven other files in the
kernel (out of several hundred randconfig kernel builds on three
architectures). In each case, we now have a signed integer range that appears
to be constrained on one side after r257857, e.g. [-2147483647, 68] or [1,
2147483647] based on a partial range check, but the warning is still a false
positive in the end.

The one such warning we gained in the kernel that makes sense was this one (not
sent yet):

commit c18e88d296264d76f1a242ae95a43681cf198078
Author: Arnd Bergmann 
Date:   Tue Apr 3 09:45:35 2018 +0200

bluetooth: fix hci name overflow

gcc-8 warns that the index of the hci device could overflow the eight
character array:

net/bluetooth/hci_core.c: In function 'hci_register_dev':
net/bluetooth/hci_core.c:3093:26: error: '%d' directive writing between 1
and 10 bytes into a region of size 5 [-Werror=format-overflow=]
  sprintf(hdev->name, "hci%d", id);
  ^~
net/bluetooth/hci_core.c:3093:22: note: directive argument in the range [0,
2147483647]
  sprintf(hdev->name, "hci%d", id);
  ^~~
net/bluetooth/hci_core.c:3093:2: note: 'sprintf' output between 5 and 14
bytes into a destination of size 8
  sprintf(hdev->name, "hci%d", id);

This uses snprintf() to enforce a valid string, and limits the range of
the integer to 0... In practice this should not matter as we would
not be able connect more than  bluetooth hci's simultaneously.

Signed-off-by: Arnd Bergmann 

diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 40d260f2bea5..9e2ad444d799 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3075,13 +3075,14 @@ int hci_register_dev(struct hci_dev *hdev)

/* Do not allow HCI_AMP devices to register at index 0,
 * so the index can be used as the AMP controller ID.
+* Ensure the name fits into eight characters id < 1.
 */
switch (hdev->dev_type) {
case HCI_PRIMARY:
-   id = ida_simple_get(&hci_index_ida, 0, 0, GFP_KERNEL);
+   id = ida_simple_get(&hci_index_ida, 0, 1, GFP_KERNEL);
break;
case HCI_AMP:
-   id = ida_simple_get(&hci_index_ida, 1, 0, GFP_KERNEL);
+   id = ida_simple_get(&hci_index_ida, 1, 1, GFP_KERNEL);
break;
default:
return -EINVAL;
@@ -3090,7 +3091,7 @@ int hci_register_dev(struct hci_dev *hdev)
if (id < 0)
return id;

-   sprintf(hdev->name, "hci%d", id);
+   snprintf(hdev->name, sizeof(hdev->name), "hci%d", id);
hdev->id = id;

BT_DBG("%p name %s bus %d", hdev, hdev->name, hdev->bus);

The other new warnings are either the same kind as this one (the compiler
should be able to figure it out), or the sort where the compiler is technically
right about the string overflow based on the types, but we can easily prove
that the range is more limited like in the ida_simple_get() case with correct
limits (this is an extern function that returns an unused number within a
strict range).

Would it be sensible to only warn with -Wformat-overflow=1 on a signed integer
if the overflow happens with an actual known limit, but not if the limit is
euqal to the minimum/maximum representable numbers? The documentation for
-Wformat-overflow=2 isn't completely clear on what behavior was intended here
for the =1 and =2 case if the range is only bounded on one side.

[Bug tree-optimization/85175] [8 regression] false-positive -Wformat-overflow= warning with gcc-8 -Os

2018-04-04 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85175

--- Comment #5 from Arnd Bergmann  ---
Improving the optimizer will definitely help this one, but not the other
instances I found. Here's a list of the remaining warnings that got introduced
in the linux kernel by r257857 for reference:

https://elixir.bootlin.com/linux/v4.16/source/drivers/acpi/acpi_processor.c#L330
invalid_logical_cpuid() has checked the 'id' variable to be a positive number
(i.e. not an error value), so it's assumed to be in the range of [0,
2147483647] by the compiler, while we know it it's in the range of [0,
CONFIG_NR_CPUS] when the id is positive. I'd work around that by adding a '<
CONFIG_NR_CPUS' check in invalid_logical_cpuid.

https://elixir.bootlin.com/linux/v4.16/source/drivers/gpu/drm/imx/imx-ldb.c#L634
The one referenced reported here, ideally handled better by gcc

https://elixir.bootlin.com/linux/v4.16/source/drivers/usb/gadget/function/rndis.c#L900
We check the 'id' for a positive number, negative values are error codes. 
rndis_get_nr() otherwise returns the first available number, so we can assume
it's a low number (you won't have billions of USB network interfaces), but
making the buffer larger is a safer fix.

https://elixir.bootlin.com/linux/v4.16/source/drivers/usb/gadget/udc/fsl_udc_core.c#L2497
We know that max_ep is a small number, but gcc cannot know this, so the loop
index has a lower bound but no upper bound. Would work around this by
increasing the buffer size from 14 to 16 bytes.

https://elixir.bootlin.com/linux/v4.16/source/sound/pci/cmipci.c#L3157
We check for an upper bound but not a lower bound on a signed integer that we
know contains a positive number. The warning seems reasonable here, and I would
make the variable unsigned.

https://elixir.bootlin.com/linux/v4.16/source/drivers/scsi/ch.c#L937
ch->minor is known to be less than 128 CH_MAX_DEVS here, but gcc cannot see
this. Again, idr_alloc() returns a negative code in case of an error, so gcc
sees that the variable has a lower bound, but not the upper bound. Would work
around this using a %hhi modifier.

https://elixir.bootlin.com/linux/v4.16/source/drivers/power/supply/sbs-battery.c#L559
sbs_read_word_data() returns either a negative error code or a positive 16-bit
number. Would work around that using the %hx modifier.

https://elixir.bootlin.com/linux/v4.16/source/net/bluetooth/hci_core.c#L3093
idr_alloc() once more returns a negative number on error, or a positive number
that may have an upper bound (the third argument to idr_alloc). Here we should
specify the upper bound (1), but gcc won't see it and still warn, so we
also need a way to tell it that 'id' is a four-digit number. Using the %hi
format won't work as that is still five digits.

To summarize: For this particular project (linux kernel), these added tend to
be slightly annoying false-positives, but we can work around that with a
handful of simple patches. In five of the eight cases, gcc sees a limited range
of [0, 2147483647] because of an explicit check for an error code. I don't know
how common that code pattern is in other projects, but for us it would be more
logical to treat this like the unbounded range in -Wformat-overflow=2 rather
than a range with a known bound for -Wformat-overflow=1.

[Bug c/69702] New: excessive stack usage with -fprofile-arcs

2016-02-05 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702

Bug ID: 69702
   Summary: excessive stack usage with -fprofile-arcs
   Product: gcc
   Version: 5.3.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

Created attachment 37604
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37604&action=edit
standalone test case extracted from Linux kernel

With gcc versions 4.9 or higher, the stack usage of some functions in the Linux
kernel has grown to the point where we risk a stack overflow, with 8kb or 16kb
of stack being available per thread.

When building an ARM kernel, I get at least these warnings in some
configurations when using "gcc -fprofile-arcs -Wframe-larger-than=1024", and
don't get them without -fprofile-arcs:

drivers/isdn/isdnhdlc.c:629:1: error: the frame size of 1152 bytes is larger
than 1024 bytes 
drivers/media/common/saa7146/saa7146_hlp.c:464:1: error: the frame size of 1040
bytes is larger than 1024 bytes 
drivers/mtd/chips/cfi_cmdset_0020.c:651:1: error: the frame size of 1040 bytes
is larger than 1024 bytes 
drivers/net/wireless/ath/ath6kl/main.c:495:1: error: the frame size of 1200
bytes is larger than 1024 bytes 
drivers/net/wireless/ath/ath9k/ar9003_aic.c:434:1: error: the frame size of
1208 bytes is larger than 1024 bytes 
drivers/video/fbdev/riva/riva_hw.c:426:1: error: the frame size of 1248 bytes
is larger than 1024 bytes 
lib/lz4/lz4hc_compress.c:514:1: error: the frame size of 2464 bytes is larger
than 1024 bytes 

The lz4hc_compress.c file is a good example, as it has the worst stack usage
and is usable as a working test case outside of the kernel. I have reduced this
file to a standalone .c file that can optionally compile into an executable
program (lz4 compression from stdin to stdout). The code is orginally from
www.lz4.org, but has been adapted for use in Linux.

Compile with:
gcc -O2 -Wall -Wno-pointer-sign -Wframe-larger-than=200 -fprofile-arcs  -c
lz4hc_compress.c 

The same problem happens on all architectures, e.g. gcc-4.9.3:

Target  -fprofile-arcs  normal
aarch64-linux-gcc 1136  112 
alpha-linux-gcc   1008  304 
am33_2.0-linux-gcc1280   84 
arm-linux-gnueabi-gcc 1080  112 
cris-linux-gcc828   100 
frv-linux-gcc 904   104 
hppa64-linux-gcc  944   248 
hppa-linux-gcc82492 
i386-linux-gcc824   108 
m32r-linux-gcc908   136 
microblaze-linux-gcc  83288 
mips64-linux-gcc  864   192 
mips-linux-gcc792   120 
powerpc64-linux-gcc   80096 
powerpc-linux-gcc 80856 
s390-linux-gcc832   112 
sh3-linux-gcc 824   128 
sparc64-linux-gcc 896   192 
sparc-linux-gcc   824   104 
x86_64-linux-gcc  912   192 
xtensa-linux-gcc  816   128 

With gcc-4.8.1, the numbers are much lower:

arm-linux-gnueabi-gcc 184   104
x86_64-linux-gcc  224   192

The size of the binary object has also grown noticeably, from around 3000 bytes
 without -fprofile-arcs (on any version) to 10300 bytes with gcc-5.3.1 but only
6941 bytes with gcc-4.8. Runtime speed does not appear to be affected much
(less than 20% overhead for -fprofile-arcs, which seems reasonable).

I have tested ARM cross-compilers version 4.9.3 through 5.3.1, which all show
similar problematic behavior, while version 4.6 through 4.8.3 are ok.

[Bug tree-optimization/69702] [4.9/5/6 Regression] excessive stack usage with -fprofile-arcs, LIM store motion lacks a register pressure aware cost model

2016-02-10 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702

--- Comment #2 from Arnd Bergmann  ---
Thanks, I have now added -fno-tree-loop-im to the kernel gcov cflags, so files
we profile will be built with that. I can confirm that it fixes all stack size
warnings that show up with -fprofile-arcs, I found around a dozen of them in
various parts of the kernel.

Are there any downsides to doing this for all compiler versions? I don't think
we care much about a missed optimization when CGOV is used, and nobody ships
that on production systems. I guess for older gcc releases it's not necessary
to disable tree-loop-im, but it's not easy to make this conditional on the gcc
version without major surgery in the kernel build system.

I'm also adding -Wno-maybe-uninitialized now, because -fprofile-arcs also
introduces countless "error: '...' may be used uninitialized in this function
[-Werror=maybe-uninitialized]" warnings that I don't see without
-fprofile-arcs, and those are all false positives. Let me know if I should open
another bug report for those (I assume there isn't really much to do, based on
my reading of https://gcc.gnu.org/wiki/Better_Uninitialized_Warnings).

I also found two other files in the kernel that doen't build correctly once
-fprofile-arcs is enabled (with or without tree-loop-im), I'll report those as
separate bugs.

[Bug c/70232] New: excessive stack usage with -O2

2016-03-14 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232

Bug ID: 70232
   Summary: excessive stack usage with -O2
   Product: gcc
   Version: 6.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

Created attachment 37964
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37964&action=edit
partially reduced test case

Using gcc-6.0 for building the Linux kernel, I see one file (out of many
hundred randconfig test builds so far) that occasionally hits the maximum
per-function stack size limit. I have only tested on 32-bit ARM, and I have
manually reduced the test case:

$ arm-linux-gnueabi-gcc -o lpfc_scsi.o -O2 -Wframe-larger-than=1024
lpfc_scsi.c: In function 'lpfc_find_next_oas_lun':
lpfc_scsi.c:117:1: warning: the frame size of 1152 bytes is larger than 1024
bytes [-Wframe-larger-than=]
$ arm-linux-gnueabi-gcc --version   
arm-linux-gnueabi-gcc (GCC) 6.0.0 20160313 (experimental)

I see this with all options as long as -O2 is set, but not with -O3 or -O1 or
-Os. Normal stack usage for this function is usually 128 bytes.

[Bug target/70232] [6 regression] excessive stack usage with -O2

2016-03-18 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232

--- Comment #4 from Arnd Bergmann  ---
I've tried out a few more things as well, to see if the alignment of the struct
lpfc_name type or the builtin memcpy makes a different. Replacing the array of
eight bytes with a single uint64_t and scalar operations instead of string
functions makes very little difference, so it seems to be neither of them.

However, I think the wwn_to_uint64_t() function is what causes the problem.
This is supposed to be turned into a direct load or a byte reversing load
depending on endianess, but this apparently does not happen.

Adding -mbig-endian to the compiler flags brings the stack usage down, so
presumably the optimization step that identifies byteswap patters is what
causes the stack growth.

[Bug target/70232] [6 regression] excessive stack usage with -O2

2016-03-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232

--- Comment #11 from Arnd Bergmann  ---
For reference, I have sent a patch to the kernel to replace the open-coded
byteswap with a __builtin_bswap64. This produces much better object code with
both gcc-5 and gcc-6 than the original version, so it's certainly a good thing
to do regardless of the resolution in gcc-6:

http://thread.gmane.org/gmane.linux.kernel/2178301

For reference, these are the sizes of stack usage and function length:

gcc-5.3.1, linux-4.5:

[Bug target/70232] [6 regression] excessive stack usage with -O2

2016-03-20 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232

--- Comment #12 from Arnd Bergmann  ---
Created attachment 37991
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37991&action=edit
simpler test case without manual byte swap

For reference, I have sent a patch to the kernel to replace the open-coded
byteswap with a __builtin_bswap64. This produces much better object code with
both gcc-5 and gcc-6 than the original version and uses the native swap
instructions, so it's certainly a good thing to do regardless of the resolution
in gcc-6:

http://thread.gmane.org/gmane.linux.kernel/2178301

For reference, these are the sizes of stack usage and function length using the
actual source code from the kernel:

 stack length
gcc-5.3.1, linux-4.5: 156  1146
gcc-5.3.1, patched:28   804
gcc-6.0.0, linux-4.5:1192  5144
gcc-6.0.0, patched:76  1612

I have adapted the test case now to no longer use unaligned data, memcpy or
manual byte swaps. The object code generated by gcc-6 nowhere as bad as with
the original example, but still considerably worse than what I get with  gcc-5.

[Bug target/70232] [6 regression] excessive stack usage with -O2

2016-03-23 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232

--- Comment #19 from Arnd Bergmann  ---
I've confirmed that the current gcc trunk addresses the original problem in the
Linux kernel build, aside from the reduced test case. Thanks a lot for working
on this!

[Bug target/62254] [4.9/5/6 Regression] gcc-4.9 ICEs on linux kernel zlib for armv3

2016-03-29 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62254

Arnd Bergmann  changed:

   What|Removed |Added

 CC||arnd at linaro dot org

--- Comment #9 from Arnd Bergmann  ---
Created attachment 38116
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=38116&action=edit
preprocessed source for another test, not minimized

I'm still getting what looks like the same or very similar problem with
arm-linux-gnueabi-gcc (GCC) 6.0.0 20160323:

arm-linux-gnueabi-gcc -march=armv3 -Os -fno-omit-frame-pointer
-fstack-protector-strong -Wno-pointer-sign -fprofile-arcs  -fno-tree-loop-im 
-c comm.i


/git/arm-soc/drivers/block/paride/comm.c: In function 'comm_write_block':
/git/arm-soc/drivers/block/paride/comm.c:177:1: internal compiler error: in
arm_reload_out_hi, at config/arm/arm.c:15604
 }
 ^
0xbba4e2 arm_reload_out_hi(rtx_def**)
/home/arnd/git/gcc/gcc/config/arm/arm.c:15604
0xcda654 gen_reload_outhi(rtx_def*, rtx_def*, rtx_def*)
/home/arnd/git/gcc/gcc/config/arm/arm.md:6408
0x831c71 insn_gen_fn::operator()(rtx_def*, rtx_def*, rtx_def*) const
/home/arnd/git/gcc/gcc/recog.h:302
0x831c71 check_and_process_move
/home/arnd/git/gcc/gcc/lra-constraints.c:1217
0x831c71 curr_insn_transform
/home/arnd/git/gcc/gcc/lra-constraints.c:3512
0x8328de lra_constraints(bool)
/home/arnd/git/gcc/gcc/lra-constraints.c:4494
0x8234e4 lra(_IO_FILE*)
/home/arnd/git/gcc/gcc/lra.c:2290
0x7e3ec1 do_reload
/home/arnd/git/gcc/gcc/ira.c:5425
0x7e3ec1 execute
/home/arnd/git/gcc/gcc/ira.c:5609
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <http://gcc.gnu.org/bugs.html> for instructions.

I've see similar ICEs in around two dozen files in the kernel now, but haven't
checked  which ones remain broken with current gcc-6. Please see if you can
spot something in the test case above, or let me know if you need me to check
the other files or reduce the test case further. Note that the command line
arguments above (-march=armv3 -Os -fno-omit-frame-pointer
-fstack-protector-strong -fprofile-arcs  -fno-tree-loop-im) all seem to be
significant here.

[Bug target/62254] [4.9/5/6 Regression] gcc-4.9 ICEs on linux kernel zlib for armv3

2016-03-29 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62254

--- Comment #11 from Arnd Bergmann  ---
(In reply to Nick Clifton from comment #10)
> Created attachment 38118 [details]

> This patch fixes your particular test case, but I am not sure if it will 
> handle all of the ICEs in the kernel.  Please could you give it a try out and 
> let me know the results ?

I've done some randconfig builds for RiscPC now and found a couple of remaining
failures with the patch from comment #7, so far all in the same three files.
Adding the patch from comment #10 fixes all of them. I'll let you know if
anything else shows up later.

>   The patch is a bit of hack.  (Well actually the both patches are hacks). 
> It seems likely that pre ARM v4t architecture support will be dropped in the
> future.  So I have to ask, is ARM v3 support really still of interest to
> kernel developers ?

We've had a discussion on the arm-kernel list about that. We have five ARMv4
platforms based on StrongARM or FA526 that are in active use (strongarm1100,
ebsa110, footbridge/netwinder, Gemini and Moxa ART), and it looks like we can
keep using them with future gcc versions by building for ARMv4T and linking
with ld --fix-v4bx, I've proposed a patch for that already.

The only reason we are still building with -march=armv3 is Russell King's
RiscPC, which also has a StrongARM 110 but cannot do 16-bit memory accesses. It
would be nice to get at least gcc-6 to work in that configuration before the
support is getting removed, so gcc-4.9 is not the last working version.
Ideally, we would have a way for a normal -march=armv4t build with another flag
to avoid 16-bit load/store operations, but I'm guessing that is exactly the
code that you want to kill off in the armv3 deprecation (aside from the
non-interworking branches).

[Bug middle-end/81483] New: spurious -Wformat-overflow warning for limited types

2017-07-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81483

Bug ID: 81483
   Summary: spurious -Wformat-overflow warning for limited types
   Product: gcc
   Version: 7.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

After looking in more detail at -Wformat-overflow warnings in the linux
kernels, I managed to reduce one warning to this test case:

#include 
void alienware_zone_init(unsigned char num_zones)
{
unsigned char zone;
char buffer[10];

for (zone = 0; zone < num_zones; zone++)
sprintf(buffer, "zone%02X", zone);
}

I get this warning:

$ x86_64-linux-gcc-7.1.1 -Wall -S /tmp/test4.c -O2 
/tmp/test4.c: In function 'alienware_zone_init':
/tmp/test4.c:9:24: warning: '%02X' directive writing between 2 and 8 bytes into
a region of size 6 [-Wformat-overflow=]
   sprintf(buffer, "zone%02X", zone);
^~~~
/tmp/test4.c:9:19: note: directive argument in the range [0, 2147483647]
   sprintf(buffer, "zone%02X", zone);
   ^~
/tmp/test4.c:9:3: note: 'sprintf' output between 7 and 13 bytes into a
destination of size 10
   sprintf(buffer, "zone%02X", zone);
   ^

which is surprising because the range "[0, 2147483647]" is larger
than an 'unsigned char' can hold. In similar test cases that use
an 'unsigned char' without the loop, we do not get the warning.

[Bug c/81484] New: incorrect -Wint-in-bool-context warning

2017-07-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81484

Bug ID: 81484
   Summary: incorrect -Wint-in-bool-context warning
   Product: gcc
   Version: 7.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

I came across a warning in the linux kernel with gcc-7.1.1:

arch/x86/math-emu/reg_add_sub.c: In function 'FPU_add':
arch/x86/math-emu/reg_add_sub.c:80:48: error: ?: using integer constants in
boolean context [-Werror=int-in-bool-context]

After analyzing it further, I reduced it to this test program, please compile
with "gcc-7.1.1 -Wall -O2":

#define signbyte(a) (((unsigned char *)(a))[9])
#define getsign(a)  (signbyte(a) & 0x80)
#define setsign(a,b)do { if (b) signbyte(a) |= 0x80; else signbyte(a) &=
0x7f; } while (0)
#define CW_RC   0x0C00
#define RC_DOWN 0x0400
#define SIGN_POS0u
#define SIGN_NEG0x80u
int FPU_add(void *dest, int control_w)
{
/* a) correct code, bogus warning */
setsign(dest, ((control_w & CW_RC) != RC_DOWN) ? SIGN_POS : SIGN_NEG);

/* b) false-positive warning */
setsign(dest, (control_w & CW_RC) != RC_DOWN ? SIGN_POS : SIGN_NEG);

/* c) wrong code, no warning */
setsign(dest, (control_w & CW_RC) != (RC_DOWN ? SIGN_POS : SIGN_NEG));

return 0;
}

The warning we get is about case a), which as far as I can tell is written
correctly and unambiguous, and should not trigger a warning.

It appears that the compiler mistakes this for code that is written in a
somewhat ambiguous way like the (still-correct) variant b) when the author
actually meant to write it like c). The result is that we also get a
false-positive warning for b) that makes a lot of sense, while the incorrect
case c) is a false-negative here, as we don't get a warning.

[Bug c/81484] incorrect -Wint-in-bool-context warning

2017-07-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81484

--- Comment #3 from Arnd Bergmann  ---
It seems I got a little confused when I only looked at the initial patch that
was proposed, which was supposed to cover specifically the comparison followed
by ?: as in https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00115.html. I see now
that the patch that made it into the release also covers the more general case
and that seems intentional, it just causes false-positive warnings.

The part that still seems odd here is that the warning does not take into
account the macro: The caller just passes an expression as the integer argument
into the setsign() macro, while the macro tests it for being nonzero.

In the opposite case, I found a piece of kernel code that does not produce a
warning unless we preprocess it first (as ccache does for instance):

#define EINVAL 22
#define v4l2_subdev_call(sd, f) ((sd)->f ? (sd)->f(sd) : -EINVAL)
struct v4l2_subdev {
int (*g_sliced_fmt)(struct v4l2_subdev *sd);
};
int f(struct v4l2_subdev *sd)
{
if (v4l2_subdev_call(sd, g_sliced_fmt))
return -EINVAL;

return 0;
}

Here the macro contains the ?: operator while the caller contains the 'if()',
and something in gcc seems to decide not to warn about this, but that logic
does not identify the first example as an false-positive.

[Bug c/81484] incorrect -Wint-in-bool-context warning

2017-07-19 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81484

--- Comment #5 from Arnd Bergmann  ---
(In reply to Marek Polacek from comment #4)
> (In reply to Arnd Bergmann from comment #3)
> > It seems I got a little confused when I only looked at the initial patch
> > that was proposed, which was supposed to cover specifically the comparison
> > followed by ?: as in
> > https://gcc.gnu.org/ml/gcc-patches/2016-09/msg00115.html. I see now that the
> > patch that made it into the release also covers the more general case and
> > that seems intentional, it just causes false-positive warnings.
> 
> I agree it's just a style warning here (to warn about if (c ? 0 : 2) I mean,
> as that's correct code), but the warning allows you to simplify the code, so
> that seems to be a good thing.

Right, in the particular case I ran into, the only thing that prevents
me from rewriting

 setsign(dest, ((control_w & CW_RC) != RC_DOWN) ? SIGN_POS : SIGN_NEG);

into

 setsign(dest, (control_w & CW_RC) == RC_DOWN);

is the local policy of always passing either SIGN_POS or SIGN_NEG
into setsign(), rather than a boolean. Similar code might rely
on those calling conventions, but this file is the only such instance
I found in Linux, so that is not a reason to change the compiler.

> > The part that still seems odd here is that the warning does not take into
> > account the macro: The caller just passes an expression as the integer
> > argument into the setsign() macro, while the macro tests it for being
> > nonzero.
> > 
> > In the opposite case, I found a piece of kernel code that does not produce a
> > warning unless we preprocess it first (as ccache does for instance):
> > 
> > #define EINVAL 22
> > #define v4l2_subdev_call(sd, f) ((sd)->f ? (sd)->f(sd) : -EINVAL)
> > struct v4l2_subdev {
> > int (*g_sliced_fmt)(struct v4l2_subdev *sd);
> > };
> > int f(struct v4l2_subdev *sd)
> > {
> > if (v4l2_subdev_call(sd, g_sliced_fmt))
> > return -EINVAL;
> > 
> > return 0;
> > }
> > 
> > Here the macro contains the ?: operator while the caller contains the
> > 'if()', and something in gcc seems to decide not to warn about this, but
> > that logic does not identify the first example as an false-positive.
> 
> The warning checks whether the ?: expression comes from a macro.  If it does
> (as in Comment 3), it doesn't warn.  But in the original testcase the ?:
> itself is not coming from a macro so it warns.

Ok, thanks for the explanation and for your time.

[Bug rtl-optimization/81592] New: spurious -Wformat-overflow warning with -fsanitize=signed-integer-overflow

2017-07-27 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81592

Bug ID: 81592
   Summary: spurious -Wformat-overflow warning with
-fsanitize=signed-integer-overflow
   Product: gcc
   Version: 7.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
  Target Milestone: ---

While testing the linux kernel with -fsanitize=signed-integer-overflow, I ran
into a bogus warning for sprintf format overflow for
security/keys/proc.c:proc_keys_show.

I have reduced the source file to this code snippet:

/* aarch64-linux-gcc-7.1.1 -O2 -c test.c -Wall -fno-strict-overflow 
-Wstrict-overflow=2 -fsanitize=signed-integer-overflow */
#include 

int proc_keys_show(long expiry, long now)
{
unsigned long timo;
char xbuf[4];

if (now < expiry) {
timo = expiry - now;
if (timo < 60)
sprintf(xbuf, "%lus", timo);
}

return 0;
}

When building with either -fwrapv or -fsanitize=signed-integer-overflow, I get
this warning:

test.c: In function '_proc_keys_show':
test.c:12:19: warning: '%lu' directive writing between 1 and 20 bytes into a
region of size 4 [-Wformat-overflow=]
sprintf(xbuf, "%lus", timo);
   ^~~
test.c:12:18: note: directive argument in the range [1, 18446744073709551615]
sprintf(xbuf, "%lus", timo);
  ^~
test.c:12:4: note: 'sprintf' output between 3 and 22 bytes into a destination
of size 4
sprintf(xbuf, "%lus", timo);
^~~


Building with just "-O2 -Wstrict-overflow=2" produces a different warning that
is also surprising:

test.c: In function '_proc_keys_show':
test.c:11:6: warning: assuming signed overflow does not occur when simplifying
conditional [-Wstrict-overflow]
   if (timo < 60)


As Jim Wilson mentioned on IRC, the first comparison here clearly determines
that no signed overflow can happen, and apparently that leads to the second
comparison being done using signed arithmetic, but then the compiler decides to
use a signed comparison and incorrectly assumes that it might have overflown.

This seems to be related to pr 79257 and pr 78969, but both of them are
sufficiently different that I decided to open a separate report here.

[Bug sanitizer/81601] New: incorrect Warray-bounds warning with -fsanitize

2017-07-28 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81601

Bug ID: 81601
   Summary: incorrect Warray-bounds warning with -fsanitize
   Product: gcc
   Version: 7.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: sanitizer
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arnd at linaro dot org
CC: dodji at gcc dot gnu.org, dvyukov at gcc dot gnu.org,
jakub at gcc dot gnu.org, kcc at gcc dot gnu.org, marxin at 
gcc dot gnu.org
  Target Milestone: ---

Created attachment 41856
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=41856&action=edit
reduced version of linux/net/ipv4/tcp_output.c

Compiling the Linux kernel with gcc-7.1.1 and ubsan, I get this warning:

net/ipv4/tcp_output.c: In function 'tcp_connect':
net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds
[-Werror=array-bounds]
   tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start;
^~
net/ipv4/tcp_output.c:2207:40: error: array subscript is below array bounds
[-Werror=array-bounds]
   tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start;
   ~^

I have manually reduced the file to the attached version (this can be reduced
further, I decided to leave a little more context for clarity).

The warning is an array dereference after a range check:

if (tp->chrono_type > TCP_CHRONO_UNSPEC)
tp->chrono_stat[tp->chrono_type - 1] += now - tp->chrono_start;

so it clearly cannot be below the bounds.

In the original version, this happens specifically when at least one of
-fsanitize=object-size, -fsanitize=alignment, or -fsanitize=null is set in
addition to "-O2 -Wall", but not when all three are disabled. In the reduced
version, I can also reproduce it with "-Os -Wall" (without ubsan).

I also see the problem with gcc-7.0.1 on all architectures I tried (arm, arm64
and x86), but not with gcc-6.3.1.

[Bug sanitizer/81601] [7/8 Regression] incorrect Warray-bounds warning with -fsanitize

2017-07-28 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81601

--- Comment #4 from Arnd Bergmann  ---
Thanks for the analysis. I have now submitted a local workaround for the
kernel, adding a temporary variable to avoid accessing the bitfield twice, see
https://patchwork.kernel.org/patch/9869037/

[Bug tree-optimization/81592] spurious -Wformat-overflow warning with -fsanitize=signed-integer-overflow

2017-07-31 Thread arnd at linaro dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81592

--- Comment #2 from Arnd Bergmann  ---
I have scanned the linux kernel sources for related bogus warnings and found
six others like this one that do not show up using gcc-7.1.1 without UBSAN but
do show up with UBSAN added in:

security/keys/proc.c: In function 'proc_keys_show':
security/keys/proc.c:232:19: error: '%lu' directive writing between 1 and 20
bytes into a region of size 16 [-Werror=format-overflow=]
sound/soc/fsl/fsl_sai.c: In function 'fsl_sai_probe':
sound/soc/fsl/fsl_sai.c:837:21: error: '%d' directive writing between 1 and 10
bytes into a region of size 4 [-Werror=format-overflow=]
drivers/block/cciss.c: In function 'cciss_init_one':
drivers/block/cciss.c:5039:28: error: '%d' directive writing between 1 and 10
bytes into a region of size 3 [-Werror=format-overflow=]
sound/isa/gus/gus_mem_proc.c: In function 'snd_gf1_mem_proc_init':
sound/isa/gus/gus_mem_proc.c:72:27: error: '%i' directive writing between 1 and
10 bytes into a region of size 8 [-Werror=format-overflow=]
sound/isa/gus/gus_mem_proc.c:72:18: note: directive argument in the range [0,
2147483646]
drivers/net/ethernet/marvell/mvneta_bm.c: In function 'mvneta_bm_probe':
drivers/net/ethernet/marvell/mvneta_bm.c:306:17: error: ',capacity' directive
writing 9 bytes into a region of size between 1 and 10
[-Werror=format-overflow=]
drivers/net/ethernet/marvell/mvneta_bm.c:306:3: note: 'sprintf' output between
15 and 24 bytes into a destination of size 15
drivers/pnp/pnpbios/proc.c: In function 'pnpbios_proc_exit':
drivers/pnp/pnpbios/proc.c:338:18: error: '%02x' directive writing between 2
and 8 bytes into a region of size 3 [-Werror=format-overflow=]
drivers/pnp/pnpbios/proc.c:338:17: note: directive argument in the range [0,
2147483646]
drivers/mfd/max77620.c: In function 'max77620_probe':
drivers/mfd/max77620.c:282:25: error: '%d' directive writing between 1 and 10
bytes into a region of size 7 [-Werror=format-overflow=]

These are all similarly bogus to the original warning, in all cases it seems
that gcc normally has no problems identifying the range. I can provide
preprocessed source files for all instances if that helps.

  1   2   3   >