[Bug rtl-optimization/78963] New: Missed optimization opportunity in copies of small unaligned data
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78963 Bug ID: 78963 Summary: Missed optimization opportunity in copies of small unaligned data Product: gcc Version: 6.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: eyalroz at technion dot ac.il Target Milestone: --- Preliminary notes: * This bug report stems from a StackOverflow question I asked: http://stackoverflow.com/q/41407257/1593077 * This bug regards the x86_64 architecture, but may apply elsewhere. * This bug regards -O3 optimizations * Everything described here is about the same for GCC 6.3 and 7 - whatever version of it GodBolt uses. * The entire bug is demonstrated here: https://godbolt.org/g/lDJSRm plus here https://godbolt.org/g/9Y2ebd Consider the task of copying 3-byte values from one place to another. If both those places are in memory, it seems reasonable to do four moves, and indeed GCC compiles this: #include typedef struct { unsigned char data[3]; } uint24_t; void f(uint24_t* __restrict__ dest, uint24_t* __restrict__ src) { memcpy(dest,src,3); } into this (clipping the instructions for the return value): f(uint24_t*, uint24_t*): movzx eax, WORD PTR [rsi] mov WORD PTR [rdi], ax movzx eax, BYTE PTR [rsi+2] mov BYTE PTR [rdi+2], al If the source or the destination is a register, two mov's should suffice - either the first two or the second two of the above. However, if I write this (perhaps contrived, but likely demonstrative of what could happen with larger programs, especially with multi-translation units, or when the OS gives you a pointer to work with etc): #include typedef struct { unsigned char data[3]; } uint24_t; void f(uint24_t* __restrict__ dest, uint24_t* __restrict__ src) { memcpy(dest,src,3); } int main() { uint24_t* p = (uint24_t*) 48; unsigned x; f((uint24_t*) &x,p); x += 1; f(p,(uint24_t*) &x); return 0; } The 3-byte value is "constructed" on the stack rather than in a register (first four mov's), and then one cannot avoid using four more mov's to copy it to the destination: movzx eax, WORD PTR ds:48 mov WORD PTR [rsp-4], ax movzx eax, BYTE PTR ds:50 mov BYTE PTR [rsp-2], al add DWORD PTR [rsp-4], 1 movzx eax, WORD PTR [rsp-4] mov WORD PTR ds:48, ax movzx eax, BYTE PTR [rsp-2] mov BYTE PTR ds:50, al If we do this with 4-byte values, i.e. replace uint24_t with uint32_t, it's a single mov both ways, and in fact it gets further optimized, so that this: #include #include void f(uint32_t* __restrict__ dest, uint32_t* __restrict__ src) { memcpy(dest,src,4); } int main() { uint32_t* p = (uint32_t*) 48; uint32_t x; f(&x,p); x += 1; f(p,&x); return 0; } is compiled into just this add DWORD PTR ds:48, 1 Now obviously you can't expect to optimize-out _that_ much with a 3-byte value, but 2 mov's in and 2 mov's out should be enough. Indeed, clang (since at least 3.4.1 or so) emits this for the uint24_t code: movzx eax, byte ptr [50] shl eax, 16 movzx ecx, word ptr [48] lea eax, [rcx + rax + 1] mov word ptr [48], ax shr eax, 16 mov byte ptr [50], al which has just four mov's.
[Bug driver/78957] ICE: SIGSEGV with -fno-sso-struct=web
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78957 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2017-01-01 Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek --- Created attachment 40434 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40434&action=edit gcc7-pr78957.patch Untested fix.
[Bug middle-end/77484] [6/7 Regression] Static branch predictor causes ~6-8% regression of SPEC2000 GAP
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77484 --- Comment #14 from Jan Hubicka --- Author: hubicka Date: Sun Jan 1 15:40:29 2017 New Revision: 243995 URL: https://gcc.gnu.org/viewcvs?rev=243995&root=gcc&view=rev Log: PR middle-end/77484 * predict.def (PRED_CALL): Update hitrate. (PRED_INDIR_CALL, PRED_POLYMORPHIC_CALL): New predictors. * predict.c (tree_estimate_probability_bb): Split CALL predictor into direct/indirect/polymorphic variants. Modified: trunk/gcc/ChangeLog trunk/gcc/predict.c trunk/gcc/predict.def
[Bug target/78938] [7 Regression] ICE in expand_vec_cond_expr, at optabs.c:5636 w/ -mavx512bw -ftree-loop-vectorize -O1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78938 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P4 CC||law at redhat dot com
[Bug tree-optimization/78961] [7 Regression] FAIL: gcc.dg/tree-ssa/scev-3.c scan-tree-dump-times optimized "&a" 1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78961 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P4 CC||law at redhat dot com
[Bug middle-end/78879] -fprofile-generate causes undefined reference to `____ilog2_NaN'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78879 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #12 from Jeffrey A. Law --- b_c_p with two distinct constant values returns 0. You're missing the fundamental point. Changing the order is just working around the core problem in that there are paths where b_c_p may return 0 and thus you must issue the call to ilog_NaN, which is not provided. This is a linux kernel bug, not a bug in GCC.
[Bug tree-optimization/78913] Probably misleading error reported by -Wformat-length
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78913 --- Comment #4 from Martin Sebor --- I'm not sure I do understand exactly what you mean. The warning in this specific case is a false positive. There is no easy way for GCC to avoid it without compromising the checker's efficacy in general. (There is, however, an easy way to avoid the warning by changing the code.) Let me explain the rationale for the warning (in the general case). Unintended truncation by snprintf has been linked to a number of vulnerabilities (see [1] and [2]). To avoid truncation, the snprintf destination buffer must be big enough to hold the longest output of the function for the given format string and the possible values of the arguments of the directives within it. Ignoring the strcpy call for the moment, in the test case in comment #0, the buffer is big enough for input strings at most 508 bytes long, but the "design" allows the input to be as long as 511 bytes. If a.path were 509 or more bytes long the result would be truncated, missing some or all of the characters in the "/_\n" suffix. If the constructed path were to be used to create a sensitive file in a privileged directory, the missing suffix could result in the file being created in a directory accessible by unprivileged users. (This is the prototypical example of why -Wformat-length warns for snprintf and not just sprintf.) If a.path can never be longer than 508 characters then there are a couple of ways to express that constraint: 1) use the %.508s directive instead of %s, or 2) verify the snprintf return value is less than 512. As I explained, due to the limitation of the implementation, the length of the string created by the strcpy call is not available to the checker and so it or similar solutions (such as guarding the snprintf call by 'if (srrlen(a.path) < 508)) don't suppress the warning and will likely require extensive changes to implement. Of the two alternatives, (2) is the expected/recommended way to avoid both the truncation and the warning. (3) is a possible enhancement that could also be used to suppress the warning (along with changing the code). Does one of these alternatives resolve your concern? [1] CWE-22: Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal') [2] The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities, chapter 8, Truncation
[Bug tree-optimization/78913] Probably misleading error reported by -Wformat-length
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78913 --- Comment #5 from Martin Sebor --- (In reply to Martin Sebor from comment #4) > 1) use the %.508s directive instead of %s, or > 2) verify the snprintf return value is less than 512. Whoops. An off-by-one error. I meant to follow that by: > Of the two alternatives, (1) is the expected/recommended way to avoid both > the truncation and the warning. (2) is a possible enhancement that could > also be used to suppress the warning (along with changing the code).
[Bug middle-end/78959] FAIL: gcc.c-torture/execute/pr78622.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78959 Martin Sebor changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2017-01-01 Ever confirmed|0 |1 --- Comment #1 from Martin Sebor --- The test output "4105" => 4 indicates that on HP-UX the %hhd directive doesn't convert the int argument (4105) to unsigned char to end up with 9 as the test expects. I think that's a bug in the HP-UX sprintf (do you have a way of reporting it and/or finding out if HP will fix it?), although there is/was some uncertainty about whether passing a value outside the range of char to %hhd is valid (https://groups.google.com/forum/#!topic/comp.std.c/evMVFdOufys). One way to way to deal with this in GCC might be to add a configure test to detect the bug and adjust the byte count in the gimple-ssa-sprintf pass based on its result. Another might be to add a target hook describing the behavior and making the pass conditional on it. Or, if the code isn't considered valid the test could simply be marked XFAIL on HP-UX.
[Bug middle-end/78959] FAIL: gcc.c-torture/execute/pr78622.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78959 --- Comment #2 from John David Anglin --- I think this is a c99 feature and we should add the following to test: /* { dg-require-effective-target c99_runtime } */ It looks as if 11.31 supports the %hhd directive but not earlier versions.
[Bug middle-end/78959] FAIL: gcc.c-torture/execute/pr78622.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78959 --- Comment #3 from Martin Sebor --- Okay, thanks. I can add that but shouldn't the pass still correctly handle the pre-C99 HP-UX behavior (i.e., when HAVE_C99_RUNTIME is not defined)? (In case it's not apparent from the test, it fails because GCC optimizes calls like "sprintf(0, 0, "%hhd", 4105)" to 1 when on pre-C99 HP-UX the function returns 4.
[Bug middle-end/78959] FAIL: gcc.c-torture/execute/pr78622.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78959 --- Comment #4 from Martin Sebor --- Or were you saying that prior to HP-UX 11.31 printf didn't have %hhd at all and treated it as an ordinary string (or undefined behavior)? If it's the latter the test change alone would probably be sufficient.
[Bug target/78938] [7 Regression] ICE in expand_vec_cond_expr, at optabs.c:5636 w/ -mavx512bw -ftree-loop-vectorize -O1
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78938 Jakub Jelinek changed: What|Removed |Added Priority|P4 |P1 CC||jakub at gcc dot gnu.org
[Bug middle-end/78960] FAIL: gcc.dg/tree-ssa/builtin-sprintf.c execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78960 Martin Sebor changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2017-01-01 CC||msebor at gcc dot gnu.org See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=78959, ||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=78133 Assignee|unassigned at gcc dot gnu.org |msebor at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Martin Sebor --- Based on the test output it looks like the failures are caused by the same underlying problem as bug 78959 (HP-UX printf lacking full support for the hh and h length modifiers), and by bug 78133 (lack of support for %a). As noted in the former bug it should be possible to resolve both by adding " dg-require-effective-target c99_runtime" to the tests, although it would be better to break out the failing tests into their own file so that the others that do not require full C99 support can still be exercised even on C89 implementations. The failure in test_f_long_double:530 look new.
[Bug middle-end/78959] FAIL: gcc.c-torture/execute/pr78622.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78959 Martin Sebor changed: What|Removed |Added Status|WAITING |ASSIGNED Assignee|unassigned at gcc dot gnu.org |msebor at gcc dot gnu.org
[Bug fortran/71880] pointer to allocatable character
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71880 --- Comment #5 from Harald Anlauf --- Maybe the dump-tree gives a hint. Looking at the example program t character(:), dimension(:), allocatable, target :: c character(:), dimension(:), pointer :: p allocate(c(10),source='X') p=>c write(*,*) size(c),len(c) !,' c=<',c(1),'>' write(*,*) size(p),len(p) !,' p=<',p(1),'>' end program t I find: t () { integer(kind=4) .c; struct array1_unknown c; integer(kind=4) .p; struct array1_unknown p; [...] .c = 1; [...] but .p seems to never get set. (.c, .p are the quantities that are printed as the len(c),len(p)).
[Bug ipa/77674] [7 Regression] ICE in binds_to_current_def_p with -fkeep-inline-functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77674 --- Comment #8 from Jan Hubicka --- Author: hubicka Date: Sun Jan 1 23:31:53 2017 New Revision: 243997 URL: https://gcc.gnu.org/viewcvs?rev=243997&root=gcc&view=rev Log: PR middle-end/77674 * symtab.c (symtab_node::binds_to_current_def_p): Fix handling of transparent aliases. PR middle-end/77674 * g++.dg/torture/pr77674.C: New testcase. Added: trunk/gcc/testsuite/g++.dg/torture/pr77674.C Modified: trunk/gcc/ChangeLog trunk/gcc/symtab.c trunk/gcc/testsuite/ChangeLog
[Bug ipa/77674] [7 Regression] ICE in binds_to_current_def_p with -fkeep-inline-functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77674 Jan Hubicka changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #9 from Jan Hubicka --- Fixed.
[Bug middle-end/78959] FAIL: gcc.c-torture/execute/pr78622.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78959 --- Comment #5 from dave.anglin at bell dot net --- On 2017-01-01, at 4:12 PM, msebor at gcc dot gnu.org wrote: > Or were you saying that prior to HP-UX 11.31 printf didn't have %hhd at all > and > treated it as an ordinary string (or undefined behavior)? If it's the latter > the test change alone would probably be sufficient. HP-UX 11.31 has %hhd support. -- John David Anglin dave.ang...@bell.net
[Bug middle-end/78959] FAIL: gcc.c-torture/execute/pr78622.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78959 --- Comment #6 from John David Anglin --- (In reply to Martin Sebor from comment #1) > The test output > > "4105" => 4 > > indicates that on HP-UX the %hhd directive doesn't convert the int argument > (4105) to unsigned char to end up with 9 as the test expects. I think > that's a bug in the HP-UX sprintf (do you have a way of reporting it and/or > finding out if HP will fix it?), although there is/was some uncertainty > about whether passing a value outside the range of char to %hhd is valid > (https://groups.google.com/forum/#!topic/comp.std.c/evMVFdOufys). snprintf returns a count of 4 and buf contains "4105". So, it looks like on HP-UX 11.11 %hhd is being interpreted as %hd (a short integer conversion). The %hhd form is not described in the 11.11 man page.
[Bug c++/78964] New: gcc fails to detect pointless increment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78964 Bug ID: 78964 Summary: gcc fails to detect pointless increment Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Target Milestone: --- Given the following C / C++ code: extern void g( int); void f( int n) { int m = 0; for (int i = 0; i < n; ++i) { g(i); ++m; } } gcc trunk dated 20161231 fails to detect that m can be safely removed: $ ~/gcc/results/bin/gcc -c -g -O2 -Wall -Wextra jan02.cc $ I think fixing this bug report will find about 100 bugs in the linux kernel.
[Bug c++/78964] gcc fails to detect pointless increment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78964 Markus Trippelsdorf changed: What|Removed |Added CC||trippels at gcc dot gnu.org --- Comment #1 from Markus Trippelsdorf --- (In reply to David Binderman from comment #0) > Given the following C / C++ code: > > extern void g( int); > > void f( int n) > { > int m = 0; > > for (int i = 0; i < n; ++i) > { > g(i); > ++m; > } > } > > gcc trunk dated 20161231 fails to detect that m can be safely removed: > > $ ~/gcc/results/bin/gcc -c -g -O2 -Wall -Wextra jan02.cc > $ > > I think fixing this bug report will find about 100 bugs > in the linux kernel. I can see no bug. The compiler will of course eliminate the pointless ++m. Why should it warn?
[Bug c++/78964] gcc fails to detect pointless increment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78964 --- Comment #2 from David Binderman --- >Why should it warn? For all the same reasons as warning -Wunused-but-set-variable. See the following code: extern void g( int); void f( int n) { int m = 0; int m2 = 2; for (int i = 0; i < n; ++i) { g(i); ++m; } m2 = 3; } $ ~/gcc/results/bin/gcc -c -O2 -Wall jan02.cc jan02.cc: In function ‘void f(int)’: jan02.cc:7:6: warning: variable ‘m2’ set but not used [-Wunused-but-set-variable] int m2 = 2; ^~ $ The compiler warns about m2, but not m. Anomaly.
[Bug c++/78964] gcc fails to detect pointless increment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78964 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- ++m is no different from m = m + 1;, both read the variable content before writing it again, and with the current infrastructure it would be terribly hard to figure out that the var is read solely to write its value again. Even trying to explain this in the documentation would be hard.
[Bug c++/78964] gcc fails to detect pointless increment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78964 --- Comment #4 from Markus Trippelsdorf --- And the Linux kernel would not see these warnings anyway: Makefile: 707 # These warnings generated too much noise in a regular build. 708 # Use make W=1 to enable them (see scripts/Makefile.build) 709 KBUILD_CFLAGS += $(call cc-disable-warning, unused-but-set-variable) 710 KBUILD_CFLAGS += $(call cc-disable-warning, unused-const-variable)
[Bug c++/78964] gcc fails to detect pointless increment
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78964 Markus Trippelsdorf changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |WONTFIX --- Comment #5 from Markus Trippelsdorf --- Lets close it.