[Bug fortran/99345] [11 Regression] ICE in doloop_contained_procedure_code, at fortran/frontend-passes.c:2464 since r11-2578-g27eac9ee6137a6b5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99345 Thomas Koenig changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |tkoenig at gcc dot gnu.org --- Comment #11 from Thomas Koenig --- Harald, thanks for reducing it!
[Bug fortran/99345] [11 Regression] ICE in doloop_contained_procedure_code, at fortran/frontend-passes.c:2464 since r11-2578-g27eac9ee6137a6b5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99345 Thomas Koenig changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #12 from Thomas Koenig --- Fixed with https://gcc.gnu.org/g:52654036a544389fb66855bf3972f2a8013bec59 . Thanks for the bug report!
[Bug web/99598] New: Commits are not transferred to bugzilla
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99598 Bug ID: 99598 Summary: Commits are not transferred to bugzilla Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: web Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- See https://gcc.gnu.org/pipermail/gcc-cvs/2021-March/343081.html , which is not distributed to bugzilla and the gcc-bugs mailing list, despite the ChangeLog entry reading Handle EXEC_IOLENGTH in doloop_contained_procedure_code. This rather obvious patch fixes an ICE on valid which came about because I did not handle EXEC_IOLENGTH as start of an I/O statement when checking for the DO loop variable. This is an 11 regression. gcc/fortran/ChangeLog: PR fortran/99345 * frontend-passes.c (doloop_contained_procedure_code): Properly handle EXEC_IOLENGTH. gcc/testsuite/ChangeLog: PR fortran/99345 * gfortran.dg/do_check_16.f90: New test. * gfortran.dg/do_check_17.f90: New test.
[Bug target/100045] New: Precomputing division
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100045 Bug ID: 100045 Summary: Precomputing division Product: gcc Version: unknown Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- Created attachment 50567 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50567&action=edit Test case We use the method given in "Division by Invariant Integers using Multiplication" by Granlund and Montgomery for optimizing division by divisors known to be constant at compile time. There can also be an advantage if many numbers are divided by the same numbers; in this case, the invariant inverse can be moved out of the loop. This is target-dependent. The attached test case performs 1000 unsigned divisions of uint32_t values read in randomly by a constant randomly chosen to be 12345678 - using the method from figure 4.1 from the publication cited above (timing in seconds given as pre_divide) - using a simple loop with divisions (timing in seconcs given as divide). On a AMD Ryzen 7 1700X, the timings are pre_divide: t = 0.013330 s divide: t = 0.052511 s OTOH, on POWER (gcc135), the difference is so small so that is very probably not worth the bother: pre_divide: t = 0.015183 s divide: t = 0.017454 s
[Bug fortran/94978] [8/9/10/11 Regression] Bogus warning "Array reference at (1) out of bounds in loop beginning at (2)"
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94978 Thomas Koenig changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org
[Bug libfortran/98076] Increase speed of integer I/O
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98076 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug fortran/82215] Feature request to better support two pass compiling with gfortran
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82215 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug fortran/92913] Add argument-mismatch check for INTERFACE for non-module procedures in the same file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92913 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug fortran/97345] FE passes do_subscript leaks gmp memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97345 Thomas Koenig changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org
[Bug fortran/93114] Use span passing components of derived types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93114 Thomas Koenig changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org
[Bug fortran/96216] Gap in interface checking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96216 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug fortran/30609] Calculating masks twice
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30609 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug fortran/97454] Decls for Fortran library procedures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97454 Thomas Koenig changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org
[Bug fortran/83927] Type-Bound Procedure on element of Derived Type PARAMETER Array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83927 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug fortran/67202] Fortran FE should load scalar pass-by-reference intent-in arguments at the beginning of a function
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67202 Thomas Koenig changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org
[Bug fortran/90536] Spurious (?) warning when using -Wconversion with -fno-range-check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90536 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug fortran/93956] Wrong array creation with p => array_dt(1:n)%component
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93956 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|REOPENED|NEW
[Bug libfortran/95101] Optimize libgfortran library handling of arrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95101 Thomas Koenig changed: What|Removed |Added Status|ASSIGNED|NEW Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org
[Bug fortran/40976] Merge DECL of procedure call with DECL of gfc_get_function_type
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=40976 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug fortran/68289] Missing diagnostic pragmas
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68289 Thomas Koenig changed: What|Removed |Added Assignee|tkoenig at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW
[Bug fortran/92422] [9 Regression] Warning with character and optimisation flags
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92422 Thomas Koenig changed: What|Removed |Added Resolution|--- |FIXED Status|WAITING |RESOLVED --- Comment #4 from Thomas Koenig --- Well, it's not fixed on gcc9, but I don't think it makes sense to try to find out what fixed this. Hence, closing as FIXED.
[Bug fortran/97454] New: Decls for Fortran library procedures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97454 Bug ID: 97454 Summary: Decls for Fortran library procedures Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- Currently, the decls for Fortran library procedures are inconsistent, which causes, among other things, segfaults on Darwin for ARM (PR96168). We should fix them all. For maxval, findloc and friends, I am working on a patch (see https://gcc.gnu.org/pipermail/fortran/2020-October/055170.html ). For cshift etc, we have to be more general, because we use the same routines for different types.
[Bug fortran/97454] Decls for Fortran library procedures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97454 Thomas Koenig changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=96168 Last reconfirmed||2020-10-16 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |tkoenig at gcc dot gnu.org
[Bug rtl-optimization/97459] New: __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 Bug ID: 97459 Summary: __uint128_t remainder for division by 3 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- The following two functions are equivalent: unsigned r3_128u_v1 (__uint128_t n) { unsigned long a; a = (n >> 64) + (n & 0x); return a % 3; } unsigned r3_128u_v2 (__uint128_t n) { return (unsigned) (n%3); } and the first one is definitely faster. (The approach is due to Hacker's Delight, 2nd edition, "Remainder by Summing Digits". There are also other interesting approaches there.)
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #4 from Thomas Koenig --- Here's a complete program for benchmarks on x86_64, using Jakub's functions (so they are indeed correct): #include #include #include #include #include #include unsigned r3_128u_v2 (__uint128_t n) { return (unsigned) (n%3); } unsigned r3_128u_v3 (__uint128_t n) { unsigned long a; a = (n >> 88); a += (n >> 44) & 0xfffULL; a += (n & 0xfffULL); return a % 3; } unsigned r3_128u_v4 (__uint128_t n) { unsigned long a; a = (n >> 96); a += (n >> 64) & 0xULL; a += (n >> 32) & 0xULL; a += (n & 0xULL); return a % 3; } #define N 100 int main() { __uint128_t *a; unsigned int s; unsigned long t1, t2; int fd; int i; a = malloc (sizeof (*a) * N); fd = open ("/dev/random", O_RDONLY); read (fd, a, sizeof (*a) * N); s = 0; t1 = __rdtsc(); for (i=0; i
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #5 from Thomas Koenig --- OK, so here is a benchmark with its function names corrected. It also includes one version (_v5) which is a bit faster. (Note I increased the number of iterations to get more accuracy out of the cycle count, which leads to numbers not being comparable to the previous benchmark.) #include #include #include #include #include #include unsigned r3_128u_v2 (__uint128_t n) { return (unsigned) (n%3); } unsigned r3_128u_v3 (__uint128_t n) { unsigned long a; a = (n >> 88); a += (n >> 44) & 0xfffULL; a += (n & 0xfffULL); return a % 3; } unsigned r3_128u_v4 (__uint128_t n) { unsigned long a; a = (n >> 96); a += (n >> 64) & 0xULL; a += (n >> 32) & 0xULL; a += (n & 0xULL); return a % 3; } unsigned r3_128u_v5 (__uint128_t n) { unsigned long a, b, c; b = n >> 64; c = n; if (__builtin_add_overflow (b, c, &a)) a++; return a%3; } #define N 1 int main() { __uint128_t *a; unsigned int s; unsigned long t1, t2; int fd; int i; a = malloc (sizeof (*a) * N); fd = open ("/dev/random", O_RDONLY); read (fd, a, sizeof (*a) * N); s = 0; t1 = __rdtsc(); for (i=0; i
[Bug fortran/95119] [9/10 Regression] CLOSE hangs when -fopenmp is specified in compilation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95119 --- Comment #13 from Thomas Koenig --- (In reply to Bill Long from comment #12) > Original submitter asking which GCC version(s) have / will have the fix. 10.2 already has been released with the fix. 9.4 and 11.1 will have it in when they are released.
[Bug libfortran/95104] [9/10 Regression] Segfault on a legal WAIT statement
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95104 Thomas Koenig changed: What|Removed |Added CC||tkoenig at gcc dot gnu.org --- Comment #19 from Thomas Koenig --- Fixed for 10.2. 9.4 and 11.1 will have the fix in.
[Bug fortran/95037] gfortran fails to compile a simple subroutine, issues an opaque message
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95037 --- Comment #5 from Thomas Koenig --- Fixed in 10.2, 9.4 and 11.1 will have it.
[Bug fortran/97491] New: Wrong restriction for VALUE arguments of pure procedures
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97491 Bug ID: 97491 Summary: Wrong restriction for VALUE arguments of pure procedures Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- $ cat pure.f90 pure function foo(x) result (ret) integer :: ret integer, value :: x x = x / 2 ret = x end function foo $ gfortran pure.f90 pure.f90:4:2: 4 | x = x / 2 | 1 Error: Variable 'x' cannot appear in a variable definition context (assignment) at (1) in PURE procedure
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #9 from Thomas Koenig --- (In reply to Jakub Jelinek from comment #7) > So, can we use this for anything but modulo 3, or 5, or 17, or 257 (all of > those have 2^32 mod N == 2^64 mod N == 2^128 mod N == 1) I think so, too. > probably also > keyed on the target having corresponding uaddv4_optab handler, normal > expansion not being able to handle it and emitting a libcall? Again, yes. This can also be used as a building block for handling division and remainder base 10. Here's a benchmark for this (it uses the sum of digits base 10 instead). qsum1 uses the standard method, which you can find (for example) in libgfortran. div_rem5_v2 first calculates the remainder of the division by 5 using this method, then does an exact division by multiplying with its modular inverse for 2^128. div_rem10_v2 then uses div_rem5_v2 to calculate the value and remainder of the division by 10, and qsum_v2 uses that to calculate the sum of digits. The timings are about a factor of 2 faster than the straightforward libcall version: s = 360398898 qsum_v1: 1.091621 s s = 360398898 qsum_v2: 0.485509 s #include #include #include #include #include #include #include #define ONE ((__uint128_t) 1) #define TWO_64 (ONE << 64) typedef __uint128_t mytype; double this_time () { struct timeval tv; gettimeofday (&tv, NULL); return tv.tv_sec + tv.tv_usec * 1e-6; } unsigned qsum_v1 (mytype n) { unsigned ret; ret = 0; while (n > 0) { ret += n % 10; n = n / 10; } return ret; } static void inline __attribute__((always_inline)) div_rem_5_v2 (mytype n, mytype *div, unsigned *rem) { unsigned long a, b, c; /* The modular inverse to 5 modulo 2^128 */ const mytype magic = (0x * TWO_64 + 0xCCCD * ONE); b = n >> 64; c = n; if (__builtin_add_overflow (b, c, &a)) a++; *rem = a % 5; *div = (n-*rem) * magic; } static void inline __attribute__((always_inline)) div_rem_10_v2 (mytype n, mytype *div, unsigned *rem) { mytype n5; unsigned rem5; div_rem_5_v2 (n, &n5, &rem5); *rem = rem5 + (n5 % 2) * 5; *div = n5/2; } unsigned qsum_v2 (mytype n) { unsigned ret; unsigned rem; mytype n_new; ret = 0; while (n > 0) { div_rem_10_v2 (n, &n_new, &rem); ret += rem; n = n_new; } return ret; } #define N 1000 int main() { mytype *a; unsigned long int s; double t1, t2; int fd; long int i; a = malloc (sizeof (*a) * N); fd = open ("/dev/urandom", O_RDONLY); read (fd, a, sizeof (*a) * N); s = 0; t1 = this_time(); for (i=0; i
[Bug bootstrap/97527] New: OpenBSD bootstrap fails with error: C++ preprocessor "/lib/cpp" fails sanity check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527 Bug ID: 97527 Summary: OpenBSD bootstrap fails with error: C++ preprocessor "/lib/cpp" fails sanity check Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: bootstrap Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- Created attachment 49418 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49418&action=edit config.log from the gcc subdirectory Bootstrap on OpenBSD fails with a strange error: /home/tkoenig/trunk-bin/./prev-gcc/xg++ -B/home/tkoenig/trunk-bin/./prev-gcc/ -B/home/tkoenig/x86_64-unknown-openbsd6.8/bin/ -nostdinc++ -B/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/src/.libs -B/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/libsupc++/.libs -I/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/include/x86_64-unknown-openbsd6.8 -I/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/include -I/home/tkoenig/trunk/libstdc++-v3/libsupc++ -L/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/src/.libs -L/home/tkoenig/trunk-bin/prev-x86_64-unknown-openbsd6.8/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g -O2 -fchecking=1 -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -Wno-unused -DHAVE_CONFIG_H -I. -I. -I../../trunk/gcc -I../../trunk/gcc/. -I../../trunk/gcc/../include -I../../trunk/gcc/../libcpp/include -I/home/tkoenig/trunk-bin/./gmp -I/home/tkoenig/trunk/gmp -I/home/tkoenig/trunk-bin/./mpfr/src -I/home/tkoenig/trunk/mpfr/src -I/home/tkoenig/trunk/mpc/src -I../../trunk/gcc/../libdecnumber -I../../trunk/gcc/../libdecnumber/dpd -I../libdecnumber -I../../trunk/gcc/../libbacktrace -I/home/tkoenig/trunk-bin/./isl/include -I/home/tkoenig/trunk/isl/include -o gimple-match.o -MT gimple-match.o -MMD -MP -MF ./.deps/gimple-match.TPo gimple-match.c /home/tkoenig/x86_64-unknown-openbsd6.8/bin/as: out of memory allocating 8 bytes after a total of 0 bytes
[Bug bootstrap/97527] OpenBSD bootstrap fails with error: C++ preprocessor "/lib/cpp" fails sanity check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527 --- Comment #1 from Thomas Koenig --- Created attachment 49419 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49419&action=edit Preprocessed source of gimple-match.ii (compressed)
[Bug bootstrap/97527] OpenBSD bootstrap fails with error: C++ preprocessor "/lib/cpp" fails sanity check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527 --- Comment #2 from Thomas Koenig --- Created attachment 49420 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49420&action=edit Resulting assember file (which is incomplete)
[Bug bootstrap/97527] OpenBSD bootstrap fails with error: C++ preprocessor "/lib/cpp" fails sanity check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527 --- Comment #3 from Thomas Koenig --- Created attachment 49421 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49421&action=edit config.log from the main build directory
[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527 Thomas Koenig changed: What|Removed |Added Target||x86_64-unknown-openbsd6.8 --- Comment #4 from Thomas Koenig --- Boostrapping compiler is gcc220$ egcc -v Using built-in specs. COLLECT_GCC=egcc COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-openbsd6.8/8.4.0/lto-wrapper Target: x86_64-unknown-openbsd6.8 Configured with: /usr/obj/ports/gcc-8.4.0/gcc-8.4.0/configure --with-stage1-ldflags=-L/usr/obj/ports/gcc-8.4.0/bootstrap/lib --verbose --program-transform-name='s,^,e,' --disable-nls --with-system-zlib --disable-libmudflap --disable-libgomp --disable-libssp --disable-tls --with-gnu-ld --with-gnu-as --enable-threads=posix --enable-wchar_t --with-gmp=/usr/local --enable-languages=c,c++,fortran,objc,ada --disable-libstdcxx-pch --enable-default-ssp --enable-default-pie --without-isl --enable-cpp --prefix=/usr/local --sysconfdir=/etc --mandir=/usr/local/man --infodir=/usr/local/info --localstatedir=/var --disable-silent-rules --disable-gtk-doc Thread model: posix gcc version 8.4.0 (GCC) gcc220$ eg++ -v Using built-in specs. COLLECT_GCC=eg++ COLLECT_LTO_WRAPPER=/usr/local/libexec/gcc/x86_64-unknown-openbsd6.8/8.4.0/lto-wrapper Target: x86_64-unknown-openbsd6.8 Configured with: /usr/obj/ports/gcc-8.4.0/gcc-8.4.0/configure --with-stage1-ldflags=-L/usr/obj/ports/gcc-8.4.0/bootstrap/lib --verbose --program-transform-name='s,^,e,' --disable-nls --with-system-zlib --disable-libmudflap --disable-libgomp --disable-libssp --disable-tls --with-gnu-ld --with-gnu-as --enable-threads=posix --enable-wchar_t --with-gmp=/usr/local --enable-languages=c,c++,fortran,objc,ada --disable-libstdcxx-pch --enable-default-ssp --enable-default-pie --without-isl --enable-cpp --prefix=/usr/local --sysconfdir=/etc --mandir=/usr/local/man --infodir=/usr/local/info --localstatedir=/var --disable-silent-rules --disable-gtk-doc Thread model: posix gcc version 8.4.0 (GCC) Assember is gcc220$ as --version GNU assembler (GNU Binutils) 2.35 Copyright (C) 2020 Free Software Foundation, Inc. This program is free software; you may redistribute it under the terms of the GNU General Public License version 3 or later. This program has absolutely no warranty. This assembler was configured for a target of `x86_64-unknown-openbsd6.8'. (self-compiled)
[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527 --- Comment #5 from Thomas Koenig --- Created attachment 49422 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49422&action=edit Generated gimple-match.c All the temporary files were generated by manually adding -save-temps to the Makefile in the gcc subdirectory, then re-compiling. The error is repeatable. Here's the output of vmstat while the machine is not compiling: gcc220$ vmstat procsmemory pagedisk traps cpu r s avm fre flt re pi po fr sr sd0 int sys cs us sy id 1 60 35M 2616M 6785 0 0 0 0 0 2 87 5296 432 8 2 90
[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527 --- Comment #6 from Thomas Koenig --- The machine is gcc220.fsffrance.org ; if anybody has an account there and wants to peek into /home/tkoenig to look into more details, be my guest.
[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527 --- Comment #8 from Thomas Koenig --- The *.s file generated with -save-temps is attached, but it is truncated for a reason that I do not understand. The binutils is indeed self-compiled from source (because the LLVM linker cannot handle gcc compilation), using the system compiler, clang. I'll recompile this with gcc 8.4 (which is installed in /usr/local/bin as egcc) and see what happens then.
[Bug bootstrap/97527] OpenBSD bootstrap fails with out of memory allocating 8 bytes after a total of 0 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97527 --- Comment #9 from Thomas Koenig --- Created attachment 49423 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49423&action=edit config.log from libgomp using binutils compiled with gcc 8.4.0 Using the binutils compiled with gcc 8.4 now leads to an error in libgomp configure, apparently because of some collision with LTO symbols (???) gmake[4]: Leaving directory '/home/tkoenig/trunk-bin/x86_64-unknown-openbsd6.8/libgcc' gmake[3]: Leaving directory '/home/tkoenig/trunk-bin/x86_64-unknown-openbsd6.8/libgcc' Checking multilib configuration for libgomp... Configuring stage 1 in x86_64-unknown-openbsd6.8/libgomp configure: loading cache ./config.cache checking for --enable-version-specific-runtime-libs... no checking for --enable-generated-files-in-srcdir... no checking build system type... x86_64-unknown-openbsd6.8 checking host system type... x86_64-unknown-openbsd6.8 checking target system type... x86_64-unknown-openbsd6.8 checking for a BSD-compatible install... /usr/bin/install -c checking whether build environment is sane... yes checking for a thread-safe mkdir -p... /usr/local/bin/gmkdir -p checking for gawk... gawk checking whether gmake sets $(MAKE)... yes checking whether gmake supports nested variables... yes checking for x86_64-unknown-openbsd6.8-gcc... /home/tkoenig/trunk-bin/./gcc/xgcc -B/home/tkoenig/trunk-bin/./gcc/ -B/home/tkoenig/x86_64-unknown-openbsd6.8/bin/ -B/home/tkoenig/x86_64-unknown-openbsd6.8/lib/ -isystem /home/tkoenig/x86_64-unknown-openbsd6.8/include -isystem /home/tkoenig/x86_64-unknown-openbsd6.8/sys-include -fno-checking checking whether the C compiler works... no configure: error: in `/home/tkoenig/trunk-bin/x86_64-unknown-openbsd6.8/libgomp': configure: error: C compiler cannot create executables See `config.log' for more details gmake[2]: *** [Makefile:24794: configure-stage1-target-libgomp] Error 77 gmake[2]: Leaving directory '/home/tkoenig/trunk-bin' gmake[1]: *** [Makefile:27002: stage1-bubble] Error 2 gmake[1]: Leaving directory '/home/tkoenig/trunk-bin' gmake: *** [Makefile:1004: all] Error 2 The suspicious part is configure:3910: checking whether the C compiler works configure:3932: /home/tkoenig/trunk-bin/./gcc/xgcc -B/home/tkoenig/trunk-bin/./gcc/ -B/home/tkoenig/x86_64-unknown-openbsd6.8/bin/ -B/home/tkoenig/x86_64-unknown-openbsd6.8/lib/ -isystem /home/tkoenig/x86_64-unknown-openbsd6.8/include -isystem /home/tkoenig/x86_64-unknown-openbsd6.8/sys-include -fno-checking -g -O2 conftest.c >&5 Wrong dl symbols! /home/tkoenig/x86_64-unknown-openbsd6.8/bin/ld: /home/tkoenig/trunk-bin/./gcc/liblto_plugin.so: error loading plugin: Wrong dl symbols! collect2: error: ld returned 1 exit status configure:3936: $? = 1 configure:3974: result: no configure: failed program was: | /* confdefs.h */ | #define PACKAGE_NAME "GNU Offloading and Multi Processing Runtime Library" | #define PACKAGE_TARNAME "libgomp" | #define PACKAGE_VERSION "1.0" | #define PACKAGE_STRING "GNU Offloading and Multi Processing Runtime Library 1.0" | #define PACKAGE_BUGREPORT "" | #define PACKAGE_URL "http://www.gnu.org/software/libgomp/"; | #define PACKAGE "libgomp" | #define VERSION "1.0" | /* end confdefs.h. */ | | int | main () | { | | ; | return 0; | } configure:3979: error: in `/home/tkoenig/trunk-bin/x86_64-unknown-openbsd6.8/libgomp': configure:3981: error: C compiler cannot create executables See `config.log' for more details
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #10 from Thomas Koenig --- There are a couple of more constants for this could be tried. Base 7: static unsigned rem_7_v2 (mytype n) { unsigned long a, b, c, d; a = n & MASK_48; b = (n >> 48) & MASK_48; c = n >> 96; return (a+b+c) % 7; } gives the reminder with respect to 7. The reason is that 2^48-1 = 3*3*5*7*13*17*97*241*257*673, so a shift of 48 bits works for any combination of these factors. However, for 15, I would have to check if it would be better to use the 64-bit shift. For 19, it's a shift of 56 that would work. I think I'd better make a table.
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #11 from Thomas Koenig --- Created attachment 49438 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49438&action=edit Numbers a, b so that 2^b ≡ 1 mod a up to b=64, larger b taken if several solutions exist Here is the promised table of divisors for which this optimization is possible. For example, the line 9 60 means that the remainder of a division by 9 can be calculated by #define MASK60 ((1ul << 60) - 1) unsigned rem_9 (__uint128_t n) { __uint64_t a, b, c; a = n & MASK60; b = (n >> 60) && MASK60; c = (n >> 120); return (a+b+c) % 9; } The number of terms varies; for b=64, it is two terms; for 63 >= b >= 43, it is three terms, and for the rest, it is four terms. 67 is the first odd divisor for which there is no such shortcut, the next one is 83. Those are the only gaps below 100. Of course, if a is in the list, then a*2^n can be treated by shifting (like it was shown for a=10). Now, the interesting question, what to make of it.
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #12 from Thomas Koenig --- (In reply to Thomas Koenig from comment #11) > Created attachment 49438 [details] > Numbers a, b so that 2^b ≡ 1 mod a up to b=64, larger b taken if several > solutions exist > A quick check that all numbers are correct is awk ' { print 2 "^" $2 "%" $1 } ' divisiontable.dat | bc which shows 1 as output only.
[Bug fortran/97530] Segmentation fault compiling coarray program with option -fcoarray=shared (not with -fcoarray={lib,single})
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97530 Thomas Koenig changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2020-10-24 Ever confirmed|0 |1 --- Comment #2 from Thomas Koenig --- Here's a reduced version of the reduced version. module types type local_model_state real, allocatable :: ps(:,:) ! Surface pressure real, allocatable :: t(:,:,:) ! Temperature end type local_model_state contains function int_mult(ms, ifactor) type(local_model_state) :: int_mult type(local_model_state), intent(in) :: ms integer, intent(in) :: ifactor int_mult % ps = ms % ps * ifactor end function int_mult end module types
[Bug fortran/97530] Segmentation fault compiling coarray program with option -fcoarray=shared (not with -fcoarray={lib,single})
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97530 --- Comment #3 from Thomas Koenig --- A little bit more reduced. module types type local_model_state real, allocatable :: ps(:,:) ! Surface pressure end type local_model_state contains function int_mult(ms, ifactor) type(local_model_state) :: int_mult type(local_model_state), intent(in) :: ms integer, intent(in) :: ifactor int_mult % ps = ms % ps * ifactor end function int_mult end module types
[Bug fortran/97530] Segmentation fault compiling coarray program with option -fcoarray=shared (not with -fcoarray={lib,single})
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97530 Thomas Koenig changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Thomas Koenig --- Fixed by https://gcc.gnu.org/g:9dca1f29608df4bda70b33be735373ac18b8714b Thanks a lot for the bug report! I think we need a way to run the whole testsuite with -fcoarray=shared to spot any other issues like this.
[Bug fortran/88076] Shared Memory implementation for Coarrays
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88076 Bug 88076 depends on bug 97530, which changed state. Bug 97530 Summary: Segmentation fault compiling coarray program with option -fcoarray=shared (not with -fcoarray={lib,single}) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97530 What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 Thomas Koenig changed: What|Removed |Added Attachment #49438|divisiontable.dat |divisiontable.txt filename|| --- Comment #13 from Thomas Koenig --- Comment on attachment 49438 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49438 Numbers a, b so that 2^b ≡ 1 mod a up to b=64, larger b taken if several solutions exist Seems the bugzilla system decided this was an MPEG file. Well, it is not, hopefully renaming it as txt will help.
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 Thomas Koenig changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #1 from Thomas Koenig --- Seems fixed by https://gcc.gnu.org/g:23856d2f29fd87edf724ade48ee30c869a3b1ea3 . Thanks for the report!
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 Thomas Koenig changed: What|Removed |Added Status|RESOLVED|REOPENED Last reconfirmed||2020-10-29 CC||koenigni at gcc dot gnu.org Ever confirmed|0 |1 Resolution|FIXED |--- --- Comment #2 from Thomas Koenig --- Correction - you were referring to a runtime error, so this is not yet fixed.
[Bug middle-end/97656] New: Specify that there is no address arithmetic on a pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97656 Bug ID: 97656 Summary: Specify that there is no address arithmetic on a pointer Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- This involves Fortran, but possibly also other languages. Consider $ cat alias.f90 program main interface subroutine foo(a) integer, intent(inout) :: a end subroutine foo end interface integer, dimension(2) :: x x(1) = 42 x(2) = 42 call foo(x(1)) if (x(2) /= 42) stop "Error!" end program main This program specifies that foo has a scalar argument, passed by reference. It is forbidden by Fortran's rules foo could access the element x(2), therefore the call to stop could be optimized away, but it isn't: $ gfortran -O3 -S alias.f90 $ grep _gfortran_stop alias.s call_gfortran_stop_string $ It would be good if there was a TREE_NO_POINTER_ARITHMETC (or simlar) flag that could be set by the Fortran front end.
[Bug bootstrap/96735] --enable-maintainer-mode broken
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96735 Thomas Koenig changed: What|Removed |Added Resolution|--- |INVALID Status|WAITING |RESOLVED --- Comment #4 from Thomas Koenig --- Didn't happen again, so the best bet is that make was indeed run from the gcc subdirectory.
[Bug fortran/97320] False positive "Array reference out of bounds in loop" in a protecting if block
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97320 Thomas Koenig changed: What|Removed |Added CC||tkoenig at gcc dot gnu.org Status|RESOLVED|NEW Resolution|DUPLICATE |--- Severity|normal |enhancement Depends on||90302 --- Comment #6 from Thomas Koenig --- It's not an exact duplicate of PR 94978; that bug is about a false positive without -Wdo-subscript, whereas this one is about a false positive with -Wdo-subscript. The reason why this is rather difficult to resolve is one of translation phases. In the gfortran front end, we create a syntax tree from the Fortran source code. On the basis of that syntax tree (where we still know a lot about the langauge) we issue that warning. The next step is conversion to an intermediate language, which gets handed to the main part of gcc for further processing (known as the "middle end"). It is the middle which is does most of the optimizations, and which has the tools to do so. In this particular instance, we would need "range propagation" (where the compiler can infer the range of variables from previous statements). We don't do that in the front end, because a) it would be a major piece of work, and b) it would duplicate a lot of what the middle end already does. The most elegant solution would be support from the middle and back end to put in a pseudo statement, like a __bulitin_warning "function". Code like integer :: a(12) do i=1,10 a(i-1) = 1 could then be annotated like do i=1,10 if (0 < lbound(a)) call __builtin_warning ("index out of bounds") if (9 > ubound(a)) call __builtin_warning ("index out of bounds") a(i-1) = 1 and if the compiler could not prove that these statements get removed by dead code elimination, it would issue the warning in the final phase of translation. This would pretty much eliminate false positives, and would be far superior than what we currently do. Unfortunately, this is a part of a compiler with which I am almost totally unfamiliar, so I cannot help there. Some preliminary work has been done (see PR 90302), but I don't know how far it has progressed in the meantime. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90302 [Bug 90302] Implement __builtin_warning
[Bug rtl-optimization/97738] New: Optimizing division by value & - value for HAKMEM 175
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738 Bug ID: 97738 Summary: Optimizing division by value & - value for HAKMEM 175 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- A straightforward implementation of HAKMEM 175 (returning the next number with the same number of bits) is unsigned int next_same_bit (unsigned int value) { unsigned int lowest_bit; unsigned int left_bits; unsigned int changed_bits; unsigned int right_bits; lowest_bit = value & - value; left_bits = value + lowest_bit; changed_bits = value ^ left_bits; right_bits = (changed_bits / lowest_bit) >> 2; return left_bits | right_bits; } In two's complement, this can be replaced by unsigned int next_s_bit (unsigned int value) { unsigned int lowest_bit; unsigned int ctz; unsigned int left_bits; unsigned int changed_bits; unsigned int right_bits; ctz = __builtin_ctz (value); lowest_bit = 1u << ctz; left_bits = value + lowest_bit; changed_bits = value ^ left_bits; right_bits = changed_bits >> (ctz + 2); return left_bits | right_bits; } to replace the expensive division by what is known to be a power of two by a shift. That transformation is counter-productive (and might be done the other way) if there is no division by lowest_bit.
[Bug middle-end/97738] Optimizing division by value & - value for HAKMEM 175
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738 --- Comment #2 from Thomas Koenig --- Created attachment 49516 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49516&action=edit Small benchmark Here's a small benchmark for counting all 32-bit numbers with 16 bits set according to the HAKMEM source. Timing is (first float is elapsed time in seconds for version with division, second float is for the shift): 2.319526 601080391 1.147284 601080391 with -O3 -march=native on an AMD Ryzen 7 1700X, 4.539288 601080391 2.700514 601080391 on POWER9.
[Bug middle-end/97738] Optimizing division by value & - value for HAKMEM 175
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738 --- Comment #3 from Thomas Koenig --- Even faster code: ctz = __builtin_ctz (value); lowest_bit = value & - value; left_bits = value + lowest_bit; changed_bits = value ^ left_bits; right_bits = changed_bits >> (ctz + 2); return left_bits | right_bits; The first two instructions get compiled directly (with -march=native) to blsi%edi, %edx tzcntl %edi, %eax
[Bug middle-end/97738] Optimizing division by value & - value for HAKMEM 175
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97738 --- Comment #5 from Thomas Koenig --- (In reply to Jakub Jelinek from comment #4) > What about a version that still sets lowest_bit to value & -value; rather > than 1 < ctz? I think this would be ideal, or close to it. > Also, I'm not sure you can safely do the (changed_bits >> ctz) >> 2 to > changed_bits >> (ctz + 2) transformation, while because of the division one > can count on value not being 0 (otherwise UB), value & -value can still be > e.g. 1U << 31 and then ctz 31 too, and changed_bits >> (31 + 2) being UB, > while > (changed_bits >> 31) >> 2 well defined returning 0. OK. > So, I think we could e.g. during expansion (or isel) based on target cost > optimize > x / (y & -y) to x >> __builtin_ctz (y) (also assuming the optab for ctz > exists), but anything else looks complicated. I think this would solve the issue for the original code (which is what people will find on the web if they google for HAKMEM 175).
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #14 from Thomas Koenig --- Created attachment 49520 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49520&action=edit Numbers a, b so that 2^b ≡ 1 mod a up to b=64, larger b taken if several solutions exist, plus the multiplicative inverse for 2^128 I've added the multiplicative inverse to the table, calculated with maxima by inv_mod(x,2^128). Output is in hex, to make it easier to break down into two numbers. Is there any more info that I could provide?
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #3 from Thomas Koenig --- Simplified test case: program main type foo real, allocatable, dimension(:) :: a[:] end type foo type (foo) :: x sync all allocate (x%a(10)[*]) end program main
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #16 from Thomas Koenig --- (In reply to Jakub Jelinek from comment #15) > I plan to work on this early in stage3. > And we really shouldn't use any tables, GCC should figure it all out. > So, for double-word modulo by constant that would be expanded using a > libcall, go for x from the word bitsize to double-word bitsize and check if > (1max << x) % cst > is 1 It's probably better to search from high to low, to reduce the number of necessary shifts for division by constants like 9 or 13. > (and prefer what we've agreed on for 3), and fall back to > multiplications (see #c8) if there aren't any other options and the costs > don't say it is too costly. I think for variants where the constants aren't power of two, #define ONE ((__uint128_t) 1) #define TWO_64 (ONE << 64) #define MASK60 ((1ul << 60) - 1) void div_rem_13 (mytype n, mytype *div, unsigned int *rem) { const mytype magic = TWO_64 * 14189803133622732012u + 5675921253449092805u * ONE; /* 0xC4EC4EC4EC4EC4EC4EC4EC4EC4EC4EC5 */ __uint64_t a, b, c; unsigned int r; a = n & MASK60; b = (n >> 60); b = b & MASK60; c = (n >> 120); r = (a+b+c) % 13; n = n - r; *div = n * magic; *rem = r; } should be pretty efficient; there is only one shift which spans two words. (The assembly generated from the function looks weird because of quite a few move instructions, but that should not be an issue for code generated inline). Regarding the approach in comment #8, I think I'll run some benchmarks to see how well that works for other constants which don't fit the pattern of being divisors for 2^n-1.
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #17 from Thomas Koenig --- To be compilable, my previous code lacks typedef __uint128_t mytype; > #define ONE ((__uint128_t) 1)
[Bug rtl-optimization/97756] New: Inefficient handling of 128-bit arguments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756 Bug ID: 97756 Summary: Inefficient handling of 128-bit arguments Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- This is an offshoot from PR 97459. The code #define ONE ((__uint128_t) 1) #define TWO_64 (ONE << 64) #define MASK60 ((1ul << 60) - 1) typedef __uint128_t mytype; void div_rem_13_v2 (mytype n, mytype *div, unsigned int *rem) { const mytype magic = TWO_64 * 14189803133622732012u + 5675921253449092805u * ONE; unsigned long a, b, c; unsigned int r; a = n & MASK60; b = (n >> 60); b = b & MASK60; c = (n >> 120); r = (a+b+c) % 13; n = n - r; *div = n * magic; *rem = r; } when compiled on x86_64 on Zen with -O3 -march=native has quite some register shuffling at the beginning: 0: 49 89 f0mov%rsi,%r8 3: 48 89 femov%rdi,%rsi 6: 49 89 d1mov%rdx,%r9 9: 48 ba ff ff ff ff ffmovabs $0xfff,%rdx 10: ff ff 0f 13: 4c 89 c7mov%r8,%rdi 16: 48 89 f0mov%rsi,%rax 19: 49 89 c8mov%rcx,%r8 1c: 48 89 f1mov%rsi,%rcx 1f: 49 89 famov%rdi,%r10 22: 48 0f ac f8 3c shrd $0x3c,%rdi,%rax 27: 48 21 d1and%rdx,%rcx 2a: 41 56 push %r14 2c: 49 c1 ea 38 shr$0x38,%r10 30: 48 21 d0and%rdx,%rax 33: 53 push %rbx 34: 48 bb c5 4e ec c4 4emovabs $0x4ec4ec4ec4ec4ec5,%rbx 3b: ec c4 4e 3e: 4c 01 d1add%r10,%rcx 41: 45 31 dbxor%r11d,%r11d 44: 48 01 c1add%rax,%rcx 47: 48 89 c8mov%rcx,%rax 4a: 48 f7 e3mul%rbx 4d: 48 c1 ea 02 shr$0x2,%rdx 51: 48 8d 04 52 lea(%rdx,%rdx,2),%rax 55: 48 8d 04 82 lea(%rdx,%rax,4),%rax 59: 48 89 camov%rcx,%rdx 5c: 48 b9 ec c4 4e ec c4movabs $0xc4ec4ec4ec4ec4ec,%rcx 63: 4e ec c4 66: 48 29 c2sub%rax,%rdx 69: 48 29 d6sub%rdx,%rsi 6c: 49 89 d6mov%rdx,%r14 6f: 4c 19 dfsbb%r11,%rdi 72: 48 0f af ce imul %rsi,%rcx 76: 48 89 f2mov%rsi,%rdx 79: 48 89 f8mov%rdi,%rax 7c: c4 e2 cb f6 fb mulx %rbx,%rsi,%rdi 81: 48 0f af c3 imul %rbx,%rax 85: 49 89 31mov%rsi,(%r9) 88: 48 01 c8add%rcx,%rax 8b: 48 01 c7add%rax,%rdi 8e: 49 89 79 08 mov%rdi,0x8(%r9) 92: 45 89 30mov%r14d,(%r8) 95: 5b pop%rbx 96: 41 5e pop%r14 98: c3 retq
[Bug libstdc++/97759] Could std::has_single_bit be faster?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759 Thomas Koenig changed: What|Removed |Added Keywords||missed-optimization Severity|normal |enhancement CC||tkoenig at gcc dot gnu.org --- Comment #1 from Thomas Koenig --- Could you post the benchmark and the exact architecture where the arithmetic version is faster?
[Bug rtl-optimization/97756] Inefficient handling of 128-bit arguments
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756 --- Comment #1 from Thomas Koenig --- Actually, it was on a Ryzen 1700 (for the -march=native). I'm at odds with architecture names...
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #4 from Thomas Koenig --- (In reply to Thomas Koenig from comment #3) > Simplified test case: > > program main > type foo > real, allocatable, dimension(:) :: a[:] > end type foo > type (foo) :: x > sync all > allocate (x%a(10)[*]) > end program main Correction: That does not always segfault.
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 Thomas Koenig changed: What|Removed |Added Status|REOPENED|NEW --- Comment #5 from Thomas Koenig --- Really reduced test case, no derived types are needed to expose the bug. program random_weather implicit none real, allocatable :: my_ps(:,:) [:,:] integer :: i, npx, nxlocal, nylocal, nzglobal nxlocal = 23 nylocal = 23 nzglobal = 30 npx = 4 allocate (my_ps(0:nxlocal-1, 0:nylocal-1) [0:npx-1,0:*]) end program random_weather
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 Thomas Koenig changed: What|Removed |Added Status|NEW |WAITING --- Comment #6 from Thomas Koenig --- After https://gcc.gnu.org/pipermail/gcc-cvs/2020-November/336787.html the simplified test case seems to be fixed. What I now get with the original test case is Decomposition information on image 4 : there are 2 * 4 slabs; the slab on this image has 45 * 21 grid cells. Decomposition information on image 6 : there are 2 * 4 slabs; the slab on this image has 45 * 23 grid cells. Decomposition information on image 2 : there are 2 * 4 slabs; the slab on this image has 45 * 23 grid cells. Decomposition information on image 8 : there are 2 * 4 slabs; the slab on this image has 45 * 21 grid cells. Decomposition information on image 5 : there are 2 * 4 slabs; the slab on this image has 45 * 23 grid cells. Decomposition information on image 1 : there are 2 * 4 slabs; the slab on this image has 45 * 23 grid cells. Decomposition information on image 7 : there are 2 * 4 slabs; the slab on this image has 45 * 23 grid cells. Decomposition information on image 3 : there are 2 * 4 slabs; the slab on this image has 45 * 23 grid cells. Size mismatch for coarray allocation id 0x608460: found = 30240 != size = 33120 where the error message seems to be correct; I think all coarrays on different images must have the same size. Toon, could you maybe comment?
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 Thomas Koenig changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #8 from Thomas Koenig --- So, this one is fixed then. Thanks a lot for the bug report - handling coarrays as components was actually broken in more than one way. If you have anything else, don't hesitate to throw it at the branch :-)
[Bug target/97770] [ICELAKE]Missing vectorization for vpopcnt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770 Thomas Koenig changed: What|Removed |Added CC||tkoenig at gcc dot gnu.org --- Comment #7 from Thomas Koenig --- Some literature: https://arxiv.org/pdf/1611.07612
[Bug tree-optimization/21046] move memory allocation out of a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=21046 Thomas Koenig changed: What|Removed |Added Last reconfirmed|2014-12-25 00:00:00 |2020-11-11 --- Comment #6 from Thomas Koenig --- Just thought to see if this has been fixed in the meantime; it's not optimized with current trunk.
[Bug tree-optimization/30398] memmove for string operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30398 Thomas Koenig changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #4 from Thomas Koenig --- This is what we generated now for program main character(len=1) :: s character(len=2) :: c s = 'a' c = repeat(s,2) call foo(c) end program main : ;; Function main (main, funcdef_no=1, decl_uid=3926, cgraph_uid=2, symbol_order=1) (executed once) __attribute__((externally_visible)) main (integer(kind=4) argc, character(kind=1) * * argv) { character(kind=1) c[1:2]; static integer(kind=4) options.3[7] = {2116, 4095, 0, 1, 1, 0, 31}; [local count: 1073741825]: _gfortran_set_args (argc_2(D), argv_3(D)); _gfortran_set_options (7, &options.3[0]); MEM [(c_char * {ref-all})&c] = 24929; foo (&c, 2); c ={v} {CLOBBER}; return 0; } So, everything that should be optimized is now optimized. Fixed.
[Bug tree-optimization/38592] Optimize memmove / memcmp combination
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=38592 Thomas Koenig changed: What|Removed |Added Last reconfirmed|2017-08-15 00:00:00 |2020-11-11 --- Comment #10 from Thomas Koenig --- For the C test case, we now get yes: .LFB0: .cfi_startproc movl$25977, %eax movb$115, -1(%rsp) movw%ax, -3(%rsp) movzbl -2(%rsp), %eax subl$101, %eax jne .L1 movzbl -1(%rsp), %eax subl$115, %eax .L1: ret So, optimized further, but not folded. clang 7 folds this completely: yes:# @yes .cfi_startproc # %bb.0: xorl%eax, %eax retq .Lfunc_end0:
[Bug fortran/97799] Passing CHARACTER*(*) var(*) through ENTRY causes segfaults
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97799 Thomas Koenig changed: What|Removed |Added CC||tkoenig at gcc dot gnu.org --- Comment #8 from Thomas Koenig --- Comment on attachment 49548 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49548 bugtest.f -- program evincing bug So, commit the test case to guard against regressions (since it is not immediately obvious if this is already covered). I'll do so in a short while.
[Bug fortran/97799] Passing CHARACTER*(*) var(*) through ENTRY causes segfaults
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97799 --- Comment #9 from Thomas Koenig --- (In reply to Thomas Koenig from comment #8) > Comment on attachment 49548 [details] > bugtest.f -- program evincing bug > > So, commit the test case to guard against regressions > (since it is not immediately obvious if this is already > covered). > > I'll do so in a short while. Or as soon as bootstrap works again.
[Bug fortran/97799] Passing CHARACTER*(*) var(*) through ENTRY causes segfaults
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97799 Thomas Koenig changed: What|Removed |Added Resolution|--- |FIXED Status|WAITING |RESOLVED --- Comment #10 from Thomas Koenig --- Test case committed to master as https://gcc.gnu.org/g:3c3beb1a8137460bc485f9fbe3be8b21ee7f91a2 and to gcc 10 as https://gcc.gnu.org/g:910250c360291074d0908feb111403e6bb3b32ee . Thanks for the report!
[Bug fortran/97799] [10/11 Regression] Passing CHARACTER*(*) var(*) through ENTRY causes segfaults
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97799 Thomas Koenig changed: What|Removed |Added Target Milestone|--- |10.3 Summary|Passing CHARACTER*(*) |[10/11 Regression] Passing |var(*) through ENTRY causes |CHARACTER*(*) var(*) |segfaults |through ENTRY causes ||segfaults Status|VERIFIED|RESOLVED --- Comment #12 from Thomas Koenig --- (In reply to George Hockney from comment #11) > We've verified a large-scale legacy build against > > GNU Fortran (gcc8.2) 11.0.0 2020 (experimental) > > and > > GNU Fortran (GCC) 10.2.1 20201017 > > All our regressions pass these compilers. Thanks for letting us know. > Therefore, I'm changing the status to verified (this is per our bugzilla > workflow; if it's not your workflow please fix) It's not usually done, so I'll just change this back (there are a few search masks which don't have VERIFIED in). > Unfortunately, 10.2.0 was released with this bug. Yep. I have looked over the changes to gcc10 since the 10.2 release to the gfortran front end haven't found anything obvious that could have fixed this; I don't think we need to do a bisection, having the test case should be enough.
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 Thomas Koenig changed: What|Removed |Added Status|REOPENED|WAITING --- Comment #10 from Thomas Koenig --- (In reply to Toon Moene from comment #9) > Unfortunately, I now get the following error on the original code in the > attachment: > > (export > LD_LIBRARY_PATH=/home/toon/compilers/install/coarray_native/lib/gcc/x86_64- > pc-linux-gnu/11.0.0; export GFORTRAN_NUM_IMAGES=1; echo ' &config / ' | > ./a.out) > Decomposition information on image 1 : there are 1 * 1 slabs; the slab > on this image has 90 * 90 grid cells. > Fortran runtime error: Integer overflow when calculating the amount of > memory to allocate That should be fixed with https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=649754c5b4a888c2c69c1a9cbeb1c356899934c1 (which just removed the overflow checks). Anything else? :-)
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 Thomas Koenig changed: What|Removed |Added Status|WAITING |NEW --- Comment #12 from Thomas Koenig --- Reduced test case: program main type global_model_state real, allocatable :: ps(:) [:] end type global_model_state type (global_model_state) :: ms_full allocate (ms_full % ps(100) [*]) ms_full %ps = 42. end program main
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #13 from Thomas Koenig --- (In reply to Thomas Koenig from comment #12) > Reduced test case: > > program main > type global_model_state > real, allocatable :: ps(:) [:] > end type global_model_state > type (global_model_state) :: ms_full > allocate (ms_full % ps(100) [*]) > ms_full %ps = 42. > end program main That one is now fixed, but the original test case still segfaults.
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #14 from Thomas Koenig --- (In reply to Thomas Koenig from comment #13) > (In reply to Thomas Koenig from comment #12) > > Reduced test case: > > > > program main > > type global_model_state > > real, allocatable :: ps(:) [:] > > end type global_model_state > > type (global_model_state) :: ms_full > > allocate (ms_full % ps(100) [*]) > > ms_full %ps = 42. > > end program main > > That one is now fixed, but the original test case still segfaults. ... with 16 images: Decomposition information on image 1 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 13 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 15 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 16 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 6 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 14 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 8 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 2 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 3 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 4 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 11 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 7 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 10 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 5 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 9 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Decomposition information on image 12 : there are 8 * 2 slabs; the slabs are 9 * 35 grid cells in size. Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: #0 0x7fb21745659f in ??? at /usr/src/debug/glibc-2.26-lp151.19.19.1.x86_64/signal/../sysdeps/unix/sysv/linux/x86_64/sigaction.c:0 #1 0x41195b in __types_MOD_global_init at /home/ig25/Krempel/Nico/random_weather.f90:154 #2 0x4148e7 in random_weather at /home/ig25/Krempel/Nico/random_weather.f90:494 #3 0x41576d in image_main_wrapper at ../../../coarray_native/libgfortran/caf_shared/coarraynative.c:183 #4 0x4153d2 in main at /home/ig25/Krempel/Nico/random_weather.f90:413 ERROR: Image 16(0x5a1d) failed
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #16 from Thomas Koenig --- program random_weather implicit none type global_model_state real, allocatable :: ps(:,:) [:,:] end type global_model_state integer :: nxslab, nyslab type(global_model_state) :: ms_full integer :: i, time, np1, np2, npx, npy, npxy real, parameter :: PS = 10.0, T = 300.0,U = 0.0,V = 0.0,W = 0.0, Q = 0.002! Mean value npxy = num_images() nxslab = 72 nyslab = 70 npx = 1 allocate( ms_full % ps(0:nxslab-1, 0:nyslab-1)[0:npx-1, 0:*] ) ms_full % ps = PS end program random_weather
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #15 from Thomas Koenig --- Next reduced test-case:
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #17 from Thomas Koenig --- A bit more reduced (no derived types necessary): program random_weather implicit none real, allocatable :: ps(:,:) [:,:] integer :: nxslab, nyslab integer :: npx integer :: i, j real, parameter :: PS1 = 10.0 nxslab = 72 nyslab = 70 npx = 1 allocate( ps(nxslab, nyslab)[npx, *] ) ps(1,1) = PS1 end program random_weather So, it appears that the offset for multi-co-dimensional allocated coarrays is miscalculated.
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 Thomas Koenig changed: What|Removed |Added Status|NEW |WAITING --- Comment #18 from Thomas Koenig --- Hi Toon, with https://gcc.gnu.org/pipermail/gcc-cvs/2020-November/337586.html , your program seems to work (at least the values look reasonable): Decomposition information on image 2 : there are 3 * 2 slabs; the slabs are 24 * 35 grid cells in size. Decomposition information on image 5 : there are 3 * 2 slabs; the slabs are 24 * 35 grid cells in size. Decomposition information on image 3 : there are 3 * 2 slabs; the slabs are 24 * 35 grid cells in size. Decomposition information on image 1 : there are 3 * 2 slabs; the slabs are 24 * 35 grid cells in size. Decomposition information on image 4 : there are 3 * 2 slabs; the slabs are 24 * 35 grid cells in size. Decomposition information on image 6 : there are 3 * 2 slabs; the slabs are 24 * 35 grid cells in size. Time 0 Image 5 PS= 99978.4531 T=300.364166 U=19.3067131 V=15.9685030 W= 0.138491884 Q= 2.17480748E-03 Time 0 Image 1 PS= 99985.0938 T=300.027161 U= -9.06420994 V=5.92245483 W= 0.137841657 Q= 2.10389541E-03 Time 0 Image 3 PS= 9.3828 T=300.014618 U= -4.48150349 V= -1.37469864 W= -8.73371959E-02 Q= 1.81287562E-03 Time 0 Image 2 PS= 99986.4141 T=300.200836 U= -3.47342205 V=16.5930214 W= 0.205771178 Q= 1.97321200E-03 Time 0 Image 6 PS= 99980.4141 T=300.424133 U=12.8092175 V=11.5236654 W=6.01452552E-02 Q= 1.87643641E-03 Time 0 Image 4 PS= 100010.516 T=300.005005 U=11.4250631 V=3.44926071 W= -0.227272436 Q= 2.07653991E-03 Time 240 Image 6 PS= 0.5781 T=300.666931 U=22.8395500 V= -11.9721365 W=3.66642363E-02 Q= 1.70292379E-03 Time 240 Image 2 PS= 99980.1484 T=300.538757 U=19.1216316 V=34.7150421 W=3.16514075E-03 Q= 2.09417334E-03 Time 240 Image 1 PS= 99969.6641 T=300.400970 U=3.65581894 V=16.8670387 W=2.10290849E-02 Q= 2.06003617E-03 Time 240 Image 3 PS= 5.2734 T=300.354370 U=4.84142876 V=4.59838200 W=1.12933442E-02 Q= 1.67453510E-03 Time 240 Image 5 PS= 99959.9141 T=300.308228 U=35.2094879 V=26.3194275 W=6.13999888E-02 Q= 2.24495190E-03 Time 240 Image 4 PS= 100024.211 T=300.642700 U= -21.4838848 V= -5.71874714 W= 0.123860441 Q= 1.77718676E-03 Time 480 Image 1 PS= 99988.9688 T=300.262726 U= -1.2006 V=13.3446560 W= -1.83758438E-02 Q= 1.98666588E-03 Time 480 Image 5 PS= 100030.500 T=300.034546 U=8.11599827 V=49.5809326 W= -1.16332620E-02 Q= 2.18066899E-03 Time 480 Image 3 PS= 99974.3828 T=300.171265 U= -12.1284695 V=13.2599001 W= -0.132261544 Q= 1.64680334E-03 Time 480 Image 6 PS= 99983.6328 T=299.253204 U= -16.0964108 V= -7.74500656 W= -0.392248750 Q= 1.88040221E-03 Time 480 Image 2 PS= 99969.3672 T=299.095215 U= -5.18578625 V=36.8412170 W= -0.231359661 Q= 1.97951938E-03 Time 480 Image 4 PS= 100016.453 T=300.540619 U= -1.72649384 V=38.7740860 W= -0.185899958 Q= 2.31738412E-03 Thanks again for the test case, it certainly showed up a lot of bugs :-)
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 Thomas Koenig changed: What|Removed |Added Status|WAITING |NEW --- Comment #21 from Thomas Koenig --- Hi Toon, yes, I can replicate this.
[Bug fortran/98016] Host association problem
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98016 Thomas Koenig changed: What|Removed |Added Last reconfirmed||2020-11-26 Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING CC||tkoenig at gcc dot gnu.org --- Comment #1 from Thomas Koenig --- Seems to be fixed on current trunk: $ cat bug.f90 program p real :: y(3) n=3 y = func(0.) stop contains function func(x) result (y) real y(n) y=x end function func end program p $ gfortran bug.f90 $ gfortran -v Es werden eingebaute Spezifikationen verwendet. COLLECT_GCC=gfortran COLLECT_LTO_WRAPPER=/home/ig25/lib/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper Ziel: x86_64-pc-linux-gnu Konfiguriert mit: ../trunk/configure --prefix=/home/ig25 --enable-languages=c,c++,fortran Thread-Modell: posix Unterstützte LTO-Kompressionsalgorithmen: zlib gcc-Version 11.0.0 20201112 (experimental) [master revision d33bc98f5bc:79fa060941e:87b7d45e358e4df93b6a93b2e7a55b123ea76f5d] (GCC) Can you confirm that? If so, we can commit a test case and close.
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #22 from Thomas Koenig --- Hi Toon, it took some time, but we finally figured out that it is actually a bug in your program that is causing problems. It has (shortened) nxglobal = 72; This sets the coarray nxglobal to 72 on every image, including image 2. if (this_image() == 1) then read(*,config) This optionally reads in nxglobal[1]. ! Why won't this work as nxglobal[:] = nx ? do i = 2, num_images() nxglobal[i] = nxglobal; This sets nxglobal[2] on image 2 from image 1. This is a race condition: It is not clear which store comes first. To fix this, you would need a "sync all" before the if (this_image() = 1) statement, or you could set the default value on image 1 only. (Incidentally, looking at this code led to finding a bug in namelist handling, where the implied this_image() was not honored (namelist I/O only worked on image 1), so the time looking at this bug was not wasted :-) By the way, there is also a segfault with GFORTRAN_NUM_IMAGES=64, which will need to be investigated.
[Bug fortran/97589] Segementation fault when allocating coarrays.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97589 --- Comment #26 from Thomas Koenig --- After 00c2e5d1c15c67fc2c9d9ed86bfa1f5aa13848cc , the segfault for too many images is now also fixed, and your program runs as expected. I'd say an important milestone has been reached :-)
[Bug testsuite/26183] setting environment variables in test cases
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26183 Thomas Koenig changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #2 from Thomas Koenig --- This is now possible using dg-set-target-env-var .
[Bug fortran/98053] New: Add Fortran tests for behavior from environment variables
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98053 Bug ID: 98053 Summary: Add Fortran tests for behavior from environment variables Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- There is dg-set-target-env-var , which we could use to check that the runtime behavior which depends on environment variables is indeed OK.
[Bug libfortran/98076] New: Increase speed of integer I/O
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98076 Bug ID: 98076 Summary: Increase speed of integer I/O Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libfortran Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- Currently, we use GFC_INTEGER_LARGEST for a lot of integer I/O. One place where things could be improved is gfc_itoa: const char * gfc_itoa (GFC_INTEGER_LARGEST n, char *buffer, size_t len) { ... GFC_UINTEGER_LARGEST t; ... t = n; if (n < 0) { negative = 1; t = -n; /*must use unsigned to protect from overflow*/ } ... while (t != 0) { *--p = '0' + (t % 10); t /= 10; } Currently, the quotient / remainder calculation is expanded into a libcall. This could be done by the improved remainder calculation from PR97459 with the calculation of the quotient by multiplicative inverse, and by switching to a smaller datatype once the value fits in there. Both can and should be combined, of course.
[Bug libfortran/98076] Increase speed of integer I/O
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98076 Thomas Koenig changed: What|Removed |Added Version|unknown |11.0 Ever confirmed|0 |1 Assignee|unassigned at gcc dot gnu.org |tkoenig at gcc dot gnu.org Severity|normal |enhancement Target Milestone|--- |11.0 Last reconfirmed||2020-12-01 Status|UNCONFIRMED |ASSIGNED Keywords||missed-optimization
[Bug libfortran/95293] Fortran not passing array by reference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95293 --- Comment #11 from Thomas Koenig --- (In reply to Dominique d'Humieres from comment #10) > Could this PR be closed as INVALID? Yes, I think so.
[Bug rtl-optimization/97459] __uint128_t remainder for division by 3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97459 --- Comment #26 from Thomas Koenig --- Yep, it's implemented and works great. For a simple "sum of digits" program in base ten, it's an acceleration by more than a factor of two. Thanks!
[Bug libfortran/98129] New: Failure on reading big chunk of /dev/urandom
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129 Bug ID: 98129 Summary: Failure on reading big chunk of /dev/urandom Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: libfortran Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- The following program program main implicit none integer, parameter :: n = 10**7 integer :: u,v integer, dimension(:), allocatable :: a open (newunit=u,file="/dev/urandom",form="unformatted",access="stream") open (newunit=v,file="/dev/null",form="unformatted",access="stream") allocate (a(n)) read (u) a write (v) a end program main fails on Linux with At line 9 of file read.f90 Fortran runtime error: End of file Error termination. Backtrace: #0 0x7fa9c7be372f in read_block_direct at ../../../trunk/libgfortran/io/transfer.c:664 #1 0x7fa9c7be372f in unformatted_read at ../../../trunk/libgfortran/io/transfer.c:1127 #2 0x400bf3 in ??? #3 0x400c9c in ??? #4 0x7fa9c6e64349 in __libc_start_main at ../csu/libc-start.c:308 #5 0x400909 in ??? at ../sysdeps/x86_64/start.S:120 #6 0x in ???
[Bug libfortran/98129] Failure on reading big chunk of /dev/urandom
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129 --- Comment #2 from Thomas Koenig --- The problem seems to be related to an early return from the read system call: strace -e trace=open,read,close ./a.out read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\v\2\0\0\0\0\0"..., 832) = 832 close(3)= 0 close(3)= 0 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0P\217\0\0\0\0\0\0"..., 832) = 832 close(3)= 0 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\260/\0\0\0\0\0\0"..., 832) = 832 close(3)= 0 read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@*\0\0\0\0\0\0"..., 832) = 832 close(3)= 0 read(3, "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0`D\2\0\0\0\0\0"..., 832) = 832 close(3)= 0 read(3, ":\312\275\302\373|\204[c`\20\275\230\205\326@\255Mj\263-qd\336\30\300\2G\326\215\333J"..., 4000) = 33554431 At line 9 of file read.f90 Fortran runtime error: End of file Error termination. Backtrace: close(5)= 0 close(5)= 0 close(6)= 0 close(5)= 0 close(5)= 0 close(5)= 0 close(6)= 0 close(5)= 0 close(6)= 0 close(5)= 0 #0 0x7fd9847c572f in read_block_direct at ../../../trunk/libgfortran/io/transfer.c:664 #1 0x7fd9847c572f in unformatted_read at ../../../trunk/libgfortran/io/transfer.c:1127 #2 0x400bf3 in ??? #3 0x400c9c in ??? #4 0x7fd983a46349 in __libc_start_main at ../csu/libc-start.c:308 #5 0x400909 in ??? at ../sysdeps/x86_64/start.S:120 #6 0x in ??? close(4)= 0 close(3)= 0 +++ exited with 2 +++
[Bug libfortran/98129] Failure on reading big chunk of /dev/urandom
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129 --- Comment #3 from Thomas Koenig --- The problem seems to be that we assume that a short read is always an EOF, in read_block_direct: if (unlikely ((ssize_t) nbytes != have_read_record)) { /* Short read, e.g. if we hit EOF. For stream files, we have to set the end-of-file condition. */ hit_eof (dtp); } return; }
[Bug libfortran/98129] Failure on reading big chunk of /dev/urandom
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129 Thomas Koenig changed: What|Removed |Added Target||x86_64-pc-linux-gnu --- Comment #5 from Thomas Koenig --- Question... on your respective systems, could you strace or truss it and find if there is a short read? On Linux, there seems to be a limitation of how many bytes a read from /dev/urandom returns, and we assume that this is an end of file. However, this is not correct - we can only safely assume eof if read() returns zero bytes.
[Bug libfortran/98129] Failure on reading big chunk of /dev/urandom
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98129 --- Comment #10 from Thomas Koenig --- (In reply to anlauf from comment #9) > The patch seems to regtest ok, but certainly needs some wider testing. Actually, I think the bug is in io/unix.c:raw_read. That should take care of repeating the reads as needed. Seems like that, if nbyte <= MAX_CHUNK, we do not take account of the possibility of a short read.
[Bug testsuite/98156] New: [Coarray] alloc_comp_1.f90 tests for wrong condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98156 Bug ID: 98156 Summary: [Coarray] alloc_comp_1.f90 tests for wrong condition Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- alloc_comp_1.f90 has ! { dg-do run } ! ! Allocatable scalar corrays were mishandled (ICE) ! type t integer, allocatable :: caf[:] end type t type(t) :: a allocate (a%caf[3:*]) a%caf = 7 if (a%caf /= 7) STOP 1 print *,ucobound(a%caf,dim=1) if (any (lcobound (a%caf) /= [ 3 ]) & .or. ucobound (a%caf, dim=1) /= this_image ()+2) & STOP 2 deallocate (a%caf) end The second test about ucobound is clearly bogus - it should be num_images() instead of this_image().