[Bug d/91628] libdruntime uses glibc internal symbol on s390
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91628 --- Comment #19 from stli at linux dot ibm.com --- Fixed with gcc commit "S/390: Fix PR91628" https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=88e508f9f112acd07d0c49c53589160db8c85fcd If somebody is backporting this fix, please also backport gcc commit "S/390: Fix layout of struct sigaction_t" https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=434fe1a4092e12e5b518ef0716dc5b315e06118d Otherwise you'll still see tls testsuite FAILs.
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 stli at linux dot ibm.com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #14 from stli at linux dot ibm.com --- I've tested this patch with help of glibc testsuite. Therefore I've disabled the current workaround: /sysdeps/s390/fpu/fix-fp-int-compare-invalid.h: #define FIX_COMPARE_INVALID 0 All tests passed. As information: Without this patch there were fails like: math/test-ldouble-iseqsig.out: testing long double (without inline functions) Failure: iseqsig (-0, qNaN): Exception "Invalid operation" not set Failure: iseqsig (-0, -qNaN): Exception "Invalid operation" not set ... As soon as gcc 10 is released, I will post a glibc-patch which conditionally disables the current workaround. Thanks.
[Bug target/77918] S390: Floating point comparisons don't raise invalid for unordered operands.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77918 --- Comment #16 from stli at linux dot ibm.com --- Just as information, this glibc commit will be first available with glibc 2.31 release: "S390: Fp comparison are now raising FE_INVALID with gcc 10." https://sourceware.org/git/?p=glibc.git;a=commit;h=64bca76f42a82e6a9ea2b0166deab7aa2b7efbea
[Bug target/80080] S390: Isses with emitted cs-instructions for __atomic builtins.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080 --- Comment #11 from stli at linux dot ibm.com --- Hi, I've retested the samples with gcc 7, 8 and head from 2018-07-20, but there are still issues: The examples foo1 and foo2 are okay. The issue in example foo3 is still present (see description of the bug-report): 00a0 : a0: a7 18 00 05 lhi %r1,5 a4: c4 2d 00 00 00 00 lrl %r2,a4 a6: R_390_PC32DBL foo3_mem+0x2 aa: c0 30 00 00 00 00 larl%r3,aa ac: R_390_PC32DBL foo3_mem+0x2 b0: ba 21 30 00 cs %r2,%r1,0(%r3) b4: a7 74 ff fb jne aa The address of the global variable is still reloaded within the loop. If the value was not swapped with cs, the jne can jump directly to the cs instruction instead of the larl-instruction. b8: b9 14 00 22 lgfr%r2,%r2 bc: 07 fe br %r14 be: 07 07 nopr%r7 I've found a further issue which is observable with the following two examples. See the questions in the disassembly: void foo4(int *mem) { int oldval = 0; if (!__atomic_compare_exchange_n (mem, (void *) &oldval, 1, 1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED)) { bar (mem); } /* : 0: e3 10 20 00 00 12 lt %r1,0(%r2) 6: a7 74 00 06 jne 12 Why do we need to jump to 0x12 first instead of directly jumping to 0x18? a: a7 38 00 01 lhi %r3,1 e: ba 13 20 00 cs %r1,%r3,0(%r2) 12: a7 74 00 03 jne 18 16: 07 fe br %r14 18: c0 f4 00 00 00 00 jg 18 1a: R_390_PC32DBL bar+0x2 1e: 07 07 nopr%r7 */ } void foo5(int *mem) { int oldval = 0; __atomic_compare_exchange_n (mem, (void *) &oldval, 1, 1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED); if (oldval != 0) bar (mem); /* 0040 : 40: e3 10 20 00 00 12 lt %r1,0(%r2) 46: a7 74 00 06 jne 52 This is similar to foo4, but the variable oldval is compared against zero instead of using the return value of __atomic_compare_exchange_n. Can't we jump directly to 0x5a instead of 0x52? 4a: a7 38 00 01 lhi %r3,1 4e: ba 13 20 00 cs %r1,%r3,0(%r2) 52: 12 11 ltr %r1,%r1 54: a7 74 00 03 jne 5a 58: 07 fe br %r14 5a: c0 f4 00 00 00 00 jg 5a 5c: R_390_PC32DBL bar+0x2 */ }
[Bug c/98070] New: errno is not re-evaluated after clearing errno and calling realloc(ptr, SIZE_MAX)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98070 Bug ID: 98070 Summary: errno is not re-evaluated after clearing errno and calling realloc(ptr, SIZE_MAX) Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: stli at linux dot ibm.com Target Milestone: --- Created attachment 49652 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49652&action=edit Testcase reproducing the issue with gcc-head Hi, After setting errno=0 and calling realloc with a too large size, which sets errno to ENOMEM, a subsequent "if (errno == ENOMEM)" is not evaluated as true. Instead gcc assumes that errno has not changed and is directly executing the else-path without testing errno again. This happens in the glibc-testcase: /malloc/tst-malloc-too-large.c test (see https://sourceware.org/git/?p=glibc.git;a=blob;f=malloc/tst-malloc-too-large.c;h=b5ad7eb7e7bf764fe57ceff5a810e3c211ca05e0;hb=refs/heads/master) on at least x86_64 and s390x with gcc-head. The attached small reproducer fails with gcc-head, but not with gcc 10, 9 (before): /* Output with gcc 11: $ ./tst-errno-realloc (build with >= -O1) 47: errno == 0 (Cannot allocate memory). We are in the else-part of 'if (errno == ENOMEM)'. Does errno correspond to %m or the line below or to '(gdb) p errno'?! dump_errno(48, compare to line above!): errno == 12 (Cannot allocate memory) vs main_errno=0 On s390x: $ gcc -v Using built-in specs. COLLECT_GCC=./install-s390x-head/bin/gcc COLLECT_LTO_WRAPPER=/home/stli/gccDir/install-s390x-head/libexec/gcc/s390x-ibm-linux-gnu/11.0.0/lto-wrapper Target: s390x-ibm-linux-gnu Configured with: /home/stli/gccDir/gcc-head/configure --prefix=/home/stli/gccDir/install-s390x-head/ --enable-shared --with-system-zlib --enable-threads=posix --enable-__cxa_atexit --enable-checking --enable-gnu-indirect-function --enable-languages=c,c++ --with-arch=zEC12 --with-tune=z13 --disable-bootstrap --with-long-double-128 --enable-decimal-float Thread model: posix Supported LTO compression algorithms: zlib gcc version 11.0.0 20201127 (experimental) (GCC) $ git log --oneline 5e9f814d754 (HEAD -> master, origin/master, origin/HEAD) rs6000: Change rs6000_expand_vector_set param Also on x86_64: $ gcc -v Using built-in specs. COLLECT_GCC=/home/stli/gccDir/install-x86_64-head/bin/gcc COLLECT_LTO_WRAPPER=/home/stli/gccDir/install-x86_64-head/libexec/gcc/x86_64-pc-linux-gnu/11.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /home/stli/gccDir/gcc-head/configure --prefix=/home/stli/gccDir/install-x86_64-head/ --enable-shared --with-system-zlib --enable-threads=posix --enable-__cxa_atexit --enable-checking --enable-gnu-indirect-function --enable-languages=c,c++ --with-tune=generic --with-arch_32=x86-64 --disable-bootstrap --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --disable-libgcj --disable-multilib Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 11.0.0 20201130 (experimental) (GCC) $ git log --oneline a5ad5d5c478 (HEAD -> master, origin/master, origin/HEAD) RISC-V: Always define MULTILIB_DEFAULTS */
[Bug middle-end/98070] [11 Regression] errno is not re-evaluated after clearing errno and calling realloc(ptr, SIZE_MAX)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98070 --- Comment #5 from stli at linux dot ibm.com --- I've just build and run the attached test on s390x/x86_64 with your fix. Now errno is re-evaluated after realloc. I've also rebuild glibc on s390x and the original glibc-test /malloc/tst-malloc-too-large.c is now also passing. Many thanks.
[Bug c/98269] New: gcc 6.5.0 __builtin_add_overflow() with small uint32_t values incorrectly detects overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269 Bug ID: 98269 Summary: gcc 6.5.0 __builtin_add_overflow() with small uint32_t values incorrectly detects overflow Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: stli at linux dot ibm.com Target Milestone: --- Created attachment 49756 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49756&action=edit Build this tst-gcc-addoverflow.c with gcc 6.5.0 to see the ERROR If build on s390x (I had no chance to test it on other architectures) with gcc 6.5.0 the attached testcase with small uint32_t input values for __builtin_add_overflow() detects an overflow and fails: else if (__builtin_add_overflow (previous->offset, previous->length + 1, ¤t->offset)) { printf ("ERROR: __builtin_add_overflow() OVERFLOWED: " "previous->offset=%" PRIu32 " + " "(previous->length=%" PRIu32 " + 1)" " => current->offset=%" PRIu32 "\n", previous->offset, previous->length, current->offset); return EXIT_FAILURE; } => ERROR: __builtin_add_overflow() OVERFLOWED: previous->offset=7 + (previous->length=3 + 1) => current->offset=11 I have not recognized this issue with gcc 7.1 and later. The original issue was found if glibc is build with gcc 6.5.0: __builtin_add_overflow is used in /elf/stringtable.c:stringtable_finalize() (https://sourceware.org/git/?p=glibc.git;a=blob;f=elf/stringtable.c;h=099347d73ee70b8ffa4b4a91c493e0bba147ffa2;hb=HEAD#l185) which leads to ldconfig failing with "String table is too large". This is also recognizable in following glibc-tests: FAIL: elf/tst-glibc-hwcaps-cache FAIL: elf/tst-glibc-hwcaps-prepend-cache FAIL: elf/tst-ldconfig-X FAIL: elf/tst-ldconfig-bad-aux-cache FAIL: elf/tst-ldconfig-ld_so_conf-update FAIL: elf/tst-stringtable Please also have a look at attached tst-gcc-addoverflow.c for some more details from my gdb session showing the add and jump instruction.
[Bug c/98269] gcc 6.5.0 __builtin_add_overflow() with small uint32_t values incorrectly detects overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269 stli at linux dot ibm.com changed: What|Removed |Added Target||s390x Known to work||10.1.0, 5.4.0, 5.5.0, ||7.1.0, 8.1.0, 9.1.0 CC||stli at linux dot ibm.com Known to fail||6.3.0, 6.4.0, 6.5.0 --- Comment #2 from stli at linux dot ibm.com --- That's okay for me. But I wanted to document it. Currently glibc is requiring gcc 6.2 as minimum. For s390x, I will post a patch which requires gcc 7.1 as minimum.
[Bug c/98269] gcc 6.5.0 __builtin_add_overflow() with small uint32_t values incorrectly detects overflow
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98269 --- Comment #5 from stli at linux dot ibm.com --- Just as information, I've just committed this glibc patch: "s390x: Require GCC 7.1 or later to build glibc." https://sourceware.org/git/?p=glibc.git;a=commit;h=844b4d8b4b937fe6943d2c0c80ce7d871cdb1eb5
[Bug c/99134] New: S390x: pfpo instructions are not used for dfp[128|64|32] to/from long double conversions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99134 Bug ID: 99134 Summary: S390x: pfpo instructions are not used for dfp[128|64|32] to/from long double conversions Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: stli at linux dot ibm.com Target Milestone: --- Created attachment 50212 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50212&action=edit Test which runs the dfpXYZ <-> long double conversions which are not performed via pfpo instruction, but by calling __dpd_[trunc|extend] functions. See libdfp-issue "s390x: 3 test failures on Fedora Rawhide #160" https://github.com/libdfp/libdfp/issues/160 (Notice that Rawhide is using GCC 11 now.) Reproduced the issues with gcc commit 78a6d0e30d7950216dc0c5be5d65d0cbed13924c You have to configure gcc with --enable-decimal-float All decimal-floating-point[128|64|32] <-> binary-floating-point[128|64|32] conversions should emit the pfpo (PERFORM FLOATING-POINT OPERATION) instruction as used in previous GCC versions. GCC 11 is not using the pfpo instruction if bfp128 (long double) is involved in the conversion. In the libdfp implementation of dpd_extend/trunc functions, this leads to be a recursive call to itself which segfaults as it runs out of stack: - bfp128 -> dfp128 (do__dpd_extendtftd(): brasl %r14,<__dpd_extendtftd>) - bfp128 -> dfp64 (do_bfp128_to_dfp64(): brasl %r14,<__dpd_trunctfdd>) - bfp128 -> dfp32 (do_bfp128_to_dfp32(): brasl %r14,<__dpd_trunctfsd>) - dfp128 -> bfp128 (do__dpd_trunctdtf(): brasl %r14,<__dpd_trunctdtf>) - dfp64 -> bfp128 (do_dfp64_to_bfp128(): brasl %r14,<__dpd_extendddtf>) - dfp32 -> bfp128 (do_dfp32_to_bfp128(): brasl %r14,<__dpd_extendsdtf>)
[Bug c/99134] S390x: pfpo instructions are not used for dfp[128|64|32] to/from long double conversions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99134 stli at linux dot ibm.com changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #2 from stli at linux dot ibm.com --- I've just restested libdfp with gcc-head: $ git log --oneline 60b99ee3bc0 (HEAD -> master, origin/master, origin/HEAD) Daily bump. ... b6e446cb581 IBM Z: Fix long double <-> DFP conversions a974b8a592e IBM Z: Improve FPRX2 <-> TF conversions Now all the long double <-> _Decimal data-type conversions are using the pfpo instruction. Thanks.
[Bug c/104011] New: s390: r12 is not setup for _mcount call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104011 Bug ID: 104011 Summary: s390: r12 is not setup for _mcount call Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: stli at linux dot ibm.com Target Milestone: --- On 31bit, as r12 is not setup before brasl _mcount@plt, we jump to a different function. Note that the PIE plt-slot is using r12. In the debugging-case, e.g. __libc_calloc is called. In a different glibc-testcase "gmon/tst-gmon-pie" we jump to another function, which leads to a segfault. This happens with, e.g.: - gcc version 9.2.1 20190827 (Red Hat 9.2.1-1) (GCC) - gcc 11.2.0 Steps to reproduce: $ cat tst-pie-mcount.c #include #include int main (void) { puts ("Hello world"); return EXIT_SUCCESS; } $ gcc -o tst-pie-mcount -g -m31 -fpie -pg -pie tst-pie-mcount.c $ objdump -d tst-pie-mcount ... 05c8 <_mcount@plt>: 5c8: 58 10 c0 20 l %r1,32(%r12) 5cc: 07 f1 br %r1 5ce: 00 00 00 00 .long 0x 5d2: 00 00 0d 10 .long 0x0d10 5d6: 58 10 10 0e l %r1,14(%r1) 5da: a7 f4 ff 97 j 508 <.plt> ... 5e6: 00 3c .short 0x003c ... 0860 : 860: 50 e0 f0 04 st %r14,4(%r15) 864: c0 10 00 00 0b f2 larl%r1,2048 <__data_start+0x4> We jump to the plt-slot, which uses r12, which is loaded later. 86a: c0 e5 ff ff fe af brasl %r14,5c8 <_mcount@plt> 870: 58 e0 f0 04 l %r14,4(%r15) 874: 90 bf f0 2c stm %r11,%r15,44(%r15) 878: a7 fa ff a0 ahi %r15,-96 87c: 18 bf lr %r11,%r15 GOT-Pointer is loaded here for puts: 87e: c0 c0 00 00 0b c1 larl%r12,2000 <_GLOBAL_OFFSET_TABLE_> 884: c0 20 00 00 00 6c larl%r2,95c <_IO_stdin_used+0x4> 88a: c0 e5 ff ff fe 7f brasl %r14,588 890: a7 18 00 00 lhi %r1,0 894: 18 21 lr %r2,%r1 896: 98 bf b0 8c lm %r11,%r15,140(%r11) 89a: 07 fe br %r14 89c: 07 07 nopr%r7 89e: 07 07 nopr%r7 */