Re: Towards GNU11
On Wed, Oct 15, 2014 at 12:08 PM, Marek Polacek wrote: > On Tue, Oct 14, 2014 at 09:23:29AM +0200, Marek Polacek wrote: >> The consensus seems to be to go forward with this change. I will >> commit the patch in 24 hours unless I hear objections. > > I made the change. Please report any fallout to me. Most of the graphite tests don't compile with -std=c11 for me. FAIL: gcc.dg/graphite/id-1.c (test for excess errors) FAIL: gcc.dg/graphite/id-13.c (test for excess errors) FAIL: gcc.dg/graphite/id-17.c (test for excess errors) FAIL: gcc.dg/graphite/id-2.c (test for excess errors) FAIL: gcc.dg/graphite/id-23.c (test for excess errors) FAIL: gcc.dg/graphite/id-26.c (test for excess errors) FAIL: gcc.dg/graphite/id-4.c (test for excess errors) FAIL: gcc.dg/graphite/id-8.c (test for excess errors) FAIL: gcc.dg/graphite/id-pr43464-1.c (test for excess errors) FAIL: gcc.dg/graphite/id-pr43464.c (test for excess errors) FAIL: gcc.dg/graphite/id-pr45230-1.c (test for excess errors) FAIL: gcc.dg/graphite/id-pr45230.c (test for excess errors) FAIL: gcc.dg/graphite/id-pr45231.c (test for excess errors) FAIL: gcc.dg/graphite/pr37485.c (test for excess errors) FAIL: gcc.dg/graphite/pr38073.c (test for excess errors) FAIL: gcc.dg/graphite/pr38125.c (test for excess errors) FAIL: gcc.dg/graphite/pr38409.c (test for excess errors) FAIL: gcc.dg/graphite/pr38413.c (test for excess errors) FAIL: gcc.dg/graphite/pr38500.c (test for excess errors) FAIL: gcc.dg/graphite/pr38510.c (test for excess errors) FAIL: gcc.dg/graphite/pr38786.c (test for excess errors) FAIL: gcc.dg/graphite/pr39260.c (test for excess errors) FAIL: gcc.dg/graphite/pr42284.c (test for excess errors) FAIL: gcc.dg/graphite/pr42914.c (test for excess errors) FAIL: gcc.dg/graphite/pr46404-1.c (test for excess errors) FAIL: gcc.dg/graphite/pr60979.c (test for excess errors) FAIL: gcc.dg/graphite/scop-19.c (test for excess errors) Richard. > Enjoy. > > Marek
Re: Towards GNU11
On Wed, Oct 15, 2014 at 09:28:09PM +0200, Uros Bizjak wrote: > Hello! > > >> The consensus seems to be to go forward with this change. I will > >> commit the patch in 24 hours unless I hear objections. > > > > I made the change. Please report any fallout to me. > > i686-linux-gnu testsuite trivially regressed [1]: > > FAIL: gcc.dg/20020122-2.c (test for excess errors) > FAIL: gcc.dg/builtin-apply4.c (test for excess errors) > FAIL: gcc.dg/ia64-sync-1.c (test for excess errors) > FAIL: gcc.dg/ia64-sync-2.c (test for excess errors) > FAIL: gcc.dg/ia64-sync-3.c (test for excess errors) > FAIL: gcc.dg/pr32176.c (test for excess errors) > FAIL: gcc.dg/sync-2.c (test for excess errors) > FAIL: gcc.dg/sync-3.c (test for excess errors) > FAIL: gcc.target/i386/20060125-1.c (test for excess errors) > FAIL: gcc.target/i386/20060125-2.c (test for excess errors) > FAIL: gcc.target/i386/980312-1.c (test for excess errors) > FAIL: gcc.target/i386/980313-1.c (test for excess errors) > FAIL: gcc.target/i386/990524-1.c (test for excess errors) > FAIL: gcc.target/i386/avx512f-pr57233.c (test for excess errors) > FAIL: gcc.target/i386/avx512f-typecast-1.c (test for excess errors) > FAIL: gcc.target/i386/builtin-apply-mmx.c (test for excess errors) > FAIL: gcc.target/i386/crc32-2.c (test for excess errors) > FAIL: gcc.target/i386/crc32-3.c (test for excess errors) > FAIL: gcc.target/i386/intrinsics_3.c (test for excess errors) > FAIL: gcc.target/i386/loop-1.c (test for excess errors) > FAIL: gcc.target/i386/memcpy-1.c (test for excess errors) > FAIL: gcc.target/i386/pr26826.c (test for excess errors) > FAIL: gcc.target/i386/pr37184.c (test for excess errors) > FAIL: gcc.target/i386/pr40934.c (test for excess errors) > FAIL: gcc.target/i386/pr44948-2a.c (test for excess errors) > FAIL: gcc.target/i386/pr47564.c (test for excess errors) > FAIL: gcc.target/i386/pr50712.c (test for excess errors) > FAIL: gcc.target/i386/sse-5.c (test for excess errors) > FAIL: gcc.target/i386/stackalign/asm-1.c -mno-stackrealign (test for > excess errors) > FAIL: gcc.target/i386/stackalign/asm-1.c -mstackrealign (test for excess > errors) > FAIL: gcc.target/i386/stackalign/return-2.c -mno-stackrealign (test > for excess errors) > FAIL: gcc.target/i386/stackalign/return-2.c -mstackrealign (test for > excess errors) > FAIL: gcc.target/i386/vectorize4.c (test for excess errors) Sorry about these, should be fixed now. Marek
Recent bootstrap failure on CentOS 5.11, /usr/bin/ld: Dwarf Error: found dwarf version '4' ...
Hello! Recent change caused bootstrap failure on CentOS 5.11: /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only handles version 2 information. unwind-dw2-fde-dip_s.o: In function `__pthread_cleanup_routine': unwind-dw2-fde-dip.c:(.text+0x1590): multiple definition of `__pthread_cleanup_routine' /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only handles version 2 information. unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only handles version 2 information. unwind-sjlj_s.o: In function `__pthread_cleanup_routine': unwind-sjlj.c:(.text+0x0): multiple definition of `__pthread_cleanup_routine' unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only handles version 2 information. emutls_s.o: In function `__pthread_cleanup_routine': emutls.c:(.text+0x170): multiple definition of `__pthread_cleanup_routine' unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here collect2: error: ld returned 1 exit status gmake[5]: *** [libgcc_s.so] Error 1 $ ld --version GNU ld version 2.17.50.0.6-26.el5 20061020 Uros.
Re: oacc kernels directive -- reductions
On Tue, 14 Oct 2014, Tom de Vries wrote: > Hi, > > in this email I'm trying to explain in detail what problem I'm running into > with reductions in oacc kernels region, and how I think it could be solved. > > Any advice is welcome. > > > OVERALL PROBLEM > > The overall problem I'm trying to solve is to implement the oacc kernels > directive in gcc, reusing pass_parallelize_loops. > > > OACC KERNELS > > The oacc kernels region is a region with a series of loop nests, which are > intended to run on the accelerator. The compiler needs to offload each loop > nest to the accelerator, in the way most optimal for the accelerator. > > > PASS_PARALLELIZE_LOOPS > > The pass analyzes loops. If the loop iterations are independent, and it looks > beneficial to parallelize the loop, the loop is transformed. > > A copy of the loop is made, that deals with: > - small loop iterations for which the overhead of starting several threads > will > be too big, or > - fixup loop iterations that are left in case the number of iterations is not > divisible by the parallelization factor. > > The original loop is transformed: > - References of local variables are replaced with dereferences of a new > variable, which are initialized at loop entry with the addresses of the > original variables (eliminate_local_variables) > - copy loop-non-local variables to a structure, and replace references with > loads from a pointer to another (similar) structure > (seperate_decls_in_region) > - The loop is replaced with an GIMPLE_OMP_FOR (with and empty body) and > GIMPLE_OMP_CONTINUE > - The loop region is enveloped with GIMPLE_OMP_PARALLEL and GIMPLE_OMP_RETURN > - the loop region is omp-expanded using omp_expand_local > > > STATUS > > I've created an initial implementation in vries/oacc-kernels, on top of the > gomp-4_0-branch. > > > GOMP-4_0-BRANCH > > In the gomp-4_0-branch, the kernels directive is translated as a copy of the > oacc parallels directive. So, the following stages are done: > - pass_lower_omp/scan_omp: > - scan directive body for variables. > - build up omp_context datastructures. > - declare struct with fields corresponding to scanned variables. > - declare function with pointer to struct > - pass_lower_omp/lower_omp: > - declare struct > - assign values to struct fields > - declare pointer to struct > - rewrite body in terms of struct fields using pointer to struct. > - omp_expand: > - build up omp_region data-structures > - split off region in separate function > - replace region with call to oacc runtime function while passing function > pointer to split off function > > > VRIES/OACC-KERNELS > > The current mechanism of offloading (compiling a function for a different > architecture) is using the lto-streaming. The parloops pass is located after > the lto-streaming point which is too late. OTOH, the parloops pass needs alias > info, which is only available after pass_build_ealias. So a copy of the > parloops pass specialized for oacc kernels has been added after > pass_build_ealias (plus a couple of passes to compensate for moving the pass > up in the pass list). > > The new pass does not use the lowering (first 2 steps of loop transform) of > parloops. The lowering is already done by pass_omp_lower. > > The omp-expansion of the oacc-kernels region (done in gomp-4_0-branch) is > skipped, to allow first the alias analysis to work on the scope of the intact > function, and the new pass to do the omp-expansion. > > So, the new pass: > - analyses the loop for dependences > - if independent, transforms the loop: > - The loop is replaced with an GIMPLE_OMP_FOR (kind_oacc_loop, with an empty > body) and GIMPLE_OMP_CONTINUE > - The GIMPLE_OACC_KERNELS is replaced with GIMPLE_OACC_PARALLEL > - the loop region is omp-expanded using omp_expand_local > > The gotchas of the implementation are: > - no support for reductions, nested loops, more than one loop nest in > kernels region > - the fixup/low-it-count loop copy is still generated _inside_ the split off > function > > > PROBLEM WITH REDUCTIONS > > In the vries/oacc-kernels implementation, the lowering of oacc kernels (in > pass_lower_omp) is done before any loop analysis. For reductions, that's not > possible anymore, since that would mean that detection of reductions comes > after handling of reductions. > > The problem we're running into here, is that: > - on one hand, the oacc lowering is done on high gimple (scopes still intact > because GIMPLE_BINDs are still present, no bbs and cfgs, eh not expanded, no > ssa), > - otoh, loop analysis is done on low ssa gimple (bbs, cfgs, ssa, no scopes, eh > expanded) > > The parloops pass is confronted with a similar problem. > > AFAIU, ideal pass reuse for parloops would go something like this: on ssa, you > do loop analysis. You then insert omp pragmas that indicate what > transformations you want. Then you go back from ssa gimple to high gimple
Re: Recent bootstrap failure on CentOS 5.11, /usr/bin/ld: Dwarf Error: found dwarf version '4' ...
On Thu, Oct 16, 2014 at 11:25 AM, Uros Bizjak wrote: > Recent change caused bootstrap failure on CentOS 5.11: > > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only > handles version 2 information. > unwind-dw2-fde-dip_s.o: In function `__pthread_cleanup_routine': > unwind-dw2-fde-dip.c:(.text+0x1590): multiple definition of > `__pthread_cleanup_routine' > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only > handles version 2 information. > unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only > handles version 2 information. > unwind-sjlj_s.o: In function `__pthread_cleanup_routine': > unwind-sjlj.c:(.text+0x0): multiple definition of `__pthread_cleanup_routine' > unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only > handles version 2 information. > emutls_s.o: In function `__pthread_cleanup_routine': > emutls.c:(.text+0x170): multiple definition of `__pthread_cleanup_routine' > unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here > collect2: error: ld returned 1 exit status > gmake[5]: *** [libgcc_s.so] Error 1 > > $ ld --version > GNU ld version 2.17.50.0.6-26.el5 20061020 It looks like a switch-to-c11 fallout. Older glibc versions have issues with c99 (and c11) conformance [1]. Changing "extern __inline void __pthread_cleanup_routine (...)" in system /usr/include/pthread.h to if __STDC_VERSION__ < 199901L extern #endif __inline__ void __pthread_cleanup_routine (...) fixes this issue and allows bootstrap to proceed. However, fixincludes is not yet built in stage1 bootstrap. Is there a way to fix this issue without changing system headers? [1] https://gcc.gnu.org/ml/gcc-patches/2006-11/msg01030.html Uros.
Re: Recent bootstrap failure on CentOS 5.11, /usr/bin/ld: Dwarf Error: found dwarf version '4' ...
On Thu, Oct 16, 2014 at 01:57:51PM +0200, Uros Bizjak wrote: > On Thu, Oct 16, 2014 at 11:25 AM, Uros Bizjak wrote: > > > Recent change caused bootstrap failure on CentOS 5.11: > > > > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only > > handles version 2 information. > > unwind-dw2-fde-dip_s.o: In function `__pthread_cleanup_routine': > > unwind-dw2-fde-dip.c:(.text+0x1590): multiple definition of > > `__pthread_cleanup_routine' > > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only > > handles version 2 information. > > unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here > > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only > > handles version 2 information. > > unwind-sjlj_s.o: In function `__pthread_cleanup_routine': > > unwind-sjlj.c:(.text+0x0): multiple definition of > > `__pthread_cleanup_routine' > > unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here > > /usr/bin/ld: Dwarf Error: found dwarf version '4', this reader only > > handles version 2 information. > > emutls_s.o: In function `__pthread_cleanup_routine': > > emutls.c:(.text+0x170): multiple definition of `__pthread_cleanup_routine' > > unwind-dw2_s.o:unwind-dw2.c:(.text+0x270): first defined here > > collect2: error: ld returned 1 exit status > > gmake[5]: *** [libgcc_s.so] Error 1 > > > > $ ld --version > > GNU ld version 2.17.50.0.6-26.el5 20061020 > > It looks like a switch-to-c11 fallout. Older glibc versions have > issues with c99 (and c11) conformance [1]. > > Changing "extern __inline void __pthread_cleanup_routine (...)" in > system /usr/include/pthread.h to > > if __STDC_VERSION__ < 199901L > extern > #endif > __inline__ void __pthread_cleanup_routine (...) > > fixes this issue and allows bootstrap to proceed. > > However, fixincludes is not yet built in stage1 bootstrap. Is there a > way to fix this issue without changing system headers? > > [1] https://gcc.gnu.org/ml/gcc-patches/2006-11/msg01030.html Yeah, old glibcs are totally incompatible with -fno-gnu89-inline. Not sure if it is easily fixincludable, if yes, then -fgnu89-inline should be used for code like libgcc which is built with the newly built compiler before it is fixincluded. Or we need -fgnu89-inline by default for old glibcs (that is pretty much what we do e.g. in Developer Toolset for RHEL5). Jakub
Re: Towards GNU11
On Wed, Oct 15, 2014 at 3:08 AM, Marek Polacek wrote: > On Tue, Oct 14, 2014 at 09:23:29AM +0200, Marek Polacek wrote: >> The consensus seems to be to go forward with this change. I will >> commit the patch in 24 hours unless I hear objections. > > I made the change. Please report any fallout to me. Yes the Linux kernel fails to compile for aarch64 after this change: In file included from include/linux/mutex.h:15:0, from include/linux/kvm_host.h:12, from arch/arm64/kvm/../../../virt/kvm/kvm_main.c:21: include/linux/spinlock_types.h:82:2: error: initializer element is not constant (spinlock_t ) __SPIN_LOCK_INITIALIZER(lockname) ^ include/linux/spinlock_types.h:84:43: note: in expansion of macro ‘__SPIN_LOCK_UNLOCKED’ #define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x) ^ arch/arm64/kvm/../../../virt/kvm/kvm_main.c:75:1: note: in expansion of macro ‘DEFINE_SPINLOCK’ DEFINE_SPINLOCK(kvm_lock); ^ include/linux/spinlock_types.h:60:2: error: initializer element is not constant (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname) ^ include/linux/spinlock_types.h:62:51: note: in expansion of macro ‘__RAW_SPIN_LOCK_UNLOCKED’ #define DEFINE_RAW_SPINLOCK(x) raw_spinlock_t x = __RAW_SPIN_LOCK_UNLOCKED(x) ^ arch/arm64/kvm/../../../virt/kvm/kvm_main.c:76:8: note: in expansion of macro ‘DEFINE_RAW_SPINLOCK’ static DEFINE_RAW_SPINLOCK(kvm_count_lock); ^ > > Enjoy. > > Marek
gcc-4.8-20141016 is now available
Snapshot gcc-4.8-20141016 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.8-20141016/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.8 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_8-branch revision 216352 You'll find: gcc-4.8-20141016.tar.bz2 Complete GCC MD5=904b270e67e0460fc5556734b7929526 SHA1=3eba44fecdcd93256c7003bb083d9dfcefad4a38 Diffs from 4.8-20141009 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.8 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.
Re: Towards GNU11
On Thu, Oct 16, 2014 at 3:35 PM, Andrew Pinski wrote: > On Wed, Oct 15, 2014 at 3:08 AM, Marek Polacek wrote: >> On Tue, Oct 14, 2014 at 09:23:29AM +0200, Marek Polacek wrote: >>> The consensus seems to be to go forward with this change. I will >>> commit the patch in 24 hours unless I hear objections. >> >> I made the change. Please report any fallout to me. > > Yes the Linux kernel fails to compile for aarch64 after this change: > In file included from include/linux/mutex.h:15:0, > from include/linux/kvm_host.h:12, > from arch/arm64/kvm/../../../virt/kvm/kvm_main.c:21: > include/linux/spinlock_types.h:82:2: error: initializer element is not > constant > (spinlock_t ) __SPIN_LOCK_INITIALIZER(lockname) > ^ > include/linux/spinlock_types.h:84:43: note: in expansion of macro > ‘__SPIN_LOCK_UNLOCKED’ > #define DEFINE_SPINLOCK(x) spinlock_t x = __SPIN_LOCK_UNLOCKED(x) >^ > arch/arm64/kvm/../../../virt/kvm/kvm_main.c:75:1: note: in expansion > of macro ‘DEFINE_SPINLOCK’ > DEFINE_SPINLOCK(kvm_lock); > ^ > include/linux/spinlock_types.h:60:2: error: initializer element is not > constant > (raw_spinlock_t) __RAW_SPIN_LOCK_INITIALIZER(lockname) > ^ > include/linux/spinlock_types.h:62:51: note: in expansion of macro > ‘__RAW_SPIN_LOCK_UNLOCKED’ > #define DEFINE_RAW_SPINLOCK(x) raw_spinlock_t x = __RAW_SPIN_LOCK_UNLOCKED(x) >^ > arch/arm64/kvm/../../../virt/kvm/kvm_main.c:76:8: note: in expansion > of macro ‘DEFINE_RAW_SPINLOCK’ > static DEFINE_RAW_SPINLOCK(kvm_count_lock); > ^ Here is a short testcase which shows the behavior difference between GNU89 and GNU11: typedef struct { volatile unsigned int lock; } arch_rwlock_t; typedef struct { arch_rwlock_t raw_lock; } rwlock_t; static rwlock_t step_hook_lock = (rwlock_t) { .raw_lock = { 0 }, }; Thanks, Andrew > > > >> >> Enjoy. >> >> Marek