Re: [patch,libgomp] Make libgomp Fortran modules multilib-aware
ping > Le 3 mai 2016 à 23:25, FX a écrit : > > The attached patch allows libgomp to install its Fortran modules in the > correct multilib-aware directories, just like libgfortran does. > Without it, multilib Fortran OpenMP code using the modules fails to compile > because the modules are not found: > > $ gfortran -fopenmp a.f90 > $ gfortran -fopenmp a.f90 -m32 > a.f90:1:6: > > use omp_lib > 1 > Fatal Error: Can't open module file ‘omp_lib.mod’ for reading at (1): No such > file or directory > compilation terminated. > > > > Bootstrapped and tested on x86_64-apple-darwin15. OK to commit? > > FX > > > > > > > 2016-05-03 Francois-Xavier Coudert > > PR libgomp/60670 > * Makefile.am: Make fincludedir multilib-aware. > * Makefile.in: Regenerate. > Index: libgomp/Makefile.am === --- libgomp/Makefile.am (revision 235843) +++ libgomp/Makefile.am (working copy) @@ -10,7 +10,7 @@ config_path = @config_path@ search_path = $(addprefix $(top_srcdir)/config/, $(config_path)) $(top_srcdir) \ $(top_srcdir)/../include -fincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/finclude +fincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)$(MULTISUBDIR)/finclude libsubincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include vpath % $(strip $(search_path))
Re: increase alignment of global structs in increase_alignment pass
On 6 May 2016 at 17:20, Richard Biener wrote: > On Wed, 4 May 2016, Prathamesh Kulkarni wrote: > >> On 23 February 2016 at 21:49, Prathamesh Kulkarni >> wrote: >> > On 23 February 2016 at 17:31, Richard Biener wrote: >> >> On Tue, 23 Feb 2016, Prathamesh Kulkarni wrote: >> >> >> >>> On 22 February 2016 at 17:36, Richard Biener wrote: >> >>> > On Mon, 22 Feb 2016, Prathamesh Kulkarni wrote: >> >>> > >> >>> >> Hi Richard, >> >>> >> As discussed in private mail, this version of patch attempts to >> >>> >> increase alignment >> >>> >> of global struct decl if it contains an an array field(s) and array's >> >>> >> offset is a multiple of the alignment of vector type corresponding to >> >>> >> it's scalar type and recursively checks for nested structs. >> >>> >> eg: >> >>> >> static struct >> >>> >> { >> >>> >> int a, b, c, d; >> >>> >> int k[4]; >> >>> >> float f[10]; >> >>> >> }; >> >>> >> k is a candidate array since it's offset is 16 and alignment of >> >>> >> "vector (4) int" is 8. >> >>> >> Similarly for f. >> >>> >> >> >>> >> I haven't been able to create a test-case where there are >> >>> >> multiple candidate arrays and vector alignment of arrays are >> >>> >> different. >> >>> >> I suppose in this case we will have to increase alignment >> >>> >> of the struct by the max alignment ? >> >>> >> eg: >> >>> >> static struct >> >>> >> { >> >>> >> >> >>> >> T1 k[S1] >> >>> >> >> >>> >> T2 f[S2] >> >>> >> >> >>> >> }; >> >>> >> >> >>> >> if V1 is vector type corresponding to T1, and V2 corresponding vector >> >>> >> type to T2, >> >>> >> offset (k) % align(V1) == 0 and offset (f) % align(V2) == 0 >> >>> >> and align (V1) > align(V2) then we will increase alignment of struct >> >>> >> by align(V1). >> >>> >> >> >>> >> Testing showed FAIL for g++.dg/torture/pr31863.C due to program >> >>> >> timeout. >> >>> >> Initially it appeared to me, it went in infinite loop. However >> >>> >> on second thoughts I think it's probably not an infinite loop, rather >> >>> >> taking (extraordinarily) large amount of time >> >>> >> to compile the test-case with the patch. >> >>> >> The test-case builds quickly for only 2 instantiations of ClassSpec >> >>> >> (ClassSpec, >> >>> >> ClassSpec) >> >>> >> Building with 22 instantiations (upto ClassSpec) >> >>> >> takes up >> >>> >> to ~1m to compile. >> >>> >> with: >> >>> >> 23 instantiations: ~2m >> >>> >> 24 instantiations: ~5m >> >>> >> For 30 instantiations I terminated cc1plus after 13m (by SIGKILL). >> >>> >> >> >>> >> I guess it shouldn't go in an infinite loop because: >> >>> >> a) structs cannot have circular references. >> >>> >> b) works for lower number of instantiations >> >>> >> However I have no sound evidence that it cannot be in infinite loop. >> >>> >> I don't understand why a decl node is getting visited more than once >> >>> >> for that test-case. >> >>> >> >> >>> >> Using a hash_map to store alignments of decl's so that decl node gets >> >>> >> visited >> >>> >> only once prevents the issue. >> >>> > >> >>> > Maybe aliases. Try not walking vnode->alias == true vars. >> >>> Hi, >> >>> I have a hypothesis why decl node gets visited multiple times. >> >>> >> >>> Consider the test-case: >> >>> template >> >>> struct X >> >>> { >> >>> T a; >> >>> virtual int foo() { return N; } >> >>> }; >> >>> >> >>> typedef struct X x_1; >> >>> typedef struct X x_2; >> >>> >> >>> static x_1 obj1 __attribute__((used)); >> >>> static x_2 obj2 __attribute__((used)); >> >>> >> >>> Two additional structs are created by C++FE, c++filt shows: >> >>> _ZTI1XIiLj1EE -> typeinfo for X >> >>> _ZTI1XIiLj2EE -> typeinfo for X >> >>> >> >>> Both of these structs have only one field D.2991 and it appears it's >> >>> *shared* between them: >> >>> struct D.2991; >> >>> const void * D.2980; >> >>> const char * D.2981; >> >>> >> >>> Hence the decl node D.2991 and it's fields (D.2890, D.2981) get visited >> >>> twice: >> >>> once when walking _ZTI1XIiLj1EE and 2nd time when walking _ZTI1XIiLj2EE >> >>> >> >>> Dump of walking over the global structs for above test-case: >> >>> http://pastebin.com/R5SABW0c >> >>> >> >>> So it appears to me to me a DAG (interior node == struct decl, leaf == >> >>> non-struct field, >> >>> edge from node1 -> node2 if node2 is field of node1) is getting >> >>> created when struct decl >> >>> is a type-info object. >> >>> >> >>> I am not really clear on how we should proceed: >> >>> a) Keep using hash_map to store alignments so that every decl gets >> >>> visited only once. >> >>> b) Skip walking artificial record decls. >> >>> I am not sure if skipping all artificial struct decls would be a good >> >>> idea, but I don't >> >>> think it's possible to identify if a struct-decl is typeinfo struct at >> >>> middle-end ? >> >> >> >> You shouldn't end up walking those when walking the type of >> >> global decls. That is, don't walk typeinfo decls - yes, practically >> >> that means just not walking DECL_ARTIFICIAL things. >> > Hi,
Option handling (support) of -fsanitize=use-after-scope
Hello. I've been working on use-after-scope sanitizer enablement in the GCC compiler ([1]) and as I've read following submit request ([2]), the LLVM compiler started to utilize following option: -mllvm -asan-use-after-scope=1 My initial attempt was to introduce a new option value for -fsanitize option (which would make both LLVM and GCC option compatible). Following the current behavior of the LLVM, I would have to add a new --param which would lead to a divergence. Is the suggested approach alterable for LLVM community? I would also suggest following default behavior: - If -fsanitize=address or -fsanitize=kernel-address is enabled, the use-after-scope sanitization should be enabled - Similarly, providing -fuse-after-scope should enable address sanitization (either use-space or kernel-space) Thank you for feedback, Martin [1] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00468.html [2] http://reviews.llvm.org/D19347
Re: Option handling (support) of -fsanitize=use-after-scope
On 05/11/2016 04:18 PM, Martin Liška wrote: Hello. I've been working on use-after-scope sanitizer enablement in the GCC compiler ([1]) and as I've read following submit request ([2]), the LLVM compiler started to utilize following option: -mllvm -asan-use-after-scope=1 My initial attempt was to introduce a new option value for -fsanitize option (which would make both LLVM and GCC option compatible). Following the current behavior of the LLVM, I would have to add a new --param which would lead to a divergence. Is the suggested approach alterable for LLVM community? I would also suggest following default behavior: - If -fsanitize=address or -fsanitize=kernel-address is enabled, the use-after-scope sanitization should be enabled - Similarly, providing -fuse-after-scope should enable address sanitization (either use-space or kernel-space) Thank you for feedback, Martin [1] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00468.html [2] http://reviews.llvm.org/D19347 Cc-ed Google folks.
Re: [PATCH] clean up insn-automata.c
On 05/11/2016 01:39 AM, Alexander Monakov wrote: On Wed, 30 Mar 2016, Bernd Schmidt wrote: On 03/25/2016 04:43 AM, Aldy Hernandez wrote: If Bernd is fine with this, I'm happy to retract my patch and any possible followups. I'm just interested in having no path causing a possible out of bounds access. If your patch will do that, I'm cool. I'll need to see that patch first to comment :-) Here's the proposed patch. I've found that there's only one user of the current fancy logic in output_internal_insn_code_evaluation: handling of NULL_RTX and const0_rtx is only useful for 'state_transition' (generated by output_trans_func), so it's possible to inline the extended handling there, and handle only plain non-null rtx_insns in output_internal_insn_code_evaluation. This change allows to remove extra checks and casting in output_internal_insn_latency_func, as done by the patch. As a nice bonus, it constrains prototypes of three automata functions to accept insn_rtx rather than just rtx. Bootstrapped and regtested on x86_64, OK? Yes, it is ok for the trunk. Thank you for solving this issue, Alexander. * genattr.c (main): Change 'rtx' to 'rtx_insn *' in prototypes of 'insn_latency', 'maximal_insn_latency', 'min_insn_conflict_delay'. * genautomata.c (output_internal_insn_code_evaluation): Simplify. Move handling of non-insn arguments inline into the sole user: (output_trans_func): ...here. (output_min_insn_conflict_delay_func): Change 'rtx' to 'rtx_insn *' in emitted function prototype. (output_internal_insn_latency_func): Ditto. Simplify. (output_internal_maximal_insn_latency_func): Ditto. Delete always-unused argument. (output_insn_latency_func): Ditto. (output_maximal_insn_latency_func): Ditto.
Ann: MELT plugin 1.3 release candidate 2 for GCC 5 or GCC 6
Dear All, It is my pleasure to announce the MELT plugin 1.3 release candidate 2 for GCC 5 & GCC 6 (hosted and usable on Linux preferably). MELT -see http://gcc-melt.org/ for more (or http://starynkevitch.net/Basile/gcc-melt/ which points to the same web pages and resources) - is a domain specific language and meta-plugin (free software GPLv3+ licensed, FSF copyrighted) to easily extend and customize the GCC compiler. Please download the bzip2 compress source tar archive from http://gcc-melt.org/melt-1.3-rc2-plugin-for-gcc-5-or-6.tar.bz2 It is a file of 4013849 bytes (3.9Mbytes) and md5sum eb4df214b293caabec07be4a672eda4e NEWS for 1.3 MELT plugin for GCC 5 & GCC 6 [[may XX, 2016]] Bug fixes = Rare garbage collection bug fixed (noticed with GCC 5). Language features = No significant new language feature. Runtime features We did keep compatibility with GCC 5 & GCC 6. Since gengtype does not admit conditionals (see messages following https://gcc.gnu.org/ml/gcc/2016-02/msg00156.html ...) we had to hack our build system. The MELT plugin now use some melt-runtypes.h symlink to a version specific file, which has typedef-s like typedef gimple* melt_gimpleptr_t; // gimple is now a struct Added plugin options: -fplugin-melt-arg-verbose-full-gc: if set to 1 or Y, a message is output to stderr on MELT full garbage collections. -fplugin-melt-arg-mmap-reserve: don't use it, except to debug the MELT runtime. See comment in melt-runtime.cc The MELT runtime (that it the MELT plugin melt.so) could be built with -DMELT_HAVE_RUNTIME_DEBUG=1 to enable MELT runtime debugging. This is rarely useful for MELT users. w.r.t. MELT plugin 1.3 rc1 I have made a few bug fixes (including perhaps some annoying GC bug that I cannot reproduce anymore). Please try to build & use that release candidate 2 and report bugs to gcc-m...@googlegroups.com Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mine, sont seulement les miennes} ***
Re: (R5900) Implementing Vector Support
On 05/11/2016 04:54 AM, Woon yung Liu wrote: I saw that the EE has the PMFHL.LH instruction, which loads the HI/LO register pairs (containing the multiplication result) into a single destination (i.e. truncates the multiplication result in the process), with the right order too. I suppose that it would be suitable for implementing the mulm3 operation. But if I implement mulm3, is there still a need to implement the vec_widen_smult_hi_m and vec_widen_smult_lo_m patterns? Of course. They're used for different things. E.g. int out[100]; short in1[100], in2[100]; for (i = 0; i < 100; ++i) out[i] = in1[i] * in2[i]; will use the vec_widen_smult* patterns. I tried to implement the two patterns (vec_widen_smult_hi_m and vec_widen_smult_lo_m), but GCC wouldn't compile due to both patterns having the same operands. Must they be expands? If so, what sort of patterns should the pcpyld and pcpyud instructions be? If I don't declare them differently, I'll have the same compilation error again (due to them having the same operands). Yes I would think they should be expands. I would expect something like ;; ??? Could describe the result in %3, if we ever find it useful. (define_insn "pmulth_ee" [(set (match_operand:V8SI 0 "register_operand" "=x") (vec_select:V8SI (mult:V8SI (sign_extend:V8SI (match_operand:V8HI 1 "register_operand" "d")) (sign_extend:V8SI (match_operand:V8HI 2 "register_operand" "d"))) (parallel [(const_int 0) (const_int 1) (const_int 4) (const_int 5) (const_int 2) (const_int 3) (const_int 6) (const_int 7)]))) (clobber (match_scratch:V4SI 3 "=d"))] "..." "pmulth\t%3,%1,%2" ) (define_insn "pmfhl_lh_ee_v8hi" [(set (match_operand:V8HI 0 "register_operand" "=d") (vec_select:V8HI (match_operand:V16HI 1 "register_operand" "x") (parallel [(const_int 0) (const_int 2) (const_int 8) (const_int 10) (const_int 4) (const_int 6) (const_int 12) (const_int 14)])))] "..." "pmfhl.lh\t%0" ) ;; ??? Maybe provide V4SI and V8HI versions too. (define_insn "pmfhi_ee_v2di" [(set (match_operand:V2DI 0 "register_operand" "=d") (vec_select:V2DI (match_operand:V4DI 1 "register_operand" "x") (parallel [(const_int 2) (const_int 3)])))] "..." "pmfhi\t%0" ) ;; ??? Maybe provide V4SI and V8HI versions too. (define_insn "pmflo_ee_v2di" [(set (match_operand:V2DI 0 "register_operand" "=d") (vec_select:V2DI (match_operand:V4DI 1 "register_operand" "x") (parallel [(const_int 0) (const_int 1)])))] "..." "pmflo\t%0" ) ;; ??? Maybe provide V4SI and V8HI versions too. (define_insn "pcpyld_ee_v2di" [(set (match_operand:V2DI 0 "register_operand" "=d") (vec_select:V2DI (vec_concat:V4DI (match_operand:V2DI 1 "register_operand" "d") (match_operand:V2DI 2 "register_operand" "d")) (parallel [(const_int 0) (const_int 2)])))] "..." "pcpyld\t%0,%2,%1" ) ;; ??? Maybe provide V4SI and V8HI versions too. (define_insn "pcpyud_ee_v2di" [(set (match_operand:V2DI 0 "register_operand" "=d") (vec_select:V2DI (vec_concat:V4DI (match_operand:V2DI 1 "register_operand" "d") (match_operand:V2DI 2 "register_operand" "d")) (parallel [(const_int 1) (const_int 3)])))] "..." "pcpyud\t%0,%1,%2" ) (define_expand "mulv8hi3" [(match_operand:V8HI 0 "register_operand") (match_operand:V8HI 1 "register_operand") (match_operand:V8HI 2 "register_operand")] "..." { rtx hilo = gen_reg_rtx (V8SImode); emit_insn (gen_pmulth_ee (hilo, operands[1], operands[2])); hilo = gen_lowpart (V16HImode, hilo); emit_insn (gen_pmfhl_lh_ee_v8hi (operands[0], hilo)); DONE; }) (define_expand "vec_widen_smult_lo_v8qi" [(match_operand:V4SI 0 "register_operand") (match_operand:V8HI 1 "register_operand") (match_operand:V8HI 2 "register_operand")] "..." { rtx hilo = gen_reg_rtx (V8SImode); rtx hi = gen_reg_rtx (V2DImode); rtx lo = gen_reg_rtx (V2DImode); emit_insn (gen_pmulth_ee (hilo, operands[1], operands[2])); hilo = gen_lowpart (V4DImode, hilo); emit_insn (gen_pmfhi_ee_v2di (hi, hilo)); emit_insn (gen_pmflo_ee_v2di (lo, hilo)); emit_insn (gen_pcpyld_ee_v2di (gen_lowpart (V2DImode, operands[0]), lo, hi)); DONE; }) (define_expand "vec_widen_smult_hi_v8qi" [(match_operand:V4SI 0 "register_operand") (match_operand:V8HI 1 "register_operand") (match_operand:V8HI 2 "register_operand")] "..." { rtx hilo = gen_reg_rtx (V8SImode); rtx hi = gen_reg_rtx (V2DImode); rtx lo = gen_reg_rtx (V2DImode); emit_insn (gen_pmulth_ee (hilo, operands[1], operands[2])); hilo = gen_lowpart (V4DImode, hilo); emit_insn (gen_pmfhi_ee_v2di (hi, hilo)); emit_insn (gen_pmflo_ee_v2di (lo, hilo)); emit_insn (gen_pcpyud_ee_v2di (gen_lowpart (V2DImode, operands[0]), lo, hi)); DONE; }
gcc-4.9-20160511 is now available
Snapshot gcc-4.9-20160511 is now available on ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160511/ and on various mirrors, see http://gcc.gnu.org/mirrors.html for details. This snapshot has been generated from the GCC 4.9 SVN branch with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch revision 236147 You'll find: gcc-4.9-20160511.tar.bz2 Complete GCC MD5=13b96c87abf36b7c87d6bbf1c577a198 SHA1=0639f501f4f78061c23951c53810ea31a7f868fd Diffs from 4.9-20160504 are available in the diffs/ subdirectory. When a particular snapshot is ready for public consumption the LATEST-4.9 link is updated and a message is sent to the gcc list. Please do not use a snapshot before it has been announced that way.