Re: [patch,libgomp] Make libgomp Fortran modules multilib-aware

2016-05-11 Thread FX
ping


> Le 3 mai 2016 à 23:25, FX  a écrit :
> 
> The attached patch allows libgomp to install its Fortran modules in the 
> correct multilib-aware directories, just like libgfortran does.
> Without it, multilib Fortran OpenMP code using the modules fails to compile 
> because the modules are not found:
> 
> $ gfortran -fopenmp a.f90 
> $ gfortran -fopenmp a.f90 -m32
> a.f90:1:6:
> 
>   use omp_lib
>  1
> Fatal Error: Can't open module file ‘omp_lib.mod’ for reading at (1): No such 
> file or directory
> compilation terminated.
> 
> 
> 
> Bootstrapped and tested on x86_64-apple-darwin15. OK to commit?
> 
> FX
> 
> 
> 
> 
> 
> 
> 2016-05-03  Francois-Xavier Coudert  
> 
>   PR libgomp/60670
>   * Makefile.am: Make fincludedir multilib-aware.
>   * Makefile.in: Regenerate.
> 

Index: libgomp/Makefile.am
===
--- libgomp/Makefile.am (revision 235843)
+++ libgomp/Makefile.am (working copy)
@@ -10,7 +10,7 @@ config_path = @config_path@
 search_path = $(addprefix $(top_srcdir)/config/, $(config_path)) $(top_srcdir) 
\
  $(top_srcdir)/../include
 
-fincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/finclude
+fincludedir = 
$(libdir)/gcc/$(target_alias)/$(gcc_version)$(MULTISUBDIR)/finclude
 libsubincludedir = $(libdir)/gcc/$(target_alias)/$(gcc_version)/include
 
 vpath % $(strip $(search_path))


Re: increase alignment of global structs in increase_alignment pass

2016-05-11 Thread Prathamesh Kulkarni
On 6 May 2016 at 17:20, Richard Biener  wrote:
> On Wed, 4 May 2016, Prathamesh Kulkarni wrote:
>
>> On 23 February 2016 at 21:49, Prathamesh Kulkarni
>>  wrote:
>> > On 23 February 2016 at 17:31, Richard Biener  wrote:
>> >> On Tue, 23 Feb 2016, Prathamesh Kulkarni wrote:
>> >>
>> >>> On 22 February 2016 at 17:36, Richard Biener  wrote:
>> >>> > On Mon, 22 Feb 2016, Prathamesh Kulkarni wrote:
>> >>> >
>> >>> >> Hi Richard,
>> >>> >> As discussed in private mail, this version of patch attempts to
>> >>> >> increase alignment
>> >>> >> of global struct decl if it contains an an array field(s) and array's
>> >>> >> offset is a multiple of the alignment of vector type corresponding to
>> >>> >> it's scalar type and recursively checks for nested structs.
>> >>> >> eg:
>> >>> >> static struct
>> >>> >> {
>> >>> >>   int a, b, c, d;
>> >>> >>   int k[4];
>> >>> >>   float f[10];
>> >>> >> };
>> >>> >> k is a candidate array since it's offset is 16 and alignment of
>> >>> >> "vector (4) int" is 8.
>> >>> >> Similarly for f.
>> >>> >>
>> >>> >> I haven't been able to create a test-case where there are
>> >>> >> multiple candidate arrays and vector alignment of arrays are 
>> >>> >> different.
>> >>> >> I suppose in this case we will have to increase alignment
>> >>> >> of the struct by the max alignment ?
>> >>> >> eg:
>> >>> >> static struct
>> >>> >> {
>> >>> >>   
>> >>> >>   T1 k[S1]
>> >>> >>   
>> >>> >>   T2 f[S2]
>> >>> >>   
>> >>> >> };
>> >>> >>
>> >>> >> if V1 is vector type corresponding to T1, and V2 corresponding vector
>> >>> >> type to T2,
>> >>> >> offset (k) % align(V1) == 0 and offset (f) % align(V2) == 0
>> >>> >> and align (V1) > align(V2) then we will increase alignment of struct
>> >>> >> by align(V1).
>> >>> >>
>> >>> >> Testing showed FAIL for g++.dg/torture/pr31863.C due to program 
>> >>> >> timeout.
>> >>> >> Initially it appeared to me, it went in infinite loop. However
>> >>> >> on second thoughts I think it's probably not an infinite loop, rather
>> >>> >> taking (extraordinarily) large amount of time
>> >>> >> to compile the test-case with the patch.
>> >>> >> The test-case  builds quickly for only 2 instantiations of ClassSpec
>> >>> >> (ClassSpec,
>> >>> >>  ClassSpec)
>> >>> >> Building with 22 instantiations (upto ClassSpec) 
>> >>> >> takes up
>> >>> >> to ~1m to compile.
>> >>> >> with:
>> >>> >> 23  instantiations: ~2m
>> >>> >> 24 instantiations: ~5m
>> >>> >> For 30 instantiations I terminated cc1plus after 13m (by SIGKILL).
>> >>> >>
>> >>> >> I guess it shouldn't go in an infinite loop because:
>> >>> >> a) structs cannot have circular references.
>> >>> >> b) works for lower number of instantiations
>> >>> >> However I have no sound evidence that it cannot be in infinite loop.
>> >>> >> I don't understand why a decl node is getting visited more than once
>> >>> >> for that test-case.
>> >>> >>
>> >>> >> Using a hash_map to store alignments of decl's so that decl node gets 
>> >>> >> visited
>> >>> >> only once prevents the issue.
>> >>> >
>> >>> > Maybe aliases.  Try not walking vnode->alias == true vars.
>> >>> Hi,
>> >>> I have a hypothesis why decl node gets visited multiple times.
>> >>>
>> >>> Consider the test-case:
>> >>> template 
>> >>> struct X
>> >>> {
>> >>>   T a;
>> >>>   virtual int foo() { return N; }
>> >>> };
>> >>>
>> >>> typedef struct X x_1;
>> >>> typedef struct X x_2;
>> >>>
>> >>> static x_1 obj1 __attribute__((used));
>> >>> static x_2 obj2 __attribute__((used));
>> >>>
>> >>> Two additional structs are created by C++FE, c++filt shows:
>> >>> _ZTI1XIiLj1EE -> typeinfo for X
>> >>> _ZTI1XIiLj2EE -> typeinfo for X
>> >>>
>> >>> Both of these structs have only one field D.2991 and it appears it's
>> >>> *shared* between them:
>> >>>  struct  D.2991;
>> >>> const void * D.2980;
>> >>> const char * D.2981;
>> >>>
>> >>> Hence the decl node D.2991 and it's fields (D.2890, D.2981) get visited 
>> >>> twice:
>> >>> once when walking _ZTI1XIiLj1EE and 2nd time when walking _ZTI1XIiLj2EE
>> >>>
>> >>> Dump of walking over the global structs for above test-case:
>> >>> http://pastebin.com/R5SABW0c
>> >>>
>> >>> So it appears to me to me a DAG (interior node == struct decl, leaf ==
>> >>> non-struct field,
>> >>> edge from node1 -> node2 if node2 is field of node1) is getting
>> >>> created when struct decl
>> >>> is a type-info object.
>> >>>
>> >>> I am not really clear on how we should proceed:
>> >>> a) Keep using hash_map to store alignments so that every decl gets
>> >>> visited only once.
>> >>> b) Skip walking artificial record decls.
>> >>> I am not sure if skipping all artificial struct decls would be a good
>> >>> idea, but I don't
>> >>> think it's possible to identify if a struct-decl is typeinfo struct at
>> >>> middle-end ?
>> >>
>> >> You shouldn't end up walking those when walking the type of
>> >> global decls.  That is, don't walk typeinfo decls - yes, practically
>> >> that means just not walking DECL_ARTIFICIAL things.
>> > Hi,

Option handling (support) of -fsanitize=use-after-scope

2016-05-11 Thread Martin Liška
Hello.

I've been working on use-after-scope sanitizer enablement in the GCC compiler 
([1]) and
as I've read following submit request ([2]), the LLVM compiler started to 
utilize following option:
-mllvm -asan-use-after-scope=1

My initial attempt was to introduce a new option value for -fsanitize option 
(which would make both LLVM and GCC
option compatible). Following the current behavior of the LLVM, I would have to 
add a new --param which would
lead to a divergence. Is the suggested approach alterable for LLVM community?

I would also suggest following default behavior:
- If -fsanitize=address or -fsanitize=kernel-address is enabled, the 
use-after-scope sanitization should be enabled
- Similarly, providing -fuse-after-scope should enable address sanitization 
(either use-space or kernel-space)

Thank you for feedback,
Martin

[1] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00468.html
[2] http://reviews.llvm.org/D19347


Re: Option handling (support) of -fsanitize=use-after-scope

2016-05-11 Thread Yury Gribov

On 05/11/2016 04:18 PM, Martin Liška wrote:

Hello.

I've been working on use-after-scope sanitizer enablement in the GCC compiler 
([1]) and
as I've read following submit request ([2]), the LLVM compiler started to 
utilize following option:
-mllvm -asan-use-after-scope=1

My initial attempt was to introduce a new option value for -fsanitize option 
(which would make both LLVM and GCC
option compatible). Following the current behavior of the LLVM, I would have to 
add a new --param which would
lead to a divergence. Is the suggested approach alterable for LLVM community?

I would also suggest following default behavior:
- If -fsanitize=address or -fsanitize=kernel-address is enabled, the 
use-after-scope sanitization should be enabled
- Similarly, providing -fuse-after-scope should enable address sanitization 
(either use-space or kernel-space)

Thank you for feedback,
Martin

[1] https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00468.html
[2] http://reviews.llvm.org/D19347


Cc-ed Google folks.



Re: [PATCH] clean up insn-automata.c

2016-05-11 Thread Vladimir Makarov

On 05/11/2016 01:39 AM, Alexander Monakov wrote:

On Wed, 30 Mar 2016, Bernd Schmidt wrote:

On 03/25/2016 04:43 AM, Aldy Hernandez wrote:

If Bernd is fine with this, I'm happy to retract my patch and any
possible followups.  I'm just interested in having no path causing a
possible out of bounds access.  If your patch will do that, I'm cool.

I'll need to see that patch first to comment :-)

Here's the proposed patch.  I've found that there's only one user of the
current fancy logic in output_internal_insn_code_evaluation: handling of
NULL_RTX and const0_rtx is only useful for 'state_transition' (generated
by output_trans_func), so it's possible to inline the extended handling
there, and handle only plain non-null rtx_insns in
output_internal_insn_code_evaluation.

This change allows to remove extra checks and casting in
output_internal_insn_latency_func, as done by the patch.

As a nice bonus, it constrains prototypes of three automata functions to
accept insn_rtx rather than just rtx.

Bootstrapped and regtested on x86_64, OK?

Yes, it is ok for the trunk.  Thank you for solving this issue, Alexander.

* genattr.c (main): Change 'rtx' to 'rtx_insn *' in prototypes of
'insn_latency', 'maximal_insn_latency', 'min_insn_conflict_delay'.
* genautomata.c (output_internal_insn_code_evaluation): Simplify.
Move handling of non-insn arguments inline into the sole user:
(output_trans_func): ...here.
(output_min_insn_conflict_delay_func): Change 'rtx' to 'rtx_insn *' in
emitted function prototype.
(output_internal_insn_latency_func): Ditto.  Simplify.
(output_internal_maximal_insn_latency_func): Ditto.  Delete
always-unused argument.
(output_insn_latency_func): Ditto.
(output_maximal_insn_latency_func): Ditto.





Ann: MELT plugin 1.3 release candidate 2 for GCC 5 or GCC 6

2016-05-11 Thread Basile Starynkevitch

Dear All,


It is my pleasure to announce the MELT plugin 1.3 release candidate 2 
for GCC 5 & GCC 6

(hosted and usable on Linux preferably).

MELT -see http://gcc-melt.org/ for more (or 
http://starynkevitch.net/Basile/gcc-melt/ which points to the same web 
pages and resources) - is a domain specific language

and meta-plugin (free software GPLv3+ licensed, FSF copyrighted) to
easily extend and customize the GCC compiler.

Please download the bzip2 compress source tar archive from
  http://gcc-melt.org/melt-1.3-rc2-plugin-for-gcc-5-or-6.tar.bz2
It is a file of 4013849 bytes (3.9Mbytes) and md5sum 
eb4df214b293caabec07be4a672eda4e




NEWS for 1.3 MELT plugin for GCC 5 & GCC 6
[[may XX, 2016]]

   Bug fixes
   =

Rare garbage collection bug fixed (noticed with GCC 5).

   Language features
   =

No significant new language feature.

   Runtime features
   

 We did keep compatibility with GCC 5 & GCC 6.

 Since gengtype does not admit conditionals (see messages following
 https://gcc.gnu.org/ml/gcc/2016-02/msg00156.html ...) we had to hack
 our build system. The MELT plugin now use some melt-runtypes.h
 symlink to a version specific file, which has typedef-s like
typedef gimple* melt_gimpleptr_t; // gimple is now a struct

Added plugin options:

   -fplugin-melt-arg-verbose-full-gc: if set to 1 or Y, a message is
   output to stderr on MELT full garbage collections.

   -fplugin-melt-arg-mmap-reserve: don't use it, except to debug the
MELT runtime. See comment in melt-runtime.cc

The MELT runtime (that it the MELT plugin melt.so) could be built with
-DMELT_HAVE_RUNTIME_DEBUG=1 to enable MELT runtime debugging. This is
rarely useful for MELT users.



w.r.t. MELT plugin 1.3 rc1 I have made a few bug fixes (including 
perhaps some annoying GC bug that I cannot reproduce anymore).


Please try to build & use that release candidate 2 and report bugs to 
gcc-m...@googlegroups.com


Regards.

--
Basile STARYNKEVITCH http://starynkevitch.net/Basile/
email: basilestarynkevitchnet mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***



Re: (R5900) Implementing Vector Support

2016-05-11 Thread Richard Henderson

On 05/11/2016 04:54 AM, Woon yung Liu wrote:

I saw that the EE has the PMFHL.LH instruction, which loads the HI/LO
register pairs (containing the multiplication result) into a single destination
(i.e. truncates the multiplication result in the process), with the right order
too.  I suppose that it would be suitable for implementing the mulm3 operation.
But  if I implement mulm3, is there still a need to implement the
vec_widen_smult_hi_m and vec_widen_smult_lo_m patterns?


Of course.  They're used for different things.  E.g.

  int out[100];
  short in1[100], in2[100];

  for (i = 0; i < 100; ++i)
out[i] = in1[i] * in2[i];

will use the vec_widen_smult* patterns.


I tried to implement the two patterns (vec_widen_smult_hi_m and
vec_widen_smult_lo_m), but GCC wouldn't compile due to both patterns having
the same operands. Must they be expands? If so, what sort of patterns should
the pcpyld and pcpyud instructions be? If I don't declare them differently,
I'll have the same compilation error again (due to them having the same
operands).


Yes I would think they should be expands.  I would expect something like

;; ??? Could describe the result in %3, if we ever find it useful.
(define_insn "pmulth_ee"
  [(set (match_operand:V8SI 0 "register_operand" "=x")
(vec_select:V8SI
  (mult:V8SI
(sign_extend:V8SI (match_operand:V8HI 1 "register_operand" "d"))
(sign_extend:V8SI (match_operand:V8HI 2 "register_operand" "d")))
  (parallel
[(const_int 0) (const_int 1) (const_int 4) (const_int 5)
 (const_int 2) (const_int 3) (const_int 6) (const_int 7)])))
(clobber (match_scratch:V4SI 3 "=d"))]
  "..."
  "pmulth\t%3,%1,%2"
)

(define_insn "pmfhl_lh_ee_v8hi"
  [(set (match_operand:V8HI 0 "register_operand" "=d")
(vec_select:V8HI
  (match_operand:V16HI 1 "register_operand" "x")
  (parallel
[(const_int 0) (const_int 2)
 (const_int 8) (const_int 10)
 (const_int 4) (const_int 6)
 (const_int 12) (const_int 14)])))]
  "..."
  "pmfhl.lh\t%0"
)

;; ??? Maybe provide V4SI and V8HI versions too.
(define_insn "pmfhi_ee_v2di"
  [(set (match_operand:V2DI 0 "register_operand" "=d")
(vec_select:V2DI
  (match_operand:V4DI 1 "register_operand" "x")
  (parallel [(const_int 2) (const_int 3)])))]
  "..."
  "pmfhi\t%0"
)

;; ??? Maybe provide V4SI and V8HI versions too.
(define_insn "pmflo_ee_v2di"
  [(set (match_operand:V2DI 0 "register_operand" "=d")
(vec_select:V2DI
  (match_operand:V4DI 1 "register_operand" "x")
  (parallel [(const_int 0) (const_int 1)])))]
  "..."
  "pmflo\t%0"
)

;; ??? Maybe provide V4SI and V8HI versions too.
(define_insn "pcpyld_ee_v2di"
  [(set (match_operand:V2DI 0 "register_operand" "=d")
(vec_select:V2DI
  (vec_concat:V4DI
(match_operand:V2DI 1 "register_operand" "d")
(match_operand:V2DI 2 "register_operand" "d"))
  (parallel [(const_int 0) (const_int 2)])))]
  "..."
  "pcpyld\t%0,%2,%1"
)

;; ??? Maybe provide V4SI and V8HI versions too.
(define_insn "pcpyud_ee_v2di"
  [(set (match_operand:V2DI 0 "register_operand" "=d")
(vec_select:V2DI
  (vec_concat:V4DI
(match_operand:V2DI 1 "register_operand" "d")
(match_operand:V2DI 2 "register_operand" "d"))
  (parallel [(const_int 1) (const_int 3)])))]
  "..."
  "pcpyud\t%0,%1,%2"
)

(define_expand "mulv8hi3"
  [(match_operand:V8HI 0 "register_operand")
   (match_operand:V8HI 1 "register_operand")
   (match_operand:V8HI 2 "register_operand")]
  "..."
{
  rtx hilo = gen_reg_rtx (V8SImode);
  emit_insn (gen_pmulth_ee (hilo, operands[1], operands[2]));
  hilo = gen_lowpart (V16HImode, hilo);
  emit_insn (gen_pmfhl_lh_ee_v8hi (operands[0], hilo));
  DONE;
})

(define_expand "vec_widen_smult_lo_v8qi"
  [(match_operand:V4SI 0 "register_operand")
   (match_operand:V8HI 1 "register_operand")
   (match_operand:V8HI 2 "register_operand")]
  "..."
{
  rtx hilo = gen_reg_rtx (V8SImode);
  rtx hi = gen_reg_rtx (V2DImode);
  rtx lo = gen_reg_rtx (V2DImode);

  emit_insn (gen_pmulth_ee (hilo, operands[1], operands[2]));
  hilo = gen_lowpart (V4DImode, hilo);
  emit_insn (gen_pmfhi_ee_v2di (hi, hilo));
  emit_insn (gen_pmflo_ee_v2di (lo, hilo));
  emit_insn (gen_pcpyld_ee_v2di (gen_lowpart (V2DImode, operands[0]), lo, hi));
  DONE;
})

(define_expand "vec_widen_smult_hi_v8qi"
  [(match_operand:V4SI 0 "register_operand")
   (match_operand:V8HI 1 "register_operand")
   (match_operand:V8HI 2 "register_operand")]
  "..."
{
  rtx hilo = gen_reg_rtx (V8SImode);
  rtx hi = gen_reg_rtx (V2DImode);
  rtx lo = gen_reg_rtx (V2DImode);

  emit_insn (gen_pmulth_ee (hilo, operands[1], operands[2]));
  hilo = gen_lowpart (V4DImode, hilo);
  emit_insn (gen_pmfhi_ee_v2di (hi, hilo));
  emit_insn (gen_pmflo_ee_v2di (lo, hilo));
  emit_insn (gen_pcpyud_ee_v2di (gen_lowpart (V2DImode, operands[0]), lo, hi));
  DONE;
}

gcc-4.9-20160511 is now available

2016-05-11 Thread gccadmin
Snapshot gcc-4.9-20160511 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/4.9-20160511/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 4.9 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/branches/gcc-4_9-branch 
revision 236147

You'll find:

 gcc-4.9-20160511.tar.bz2 Complete GCC

  MD5=13b96c87abf36b7c87d6bbf1c577a198
  SHA1=0639f501f4f78061c23951c53810ea31a7f868fd

Diffs from 4.9-20160504 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-4.9
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.