[PING] [PATCH 1/n] Add conditional compare support

2013-11-20 Thread Zhenqiang Chen
Ping?

Thanks!
-Zhenqiang

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> ow...@gcc.gnu.org] On Behalf Of Zhenqiang Chen
> Sent: Wednesday, November 06, 2013 3:39 PM
> To: 'Richard Henderson'
> Cc: Richard Earnshaw; 'Richard Biener'; GCC Patches
> Subject: RE: [PATCH 1/n] Add conditional compare support
> 
> 
> > -Original Message-
> > From: Richard Henderson [mailto:r...@redhat.com]
> > Sent: Tuesday, November 05, 2013 4:39 AM
> > To: Zhenqiang Chen
> > Cc: Richard Earnshaw; 'Richard Biener'; GCC Patches
> > Subject: Re: [PATCH 1/n] Add conditional compare support
> >
> > On 11/04/2013 08:00 PM, Zhenqiang Chen wrote:
> > > Thanks. I add a new hook. The default function will return -1 if the
> > > target does not care about the order.
> > >
> > > +DEFHOOK
> > > +(select_ccmp_cmp_order,
> > > + "For some target (like ARM), the order of two compares is
> > > +sensitive for\n\ conditional compare.  cmp0-cmp1 might be an
> > > +invalid combination.  But when\n\ swapping the order, cmp1-cmp0 is
> valid.
> > > +The function will return\n\
> > > +  -1: if @code{code1} and @code{code2} are valid combination.\n\
> > > +   1: if @code{code2} and @code{code1} are valid combination.\n\
> > > +   0: both are invalid.",
> > > + int, (int code1, int code2),
> > > + default_select_ccmp_cmp_order)
> >
> > Fair enough.  I'd originally been thinking that returning a tri-state
> > value akin to the comparison callback to qsort would allow easy
> > sorting of a whole list of comparisons.  But probably just as easy to
> > open-code while checking for invalid combinations.
> >
> > Checking for invalid while sorting means that we can then disallow
> > returning NULL from the other two hooks.  Because the backend has
> > already had a chance to indicate failure.
> 
> The check is only for the first two compares. And the following compares are
> not checked. In addition, backend might check more staffs (e.g.
> arm_select_dominance_cc_mode) to generate a valid compare instruction.
> 
> > > For gen_ccmp_next, I add another parameter CC_P to indicate the
> > > result is used as CC or not. If CC_P is false, the gen_ccmp_next
> > > will return a general register. This is for code like
> > >
> > > int test (int a, int b)
> > > {
> > >   return a > 0 && b > 0;
> > > }
> > > During expand, there might have no branch at all. So gen_ccmp_next
> > > can not return CC for "a > 0 && b > 0".
> >
> > Uh, no, this is a terrible idea.  There's no need for gen_ccmp_next to
> > re-do the work of cstore_optab.
> >
> > I believe you can use emit_store_flag as a high-level interface here,
> > since there are technically vagaries due to STORE_FLAG_VALUE.  If that
> > turns out to crash or fail in some way, we can talk about using
> > cstore_optab directly given some restrictions.
> 
> emit_store_flag does too much checks. I use cstore_optab to emit the insn.
> 
> +  icode = optab_handler (cstore_optab, CCmode);
> +  if (icode != CODE_FOR_nothing)
> + {
> +   rtx target = gen_reg_rtx (word_mode);
> +   tmp = emit_cstore (target, icode, NE, CCmode, CCmode,
> +  0, tmp, const0_rtx, 1, word_mode);
> +   if (tmp)
> + return tmp;
> + }
> 
> > It also means that you shouldn't need all of and_scc_scc, ior_scc_scc,
> > ccmp_and_scc_scc, ccmp_ior_scc_scc.
> 
> Yes. We only need ccmp_and and ccmp_ior now.
> 
> I will verify to remove the existing and_scc_scc, ior_scc_scc,
> and_scc_scc_cmp, ior_scc_scc_cmp once conditional compare is enabled.
> 
> > Although I don't see cstorecc4 defined for ARM, so there is something
> > missing.
> 
> cstorecc4 is added.
> 
> > > +static int
> > > +arm_select_ccmp_cmp_order (int cond1, int cond2) {
> > > +  if (cond1 == cond2)
> > > +return -1;
> > > +  if (comparison_dominates_p ((enum rtx_code) cond1, (enum
> > > +rtx_code)
> > cond2))
> > > +return 1;
> > > +  if (comparison_dominates_p ((enum rtx_code) cond2, (enum
> > > + rtx_code)
> > cond1))
> > > +return -1;
> > > +  return 0;
> > > +
> > > +}
> >
> > This sort does not look stable.  In particular,
> >
> >   if (cond1 == cond2)
> > return 1;
> >
> > would seem to better preserve the original order of the comparisons.
> 
> -1 is to keep the original order. Anyway I change the function as:
> 
> +/* COND1 and COND2 should be enum rtx_code, which represent two
> compares.
> +   There are order sensitive for conditional compare.  It returns
> +  1: Keep current order.
> + -1: Swap the two compares.
> +  0: Invalid combination.  */
> +
> +static int
> +arm_select_ccmp_cmp_order (int cond1, int cond2) {
> +  /* THUMB1 does not support conditional compare.  */
> +  if (TARGET_THUMB1)
> +return 0;
> +
> +  if (cond1 == cond2)
> +return 1;
> +  if (comparison_dominates_p ((enum rtx_code) cond1, (enum rtx_code)
> cond2))
> +return -1;
> +  if (comparison_dominates_p ((enum rtx_code) cond2, (enum rtx_code)
> cond1))
> +return 1;
> +

Re: [PATCH] Support -fsanitize=leak

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 11:40:14AM +0400, Yury Gribov wrote:
> > Jakub already wrote the invoke.texi part:
> > http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02061.html
> 
> Ah, thanks, I somehow missed it. When will this end up in trunk? I
> have a depending patch.

When it is reviewed.  I have various other pending asan patches:
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02070.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02061.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01874.html
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01791.html
plus the libbacktrace symbolization in libsanitizer (but that depends
on upstream acceptance and because of the non-trivial changes in upstream
will need to wait for acceptance there plus another merge back).

Dodji, do you think you could review these soon?

Jakub


[PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion

2013-11-20 Thread Jakub Jelinek
Hi!

I've noticed we generate terrible code for the testcase below.
E.g. with -mavx2 it is:
leal6(%rdi), %edx
leal12(%rdi), %ecx
leal18(%rdi), %esi
leal3(%rdi), %eax
movl%edx, -20(%rsp)
movl%ecx, -24(%rsp)
leal9(%rdi), %edx
movl%esi, -28(%rsp)
vmovd   -20(%rsp), %xmm6
leal15(%rdi), %ecx
movl%edi, -20(%rsp)
leal21(%rdi), %esi
vmovd   -28(%rsp), %xmm4
vmovd   -24(%rsp), %xmm5
vpinsrd $1, %edx, %xmm6, %xmm3
vmovd   -20(%rsp), %xmm7
vpinsrd $1, %esi, %xmm4, %xmm2
vpinsrd $1, %ecx, %xmm5, %xmm1
vpinsrd $1, %eax, %xmm7, %xmm0
vpunpcklqdq %xmm2, %xmm1, %xmm1
vpunpcklqdq %xmm3, %xmm0, %xmm0
vinserti128 $0x1, %xmm1, %ymm0, %ymm0
to load the { x, x + 3, x + 6, x + 9, x + 12, x + 15, x + 18, x + 21 }
CONSTRUCTOR into a vector register.  This patch expands it as:
movl%edi, -20(%rsp)
vbroadcastss-20(%rsp), %ymm0
vpaddd  .LC0(%rip), %ymm0, %ymm0
instead.  Similarly for SSE2:
leal3(%rdi), %eax
movl%eax, -20(%rsp)
leal6(%rdi), %eax
movd-20(%rsp), %xmm4
movl%eax, -16(%rsp)
leal9(%rdi), %eax
movd-16(%rsp), %xmm1
movl%edi, -16(%rsp)
movl%eax, -12(%rsp)
movd-16(%rsp), %xmm0
movd-12(%rsp), %xmm3
punpckldq   %xmm4, %xmm0
punpckldq   %xmm3, %xmm1
punpcklqdq  %xmm1, %xmm0
instead of:
movl%edi, -12(%rsp)
movd-12(%rsp), %xmm3
pshufd  $0, %xmm3, %xmm0
paddd   .LC0(%rip), %xmm0

The patch leaves that decision to the target (in it's
*_expand_vector_init).

Ok?

2013-11-20  Jakub Jelinek  

* expr.c (store_constructor): If all CONSTRUCTOR values are
the same SSA_NAME plus optional constant, pass plus_constant results
down to vec_init.
* config/i386/i386.c (ix86_expand_vector_init): Optimize it using
broadcast followed by addition of a constant vector.

* gcc.dg/vect/vect-124.c: New test.

--- gcc/expr.c.jj   2013-11-19 21:56:27.0 +0100
+++ gcc/expr.c  2013-11-20 09:24:14.435860130 +0100
@@ -6296,6 +6296,8 @@ store_constructor (tree exp, rtx target,
rtvec vector = NULL;
unsigned n_elts;
alias_set_type alias;
+   tree base = NULL_TREE;
+   rtx base_rtx = NULL_RTX;
 
gcc_assert (eltmode != BLKmode);
 
@@ -6334,6 +6336,31 @@ store_constructor (tree exp, rtx target,
TYPE_SIZE (TREE_TYPE (value)),
TYPE_SIZE (elttype)));
 
+   /* Try to detect fairly common pattern of
+  { a_5, a_5 + 1, a_5 + 2, a_5 + 3 }.  */
+   if (vector
+   && (count == 0 || base)
+   && TREE_CODE (value) == SSA_NAME)
+ {
+   gimple g = get_gimple_for_ssa_name (value);
+   tree this_base = value;
+   if (g && is_gimple_assign (g))
+ switch (gimple_assign_rhs_code (g))
+   {
+   case PLUS_EXPR:
+   case POINTER_PLUS_EXPR:
+ if (TREE_CODE (gimple_assign_rhs1 (g)) == SSA_NAME
+ && tree_fits_shwi_p (gimple_assign_rhs2 (g)))
+   this_base = gimple_assign_rhs1 (g);
+ break;
+   default:
+ break;
+   }
+   if (count == 0)
+ base = this_base;
+   else if (base != this_base)
+ base = NULL_TREE;
+ }
count += n_elts_here;
if (mostly_zeros_p (value))
  zero_count += n_elts_here;
@@ -6342,6 +6369,9 @@ store_constructor (tree exp, rtx target,
/* Clear the entire vector first if there are any missing elements,
   or if the incidence of zero elements is >= 75%.  */
need_to_clear = (count < n_elts || 4 * zero_count >= 3 * count);
+
+   if (count < n_elts)
+ base = NULL_TREE;
  }
 
if (need_to_clear && size > 0 && !vector)
@@ -6362,6 +6392,9 @@ store_constructor (tree exp, rtx target,
else
  alias = get_alias_set (elttype);
 
+   if (base)
+ base_rtx = expand_normal (base);
+
 /* Store each element of the constructor into the corresponding
   element of TARGET, determined by counting the elements.  */
for (idx = 0, i = 0;
@@ -6385,8 +6418,21 @@ store_constructor (tree exp, rtx target,
/* Vector CONSTRUCTORs should only be built from smaller
   vectors in the case of BLKmode vectors.  */
   

Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Uros Bizjak
On Tue, Nov 19, 2013 at 9:43 PM, Ilya Enkovich  wrote:

>> > Here is a patch to add size relocation and instruction to obtain object's 
>> > size in i386 target.
>>
>> +(define_insn "move_size_reloc_"
>> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
>> +(match_operand: 1 "size_relocation" "Z"))]
>> +  ""
>> +{
>> +  return "mov{}\t{%1, %0|%0, %1}";
>>
>> Please don't change x86_64_immediate_operand just to use "Z"
>> constraint The predicate is used in a couple of other places that for
>> sure don't accept your change.
>>
>> Better write this insn in an explicit way (see for example
>> tls_initial_exec_64_sun). Something like:
>>
>> (define_insn "move_size_reloc_"
>>   [(set (match_operand:SWI48 0 "register_operand" "=r")
>> (unspec:SWI48
>>  [(match_operand 1 "symbolic_operand" "..." )]
>>  UNSPEC_SIZEOF))]
>>   ""
>>   "mov{}\t{%a1@SIZE, %0|%0, %a1@SIZE}")
>>
>> You will probably need to define new operand 1 predicate and constraint.
>>
>> Uros.
>
> Hi, Uros!  Thanks for comments!  Here is what I got trying to follow your 
> suggestion.  Does it look better?

You actually don't need any operand modifiers in the insn template. Simply use:

"mov{}\t{%1@SIZE, %0|%0, %1@SIZE}"

and you will automatically get

"movl $zzz, %eax" or "mov %eax, OFFSET FLAT: zzz".

Since your pattern allows only symbolic_operand, there is no reload,
so you can avoid the constraint alltogether.

BTW: Did you consider various -mcmodel=... options? For DImode moves,
you should check x86_64_zext_immediate_operand predicate and output
either "movl $zzz, %eax" or "movabs $zzz, %rax". There is no movq with
64bit immediate. Please see movdi pattern.

Uros.


Re: [AArch64] AArch64 SIMD Builtins Better Type Correctness.

2013-11-20 Thread Marcus Shawcroft
On 18 November 2013 09:10, James Greenhalgh  wrote:

> 2013-11-18  James Greenhalgh  
>
> * gcc/config/aarch64/aarch64-builtins.c
> (aarch64_simd_itype): Remove.
> (aarch64_simd_builtin_datum): Remove itype, add
> qualifiers pointer.
> (VAR1): Use qualifiers.
> (aarch64_build_scalar_type): New.
> (aarch64_build_vector_type): Likewise.
> (aarch64_build_type): Likewise.
> (aarch64_init_simd_builtins): Refactor, remove special cases,
> consolidate main loop.
> (aarch64_simd_expand_args): Likewise.

OK
/Marcus


[buildrobot] [PATCH] mips: Really remove ENTRY_BLOCK_PTR

2013-11-20 Thread Jan-Benedict Glaw
Hi!

One usage of ENTRY_BLOCK_PTR was missed:

g++ -c   -g -O2 -DIN_GCC  -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wwrite-strings -Wcast-qual 
-Wmissing-format-attribute -pedantic -Wno-long-long -Wno-variadic-macros 
-Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. -I. 
-I/home/jbglaw/repos/gcc/gcc -I/home/jbglaw/repos/gcc/gcc/. 
-I/home/jbglaw/repos/gcc/gcc/../include 
-I/home/jbglaw/repos/gcc/gcc/../libcpp/include  
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
-I/home/jbglaw/repos/gcc/gcc/../libbacktrace-o mips.o -MT mips.o -MMD -MP 
-MF ./.deps/mips.TPo /home/jbglaw/repos/gcc/gcc/config/mips/mips.c
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c: In function ‘void 
mips_insert_attributes(tree_node*, tree_node**)’:
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1403: warning: unknown conversion 
type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1403: warning: format ‘%qs’ 
expects type ‘char*’, but argument 2 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1403: warning: too many arguments 
for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1408: warning: unknown conversion 
type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1408: warning: format ‘%qs’ 
expects type ‘char*’, but argument 2 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1408: warning: too many arguments 
for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c: In function ‘tree_node* 
mips_merge_decl_attributes(tree_node*, tree_node*)’:
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1437: warning: unknown conversion 
type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1437: warning: format ‘%qs’ 
expects type ‘char*’, but argument 2 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1437: warning: too many arguments 
for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1443: warning: unknown conversion 
type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1443: warning: format ‘%qs’ 
expects type ‘char*’, but argument 2 has type ‘tree_node*’
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:1443: warning: too many arguments 
for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c: In function ‘rtx_def* 
mips_expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int)’:
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:14458: warning: unknown 
conversion type character ‘E’ in format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:14458: warning: too many 
arguments for format
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c: In function ‘rtx_def* 
r10k_simplify_address(rtx_def*, rtx_def*)’:
/home/jbglaw/repos/gcc/gcc/config/mips/mips.c:14845: error: ‘ENTRY_BLOCK_PTR’ 
was not declared in this scope
make[1]: *** [mips.o] Error 1

(http://toolchain.lug-owl.de/buildbot/show_build_details.php?id=36303)

This should fix it:

2013-11-20  Jan-Benedict Glaw  

* config/mips/mips.c (r10k_simplify_address): Eliminate macro usage.

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 82ca719..d06d574 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -14842,7 +14842,7 @@ r10k_simplify_address (rtx x, rtx insn)
  /* Replace the incoming value of $sp with
 virtual_incoming_args_rtx.  */
  if (x == stack_pointer_rtx
- && DF_REF_BB (def) == ENTRY_BLOCK_PTR)
+ && DF_REF_BB (def) == ENTRY_BLOCK_PTR_FOR_FN (cfun))
newx = virtual_incoming_args_rtx;
}
  else if (dominated_by_p (CDI_DOMINATORS, DF_REF_BB (use),


Ok?

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: 23:53 <@jbglaw> So, ich kletter' jetzt mal ins Bett.
the second  : 23:57 <@jever2> .oO( kletter ..., hat er noch Gitter vorm Bett, 
wie früher meine Kinder?)
  00:00 <@jbglaw> jever2: *patsch*
  00:01 <@jever2> *aua*, wofür, Gedanken sind frei!
  00:02 <@jbglaw> Nee, freie Gedanken, die sind seit 1984 doch aus!
  00:03 <@jever2> 1984? ich bin erst seit 1985 verheiratet!


signature.asc
Description: Digital signature


Re: [PATCH] Support addsub/subadd as non-isomorphic operations for SLP vectorizer.

2013-11-20 Thread Richard Biener
On Tue, 19 Nov 2013, Cong Hou wrote:

> On Tue, Nov 19, 2013 at 1:45 AM, Richard Biener  wrote:
> >
> > On Mon, 18 Nov 2013, Cong Hou wrote:
> >
> > > I tried your method and it works well for doubles. But for float,
> > > there is an issue. For the following gimple code:
> > >
> > >c1 = a - b;
> > >c2 = a + b;
> > >c = VEC_PERM 
> > >
> > > It needs two instructions to implement the VEC_PERM operation in
> > > SSE2-4, one of which should be using shufps which is represented by
> > > the following pattern in rtl:
> > >
> > >
> > > (define_insn "sse_shufps_"
> > >   [(set (match_operand:VI4F_128 0 "register_operand" "=x,x")
> > > (vec_select:VI4F_128
> > >  (vec_concat:
> > >(match_operand:VI4F_128 1 "register_operand" "0,x")
> > >(match_operand:VI4F_128 2 "nonimmediate_operand" "xm,xm"))
> > >  (parallel [(match_operand 3 "const_0_to_3_operand")
> > > (match_operand 4 "const_0_to_3_operand")
> > > (match_operand 5 "const_4_to_7_operand")
> > > (match_operand 6 "const_4_to_7_operand")])))]
> > > ...)
> > >
> > > Note that it contains two rtl instructions.
> >
> > It's a single instruction as far as combine is concerned (RTL
> > instructions have arbitrary complexity).
> 
> 
> Even it is one instruction, we will end up with four rtl statements,
> which still cannot be combined as there are restrictions on combining
> four instructions (loads of constants or binary operations involving a
> constant). Note that vec_select instead of vec_merge is used here
> because currently vec_merge is emitted only if SSE4 is enabled (thus
> blend instructions can be used. If you look at
> ix86_expand_vec_perm_const_1() in i386.c, you can find that vec_merge
> is generated in expand_vec_perm_1() with SSE4.). Without SSE4 support,
> in most cases a vec_merge statement could not be translated by one SSE
> instruction.
> 
> >
> >
> > > Together with minus, plus,
> > > and one more shuffling instruction, we have at least five instructions
> > > for addsub pattern. I think during the combine pass, only four
> > > instructions are considered to be combined, right? So unless we
> > > compress those five instructions into four or less, we could not use
> > > this method for float values.
> >
> > At the moment addsubv4sf looks like
> >
> > (define_insn "sse3_addsubv4sf3"
> >   [(set (match_operand:V4SF 0 "register_operand" "=x,x")
> > (vec_merge:V4SF
> >   (plus:V4SF
> > (match_operand:V4SF 1 "register_operand" "0,x")
> > (match_operand:V4SF 2 "nonimmediate_operand" "xm,xm"))
> >   (minus:V4SF (match_dup 1) (match_dup 2))
> >   (const_int 10)))]
> >
> > to match this it's best to have the VEC_SHUFFLE retained as
> > vec_merge and thus support arbitrary(?) vec_merge for the aid
> > of combining until reload(?) after which we can split it.
> >
> 
> 
> You mean VEC_PERM (this is generated in gimple from your patch)? Note
> as I mentioned above, without SSE4, it is difficult to translate
> VEC_PERM into vec_merge. Even if we can do it, we still need do define
> split to convert one vec_merge into two or more other statements
> later. ADDSUB instructions are proved by SSE3 and I think we should
> not rely on SSE4 to perform this transformation, right?

Sure not, we just keep vec_merge in the early RTL IL to give
combine an easier job.  We split it later to expose the target details
(see below).

> To sum up, if we use vec_select instead of vec_merge, we may have four
> rtl statements for float types, in which case they cannot be combined.
> If we use vec_merge, we need to define the split for it without SSE4
> support, and we need also to change the behavior of
> ix86_expand_vec_perm_const_1().

Yes.  But that would be best IMHO.  Otherwise we're not going to
have PR56766 fixed for the plain C generic vector testcase

typedef double v2df __attribute__((vector_size(16)));
typedef long long v2di __attribute__((vector_size(16)));
v2df foo (v2df x, v2df y)
{
  v2df tem1 = x - y;
  v2df tem2 = x + y;
  return __builtin_shuffle (tem1, tem2, (v2di) { 0, 3 });
}

> > > What do you think?
> >
> > Besides of addsub are there other instructions that can be expressed
> > similarly?  Thus, how far should the combiner pattern go?
> >
> 
> 
> I think your method is quite flexible. Beside blending add/sub, we
> could blend other combinations of two operations, and even one
> operation and a no-op. For example, consider vectorizing complex
> conjugate operation:
> 
> for (int i = 0; i < N; i+=2) {
>   a[i] = b[i];
>   a[i+1] = -b[i+1];
> }
> 
> This is loop is better to be vectorized by hybrid SLP. The second
> statement has a unary minus operation but there is no operation in the
> first one. We can improve out SLP grouping algorithm to let GCC SLP
> vectorize it.

Yes indeed - I also liked the flexibility in the end.  If cost
considerations allow it we can handle mismatched ops
quite well (albeit with a cost overhead).  Note that the
VEC_PERM permutations generated a

Re: [buildrobot] [PATCH] mips: Really remove ENTRY_BLOCK_PTR

2013-11-20 Thread Richard Sandiford
Jan-Benedict Glaw  writes:
> 2013-11-20  Jan-Benedict Glaw  
>
>   * config/mips/mips.c (r10k_simplify_address): Eliminate macro usage.

OK.  And thanks for the catch.

Richard


Re: [buildrobot] [PATCH] mips: Really remove ENTRY_BLOCK_PTR

2013-11-20 Thread Steven Bosscher
On Wed, Nov 20, 2013 at 10:04 AM, Jan-Benedict Glaw wrote:
> 2013-11-20  Jan-Benedict Glaw  <...>
>
> * config/mips/mips.c (r10k_simplify_address): Eliminate macro usage.
>
> diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
> index 82ca719..d06d574 100644
> --- a/gcc/config/mips/mips.c
> +++ b/gcc/config/mips/mips.c
> @@ -14842,7 +14842,7 @@ r10k_simplify_address (rtx x, rtx insn)
>   /* Replace the incoming value of $sp with
>  virtual_incoming_args_rtx.  */
>   if (x == stack_pointer_rtx
> - && DF_REF_BB (def) == ENTRY_BLOCK_PTR)
> + && DF_REF_BB (def) == ENTRY_BLOCK_PTR_FOR_FN (cfun))
> newx = virtual_incoming_args_rtx;
> }
>   else if (dominated_by_p (CDI_DOMINATORS, DF_REF_BB (use),
>
>
> Ok?
>
> MfG, JBG

This patch is obvious and it fixes breakage. Please go ahead and commit it.

I wonder if there are any more cases like this missed... Could you
please check that? Something like:

egrep -w "ENTRY_BLOCK_PTR|EXIT_BLOCK_PTR" gcc/*.[ch] gcc/config/*.[ch]
gcc/config/*/*.{c,h,md}

should do the trick. (I'd do it myself if I had access to a Linux box
right now...)

Ciao!
Steven


Re: RFA: patch to fix PR58785 (an ARM LRA crash)

2013-11-20 Thread Yvan Roux
Hi,

as Richard said, only a subset of rclass is allowed to be returned by
preferred_reload_class.  I've tested the attached patched in Thumb
mode, on ARMv5, A9 and A9hf and on cross A15 without regression.

Yvan

2013-11-20  Yvan Roux  

PR target/58785
* config/arm/arm.c (arm_preferred_reload_class): Only return LO_REGS
when rclass is GENERAL_REGS.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5c53440..63f10bd 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -6882,10 +6882,7 @@ arm_preferred_reload_class (rtx x ATTRIBUTE_UNUSED, 
reg_class_t rclass)
 return rclass;
   else
 {
-  if (rclass == GENERAL_REGS
- || rclass == HI_REGS
- || rclass == NO_REGS
- || rclass == STACK_REG)
+  if (rclass == GENERAL_REGS)
return LO_REGS;
   else
return rclass;


Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion

2013-11-20 Thread Richard Biener
On Wed, 20 Nov 2013, Jakub Jelinek wrote:

> Hi!
> 
> I've noticed we generate terrible code for the testcase below.
> E.g. with -mavx2 it is:
> leal6(%rdi), %edx
> leal12(%rdi), %ecx
> leal18(%rdi), %esi
> leal3(%rdi), %eax
> movl%edx, -20(%rsp)
> movl%ecx, -24(%rsp)
> leal9(%rdi), %edx
> movl%esi, -28(%rsp)
> vmovd   -20(%rsp), %xmm6
> leal15(%rdi), %ecx
> movl%edi, -20(%rsp)
> leal21(%rdi), %esi
> vmovd   -28(%rsp), %xmm4
> vmovd   -24(%rsp), %xmm5
> vpinsrd $1, %edx, %xmm6, %xmm3
> vmovd   -20(%rsp), %xmm7
> vpinsrd $1, %esi, %xmm4, %xmm2
> vpinsrd $1, %ecx, %xmm5, %xmm1
> vpinsrd $1, %eax, %xmm7, %xmm0
> vpunpcklqdq %xmm2, %xmm1, %xmm1
> vpunpcklqdq %xmm3, %xmm0, %xmm0
> vinserti128 $0x1, %xmm1, %ymm0, %ymm0
> to load the { x, x + 3, x + 6, x + 9, x + 12, x + 15, x + 18, x + 21 }
> CONSTRUCTOR into a vector register.  This patch expands it as:
> movl%edi, -20(%rsp)
> vbroadcastss-20(%rsp), %ymm0
> vpaddd  .LC0(%rip), %ymm0, %ymm0
> instead.  Similarly for SSE2:
> leal3(%rdi), %eax
> movl%eax, -20(%rsp)
> leal6(%rdi), %eax
> movd-20(%rsp), %xmm4
> movl%eax, -16(%rsp)
> leal9(%rdi), %eax
> movd-16(%rsp), %xmm1
> movl%edi, -16(%rsp)
> movl%eax, -12(%rsp)
> movd-16(%rsp), %xmm0
> movd-12(%rsp), %xmm3
> punpckldq   %xmm4, %xmm0
> punpckldq   %xmm3, %xmm1
> punpcklqdq  %xmm1, %xmm0
> instead of:
> movl%edi, -12(%rsp)
> movd-12(%rsp), %xmm3
> pshufd  $0, %xmm3, %xmm0
> paddd   .LC0(%rip), %xmm0
> 
> The patch leaves that decision to the target (in it's
> *_expand_vector_init).
> 
> Ok?

Aww ;)  Nice improvement.  Generally when I see this I always wonder
whether we want to do this kind of stuff pre RTL expansion.
1st to not rely on being able to TER, 2nd to finally eventually
get rid of TER.

These patches are unfortunately a step backward for #2.

As of the patch, do we have a way to query whether the target
can efficiently broadcast?  If so this IMHO belongs in generic
code, not in ix86_expand_vector_init - passing down plus_constant
results may pessimize other targets that don't understand this
trick, no?  (OTOH, which vector ISA does _not_ know how to
broadcast ...?)

Thanks,
Richard.

> 2013-11-20  Jakub Jelinek  
> 
>   * expr.c (store_constructor): If all CONSTRUCTOR values are
>   the same SSA_NAME plus optional constant, pass plus_constant results
>   down to vec_init.
>   * config/i386/i386.c (ix86_expand_vector_init): Optimize it using
>   broadcast followed by addition of a constant vector.
> 
>   * gcc.dg/vect/vect-124.c: New test.
> 
> --- gcc/expr.c.jj 2013-11-19 21:56:27.0 +0100
> +++ gcc/expr.c2013-11-20 09:24:14.435860130 +0100
> @@ -6296,6 +6296,8 @@ store_constructor (tree exp, rtx target,
>   rtvec vector = NULL;
>   unsigned n_elts;
>   alias_set_type alias;
> + tree base = NULL_TREE;
> + rtx base_rtx = NULL_RTX;
>  
>   gcc_assert (eltmode != BLKmode);
>  
> @@ -6334,6 +6336,31 @@ store_constructor (tree exp, rtx target,
>   TYPE_SIZE (TREE_TYPE (value)),
>   TYPE_SIZE (elttype)));
>  
> + /* Try to detect fairly common pattern of
> +{ a_5, a_5 + 1, a_5 + 2, a_5 + 3 }.  */
> + if (vector
> + && (count == 0 || base)
> + && TREE_CODE (value) == SSA_NAME)
> +   {
> + gimple g = get_gimple_for_ssa_name (value);
> + tree this_base = value;
> + if (g && is_gimple_assign (g))
> +   switch (gimple_assign_rhs_code (g))
> + {
> + case PLUS_EXPR:
> + case POINTER_PLUS_EXPR:
> +   if (TREE_CODE (gimple_assign_rhs1 (g)) == SSA_NAME
> +   && tree_fits_shwi_p (gimple_assign_rhs2 (g)))
> + this_base = gimple_assign_rhs1 (g);
> +   break;
> + default:
> +   break;
> + }
> + if (count == 0)
> +   base = this_base;
> + else if (base != this_base)
> +   base = NULL_TREE;
> +   }
>   count += n_elts_here;
>   if (mostly_zeros_p (value))
> zero_count += n_elts_here;
> @@ -6342,6 +6369,9 @@ store_constructor (tree exp, rtx target,
>   /* Clear the entire vector first if there are any missing elements,
>  or if the incidence of zero e

Re: Ping Re: [gomp4] Dumping gimple for offload.

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 10:58 AM, Ilya Tocar  wrote:
> On 14 Nov 11:27, Richard Biener wrote:
>> > +  /* Set when symbol needs to be dumped for lto/offloading.  */
>> > +  unsigned need_dump : 1;
>> > +
>>
>> That's very non-descriptive.  What's "offloading"?  But yes, something
>> like this is what I was asking for.
>
> I've changed it into:
> Set when symbol needs to be dumped into LTO bytecode for LTO,
> or in pragma omp target case, for separate compilation targeting
> a different architecture.
>
> Ok for gomp4 branch now?

Works for me.  I'll let branch maintainers decide if it follows whatever
is done there (I haven't found time to follow stuff here).

Richard.

> 2013-11-19 Ilya Tocar  
>
> * cgraph.h (symtab_node): Add need_dump.
> * cgraphunit.c (ipa_passes): Run ipa_write_summaries for omp.
> (compile): Intialize streamer for omp.
> * ipa-inline-analysis.c (inline_generate_summary): Add flag_openmp.
> * lto-cgraph.c (lto_set_symtab_encoder_in_partition): Respect
> need_dump flag.
> (select_what_to_dump): New.
> * lto-streamer.c (section_name_prefix): New.
> (lto_get_section_name): Use section_name_prefix.
> (lto_streamer_init): Add flag_openmp.
> * lto-streamer.h (OMP_SECTION_NAME_PREFIX): New.
> (section_name_prefix): Ditto.
> (select_what_to_dump): Ditto.
> * lto/lto-partition.c (add_symbol_to_partition_1): Set need_dump.
> (lto_promote_cross_file_statics): Dump everyhtinh.
> * passes.c (ipa_write_summaries): Add parameter,
> call select_what_to_dump.
> * tree-pass.h (void ipa_write_summaries): Add parameter.
>
>
> ---
>  gcc/cgraph.h  |  5 +
>  gcc/cgraphunit.c  | 15 +--
>  gcc/ipa-inline-analysis.c |  2 +-
>  gcc/lto-cgraph.c  | 14 ++
>  gcc/lto-streamer.c|  5 +++--
>  gcc/lto-streamer.h|  6 ++
>  gcc/lto/lto-partition.c   |  3 +++
>  gcc/passes.c  |  6 --
>  gcc/tree-pass.h   |  2 +-
>  9 files changed, 50 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index fb0fe93..9f799f4 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -105,6 +105,11 @@ public:
>/* Set when symbol has address taken. */
>unsigned address_taken : 1;
>
> +  /* Set when symbol needs to be dumped into LTO bytecode for LTO,
> + or in pragma omp target case, for separate compilation targeting
> + a different architecture.  */
> +  unsigned need_dump : 1;
> +
>
>/* Ordering of all symtab entries.  */
>int order;
> diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
> index c3a8967..53cd250 100644
> --- a/gcc/cgraphunit.c
> +++ b/gcc/cgraphunit.c
> @@ -2019,7 +2019,18 @@ ipa_passes (void)
>   passes->all_lto_gen_passes);
>
>if (!in_lto_p)
> -ipa_write_summaries ();
> +{
> +  if (flag_openmp)
> +   {
> + section_name_prefix = OMP_SECTION_NAME_PREFIX;
> + ipa_write_summaries (true);
> +   }
> +  if (flag_lto)
> +   {
> + section_name_prefix = LTO_SECTION_NAME_PREFIX;
> + ipa_write_summaries (false);
> +   }
> +}
>
>if (flag_generate_lto)
>  targetm.asm_out.lto_end ();
> @@ -2110,7 +2121,7 @@ compile (void)
>cgraph_state = CGRAPH_STATE_IPA;
>
>/* If LTO is enabled, initialize the streamer hooks needed by GIMPLE.  */
> -  if (flag_lto)
> +  if (flag_lto || flag_openmp)
>  lto_streamer_hooks_init ();
>
>/* Don't run the IPA passes if there was any error or sorry messages.  */
> diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
> index 4458723..62faa52 100644
> --- a/gcc/ipa-inline-analysis.c
> +++ b/gcc/ipa-inline-analysis.c
> @@ -3813,7 +3813,7 @@ inline_generate_summary (void)
>
>/* When not optimizing, do not bother to analyze.  Inlining is still done
>   because edge redirection needs to happen there.  */
> -  if (!optimize && !flag_lto && !flag_wpa)
> +  if (!optimize && !flag_lto && !flag_wpa && !flag_openmp)
>  return;
>
>function_insertion_hook_holder =
> diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
> index 6a52da8..697c069 100644
> --- a/gcc/lto-cgraph.c
> +++ b/gcc/lto-cgraph.c
> @@ -238,6 +238,9 @@ void
>  lto_set_symtab_encoder_in_partition (lto_symtab_encoder_t encoder,
>  symtab_node *node)
>  {
> +  /* Ignore not needed nodes.  */
> +  if (!node->need_dump)
> +return;
>int index = lto_symtab_encoder_encode (encoder, node);
>encoder->nodes[index].in_partition = true;
>  }
> @@ -751,6 +754,17 @@ add_references (lto_symtab_encoder_t encoder,
>lto_symtab_encoder_encode (encoder, ref->referred);
>  }
>
> +/* Select what needs to be dumped. In lto case dump everything.
> +   In omp target case only dump stuff makrked with attribute.  */
> +void
> +select_what_to_dump (bool is_omp)
> +{
> +  struct symtab_node *snode;
>

Re: Ping Re: [gomp4] Dumping gimple for offload.

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 10:34:30AM +0100, Richard Biener wrote:
> On Tue, Nov 19, 2013 at 10:58 AM, Ilya Tocar  wrote:
> > On 14 Nov 11:27, Richard Biener wrote:
> >> > +  /* Set when symbol needs to be dumped for lto/offloading.  */
> >> > +  unsigned need_dump : 1;
> >> > +
> >>
> >> That's very non-descriptive.  What's "offloading"?  But yes, something
> >> like this is what I was asking for.
> >
> > I've changed it into:
> > Set when symbol needs to be dumped into LTO bytecode for LTO,
> > or in pragma omp target case, for separate compilation targeting
> > a different architecture.
> >
> > Ok for gomp4 branch now?
> 
> Works for me.  I'll let branch maintainers decide if it follows whatever
> is done there (I haven't found time to follow stuff here).

Ok then.

Jakub


Re: [RFA/RFC patch]: Follow-up on type-demotion pass ...

2013-11-20 Thread Richard Biener
On Mon, Nov 18, 2013 at 8:46 PM, Kai Tietz  wrote:
> Hello,
>
> this is the second approach of the patch I've sent some months ago.  I think 
> we saw that there was no other approach shown, so I will continue on that ...
> The idea of this pass is to introduce a statement "normalization" and 
> "denormalization".  The "normalization" happens in forward-propagation, and 
> "denormalization" is done by this type-demotion-pass.  Therefore it is 
> recommented that the type-demotion pass runs just before the forwprop-pass, 
> as otherwise we might end in "denormalized" representation, which is in some 
> cases less performance.
>
> The current pass supports for addition/substraction type-transscript to 
> unsigned-integer logic for preventing overflow-issues.  By this we transform 
> operations like:
> int a,b; ... (unsigned int) (a + b) ... into ((unsigned int) a) + ((unsigned 
> int) b).
> This transformation needs to be done after first analyze code, as we 
> otherwise would destroy overflow-detection by this transscription.  We could 
> do the same for MULT_EXPR, if there wouldn't be that late analyzis of 
> "possible" overflow of it.  Therefore I disabled such transformation for it.
>
> Then it supports shift-operation type-sinking/type-raising.  This seems to be 
> the most interesting part of this patch - beside of the constnat propagtion 
> optimization as seen in ts-add-3.c testcase - as this kind of optimization is 
> new to gcc.  The part in forward-propagation is required to define the 
> "normal" notation of such shift-expressions.  As nice side-effect by these 
> transformation we can find that type-over-widening on some operations don't 
> occure anymore in vectorize code.
>
> Additional this pass should catch useless type-conversion and simplifies 
> them.  Therefore I kept the code about type-cast reduction still within that 
> pass.  Of course we can move it out to some other place, like 
> tree-ssa-gimple, or even gimple might be a place.  But right now I would like 
> to keep at beginning all those pieces together.

This is not a review of the pass either.

1) Can you please split out the forwprop pieces and motivate them
separately?
2) How did you decide on pass placement?  It looks like you do
it twice (eh?) and before forwprop passes.  What's the difference
between the passes?
3) You don't add a flag to disable this pass, please add one
4) You compute post-dominators but appearantly only to specify
basic-block walk order - instead use one of the basic-block
odering functions

Richard.

>
> ChangeLog gcc
>
> * Makefile.in: Add tree-ssa-te.c to build.
> * passes.def: Add typedemote passes.
> * tree-pass.h (make_pass_demote1): Add prototype.
> (make_pass_demote2): Likewise.
> * tree-ssa-forwprop.c (simplify_shift):  New function.
> (ssa_forward_propagate_and_combine): Use it.
> *  tree-ssa-te.c: New pass.
>
> Changelog gcc/testsuite:
>
> * gcc.dg/tree-ssa/scev-cast.c: Adjust test.
> * gcc.dg/tree-ssa/ssa-fre-2.c: Likewise.
> * gcc.dg/tree-ssa/ssa-fre-3.c: Likewise.
> * gcc.dg/tree-ssa/ssa-fre-4.c: Likewise.
> * gcc.dg/tree-ssa/ssa-fre-5.c: Likewise.
> * gcc.dg/vect/vect-over-widen-1-big-array.c: Likewise.
> * gcc.dg/vect/vect-over-widen-1.c: Likewise.
> * gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
> * gcc.dg/vect/vect-over-widen-3.c: Likewise.
> * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
> * gcc.dg/vect/vect-over-widen-4.c: Likewise.
> * gcc.dg/tree-ssa/ts-add-1.c: New test.
> * gcc.dg/tree-ssa/ts-add-2.c: New test.
> * gcc.dg/tree-ssa/ts-add-3.c: New test.
> * gcc.dg/tree-ssa/ts-shift-1.c: New test.
>
> Test for x86_64-unknown-linux-gnu, and i686-pc-cygwin, and x86_64-pc-cygwin.
>
> Ok for apply?
>
> Regards,
> Kai
>
>
> Index: gcc-trunk/gcc/Makefile.in
> ===
> --- gcc-trunk.orig/gcc/Makefile.in
> +++ gcc-trunk/gcc/Makefile.in
> @@ -1430,6 +1430,7 @@ OBJS = \
> tree-ssa-pre.o \
> tree-ssa-propagate.o \
> tree-ssa-reassoc.o \
> +   tree-ssa-te.o \
> tree-ssa-sccvn.o \
> tree-ssa-sink.o \
> tree-ssa-strlen.o \
> Index: gcc-trunk/gcc/passes.def
> ===
> --- gcc-trunk.orig/gcc/passes.def
> +++ gcc-trunk/gcc/passes.def
> @@ -64,6 +64,7 @@ along with GCC; see the file COPYING3.
>   NEXT_PASS (pass_remove_cgraph_callee_edges);
>   NEXT_PASS (pass_rename_ssa_copies);
>   NEXT_PASS (pass_ccp);
> + NEXT_PASS (pass_demote1);
>   /* After CCP we rewrite no longer addressed locals into SSA
>  form if possible.  */
>   NEXT_PASS (pass_forwprop);
> @@ -141,6 +142,7 @@ along with GCC; see the file COPYING3.
>/* After CCP we rewrite no longer addressed locals into SSA

Re: [1/4] Use tree_to_uhwi with an inlined tree_fits_uhwi_p test

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 12:54 PM, Richard Sandiford
 wrote:
> check_function_arguments_recurse has an assert that is equivalent
> to tree_fits_uhwi_p.  The extraction can then use tree_to_uhwi.
>
> Asserting here makes the intent obvious, but tree_to_uhwi also asserts
> for the same thing, so an alternative would be to use tree_to_uhwi on
> its own.

Please do that.  Ok with this change.

Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/c-family/
> 2013-11-19  Kenneth Zadeck  
>
> * c-common.c (check_function_arguments_recurse): Use tree_fits_uhwi_p
> and tree_to_uhwi.
>
> Index: gcc/c-family/c-common.c
> ===
> --- gcc/c-family/c-common.c 2013-11-19 10:53:54.965643984 +
> +++ gcc/c-family/c-common.c 2013-11-19 11:08:41.797920627 +
> @@ -9209,10 +9209,9 @@ check_function_arguments_recurse (void (
>to be valid.  */
> format_num_expr = TREE_VALUE (TREE_VALUE (attrs));
>
> -   gcc_assert (TREE_CODE (format_num_expr) == INTEGER_CST
> -   && !TREE_INT_CST_HIGH (format_num_expr));
> +   gcc_assert (tree_fits_uhwi_p (format_num_expr));
>
> -   format_num = TREE_INT_CST_LOW (format_num_expr);
> +   format_num = tree_to_uhwi (format_num_expr);
>
> for (inner_arg = first_call_expr_arg (param, &iter), i = 1;
>  inner_arg != 0;


Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 10:31:38AM +0100, Richard Biener wrote:
> Aww ;)  Nice improvement.  Generally when I see this I always wonder
> whether we want to do this kind of stuff pre RTL expansion.
> 1st to not rely on being able to TER, 2nd to finally eventually
> get rid of TER.
> 
> These patches are unfortunately a step backward for #2.
> 
> As of the patch, do we have a way to query whether the target
> can efficiently broadcast?  If so this IMHO belongs in generic

We don't.  Perhaps if we'd add optab for vec_dup and mentioned
clearly in the documentation that it should be used only if it is reasonably
efficient.  But still, even with optab, it would probably better to do it
in the veclower* passes than in the vectorizer itself.

Jakub


Re: [3/4] Avoid undefined operation in overflow check

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 1:06 PM, Richard Sandiford
 wrote:
> This is a case where tree_to_shwi can be used instead of TREE_INT_CST_LOW.
> I separated it out because it was using a signed "x * y / y == x" to check
> whether "x * y" overflows a HWI, which relies on undefined behaviour.

Please CSE tree_to_shwi (size)  (yeah, you may use an ugly

  && (hsz = tree_to_shwi (size)) <= HOST_WIDE_INT_MAX / BITS_PER_UNIT

Ok with that change.
Thanks,
Richard.

> Thanks,
> Richard
>
>
> gcc/
> * tree-ssa-alias.c (ao_ref_init_from_ptr_and_size): Avoid signed
> overflow.  Use tree_to_shwi.
>
> Index: gcc/tree-ssa-alias.c
> ===
> --- gcc/tree-ssa-alias.c2013-11-19 10:53:54.965643984 +
> +++ gcc/tree-ssa-alias.c2013-11-19 11:08:51.882992035 +
> @@ -615,9 +615,8 @@ ao_ref_init_from_ptr_and_size (ao_ref *r
>ref->offset += extra_offset;
>if (size
>&& tree_fits_shwi_p (size)
> -  && TREE_INT_CST_LOW (size) * BITS_PER_UNIT / BITS_PER_UNIT
> -== TREE_INT_CST_LOW (size))
> -ref->max_size = ref->size = TREE_INT_CST_LOW (size) * BITS_PER_UNIT;
> +  && tree_to_shwi (size) <= HOST_WIDE_INT_MAX / BITS_PER_UNIT)
> +ref->max_size = ref->size = tree_to_shwi (size) * BITS_PER_UNIT;
>else
>  ref->max_size = ref->size = -1;
>ref->ref_alias_set = 0;


Re: [wwwdocs] Broken links

2013-11-20 Thread Patrick Marlier
+ Secify [docs] in the subject header.

Also [wwwdocs] instead of [docs].

--
Patrick Marlier


Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich  wrote:
> 2013/11/19 Jeff Law :
>> On 11/19/13 05:20, Ilya Enkovich wrote:
>>>
>>> 2013/11/19 Richard Biener :

 On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich 
 wrote:
>
> 2013/11/18 Jeff Law :
>>
>> On 11/18/13 11:27, Ilya Enkovich wrote:
>>>
>>>
>>>
>>> How does pointer passed to regular function differ from pointer passed
>>> to splitted function? How do I know then which pointer is to be passed
>>> with bounds and wchich one is not? Moreover current ABI does not allow
>>> to pass bounds with no pointer or pass bounds for some pointers in the
>>> call only.
>>
>>
>> But I don't see any case in function splitting where we're going to
>> want to
>> pass the pointer without the bounds.  If you want the former, you're
>> going
>> to want the latter.
>
>
> There are at least cases when checks are eliminated or when lots of
> pointer usages are accompanied with few checks performed earlier (e.g.
> we are working with array). In such cases splitted part may easily get
> no bounds.
>
>>
>> I really don't see why you need to do anything special here.  At the
>> most an
>> assert in the splitting code to ensure that you don't have a situation
>> where
>> there's mixed pointers with bounds and pointers without bounds should
>> be all
>> you need or that you passed a bounds with no associated pointer :-)
>
>
> It would also require generation of proper bind_bounds calls in the
> original function and arg_bounds calls in a separated part. So,
> special support is required.


 Well, only to keep proper instrumentation.  I hope code still works
 (doesn't trap) when optimizations "wreck" the bounds?  Thus all
 these patches are improving bounds propagation but are not required
 for correctness?  If so please postpone all of them until after the
 initial support is merged.  If not, please make sure BND instrumentation
 works conservatively when optimizations wreck it.
>>>
>>>
>>> All patches I sent for optimization passes are required to avoid ICEs
>>> when compiling instrumented code.
>>
>> Then I think we're going to need to understand them in more detail. That's
>> going to mean testcases, probably dumps and some commentary about what went
>> wrong.
>>
>> I can't speak for Richi, but when optimizations get disabled, I tend to want
>> to really understand why and make sure we're not papering over a larger
>> problem.
>>
>> The tail recursion elimination one we're discussing now is a great example.
>> At this point I understand the problem you're running into, but I'm still
>> trying to wrap my head around the implications of the funny semantics of
>> __builtin_arg_bounds and how they may cause other problems.
>
> Root of all problems if implicit data flow hidden in arg_bounds and
> bind_bounds.  Calls consume bounds and compiler does not know it. And
> input bounds are always expressed via arg_bounds calls and never
> expressed via formal args. Obviously optimizers have to be taught
> about these data dependencies to work correctly.
>
> I agree semantics of arg_bounds call creates many issues for
> optimizers but currently I do not see a better replacement for it.

But it looks incredibly fragile if you ICE once something you don't like
happens.  You should be able to easily detect the case and "punt",
that is, drop to non-instrumented aka invalidating bounds.

Thus, I really really don't like these patches.  They hint at some
deeper problem with the overall design (or the HW feature or the
accompaning ABI).

Richard.

> Ilya
>
>>
>>
>> jeff
>>


Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 9:29 PM, Ilya Enkovich  wrote:
> On 19 Nov 12:33, Jeff Law wrote:
>> On 11/19/13 05:13, Ilya Enkovich wrote:
>> >On 19 Nov 13:00, Richard Biener wrote:
>> >>I'd say not in the gimplifier either but in varpool (symbol table) code
>> >>where the symbols are ultimatively registered with?
>> >
>> >Something like that?
>> >
>> >--- a/gcc/varpool.c
>> >+++ b/gcc/varpool.c
>> >@@ -151,6 +151,10 @@ varpool_node_for_decl (tree decl)
>> >node = varpool_create_empty_node ();
>> >node->decl = decl;
>> >symtab_register_node (node);
>> >+
>> >+  if (DECL_NIITIAL (decl))
>> >+chkp_register_var_initializer (decl);
>> >+
>> >return node;
>> >  }
>> Yea, I think that's what Richi is suggesting.
>> jeff
>>
>
> Great!  Here is a full version of the patch.  Bootstrap, make check and MPX 
> tests are OK with the change.

Please don't do this in varpool_node_for_decl but in
varpool_finalize_decl.

Richard.

> Thanks,
> Ilya
> --
> 2013-11-19  Ilya Enkovich  
>
> * varpool.c: Include tree-chkp.h.
> (varpool_node_for_decl): Register statically
> initialized decls in Pointer Bounds Checker.
>
>
> diff --git a/gcc/varpool.c b/gcc/varpool.c
> index 471db82..8487b6e 100644
> --- a/gcc/varpool.c
> +++ b/gcc/varpool.c
> @@ -34,6 +34,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "output.h"
>  #include "gimple.h"
>  #include "flags.h"
> +#include "tree-chkp.h"
>
>  /* List of hooks triggered on varpool_node events.  */
>  struct varpool_node_hook_list {
> @@ -151,6 +152,10 @@ varpool_node_for_decl (tree decl)
>node = varpool_create_empty_node ();
>node->decl = decl;
>symtab_register_node (node);
> +
> +  if (DECL_INITIAL (decl))
> +chkp_register_var_initializer (decl);
> +
>return node;
>  }
>


Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 10:57 AM, Richard Biener
 wrote:
> On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich  wrote:
>> 2013/11/19 Jeff Law :
>>> On 11/19/13 05:20, Ilya Enkovich wrote:

 2013/11/19 Richard Biener :
>
> On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich 
> wrote:
>>
>> 2013/11/18 Jeff Law :
>>>
>>> On 11/18/13 11:27, Ilya Enkovich wrote:



 How does pointer passed to regular function differ from pointer passed
 to splitted function? How do I know then which pointer is to be passed
 with bounds and wchich one is not? Moreover current ABI does not allow
 to pass bounds with no pointer or pass bounds for some pointers in the
 call only.
>>>
>>>
>>> But I don't see any case in function splitting where we're going to
>>> want to
>>> pass the pointer without the bounds.  If you want the former, you're
>>> going
>>> to want the latter.
>>
>>
>> There are at least cases when checks are eliminated or when lots of
>> pointer usages are accompanied with few checks performed earlier (e.g.
>> we are working with array). In such cases splitted part may easily get
>> no bounds.
>>
>>>
>>> I really don't see why you need to do anything special here.  At the
>>> most an
>>> assert in the splitting code to ensure that you don't have a situation
>>> where
>>> there's mixed pointers with bounds and pointers without bounds should
>>> be all
>>> you need or that you passed a bounds with no associated pointer :-)
>>
>>
>> It would also require generation of proper bind_bounds calls in the
>> original function and arg_bounds calls in a separated part. So,
>> special support is required.
>
>
> Well, only to keep proper instrumentation.  I hope code still works
> (doesn't trap) when optimizations "wreck" the bounds?  Thus all
> these patches are improving bounds propagation but are not required
> for correctness?  If so please postpone all of them until after the
> initial support is merged.  If not, please make sure BND instrumentation
> works conservatively when optimizations wreck it.


 All patches I sent for optimization passes are required to avoid ICEs
 when compiling instrumented code.
>>>
>>> Then I think we're going to need to understand them in more detail. That's
>>> going to mean testcases, probably dumps and some commentary about what went
>>> wrong.
>>>
>>> I can't speak for Richi, but when optimizations get disabled, I tend to want
>>> to really understand why and make sure we're not papering over a larger
>>> problem.
>>>
>>> The tail recursion elimination one we're discussing now is a great example.
>>> At this point I understand the problem you're running into, but I'm still
>>> trying to wrap my head around the implications of the funny semantics of
>>> __builtin_arg_bounds and how they may cause other problems.
>>
>> Root of all problems if implicit data flow hidden in arg_bounds and
>> bind_bounds.  Calls consume bounds and compiler does not know it. And
>> input bounds are always expressed via arg_bounds calls and never
>> expressed via formal args. Obviously optimizers have to be taught
>> about these data dependencies to work correctly.
>>
>> I agree semantics of arg_bounds call creates many issues for
>> optimizers but currently I do not see a better replacement for it.
>
> But it looks incredibly fragile if you ICE once something you don't like
> happens.  You should be able to easily detect the case and "punt",
> that is, drop to non-instrumented aka invalidating bounds.
>
> Thus, I really really don't like these patches.  They hint at some
> deeper problem with the overall design (or the HW feature or the
> accompaning ABI).

Note that this, the intrusiveness of the feature and the questionable
gain makes me question whether GCC should have support for this
feature (and whether we really should rush this in this late).

Thus, I hereby formally ask to push back this feature to 4.10.

Thanks,
Richard.

> Richard.
>
>> Ilya
>>
>>>
>>>
>>> jeff
>>>


Re: [buildrobot] [PATCH] mips: Really remove ENTRY_BLOCK_PTR

2013-11-20 Thread Jan-Benedict Glaw
On Wed, 2013-11-20 10:08:45 +0100, Steven Bosscher  
wrote:
[...]
> I wonder if there are any more cases like this missed... Could you
> please check that? Something like:
> 
> egrep -w "ENTRY_BLOCK_PTR|EXIT_BLOCK_PTR" gcc/*.[ch] gcc/config/*.[ch]
> gcc/config/*/*.{c,h,md}

No more uses, but 21 comment lines contain references to the macros.

MfG, JBG

-- 
  Jan-Benedict Glaw  jbg...@lug-owl.de  +49-172-7608481
Signature of: Friends are relatives you make for yourself.
the second  :


signature.asc
Description: Digital signature


Re: [PATCH, MPX, 2/X] Pointers Checker [16/25] Tail recursion

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 8:02 PM, Jeff Law  wrote:
> On 11/18/13 14:03, Ilya Enkovich wrote:
>>
>> 2013/11/19 Jeff Law :
>>>
>>> On 11/18/13 12:16, Ilya Enkovich wrote:


 With current recursion elimination we will have:

 test (int *param1)
 {
 :

 :
 _7 = PHI
 bounds2 = __builtin_arg_bounds (_7) -- WRONG
>>>
>>>
>>> I wouldn't say it's wrong.  It's conservatively correct if properly
>>> implemented.   Why precisely do you consider this wrong?  If your code
>>> can't
>>> handle it, it really should be fixed.
>>
>>
>> It is wrong because __builtin_arg_bounds is used to get bounds of
>> input parameter and PHI node here is not an input parameter.
>
> OK, now we're getting somewhere.  So we've got this odd little function
> which only works on parameters.   I can't help but feel this is a bit of
> mis-design coming back to haunt us and I wonder if we're going to see other
> instances of this kind of problem.

Right.

> There's something just wrong with the semantics of __builtin_arg_bounds, but
> I'm having trouble putting my finger on it.

Well, the issue is that it accepts any pointer argument as input.
I originally thought it may just get an integer constant argument - the
argument position.  It really seems to me that we have
__builtin_arg_boundsN (), but to avoid having N builtins we specify N
via an argument to the function.  But as it may not change we should
use something that cannot change - like a constant.  A constant
that identifies a parameter is one that gives its position for example.

Note that any restrictions you impose on what is "valid" for GIMPLE
should be verified in tree-cfg.c:verify_gimple_* - in the
__builtin_arg_bounds call case in verify_gimple_call.

Richard.

>
>  Correctl
>>
>> handling of __builtin_arg_bounds in this 'wrong' example requires
>> adding it a PHI node semantics based on it's arg.  For me it seems
>> more complex then generating a regular PHI node for bounds and makes
>> GIMPLE less readable.
>
> But presumably you have code to merge bound information already, right?  You
> need that for PHI nodes.  Can't you use that to walk up the use-def chains
> and build the bounds information?
>
> Or if you want to fix the tailcall stuff, you can start down that direction.
> I don't think that'll fix the nagging concerns about the overall semantics
> of builtin_arg_bounds, but it might be enough for you to go forward.
>
> jeff


Re: [PATCH, 4.8] Backport patches to fix parallel build fail PR 57683

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 2:30 AM, Zhenqiang Chen
 wrote:
> Hi,
>
> The issue files in PR 57683 include lra-constraints.o,
> lra-eliminations.o and tree-switch-conversion.o.
>
> Jeff had fixed them in trunk (r197467 and r198999):
>
> http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00957.html
> http://gcc.gnu.org/ml/gcc-patches/2013-04/msg00211.html
>
> Can we backport the two patches to fix the parallel build issues in 4.8?

Sure, please go ahead.

Richard.

> Thanks!
> -Zhenqiang


Re: [PATCH] Improve { x, x + 3, x + 6, x + 9 } expansion

2013-11-20 Thread Richard Biener
On Wed, 20 Nov 2013, Jakub Jelinek wrote:

> On Wed, Nov 20, 2013 at 10:31:38AM +0100, Richard Biener wrote:
> > Aww ;)  Nice improvement.  Generally when I see this I always wonder
> > whether we want to do this kind of stuff pre RTL expansion.
> > 1st to not rely on being able to TER, 2nd to finally eventually
> > get rid of TER.
> > 
> > These patches are unfortunately a step backward for #2.
> > 
> > As of the patch, do we have a way to query whether the target
> > can efficiently broadcast?  If so this IMHO belongs in generic
> 
> We don't.  Perhaps if we'd add optab for vec_dup and mentioned
> clearly in the documentation that it should be used only if it is reasonably
> efficient.  But still, even with optab, it would probably better to do it
> in the veclower* passes than in the vectorizer itself.

Yes, veclower would be a good fit.

I was of course thinking of a new "apply TER" pass right before cfgexpand
(for the cases where TER ends up doing some kind of "folding").

Richard.


[PATCH] Fix PR59035

2013-11-20 Thread Richard Biener

The following makes sure the languages default of -ffp-contract
is properly transfered to the LTO stage (which defaults to "fast").
Merging different options from different TUs is done conservatively.

LTO bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2013-11-20  Richard Biener  

PR lto/59035
* lto-opts.c (lto_write_options): Write defaults only if
they were not explicitely specified.  Also write
-ffp-contract default.
* lto-wrapper.c (merge_and_complain): Merge -ffp-contract
conservatively.
(run_gcc): Pass through -ffp-contract.

Index: gcc/lto-opts.c
===
*** gcc/lto-opts.c  (revision 205009)
--- gcc/lto-opts.c  (working copy)
*** lto_write_options (void)
*** 85,98 
   function rather than per compilation unit.  */
/* -fexceptions causes the EH machinery to be initialized, enabling
   generation of unwind data so that explicit throw() calls work.  */
!   if (global_options.x_flag_exceptions)
  append_to_collect_gcc_options (&temporary_obstack, &first_p,
   "-fexceptions");
/* -fnon-call-exceptions changes the generation of exception
regions.  It is enabled implicitly by the Go frontend.  */
!   if (global_options.x_flag_non_call_exceptions)
  append_to_collect_gcc_options (&temporary_obstack, &first_p,
   "-fnon-call-exceptions");
  
/* Output explicitly passed options.  */
for (i = 1; i < save_decoded_options_count; ++i)
--- 85,119 
   function rather than per compilation unit.  */
/* -fexceptions causes the EH machinery to be initialized, enabling
   generation of unwind data so that explicit throw() calls work.  */
!   if (!global_options_set.x_flag_exceptions
!   && global_options.x_flag_exceptions)
  append_to_collect_gcc_options (&temporary_obstack, &first_p,
   "-fexceptions");
/* -fnon-call-exceptions changes the generation of exception
regions.  It is enabled implicitly by the Go frontend.  */
!   if (!global_options_set.x_flag_non_call_exceptions
!   && global_options.x_flag_non_call_exceptions)
  append_to_collect_gcc_options (&temporary_obstack, &first_p,
   "-fnon-call-exceptions");
+   /* The default -ffp-contract changes depending on the language
+  standard.  Pass thru conservative standard settings.  */
+   if (!global_options_set.x_flag_fp_contract_mode)
+ switch (global_options.x_flag_fp_contract_mode)
+   {
+   case FP_CONTRACT_OFF:
+   append_to_collect_gcc_options (&temporary_obstack, &first_p,
+  "-ffp-contract=off");
+   break;
+   case FP_CONTRACT_ON:
+   append_to_collect_gcc_options (&temporary_obstack, &first_p,
+  "-ffp-contract=on");
+   break;
+   case FP_CONTRACT_FAST:
+   /* Nothing.  That merges conservatively and is the default for LTO.  */
+   break;
+   default:
+   gcc_unreachable ();
+   }
  
/* Output explicitly passed options.  */
for (i = 1; i < save_decoded_options_count; ++i)
Index: gcc/lto-wrapper.c
===
*** gcc/lto-wrapper.c   (revision 205009)
--- gcc/lto-wrapper.c   (working copy)
*** merge_and_complain (struct cl_decoded_op
*** 422,427 
--- 422,439 
append_option (decoded_options, decoded_options_count, foption);
  break;
  
+   case OPT_ffp_contract_:
+ /* For selected options we can merge conservatively.  */
+ for (j = 0; j < *decoded_options_count; ++j)
+   if ((*decoded_options)[j].opt_index == foption->opt_index)
+ break;
+ if (j == *decoded_options_count)
+   append_option (decoded_options, decoded_options_count, foption);
+ /* FP_CONTRACT_OFF < FP_CONTRACT_ON < FP_CONTRACT_FAST.  */
+ else if (foption->value < (*decoded_options)[j].value)
+   (*decoded_options)[j] = *foption;
+ break;
+ 
case OPT_freg_struct_return:
case OPT_fpcc_struct_return:
  for (j = 0; j < *decoded_options_count; ++j)
*** run_gcc (unsigned argc, char *argv[])
*** 578,583 
--- 590,596 
case OPT_fgnu_tm:
case OPT_freg_struct_return:
case OPT_fpcc_struct_return:
+   case OPT_ffp_contract_:
  break;
  
default:


Re: [PATCH, MPX, 2/X] Pointers Checker [16/25] Tail recursion

2013-11-20 Thread Ilya Enkovich
2013/11/20 Richard Biener :
> On Tue, Nov 19, 2013 at 8:02 PM, Jeff Law  wrote:
>> On 11/18/13 14:03, Ilya Enkovich wrote:
>>>
>>> 2013/11/19 Jeff Law :

 On 11/18/13 12:16, Ilya Enkovich wrote:
>
>
> With current recursion elimination we will have:
>
> test (int *param1)
> {
> :
>
> :
> _7 = PHI
> bounds2 = __builtin_arg_bounds (_7) -- WRONG


 I wouldn't say it's wrong.  It's conservatively correct if properly
 implemented.   Why precisely do you consider this wrong?  If your code
 can't
 handle it, it really should be fixed.
>>>
>>>
>>> It is wrong because __builtin_arg_bounds is used to get bounds of
>>> input parameter and PHI node here is not an input parameter.
>>
>> OK, now we're getting somewhere.  So we've got this odd little function
>> which only works on parameters.   I can't help but feel this is a bit of
>> mis-design coming back to haunt us and I wonder if we're going to see other
>> instances of this kind of problem.
>
> Right.
>
>> There's something just wrong with the semantics of __builtin_arg_bounds, but
>> I'm having trouble putting my finger on it.
>
> Well, the issue is that it accepts any pointer argument as input.
> I originally thought it may just get an integer constant argument - the
> argument position.  It really seems to me that we have
> __builtin_arg_boundsN (), but to avoid having N builtins we specify N
> via an argument to the function.  But as it may not change we should
> use something that cannot change - like a constant.  A constant
> that identifies a parameter is one that gives its position for example.

I have tried to pass constant for arg_bounds.  It still requires
additional modification of optimization passes (including tail
recursion, param propagation, inline etc.)  But having constant as an
argument you really cannot detect errors.  E.g. if you inline and do
not fix inlined arg_bounds calls correctly, you may not get ICE but
get wrong program which uses wrong bounds and produces false bounds
violations.

>
> Note that any restrictions you impose on what is "valid" for GIMPLE
> should be verified in tree-cfg.c:verify_gimple_* - in the
> __builtin_arg_bounds call case in verify_gimple_call.

Will add it.

Thanks,
Ilya

>
> Richard.
>
>>
>>  Correctl
>>>
>>> handling of __builtin_arg_bounds in this 'wrong' example requires
>>> adding it a PHI node semantics based on it's arg.  For me it seems
>>> more complex then generating a regular PHI node for bounds and makes
>>> GIMPLE less readable.
>>
>> But presumably you have code to merge bound information already, right?  You
>> need that for PHI nodes.  Can't you use that to walk up the use-def chains
>> and build the bounds information?
>>
>> Or if you want to fix the tailcall stuff, you can start down that direction.
>> I don't think that'll fix the nagging concerns about the overall semantics
>> of builtin_arg_bounds, but it might be enough for you to go forward.
>>
>> jeff


[PATCH] Strict volatile bit-fields clean-up

2013-11-20 Thread Bernd Edlinger
Hello Richard,

as a follow-up patch to the bit-fields patch(es), I wanted to remove the 
dependencies on
the variable flag_strict_volatile_bitfields from expand_assignment and 
expand_expr_real_1.
Additionally I want the access mode of the field to be selected in the memory 
context,
instead of the structure's mode.

Boot-strapped and regression-tested on x86_64-linux-gnu.

OK for trunk?

Thanks
Bernd.


FYI - these are my in-flight patches, which would be nice to go into the GCC 
4.9.0 release:


http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02046.html :[PATCH] reimplement 
-fstrict-volatile-bitfields v4, part 1/2
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02025.html :[PATCH] Fix C++0x 
memory model for -fno-strict-volatile-bitfields on ARM
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02291.html :
[PATCH, PR 57748] Check for out of bounds access, Part 2
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00581.html :[PATCH, PR 57748] 
Check for out of bounds access, Part 3
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00133.html:[PATCH] Fix PR58115 
  2013-11-20  Bernd Edlinger  

* expr.c (expand_assignment): Remove dependency on
flag_strict_volatile_bitfields. Always set the memory 
access mode.
(expand_expr_real_1): Likewise.



patch-bitfields-cleanup.diff
Description: Binary data


Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Ilya Enkovich
2013/11/20 Richard Biener :
> On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich  wrote:
>> 2013/11/19 Jeff Law :
>>> On 11/19/13 05:20, Ilya Enkovich wrote:

 2013/11/19 Richard Biener :
>
> On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich 
> wrote:
>>
>> 2013/11/18 Jeff Law :
>>>
>>> On 11/18/13 11:27, Ilya Enkovich wrote:



 How does pointer passed to regular function differ from pointer passed
 to splitted function? How do I know then which pointer is to be passed
 with bounds and wchich one is not? Moreover current ABI does not allow
 to pass bounds with no pointer or pass bounds for some pointers in the
 call only.
>>>
>>>
>>> But I don't see any case in function splitting where we're going to
>>> want to
>>> pass the pointer without the bounds.  If you want the former, you're
>>> going
>>> to want the latter.
>>
>>
>> There are at least cases when checks are eliminated or when lots of
>> pointer usages are accompanied with few checks performed earlier (e.g.
>> we are working with array). In such cases splitted part may easily get
>> no bounds.
>>
>>>
>>> I really don't see why you need to do anything special here.  At the
>>> most an
>>> assert in the splitting code to ensure that you don't have a situation
>>> where
>>> there's mixed pointers with bounds and pointers without bounds should
>>> be all
>>> you need or that you passed a bounds with no associated pointer :-)
>>
>>
>> It would also require generation of proper bind_bounds calls in the
>> original function and arg_bounds calls in a separated part. So,
>> special support is required.
>
>
> Well, only to keep proper instrumentation.  I hope code still works
> (doesn't trap) when optimizations "wreck" the bounds?  Thus all
> these patches are improving bounds propagation but are not required
> for correctness?  If so please postpone all of them until after the
> initial support is merged.  If not, please make sure BND instrumentation
> works conservatively when optimizations wreck it.


 All patches I sent for optimization passes are required to avoid ICEs
 when compiling instrumented code.
>>>
>>> Then I think we're going to need to understand them in more detail. That's
>>> going to mean testcases, probably dumps and some commentary about what went
>>> wrong.
>>>
>>> I can't speak for Richi, but when optimizations get disabled, I tend to want
>>> to really understand why and make sure we're not papering over a larger
>>> problem.
>>>
>>> The tail recursion elimination one we're discussing now is a great example.
>>> At this point I understand the problem you're running into, but I'm still
>>> trying to wrap my head around the implications of the funny semantics of
>>> __builtin_arg_bounds and how they may cause other problems.
>>
>> Root of all problems if implicit data flow hidden in arg_bounds and
>> bind_bounds.  Calls consume bounds and compiler does not know it. And
>> input bounds are always expressed via arg_bounds calls and never
>> expressed via formal args. Obviously optimizers have to be taught
>> about these data dependencies to work correctly.
>>
>> I agree semantics of arg_bounds call creates many issues for
>> optimizers but currently I do not see a better replacement for it.
>
> But it looks incredibly fragile if you ICE once something you don't like
> happens.  You should be able to easily detect the case and "punt",
> that is, drop to non-instrumented aka invalidating bounds.

Actually, it is easy detect such cases and invalidate bounds each time
some optimization bounds data flow.  With such feature 5-6 of sent
patches would be unnecessary to successfully build instrumented code
on -O2, but without them quality of checks would be much lower (mostly
due to inline).

Ilya

>
> Thus, I really really don't like these patches.  They hint at some
> deeper problem with the overall design (or the HW feature or the
> accompaning ABI).
>
> Richard.
>
>> Ilya
>>
>>>
>>>
>>> jeff
>>>


Re: [PATCH, MPX, 2/X] Pointers Checker [16/25] Tail recursion

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 11:41 AM, Ilya Enkovich  wrote:
> 2013/11/20 Richard Biener :
>> On Tue, Nov 19, 2013 at 8:02 PM, Jeff Law  wrote:
>>> On 11/18/13 14:03, Ilya Enkovich wrote:

 2013/11/19 Jeff Law :
>
> On 11/18/13 12:16, Ilya Enkovich wrote:
>>
>>
>> With current recursion elimination we will have:
>>
>> test (int *param1)
>> {
>> :
>>
>> :
>> _7 = PHI
>> bounds2 = __builtin_arg_bounds (_7) -- WRONG
>
>
> I wouldn't say it's wrong.  It's conservatively correct if properly
> implemented.   Why precisely do you consider this wrong?  If your code
> can't
> handle it, it really should be fixed.


 It is wrong because __builtin_arg_bounds is used to get bounds of
 input parameter and PHI node here is not an input parameter.
>>>
>>> OK, now we're getting somewhere.  So we've got this odd little function
>>> which only works on parameters.   I can't help but feel this is a bit of
>>> mis-design coming back to haunt us and I wonder if we're going to see other
>>> instances of this kind of problem.
>>
>> Right.
>>
>>> There's something just wrong with the semantics of __builtin_arg_bounds, but
>>> I'm having trouble putting my finger on it.
>>
>> Well, the issue is that it accepts any pointer argument as input.
>> I originally thought it may just get an integer constant argument - the
>> argument position.  It really seems to me that we have
>> __builtin_arg_boundsN (), but to avoid having N builtins we specify N
>> via an argument to the function.  But as it may not change we should
>> use something that cannot change - like a constant.  A constant
>> that identifies a parameter is one that gives its position for example.
>
> I have tried to pass constant for arg_bounds.  It still requires
> additional modification of optimization passes (including tail
> recursion, param propagation, inline etc.)  But having constant as an
> argument you really cannot detect errors.  E.g. if you inline and do
> not fix inlined arg_bounds calls correctly, you may not get ICE but
> get wrong program which uses wrong bounds and produces false bounds
> violations.

Yeah, that's why I refrained from suggesting this earlier ;)  You lose
the connection between "bounds for argument X" and "function entry
value for argument X".

Richard.

>>
>> Note that any restrictions you impose on what is "valid" for GIMPLE
>> should be verified in tree-cfg.c:verify_gimple_* - in the
>> __builtin_arg_bounds call case in verify_gimple_call.
>
> Will add it.
>
> Thanks,
> Ilya
>
>>
>> Richard.
>>
>>>
>>>  Correctl

 handling of __builtin_arg_bounds in this 'wrong' example requires
 adding it a PHI node semantics based on it's arg.  For me it seems
 more complex then generating a regular PHI node for bounds and makes
 GIMPLE less readable.
>>>
>>> But presumably you have code to merge bound information already, right?  You
>>> need that for PHI nodes.  Can't you use that to walk up the use-def chains
>>> and build the bounds information?
>>>
>>> Or if you want to fix the tailcall stuff, you can start down that direction.
>>> I don't think that'll fix the nagging concerns about the overall semantics
>>> of builtin_arg_bounds, but it might be enough for you to go forward.
>>>
>>> jeff


Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 11:55 AM, Ilya Enkovich  wrote:
> 2013/11/20 Richard Biener :
>> On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich  
>> wrote:
>>> 2013/11/19 Jeff Law :
 On 11/19/13 05:20, Ilya Enkovich wrote:
>
> 2013/11/19 Richard Biener :
>>
>> On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich 
>> wrote:
>>>
>>> 2013/11/18 Jeff Law :

 On 11/18/13 11:27, Ilya Enkovich wrote:
>
>
>
> How does pointer passed to regular function differ from pointer passed
> to splitted function? How do I know then which pointer is to be passed
> with bounds and wchich one is not? Moreover current ABI does not allow
> to pass bounds with no pointer or pass bounds for some pointers in the
> call only.


 But I don't see any case in function splitting where we're going to
 want to
 pass the pointer without the bounds.  If you want the former, you're
 going
 to want the latter.
>>>
>>>
>>> There are at least cases when checks are eliminated or when lots of
>>> pointer usages are accompanied with few checks performed earlier (e.g.
>>> we are working with array). In such cases splitted part may easily get
>>> no bounds.
>>>

 I really don't see why you need to do anything special here.  At the
 most an
 assert in the splitting code to ensure that you don't have a situation
 where
 there's mixed pointers with bounds and pointers without bounds should
 be all
 you need or that you passed a bounds with no associated pointer :-)
>>>
>>>
>>> It would also require generation of proper bind_bounds calls in the
>>> original function and arg_bounds calls in a separated part. So,
>>> special support is required.
>>
>>
>> Well, only to keep proper instrumentation.  I hope code still works
>> (doesn't trap) when optimizations "wreck" the bounds?  Thus all
>> these patches are improving bounds propagation but are not required
>> for correctness?  If so please postpone all of them until after the
>> initial support is merged.  If not, please make sure BND instrumentation
>> works conservatively when optimizations wreck it.
>
>
> All patches I sent for optimization passes are required to avoid ICEs
> when compiling instrumented code.

 Then I think we're going to need to understand them in more detail. That's
 going to mean testcases, probably dumps and some commentary about what went
 wrong.

 I can't speak for Richi, but when optimizations get disabled, I tend to 
 want
 to really understand why and make sure we're not papering over a larger
 problem.

 The tail recursion elimination one we're discussing now is a great example.
 At this point I understand the problem you're running into, but I'm still
 trying to wrap my head around the implications of the funny semantics of
 __builtin_arg_bounds and how they may cause other problems.
>>>
>>> Root of all problems if implicit data flow hidden in arg_bounds and
>>> bind_bounds.  Calls consume bounds and compiler does not know it. And
>>> input bounds are always expressed via arg_bounds calls and never
>>> expressed via formal args. Obviously optimizers have to be taught
>>> about these data dependencies to work correctly.
>>>
>>> I agree semantics of arg_bounds call creates many issues for
>>> optimizers but currently I do not see a better replacement for it.
>>
>> But it looks incredibly fragile if you ICE once something you don't like
>> happens.  You should be able to easily detect the case and "punt",
>> that is, drop to non-instrumented aka invalidating bounds.
>
> Actually, it is easy detect such cases and invalidate bounds each time
> some optimization bounds data flow.  With such feature 5-6 of sent
> patches would be unnecessary to successfully build instrumented code
> on -O2, but without them quality of checks would be much lower (mostly
> due to inline).

Sure, but you want to have this feature for a production compiler as
for sure you are going to miss an odd sequence of transforms that
breaks your assumptions.

Richard.

> Ilya
>
>>
>> Thus, I really really don't like these patches.  They hint at some
>> deeper problem with the overall design (or the HW feature or the
>> accompaning ABI).
>>
>> Richard.
>>
>>> Ilya
>>>


 jeff



Re: [AArch64] Remove "mode", "mode2" attributes

2013-11-20 Thread Richard Earnshaw
On 19/11/13 18:12, James Greenhalgh wrote:
> 
> There are no consumers for these attributes, nor should there
> ever be. Remove them.
> 
> Regression tested on aarch64-none-elf with no issues.
> 
> OK?
> 
> Thanks,
> James
> 
> ---
> 2013-11-19  James Greenhalgh  
> 
>   * config/aarch64/aarch64.md: Remove "mode" and "mode2" attributes
>   from all insns.
> 
> 

OK.

R.




Re: [PATCH, MPX, 2/X] Pointers Checker [14/25] Function splitting

2013-11-20 Thread Ilya Enkovich
2013/11/20 Richard Biener :
> On Wed, Nov 20, 2013 at 11:55 AM, Ilya Enkovich  
> wrote:
>> 2013/11/20 Richard Biener :
>>> On Tue, Nov 19, 2013 at 9:18 PM, Ilya Enkovich  
>>> wrote:
 2013/11/19 Jeff Law :
> On 11/19/13 05:20, Ilya Enkovich wrote:
>>
>> 2013/11/19 Richard Biener :
>>>
>>> On Mon, Nov 18, 2013 at 8:12 PM, Ilya Enkovich 
>>> wrote:

 2013/11/18 Jeff Law :
>
> On 11/18/13 11:27, Ilya Enkovich wrote:
>>
>>
>>
>> How does pointer passed to regular function differ from pointer 
>> passed
>> to splitted function? How do I know then which pointer is to be 
>> passed
>> with bounds and wchich one is not? Moreover current ABI does not 
>> allow
>> to pass bounds with no pointer or pass bounds for some pointers in 
>> the
>> call only.
>
>
> But I don't see any case in function splitting where we're going to
> want to
> pass the pointer without the bounds.  If you want the former, you're
> going
> to want the latter.


 There are at least cases when checks are eliminated or when lots of
 pointer usages are accompanied with few checks performed earlier (e.g.
 we are working with array). In such cases splitted part may easily get
 no bounds.

>
> I really don't see why you need to do anything special here.  At the
> most an
> assert in the splitting code to ensure that you don't have a situation
> where
> there's mixed pointers with bounds and pointers without bounds should
> be all
> you need or that you passed a bounds with no associated pointer :-)


 It would also require generation of proper bind_bounds calls in the
 original function and arg_bounds calls in a separated part. So,
 special support is required.
>>>
>>>
>>> Well, only to keep proper instrumentation.  I hope code still works
>>> (doesn't trap) when optimizations "wreck" the bounds?  Thus all
>>> these patches are improving bounds propagation but are not required
>>> for correctness?  If so please postpone all of them until after the
>>> initial support is merged.  If not, please make sure BND instrumentation
>>> works conservatively when optimizations wreck it.
>>
>>
>> All patches I sent for optimization passes are required to avoid ICEs
>> when compiling instrumented code.
>
> Then I think we're going to need to understand them in more detail. That's
> going to mean testcases, probably dumps and some commentary about what 
> went
> wrong.
>
> I can't speak for Richi, but when optimizations get disabled, I tend to 
> want
> to really understand why and make sure we're not papering over a larger
> problem.
>
> The tail recursion elimination one we're discussing now is a great 
> example.
> At this point I understand the problem you're running into, but I'm still
> trying to wrap my head around the implications of the funny semantics of
> __builtin_arg_bounds and how they may cause other problems.

 Root of all problems if implicit data flow hidden in arg_bounds and
 bind_bounds.  Calls consume bounds and compiler does not know it. And
 input bounds are always expressed via arg_bounds calls and never
 expressed via formal args. Obviously optimizers have to be taught
 about these data dependencies to work correctly.

 I agree semantics of arg_bounds call creates many issues for
 optimizers but currently I do not see a better replacement for it.
>>>
>>> But it looks incredibly fragile if you ICE once something you don't like
>>> happens.  You should be able to easily detect the case and "punt",
>>> that is, drop to non-instrumented aka invalidating bounds.
>>
>> Actually, it is easy detect such cases and invalidate bounds each time
>> some optimization bounds data flow.  With such feature 5-6 of sent
>> patches would be unnecessary to successfully build instrumented code
>> on -O2, but without them quality of checks would be much lower (mostly
>> due to inline).
>
> Sure, but you want to have this feature for a production compiler as
> for sure you are going to miss an odd sequence of transforms that
> breaks your assumptions.

There still should be a way to detect such cases.  If all unsupported
cases are silently ignored then user never knows about it and never
reports the problem.

I may fix expand changes to make it less vulnerable to different code
modifications and get no ICEs without fixes in optimizers (well,
except fix in tree-ssa-ccp.c where instrumentation breaks assumption
for data flow between stack save/restore builtins). But even with
stable expand I would still want to have all those changes in
optimizers to ge

Re: [PATCH i386 8/8] [AVX-512] Add SHA support.

2013-11-20 Thread Kirill Yukhin
On 19 Nov 15:36, Uros Bizjak wrote:
> Please also add new command options to g++.dg/other/sse-2.C and
> g++.dg/other/sse-3.C
> 
> OK with the small nit below and with above testsute addition.
> 
> No need to document negative option here.

Thanks! I'll fix your inputs and check in after rest is reviewed.

--
K


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Eric Botcazou
> Thanks. Committed as rev 205023.

Nice work, but why did you antedate the entries in the various ChangeLog 
files?  That's rather confusing when you use them to track things in specific 
directories (yes, we all know your opinion about ChangeLog files ;-)

-- 
Eric Botcazou


Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Andreas Schwab
Jan Hubicka  writes:

>   * config/bootstrap-lto.mk: Use -ffat-lto-objects.
>   * common.opt (ffat-lto-objects): Disable by default.
>   * doc/invoke.texi (fat-lto-objects): Update documentation.
>   * opts.c: Enable fat-lto-objects on lto plugin disable setups.

This is breaking all tests with -flto -fno-use-linker-plugin.

$ gcc/xgcc -B gcc/ ../gcc/testsuite/gcc.c-torture/execute/900409-1.c -w -flto 
-fno-use-linker-plugin 
/daten/cross/m68k-linux/m68k-linux/sys-root/usr/lib/crt1.o: In function 
`_start':
/daten/m68k/build/glibc-2.18/csu/../ports/sysdeps/m68k/start.S:85: undefined 
reference to `main'
collect2: error: ld returned 1 exit status

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Jan Hubicka
> Jan Hubicka  writes:
> 
> > * config/bootstrap-lto.mk: Use -ffat-lto-objects.
> > * common.opt (ffat-lto-objects): Disable by default.
> > * doc/invoke.texi (fat-lto-objects): Update documentation.
> > * opts.c: Enable fat-lto-objects on lto plugin disable setups.
> 
> This is breaking all tests with -flto -fno-use-linker-plugin.
> 
> $ gcc/xgcc -B gcc/ ../gcc/testsuite/gcc.c-torture/execute/900409-1.c -w -flto 
> -fno-use-linker-plugin 
> /daten/cross/m68k-linux/m68k-linux/sys-root/usr/lib/crt1.o: In function 
> `_start':
> /daten/m68k/build/glibc-2.18/csu/../ports/sysdeps/m68k/start.S:85: undefined 
> reference to `main'
> collect2: error: ld returned 1 exit status

Hmm, I see.  Seems I profilebootstrapped but later regtested only on plugin
disabled gccfarm machine.  I wonder if we shall require users to use
-ffat-lto-objects in combination with -fno-use-linker-plugin (and thus document
it and fix the testsuite) or find a way to make driver to pass the option
transparently?

Honza
> 
> Andreas.
> 
> -- 
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."


Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Jan Hubicka
Hi,
actually the flag is being passed to the frontends instead of being consumed by
the driver that makes life easier.  I am testing the attached patch and will
commit it as obvious if it passes (to unbreak the testing).

Index: opts.c
===
--- opts.c  (revision 205108)
+++ opts.c  (working copy)
@@ -809,10 +809,13 @@ finish_options (struct gcc_options *opts
 #else
   error_at (loc, "LTO support has not been enabled in this configuration");
 #endif
-  if (!opts->x_flag_fat_lto_objects && !HAVE_LTO_PLUGIN)
+  if (!opts->x_flag_fat_lto_objects
+ && (!HAVE_LTO_PLUGIN
+ || (opts_set->x_flag_use_linker_plugin
+ && !opts->x_flag_use_linker_plugin)))
{
  if (opts_set->x_flag_fat_lto_objects)
-error_at (loc, "-fno-fat-lto-objects are supported only with 
linker plugin.");
+error_at (loc, "-fno-fat-lto-objects are supported only with 
linker plugin");
  opts->x_flag_fat_lto_objects = 1;
}
 }


Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Ilya Enkovich
On 20 Nov 09:49, Uros Bizjak wrote:
> On Tue, Nov 19, 2013 at 9:43 PM, Ilya Enkovich  wrote:
> 
> >> > Here is a patch to add size relocation and instruction to obtain 
> >> > object's size in i386 target.
> >>
> >> +(define_insn "move_size_reloc_"
> >> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
> >> +(match_operand: 1 "size_relocation" "Z"))]
> >> +  ""
> >> +{
> >> +  return "mov{}\t{%1, %0|%0, %1}";
> >>
> >> Please don't change x86_64_immediate_operand just to use "Z"
> >> constraint The predicate is used in a couple of other places that for
> >> sure don't accept your change.
> >>
> >> Better write this insn in an explicit way (see for example
> >> tls_initial_exec_64_sun). Something like:
> >>
> >> (define_insn "move_size_reloc_"
> >>   [(set (match_operand:SWI48 0 "register_operand" "=r")
> >> (unspec:SWI48
> >>  [(match_operand 1 "symbolic_operand" "..." )]
> >>  UNSPEC_SIZEOF))]
> >>   ""
> >>   "mov{}\t{%a1@SIZE, %0|%0, %a1@SIZE}")
> >>
> >> You will probably need to define new operand 1 predicate and constraint.
> >>
> >> Uros.
> >
> > Hi, Uros!  Thanks for comments!  Here is what I got trying to follow your 
> > suggestion.  Does it look better?
> 
> You actually don't need any operand modifiers in the insn template. Simply 
> use:
> 
> "mov{}\t{%1@SIZE, %0|%0, %1@SIZE}"
> 
> and you will automatically get
> 
> "movl $zzz, %eax" or "mov %eax, OFFSET FLAT: zzz".
> 
> Since your pattern allows only symbolic_operand, there is no reload,
> so you can avoid the constraint alltogether.
> 
> BTW: Did you consider various -mcmodel=... options? For DImode moves,
> you should check x86_64_zext_immediate_operand predicate and output
> either "movl $zzz, %eax" or "movabs $zzz, %rax". There is no movq with
> 64bit immediate. Please see movdi pattern.
> 
> Uros.

Yep, for large objects it may work wrongly. Does anyone use static objects 
>4Gb? :)

Large address does not mean large object but seems we have to be conservative 
here. I added  x86_64_zext_immediate_operand check with additional CM_KERNEL 
check because in this model object size should always fit 32 bits.

Thanks,
Ilya
--

gcc/

2013-11-19  Ilya Enkovich  

* config/i386/i386.md (UNSPEC_SIZEOF): New.
(move_size_reloc_): New.
* config/i386/predicates.md (symbol_operand): New.

diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 7289ae4..b29fadb 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -241,6 +241,7 @@
 ;;   s - address with no segment register
 ;;   i - address with no index and no rip
 ;;   b - address with no base and no rip
+;;   S - symbol reference
 
 (define_address_constraint "Tv"
   "VSIB address operand"
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index e23b3b6..772c09d 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -79,6 +79,7 @@
   UNSPEC_PLTOFF
   UNSPEC_MACHOPIC_OFFSET
   UNSPEC_PCREL
+  UNSPEC_SIZEOF
 
   ;; Prologue support
   UNSPEC_STACK_ALLOC
@@ -18446,6 +18447,22 @@
   "bndstx\t{%2, %3|%3, %2}"
   [(set_attr "type" "mpxst")])
 
+(define_insn "move_size_reloc_"
+  [(set (match_operand:SWI48 0 "register_operand" "=r")
+   (unspec:
+[(match_operand: 1 "symbol_operand")]
+UNSPEC_SIZEOF))]
+  ""
+{
+  if (GET_MODE (operands[0]) == SImode)
+return "mov{l}\t{%1@SIZE, %k0|%k0, %1@SIZE}";
+  else if (x86_64_zext_immediate_operand (operands[1], VOIDmode)
+  || ix86_cmodel == CM_KERNEL)
+return "mov{q}\t{%1@SIZE, %0|%0, %1@SIZE}";
+  else
+return "movabs{q}\t{%1@SIZE, %0|%0, %1@SIZE}";
+})
+
 (include "mmx.md")
 (include "sse.md")
 (include "sync.md")
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 66ac52f..5c758ab 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -119,6 +119,10 @@
(match_test "TARGET_64BIT")
(match_test "REGNO (op) > BX_REG")))
 
+;; Return true if VALUE is symbol reference
+(define_predicate "symbol_operand"
+  (match_code "symbol_ref"))
+
 ;; Return true if VALUE can be stored in a sign extended immediate field.
 (define_predicate "x86_64_immediate_operand"
   (match_code "const_int,symbol_ref,label_ref,const")


Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-20 Thread Sergey Ostanevich
Thanks for comments, hope I got all of em.
Note: I used a LOOP_VINFO_LOC (loop_vinfo) to print the loop location
but it appears to be 0, so the output is somewhat lousy. The global
vect_location points somewhere inside the loop, which is not that better.
Shall we address this separately?

Sergos

* common.opt: Added new option -fsimd-cost-model.
* tree-vectorizer.h (unlimited_cost_model): Interface update
to rely on particular loop info.
* tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
unlimited_cost_model call according to new interface.
(vect_peeling_hash_choose_best_peeling): Ditto.
(vect_enhance_data_refs_alignment): Ditto.
* tree-vect-slp.c: Ditto.
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
plus issue a warning in case cost model overrides users' directive.
* c-family/c.opt: add openmp-simd warning.
* fortran/lang.opt: Ditto.
* doc/invoke.texi: Added new openmp-simd warning.



diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 0026683..a85a8ad 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -592,6 +592,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used

+Wopenmp-simd
+C C++ Var(openmp_simd) Warning EnabledBy(Wall)
+Warn about simd directive is overridden by vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified
by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..2b0e9e6 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,6 +2296,10 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization

+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
vectorizer cost model %qs)

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c250385..050bd44 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces
-Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++
Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4804,6 +4805,10 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.

+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if vectorizer cost model overrides simd directive from user.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..b43c48c 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard

+Wopenmp-simd
+Fortran Warning
+; Documented in C
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..977db43 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
loop_vinfo, struct data_reference *dr,
   *new_slot = slot;
 }

-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+  && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
 slot->count += VECT_MAX_COST;
 }

@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
(loop_vec_info loop_vinfo,
res.peel_info.dr = NULL;
res.body_cost_vec = stmt_vector_for_cost ();

-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
  {
res.inside_cost = INT_MAX;
res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
  vectorization factor.
  We do this automtically for cost model, since we
calculate cost
  for every peeling option.  */
-  if 

Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Jan Hubicka
Hi,
this is version I commited - we need to intrdocue var for -fuse-linker-plugin
to be able to check it.

I apologize for the breakage.
Honza

* opts.c (finish_options): Imply -ffat-lto-objects with 
-fno-use-linker-plugin.
* common.opt (fuse-linker-plugin): Add var.
Index: opts.c
===
--- opts.c  (revision 205108)
+++ opts.c  (working copy)
@@ -809,10 +809,13 @@ finish_options (struct gcc_options *opts
 #else
   error_at (loc, "LTO support has not been enabled in this configuration");
 #endif
-  if (!opts->x_flag_fat_lto_objects && !HAVE_LTO_PLUGIN)
+  if (!opts->x_flag_fat_lto_objects
+ && (!HAVE_LTO_PLUGIN
+ || (opts_set->x_flag_use_linker_plugin
+ && !opts->x_flag_use_linker_plugin)))
{
  if (opts_set->x_flag_fat_lto_objects)
-error_at (loc, "-fno-fat-lto-objects are supported only with 
linker plugin.");
+error_at (loc, "-fno-fat-lto-objects are supported only with 
linker plugin");
  opts->x_flag_fat_lto_objects = 1;
}
 }
Index: common.opt
===
--- common.opt  (revision 205108)
+++ common.opt  (working copy)
@@ -2247,7 +2247,7 @@ Common Negative(fuse-ld=bfd)
 Use the gold linker instead of the default linker
 
 fuse-linker-plugin
-Common Undocumented
+Common Undocumented Var(flag_use_linker_plugin)
 
 ; Positive if we should track variables, negative if we should run
 ; the var-tracking pass only to discard debug annotations, zero if


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread H.J. Lu
On Thu, Nov 14, 2013 at 12:28 PM, Diego Novillo  wrote:
> This patch applies the rule that functions defined in FOO.c must be
> declared in FOO.h. One of the worst offenders in the code base is
> tree.h, unsurprisingly.
>
> The first patch contains the actual moves from tree.h into the
> various .h files.  The second patch contains the mechanical
> side-effects of the first one.
>
> This patch creates several new headers: attribs.h calls.h
> fold-const.h gcc-symtab.h print-rtl.h print-tree.h stmt.h
> stor-layout.h stringpool.h tree-nested.h tree-object-size.h
> varasm.h.
>
> Functions in each corresponding .c file got moved to those
> headers and others that already existed. I wanted to make this
> patch as mechanical as possible, so I made no attempt to fix
> problems like having build_addr defined in tree-inline.c. I left
> that for later.
>
> There were some declarations that I could not move out of tree.h
> because of header poisoning. We forbid the inclusion of things
> like expr.h from FE files. While that's a reasonable idea, the FE
> file *still* manage to at expr.c functionality because the
> declarations they want to use were defined in tree.h.
>
> If that functionality is allowed to be accessed from the FEs,
> then I will later move those functions out of expr.c into tree.c.
> I have moved these declarations to the bottom of tree.h so they
> are easy to identify later.
>
> There is a namespace collision with libcpp. The file gcc/symtab.c
> cannot use gcc/symtab.h because the #include command picks up
> libcpp/include/symtab.h first. So I named this file gcc-symtab.h
> for now.
>
> This patch should offer some minimal incremental build advantages
> by reducing the size of tree.h. Changes that would otherwise
> affected tree.h, will now go to other headers which are less
> frequently included.
>
> The cleanup is not complete, but this is already bulky:
>
> - There are some declarations that are still in tree.h that
>   should be moved. Andrew is also modifying those files, so I
>   chose to leave them in tree.h for now.
>
> - Some header files always need another header file. I chose to
>   #include that header in the file. At this stage we want to do
>   the opposite, but this would've added even more bulk to the
>   change, so I left a FIXME marker for the next pass.
>
> I've tested these changes with all the languages and all the
> targets in contrib/config-list.mk. However, I would not be
> surprised if something slipped. Please CC me on any fallout, so I
> can fix it.
>
> I'm doing a final round of testing and will commit once
> everything finishes.
>
>
> 2013-11-14  Diego Novillo  
>
> * tree.h: Include fold-const.h.
> (aggregate_value_p): Moved to function.h.
> (alloca_call_p): Moved to calls.h.
> (allocate_struct_function): Moved to function.h.
> (apply_tm_attr): Moved to attribs.h.
> (array_at_struct_end_p): Moved to expr.h.
> (array_ref_element_size): Moved to tree-dfa.h.
> (array_ref_low_bound): Moved to tree-dfa.h.
> (array_ref_up_bound): Moved to tree.h.
> (assemble_alias): Moved to cgraph.h.
> (avoid_folding_inline_builtin): Moved to builtins.h.
> (bit_from_pos): Moved to stor-layout.h.
> (build_addr): Moved to tree-nested.h.
> (build_call_expr): Moved to builtins.h.
> (build_call_expr_loc): Moved to builtins.h.
> (build_call_expr_loc_array): Moved to builtins.h.
> (build_call_expr_loc_vec): Moved to builtins.h.
> (build_duplicate_type): Moved to tree-inline.h.
> (build_fold_addr_expr): Moved to fold-const.h.
> (build_fold_addr_expr_with_type): Moved to fold-const.h.
> (build_fold_addr_expr_with_type_loc): Moved to fold-const.h.
> (build_fold_indirect_ref): Moved to fold-const.h.
> (build_fold_indirect_ref_loc): Moved to fold-const.h.
> (build_personality_function): Moved to tree.h.
> (build_range_check): Moved to fold-const.h.
> (build_simple_mem_ref): Moved to fold-const.h.
> (build_simple_mem_ref_loc): Moved to fold-const.h.
> (build_string_literal): Moved to builtins.h.
> (build_tm_abort_call): Moved to trans-mem.h.
> (builtin_mathfn_code): Moved to builtins.h.
> (builtin_memset_read_str): Moved to builtins.h.
> (byte_from_pos): Moved to stor-layout.h.
> (c_strlen): Moved to builtins.h.
> (call_expr_flags): Moved to calls.h.
> (can_move_by_pieces): Moved to expr.h.
> (categorize_ctor_elements): Moved to expr.h.
> (change_decl_assembler_name): Moved to gcc-symtab.h.
> (combine_comparisons): Moved to fold-const.h.
> (complete_ctor_at_level_p): Moved to tree.h.
> (component_ref_field_offset): Moved to tree-dfa.h.
> (compute_builtin_object_size): Moved to tree-object-size.h.
> (compute_record_mode): Moved to stor-layout.h.
> (constant_boolean_nod

[patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

This has been bugging me since I moved it out of gimple.h to gimplify.h.

The gimplify context structure was exposed in the header file to allow a 
few other files to push and pop contexts off the gimplification stack. 
Unfortunately, the struct contains a hash-table template, which means it 
also required hash-table.h in order to even parse the header.  No file 
other than gimplify.c actually uses what is in the structure, so 
exposing it seems wrong.


This patch primarily changes push_gimplify_context () and 
pop_gimplify_context () to grab a struct gimplify_ctx off a local stack 
pool rather than have each file's local function know about the 
structure and allocate it on their stack.  I didn't want to use 
malloc/free or there would be a lot of churn.  Its open to debate what 
the best approach is, but I decided to simply allocate a bunch on a 
static array local to gimplify.c.  If it goes past the end of the local 
array, malloc the required structure.  Its pretty simplistic, but 
functional and doesn't required any GTY stuff or other machinery.


I also hacked up the compiler to report what the 'top' of the stack was 
for a compilation unit. I then ran it through a bootstrap and full 
testsuite run of all languages, and looked for the maximum number of 
context structs in use at one time.  Normally only 1 or 2 were ever in 
use, but its topped out at 7 during some of the openmp tests.  So my 
current limit of 30 will likely never been reached.


only 2 fields were ever set by anyone outside of gimplify.c, the 
'into_ssa' and 'allow_rhs_cond_expr' fields, and they were always set
immediately after pushing the context onto the stack... So I added them 
as parameters to the push and gave them default values.


And thats it... this patch moves the hash function and structure 
completely within gimplify.c so no one needs to know anything about it, 
and removes the hash-table.h include prerequisite for parsing.


Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for 
mainline?


Andrew




	* gimplify.h (gimplify_hasher : typed_free_remove, struct gimplify_ctx):
	Move to gimplify.c.
	* gimplify.c (gimplify_hasher:typed_free_remove): Relocate here.
	(struct gimplify_ctx): Relocate here and add malloced field.
	(gimplify_ctxp): Make static.
	(ctx_alloc, ctx_free): Manage a pool of struct gimplify_ctx.
	(push_gimplify_context)): Add default parameters and allocate a 
	struct from the pool.
	(pop_gimplify_context): Free a struct back to the pool.
	(gimplify_scan_omp_clauses, gimplify_omp_parallel, gimplify_omp_task,
	gimplify_omp_workshare, gimplify_transaction, gimplify_body): Don't
	use a local struct gimplify_ctx.
	* gimplify-me.c (force_gimple_operand_1, gimple_regimplify_operands):
	Likewise.
	* omp-low.c (lower_omp_sections, lower_omp_single, lower_omp_master,
	lower_omp_ordered, lower_omp_critical, lower_omp_for,
	create_task_copyfn, lower_omp_taskreg, lower_omp_target,
	lower_omp_teams, execute_lower_omp): Likewise.
	* gimple-fold.c (gimplify_and_update_call_from_tree): Likewise.
	* tree-inline.c (optimize_inline_calls): Likewise.

Index: gimplify.h
===
*** gimplify.h	(revision 205035)
--- gimplify.h	(working copy)
*** enum gimplify_status {
*** 48,86 
GS_OK		= 0,	/* We did something, maybe more to do.  */
GS_ALL_DONE	= 1	/* The expression is fully gimplified.  */
  };
- /* Gimplify hashtable helper.  */
  
! struct gimplify_hasher : typed_free_remove 
! {
!   typedef elt_t value_type;
!   typedef elt_t compare_type;
!   static inline hashval_t hash (const value_type *);
!   static inline bool equal (const value_type *, const compare_type *);
! };
! 
! struct gimplify_ctx
! {
!   struct gimplify_ctx *prev_context;
! 
!   vec bind_expr_stack;
!   tree temps;
!   gimple_seq conditional_cleanups;
!   tree exit_label;
!   tree return_temp;
! 
!   vec case_labels;
!   /* The formal temporary table.  Should this be persistent?  */
!   hash_table  temp_htab;
! 
!   int conditions;
!   bool save_stack;
!   bool into_ssa;
!   bool allow_rhs_cond_expr;
!   bool in_cleanup_point_expr;
! };
! 
! extern struct gimplify_ctx *gimplify_ctxp;
! extern void push_gimplify_context (struct gimplify_ctx *);
  extern void pop_gimplify_context (gimple);
  extern gimple gimple_current_bind_expr (void);
  extern vec gimple_bind_expr_stack (void);
--- 48,56 
GS_OK		= 0,	/* We did something, maybe more to do.  */
GS_ALL_DONE	= 1	/* The expression is fully gimplified.  */
  };
  
! extern void push_gimplify_context (bool in_ssa = false,
!    bool rhs_cond_ok = false);
  extern void pop_gimplify_context (gimple);
  extern gimple gimple_current_bind_expr (void);
  extern vec gimple_bind_expr_stack (void);
Index: gimplify.c
===
*** gimplify.c	(revision 205035)
--- gimplify.c	(working copy)
*** enum omp_region_type
*** 89,94 
--- 89,12

[wide-int, committed] Remove remaining "wide_int (x)"s

2013-11-20 Thread Richard Sandiford
One simple switch to wi::add and one simplification that became possible
after Kenny's patch to remove the other real_from_integer.

Tested on x86_64-linux-gnu and applied as obvious (I hope).

Thanks,
Richard


Index: gcc/objc/objc-act.c
===
--- gcc/objc/objc-act.c 2013-11-20 12:04:28.937491368 +
+++ gcc/objc/objc-act.c 2013-11-20 13:35:05.663472662 +
@@ -4885,7 +4885,8 @@ objc_decl_method_attributes (tree *node,
  number = TREE_VALUE (second_argument);
  if (number && TREE_CODE (number) == INTEGER_CST)
TREE_VALUE (second_argument)
- = wide_int_to_tree (TREE_TYPE (number), wide_int (number) 
+ 2);
+ = wide_int_to_tree (TREE_TYPE (number),
+ wi::add (number, 2));
 
  /* This is the third argument, the "first-to-check",
 which specifies the index of the first argument to
@@ -4895,7 +4896,8 @@ objc_decl_method_attributes (tree *node,
  number = TREE_VALUE (third_argument);
  if (number && TREE_CODE (number) == INTEGER_CST)
TREE_VALUE (third_argument)
- = wide_int_to_tree (TREE_TYPE (number), wide_int (number) 
+ 2);
+ = wide_int_to_tree (TREE_TYPE (number),
+ wi::add (number, 2));
}
  filtered_attributes = chainon (filtered_attributes,
 new_attribute);
@@ -4930,7 +4932,8 @@ objc_decl_method_attributes (tree *node,
  if (number && TREE_CODE (number) == INTEGER_CST
  && !wi::eq_p (number, 0))
TREE_VALUE (argument)
- = wide_int_to_tree (TREE_TYPE (number), wide_int (number) 
+ 2);
+ = wide_int_to_tree (TREE_TYPE (number),
+ wi::add (number, 2));
  argument = TREE_CHAIN (argument);
}
 
Index: gcc/tree.c
===
--- gcc/tree.c  2013-11-20 12:55:04.109302505 +
+++ gcc/tree.c  2013-11-20 13:35:05.677472558 +
@@ -1711,8 +1711,8 @@ real_value_from_int_cst (const_tree type
  bitwise comparisons to see if two values are the same.  */
   memset (&d, 0, sizeof d);
 
-  real_from_integer (&d, type ? TYPE_MODE (type) : VOIDmode,
-wide_int (i), TYPE_SIGN (TREE_TYPE (i)));
+  real_from_integer (&d, type ? TYPE_MODE (type) : VOIDmode, i,
+TYPE_SIGN (TREE_TYPE (i)));
   return d;
 }
 



Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 08:35 AM, Andrew MacLeod wrote:



Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for 
mainline?


Andrew


And I've already fixed the typo in the changelog :-)  I had it open in a 
window but hadn't saved it when I created the patch.


Andrew

* gimplify.h (gimplify_hasher : typed_free_remove, struct gimplify_ctx):
Move to gimplify.c.
* gimplify.c (gimplify_hasher:typed_free_remove): Relocate here.
(struct gimplify_ctx): Relocate here and add 'malloced' field.
(gimplify_ctxp): Make static.
(ctx_alloc, ctx_free): Manage a pool of struct gimplify_ctx.
(push_gimplify_context): Add default parameters and allocate a 
struct from the pool.
(pop_gimplify_context): Free a struct back to the pool.
(gimplify_scan_omp_clauses, gimplify_omp_parallel, gimplify_omp_task,
gimplify_omp_workshare, gimplify_transaction, gimplify_body): Don't
use a local 'struct gimplify_ctx'.
* gimplify-me.c (force_gimple_operand_1, gimple_regimplify_operands):
Likewise.
* omp-low.c (lower_omp_sections, lower_omp_single, lower_omp_master,
lower_omp_ordered, lower_omp_critical, lower_omp_for,
create_task_copyfn, lower_omp_taskreg, lower_omp_target,
lower_omp_teams, execute_lower_omp): Likewise.
* gimple-fold.c (gimplify_and_update_call_from_tree): Likewise.
* tree-inline.c (optimize_inline_calls): Likewise.



Re: Add slim-lto support to gcc's build machinery

2013-11-20 Thread Paolo Bonzini
Il 20/11/2013 08:23, Markus Trippelsdorf ha scritto:
> Hi,
> 
> now that slim-lto objects are enabled by default, it would be nice to
> use them when building gcc with bootstrap-lto. The following patch
> implements the handling of these objects in gcc's build machinery.
> (Once -fuse-linker-plugin is made the default, -ffat-lto-objects in
> config/bootstrap-lto.mk could be dropped.)
> 
> LTO-Bootstrapped on x86_64-linux (with default -fuse-linker-plugin).
> The patch was already approved by Paolo Bonzini.
> 
> Please apply, because I don't have access.

Note that you need to regenerate all users of libtool.m4.  Please post a
patch _with_ the regeneration so that whoever applies it won't screw up.

Paolo

> Thanks.
> 
> 2013-11-20  Markus Trippelsdorf  
> 
>   * libtool.m4 : Handle slim-lto objects.
>   * ltmain.sh: Handle lto options.
> 
> diff --git a/libtool.m4 b/libtool.m4
> index 797468f02a5a..c55b6eba7a94 100644
> --- a/libtool.m4
> +++ b/libtool.m4
> @@ -3440,6 +3440,7 @@ for ac_symprfx in "" "_"; do
>else
>  lt_cv_sys_global_symbol_pipe="sed -n -e 's/^.*[[  
> ]]\($symcode$symcode*\)[[   ]][[
> ]]*$ac_symprfx$sympat$opt_cr$/$symxfrm/p'"
>fi
> +  lt_cv_sys_global_symbol_pipe="$lt_cv_sys_global_symbol_pipe | sed '/ 
> __gnu_lto/d'"
>  
># Check to see that the pipe works correctly.
>pipe_works=no
> @@ -4459,7 +4460,7 @@ _LT_EOF
>if $LD --help 2>&1 | $EGREP ': supported targets:.* elf' > /dev/null \
>&& test "$tmp_diet" = no
>then
> - tmp_addflag=
> + tmp_addflag=' $pic_flag'
>   tmp_sharedflag='-shared'
>   case $cc_basename,$host_cpu in
>  pgcc*)   # Portland Group C compiler
> @@ -5525,8 +5526,8 @@ if test "$_lt_caught_CXX_error" != yes; then
># Check if GNU C++ uses GNU ld as the underlying linker, since the
># archiving commands below assume that GNU ld is being used.
>if test "$with_gnu_ld" = yes; then
> -_LT_TAGVAR(archive_cmds, $1)='$CC -shared -nostdlib $predep_objects 
> $libobjs $deplibs $postdep_objects $compiler_flags ${wl}-soname $wl$soname -o 
> $lib'
> -_LT_TAGVAR(archive_expsym_cmds, $1)='$CC -shared -nostdlib 
> $predep_objects $libobjs $deplibs $postdep_objects $compiler_flags 
> ${wl}-soname $wl$soname ${wl}-retain-symbols-file $wl$export_symbols -o $lib'
> +_LT_TAGVAR(archive_cmds, $1)='$CC $pic_flag -shared -nostdlib 
> $predep_objects $libobjs $deplibs $postdep_objects $compiler_flags 
> ${wl}-soname $wl$soname -o $lib'
> +_LT_TAGVAR(archive_expsym_cmds, $1)='$CC $pic_flag -shared -nostdlib 
> $predep_objects $libobjs $deplibs $postdep_objects $compiler_flags 
> ${wl}-soname $wl$soname ${wl}-retain-symbols-file $wl$export_symbols -o $lib'
>  
>  _LT_TAGVAR(hardcode_libdir_flag_spec, $1)='${wl}-rpath ${wl}$libdir'
>  _LT_TAGVAR(export_dynamic_flag_spec, $1)='${wl}--export-dynamic'
> @@ -6503,6 +6504,13 @@ public class foo {
>  };
>  _LT_EOF
>  ])
> +
> +_lt_libdeps_save_CFLAGS=$CFLAGS
> +case "$CC $CFLAGS " in #(
> +*\ -flto*\ *) CFLAGS="$CFLAGS -fno-lto" ;;
> +*\ -fwhopr*\ *) CFLAGS="$CFLAGS -fno-whopr" ;;
> +esac
> +
>  dnl Parse the compiler output and extract the necessary
>  dnl objects, libraries and library flags.
>  if AC_TRY_EVAL(ac_compile); then
> @@ -6551,6 +6559,7 @@ if AC_TRY_EVAL(ac_compile); then
> fi
> ;;
>  
> +*.lto.$objext) ;; # Ignore GCC LTO objects
>  *.$objext)
> # This assumes that the test object file only shows up
> # once in the compiler output.
> @@ -6586,6 +6595,7 @@ else
>  fi
>  
>  $RM -f confest.$objext
> +CFLAGS=$_lt_libdeps_save_CFLAGS
>  
>  # PORTME: override above test on systems where it is broken
>  m4_if([$1], [CXX],
> diff --git a/ltmain.sh b/ltmain.sh
> index a03433f17894..2e0910194f24 100644
> --- a/ltmain.sh
> +++ b/ltmain.sh
> @@ -4980,7 +4980,8 @@ func_mode_link ()
># @file GCC response files
># -tp=* Portland pgcc target processor selection
>-64|-mips[0-9]|-r[0-9][0-9]*|-xarch=*|-xtarget=*|+DA*|+DD*|-q*|-m*| \
> -  -t[45]*|-txscale*|-p|-pg|--coverage|-fprofile-*|-F*|@*|-tp=*)
> +  -t[45]*|-txscale*|-p|-pg|--coverage|-fprofile-*|-F*|@*|-tp=*| \
> +  -O*|-flto*|-fwhopr|-fuse-linker-plugin)
>  func_quote_for_eval "$arg"
>   arg="$func_quote_for_eval_result"
>  func_append compile_command " $arg"
> 



[v3 patch] add missing 'template' keyword

2013-11-20 Thread Jonathan Wakely
2013-11-20  Jonathan Wakely  

PR c++/59173
* include/ext/pointer.h (pointer_traits<>::rebind<>): Add template
keyword in nested name.

Tested x86_64-linux, committed to trunk.
commit 7359ca380f9e4ec5d1193e1efcc3aec5af95ec93
Author: Jonathan Wakely 
Date:   Wed Nov 20 13:19:28 2013 +

PR c++/59173
* include/ext/pointer.h (pointer_traits<>::rebind<>): Add template
keyword in nested name.

diff --git a/libstdc++-v3/include/ext/pointer.h 
b/libstdc++-v3/include/ext/pointer.h
index 12bc749..d1730be 100644
--- a/libstdc++-v3/include/ext/pointer.h
+++ b/libstdc++-v3/include/ext/pointer.h
@@ -580,7 +580,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 using rebind = typename __gnu_cxx::_Pointer_adapter<
-   typename pointer_traits<_Storage_policy>::rebind<_Up>>;
+   typename pointer_traits<_Storage_policy>::template rebind<_Up>>;
 
   static pointer pointer_to(typename pointer::reference __r) noexcept
   { return pointer(std::addressof(__r)); }


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 08:35:28AM -0500, Andrew MacLeod wrote:
> I also hacked up the compiler to report what the 'top' of the stack
> was for a compilation unit. I then ran it through a bootstrap and
> full testsuite run of all languages, and looked for the maximum
> number of context structs in use at one time.  Normally only 1 or 2
> were ever in use, but its topped out at 7 during some of the openmp
> tests.  So my current limit of 30 will likely never been reached.

???  You can trivially create a testcase where it will be reached
(and you certainly should, so that you actually test the fallback).

Say:

/* { dg-do compile } */
/* { dg-options "-fopenmp" } */
#define P1 _Pragma ("omp parallel")
#define P10 P1 P1 P1 P1 P1 P1 P1 P1 P1 P1
#define P100 P10 P10 P10 P10 P10 P10 P10 P10 P10 P10
void bar (void);
void
foo (void)
{
  P100
bar ();
}

Jakub


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Diego Novillo
On Wed, Nov 20, 2013 at 6:44 AM, Eric Botcazou  wrote:
>> Thanks. Committed as rev 205023.
>
> Nice work, but why did you antedate the entries in the various ChangeLog

Oh, that's because of local commits and holding on to the patch for a
few days. That date is the date of the original local commit. Further
pulls from upstream pushed the entries down in the file. I just forgot
to move the entries forward when I committed the patch.

> directories (yes, we all know your opinion about ChangeLog files ;-)

What gave you that idea? ;)


Diego.


Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-20 Thread Ilya Enkovich
On 20 Nov 10:59, Richard Biener wrote:
> On Tue, Nov 19, 2013 at 9:29 PM, Ilya Enkovich  wrote:
> > On 19 Nov 12:33, Jeff Law wrote:
> >> On 11/19/13 05:13, Ilya Enkovich wrote:
> >> >On 19 Nov 13:00, Richard Biener wrote:
> >> >>I'd say not in the gimplifier either but in varpool (symbol table) code
> >> >>where the symbols are ultimatively registered with?
> >> >
> >> >Something like that?
> >> >
> >> >--- a/gcc/varpool.c
> >> >+++ b/gcc/varpool.c
> >> >@@ -151,6 +151,10 @@ varpool_node_for_decl (tree decl)
> >> >node = varpool_create_empty_node ();
> >> >node->decl = decl;
> >> >symtab_register_node (node);
> >> >+
> >> >+  if (DECL_NIITIAL (decl))
> >> >+chkp_register_var_initializer (decl);
> >> >+
> >> >return node;
> >> >  }
> >> Yea, I think that's what Richi is suggesting.
> >> jeff
> >>
> >
> > Great!  Here is a full version of the patch.  Bootstrap, make check and MPX 
> > tests are OK with the change.
> 
> Please don't do this in varpool_node_for_decl but in
> varpool_finalize_decl.
> 
> Richard.
> 

Here it is.

Thanks,
Ilya
--

gcc/

2013-11-20  Ilya Enkovich  

* cgraphunit.c: Include tree-chkp.h.
(varpool_finalize_decl): Register statically
initialized decls in Pointer Bounds Checker.


diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index 8ab274b..6b6ec55 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -201,6 +201,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "regset.h" /* FIXME: For reg_obstack.  */
 #include "context.h"
 #include "pass_manager.h"
+#include "tree-chkp.h"
 
 /* Queue of cgraph nodes scheduled to be added into cgraph.  This is a
secondary queue used during optimization to accommodate passes that
@@ -830,6 +831,9 @@ varpool_finalize_decl (tree decl)
  finished.  */
   if (cgraph_state == CGRAPH_STATE_FINISHED)
 varpool_assemble_decl (node);
+
+  if (DECL_INITIAL (decl))
+chkp_register_var_initializer (decl);
 }
 
 /* EDGE is an polymorphic call.  Mark all possible targets as reachable


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Diego Novillo
On Wed, Nov 20, 2013 at 8:32 AM, H.J. Lu  wrote:

> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59212

Thanks.  Fixed.


PR 59212
* g++.dg/plugin/selfassign.c: Include stringpool.h

diff --git a/gcc/testsuite/g++.dg/plugin/selfassign.c
b/gcc/testsuite/g++.dg/plugin/selfassign.c
index 2498153..cdab74a 100644
--- a/gcc/testsuite/g++.dg/plugin/selfassign.c
+++ b/gcc/testsuite/g++.dg/plugin/selfassign.c
@@ -8,6 +8,7 @@
 #include "coretypes.h"
 #include "tm.h"
 #include "tree.h"
+#include "stringpool.h"
 #include "toplev.h"
 #include "basic-block.h"
 #include "gimple.h"


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 08:44 AM, Jakub Jelinek wrote:

On Wed, Nov 20, 2013 at 08:35:28AM -0500, Andrew MacLeod wrote:

I also hacked up the compiler to report what the 'top' of the stack
was for a compilation unit. I then ran it through a bootstrap and
full testsuite run of all languages, and looked for the maximum
number of context structs in use at one time.  Normally only 1 or 2
were ever in use, but its topped out at 7 during some of the openmp
tests.  So my current limit of 30 will likely never been reached.

???  You can trivially create a testcase where it will be reached
(and you certainly should, so that you actually test the fallback).


Not something we're likely to see every day, but excellent!  a testcase 
is good :-) I'll add this as a testcase.


Do you think the current limit of 30 is reasonable?

Andrew


Re: [patch] regcprop fix for PR rtl-optimization/54300

2013-11-20 Thread Richard Earnshaw
On 19/11/13 17:48, Jeff Law wrote:
> On 11/19/13 10:32, Steven Bosscher wrote:
>>
>> Yes. In the GCC3 days it was important for sincos on i386, and on mk68
>> it used to be important for some of the funnier patterns. Not sure if
>> it's still useful today, though. Might be worth looking into, just to
>> avoid the confusion in the future.
> I doubt it's changed all that much :-)
> 
>>
>> There's been confusion about this before, where people assumed
>> single_set really means "just one SET in this pattern". (ISTR fixing
>> gcse.c's hash_scan_rtx for this at some point...?). But that's not the
>> semantics of single_set.
> Yes.  And I'd expect confusion to continue :(  Not sure if creating 
> renaming to capture the actual semantics would help here.
> 
>>
>> The proper test for "just one SET" is (!multiple_sets && single_set).
>> At least, that's how I've always coded it...
> Seems reasonable for those cases where you have to ensure there really 
> is just one set.
> 
> 
> jeff
> 

Provided we correctly note the other values that are killed, we can
handle multiple sets safely.  The one restriction we have to watch is
where the dead set operations kill input values to the live set operation.

I've committed my patch to trunk.

I'll leave it to gestate a couple of days, but this is also needed on
the active release branches as well.

R.



[wide-int] Undo some differences with trunk

2013-11-20 Thread Richard Sandiford
I've committed a patch upstream to convert TREE_INT_CST_LOW to tree_to_[su]hwi
if there is an obvious tree_fits_[su]hwi_p guard.  There were some other
changes from TREE_INT_CST_LOW to tree_to_[su]hwi that weren't as obvious
and I think we should deal with them separately.  As before, these changes
aren't really related to the wide-int conversion, since the code that uses
them is operating on HWIs rather than double_ints.  The patch is still here,
so it's not like the work is lost.

There were also some changes from TREE_INT_CST_LOW (X) to
TREE_INT_CST_ELT (X, 0), which are no longer needed now that we define
TREE_INT_CST_LOW too.

There was one case where the branch switched a tree_fits_uhwi_p to
a tree_fits_shwi_p, but with a !!! about the original choice.
The two other instances of the !!! sequence kept the original tests,
so I think this might have been unintentional.  If not, then again
it should go on trunk first.

The patch also undoes a couple of other ordering and whitespace differences.

Tested on x86_64-linux-gnu.  OK for wide-int?

Thanks,
Richard


Index: gcc/c-family/cilk.c
===
--- gcc/c-family/cilk.c 2013-11-20 13:43:30.896150594 +
+++ gcc/c-family/cilk.c 2013-11-20 13:48:38.491410510 +
@@ -1208,7 +1208,7 @@ extract_free_variables (tree t, struct w
int ii = 0;
if (TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST)
  {
-   len = tree_to_uhwi (TREE_OPERAND (t, 0));
+   len = TREE_INT_CST_LOW (TREE_OPERAND (t, 0));
 
for (ii = 0; ii < len; ii++)
  extract_free_variables (TREE_OPERAND (t, ii), wd, ADD_READ);
Index: gcc/c/Make-lang.in
===
--- gcc/c/Make-lang.in  2013-11-20 13:43:30.896150594 +
+++ gcc/c/Make-lang.in  2013-11-20 13:48:38.474410604 +
@@ -137,3 +137,4 @@ c.stageprofile: stageprofile-start
-mv c/*$(objext) stageprofile/c
 c.stagefeedback: stagefeedback-start
-mv c/*$(objext) stagefeedback/c
+
Index: gcc/config/avr/avr.c
===
--- gcc/config/avr/avr.c2013-11-20 13:43:30.896150594 +
+++ gcc/config/avr/avr.c2013-11-20 13:48:38.523410332 +
@@ -12233,7 +12233,7 @@ avr_fold_builtin (tree fndecl, int n_arg
   }
 
 tmap = wide_int_to_tree (map_type, arg[0]);
-map = tree_to_uhwi (tmap);
+map = TREE_INT_CST_LOW (tmap);
 
 if (TREE_CODE (tval) != INTEGER_CST
 && 0 == avr_map_metric (map, MAP_MASK_PREIMAGE_F))
Index: gcc/config/nds32/nds32.c
===
--- gcc/config/nds32/nds32.c2013-11-20 13:43:30.896150594 +
+++ gcc/config/nds32/nds32.c2013-11-20 13:48:38.492410505 +
@@ -1276,7 +1276,7 @@ nds32_construct_isr_vectors_information
  /* Pick up each vector id value.  */
  id = TREE_VALUE (id_list);
  /* Add vector_number_offset to get actual vector number.  */
- vector_id = tree_to_uhwi (id) + vector_number_offset;
+ vector_id = TREE_INT_CST_LOW (id) + vector_number_offset;
 
  /* Enable corresponding vector and set function name.  */
  nds32_isr_vectors[vector_id].category = (intr)
@@ -1318,7 +1318,7 @@ nds32_construct_isr_vectors_information
 
   /* The total vectors = interrupt + exception numbers + reset.
  There are 8 exception and 1 reset in nds32 architecture.  */
-  nds32_isr_vectors[0].total_n_vectors = tree_to_uhwi (id) + 8 + 1;
+  nds32_isr_vectors[0].total_n_vectors = TREE_INT_CST_LOW (id) + 8 + 1;
   strcpy (nds32_isr_vectors[0].func_name, func_name);
 
   /* Retrieve nmi and warm function.  */
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c 2013-11-20 13:43:30.896150594 +
+++ gcc/cp/parser.c 2013-11-20 13:48:38.503410443 +
@@ -3804,7 +3804,7 @@ make_string_pack (tree value)
   tree charvec;
   tree argpack = make_node (NONTYPE_ARGUMENT_PACK);
   const char *str = TREE_STRING_POINTER (value);
-  int sz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
+  int sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
   int len = TREE_STRING_LENGTH (value) / sz - 1;
   tree argvec = make_tree_vec (2);
 
Index: gcc/cp/vtable-class-hierarchy.c
===
--- gcc/cp/vtable-class-hierarchy.c 2013-11-20 13:43:30.896150594 +
+++ gcc/cp/vtable-class-hierarchy.c 2013-11-20 13:48:38.504410438 +
@@ -453,7 +453,7 @@ check_and_record_registered_pairs (tree
 vptr_address = TREE_OPERAND (vptr_address, 0);
 
   if (TREE_OPERAND_LENGTH (vptr_address) > 1)
-offset = tree_to_uhwi (TREE_OPERAND (vptr_address, 1));
+offset = TREE_INT_CST_LOW (TREE_OPERAND (vptr_address, 1));
   else
 offset = 0;
 
@@ -876,

Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-20 Thread Richard Biener
On Wed, 20 Nov 2013, Sergey Ostanevich wrote:

> Thanks for comments, hope I got all of em.
> Note: I used a LOOP_VINFO_LOC (loop_vinfo) to print the loop location
> but it appears to be 0, so the output is somewhat lousy. The global
> vect_location points somewhere inside the loop, which is not that better.
> Shall we address this separately?

Use vect_location instead.

Note that c-family/ and fortran/ have their own ChangeLog file
and files there don't have a prefix.

Ok with the change to use vect_location.

Thanks,
Richard.

> Sergos
> 
> * common.opt: Added new option -fsimd-cost-model.
> * tree-vectorizer.h (unlimited_cost_model): Interface update
> to rely on particular loop info.
> * tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
> unlimited_cost_model call according to new interface.
> (vect_peeling_hash_choose_best_peeling): Ditto.
> (vect_enhance_data_refs_alignment): Ditto.
> * tree-vect-slp.c: Ditto.
> * tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
> plus issue a warning in case cost model overrides users' directive.
> * c-family/c.opt: add openmp-simd warning.
> * fortran/lang.opt: Ditto.
> * doc/invoke.texi: Added new openmp-simd warning.
> 
> 
> 
> diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
> index 0026683..a85a8ad 100644
> --- a/gcc/c-family/c.opt
> +++ b/gcc/c-family/c.opt
> @@ -592,6 +592,10 @@ Wold-style-definition
>  C ObjC Var(warn_old_style_definition) Warning
>  Warn if an old-style parameter definition is used
> 
> +Wopenmp-simd
> +C C++ Var(openmp_simd) Warning EnabledBy(Wall)
> +Warn about simd directive is overridden by vectorizer cost model
> +
>  Woverlength-strings
>  C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
> LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
>  Warn if a string is longer than the maximum portable length specified
> by the standard
> diff --git a/gcc/common.opt b/gcc/common.opt
> index d5971df..2b0e9e6 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -2296,6 +2296,10 @@ fvect-cost-model=
>  Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
>  Specifies the cost model for vectorization
> 
> +fsimd-cost-model=
> +Common Joined RejectNegative Enum(vect_cost_model)
> Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
> +Specifies the vectorization cost model for code marked with simd directive
> +
>  Enum
>  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
> vectorizer cost model %qs)
> 
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index c250385..050bd44 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
>  -Wlogical-op -Wlong-long @gol
>  -Wmain -Wmaybe-uninitialized -Wmissing-braces
> -Wmissing-field-initializers @gol
>  -Wmissing-include-dirs @gol
> --Wno-multichar  -Wnonnull  -Wno-overflow @gol
> +-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
>  -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
>  -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
>  -Wpointer-arith  -Wno-pointer-to-int-cast @gol
> @@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++
> Dialect Options}.
>  -Wmaybe-uninitialized @gol
>  -Wmissing-braces @r{(only for C/ObjC)} @gol
>  -Wnonnull  @gol
> +-Wopenmp-simd @gol
>  -Wparentheses  @gol
>  -Wpointer-sign  @gol
>  -Wreorder   @gol
> @@ -4804,6 +4805,10 @@ attribute.
>  @opindex Woverflow
>  Do not warn about compile-time overflow in constant expressions.
> 
> +@item -Wopenmp-simd
> +@opindex Wopenm-simd
> +Warn if vectorizer cost model overrides simd directive from user.
> +
>  @item -Woverride-init @r{(C and Objective-C only)}
>  @opindex Woverride-init
>  @opindex Wno-override-init
> diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
> index 5e09cbd..b43c48c 100644
> --- a/gcc/fortran/lang.opt
> +++ b/gcc/fortran/lang.opt
> @@ -257,6 +257,10 @@ Wintrinsics-std
>  Fortran Warning
>  Warn on intrinsics not part of the selected standard
> 
> +Wopenmp-simd
> +Fortran Warning
> +; Documented in C
> +
>  Wreal-q-constant
>  Fortran Warning
>  Warn about real-literal-constants with 'q' exponent-letter
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index 83d1f45..977db43 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
> loop_vinfo, struct data_reference *dr,
>*new_slot = slot;
>  }
> 
> -  if (!supportable_dr_alignment && unlimited_cost_model ())
> +  if (!supportable_dr_alignment
> +  && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
>  slot->count += VECT_MAX_COST;
>  }
> 
> @@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
> (loop_vec_info loop_vinfo,
> res.peel_info.dr = NULL;
> res

[wide-int] Tweak an rs6000 test

2013-11-20 Thread Richard Sandiford
This test changes from TREE_INT_CST_LOW to TREE_INT_CST_ELT.  I was going
to change it back as part of the previous patch, but using wi:: seemed
more robust.

Only compile-tested so far because of problems with gcc110.  OK for wide-int?

Richard


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  2013-11-20 11:26:05.996351099 +
+++ gcc/config/rs6000/rs6000.c  2013-11-20 11:43:37.896818266 +
@@ -12131,16 +12131,14 @@ rs6000_expand_ternop_builtin (enum insn_
   /* Check whether the 2nd and 3rd arguments are integer constants and in
 range and prepare arguments.  */
   STRIP_NOPS (arg1);
-  if (TREE_CODE (arg1) != INTEGER_CST
- || !IN_RANGE (TREE_INT_CST_ELT (arg1, 0), 0, 1))
+  if (TREE_CODE (arg1) != INTEGER_CST || wi::geu_p (arg1, 2))
{
  error ("argument 2 must be 0 or 1");
  return const0_rtx;
}
 
   STRIP_NOPS (arg2);
-  if (TREE_CODE (arg2) != INTEGER_CST
- || !IN_RANGE (TREE_INT_CST_ELT (arg2, 0), 0, 15))
+  if (TREE_CODE (arg2) != INTEGER_CST || wi::geu_p (arg1, 16))
{
  error ("argument 3 must be in the range 0..15");
  return const0_rtx;



Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 2:12 PM, Jan Hubicka  wrote:
> Hi,
> this is version I commited - we need to intrdocue var for -fuse-linker-plugin
> to be able to check it.
>
> I apologize for the breakage.
> Honza
>
> * opts.c (finish_options): Imply -ffat-lto-objects with 
> -fno-use-linker-plugin.
> * common.opt (fuse-linker-plugin): Add var.
> Index: opts.c

Hmm, but that won't help for

gcc -c t.c -flto
gcc t.o -flto -fno-use-linker-plugin

Isn't that the style how we invoke LTO tests?  (ok we put in useless
-fno-use-linker-plugin in compile stage, too)

So in the end it'll be too fragile for users IMHO.  I think we should
disallow -fno-use-linker-plugin (if we decided to enable it by default
then it won't fall back to no linker plugin but errors if we replace ld
with something too old or remove plugin, right?).

Richard.

>===
> --- opts.c  (revision 205108)
> +++ opts.c  (working copy)
> @@ -809,10 +809,13 @@ finish_options (struct gcc_options *opts
>  #else
>error_at (loc, "LTO support has not been enabled in this 
> configuration");
>  #endif
> -  if (!opts->x_flag_fat_lto_objects && !HAVE_LTO_PLUGIN)
> +  if (!opts->x_flag_fat_lto_objects
> + && (!HAVE_LTO_PLUGIN
> + || (opts_set->x_flag_use_linker_plugin
> + && !opts->x_flag_use_linker_plugin)))
> {
>   if (opts_set->x_flag_fat_lto_objects)
> -error_at (loc, "-fno-fat-lto-objects are supported only with 
> linker plugin.");
> +error_at (loc, "-fno-fat-lto-objects are supported only with 
> linker plugin");
>   opts->x_flag_fat_lto_objects = 1;
> }
>  }
> Index: common.opt
> ===
> --- common.opt  (revision 205108)
> +++ common.opt  (working copy)
> @@ -2247,7 +2247,7 @@ Common Negative(fuse-ld=bfd)
>  Use the gold linker instead of the default linker
>
>  fuse-linker-plugin
> -Common Undocumented
> +Common Undocumented Var(flag_use_linker_plugin)
>
>  ; Positive if we should track variables, negative if we should run
>  ; the var-tracking pass only to discard debug annotations, zero if


Re: [wide-int] Undo some differences with trunk

2013-11-20 Thread Kenneth Zadeck
I would leave out the last frag of tree.c.  It actually gets rid of what 
i consider, a latent bug. it does not show up with the current 
implementation of wide-int, but if that implementation changed, it 
would.   The problem is that you really should not expect that you can 
set the min or max value for a type without first setting the precision 
of that type.


Aside from that, this looks fine.

kenny

On 11/20/2013 08:58 AM, Richard Sandiford wrote:

I've committed a patch upstream to convert TREE_INT_CST_LOW to tree_to_[su]hwi
if there is an obvious tree_fits_[su]hwi_p guard.  There were some other
changes from TREE_INT_CST_LOW to tree_to_[su]hwi that weren't as obvious
and I think we should deal with them separately.  As before, these changes
aren't really related to the wide-int conversion, since the code that uses
them is operating on HWIs rather than double_ints.  The patch is still here,
so it's not like the work is lost.

There were also some changes from TREE_INT_CST_LOW (X) to
TREE_INT_CST_ELT (X, 0), which are no longer needed now that we define
TREE_INT_CST_LOW too.

There was one case where the branch switched a tree_fits_uhwi_p to
a tree_fits_shwi_p, but with a !!! about the original choice.
The two other instances of the !!! sequence kept the original tests,
so I think this might have been unintentional.  If not, then again
it should go on trunk first.

The patch also undoes a couple of other ordering and whitespace differences.

Tested on x86_64-linux-gnu.  OK for wide-int?

Thanks,
Richard


Index: gcc/c-family/cilk.c
===
--- gcc/c-family/cilk.c 2013-11-20 13:43:30.896150594 +
+++ gcc/c-family/cilk.c 2013-11-20 13:48:38.491410510 +
@@ -1208,7 +1208,7 @@ extract_free_variables (tree t, struct w
int ii = 0;
if (TREE_CODE (TREE_OPERAND (t, 0)) == INTEGER_CST)
  {
-   len = tree_to_uhwi (TREE_OPERAND (t, 0));
+   len = TREE_INT_CST_LOW (TREE_OPERAND (t, 0));
  
  	for (ii = 0; ii < len; ii++)

  extract_free_variables (TREE_OPERAND (t, ii), wd, ADD_READ);
Index: gcc/c/Make-lang.in
===
--- gcc/c/Make-lang.in  2013-11-20 13:43:30.896150594 +
+++ gcc/c/Make-lang.in  2013-11-20 13:48:38.474410604 +
@@ -137,3 +137,4 @@ c.stageprofile: stageprofile-start
-mv c/*$(objext) stageprofile/c
  c.stagefeedback: stagefeedback-start
-mv c/*$(objext) stagefeedback/c
+
Index: gcc/config/avr/avr.c
===
--- gcc/config/avr/avr.c2013-11-20 13:43:30.896150594 +
+++ gcc/config/avr/avr.c2013-11-20 13:48:38.523410332 +
@@ -12233,7 +12233,7 @@ avr_fold_builtin (tree fndecl, int n_arg
}
  
  tmap = wide_int_to_tree (map_type, arg[0]);

-map = tree_to_uhwi (tmap);
+map = TREE_INT_CST_LOW (tmap);
  
  if (TREE_CODE (tval) != INTEGER_CST

  && 0 == avr_map_metric (map, MAP_MASK_PREIMAGE_F))
Index: gcc/config/nds32/nds32.c
===
--- gcc/config/nds32/nds32.c2013-11-20 13:43:30.896150594 +
+++ gcc/config/nds32/nds32.c2013-11-20 13:48:38.492410505 +
@@ -1276,7 +1276,7 @@ nds32_construct_isr_vectors_information
  /* Pick up each vector id value.  */
  id = TREE_VALUE (id_list);
  /* Add vector_number_offset to get actual vector number.  */
- vector_id = tree_to_uhwi (id) + vector_number_offset;
+ vector_id = TREE_INT_CST_LOW (id) + vector_number_offset;
  
  	  /* Enable corresponding vector and set function name.  */

  nds32_isr_vectors[vector_id].category = (intr)
@@ -1318,7 +1318,7 @@ nds32_construct_isr_vectors_information
  
/* The total vectors = interrupt + exception numbers + reset.

   There are 8 exception and 1 reset in nds32 architecture.  */
-  nds32_isr_vectors[0].total_n_vectors = tree_to_uhwi (id) + 8 + 1;
+  nds32_isr_vectors[0].total_n_vectors = TREE_INT_CST_LOW (id) + 8 + 1;
strcpy (nds32_isr_vectors[0].func_name, func_name);
  
/* Retrieve nmi and warm function.  */

Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c 2013-11-20 13:43:30.896150594 +
+++ gcc/cp/parser.c 2013-11-20 13:48:38.503410443 +
@@ -3804,7 +3804,7 @@ make_string_pack (tree value)
tree charvec;
tree argpack = make_node (NONTYPE_ARGUMENT_PACK);
const char *str = TREE_STRING_POINTER (value);
-  int sz = tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
+  int sz = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (TREE_TYPE (value;
int len = TREE_STRING_LENGTH (value) / sz - 1;
tree argvec = make_tree_vec (2);
  
Index: gcc/cp/vtable-class-hierarchy.c

=

Re: [wide-int] Tweak an rs6000 test

2013-11-20 Thread Kenneth Zadeck

looks fine to me.

kenny

On 11/20/2013 09:00 AM, Richard Sandiford wrote:

This test changes from TREE_INT_CST_LOW to TREE_INT_CST_ELT.  I was going
to change it back as part of the previous patch, but using wi:: seemed
more robust.

Only compile-tested so far because of problems with gcc110.  OK for wide-int?

Richard


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  2013-11-20 11:26:05.996351099 +
+++ gcc/config/rs6000/rs6000.c  2013-11-20 11:43:37.896818266 +
@@ -12131,16 +12131,14 @@ rs6000_expand_ternop_builtin (enum insn_
/* Check whether the 2nd and 3rd arguments are integer constants and in
 range and prepare arguments.  */
STRIP_NOPS (arg1);
-  if (TREE_CODE (arg1) != INTEGER_CST
- || !IN_RANGE (TREE_INT_CST_ELT (arg1, 0), 0, 1))
+  if (TREE_CODE (arg1) != INTEGER_CST || wi::geu_p (arg1, 2))
{
  error ("argument 2 must be 0 or 1");
  return const0_rtx;
}
  
STRIP_NOPS (arg2);

-  if (TREE_CODE (arg2) != INTEGER_CST
- || !IN_RANGE (TREE_INT_CST_ELT (arg2, 0), 0, 15))
+  if (TREE_CODE (arg2) != INTEGER_CST || wi::geu_p (arg1, 16))
{
  error ("argument 3 must be in the range 0..15");
  return const0_rtx;





Re: [PATCH, PR 57363] IBM long double: adding qNaN and number raises inexact exception

2013-11-20 Thread Adhemerval Zanella
Hi,

This is a reformatted patch after Ulrich review:

---

libgcc/ChangeLog:

2013-11-20  Adhemerval Zanella  

* config/rs6000/ibm-ldouble.c (__gcc_qadd): Fix add
of normal number and qNaN to not raise an inexact exception.

gcc/testsuite/ChangeLog:

2013-11-20  Adhemerval Zanella  

* gcc.target/powerpc/pr57363.c: New test.

--

diff --git a/gcc/testsuite/gcc.target/powerpc/pr57363.c 
b/gcc/testsuite/gcc.target/powerpc/pr57363.c
new file mode 100644
index 000..45ea3f3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr57363.c
@@ -0,0 +1,19 @@
+/* { dg-do run { target { powerpc*-*-linux* } } } */
+/* { dg-options "-mlong-double-128" } */
+
+/* Check if adding a qNAN and a normal long double does not generate a
+   inexact exception.  */
+
+#define _GNU_SOURCE
+#include 
+
+int main(void)
+{
+  double x = __builtin_nan ("");
+  long double y = 1.1L;
+
+  feenableexcept (FE_INEXACT);
+  feclearexcept (FE_ALL_EXCEPT);
+  x = x + y;
+  return fetestexcept(FE_INEXACT);
+}
diff --git a/libgcc/config/rs6000/ibm-ldouble.c 
b/libgcc/config/rs6000/ibm-ldouble.c
index 28e02e9..7ca900c 100644
--- a/libgcc/config/rs6000/ibm-ldouble.c
+++ b/libgcc/config/rs6000/ibm-ldouble.c
@@ -104,6 +104,8 @@ __gcc_qadd (double a, double aa, double c, double cc)
 
   if (nonfinite (z))
 {
+  if (fabs (z) != inf())
+   return z;
   z = cc + aa + c + a;
   if (nonfinite (z))
return z;



Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 2:35 PM, Andrew MacLeod  wrote:
> This has been bugging me since I moved it out of gimple.h to gimplify.h.
>
> The gimplify context structure was exposed in the header file to allow a few
> other files to push and pop contexts off the gimplification stack.
> Unfortunately, the struct contains a hash-table template, which means it
> also required hash-table.h in order to even parse the header.  No file other
> than gimplify.c actually uses what is in the structure, so exposing it seems
> wrong.
>
> This patch primarily changes push_gimplify_context () and
> pop_gimplify_context () to grab a struct gimplify_ctx off a local stack pool
> rather than have each file's local function know about the structure and
> allocate it on their stack.  I didn't want to use malloc/free or there would
> be a lot of churn.  Its open to debate what the best approach is, but I
> decided to simply allocate a bunch on a static array local to gimplify.c.
> If it goes past the end of the local array, malloc the required structure.
> Its pretty simplistic, but functional and doesn't required any GTY stuff or
> other machinery.
>
> I also hacked up the compiler to report what the 'top' of the stack was for
> a compilation unit. I then ran it through a bootstrap and full testsuite run
> of all languages, and looked for the maximum number of context structs in
> use at one time.  Normally only 1 or 2 were ever in use, but its topped out
> at 7 during some of the openmp tests.  So my current limit of 30 will likely
> never been reached.
>
> only 2 fields were ever set by anyone outside of gimplify.c, the 'into_ssa'
> and 'allow_rhs_cond_expr' fields, and they were always set
> immediately after pushing the context onto the stack... So I added them as
> parameters to the push and gave them default values.
>
> And thats it... this patch moves the hash function and structure completely
> within gimplify.c so no one needs to know anything about it, and removes the
> hash-table.h include prerequisite for parsing.
>
> Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for
> mainline?

The limit looks reasonable, but you could have used a simple linked
list (and never free).  Also being able to pop a random context
looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.

Richard.



> Andrew
>
>


Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 02:59:21PM +0100, Richard Biener wrote:
> > --- a/gcc/c-family/c.opt
> > +++ b/gcc/c-family/c.opt
> > @@ -592,6 +592,10 @@ Wold-style-definition
> >  C ObjC Var(warn_old_style_definition) Warning
> >  Warn if an old-style parameter definition is used
> > 
> > +Wopenmp-simd
> > +C C++ Var(openmp_simd) Warning EnabledBy(Wall)

Please use Var(warn_openmp_simd) here.

> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -2296,6 +2296,10 @@ fvect-cost-model=
> >  Common Joined RejectNegative Enum(vect_cost_model)
> > Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
> >  Specifies the cost model for vectorization
> > 
> > +fsimd-cost-model=
> > +Common Joined RejectNegative Enum(vect_cost_model)
> > Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
> > +Specifies the vectorization cost model for code marked with simd directive
> > +
> >  Enum
> >  Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
> > vectorizer cost model %qs)

I'd say you want to add
EnumValue
Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
here.

> > @@ -2929,6 +2929,13 @@ vect_estimate_min_profitable_iters
> > (loop_vec_info loop_vinfo,
> >/* vector version will never be profitable.  */
> >else
> >  {
> > +  if (LOOP_VINFO_LOOP (loop_vinfo)->force_vect)
> > +{
> > +  warning_at (LOOP_VINFO_LOC (loop_vinfo), OPT_Wopenmp_simd,
> > +  "Vectorization did not happen for "
> > +  "the loop labeled as simd.");

No {} around single stmt then body.  Also, diagnostic messages
don't start with a capital letter and don't end with dot.
So
"vectorization did not happen for "
"a simd loop"
or so.

> >  /* Return true if the vect cost model is unlimited.  */
> >  static inline bool
> > -unlimited_cost_model ()
> > +unlimited_cost_model (loop_p loop)
> >  {
> > -  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
> > +  return (flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED
> > +  || (loop != NULL
> > +  && loop->force_vect
> > +  && flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED));
> >  }

IMNSHO this should instead do:
  if (loop != NULL && loop->force_vect
  && flag_simd_cost_model != VECT_COST_MODEL_DEFAULT)
return flag_simd_cost_model == VECT_COST_MODEL_UNLIMITED;
  return flag_vect_cost_model == VECT_COST_MODEL_UNLIMITED;
so, if user said that -fsimd-cost-model=default, then it should honor
-fvect-cost-model.  And, IMHO that should be the default, but I don't
feel strongly about that.

Jakub


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Jakub Jelinek
On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:
> The limit looks reasonable, but you could have used a simple linked
> list (and never free).  Also being able to pop a random context
> looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.

Can't we use stack_vec for that?  Though that would
mean a global var constructor and destructor, so alternatively just use
a normal vec and .create(30) it somewhere during initialization?

Jakub


Re: patch PLUGIN_HEADER_FILE event for tracing of header inclusions.

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 10:57 PM, Basile Starynkevitch
 wrote:
> On Tue, 2013-11-19 at 20:33 +0100, Dominique Dhumieres wrote:
>> > Thanks: Committed revision 205038.
>>
>> This seems to break several g++ tests: see 
>> http://gcc.gnu.org/ml/gcc-testresults/2013-11/msg01482.html
>> On x86_64-apple-darwin13, the ICE is
>>
>> cc1: error: cannot load plugin ./one_time_plugin.so
>
> Actually, when we add hard coded plugin events we should never forget,
> like I just did, to edit invoke_plugin_callbacks_full.
>
> Diego and all, can I add the following patch to the trunk. One is just a
> comment inside plugin.def to recall that invoke_plugin_callbacks_full
> should be edited, and another is the tiny fix on that function in
> plugin.c
>
> ### gcc/ChangeLog entry
> 2013-11-19  Basile Starynkevitch  
>
> * plugin.def: Add comment about register_callback and
>   invoke_plugin_callbacks_full.
>
> * plugin.c (register_callback, invoke_plugin_callbacks_full):
>   Handle PLUGIN_INCLUDE_FILE event.
> ###
>
> Apologies for my mistake. Hope this will be approved quickly before 4.9
> stage 3! (I will later submit a documentation patch)

Ok.

Thanks,
Richard.

>
> BTW, I am not very happy about some events in plugin.def not being real
> events.  IMHO I would prefer that PLUGIN_PASS_MANAGER_SETUP,
> PLUGIN_REGISTER_GGC_ROOTS, PLUGIN_REGISTER_GGC_CACHES be removed,
> because they are not real events (and they should have their own public
> functions in plugin.h, e.g. add functions like
> plugin_pass_manager_setup, plugin_register_ggc_roots,
> plugin_register_ggc_caches in gcc/gcc-plugin.h). Unfortunately, we don't
> have time for such an improvement for 4.9
>
> Regards.
> --
> Basile STARYNKEVITCH http://starynkevitch.net/Basile/
> email: basilestarynkevitchnet mobile: +33 6 8501 2359
> 8, rue de la Faiencerie, 92340 Bourg La Reine, France
> *** opinions {are only mine, sont seulement les miennes} ***
>


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek  wrote:
> On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:
>> The limit looks reasonable, but you could have used a simple linked
>> list (and never free).  Also being able to pop a random context
>> looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.
>
> Can't we use stack_vec for that?  Though that would
> mean a global var constructor and destructor, so alternatively just use
> a normal vec and .create(30) it somewhere during initialization?

only with gimplify_context *, otherwise things will break during re-allocation.

Richard.

> Jakub


Re: Enale -fno-fat-lto-objects by default

2013-11-20 Thread Jan Hubicka
> On Wed, Nov 20, 2013 at 2:12 PM, Jan Hubicka  wrote:
> > Hi,
> > this is version I commited - we need to intrdocue var for 
> > -fuse-linker-plugin
> > to be able to check it.
> >
> > I apologize for the breakage.
> > Honza
> >
> > * opts.c (finish_options): Imply -ffat-lto-objects with 
> > -fno-use-linker-plugin.
> > * common.opt (fuse-linker-plugin): Add var.
> > Index: opts.c
> 
> Hmm, but that won't help for
> 
> gcc -c t.c -flto
> gcc t.o -flto -fno-use-linker-plugin
> 
> Isn't that the style how we invoke LTO tests?  (ok we put in useless
> -fno-use-linker-plugin in compile stage, too)

We put -fno-use-linker-plugin everywhere and I think it is what I would expect
people would do, too - just drop -flto -fno-use-linker-plugin as new 
optimization
option.

> 
> So in the end it'll be too fragile for users IMHO.  I think we should
> disallow -fno-use-linker-plugin (if we decided to enable it by default
> then it won't fall back to no linker plugin but errors if we replace ld
> with something too old or remove plugin, right?).

On plugin enabled setups, I think only value of -fno-use-linker-plugin
is to allow us to test the other code path.
Perhaps we can declare it undocumented and for internal use only?

Honza
> 
> Richard.
> 
> >===
> > --- opts.c  (revision 205108)
> > +++ opts.c  (working copy)
> > @@ -809,10 +809,13 @@ finish_options (struct gcc_options *opts
> >  #else
> >error_at (loc, "LTO support has not been enabled in this 
> > configuration");
> >  #endif
> > -  if (!opts->x_flag_fat_lto_objects && !HAVE_LTO_PLUGIN)
> > +  if (!opts->x_flag_fat_lto_objects
> > + && (!HAVE_LTO_PLUGIN
> > + || (opts_set->x_flag_use_linker_plugin
> > + && !opts->x_flag_use_linker_plugin)))
> > {
> >   if (opts_set->x_flag_fat_lto_objects)
> > -error_at (loc, "-fno-fat-lto-objects are supported only with 
> > linker plugin.");
> > +error_at (loc, "-fno-fat-lto-objects are supported only with 
> > linker plugin");
> >   opts->x_flag_fat_lto_objects = 1;
> > }
> >  }
> > Index: common.opt
> > ===
> > --- common.opt  (revision 205108)
> > +++ common.opt  (working copy)
> > @@ -2247,7 +2247,7 @@ Common Negative(fuse-ld=bfd)
> >  Use the gold linker instead of the default linker
> >
> >  fuse-linker-plugin
> > -Common Undocumented
> > +Common Undocumented Var(flag_use_linker_plugin)
> >
> >  ; Positive if we should track variables, negative if we should run
> >  ; the var-tracking pass only to discard debug annotations, zero if


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 09:12 AM, Richard Biener wrote:

On Wed, Nov 20, 2013 at 2:35 PM, Andrew MacLeod  wrote:

This has been bugging me since I moved it out of gimple.h to gimplify.h.

The gimplify context structure was exposed in the header file to allow a few
other files to push and pop contexts off the gimplification stack.
Unfortunately, the struct contains a hash-table template, which means it
also required hash-table.h in order to even parse the header.  No file other
than gimplify.c actually uses what is in the structure, so exposing it seems
wrong.

This patch primarily changes push_gimplify_context () and
pop_gimplify_context () to grab a struct gimplify_ctx off a local stack pool
rather than have each file's local function know about the structure and
allocate it on their stack.  I didn't want to use malloc/free or there would
be a lot of churn.  Its open to debate what the best approach is, but I
decided to simply allocate a bunch on a static array local to gimplify.c.
If it goes past the end of the local array, malloc the required structure.
Its pretty simplistic, but functional and doesn't required any GTY stuff or
other machinery.

I also hacked up the compiler to report what the 'top' of the stack was for
a compilation unit. I then ran it through a bootstrap and full testsuite run
of all languages, and looked for the maximum number of context structs in
use at one time.  Normally only 1 or 2 were ever in use, but its topped out
at 7 during some of the openmp tests.  So my current limit of 30 will likely
never been reached.

only 2 fields were ever set by anyone outside of gimplify.c, the 'into_ssa'
and 'allow_rhs_cond_expr' fields, and they were always set
immediately after pushing the context onto the stack... So I added them as
parameters to the push and gave them default values.

And thats it... this patch moves the hash function and structure completely
within gimplify.c so no one needs to know anything about it, and removes the
hash-table.h include prerequisite for parsing.

Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for
mainline?

The limit looks reasonable, but you could have used a simple linked
list (and never free).  Also being able to pop a random context
looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.


The argument to pop_gimplify_context isn't a context pointer, its a 
gimple statement and if present calls declare_vars on the temps in the 
body...  the routine always pops the current context.  So that doesn't 
seem too fragile?


I had considered just a linked list and never freeing, but thought that 
policy might be frowned upon and trigger false positives on memory 
allocation analysis tools  :-)   I'm happy to do that tho, its quite 
trivial.



Andrew




Re: [Patch, ARM] New feature to minimize the literal load for armv7-m target

2013-11-20 Thread Richard Earnshaw
On 06/11/13 06:10, Terry Guo wrote:
> Hi,
> 
> This patch intends to minimize the use of literal pool for some armv7-m
> targets that have slower speed to load data from flash than to fetch
> instruction from flash. The normal literal load instruction is now replaced
> by MOVW/MOVT instructions. A new option -mslow-flash-data is created for
> this purpose. So far this feature doesn't support PIC code and target that
> isn't based on armv7-m.
> 
> Tested with GCC regression test on QEMU for cortex-m3. No new regressions.
> Is it OK to trunk?
> 
> BR,
> Terry
> 
> 2013-11-06  Terry Guo  
> 
>  * doc/invoke.texi (-mslow-flash-data): Document new option.
>  * config/arm/arm.opt (mslow-flash-data): New option.
>  * config/arm/arm-protos.h
> (arm_max_const_double_inline_cost): Declare it.
>  * config/arm/arm.h (TARGET_USE_MOVT): Always true when
> disable literal pools.
literal pools are disabled.

>  (arm_disable_literal_pool): Declare it.
>  * config/arm/arm.c (arm_disable_literal_pool): New
> variable.
>  (arm_option_override): Handle new option.
>  (thumb2_legitimate_address_p): Invalid certain address
> format.

Invalidate.  What address formats?

>  (arm_max_const_double_inline_cost): New function.
>  * config/arm/arm.md (types.md): Include it a little
> earlier.

Include it before ...

>  (use_literal_pool): New attribute.
>  (enabled): Use new attribute.
>  (split pattern): Replace symbol+offset with MOVW/MOVT.
> 
> 

Comments inline.

> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index 1781b75..25927a1 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -554,6 +556,9 @@ extern int arm_arch_thumb_hwdiv;
> than core registers.  */
>  extern int prefer_neon_for_64bits;
>  
> +/* Nonzero if shouldn't use literal pool in generated code.  */
'if we shouldn't use literal pools'

> +extern int arm_disable_literal_pool;

This should be a bool, values stored in it should be true/false not 1/0.

> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 78554e8..de2a9c0 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -864,6 +864,9 @@ int arm_arch_thumb_hwdiv;
> than core registers.  */
>  int prefer_neon_for_64bits = 0;
>  
> +/* Nonzero if shouldn't use literal pool in generated code.  */
> +int arm_disable_literal_pool = 0;

Similar comments to above.

> @@ -6348,6 +6361,25 @@ thumb2_legitimate_address_p (enum machine_mode mode, 
> rtx x, int strict_p)
> && thumb2_legitimate_index_p (mode, xop0, strict_p)));
>  }
>  
> +  /* Normally we can assign constant values to its target register without
'to target registers'

> + the help of constant pool.  But there are cases we have to use constant
> + pool like:
> + 1) assign a label to register.
> + 2) sign-extend a 8bit value to 32bit and then assign to register.
> +
> + Constant pool access in format:
> + (set (reg r0) (mem (symbol_ref (".LC0"
> + will cause the use of literal pool (later in function arm_reorg).
> + So here we mark such format as an invalid format, then compiler
'then the compiler'

> @@ -16114,6 +16146,18 @@ push_minipool_fix (rtx insn, HOST_WIDE_INT address, 
> rtx *loc,
>minipool_fix_tail = fix;
>  }
>  
> +/* Return maximum allowed cost of synthesizing a 64-bit constant VAL inline.
> +   Returns 99 if we always want to synthesize the value.  */

Needs to mention that the cost is in terms of 'insns' (see the function
below it).

> +int
> +arm_max_const_double_inline_cost ()
> +{
> +  /* Let the value get synthesized to avoid the use of literal pools.  */
> +  if (arm_disable_literal_pool)
> +return 99;
> +
> +  return ((optimize_size || arm_ld_sched) ? 3 : 4);
> +}
> +
>  /* Return the cost of synthesizing a 64-bit constant VAL inline.
> Returns the number of insns needed, or 99 if we don't know how to
> do it.  */

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index adbc45b..a5991cb 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -534,6 +534,7 @@ Objective-C and Objective-C++ Dialects}.
>  -mfix-cortex-m3-ldrd @gol
>  -munaligned-access @gol
>  -mneon-for-64bits @gol
> +-mslow-flash-data @gol
>  -mrestrict-it}
>  
>  @emph{AVR Options}
> @@ -12295,6 +12296,12 @@ Enables using Neon to handle scalar 64-bits 
> operations. This is
>  disabled by default since the cost of moving data from core registers
>  to Neon is high.
>  
> +@item -mslow-flash-data
> +@opindex mslow-flash-data
> +Assume loading data from flash is slower than fetching instruction.
> +Therefore literal load is minimized for better performance.
> +This option is off by default.
> +

Needs to mention the limitation on which processors can support this, ie
only v7 m-profile.


R.



[Patchv2] Add slim-lto support to gcc's build machinery

2013-11-20 Thread Markus Trippelsdorf
On 2013.11.20 at 14:41 +0100, Paolo Bonzini wrote:
> Il 20/11/2013 08:23, Markus Trippelsdorf ha scritto:
> > Hi,
> > 
> > now that slim-lto objects are enabled by default, it would be nice to
> > use them when building gcc with bootstrap-lto. The following patch
> > implements the handling of these objects in gcc's build machinery.
> > (Once -fuse-linker-plugin is made the default, -ffat-lto-objects in
> > config/bootstrap-lto.mk could be dropped.)
> > 
> > LTO-Bootstrapped on x86_64-linux (with default -fuse-linker-plugin).
> > The patch was already approved by Paolo Bonzini.
> > 
> > Please apply, because I don't have access.
> 
> Note that you need to regenerate all users of libtool.m4.  Please post a
> patch _with_ the regeneration so that whoever applies it won't screw up.

Thanks. Lets hope this one is OK.

2013-11-20  Markus Trippelsdorf  

* boehm-gc/configure: Regenerate.
* gcc/configure: Likewise.
* ibatomic/configure: Likewise.
* libbacktrace/configure: Likewise.
* libcilkrts/configure: Likewise.
* libffi/configure: Likewise.
* libgfortran/configure
* libgomp/configure: Likewise.
* libitm/configure: Likewise.
* libjava/classpath/configure: Likewise.
* libjava/configure: Likewise.
* libobjc/configure: Likewise.
* libquadmath/configure: Likewise.
* libsanitizer/configure: Likewise.
* libssp/configure: Likewise.
* libstdc++-v3/configure: Likewise.
* libtool.m4: Handle slim-lto objects.
* libvtv/configure: Regenerate.
* ltmain.sh: Handle lto options.
* lto-plugin/configure: Regenerate.
* zlib/configure: Likewise.


diff --git a/boehm-gc/configure b/boehm-gc/configure
index 025003cac135..3c57697e0854 100755
--- a/boehm-gc/configure
+++ b/boehm-gc/configure
@@ -6555,6 +6555,7 @@ for ac_symprfx in "" "_"; do
   else
 lt_cv_sys_global_symbol_pipe="sed -n -e 's/^.*[ 
]\($symcode$symcode*\)[ ][  
]*$ac_symprfx$sympat$opt_cr$/$symxfrm/p'"
   fi
+  lt_cv_sys_global_symbol_pipe="$lt_cv_sys_global_symbol_pipe | sed '/ 
__gnu_lto/d'"
 
   # Check to see that the pipe works correctly.
   pipe_works=no
@@ -9081,7 +9082,7 @@ _LT_EOF
   if $LD --help 2>&1 | $EGREP ': supported targets:.* elf' > /dev/null \
 && test "$tmp_diet" = no
   then
-   tmp_addflag=
+   tmp_addflag=' $pic_flag'
tmp_sharedflag='-shared'
case $cc_basename,$host_cpu in
 pgcc*) # Portland Group C compiler
@@ -11318,7 +11319,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11321 "configure"
+#line 11322 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -11424,7 +11425,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 11427 "configure"
+#line 11428 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -12019,8 +12020,8 @@ with_gnu_ld=$lt_cv_prog_gnu_ld
   # Check if GNU C++ uses GNU ld as the underlying linker, since the
   # archiving commands below assume that GNU ld is being used.
   if test "$with_gnu_ld" = yes; then
-archive_cmds_CXX='$CC -shared -nostdlib $predep_objects $libobjs 
$deplibs $postdep_objects $compiler_flags ${wl}-soname $wl$soname -o $lib'
-archive_expsym_cmds_CXX='$CC -shared -nostdlib $predep_objects 
$libobjs $deplibs $postdep_objects $compiler_flags ${wl}-soname $wl$soname 
${wl}-retain-symbols-file $wl$export_symbols -o $lib'
+archive_cmds_CXX='$CC $pic_flag -shared -nostdlib $predep_objects 
$libobjs $deplibs $postdep_objects $compiler_flags ${wl}-soname $wl$soname -o 
$lib'
+archive_expsym_cmds_CXX='$CC $pic_flag -shared -nostdlib 
$predep_objects $libobjs $deplibs $postdep_objects $compiler_flags ${wl}-soname 
$wl$soname ${wl}-retain-symbols-file $wl$export_symbols -o $lib'
 
 hardcode_libdir_flag_spec_CXX='${wl}-rpath ${wl}$libdir'
 export_dynamic_flag_spec_CXX='${wl}--export-dynamic'
@@ -13027,6 +13028,13 @@ private:
 };
 _LT_EOF
 
+
+_lt_libdeps_save_CFLAGS=$CFLAGS
+case "$CC $CFLAGS " in #(
+*\ -flto*\ *) CFLAGS="$CFLAGS -fno-lto" ;;
+*\ -fwhopr*\ *) CFLAGS="$CFLAGS -fno-whopr" ;;
+esac
+
 if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_compile\""; } >&5
   (eval $ac_compile) 2>&5
   ac_status=$?
@@ -13077,6 +13085,7 @@ if { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: 
\"$ac_compile\""; } >&5
fi
;;
 
+*.lto.$objext) ;; # Ignore GCC LTO objects
 *.$objext)
# This assumes that the test object file only shows up
# once in the compiler output.
@@ -13112,6 +13121,7 @@ else
 fi
 
 $RM -f confest.$objext
+CFLAGS=$_lt_libdeps_save_CFLAGS
 
 # PORTME: override above test on systems where it is broken
 case $host_os in
diff --git a/gcc/configure b/gcc/configure
index c9bbd653

Re: [gomp4 simd, RFC] Simple fix to override vectorization cost estimation.

2013-11-20 Thread Sergey Ostanevich
Updated as per Richard and Jakub feedback - assuming the default
for simd-cost-model is unlmited by default.
Richard - was you Ok with it?

Sergos

* common.opt: Added new option -fsimd-cost-model.
* tree-vectorizer.h (unlimited_cost_model): Interface update
to rely on particular loop info.
* tree-vect-data-refs.c (vect_peeling_hash_insert): Update to
unlimited_cost_model call according to new interface.
(vect_peeling_hash_choose_best_peeling): Ditto.
(vect_enhance_data_refs_alignment): Ditto.
* tree-vect-slp.c: Ditto.
* tree-vect-loop.c (vect_estimate_min_profitable_iters): Ditto,
plus issue a warning in case cost model overrides users' directive.
* c.opt: add openmp-simd warning.
* lang.opt: Ditto.
* doc/invoke.texi: Added new openmp-simd warning.



diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 0026683..6173013 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -592,6 +592,10 @@ Wold-style-definition
 C ObjC Var(warn_old_style_definition) Warning
 Warn if an old-style parameter definition is used

+Wopenmp-simd
+C C++ Var(warn_openmp_simd) Warning EnabledBy(Wall)
+Warn about simd directive is overridden by vectorizer cost model
+
 Woverlength-strings
 C ObjC C++ ObjC++ Var(warn_overlength_strings) Warning
LangEnabledBy(C ObjC C++ ObjC++,Wpedantic)
 Warn if a string is longer than the maximum portable length specified
by the standard
diff --git a/gcc/common.opt b/gcc/common.opt
index d5971df..6a40a5d 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2296,10 +2296,17 @@ fvect-cost-model=
 Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_vect_cost_model) Init(VECT_COST_MODEL_DEFAULT)
 Specifies the cost model for vectorization

+fsimd-cost-model=
+Common Joined RejectNegative Enum(vect_cost_model)
Var(flag_simd_cost_model) Init(VECT_COST_MODEL_UNLIMITED)
+Specifies the vectorization cost model for code marked with simd directive
+
 Enum
 Name(vect_cost_model) Type(enum vect_cost_model) UnknownError(unknown
vectorizer cost model %qs)

 EnumValue
+Enum(vect_cost_model) String(default) Value(VECT_COST_MODEL_DEFAULT)
+
+EnumValue
 Enum(vect_cost_model) String(unlimited) Value(VECT_COST_MODEL_UNLIMITED)

 EnumValue
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index c250385..050bd44 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -256,7 +256,7 @@ Objective-C and Objective-C++ Dialects}.
 -Wlogical-op -Wlong-long @gol
 -Wmain -Wmaybe-uninitialized -Wmissing-braces
-Wmissing-field-initializers @gol
 -Wmissing-include-dirs @gol
--Wno-multichar  -Wnonnull  -Wno-overflow @gol
+-Wno-multichar  -Wnonnull  -Wno-overflow -Wopenmp-simd @gol
 -Woverlength-strings  -Wpacked  -Wpacked-bitfield-compat  -Wpadded @gol
 -Wparentheses  -Wpedantic-ms-format -Wno-pedantic-ms-format @gol
 -Wpointer-arith  -Wno-pointer-to-int-cast @gol
@@ -3318,6 +3318,7 @@ Options} and @ref{Objective-C and Objective-C++
Dialect Options}.
 -Wmaybe-uninitialized @gol
 -Wmissing-braces @r{(only for C/ObjC)} @gol
 -Wnonnull  @gol
+-Wopenmp-simd @gol
 -Wparentheses  @gol
 -Wpointer-sign  @gol
 -Wreorder   @gol
@@ -4804,6 +4805,10 @@ attribute.
 @opindex Woverflow
 Do not warn about compile-time overflow in constant expressions.

+@item -Wopenmp-simd
+@opindex Wopenm-simd
+Warn if vectorizer cost model overrides simd directive from user.
+
 @item -Woverride-init @r{(C and Objective-C only)}
 @opindex Woverride-init
 @opindex Wno-override-init
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5e09cbd..b43c48c 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -257,6 +257,10 @@ Wintrinsics-std
 Fortran Warning
 Warn on intrinsics not part of the selected standard

+Wopenmp-simd
+Fortran Warning
+; Documented in C
+
 Wreal-q-constant
 Fortran Warning
 Warn about real-literal-constants with 'q' exponent-letter
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 83d1f45..977db43 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -1090,7 +1090,8 @@ vect_peeling_hash_insert (loop_vec_info
loop_vinfo, struct data_reference *dr,
   *new_slot = slot;
 }

-  if (!supportable_dr_alignment && unlimited_cost_model ())
+  if (!supportable_dr_alignment
+  && unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
 slot->count += VECT_MAX_COST;
 }

@@ -1200,7 +1201,7 @@ vect_peeling_hash_choose_best_peeling
(loop_vec_info loop_vinfo,
res.peel_info.dr = NULL;
res.body_cost_vec = stmt_vector_for_cost ();

-   if (!unlimited_cost_model ())
+   if (!unlimited_cost_model (LOOP_VINFO_LOOP (loop_vinfo)))
  {
res.inside_cost = INT_MAX;
res.outside_cost = INT_MAX;
@@ -1429,7 +1430,7 @@ vect_enhance_data_refs_alignment (loop_vec_info
loop_vinfo)
  vectorization factor.
  We do this automtically for cost model, since we
calculate cost
  for every peeling option.  */

Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Trevor Saunders
On Wed, Nov 20, 2013 at 03:18:07PM +0100, Richard Biener wrote:
> On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek  wrote:
> > On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:
> >> The limit looks reasonable, but you could have used a simple linked
> >> list (and never free).  Also being able to pop a random context
> >> looks fragile ...  that is, pop_gimplify_context shouldn't have an 
> >> argument.
> >
> > Can't we use stack_vec for that?  Though that would
> > mean a global var constructor and destructor, so alternatively just use
> > a normal vec and .create(30) it somewhere during initialization?
> 
> only with gimplify_context *, otherwise things will break during 
> re-allocation.

hm? it seems like the only member of gimplify_ctx that can't just be
memcpyd is the prev pointer which presumably could go away if you have a
vec of all the contexts.

Trev

> 
> Richard.
> 
> > Jakub


Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Trevor Saunders
On Wed, Nov 20, 2013 at 08:43:57AM -0500, Diego Novillo wrote:
> On Wed, Nov 20, 2013 at 6:44 AM, Eric Botcazou  wrote:
> >> Thanks. Committed as rev 205023.
> >
> > Nice work, but why did you antedate the entries in the various ChangeLog
> 
> Oh, that's because of local commits and holding on to the patch for a
> few days. That date is the date of the original local commit. Further
> pulls from upstream pushed the entries down in the file. I just forgot
> to move the entries forward when I committed the patch.
> 
> > directories (yes, we all know your opinion about ChangeLog files ;-)
> 
> What gave you that idea? ;)

 This even seems like a good argument for that view, git log and I
 assume svn log could give you the same data without the possibility of
 this sort of issue ;)

 Trev

> 
> 
> Diego.


Re: [RFC] [PATCH, AARCH64] Machine descriptions to support stack smashing protection

2013-11-20 Thread Venkataramanan Kumar
Hi Joseph,

On 19 November 2013 21:53, Joseph S. Myers  wrote:
> On Tue, 19 Nov 2013, Jakub Jelinek wrote:
>
>> On Tue, Nov 19, 2013 at 04:30:21PM +0530, Venkataramanan Kumar wrote:
>> > This is RFC patch that adds machine descriptions to support stack
>> > smashing protection in AArch64.
>>
>> Most of the testsuite changes look wrong.  The fact that aarch64
>> gets stack protector support doesn't mean all other targets do as well.
>> So please leave all the changes that remove native or stack_protector
>> requirements out.
>
> The tests for "target native" look wrong for me ("native" conditionals
> only make sense for special cases such as guality tests that expect to
> exec another tool on the host) - so I think they should be removed, but
> that removal needs submitting as a separate patch.
>

Yes apologies for that, I will send another patch.

> I would like to see a clear description of what happens with all eight
> combinations of (glibc version does or doesn't support this, GCC building
> glibc does or doesn't support this, GCC building user program does or
> doesn't support this).  Which of the (GCC building glibc, glibc)
> combinations will successfully build glibc?  Will all such glibcs be
> binary-compatible?  Will both old and new GCC work for building user
> programs with both old and new glibc?

Can you please clarify why we need to consider "the gcc compiler that
builds the glibc" in the combinations you want me to describe. I am
not able to understand that.

I tested the below three GCC compiler versions for building user programs.

1) GCC trunk (without my patch) generates code for "stack protection
set/test" when -fstack-protector-all is enabled.
This is based on global variable  "__stack_chk_guard" access, which
glibc supports from version 2.4.

-snip-
 adrpx0, __stack_chk_guard
 add x0, x0, :lo12:__stack_chk_guard
 ldr x0, [x0]
 str x0, [x29,24]
-snip-

GCC has generic code emitted for stack protection, enabled for targets
where frame growth is downwards. The patch for frame growing downwards
in Aarch64 is recently committed in trunk.

2) Now with the patch, GCC compiler will generate TLS based access.

snip-
   mrs x0, tpidr_el0
   ldr x1, [x0,-8]
   str x1, [x29,24]
-snip-

I need to check if target glibc for Aarch64 supports TLS based ssp
support. Currently some targets check and generate TLS based access
when glibc version is >=2.4.

3) GCC linaro compiler packaged with open embedded image and which I
boot in V8 foundation model as I don't have access to Aarch64 hardware
yet.

It will not emit any stack protection code.
test.c:1:0: warning: -fstack-protector not supported for this target
[enabled by default]

regards,
Venkat.


Re: Committed: gdbhooks.py: reorganization to support regex-matching and typedefs

2013-11-20 Thread Tom Tromey
> "David" == David Malcolm  writes:

David> I've committed the following to trunk as r205085, as a mixture of
David> enabling work towards gdb being able to print vec<>, and to fix issues
David> where the string-matching logic got confused by typedefs (done in a
David> rather crude way, by supporting multiple strings in the match, but it
David> works).

FWIW for the libstdc++ printers what we do is strip typedefs and
qualifiers to find the type whose name we want to match:

# Get the unqualified type, stripped of typedefs.
type = type.unqualified ().strip_typedefs ()

Tom


Re: [wide-int] Undo some differences with trunk

2013-11-20 Thread Richard Sandiford
Kenneth Zadeck  writes:
> I would leave out the last frag of tree.c.  It actually gets rid of what 
> i consider, a latent bug. it does not show up with the current 
> implementation of wide-int, but if that implementation changed, it 
> would.   The problem is that you really should not expect that you can 
> set the min or max value for a type without first setting the precision 
> of that type.

Ah, OK.  I should have realised there was more to it.

> Aside from that, this looks fine.

Thanks, applied without that hunk.

Richard



[wide-int] Remove cst_fits_*

2013-11-20 Thread Richard Sandiford
FWIW this is the last change I had planned.  I thought it would be easier
to do once the host_integerp replacement was in mainline, but that means
it's a bit closer to the deadline than it should have been.

The branch adds two new functions, cst_fits_uhwi_p and cst_fits_shwi_p,
alongside the tree_fits_* variants.  AFAICT, all cst_fits_uhwi_p tests
replace trunk tests of the form:

TREE_CODE (x) == INTEGER_CST && TREE_INT_CST_HIGH (x) == 0

But this is the same as host_integerp (x, 1), now tree_fits_uhwi_p,
so I think we should just use that instead.  FWIW host_integerp was:

int
host_integerp (const_tree t, int pos)
{
  if (t == NULL_TREE)
return 0;

  return (TREE_CODE (t) == INTEGER_CST
  && ((TREE_INT_CST_HIGH (t) == 0
   && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) >= 0)
  || (! pos && TREE_INT_CST_HIGH (t) == -1
  && (HOST_WIDE_INT) TREE_INT_CST_LOW (t) < 0
  && !TYPE_UNSIGNED (TREE_TYPE (t)))
  || (pos && TREE_INT_CST_HIGH (t) == 0)));
}

which if you fold in "pos = 1" reduces to current trunk's:

bool
tree_fits_uhwi_p (const_tree t)
{
  return (t != NULL_TREE
  && TREE_CODE (t) == INTEGER_CST
  && TREE_INT_CST_HIGH (t) == 0);
}

cst_fits_shwi_p replaces cst_and_fits_in_hwi, but if cst_fits_uhwi_p
goes away then I think we might as well stick with the original name.
(We're already keeping the associated extraction function, int_cst_value.)

I did a bit of wide-intification of write_integer_cst, which showed
up a missing conversion in the abs routine (my fault).

Tested on x86_64-linux-gnu.  OK for wide-int?

Thanks,
Richard


Index: gcc/ada/gcc-interface/utils.c
===
--- gcc/ada/gcc-interface/utils.c   2013-11-20 14:00:54.151599452 +
+++ gcc/ada/gcc-interface/utils.c   2013-11-20 15:29:17.483262547 +
@@ -6065,7 +6065,7 @@ handle_novops_attribute (tree *node, tre
 get_nonnull_operand (tree arg_num_expr, unsigned HOST_WIDE_INT *valp)
 {
   /* Verify the arg number is a constant.  */
-  if (!cst_fits_uhwi_p (arg_num_expr))
+  if (!tree_fits_uhwi_p (arg_num_expr))
 return false;
 
   *valp = TREE_INT_CST_LOW (arg_num_expr);
Index: gcc/c-family/c-common.c
===
--- gcc/c-family/c-common.c 2013-11-20 14:00:54.151599452 +
+++ gcc/c-family/c-common.c 2013-11-20 15:29:17.486262535 +
@@ -8768,7 +8768,7 @@ check_nonnull_arg (void * ARG_UNUSED (ct
 get_nonnull_operand (tree arg_num_expr, unsigned HOST_WIDE_INT *valp)
 {
   /* Verify the arg number is a small constant.  */
-  if (cst_fits_uhwi_p (arg_num_expr))
+  if (tree_fits_uhwi_p (arg_num_expr))
 {
   *valp = TREE_INT_CST_LOW (arg_num_expr);
   return true;
Index: gcc/c-family/c-format.c
===
--- gcc/c-family/c-format.c 2013-11-20 14:00:54.151599452 +
+++ gcc/c-family/c-format.c 2013-11-20 15:29:17.487262531 +
@@ -227,7 +227,7 @@ check_format_string (tree fntype, unsign
 static bool
 get_constant (tree expr, unsigned HOST_WIDE_INT *value, int validated_p)
 {
-  if (!cst_fits_uhwi_p (expr))
+  if (!tree_fits_uhwi_p (expr))
 {
   gcc_assert (!validated_p);
   return false;
Index: gcc/config/sol2-c.c
===
--- gcc/config/sol2-c.c 2013-11-20 14:00:54.151599452 +
+++ gcc/config/sol2-c.c 2013-11-20 15:29:17.433262746 +
@@ -96,7 +96,7 @@ solaris_pragma_align (cpp_reader *pfile
 }
 
   low = TREE_INT_CST_LOW (x);
-  if (!cst_fits_uhwi_p (x)
+  if (!tree_fits_uhwi_p (x)
   || (low != 1 && low != 2 && low != 4 && low != 8 && low != 16
  && low != 32 && low != 64 && low != 128))
 {
Index: gcc/cp/mangle.c
===
--- gcc/cp/mangle.c 2013-11-20 14:00:54.151599452 +
+++ gcc/cp/mangle.c 2013-11-20 15:29:17.434262742 +
@@ -1505,8 +1505,8 @@ write_number (unsigned HOST_WIDE_INT num
 write_integer_cst (const tree cst)
 {
   int sign = tree_int_cst_sgn (cst);
-
-  if (!cst_fits_shwi_p (cst))
+  widest_int abs_value = wi::abs (wi::to_widest (cst));
+  if (!wi::fits_uhwi_p (abs_value))
 {
   /* A bignum. We do this in chunks, each of which fits in a
 HOST_WIDE_INT.  */
@@ -1559,14 +1559,9 @@ write_integer_cst (const tree cst)
   else
 {
   /* A small num.  */
-  unsigned HOST_WIDE_INT low = TREE_INT_CST_LOW (cst);
-
   if (sign < 0)
-   {
- write_char ('n');
- low = -low;
-   }
-  write_unsigned_number (low);
+   write_char ('n');
+  write_unsigned_number (abs_value.to_uhwi ());
 }
 }
 
Index: gcc/dwarf2out.c
===
--- gcc/dwarf2out.c 2013-11-20 15:27:30.843688771 +
+++ gcc/dwarf2out.c 2013-11-20 15:29:17.438262726 +000

[PATCH, ia64] [PR target/52731] internal compiler error: in ia64_st_address_bypass_p, at config/ia64/ia64.c:9357

2013-11-20 Thread Kirill Yukhin
Hello,
Patch in the bottom fixes PR52731.

The essense of the problem is that `ia64_single_set'
allows double set as exception for `prologue_allocate_stack'
and `epilogue_deallocate_stack'.

Although it does not apply this exception for predicated 
version of the patterns. I introduce explicit predicated
versions of the insns and use their codes in the routine.

Test passes w/ patch applied and fails when not.
Bootstrap pass (for all languages).

ChangeLog entry:
2013-11-20  Kirill Yukhin  

* config/ia64/ia64.md (prologue_allocate_stack): Block auto-
generation of predicated version.
(epilogue_deallocate_stack): Ditto.
(prologue_allocate_stack_pr): Add explicit predicated version.
(epilogue_deallocate_stack_pr): Ditto.

testsuite/ChangeLog entry:
2013-11-20  Kirill Yukhin  

* gcc.target/ia64/pr52731.c: New. 

Is it ok for trunk?

--
Thanks, K

 gcc/config/ia64/ia64.c  |2 +
 gcc/config/ia64/ia64.md |   41 +-
 gcc/testsuite/gcc.target/ia64/pr52731.c |   19 ++
 3 files changed, 60 insertions(+), 2 deletions(-)

diff --git a/gcc/config/ia64/ia64.c b/gcc/config/ia64/ia64.c
index 4fde7aa..6ce7154 100644
--- a/gcc/config/ia64/ia64.c
+++ b/gcc/config/ia64/ia64.c
@@ -7146,7 +7146,9 @@ ia64_single_set (rtx insn)
   switch (recog_memoized (insn))
 {
 case CODE_FOR_prologue_allocate_stack:
+case CODE_FOR_prologue_allocate_stack_pr:
 case CODE_FOR_epilogue_deallocate_stack:
+case CODE_FOR_epilogue_deallocate_stack_pr:
   ret = XVECEXP (x, 0, 0);
   break;
 
diff --git a/gcc/config/ia64/ia64.md b/gcc/config/ia64/ia64.md
index 4d9d4e0..bc4e8cb 100644
--- a/gcc/config/ia64/ia64.md
+++ b/gcc/config/ia64/ia64.md
@@ -4652,6 +4652,8 @@
 
 ;; This prevents the scheduler from moving the SP decrement past FP-relative
 ;; stack accesses.  This is the same as adddi3 plus the extra set.
+;; Explicit predicated version of insn needed to check by CODE_FOR_
+;; in ia64_single_set, where despite of 2 sets this define_insn should be OK.
 
 (define_insn "prologue_allocate_stack"
   [(set (match_operand:DI 0 "register_operand" "=r,r,r")
@@ -4664,10 +4666,31 @@
add %0 = %1, %2
adds %0 = %2, %1
addl %0 = %2, %1"
-  [(set_attr "itanium_class" "ialu")])
+  [(set_attr "itanium_class" "ialu")
+   (set_attr "predicable" "no")])
+
+(define_insn "prologue_allocate_stack_pr"
+  [(cond_exec (match_operator 0 ("predicate_operator")
+[(match_operand:BI 1 ("register_operand") ("c,c,c"))
+ (const_int 0)])
+ (parallel
+[(set (match_operand:DI 2 "register_operand" "=r,r,r")
+  (plus:DI (match_operand:DI 3 "register_operand" "%r,r,a")
+   (match_operand:DI 4 "gr_reg_or_22bit_operand" 
"r,I,J")))
+ (set (match_operand:DI 5 "register_operand" "+r,r,r")
+  (match_dup 5))]))]
+  ""
+  "@
+   (%J0) add %2 = %3, %4
+   (%J0) adds %2 = %3, %4
+   (%J0) addl %2 = %3, %4"
+  [(set_attr "itanium_class" "ialu")
+   (set_attr "predicable" "no")])
 
 ;; This prevents the scheduler from moving the SP restore past FP-relative
 ;; stack accesses.  This is similar to movdi plus the extra set.
+;; Explicit predicated version of insn needed to check by CODE_FOR_
+;; in ia64_single_set, where despite of 2 sets this define_insn should be OK.
 
 (define_insn "epilogue_deallocate_stack"
   [(set (match_operand:DI 0 "register_operand" "=r")
@@ -4675,7 +4698,21 @@
(set (match_dup 1) (match_dup 1))]
   ""
   "mov %0 = %1"
-  [(set_attr "itanium_class" "ialu")])
+  [(set_attr "itanium_class" "ialu")
+   (set_attr "predicable" "no")])
+
+(define_insn "epilogue_deallocate_stack_pr"
+  [(cond_exec (match_operator 0 ("predicate_operator")
+[(match_operand:BI 1 ("register_operand") ("c"))
+ (const_int 0)])
+ (parallel
+[(set (match_operand:DI 2 "register_operand" "=r")
+  (match_operand:DI 3 "register_operand" "+r"))
+ (set (match_dup 3) (match_dup 3))]))]
+  ""
+  "(%J0) mov %2 = %3"
+  [(set_attr "itanium_class" "ialu")
+   (set_attr "predicable" "no")])
 
 ;; As USE insns aren't meaningful after reload, this is used instead
 ;; to prevent deleting instructions setting registers for EH handling
diff --git a/gcc/testsuite/gcc.target/ia64/pr52731.c 
b/gcc/testsuite/gcc.target/ia64/pr52731.c
new file mode 100644
index 000..50ef1d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/ia64/pr52731.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target ia64-*-* } } */
+/* { dg-options "-O2" } */
+
+char* area;
+long int area_size;
+char* base;
+
+void fun(unsigned long int addr)
+{
+  unsigned long int size32 = (addr + 4096 - 1) & ~(4096 - 1);
+  unsigned long int size = size32 * sizeof(unsigned int);
+
+  if (size > 0) {
+size = (size + 1) & ~(1);
+  }
+
+  area_size = size;
+  area = base + siz

Re: Factor unrelated declarations out of tree.h (1/2)

2013-11-20 Thread Diego Novillo
On Wed, Nov 20, 2013 at 10:10 AM, Trevor Saunders  wrote:
> On Wed, Nov 20, 2013 at 08:43:57AM -0500, Diego Novillo wrote:
>> On Wed, Nov 20, 2013 at 6:44 AM, Eric Botcazou  wrote:
>> >> Thanks. Committed as rev 205023.
>> >
>> > Nice work, but why did you antedate the entries in the various ChangeLog
>>
>> Oh, that's because of local commits and holding on to the patch for a
>> few days. That date is the date of the original local commit. Further
>> pulls from upstream pushed the entries down in the file. I just forgot
>> to move the entries forward when I committed the patch.
>>
>> > directories (yes, we all know your opinion about ChangeLog files ;-)
>>
>> What gave you that idea? ;)
>
>  This even seems like a good argument for that view, git log and I
>  assume svn log could give you the same data without the possibility of
>  this sort of issue ;)

This is the main reason why I think ChangeLogs are absolutely
worthless. But I don't want to hijack this thread to discuss ChangeLog
politics.


Diego.


Re: [PATCH, MPX, 2/X] Pointers Checker. Add flag to varpool_node

2013-11-20 Thread Richard Biener
On Tue, Nov 19, 2013 at 8:09 PM, Jeff Law  wrote:
> On 11/19/13 07:06, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> Here is a patch to add flag for marking symbols as requiring static
>> initialization of bounds.  Used by Pointer Bounds Checker to handle
>> statically initialized pointers and static bounds vars.
>>
>> Thanks,
>> Ilya
>> --
>> 2013-11-19  Ilya Enkovich  
>>
>> * cgraph.h (varpool_node): Add need_bounds_init field.
>> * lto-cgraph.c (lto_output_varpool_node): Output
>> need_bounds_init value.
>> (input_varpool_node): Read need_bounds_init value.
>> * varpool.c (dump_varpool_node): Dump need_bounds_init field.
>
> OK.  This is fine to check in as long as any prerequisites are already in.

Please refrain from cluttering the tree with dead code.  Commit all patches
in one go once all patches are approved.

Thanks,
Richard.

> jeff
>


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 3:34 PM, Andrew MacLeod  wrote:
> On 11/20/2013 09:12 AM, Richard Biener wrote:
>>
>> On Wed, Nov 20, 2013 at 2:35 PM, Andrew MacLeod 
>> wrote:
>>>
>>> This has been bugging me since I moved it out of gimple.h to gimplify.h.
>>>
>>> The gimplify context structure was exposed in the header file to allow a
>>> few
>>> other files to push and pop contexts off the gimplification stack.
>>> Unfortunately, the struct contains a hash-table template, which means it
>>> also required hash-table.h in order to even parse the header.  No file
>>> other
>>> than gimplify.c actually uses what is in the structure, so exposing it
>>> seems
>>> wrong.
>>>
>>> This patch primarily changes push_gimplify_context () and
>>> pop_gimplify_context () to grab a struct gimplify_ctx off a local stack
>>> pool
>>> rather than have each file's local function know about the structure and
>>> allocate it on their stack.  I didn't want to use malloc/free or there
>>> would
>>> be a lot of churn.  Its open to debate what the best approach is, but I
>>> decided to simply allocate a bunch on a static array local to gimplify.c.
>>> If it goes past the end of the local array, malloc the required
>>> structure.
>>> Its pretty simplistic, but functional and doesn't required any GTY stuff
>>> or
>>> other machinery.
>>>
>>> I also hacked up the compiler to report what the 'top' of the stack was
>>> for
>>> a compilation unit. I then ran it through a bootstrap and full testsuite
>>> run
>>> of all languages, and looked for the maximum number of context structs in
>>> use at one time.  Normally only 1 or 2 were ever in use, but its topped
>>> out
>>> at 7 during some of the openmp tests.  So my current limit of 30 will
>>> likely
>>> never been reached.
>>>
>>> only 2 fields were ever set by anyone outside of gimplify.c, the
>>> 'into_ssa'
>>> and 'allow_rhs_cond_expr' fields, and they were always set
>>> immediately after pushing the context onto the stack... So I added them
>>> as
>>> parameters to the push and gave them default values.
>>>
>>> And thats it... this patch moves the hash function and structure
>>> completely
>>> within gimplify.c so no one needs to know anything about it, and removes
>>> the
>>> hash-table.h include prerequisite for parsing.
>>>
>>> Bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK for
>>> mainline?
>>
>> The limit looks reasonable, but you could have used a simple linked
>> list (and never free).  Also being able to pop a random context
>> looks fragile ...  that is, pop_gimplify_context shouldn't have an
>> argument.
>>
>>
> The argument to pop_gimplify_context isn't a context pointer, its a gimple
> statement and if present calls declare_vars on the temps in the body...  the
> routine always pops the current context.  So that doesn't seem too fragile?

Oh... I see.

> I had considered just a linked list and never freeing, but thought that
> policy might be frowned upon and trigger false positives on memory
> allocation analysis tools  :-)   I'm happy to do that tho, its quite
> trivial.

;)

Richard.

>
> Andrew
>
>


Re: [wwwdocs] Broken links

2013-11-20 Thread Philippe Baril Lecavalier


Embarrassing typos, my apologies. I was told to specify that I don't
have commit access, but since this mention is irrelevant, I modified my
suggested notice in about.html to reflect that. See attached.Index: projects/beginner.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/beginner.html,v
retrieving revision 1.57
diff -u -r1.57 beginner.html
--- projects/beginner.html	29 Dec 2012 00:24:56 -	1.57
+++ projects/beginner.html	20 Nov 2013 15:13:58 -
@@ -32,7 +32,7 @@
 http://gcc.gnu.org/onlinedocs/gccint/RTL.html";>RTL.
 It may help to understand the higher-level tree structure as
 well.  Unfortunately, for this we only have an http://gcc.gnu.org/onlinedocs/gccint/Trees.html";>incomplete, C/C++ specific manual.
+href="http://gcc.gnu.org/onlinedocs/gccint/GENERIC.html";>incomplete, C/C++ specific manual.
 
 Remember to keep other developers
 informed of any substantial projects you intend to work on.
Index: projects/documentation.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/documentation.html,v
retrieving revision 1.5
diff -u -r1.5 documentation.html
--- projects/documentation.html	19 Sep 2007 22:40:29 -	1.5
+++ projects/documentation.html	20 Nov 2013 15:13:58 -
@@ -56,7 +56,7 @@
 
 We've got quite a bit of this but it is scattered all over the
 place.  It belongs in the official manual.  There is a http://gcc.gnu.org/onlinedocs/gccint/Trees.html";>C/C++ specific manual,
+href="http://gcc.gnu.org/onlinedocs/gccint/GENERIC.html";>C/C++ specific manual,
 which is incomplete, though.  The file
 gcc/LANGUAGES contains incomplete and outdated information
 about changes made in not so recent years to the tree
Index: readings.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v
retrieving revision 1.233
diff -u -r1.233 readings.html
--- readings.html	31 Oct 2013 18:03:28 -	1.233
+++ readings.html	20 Nov 2013 15:13:58 -
@@ -401,7 +401,7 @@
 Objective-C Information
 
 
-  http://developer.apple.com/documentation/Cocoa/Conceptual/ObjectiveC/";>
+https://developer.apple.com/library/mac/documentation/cocoa/conceptual/ProgrammingWithObjectiveC/Introduction/Introduction.html";>
   Objective-C Language Description (Apple Computer)
 
 
Index: about.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/about.html,v
retrieving revision 1.19
diff -u -r1.19 about.html
--- about.html	18 Aug 2013 20:38:58 -	1.19
+++ about.html	20 Nov 2013 15:13:58 -
@@ -69,6 +69,14 @@
 list.
 
 
+Whether or not you have commit access, you must send an email to gcc-patches@gcc.gnu.org
+and CC ger...@pfeifer.com,
+with your patch as an attachment. Specify [wwwdocs] in the subject
+header and follow the instructions on http://gcc.gnu.org/contribute.html#patches";>patch
+submission, whenever applicable.
+
 
 The host system
 


Re: [PATCH] [AArch64] Fix over length lines around aarch64_save_or_restore_fprs

2013-11-20 Thread Marcus Shawcroft

On 19/11/13 16:50, Kyrill Tkachov wrote:

On 19/11/13 16:39, Marcus Shawcroft wrote:

Committed.

/Marcus

2013-11-19  Marcus Shawcroft  

* config/aarch64/aarch64.c (aarch64_save_or_restore_fprs): Fix over
length lines.



Minor nit... but ENOPATCH ;)

Kyrill



Ooops, sorry.

Attached
/Marcusdiff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cfda95e..4faa7cf 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1796,7 +1796,8 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
   unsigned regno;
   unsigned regno2;
   rtx insn;
-  rtx (*gen_mem_ref)(enum machine_mode, rtx) = (frame_pointer_needed)? gen_frame_mem : gen_rtx_MEM;
+  rtx (*gen_mem_ref)(enum machine_mode, rtx)
+= (frame_pointer_needed)? gen_frame_mem : gen_rtx_MEM;
 
 
   for (regno = V0_REGNUM; regno <= V31_REGNUM; regno++)
@@ -1839,16 +1840,17 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
 		( gen_load_pairdf (gen_rtx_REG (DFmode, regno), mem,
    gen_rtx_REG (DFmode, regno2), mem2));
 
-		  add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DFmode, regno));
-		  add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DFmode, regno2));
+		  add_reg_note (insn, REG_CFA_RESTORE,
+gen_rtx_REG (DFmode, regno));
+		  add_reg_note (insn, REG_CFA_RESTORE,
+gen_rtx_REG (DFmode, regno2));
 		}
 
 		  /* The first part of a frame-related parallel insn
 		 is always assumed to be relevant to the frame
 		 calculations; subsequent parts, are only
 		 frame-related if explicitly marked.  */
-	  RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0,
-	1)) = 1;
+	  RTX_FRAME_RELATED_P (XVECEXP (PATTERN (insn), 0, 1)) = 1;
 	  regno = regno2;
 	  start_offset += increment * 2;
 	}
@@ -1859,7 +1861,8 @@ aarch64_save_or_restore_fprs (int start_offset, int increment,
 	  else
 		{
 		  insn = emit_move_insn (gen_rtx_REG (DFmode, regno), mem);
-		  add_reg_note (insn, REG_CFA_RESTORE, gen_rtx_REG (DImode, regno));
+		  add_reg_note (insn, REG_CFA_RESTORE,
+gen_rtx_REG (DImode, regno));
 		}
 	  start_offset += increment;
 	}
-- 
1.7.9.5


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Richard Biener
On Wed, Nov 20, 2013 at 4:06 PM, Trevor Saunders  wrote:
> On Wed, Nov 20, 2013 at 03:18:07PM +0100, Richard Biener wrote:
>> On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek  wrote:
>> > On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:
>> >> The limit looks reasonable, but you could have used a simple linked
>> >> list (and never free).  Also being able to pop a random context
>> >> looks fragile ...  that is, pop_gimplify_context shouldn't have an 
>> >> argument.
>> >
>> > Can't we use stack_vec for that?  Though that would
>> > mean a global var constructor and destructor, so alternatively just use
>> > a normal vec and .create(30) it somewhere during initialization?
>>
>> only with gimplify_context *, otherwise things will break during 
>> re-allocation.
>
> hm? it seems like the only member of gimplify_ctx that can't just be
> memcpyd is the prev pointer which presumably could go away if you have a
> vec of all the contexts.

Callers have a pointer to gimplify_context AFAIK.

Richard.

> Trev
>
>>
>> Richard.
>>
>> > Jakub


Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Uros Bizjak
On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich  wrote:

>> >> > Here is a patch to add size relocation and instruction to obtain 
>> >> > object's size in i386 target.
>> >>
>> >> +(define_insn "move_size_reloc_"
>> >> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
>> >> +(match_operand: 1 "size_relocation" "Z"))]
>> >> +  ""
>> >> +{
>> >> +  return "mov{}\t{%1, %0|%0, %1}";
>> >>
>> >> Please don't change x86_64_immediate_operand just to use "Z"
>> >> constraint The predicate is used in a couple of other places that for
>> >> sure don't accept your change.
>> >>
>> >> Better write this insn in an explicit way (see for example
>> >> tls_initial_exec_64_sun). Something like:
>> >>
>> >> (define_insn "move_size_reloc_"
>> >>   [(set (match_operand:SWI48 0 "register_operand" "=r")
>> >> (unspec:SWI48
>> >>  [(match_operand 1 "symbolic_operand" "..." )]
>> >>  UNSPEC_SIZEOF))]
>> >>   ""
>> >>   "mov{}\t{%a1@SIZE, %0|%0, %a1@SIZE}")
>> >>
>> >> You will probably need to define new operand 1 predicate and constraint.
>> >>
>> >> Uros.
>> >
>> > Hi, Uros!  Thanks for comments!  Here is what I got trying to follow your 
>> > suggestion.  Does it look better?
>>
>> You actually don't need any operand modifiers in the insn template. Simply 
>> use:
>>
>> "mov{}\t{%1@SIZE, %0|%0, %1@SIZE}"
>>
>> and you will automatically get
>>
>> "movl $zzz, %eax" or "mov %eax, OFFSET FLAT: zzz".
>>
>> Since your pattern allows only symbolic_operand, there is no reload,
>> so you can avoid the constraint alltogether.
>>
>> BTW: Did you consider various -mcmodel=... options? For DImode moves,
>> you should check x86_64_zext_immediate_operand predicate and output
>> either "movl $zzz, %eax" or "movabs $zzz, %rax". There is no movq with
>> 64bit immediate. Please see movdi pattern.
>
> Yep, for large objects it may work wrongly. Does anyone use static objects 
> >4Gb? :)
>
> Large address does not mean large object but seems we have to be conservative 
> here. I added  x86_64_zext_immediate_operand check with additional CM_KERNEL 
> check because in this model object size should always fit 32 bits.

IMO, we should list code models that support object sizes > 31bits for
64bit target. The object size in small models will never be > 31bits
(and never negative), so we can use movl unconditionally.

Please try attached patch.

Uros.
Index: i386.md
===
--- i386.md (revision 205051)
+++ i386.md (working copy)
@@ -79,6 +79,7 @@
   UNSPEC_PLTOFF
   UNSPEC_MACHOPIC_OFFSET
   UNSPEC_PCREL
+  UNSPEC_SIZEOF
 
   ;; Prologue support
   UNSPEC_STACK_ALLOC
@@ -18458,6 +18459,30 @@
   "bndstx\t{%2, %3|%3, %2}"
   [(set_attr "type" "mpxst")])
 
+(define_insn "move_size_reloc_si"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+   (unspec:SI
+[(match_operand:SI 1 "symbol_operand")]
+UNSPEC_SIZEOF))]
+  "TARGET_MPX"
+  "mov{l}\t{%1@SIZE, %0|%0, %1@SIZE}"
+  [(set_attr "type" "imov")])
+
+(define_insn "move_size_reloc_di"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+   (unspec:DI
+[(match_operand:DI 1 "symbol_operand")]
+UNSPEC_SIZEOF))]
+  "TARGET_64BIT && TARGET_MPX"
+{
+  if (ix86_cmodel == CM_MEDIUM || ix86_cmodel == CM_LARGE
+  || ix86_cmodel == CM_MEDIUM_PIC || ix86_cmodel == CM_LARGE_PIC)
+return "movabs{q}\t{%1@SIZE, %0|%0, %1@SIZE}";
+  else
+return "mov{l}\t{%1@SIZE, %k0|%k0, %1@SIZE}";
+}
+  [(set_attr "type" "imov")])
+
 (include "mmx.md")
 (include "sse.md")
 (include "sync.md")
Index: predicates.md
===
--- predicates.md   (revision 205051)
+++ predicates.md   (working copy)
@@ -119,6 +119,10 @@
(match_test "TARGET_64BIT")
(match_test "REGNO (op) > BX_REG")))
 
+;; Return true if VALUE is a symbol reference
+(define_predicate "symbol_operand"
+  (match_code "symbol_ref"))
+
 ;; Return true if VALUE can be stored in a sign extended immediate field.
 (define_predicate "x86_64_immediate_operand"
   (match_code "const_int,symbol_ref,label_ref,const")


Re: [PATCH][AArch64] vneg testcase made big-endian safe

2013-11-20 Thread Marcus Shawcroft

On 19/11/13 17:29, Alex Velenko wrote:


gcc/testsuite/

2013-11-19  Alex Velenko  

* gcc.target/aarch64/vneg_s.c (test_vneg_s8): fixed to not use
vector indexing.
(test_vneg_s16): Likewise.
(test_vneg_s32): Likewise.
(test_vneg_s64): Likewise.
(test_vnegq_s8): Likewise.
(test_vnegq_s16): Likewise.
(test_vnegq_s32): Likewise.
(test_vnegq_s64): Likewise.



Hi, Couple of nits:

-#define RUN_TEST(test_set, answ_set, reg_len, data_len, n, a, b)   \
-  {\

+#define RUN_TEST(test_set, answ_set, reg_len,  \
+ data_len, n, a, b, _a, _b)\
+{  \

This part of the patch changes the indentation for the body of RUN_TEST 
unnecessarily, please re-indent.


-  int16x4_t a;
-  int16x4_t b;
+  int16x4_t a, b;

Unnecessary layout change, please remove.

The ChangeLog should provide a detailed explanation of what changed. 
For example:


* gcc.target/aarch64/vneg_s.c (LANE_POSTFIX): New.
(INDEX64_8, INDEX64_16, INDEX64_32, INDEX64_64, INDEX128_8)
(INDEX128_16, INDEX128_32, INDEX128_64, INDEX): Removed.
(RUN_TEST): New formal arguments _a and  _b...
(test_vneg_s8): Define _a and _b.  Pass to RUN_TEST.
...
etc

Cheers
/Marcus



Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Ilya Enkovich
2013/11/20 Uros Bizjak :
> On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich  wrote:
>
>>> >> > Here is a patch to add size relocation and instruction to obtain 
>>> >> > object's size in i386 target.
>>> >>
>>> >> +(define_insn "move_size_reloc_"
>>> >> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
>>> >> +(match_operand: 1 "size_relocation" "Z"))]
>>> >> +  ""
>>> >> +{
>>> >> +  return "mov{}\t{%1, %0|%0, %1}";
>>> >>
>>> >> Please don't change x86_64_immediate_operand just to use "Z"
>>> >> constraint The predicate is used in a couple of other places that for
>>> >> sure don't accept your change.
>>> >>
>>> >> Better write this insn in an explicit way (see for example
>>> >> tls_initial_exec_64_sun). Something like:
>>> >>
>>> >> (define_insn "move_size_reloc_"
>>> >>   [(set (match_operand:SWI48 0 "register_operand" "=r")
>>> >> (unspec:SWI48
>>> >>  [(match_operand 1 "symbolic_operand" "..." )]
>>> >>  UNSPEC_SIZEOF))]
>>> >>   ""
>>> >>   "mov{}\t{%a1@SIZE, %0|%0, %a1@SIZE}")
>>> >>
>>> >> You will probably need to define new operand 1 predicate and constraint.
>>> >>
>>> >> Uros.
>>> >
>>> > Hi, Uros!  Thanks for comments!  Here is what I got trying to follow your 
>>> > suggestion.  Does it look better?
>>>
>>> You actually don't need any operand modifiers in the insn template. Simply 
>>> use:
>>>
>>> "mov{}\t{%1@SIZE, %0|%0, %1@SIZE}"
>>>
>>> and you will automatically get
>>>
>>> "movl $zzz, %eax" or "mov %eax, OFFSET FLAT: zzz".
>>>
>>> Since your pattern allows only symbolic_operand, there is no reload,
>>> so you can avoid the constraint alltogether.
>>>
>>> BTW: Did you consider various -mcmodel=... options? For DImode moves,
>>> you should check x86_64_zext_immediate_operand predicate and output
>>> either "movl $zzz, %eax" or "movabs $zzz, %rax". There is no movq with
>>> 64bit immediate. Please see movdi pattern.
>>
>> Yep, for large objects it may work wrongly. Does anyone use static objects 
>> >4Gb? :)
>>
>> Large address does not mean large object but seems we have to be 
>> conservative here. I added  x86_64_zext_immediate_operand check with 
>> additional CM_KERNEL check because in this model object size should always 
>> fit 32 bits.
>
> IMO, we should list code models that support object sizes > 31bits for
> 64bit target. The object size in small models will never be > 31bits
> (and never negative), so we can use movl unconditionally.

For medium models x86_64_zext_immediate_operand returns true for
object is known to go to lower 2Gb space.  It should allow us to use
movl.  Why do you always emit movabs for medium model?

Ilya

>
> Please try attached patch.
>
> Uros.


Re: [PATCH, rs6000] Fix order of TDmode in FPRs in little-endian mode

2013-11-20 Thread David Edelsohn
On Mon, Nov 18, 2013 at 3:07 PM, Ulrich Weigand  wrote:


> Also note that this patch does not change how TDmode values are loaded
> into GPRs: on little-endian, this means we do get the usual LE subreg
> order there (least significant word in lowest-numbered register).  This
> does still seem the right thing to do, because there are no instructions
> that operate on DFP values in GPRs, and because this allows us to keep
> using subregs on GPRs in that case, and it allows moving TDmode between
> memory and GPRs (e.g. for vararg arguments) in the usual way.

Does this require special support in GDB so that it understands how to
interpret this type in different classes of registers?  However, the
interpretation always will be consistent for a particular class of
register.

> Tested on powerpc64le-linux, fixes all remaining DFP-related test suite
> failures.
>
> OK for mainline?
>
> Bye,
> Ulrich
>
>
> ChangeLog:
>
> * config/rs6000/rs6000.c (rs6000_cannot_change_mode_class): Do not
> allow subregs of TDmode in FPRs of smaller size in little-endian.
> (rs6000_split_multireg_move): When splitting an access to TDmode
> in FPRs, do not use simplify_gen_subreg.

Okay.

Thanks, David


Re: [PATCH, PR 57363] IBM long double: adding qNaN and number raises inexact exception

2013-11-20 Thread David Edelsohn
On Wed, Nov 20, 2013 at 9:11 AM, Adhemerval Zanella
 wrote:

> libgcc/ChangeLog:
>
> 2013-11-20  Adhemerval Zanella  
>
> * config/rs6000/ibm-ldouble.c (__gcc_qadd): Fix add
> of normal number and qNaN to not raise an inexact exception.
>
> gcc/testsuite/ChangeLog:
>
> 2013-11-20  Adhemerval Zanella  
>
> * gcc.target/powerpc/pr57363.c: New test.

This is okay.  And the patch is small enough that it does not need a
copyright assignment for GCC.

Do you need someone to commit the patch for you?

Thanks, David


[PATCH][ARM]Use of vcvt for float to fixed point conversions.

2013-11-20 Thread Renlin Li

Hi all,

This patch will make the arm back-end use vcvt for float to fixed point 
conversions when applicable.


Test on arm-none-linux-gnueabi has been done on the model.
Okay for trunk?


Kind regards,
Renlin Li


gcc/ChangeLog:

2013-11-20  Renlin Li  

* config/arm/arm-protos.h (vfp_const_double_for_bits): Declare.
* config/arm/constraints.md (Dp): Define new constraint.
* config/arm/predicates.md ( const_double_vcvt_power_of_two): Define
new predicate.
* config/arm/arm.c (arm_print_operand): Add print for new fucntion.
(vfp3_const_double_for_bits): New function.
* config/arm/vfp.md (combine_vcvtf2i): Define new instruction.

gcc/testsuite/ChangeLog:

2013-11-20  Renlin Li  

* gcc.target/arm/fixed_float_conversion.c: New test case.diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 944cf10..f2f8272 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -275,6 +275,8 @@ struct tune_params
 
 extern const struct tune_params *current_tune;
 extern int vfp3_const_double_for_fract_bits (rtx);
+/* return power of two from operand, otherwise 0.  */
+extern int vfp3_const_double_for_bits (rtx);
 
 extern void arm_emit_coreregs_64bit_shift (enum rtx_code, rtx, rtx, rtx, rtx,
 	   rtx);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 78554e8..72c4204 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -21175,7 +21175,11 @@ arm_print_operand (FILE *stream, rtx x, int code)
 
 case 'v':
 	gcc_assert (CONST_DOUBLE_P (x));
-	fprintf (stream, "#%d", vfp3_const_double_for_fract_bits (x));
+	int result;
+	result = vfp3_const_double_for_fract_bits (x);
+	if (result == 0)
+	  result = vfp3_const_double_for_bits (x);
+	fprintf (stream, "#%d", result);
 	return;
 
 /* Register specifier for vld1.16/vst1.16.  Translate the S register
@@ -28958,6 +28973,26 @@ vfp3_const_double_for_fract_bits (rtx operand)
 }
   return 0;
 }
+
+int
+vfp3_const_double_for_bits (rtx operand)
+{
+  REAL_VALUE_TYPE r0;
+
+  if (!CONST_DOUBLE_P (operand))
+return 0;
+
+  REAL_VALUE_FROM_CONST_DOUBLE (r0, operand);
+  if (exact_real_truncate (DFmode, &r0))
+{
+  HOST_WIDE_INT value = real_to_integer (&r0);
+  value = value & 0x;
+  if ((value != 0) && ( (value & (value - 1)) == 0))
+	return int_log2 (value);
+}
+
+  return 0;
+}
 
 /* Emit a memory barrier around an atomic sequence according to MODEL.  */
 
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index e2a3099..59ca4b6 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -31,7 +31,7 @@
 ;; 'H' was previously used for FPA.
 
 ;; The following multi-letter normal constraints have been used:
-;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dz
+;; in ARM/Thumb-2 state: Da, Db, Dc, Dd, Dn, Dl, DL, Do, Dv, Dy, Di, Dt, Dp, Dz
 ;; in Thumb-1 state: Pa, Pb, Pc, Pd, Pe
 ;; in Thumb-2 state: Pj, PJ, Ps, Pt, Pu, Pv, Pw, Px, Py
 
@@ -328,12 +328,18 @@
  (and (match_code "const_double")
   (match_test "TARGET_32BIT && TARGET_VFP_DOUBLE && vfp3_const_double_rtx (op)")))
 
-(define_constraint "Dt" 
+(define_constraint "Dt"
  "@internal
   In ARM/ Thumb2 a const_double which can be used with a vcvt.f32.s32 with fract bits operation"
   (and (match_code "const_double")
(match_test "TARGET_32BIT && TARGET_VFP && vfp3_const_double_for_fract_bits (op)")))
 
+(define_constraint "Dp"
+ "@internal
+  In ARM/ Thumb2 a const_double which can be used with a vcvt.s32.f32 with bits operation"
+  (and (match_code "const_double")
+   (match_test "TARGET_32BIT && TARGET_VFP && vfp3_const_double_for_bits (op)")))
+
 (define_register_constraint "Ts" "(arm_restrict_it) ? LO_REGS : GENERAL_REGS"
  "For arm_restrict_it the core registers @code{r0}-@code{r7}.  GENERAL_REGS otherwise.")
 
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 29e1e5c..2dac581 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -639,8 +639,13 @@
 
 (define_predicate "const_double_vcvt_power_of_two_reciprocal"
   (and (match_code "const_double")
-   (match_test "TARGET_32BIT && TARGET_VFP 
-   		   && vfp3_const_double_for_fract_bits (op)")))
+   (match_test "TARGET_32BIT && TARGET_VFP
+   && vfp3_const_double_for_fract_bits (op)")))
+
+(define_predicate "const_double_vcvt_power_of_two"
+  (and (match_code "const_double")
+   (match_test "TARGET_32BIT && TARGET_VFP
+   && vfp3_const_double_for_bits (op)")))
 
 (define_predicate "neon_struct_operand"
   (and (match_code "mem")
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index 22b6325..aec17f9 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -1253,6 +1253,19 @@
(set_attr "length" "8")]
 )
 
+ (define_insn "*combine_vcvtf2i"
+   [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(fix:SI (fix:SF (mult:SF (match_op

Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Uros Bizjak
On Wed, Nov 20, 2013 at 5:11 PM, Ilya Enkovich  wrote:
> 2013/11/20 Uros Bizjak :
>> On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich  
>> wrote:
>>
 >> > Here is a patch to add size relocation and instruction to obtain 
 >> > object's size in i386 target.
 >>
 >> +(define_insn "move_size_reloc_"
 >> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
 >> +(match_operand: 1 "size_relocation" "Z"))]
 >> +  ""
 >> +{
 >> +  return "mov{}\t{%1, %0|%0, %1}";
 >>
 >> Please don't change x86_64_immediate_operand just to use "Z"
 >> constraint The predicate is used in a couple of other places that for
 >> sure don't accept your change.
 >>
 >> Better write this insn in an explicit way (see for example
 >> tls_initial_exec_64_sun). Something like:
 >>
 >> (define_insn "move_size_reloc_"
 >>   [(set (match_operand:SWI48 0 "register_operand" "=r")
 >> (unspec:SWI48
 >>  [(match_operand 1 "symbolic_operand" "..." )]
 >>  UNSPEC_SIZEOF))]
 >>   ""
 >>   "mov{}\t{%a1@SIZE, %0|%0, %a1@SIZE}")
 >>
 >> You will probably need to define new operand 1 predicate and constraint.
 >>
 >> Uros.
 >
 > Hi, Uros!  Thanks for comments!  Here is what I got trying to follow 
 > your suggestion.  Does it look better?

 You actually don't need any operand modifiers in the insn template. Simply 
 use:

 "mov{}\t{%1@SIZE, %0|%0, %1@SIZE}"

 and you will automatically get

 "movl $zzz, %eax" or "mov %eax, OFFSET FLAT: zzz".

 Since your pattern allows only symbolic_operand, there is no reload,
 so you can avoid the constraint alltogether.

 BTW: Did you consider various -mcmodel=... options? For DImode moves,
 you should check x86_64_zext_immediate_operand predicate and output
 either "movl $zzz, %eax" or "movabs $zzz, %rax". There is no movq with
 64bit immediate. Please see movdi pattern.
>>>
>>> Yep, for large objects it may work wrongly. Does anyone use static objects 
>>> >4Gb? :)
>>>
>>> Large address does not mean large object but seems we have to be 
>>> conservative here. I added  x86_64_zext_immediate_operand check with 
>>> additional CM_KERNEL check because in this model object size should always 
>>> fit 32 bits.
>>
>> IMO, we should list code models that support object sizes > 31bits for
>> 64bit target. The object size in small models will never be > 31bits
>> (and never negative), so we can use movl unconditionally.
>
> For medium models x86_64_zext_immediate_operand returns true for
> object is known to go to lower 2Gb space.  It should allow us to use
> movl.  Why do you always emit movabs for medium model?

CM_MEDIUM has unlimited data size.

i386-opts.h:  CM_MEDIUM,/* Assumes code fits in the low 31
bits; data unlimited.  */

The x86_64_zext_immediate_operand allows _address_ to be loaded by
movl. The @SIZE reloc returns object size, which is unlimited and has
no connection to its address. For CM_MEDIUM,
x86_64_zext_immediate_operand allows:

  return (ix86_cmodel == CM_SMALL
  || (ix86_cmodel == CM_MEDIUM
  && !SYMBOL_REF_FAR_ADDR_P (op)));

Uros.


Re: [PATCH, i386, MPX, 2/X] Pointers Checker [22/25] Target builtins

2013-11-20 Thread Uros Bizjak
Hello!

> Here is a patch introducing i386 target versions of Pointer Bounds Checker 
> builtins.
>
> 2013-11-15  Ilya Enkovich  
>
> * config/i386/i386-builtin-types.def (BND): New.
> (ULONG): New.
> (BND_FTYPE_PCVOID_ULONG): New.
> (VOID_FTYPE_BND_PCVOID): New.
>(VOID_FTYPE_PCVOID_PCVOID_BND): New.
> (BND_FTYPE_PCVOID_PCVOID): New.
> (BND_FTYPE_PCVOID): New.
> (BND_FTYPE_BND_BND): New.
> (PVOID_FTYPE_PVOID_PVOID_ULONG): New.
> (PVOID_FTYPE_PCVOID_BND_ULONG): New.
> (ULONG_FTYPE_VOID): New.
> (PVOID_FTYPE_BND): New.
> * config/i386/i386.c: Include tree-chkp.h, rtl-chkp.h.
> (ix86_builtins): Add
> IX86_BUILTIN_BNDMK, IX86_BUILTIN_BNDSTX,
> IX86_BUILTIN_BNDLDX, IX86_BUILTIN_BNDCL,
> IX86_BUILTIN_BNDCU, IX86_BUILTIN_BNDRET,
> IX86_BUILTIN_BNDSET, IX86_BUILTIN_BNDNARROW,
> IX86_BUILTIN_BNDINT, IX86_BUILTIN_ARG_BND,
> IX86_BUILTIN_SIZEOF, IX86_BUILTIN_BNDLOWER,
> IX86_BUILTIN_BNDUPPER.
> (builtin_isa): Add leaf_p and nothrow_p fields.
> (def_builtin): Initialize leaf_p and nothrow_p.
> (ix86_add_new_builtins): Handle leaf_p and nothrow_p
> flags.
> (bdesc_mpx): New.
> (bdesc_mpx_const): New.
> (ix86_init_mpx_builtins): New.
> (ix86_init_builtins): Call ix86_init_mpx_builtins.
> (ix86_expand_builtin): expand IX86_BUILTIN_BNDMK,
> IX86_BUILTIN_BNDSTX, IX86_BUILTIN_BNDLDX,
> IX86_BUILTIN_BNDCL, IX86_BUILTIN_BNDCU,
> IX86_BUILTIN_BNDRET, IX86_BUILTIN_BNDSET,
> IX86_BUILTIN_BNDNARROW, IX86_BUILTIN_BNDINT,
> IX86_BUILTIN_ARG_BND, IX86_BUILTIN_SIZEOF,
> IX86_BUILTIN_BNDLOWER, IX86_BUILTIN_BNDUPPER.

+  /* Avoid registers which connot be used as index.  */
+  if (REGNO (op1) == VIRTUAL_INCOMING_ARGS_REGNUM
+  || REGNO (op1) == VIRTUAL_STACK_VARS_REGNUM
+  || REGNO (op1) == VIRTUAL_OUTGOING_ARGS_REGNUM)

You can use index_register_operand predicate here.

Uros.


Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 10:51 AM, Richard Biener wrote:

On Wed, Nov 20, 2013 at 4:06 PM, Trevor Saunders  wrote:

On Wed, Nov 20, 2013 at 03:18:07PM +0100, Richard Biener wrote:

On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek  wrote:

On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:

The limit looks reasonable, but you could have used a simple linked
list (and never free).  Also being able to pop a random context
looks fragile ...  that is, pop_gimplify_context shouldn't have an argument.

Can't we use stack_vec for that?  Though that would
mean a global var constructor and destructor, so alternatively just use
a normal vec and .create(30) it somewhere during initialization?

only with gimplify_context *, otherwise things will break during re-allocation.

hm? it seems like the only member of gimplify_ctx that can't just be
memcpyd is the prev pointer which presumably could go away if you have a
vec of all the contexts.

Callers have a pointer to gimplify_context AFAIK.




No one except gimplify.c can have a pointer to the gimplify_context, so 
its contained to within gimplify.c.  Pretty much everything is based off 
the current context pointer (gimplify_ctxp).  There are places where the 
address of a field within that context structure is passed to another 
routine.  If that routine then eventually triggered another 
push/pop_context call, the address underneath could be changed... and 
chaos ensues.


I don't know if that does happen, but it is a possibility and I dont see 
the need to find out.  So a simple allocation scheme has the minimal 
impact on the code, and  my preference is leave it as it is, or 
otherwise do the simple linked list malloc-ing only as necessary


Want me to change it, or leave it as is?

Andrew







Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Ilya Enkovich
2013/11/20 Uros Bizjak :
> On Wed, Nov 20, 2013 at 5:11 PM, Ilya Enkovich  wrote:
>> 2013/11/20 Uros Bizjak :
>>> On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich  
>>> wrote:
>>>
> >> > Here is a patch to add size relocation and instruction to obtain 
> >> > object's size in i386 target.
> >>
> >> +(define_insn "move_size_reloc_"
> >> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
> >> +(match_operand: 1 "size_relocation" "Z"))]
> >> +  ""
> >> +{
> >> +  return "mov{}\t{%1, %0|%0, %1}";
> >>
> >> Please don't change x86_64_immediate_operand just to use "Z"
> >> constraint The predicate is used in a couple of other places that for
> >> sure don't accept your change.
> >>
> >> Better write this insn in an explicit way (see for example
> >> tls_initial_exec_64_sun). Something like:
> >>
> >> (define_insn "move_size_reloc_"
> >>   [(set (match_operand:SWI48 0 "register_operand" "=r")
> >> (unspec:SWI48
> >>  [(match_operand 1 "symbolic_operand" "..." )]
> >>  UNSPEC_SIZEOF))]
> >>   ""
> >>   "mov{}\t{%a1@SIZE, %0|%0, %a1@SIZE}")
> >>
> >> You will probably need to define new operand 1 predicate and 
> >> constraint.
> >>
> >> Uros.
> >
> > Hi, Uros!  Thanks for comments!  Here is what I got trying to follow 
> > your suggestion.  Does it look better?
>
> You actually don't need any operand modifiers in the insn template. 
> Simply use:
>
> "mov{}\t{%1@SIZE, %0|%0, %1@SIZE}"
>
> and you will automatically get
>
> "movl $zzz, %eax" or "mov %eax, OFFSET FLAT: zzz".
>
> Since your pattern allows only symbolic_operand, there is no reload,
> so you can avoid the constraint alltogether.
>
> BTW: Did you consider various -mcmodel=... options? For DImode moves,
> you should check x86_64_zext_immediate_operand predicate and output
> either "movl $zzz, %eax" or "movabs $zzz, %rax". There is no movq with
> 64bit immediate. Please see movdi pattern.

 Yep, for large objects it may work wrongly. Does anyone use static objects 
 >4Gb? :)

 Large address does not mean large object but seems we have to be 
 conservative here. I added  x86_64_zext_immediate_operand check with 
 additional CM_KERNEL check because in this model object size should always 
 fit 32 bits.
>>>
>>> IMO, we should list code models that support object sizes > 31bits for
>>> 64bit target. The object size in small models will never be > 31bits
>>> (and never negative), so we can use movl unconditionally.
>>
>> For medium models x86_64_zext_immediate_operand returns true for
>> object is known to go to lower 2Gb space.  It should allow us to use
>> movl.  Why do you always emit movabs for medium model?
>
> CM_MEDIUM has unlimited data size.
>
> i386-opts.h:  CM_MEDIUM,/* Assumes code fits in the low 31
> bits; data unlimited.  */
>
> The x86_64_zext_immediate_operand allows _address_ to be loaded by
> movl. The @SIZE reloc returns object size, which is unlimited and has
> no connection to its address. For CM_MEDIUM,
> x86_64_zext_immediate_operand allows:
>
>   return (ix86_cmodel == CM_SMALL
>   || (ix86_cmodel == CM_MEDIUM
>   && !SYMBOL_REF_FAR_ADDR_P (op)));

Yes, but large symbols never have SYMBOL_FLAG_FAR_ADDR set.

Ilya

>
> Uros.


Re: [PATCH, rs6000] Fix order of TDmode in FPRs in little-endian mode

2013-11-20 Thread Ulrich Weigand
David Edelsohn wrote:
> On Mon, Nov 18, 2013 at 3:07 PM, Ulrich Weigand  wrote:
> > Also note that this patch does not change how TDmode values are loaded
> > into GPRs: on little-endian, this means we do get the usual LE subreg
> > order there (least significant word in lowest-numbered register).  This
> > does still seem the right thing to do, because there are no instructions
> > that operate on DFP values in GPRs, and because this allows us to keep
> > using subregs on GPRs in that case, and it allows moving TDmode between
> > memory and GPRs (e.g. for vararg arguments) in the usual way.
> 
> Does this require special support in GDB so that it understands how to
> interpret this type in different classes of registers?  However, the
> interpretation always will be consistent for a particular class of
> register.

Ah, good point.  Yes, GDB will need to be changed, but this looks
to be straightforward.  GDB already uses "pseudo" registers to
represent _Decimal128 in FPR pairs, they simply need to be glued
together the right way on little-endian.  I'll fix this.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH, i386, MPX, 2/X] Pointers Checker [21/25] Size relocation

2013-11-20 Thread Ilya Enkovich
2013/11/20 Ilya Enkovich :
> 2013/11/20 Uros Bizjak :
>> On Wed, Nov 20, 2013 at 5:11 PM, Ilya Enkovich  
>> wrote:
>>> 2013/11/20 Uros Bizjak :
 On Wed, Nov 20, 2013 at 1:32 PM, Ilya Enkovich  
 wrote:

>> >> > Here is a patch to add size relocation and instruction to obtain 
>> >> > object's size in i386 target.
>> >>
>> >> +(define_insn "move_size_reloc_"
>> >> +  [(set (match_operand:SWI48 0 "register_operand" "=r")
>> >> +(match_operand: 1 "size_relocation" "Z"))]
>> >> +  ""
>> >> +{
>> >> +  return "mov{}\t{%1, %0|%0, %1}";
>> >>
>> >> Please don't change x86_64_immediate_operand just to use "Z"
>> >> constraint The predicate is used in a couple of other places that for
>> >> sure don't accept your change.
>> >>
>> >> Better write this insn in an explicit way (see for example
>> >> tls_initial_exec_64_sun). Something like:
>> >>
>> >> (define_insn "move_size_reloc_"
>> >>   [(set (match_operand:SWI48 0 "register_operand" "=r")
>> >> (unspec:SWI48
>> >>  [(match_operand 1 "symbolic_operand" "..." )]
>> >>  UNSPEC_SIZEOF))]
>> >>   ""
>> >>   "mov{}\t{%a1@SIZE, %0|%0, %a1@SIZE}")
>> >>
>> >> You will probably need to define new operand 1 predicate and 
>> >> constraint.
>> >>
>> >> Uros.
>> >
>> > Hi, Uros!  Thanks for comments!  Here is what I got trying to follow 
>> > your suggestion.  Does it look better?
>>
>> You actually don't need any operand modifiers in the insn template. 
>> Simply use:
>>
>> "mov{}\t{%1@SIZE, %0|%0, %1@SIZE}"
>>
>> and you will automatically get
>>
>> "movl $zzz, %eax" or "mov %eax, OFFSET FLAT: zzz".
>>
>> Since your pattern allows only symbolic_operand, there is no reload,
>> so you can avoid the constraint alltogether.
>>
>> BTW: Did you consider various -mcmodel=... options? For DImode moves,
>> you should check x86_64_zext_immediate_operand predicate and output
>> either "movl $zzz, %eax" or "movabs $zzz, %rax". There is no movq with
>> 64bit immediate. Please see movdi pattern.
>
> Yep, for large objects it may work wrongly. Does anyone use static 
> objects >4Gb? :)
>
> Large address does not mean large object but seems we have to be 
> conservative here. I added  x86_64_zext_immediate_operand check with 
> additional CM_KERNEL check because in this model object size should 
> always fit 32 bits.

 IMO, we should list code models that support object sizes > 31bits for
 64bit target. The object size in small models will never be > 31bits
 (and never negative), so we can use movl unconditionally.
>>>
>>> For medium models x86_64_zext_immediate_operand returns true for
>>> object is known to go to lower 2Gb space.  It should allow us to use
>>> movl.  Why do you always emit movabs for medium model?
>>
>> CM_MEDIUM has unlimited data size.
>>
>> i386-opts.h:  CM_MEDIUM,/* Assumes code fits in the low 31
>> bits; data unlimited.  */
>>
>> The x86_64_zext_immediate_operand allows _address_ to be loaded by
>> movl. The @SIZE reloc returns object size, which is unlimited and has
>> no connection to its address. For CM_MEDIUM,
>> x86_64_zext_immediate_operand allows:
>>
>>   return (ix86_cmodel == CM_SMALL
>>   || (ix86_cmodel == CM_MEDIUM
>>   && !SYMBOL_REF_FAR_ADDR_P (op)));
>
> Yes, but large symbols never have SYMBOL_FLAG_FAR_ADDR set.

Sorry, I mean all large symbols have this flag set.

>
> Ilya
>
>>
>> Uros.


Re: [PATCH][1/3] Re-submission of Altera Nios II port, gcc parts

2013-11-20 Thread Chung-Lin Tang
On 13/11/20 1:34 AM, Richard Sandiford wrote:
> Chung-Lin Tang  writes:
 +;;  Integer logical Operations
 +
 +(define_code_iterator LOGICAL [and ior xor])
 +(define_code_attr logical_asm [(and "and") (ior "or") (xor "xor")])
 +
 +(define_insn "si3"
 +  [(set (match_operand:SI 0 "register_operand" "=r,r,r")
 +(LOGICAL:SI (match_operand:SI 1 "register_operand" "%r,r,r")
 +(match_operand:SI 2 "logical_operand"  "rM,J,K")))]
 +  ""
 +  "@
 +\\t%0, %1, %z2
 +%i2\\t%0, %1, %2
 +h%i2\\t%0, %1, %U2"
 +  [(set_attr "type" "alu")])
 +
 +(define_insn "*norsi3"
 +  [(set (match_operand:SI 0 "register_operand"  "=r")
 +(and:SI (not:SI (match_operand:SI 1 "register_operand"  "%r"))
 +(not:SI (match_operand:SI 2 "reg_or_0_operand"  "rM"]
 +  ""
 +  "nor\\t%0, %1, %z2"
 +  [(set_attr "type" "alu")])
>>>
>>> M constraints (for const0_rtx) and reg_or_0 seem unnecessary, no such
>>> RTL should make it to this point.
>>
>> Such RTL does appear under -O0. Removing the 'M' will also
>> require a bit of re-working the operand printing mechanics; not a lot of
>> work, but I'd rather keep it as is. The behavior of using the zero
>> register for a 0-value is also more expected in nios2, I think.

Will any RTL-level passes, say for example, propagate a zero-constant
into the pattern, without also simplifying it to a NOT pattern? (that's
a question, I don't really know)

Personally, I haven't really seen many cases optimizable to NOR, but I
think keeping the 'M' there is pretty harmless.

> Hmm, but if we get (not (const_int 0)) then that sounds like a bug,
> since (and (not X) (not (const_int 0))) should reduce to (not X).
> IMO target-independent code shouldn't try to create the nor-with-0
> form and backends shouldn't match it.
> 
> Why would removing 'M' affect the printing mechanism?  Naively I'd
> have expected:

The *norsi3 here is of course straightforward. I was referring to other
cases, like the AND/IOR/XOR pattern above, where I wanted to combine
them into a single alternative. That needs a bit more work to reorganize
the nios2_print_operand() cases.

> (define_insn "*norsi3"
>   [(set (match_operand:SI 0 "register_operand"  "=r")
> (and:SI (not:SI (match_operand:SI 1 "register_operand"  "r"))
> (not:SI (match_operand:SI 2 "register_operand"  "r"]
>   ""
>   "nor\\t%0, %1, %2"
>   [(set_attr "type" "alu")])
> 
> to just work.

That will definitely work, though I don't think the zero case does any harm.

 +;; Integer comparisons
 +
 +(define_code_iterator EQNE [eq ne])
 +(define_insn "nios2_cmp"
 +  [(set (match_operand:SI 0 "register_operand"   "=r")
 +(EQNE:SI (match_operand:SI 1 "reg_or_0_operand" "%rM")
 + (match_operand:SI 2 "arith_operand" "rI")))]
 +  ""
 +  "cmp%i2\\t%0, %z1, %z2"
 +  [(set_attr "type" "alu")])
>>>
>>> Once again, using reg_or_0 and "M" seems pointless.
>>
>> The compares don't support all operations, with only the second operand
>> capable of an immediate. Using 'rM' should theoretically allow more
>> commutative swapping.
> 
> But rtl-wise, we should never generate an EQ or NE with two constants.
> And if one operand is constant then it's supposed to be the second.
> 
> The "%" should give commutativity on its own, without the "M".

For EQ/NE I guess that's the case, for the other comparisons I'm not so
sure; I'm not familiar enough with the details of the expander machinery
to claim anything.

If this doesn't carry to other comparisons, I intend to keep it in line
with the other cmp patterns. I experimented a bit with the generated
code, nothing is affected.

 +  emit_insn
 +  (gen_rtx_SET (Pmode, tmp,
 +gen_int_mode (cfun->machine->save_regs_offset,
 +  Pmode)));
>>>
>>> Shouldn't have a mode on the SET, but really should just call
>>> emit_move_insn. Similar cases elsewhere, search for "gen_rtx_SET (Pmode".
>>
>> I've removed the modes on SET, though I prefer the more bare generation
>> of the insns in some contexts; emit_move_insn() seems to have a lot
>> under the hood.
> 
> There shouldn't be anything to be afraid of though.  Target-independent
> code would use emit_move_insn for this though, so it needs to just work.

It will work, and I did use it in some places, though I did not
exhaustively search-and-replace.

For (subtle) performance reasons, emit_move_insn() really does too much
as a backend utility. Usually backend code is already very precise on
what we want to generate.

 +  HOST_WIDE_INT var_size;   /* # of var. bytes allocated.  */
>>>
>>> Not the first time they occur in this file, but I suppose I should
>>> mention that these in-line comments are probably just outside our coding
>>> guidelines.
>>
>> Dele

Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Andrew MacLeod

On 11/20/2013 11:30 AM, Andrew MacLeod wrote:

On 11/20/2013 10:51 AM, Richard Biener wrote:
On Wed, Nov 20, 2013 at 4:06 PM, Trevor Saunders 
 wrote:

On Wed, Nov 20, 2013 at 03:18:07PM +0100, Richard Biener wrote:
On Wed, Nov 20, 2013 at 3:16 PM, Jakub Jelinek  
wrote:

On Wed, Nov 20, 2013 at 03:12:34PM +0100, Richard Biener wrote:

The limit looks reasonable, but you could have used a simple linked
list (and never free).  Also being able to pop a random context
looks fragile ...  that is, pop_gimplify_context shouldn't have 
an argument.
Can't we use stack_vec for that?  Though 
that would
mean a global var constructor and destructor, so alternatively 
just use

a normal vec and .create(30) it somewhere during initialization?
only with gimplify_context *, otherwise things will break during 
re-allocation.

hm? it seems like the only member of gimplify_ctx that can't just be
memcpyd is the prev pointer which presumably could go away if you 
have a

vec of all the contexts.

Callers have a pointer to gimplify_context AFAIK.




No one except gimplify.c can have a pointer to the gimplify_context, 
so its contained to within gimplify.c.  Pretty much everything is 
based off the current context pointer (gimplify_ctxp).  There are 
places where the address of a field within that context structure is 
passed to another routine.  If that routine then eventually triggered 
another push/pop_context call, the address underneath could be 
changed... and chaos ensues.


I don't know if that does happen, but it is a possibility and I dont 
see the need to find out.  So a simple allocation scheme has the 
minimal impact on the code, and  my preference is leave it as it is, 
or otherwise do the simple linked list malloc-ing only as necessary


Want me to change it, or leave it as is?
In fact, I like the code better for the linked list... its even simpler  
I've attached that patch.   Its currently bootstrapping and I'll run the 
tests.   And it wouldn't need a special testcase since there are no edge 
cases.


Which is your preference?

Andrew


	* gimplify.h (gimplify_hasher : typed_free_remove, struct gimplify_ctx):
	Move to gimplify.c.
	* gimplify.c (gimplify_hasher:typed_free_remove): Relocate here.
	(struct gimplify_ctx): Relocate here and add 'malloced' field.
	(gimplify_ctxp): Make static.
	(ctx_pool, ctx_alloc, ctx_free): Manage a list of struct gimplify_ctx.
	(push_gimplify_context): Add default parameters and allocate a 
	struct from the pool.
	(pop_gimplify_context): Free a struct back to the pool.
	(gimplify_scan_omp_clauses, gimplify_omp_parallel, gimplify_omp_task,
	gimplify_omp_workshare, gimplify_transaction, gimplify_body): Don't
	use a local 'struct gimplify_ctx'.
	* gimplify-me.c (force_gimple_operand_1, gimple_regimplify_operands):
	Likewise.
	* omp-low.c (lower_omp_sections, lower_omp_single, lower_omp_master,
	lower_omp_ordered, lower_omp_critical, lower_omp_for,
	create_task_copyfn, lower_omp_taskreg, lower_omp_target,
	lower_omp_teams, execute_lower_omp): Likewise.
	* gimple-fold.c (gimplify_and_update_call_from_tree): Likewise.
	* tree-inline.c (optimize_inline_calls): Likewise.

Index: gimplify.h
===
*** gimplify.h	(revision 205035)
--- gimplify.h	(working copy)
*** enum gimplify_status {
*** 48,86 
GS_OK		= 0,	/* We did something, maybe more to do.  */
GS_ALL_DONE	= 1	/* The expression is fully gimplified.  */
  };
- /* Gimplify hashtable helper.  */
  
! struct gimplify_hasher : typed_free_remove 
! {
!   typedef elt_t value_type;
!   typedef elt_t compare_type;
!   static inline hashval_t hash (const value_type *);
!   static inline bool equal (const value_type *, const compare_type *);
! };
! 
! struct gimplify_ctx
! {
!   struct gimplify_ctx *prev_context;
! 
!   vec bind_expr_stack;
!   tree temps;
!   gimple_seq conditional_cleanups;
!   tree exit_label;
!   tree return_temp;
! 
!   vec case_labels;
!   /* The formal temporary table.  Should this be persistent?  */
!   hash_table  temp_htab;
! 
!   int conditions;
!   bool save_stack;
!   bool into_ssa;
!   bool allow_rhs_cond_expr;
!   bool in_cleanup_point_expr;
! };
! 
! extern struct gimplify_ctx *gimplify_ctxp;
! extern void push_gimplify_context (struct gimplify_ctx *);
  extern void pop_gimplify_context (gimple);
  extern gimple gimple_current_bind_expr (void);
  extern vec gimple_bind_expr_stack (void);
--- 48,56 
GS_OK		= 0,	/* We did something, maybe more to do.  */
GS_ALL_DONE	= 1	/* The expression is fully gimplified.  */
  };
  
! extern void push_gimplify_context (bool in_ssa = false,
!    bool rhs_cond_ok = false);
  extern void pop_gimplify_context (gimple);
  extern gimple gimple_current_bind_expr (void);
  extern vec gimple_bind_expr_stack (void);
Index: gimplify.c
===
*** gimplify.c	(revision 205035)
--- gimplify.c	(working copy)
*** e

Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Jeff Law

On 11/20/13 06:44, Jakub Jelinek wrote:

On Wed, Nov 20, 2013 at 08:35:28AM -0500, Andrew MacLeod wrote:

I also hacked up the compiler to report what the 'top' of the stack
was for a compilation unit. I then ran it through a bootstrap and
full testsuite run of all languages, and looked for the maximum
number of context structs in use at one time.  Normally only 1 or 2
were ever in use, but its topped out at 7 during some of the openmp
tests.  So my current limit of 30 will likely never been reached.


???  You can trivially create a testcase where it will be reached
(and you certainly should, so that you actually test the fallback).
I was going to suggest hacking things so that he only had 2, then do a 
bootstrap/comparison to test the fallback.  Obviously that would be for 
testing purposes only.


However, a test we can put into the suite is even better :-)
jeff



Re: [patch] Privatize gimplify_ctx structure.

2013-11-20 Thread Jeff Law

On 11/20/13 06:56, Andrew MacLeod wrote:

On 11/20/2013 08:44 AM, Jakub Jelinek wrote:

On Wed, Nov 20, 2013 at 08:35:28AM -0500, Andrew MacLeod wrote:

I also hacked up the compiler to report what the 'top' of the stack
was for a compilation unit. I then ran it through a bootstrap and
full testsuite run of all languages, and looked for the maximum
number of context structs in use at one time.  Normally only 1 or 2
were ever in use, but its topped out at 7 during some of the openmp
tests.  So my current limit of 30 will likely never been reached.

???  You can trivially create a testcase where it will be reached
(and you certainly should, so that you actually test the fallback).


Not something we're likely to see every day, but excellent!  a testcase
is good :-) I'll add this as a testcase.

Do you think the current limit of 30 is reasonable?
Yes if you've verified the fall back works ;-)  You probably could even 
drop it to 10ish.

jeff


Fix PR target/59207

2013-11-20 Thread Eric Botcazou
This fixes an uninitialized variable introduced by the double_int rewrite.

Tested on SPARC/Solaris, applied on the mainline and 4.8 branch.


2013-11-20  Eric Botcazou 

PR target/59207
* config/sparc/sparc.c (sparc_fold_builtin) :
Make sure neg2_ovf is set before being used.


-- 
Eric BotcazouIndex: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 205090)
+++ config/sparc/sparc.c	(working copy)
@@ -10658,7 +10658,8 @@ sparc_fold_builtin (tree fndecl, int n_a
 	  tmp = e0.add_with_sign (tmp, false, &add1_ovf);
 	  if (tmp.is_negative ())
 		tmp = tmp.neg_with_overflow (&neg2_ovf);
-
+	  else
+		neg2_ovf = false;
 	  result = result.add_with_sign (tmp, false, &add2_ovf);
 	  overflow |= neg1_ovf | neg2_ovf | add1_ovf | add2_ovf;
 	}


  1   2   >