from:"Ilya Enkovich"

Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA

2013-08-26 Thread Ilya Enkovich

Ping

2013/8/19 Ilya Enkovich :
> Ping
>
> 2013/8/12 Ilya Enkovich :
>> 2013/8/10 Joseph S. Myers :
>>> On Mon, 29 Jul 2013, Ilya Enkovich wrote:
>>>
>>>> Hi,
>>>>
>>>> Here is updated version of the patch. I removed redundant
>>>> mode_for_bound, added comments to BOUND_TYPE and added -mmpx option.
>>>> I also fixed bndmk/bndldx/bndstx constraints to avoid incorrect
>>>> register allocation (created two new constraints for that).
>>>
>>> I think the -mmpx option should be documented in invoke.texi, and the new
>>> machine modes / mode class should be documented in rtl.texi where other
>>> machine modes / mode classes are documented.  Beyond that, I have no
>>> comments on this patch revision.
>>>
>>> --
>>> Joseph S. Myers
>>> jos...@codesourcery.com
>>
>> Thanks! Here is a new revision with -mmpx and new machine modes /
>> class documented.
>> Is it good to install to trunk?
>>
>> Thanks,
>> Ilya
>> ---
>> 2013-08-12  Ilya Enkovich  
>>
>> * mode-classes.def (MODE_BOUND): New.
>> * tree.def (BOUND_TYPE): New.
>> * genmodes.c (complete_mode): Support MODE_BOUND.
>> (BOUND_MODE): New.
>> (make_bound_mode): New.
>> * machmode.h (BOUND_MODE_P): New.
>> * stor-layout.c (int_mode_for_mode): Support MODE_BOUND.
>> (layout_type): Support BOUND_TYPE.
>> * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE.
>> * tree.c (build_int_cst_wide): Support BOUND_TYPE.
>> (type_contains_placeholder_1): Likewise.
>> * tree.h (BOUND_TYPE_P): New.
>> * varasm.c (output_constant): Support BOUND_TYPE.
>> * config/i386/constraints.md (B): New.
>> (Ti): New.
>> (Tb): New.
>> * config/i386/i386-modes.def (BND32): New.
>> (BND64): New.
>> * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
>> * config/i386/i386.c (isa_opts): Add mmpx.
>> (regclass_map): Add bound registers.
>> (dbx_register_map): Likewise.
>> (dbx64_register_map): Likewise.
>> (svr4_dbx_register_map): Likewise.
>> (PTA_MPX): New.
>> (ix86_option_override_internal) Support MPX ISA.
>> (ix86_code_end): Add MPX bnd prefix.
>> (output_set_got): Likewise.
>> (ix86_output_call_insn): Likewise.
>> (get_some_local_dynamic_name): Add '!' (MPX bnd) print prefix 
>> support.
>> (ix86_print_operand_punct_valid_p): Likewise.
>> (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
>> UNSPEC_BNDMK_ADDR.
>> (ix86_class_likely_spilled_p): Add bound regs support.
>> (ix86_hard_regno_mode_ok): Likewise.
>> (x86_order_regs_for_local_alloc): Likewise.
>> (ix86_bnd_prefixed_insn_p): New.
>> * config/i386/i386.h (FIRST_PSEUDO_REGISTER): Fix to new value.
>> (FIXED_REGISTERS): Add bound registers.
>> (CALL_USED_REGISTERS): Likewise.
>> (REG_ALLOC_ORDER): Likewise.
>> (HARD_REGNO_NREGS): Likewise.
>> (TARGET_MPX): New.
>> (VALID_BND_REG_MODE): New.
>> (FIRST_BND_REG): New.
>> (LAST_BND_REG): New.
>> (reg_class): Add BND_REGS.
>> (REG_CLASS_NAMES): Likewise.
>> (REG_CLASS_CONTENTS): Likewise.
>> (BND_REGNO_P): New.
>> (ANY_BND_REG_P): New.
>> (BNDmode): New.
>> (HI_REGISTER_NAMES): Add bound registers.
>> * config/i386/i386.md (UNSPEC_BNDMK): New.
>> (UNSPEC_BNDMK_ADDR): New.
>> (UNSPEC_BNDSTX): New.
>> (UNSPEC_BNDLDX): New.
>> (UNSPEC_BNDLDX_ADDR): New.
>> (UNSPEC_BNDCL): New.
>> (UNSPEC_BNDCU): New.
>> (UNSPEC_BNDCN): New.
>> (UNSPEC_MPX_FENCE): New.
>> (BND0_REG): New.
>> (BND1_REG): New.
>> (type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
>> (length_immediate): Likewise.
>> (prefix_0f): Likewise.
>> (memory): Likewise.
>> (prefix_rep): Check for bnd prefix.
>> (BND): New.
>> (bnd_ptr): New.
>> (BNDCHECK): New.
>> (bndcheck): New.
>> (*jcc_1): Add MPX bnd prefix and fix length.
>> (*jcc_2): Likewise.
>> (jump): Likewise.
>> (simple_return_inter

Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA

2013-09-10 Thread Ilya Enkovich

Ping^4

Could please someone look at this patch? It is mostly i386 target
specific and is basic for further MPX based features.

Thanks,
Ilya

2013/9/2 Ilya Enkovich :
> Ping^3
>
> Attached is the same patch but against the current trunk.
>
> 2013/8/26 Ilya Enkovich :
>> Ping
>>
>> 2013/8/19 Ilya Enkovich :
>>> Ping
>>>
>>> 2013/8/12 Ilya Enkovich :
>>>> 2013/8/10 Joseph S. Myers :
>>>>> On Mon, 29 Jul 2013, Ilya Enkovich wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Here is updated version of the patch. I removed redundant
>>>>>> mode_for_bound, added comments to BOUND_TYPE and added -mmpx option.
>>>>>> I also fixed bndmk/bndldx/bndstx constraints to avoid incorrect
>>>>>> register allocation (created two new constraints for that).
>>>>>
>>>>> I think the -mmpx option should be documented in invoke.texi, and the new
>>>>> machine modes / mode class should be documented in rtl.texi where other
>>>>> machine modes / mode classes are documented.  Beyond that, I have no
>>>>> comments on this patch revision.
>>>>>
>>>>> --
>>>>> Joseph S. Myers
>>>>> jos...@codesourcery.com
>>>>
>>>> Thanks! Here is a new revision with -mmpx and new machine modes /
>>>> class documented.
>>>> Is it good to install to trunk?
>>>>
>>>> Thanks,
>>>> Ilya
>>>> ---
>>>> 2013-08-12  Ilya Enkovich  
>>>>
>>>> * mode-classes.def (MODE_BOUND): New.
>>>> * tree.def (BOUND_TYPE): New.
>>>> * genmodes.c (complete_mode): Support MODE_BOUND.
>>>> (BOUND_MODE): New.
>>>> (make_bound_mode): New.
>>>> * machmode.h (BOUND_MODE_P): New.
>>>> * stor-layout.c (int_mode_for_mode): Support MODE_BOUND.
>>>> (layout_type): Support BOUND_TYPE.
>>>> * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE.
>>>> * tree.c (build_int_cst_wide): Support BOUND_TYPE.
>>>> (type_contains_placeholder_1): Likewise.
>>>> * tree.h (BOUND_TYPE_P): New.
>>>> * varasm.c (output_constant): Support BOUND_TYPE.
>>>> * config/i386/constraints.md (B): New.
>>>> (Ti): New.
>>>> (Tb): New.
>>>> * config/i386/i386-modes.def (BND32): New.
>>>> (BND64): New.
>>>> * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
>>>> * config/i386/i386.c (isa_opts): Add mmpx.
>>>> (regclass_map): Add bound registers.
>>>> (dbx_register_map): Likewise.
>>>> (dbx64_register_map): Likewise.
>>>> (svr4_dbx_register_map): Likewise.
>>>> (PTA_MPX): New.
>>>> (ix86_option_override_internal) Support MPX ISA.
>>>> (ix86_code_end): Add MPX bnd prefix.
>>>> (output_set_got): Likewise.
>>>> (ix86_output_call_insn): Likewise.
>>>> (get_some_local_dynamic_name): Add '!' (MPX bnd) print prefix 
>>>> support.
>>>> (ix86_print_operand_punct_valid_p): Likewise.
>>>> (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
>>>> UNSPEC_BNDMK_ADDR.
>>>> (ix86_class_likely_spilled_p): Add bound regs support.
>>>> (ix86_hard_regno_mode_ok): Likewise.
>>>> (x86_order_regs_for_local_alloc): Likewise.
>>>> (ix86_bnd_prefixed_insn_p): New.
>>>> * config/i386/i386.h (FIRST_PSEUDO_REGISTER): Fix to new value.
>>>> (FIXED_REGISTERS): Add bound registers.
>>>> (CALL_USED_REGISTERS): Likewise.
>>>> (REG_ALLOC_ORDER): Likewise.
>>>> (HARD_REGNO_NREGS): Likewise.
>>>> (TARGET_MPX): New.
>>>> (VALID_BND_REG_MODE): New.
>>>> (FIRST_BND_REG): New.
>>>> (LAST_BND_REG): New.
>>>> (reg_class): Add BND_REGS.
>>>> (REG_CLASS_NAMES): Likewise.
>>>> (REG_CLASS_CONTENTS): Likewise.
>>>> (BND_REGNO_P): New.
>>>> (ANY_BND_REG_P): New.
>>>> (BNDmode): New.
>>>> (HI_REGISTER_NAMES): Add bo

Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA

2013-09-13 Thread Ilya Enkovich

2013/9/11 Uros Bizjak :
> On Tue, Sep 10, 2013 at 1:38 PM, Ilya Enkovich  wrote:
>> Ping^4
>>
>> Could please someone look at this patch? It is mostly i386 target
>> specific and is basic for further MPX based features.
>>
>> Thanks,
>> Ilya
>>
>> 2013/9/2 Ilya Enkovich :
>>> Ping^3
>>>
>>> Attached is the same patch but against the current trunk.
>>>
>>> 2013/8/26 Ilya Enkovich :
>>>> Ping
>>>>
>>>> 2013/8/19 Ilya Enkovich :
>>>>> Ping
>>>>>
>>>>> 2013/8/12 Ilya Enkovich :
>>>>>> 2013/8/10 Joseph S. Myers :
>>>>>>> On Mon, 29 Jul 2013, Ilya Enkovich wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Here is updated version of the patch. I removed redundant
>>>>>>>> mode_for_bound, added comments to BOUND_TYPE and added -mmpx option.
>>>>>>>> I also fixed bndmk/bndldx/bndstx constraints to avoid incorrect
>>>>>>>> register allocation (created two new constraints for that).
>>>>>>>
>>>>>>> I think the -mmpx option should be documented in invoke.texi, and the 
>>>>>>> new
>>>>>>> machine modes / mode class should be documented in rtl.texi where other
>>>>>>> machine modes / mode classes are documented.  Beyond that, I have no
>>>>>>> comments on this patch revision.
>>>>>>>
>>>>>>> --
>>>>>>> Joseph S. Myers
>>>>>>> jos...@codesourcery.com
>>>>>>
>>>>>> Thanks! Here is a new revision with -mmpx and new machine modes /
>>>>>> class documented.
>>>>>> Is it good to install to trunk?
>>>>>>
>>>>>> Thanks,
>>>>>> Ilya
>>>>>> ---
>>>>>> 2013-08-12  Ilya Enkovich  
>>>>>>
>>>>>> * mode-classes.def (MODE_BOUND): New.
>>>>>> * tree.def (BOUND_TYPE): New.
>>>>>> * genmodes.c (complete_mode): Support MODE_BOUND.
>>>>>> (BOUND_MODE): New.
>>>>>> (make_bound_mode): New.
>>>>>> * machmode.h (BOUND_MODE_P): New.
>>>>>> * stor-layout.c (int_mode_for_mode): Support MODE_BOUND.
>>>>>> (layout_type): Support BOUND_TYPE.
>>>>>> * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE.
>>>>>> * tree.c (build_int_cst_wide): Support BOUND_TYPE.
>>>>>> (type_contains_placeholder_1): Likewise.
>>>>>> * tree.h (BOUND_TYPE_P): New.
>>>>>> * varasm.c (output_constant): Support BOUND_TYPE.
>>>>>> * config/i386/constraints.md (B): New.
>>>>>> (Ti): New.
>>>>>> (Tb): New.
>>>>>> * config/i386/i386-modes.def (BND32): New.
>>>>>> (BND64): New.
>>>>>> * config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
>>>>>> * config/i386/i386.c (isa_opts): Add mmpx.
>>>>>> (regclass_map): Add bound registers.
>>>>>> (dbx_register_map): Likewise.
>>>>>> (dbx64_register_map): Likewise.
>>>>>> (svr4_dbx_register_map): Likewise.
>>>>>> (PTA_MPX): New.
>>>>>> (ix86_option_override_internal) Support MPX ISA.
>>>>>> (ix86_code_end): Add MPX bnd prefix.
>>>>>> (output_set_got): Likewise.
>>>>>> (ix86_output_call_insn): Likewise.
>>>>>> (get_some_local_dynamic_name): Add '!' (MPX bnd) print prefix 
>>>>>> support.
>>>>>> (ix86_print_operand_punct_valid_p): Likewise.
>>>>>> (ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
>>>>>> UNSPEC_BNDMK_ADDR.
>>>>>> (ix86_class_likely_spilled_p): Add bound regs support.
>>>>>> (ix86_hard_regno_mode_ok): Likewise.
>>>>>> (x86_order_regs_for_local_alloc): Likewise.
>>>>>> (ix86_bnd_prefixed_insn_p): New.
>>>>>> * config/i386/i386.h (FIRST_PSEUDO_REG

[PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-09-17 Thread Ilya Enkovich

Hi,

Here is a patch introducing new type and mode for bounds. It is a part of MPX 
ISA support patch (http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01094.html).

Bootstrapped and tested on linux-x86_64. Is it OK for trunk?

Thanks,
Ilya
--

gcc/

2013-09-16  Ilya Enkovich  

* mode-classes.def (MODE_BOUND): New.
* tree.def (BOUND_TYPE): New.
* genmodes.c (complete_mode): Support MODE_BOUND.
(BOUND_MODE): New.
(make_bound_mode): New.
* machmode.h (BOUND_MODE_P): New.
* stor-layout.c (int_mode_for_mode): Support MODE_BOUND.
(layout_type): Support BOUND_TYPE.
* tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE.
* tree.c (build_int_cst_wide): Support BOUND_TYPE.
(type_contains_placeholder_1): Likewise.
* tree.h (BOUND_TYPE_P): New.
* varasm.c (output_constant): Support BOUND_TYPE.
* doc/rtl.texi (MODE_BOUND): New.

diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 1d62223..02b1214 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the 
@file{@var{machine}-modes.def}.
 @xref{Jump Patterns},
 also see @ref{Condition Code}.
 
+@findex MODE_BOUND
+@item MODE_BOUND
+Bound modes class.  Used to represent values of pointer bounds.
+
 @findex MODE_RANDOM
 @item MODE_RANDOM
 This is a catchall mode class for modes which don't fit into the above
diff --git a/gcc/genmodes.c b/gcc/genmodes.c
index dc38483..89174ec 100644
--- a/gcc/genmodes.c
+++ b/gcc/genmodes.c
@@ -333,6 +333,7 @@ complete_mode (struct mode_data *m)
   break;
 
 case MODE_INT:
+case MODE_BOUND:
 case MODE_FLOAT:
 case MODE_DECIMAL_FLOAT:
 case MODE_FRACT:
@@ -533,6 +534,18 @@ make_special_mode (enum mode_class cl, const char *name,
   new_mode (cl, name, file, line);
 }
 
+#define BOUND_MODE(N, Y) make_bound_mode (#N, Y, __FILE__, __LINE__)
+
+static void ATTRIBUTE_UNUSED
+make_bound_mode (const char *name,
+   unsigned int bytesize,
+   const char *file, unsigned int line)
+{
+  struct mode_data *m = new_mode (MODE_BOUND, name, file, line);
+  m->bytesize = bytesize;
+}
+
+
 #define INT_MODE(N, Y) FRACTIONAL_INT_MODE (N, -1U, Y)
 #define FRACTIONAL_INT_MODE(N, B, Y) \
   make_int_mode (#N, B, Y, __FILE__, __LINE__)
diff --git a/gcc/machmode.h b/gcc/machmode.h
index 981ee92..d4a20b2 100644
--- a/gcc/machmode.h
+++ b/gcc/machmode.h
@@ -174,6 +174,9 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES];
|| CLASS == MODE_ACCUM  \
|| CLASS == MODE_UACCUM)
 
+#define BOUND_MODE_P(MODE)  \
+  (GET_MODE_CLASS (MODE) == MODE_BOUND)
+
 /* Get the size in bytes and bits of an object of mode MODE.  */
 
 extern CONST_MODE_SIZE unsigned char mode_size[NUM_MACHINE_MODES];
diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def
index 7207ef7..c5ea215 100644
--- a/gcc/mode-classes.def
+++ b/gcc/mode-classes.def
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
   DEF_MODE_CLASS (MODE_RANDOM),/* other */ 
   \
   DEF_MODE_CLASS (MODE_CC),/* condition code in a register */ \
   DEF_MODE_CLASS (MODE_INT),   /* integer */  \
+  DEF_MODE_CLASS (MODE_BOUND),/* bounds */ \
   DEF_MODE_CLASS (MODE_PARTIAL_INT),   /* integer with padding bits */\
   DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */ \
   DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional number 
*/   \
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 6f6b310..82611c7 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -383,6 +383,7 @@ int_mode_for_mode (enum machine_mode mode)
 case MODE_VECTOR_ACCUM:
 case MODE_VECTOR_UFRACT:
 case MODE_VECTOR_UACCUM:
+case MODE_BOUND:
   mode = mode_for_size (GET_MODE_BITSIZE (mode), MODE_INT, 0);
   break;
 
@@ -2135,6 +2136,13 @@ layout_type (tree type)
   SET_TYPE_MODE (type, VOIDmode);
   break;
 
+case BOUND_TYPE:
+  SET_TYPE_MODE (type,
+ mode_for_size (TYPE_PRECISION (type), MODE_BOUND, 0));
+  TYPE_SIZE (type) = bitsize_int (GET_MODE_BITSIZE (TYPE_MODE (type)));
+  TYPE_SIZE_UNIT (type) = size_int (GET_MODE_SIZE (TYPE_MODE (type)));
+  break;
+
 case OFFSET_TYPE:
   TYPE_SIZE (type) = bitsize_int (POINTER_SIZE);
   TYPE_SIZE_UNIT (type) = size_int (POINTER_SIZE / BITS_PER_UNIT);
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 69e4006..8b0825c 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -697,6 +697,7 @@ dump_generic_node (pretty_printer *buffer, tree node, int 
spc, int flags,
   break;
 
 case VOID_TYPE:
+case BOUND_TYPE:
 case INTEGER_TYPE:
 case REAL_TYPE:
 case FIXED_POINT_TYPE:
diff --git a/gcc/tree.c b/gcc/tree.c
index b469b97..bbbe16e 100644
--- a/gc

Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA. 2/2 New registers and instructions

2013-09-17 Thread Ilya Enkovich

On 16 Sep 11:24, Uros Bizjak wrote:
> On Fri, Sep 13, 2013 at 12:18 PM, Ilya Enkovich  
> wrote:
> > 2013/9/11 Uros Bizjak :
> >>
> >
> > Hi Uros,
> >
> > Thanks a lot for the review!
> >
> >> The x86 part looks mostly OK (I have a couple of comments bellow), but
> >> please first get target-independent changes reviewed and committed.
> >
> > Do you mean I should move bound type and mode declaration into a separate 
> > patch?
> 
> Yes, target-independent part (middle end) has to go through the
> separate review to check if this part is OK. The target-dependent part
> uses the infrastructure from the middle end, so it can go into the
> code base only after target-independent parts are committed.

I sent a separate patch for bound type and mode class 
(http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01268.html). Here is target part 
of the patch with fixes you mentioned. Does it look OK?

Bootstrapped and checked on linux-x86_64. Still shows incorrect length 
attribute computation (described here 
http://gcc.gnu.org/ml/gcc/2013-07/msg00311.html).

Thanks,
Ilya
--

2013-09-16  Ilya Enkovich  

* config/i386/constraints.md (B): New.
(Ti): New.
(Tb): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__.
* config/i386/i386-modes.def (BND32): New.
(BND64): New.
* config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.c (isa_opts): Add mmpx.
(regclass_map): Add bound registers.
(dbx_register_map): Likewise.
(dbx64_register_map): Likewise.
(svr4_dbx_register_map): Likewise.
(PTA_MPX): New.
(ix86_option_override_internal): Support MPX ISA.
(ix86_conditional_register_usage): Support bound registers.
(print_reg): Likewise.
(ix86_code_end): Add MPX bnd prefix.
(output_set_got): Likewise.
(ix86_output_call_insn): Likewise.
(ix86_print_operand): Add '!' (MPX bnd) print prefix support.
(ix86_print_operand_punct_valid_p): Likewise.
(ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
UNSPEC_BNDMK_ADDR.
(ix86_class_likely_spilled_p): Add bound regs support.
(ix86_hard_regno_mode_ok): Likewise.
(x86_order_regs_for_local_alloc): Likewise.
(ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.h (FIRST_PSEUDO_REGISTER): Fix to new value.
(FIXED_REGISTERS): Add bound registers.
(CALL_USED_REGISTERS): Likewise.
(REG_ALLOC_ORDER): Likewise.
(HARD_REGNO_NREGS): Likewise.
(TARGET_MPX): New.
(VALID_BND_REG_MODE): New.
(FIRST_BND_REG): New.
(LAST_BND_REG): New.
(reg_class): Add BND_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BND_REGNO_P): New.
(ANY_BND_REG_P): New.
(BNDmode): New.
(HI_REGISTER_NAMES): Add bound registers.
* config/i386/i386.md (UNSPEC_BNDMK): New.
(UNSPEC_BNDMK_ADDR): New.
(UNSPEC_BNDSTX): New.
(UNSPEC_BNDLDX): New.
(UNSPEC_BNDLDX_ADDR): New.
(UNSPEC_BNDCL): New.
(UNSPEC_BNDCU): New.
(UNSPEC_BNDCN): New.
(UNSPEC_MPX_FENCE): New.
(BND0_REG): New.
(BND1_REG): New.
(type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_immediate): Likewise.
(prefix_0f): Likewise.
(memory): Likewise.
(prefix_rep): Check for bnd prefix.
(BND): New.
(bnd_ptr): New.
(BNDCHECK): New.
(bndcheck): New.
(*jcc_1): Add MPX bnd prefix and fix length.
(*jcc_2): Likewise.
(jump): Likewise.
(simple_return_internal): Likewise.
(simple_return_pop_internal): Likewise.
(*indirect_jump): Add MPX bnd prefix.
(*tablejump_1): Likewise.
(simple_return_internal_long): Likewise.
(simple_return_indirect_internal): Likewise.
(_mk): New.
(*_mk): New.
(mov): New.
(*mov_internal_mpx): New.
(_): New.
(*_): New.
(_ldx): New.
(*_ldx): New.
(_stx): New.
(*_stx): New.
* config/i386/predicates.md (lea_address_operand): Rename to...
(address_no_seg_operand): ... this.
(address_mpx_no_base_operand): New.
(address_mpx_no_index_operand): New.
(bnd_mem_operator): New.
* config/i386/i386.opt (mmpx): New.
* doc/invoke.texi: Add documentation for the flags -mmpx, -mno-mpx.
* doc/rtl.texi Add documentation for BND32mode and BND64mode.
diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 28e626f..79d02f7 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -18,7 +18,7 @@
 ;; <http://www.gnu.org/licenses/>.
 
 ;;; Unused letters

Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA. 2/2 New registers and instructions

2013-10-01 Thread Ilya Enkovich

On 26 Sep 23:12, Uros Bizjak wrote:
> On Tue, Sep 17, 2013 at 10:41 AM, Ilya Enkovich  
> wrote:
> 
> >> >> The x86 part looks mostly OK (I have a couple of comments bellow), but
> >> >> please first get target-independent changes reviewed and committed.
> >> >
> >> > Do you mean I should move bound type and mode declaration into a 
> >> > separate patch?
> >>
> >> Yes, target-independent part (middle end) has to go through the
> >> separate review to check if this part is OK. The target-dependent part
> >> uses the infrastructure from the middle end, so it can go into the
> >> code base only after target-independent parts are committed.
> >
> > I sent a separate patch for bound type and mode class 
> > (http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01268.html). Here is target 
> > part of the patch with fixes you mentioned. Does it look OK?
> >
> > Bootstrapped and checked on linux-x86_64. Still shows incorrect length 
> > attribute computation (described here 
> > http://gcc.gnu.org/ml/gcc/2013-07/msg00311.html).
> 
> Please look at the attached patch that solves length computation
> problem. The patch also implements length calculation in a generic
> way, as proposed earlier.
> 
> The idea is to calculate total insn length via generic "length"
> attribute calculation from "length_nobnd" attribute, but iff
> length_attribute is non-null. This way, we are able to decorate
> bnd-prefixed instructions by "lenght_nobnd" attribute, and generic
> part will automatically call ix86_bnd_prefixed_insn_p predicate with
> current insn pattern. I also belive that this approach is most
> flexible to decorate future patterns.
> 
> The patch adds new attribute to a couple of patterns to illustrate its usage.
> 
> Please test this approach. Modulo length calculations, improved by the
> patch in this message, I have no further comments, but please repost
> complete (target part) of your patch.

Hi Uros,

Thanks for your reply! I applied approach you proposed for length attribute. It 
works well. Make check is clean now.

I also adjusted bound registers to recently added mask registers. Attached is a 
new patch.

Thanks,
Ilya

--

2013-09-30  Ilya Enkovich  

* config/i386/constraints.md (B): New.
(Ti): New.
(Tb): New.
* config/i386/i386-c.c (ix86_target_macros_internal): Add __MPX__.
* config/i386/i386-modes.def (BND32): New.
(BND64): New.
* config/i386/i386-protos.h (ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.c (isa_opts): Add mmpx.
(regclass_map): Add bound registers.
(dbx_register_map): Likewise.
(dbx64_register_map): Likewise.
(svr4_dbx_register_map): Likewise.
(PTA_MPX): New.
(ix86_option_override_internal): Support MPX ISA.
(ix86_conditional_register_usage): Support bound registers.
(print_reg): Likewise.
(ix86_code_end): Add MPX bnd prefix.
(output_set_got): Likewise.
(ix86_output_call_insn): Likewise.
(ix86_print_operand): Add '!' (MPX bnd) print prefix support.
(ix86_print_operand_punct_valid_p): Likewise.
(ix86_print_operand_address): Support UNSPEC_BNDMK_ADDR and
UNSPEC_BNDMK_ADDR.
(ix86_class_likely_spilled_p): Add bound regs support.
(ix86_hard_regno_mode_ok): Likewise.
(x86_order_regs_for_local_alloc): Likewise.
(ix86_bnd_prefixed_insn_p): New.
* config/i386/i386.h (FIRST_PSEUDO_REGISTER): Fix to new value.
(FIXED_REGISTERS): Add bound registers.
(CALL_USED_REGISTERS): Likewise.
(REG_ALLOC_ORDER): Likewise.
(HARD_REGNO_NREGS): Likewise.
(TARGET_MPX): New.
(VALID_BND_REG_MODE): New.
(FIRST_BND_REG): New.
(LAST_BND_REG): New.
(reg_class): Add BND_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
(BND_REGNO_P): New.
(ANY_BND_REG_P): New.
(BNDmode): New.
(HI_REGISTER_NAMES): Add bound registers.
* config/i386/i386.md (UNSPEC_BNDMK): New.
(UNSPEC_BNDMK_ADDR): New.
(UNSPEC_BNDSTX): New.
(UNSPEC_BNDLDX): New.
(UNSPEC_BNDLDX_ADDR): New.
(UNSPEC_BNDCL): New.
(UNSPEC_BNDCU): New.
(UNSPEC_BNDCN): New.
(UNSPEC_MPX_FENCE): New.
(BND0_REG): New.
(BND1_REG): New.
(type): Add mpxmov, mpxmk, mpxchk, mpxld, mpxst.
(length_immediate): Likewise.
(prefix_0f): Likewise.
(memory): Likewise.
(prefix_rep): Check for bnd prefix.
(length_nobnd): New.
(length): Use length_nobnd if specified.
(BND): New.
(bnd_ptr): New.
(BNDCHECK): New.
(bndche

Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-02 Thread Ilya Enkovich

Ping

2013/9/17 Ilya Enkovich :
> Hi,
>
> Here is a patch introducing new type and mode for bounds. It is a part of MPX 
> ISA support patch (http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01094.html).
>
> Bootstrapped and tested on linux-x86_64. Is it OK for trunk?
>
> Thanks,
> Ilya
> --
>
> gcc/
>
> 2013-09-16  Ilya Enkovich  
>
> * mode-classes.def (MODE_BOUND): New.
> * tree.def (BOUND_TYPE): New.
> * genmodes.c (complete_mode): Support MODE_BOUND.
> (BOUND_MODE): New.
> (make_bound_mode): New.
> * machmode.h (BOUND_MODE_P): New.
> * stor-layout.c (int_mode_for_mode): Support MODE_BOUND.
> (layout_type): Support BOUND_TYPE.
> * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE.
> * tree.c (build_int_cst_wide): Support BOUND_TYPE.
> (type_contains_placeholder_1): Likewise.
> * tree.h (BOUND_TYPE_P): New.
> * varasm.c (output_constant): Support BOUND_TYPE.
> * doc/rtl.texi (MODE_BOUND): New.
>
> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
> index 1d62223..02b1214 100644
> --- a/gcc/doc/rtl.texi
> +++ b/gcc/doc/rtl.texi
> @@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the 
> @file{@var{machine}-modes.def}.
>  @xref{Jump Patterns},
>  also see @ref{Condition Code}.
>
> +@findex MODE_BOUND
> +@item MODE_BOUND
> +Bound modes class.  Used to represent values of pointer bounds.
> +
>  @findex MODE_RANDOM
>  @item MODE_RANDOM
>  This is a catchall mode class for modes which don't fit into the above
> diff --git a/gcc/genmodes.c b/gcc/genmodes.c
> index dc38483..89174ec 100644
> --- a/gcc/genmodes.c
> +++ b/gcc/genmodes.c
> @@ -333,6 +333,7 @@ complete_mode (struct mode_data *m)
>break;
>
>  case MODE_INT:
> +case MODE_BOUND:
>  case MODE_FLOAT:
>  case MODE_DECIMAL_FLOAT:
>  case MODE_FRACT:
> @@ -533,6 +534,18 @@ make_special_mode (enum mode_class cl, const char *name,
>new_mode (cl, name, file, line);
>  }
>
> +#define BOUND_MODE(N, Y) make_bound_mode (#N, Y, __FILE__, __LINE__)
> +
> +static void ATTRIBUTE_UNUSED
> +make_bound_mode (const char *name,
> +   unsigned int bytesize,
> +   const char *file, unsigned int line)
> +{
> +  struct mode_data *m = new_mode (MODE_BOUND, name, file, line);
> +  m->bytesize = bytesize;
> +}
> +
> +
>  #define INT_MODE(N, Y) FRACTIONAL_INT_MODE (N, -1U, Y)
>  #define FRACTIONAL_INT_MODE(N, B, Y) \
>make_int_mode (#N, B, Y, __FILE__, __LINE__)
> diff --git a/gcc/machmode.h b/gcc/machmode.h
> index 981ee92..d4a20b2 100644
> --- a/gcc/machmode.h
> +++ b/gcc/machmode.h
> @@ -174,6 +174,9 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES];
> || CLASS == MODE_ACCUM  \
> || CLASS == MODE_UACCUM)
>
> +#define BOUND_MODE_P(MODE)  \
> +  (GET_MODE_CLASS (MODE) == MODE_BOUND)
> +
>  /* Get the size in bytes and bits of an object of mode MODE.  */
>
>  extern CONST_MODE_SIZE unsigned char mode_size[NUM_MACHINE_MODES];
> diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def
> index 7207ef7..c5ea215 100644
> --- a/gcc/mode-classes.def
> +++ b/gcc/mode-classes.def
> @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
>DEF_MODE_CLASS (MODE_RANDOM),/* other */   
>  \
>DEF_MODE_CLASS (MODE_CC),/* condition code in a register */ \
>DEF_MODE_CLASS (MODE_INT),   /* integer */  \
> +  DEF_MODE_CLASS (MODE_BOUND),/* bounds */ \
>DEF_MODE_CLASS (MODE_PARTIAL_INT),   /* integer with padding bits */\
>DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */ \
>DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional number 
> */   \
> diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
> index 6f6b310..82611c7 100644
> --- a/gcc/stor-layout.c
> +++ b/gcc/stor-layout.c
> @@ -383,6 +383,7 @@ int_mode_for_mode (enum machine_mode mode)
>  case MODE_VECTOR_ACCUM:
>  case MODE_VECTOR_UFRACT:
>  case MODE_VECTOR_UACCUM:
> +case MODE_BOUND:
>mode = mode_for_size (GET_MODE_BITSIZE (mode), MODE_INT, 0);
>break;
>
> @@ -2135,6 +2136,13 @@ layout_type (tree type)
>SET_TYPE_MODE (type, VOIDmode);
>break;
>
> +case BOUND_TYPE:
> +  SET_TYPE_MODE (type,
> + mode_for_size (TYPE_PRECISION (type), MODE_BOUND, 0));
> +  TYPE_SIZE (type) = bitsize_int (GET_MODE_BITSIZE (TYPE_MODE (type)));
> +  TYPE_SIZE_UNIT (type) = size_int (GET

Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-15 Thread Ilya Enkovich

Hey guys,

could please someone look at this small patch? It blocks approved MPX
ISA support on i386 target.

Thanks,
Ilya

2013/10/2 Ilya Enkovich :
> Ping
>
> 2013/9/17 Ilya Enkovich :
>> Hi,
>>
>> Here is a patch introducing new type and mode for bounds. It is a part of 
>> MPX ISA support patch 
>> (http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01094.html).
>>
>> Bootstrapped and tested on linux-x86_64. Is it OK for trunk?
>>
>> Thanks,
>> Ilya
>> --
>>
>> gcc/
>>
>> 2013-09-16  Ilya Enkovich  
>>
>> * mode-classes.def (MODE_BOUND): New.
>> * tree.def (BOUND_TYPE): New.
>> * genmodes.c (complete_mode): Support MODE_BOUND.
>> (BOUND_MODE): New.
>> (make_bound_mode): New.
>> * machmode.h (BOUND_MODE_P): New.
>> * stor-layout.c (int_mode_for_mode): Support MODE_BOUND.
>> (layout_type): Support BOUND_TYPE.
>> * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE.
>> * tree.c (build_int_cst_wide): Support BOUND_TYPE.
>> (type_contains_placeholder_1): Likewise.
>> * tree.h (BOUND_TYPE_P): New.
>> * varasm.c (output_constant): Support BOUND_TYPE.
>> * doc/rtl.texi (MODE_BOUND): New.
>>
>> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
>> index 1d62223..02b1214 100644
>> --- a/gcc/doc/rtl.texi
>> +++ b/gcc/doc/rtl.texi
>> @@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the 
>> @file{@var{machine}-modes.def}.
>>  @xref{Jump Patterns},
>>  also see @ref{Condition Code}.
>>
>> +@findex MODE_BOUND
>> +@item MODE_BOUND
>> +Bound modes class.  Used to represent values of pointer bounds.
>> +
>>  @findex MODE_RANDOM
>>  @item MODE_RANDOM
>>  This is a catchall mode class for modes which don't fit into the above
>> diff --git a/gcc/genmodes.c b/gcc/genmodes.c
>> index dc38483..89174ec 100644
>> --- a/gcc/genmodes.c
>> +++ b/gcc/genmodes.c
>> @@ -333,6 +333,7 @@ complete_mode (struct mode_data *m)
>>break;
>>
>>  case MODE_INT:
>> +case MODE_BOUND:
>>  case MODE_FLOAT:
>>  case MODE_DECIMAL_FLOAT:
>>  case MODE_FRACT:
>> @@ -533,6 +534,18 @@ make_special_mode (enum mode_class cl, const char *name,
>>new_mode (cl, name, file, line);
>>  }
>>
>> +#define BOUND_MODE(N, Y) make_bound_mode (#N, Y, __FILE__, __LINE__)
>> +
>> +static void ATTRIBUTE_UNUSED
>> +make_bound_mode (const char *name,
>> +   unsigned int bytesize,
>> +   const char *file, unsigned int line)
>> +{
>> +  struct mode_data *m = new_mode (MODE_BOUND, name, file, line);
>> +  m->bytesize = bytesize;
>> +}
>> +
>> +
>>  #define INT_MODE(N, Y) FRACTIONAL_INT_MODE (N, -1U, Y)
>>  #define FRACTIONAL_INT_MODE(N, B, Y) \
>>make_int_mode (#N, B, Y, __FILE__, __LINE__)
>> diff --git a/gcc/machmode.h b/gcc/machmode.h
>> index 981ee92..d4a20b2 100644
>> --- a/gcc/machmode.h
>> +++ b/gcc/machmode.h
>> @@ -174,6 +174,9 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES];
>> || CLASS == MODE_ACCUM  \
>> || CLASS == MODE_UACCUM)
>>
>> +#define BOUND_MODE_P(MODE)  \
>> +  (GET_MODE_CLASS (MODE) == MODE_BOUND)
>> +
>>  /* Get the size in bytes and bits of an object of mode MODE.  */
>>
>>  extern CONST_MODE_SIZE unsigned char mode_size[NUM_MACHINE_MODES];
>> diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def
>> index 7207ef7..c5ea215 100644
>> --- a/gcc/mode-classes.def
>> +++ b/gcc/mode-classes.def
>> @@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
>>DEF_MODE_CLASS (MODE_RANDOM),/* other */  
>>   \
>>DEF_MODE_CLASS (MODE_CC),/* condition code in a register */ \
>>DEF_MODE_CLASS (MODE_INT),   /* integer */  \
>> +  DEF_MODE_CLASS (MODE_BOUND),/* bounds */ \
>>DEF_MODE_CLASS (MODE_PARTIAL_INT),   /* integer with padding bits */\
>>DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */ \
>>DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional 
>> number */   \
>> diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
>> index 6f6b310..82611c7 100644
>> --- a/gcc/stor-layout.c
>> +++ b/gcc/stor-layout.c
>> @@ -383,6 +383,7 @@ int_mode_for_mode (enum machine_mode mode)
>>

[PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks

2013-10-21 Thread Ilya Enkovich

Hi,

This patch starts the series which introduces Pointers Checker and its support 
in i386 via Intel MPX. Pointers Checker is described on Wiki page - 
http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler. This 
series actually replaces previously sent patch 
(http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01167.html) which seems too 
inconvenient for review.

The first patch in a series introduces new target and language hooks used by 
Pointers Checker.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-21  Ilya Enkovich  

* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(fn_abi_va_list_bounds_size): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
* doc/tm.texi.in (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New.
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
* doc/tm.texi: Regenerated.
* langhooks.h (lang_hooks): Add chkp_supported field.
* langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): New.
(LANG_HOOKS_INITIALIZER); Add LANG_HOOKS_CHKP_SUPPORTED.


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8d220f3..01462a2 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4334,6 +4334,11 @@ This hook returns the va_list type of the calling 
convention specified by
 The default version of this hook returns @code{va_list_type_node}.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE (tree 
@var{fndecl})
+This hook returns size for va_list object or integer_zero_node if
+it does not have any (e.g. is scalar pointer to the stack).
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_CANONICAL_VA_LIST_TYPE (tree @var{type})
 This hook returns the va_list type of the calling convention specified by the
 type of @var{type}. If @var{type} is not a valid va_list type, it returns
@@ -5151,6 +5156,16 @@ defined, then define this hook to return @code{true} if
 Otherwise, you should not define this hook.
 @end deftypefn
 
+@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx, @var{rtx}, 
@var{rtx})
+This hook is used to emit insn to load arg's bounds
+in case bounds are not passed on register.  Return loaded bounds
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_STORE_BOUNDS_FOR_ARG (rtx, @var{rtx}, 
@var{rtx}, @var{rtx})
+This hook is used to emit insn to store arg's bounds
+in case bounds are not passed on register.
+@end deftypefn
+
 @node Trampolines
 @section Trampolines for Nested Functions
 @cindex trampolines for nested functions
@@ -10907,6 +10922,18 @@ ignored.  This function should return the result of 
the call to the
 built-in function.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_BUILTIN_CHKP_FUNCTION (unsigned 
@var{fcode})
+Pointers checker instrumentation pass uses this hook to obtain
+target-specific functions which implement specified generic checker
+builtins.
+@end deftypefn
+@deftypefn {Target Hook} tree TARGET_CHKP_BOUND_TYPE (void)
+Return type to be used for bounds
+@end deftypefn
+@deftypefn {Target Hook} {enum machine_mode} TARGET_CHKP_BOUND_MODE (void)
+Return mode to be used for bounds.
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_RESOLVE_OVERLOADED_BUILTIN (unsigned int 
@var{loc}, tree @var{fndecl}, void *@var{arglist})
 Select a replacement for a machine specific built-in function that
 was set up by @samp{TARGET_INIT_BUILTINS}.  This is done
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 863e843a..2828361 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3694,6 +3694,8 @@ stack.
 
 @hook TARGET_FN_ABI_VA_LIST
 
+@hook TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE
+
 @hook TARGET_CANONICAL_VA_LIST_TYPE
 
 @hook TARGET_GIMPLIFY_VA_ARG_EXPR
@@ -4064,6 +4066,10 @@ These machine description macros help implement varargs:
 
 @hook TARGET_PRETEND_OUTGOING_VARARGS_NAMED
 
+@hook TARGET_LOAD_BOUNDS_FOR_ARG
+
+@hook TARGET_STORE_BOUNDS_FOR_ARG
+
 @node Trampolines
 @section Trampolines for Nested Functions
 @cindex trampolines for nested functions
@@ -8184,6 +8190,10 @@ to by @var{ce_info}.
 
 @hook TARGET_EXPAND_BUILTIN
 
+@hook TARGET_BUILTIN_CHKP_FUNCTION
+@hook TARGET_CHKP_BOUND_TYPE
+@hook TARGET_CHKP_BOUND_MOD

[PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-21 Thread Ilya Enkovich

Hi,

This patch introduces built-in functions used by Pointers Checker and flag to 
enable Pointers Checker. Builtins available for user are expanded in 
expand_builtin. All other builtins are not allowed in expand until generic 
version of Pointers Cheker is implemented.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-04  Ilya Enkovich  

* builtin-types.def (BT_FN_VOID_CONST_PTR): New.
(BT_FN_PTR_CONST_PTR): New.
(BT_FN_CONST_PTR_CONST_PTR): New.
(BT_FN_PTR_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_VOID_PTRPTR_CONST_PTR): New.
(BT_FN_VOID_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
* chkp-builtins.def: New.
* builtins.def: include chkp-builtins.def.
(DEF_CHKP_BUILTIN): New.
* builtins.c (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS,
BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_ARG_BND, BUILT_IN_CHKP_NARROW,
BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
* common.opt (fcheck-pointers): New.
* toplev.c (process_options): Check Pointers Checker is supported.
* doc/extend.texi: Document Pointers Checker built-in functions.


diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 2634ecc..c6c5e5c 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -224,12 +224,15 @@ DEF_FUNCTION_TYPE_1 (BT_FN_DFLOAT64_DFLOAT64, 
BT_DFLOAT64, BT_DFLOAT64)
 DEF_FUNCTION_TYPE_1 (BT_FN_DFLOAT128_DFLOAT128, BT_DFLOAT128, BT_DFLOAT128)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, BT_VOLATILE_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_PTRPTR, BT_VOID, BT_PTR_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_VOID_CONST_PTR, BT_VOID, BT_CONST_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONG_ULONG, BT_ULONG, BT_ULONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT16_UINT16, BT_UINT16, BT_UINT16)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)
+DEF_FUNCTION_TYPE_1 (BT_FN_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_CONST_PTR_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
 
@@ -341,6 +344,10 @@ DEF_FUNCTION_TYPE_2 (BT_FN_VOID_VPTR_INT, BT_VOID, 
BT_VOLATILE_PTR, BT_INT)
 DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_VPTR_INT, BT_BOOL, BT_VOLATILE_PTR, BT_INT)
 DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_SIZE_CONST_VPTR, BT_BOOL, BT_SIZE,
 BT_CONST_VOLATILE_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, BT_SIZE)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR, 
BT_CONST_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTRPTR_CONST_PTR, BT_VOID, BT_PTR_PTR, 
BT_CONST_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_CONST_PTR_SIZE, BT_VOID, BT_CONST_PTR, BT_SIZE)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR)
 
@@ -425,6 +432,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, 
BT_VOLATILE_PTR, BT_I2, BT
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, 
BT_INT)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, 
BT_INT)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, 
BT_I16, BT_INT)
+DEF_FUNCTION_TYPE_3 (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, 
BT_CONST_PTR, BT_SIZE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 3b16d59..b8dec3f 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -5861,7 +5861,18 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
enum machine_mode mode,
   && fcode != BUILT_IN_EXECVE
   && fcode != BUILT_IN_ALLOCA
   && fcode != BUILT_IN_ALLOCA_WITH_ALIGN
-  && fcode != BUILT_IN_FREE)
+  && fcode != BUILT_IN_FREE
+  && fcode != BUILT_IN_CHKP_SET_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_INIT_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_NULL_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_COPY_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_NARROW_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_STORE_PTR_BOUNDS
+  && fcode != BU

[PATCH, MPX, 2/X] Pointers Checker [3/25] Attributes

2013-10-21 Thread Ilya Enkovich

Hi,

This patch adds attributes 'bnd_variable_size' and 'bnd_legacy' used by 
Pointers Checker.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-21  Ilya Enkovich  

* c-family/c-common.c (handle_bnd_variable_size_attribute): New.
(handle_bnd_legacy): New.
(c_common_attribute_table): Add bnd_variable_size and bnd_legacy.
* doc/extend.texi: Document bnd_variable_size and bnd_legacy
attributes.


diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 8ecb70c..3902909 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -371,6 +371,8 @@ static tree ignore_attribute (tree *, tree, tree, int, bool 
*);
 static tree handle_no_split_stack_attribute (tree *, tree, tree, int, bool *);
 static tree handle_fnspec_attribute (tree *, tree, tree, int, bool *);
 static tree handle_warn_unused_attribute (tree *, tree, tree, int, bool *);
+static tree handle_bnd_variable_size_attribute (tree *, tree, tree, int, bool 
*);
+static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 
 static void check_function_nonnull (tree, int, tree *);
 static void check_nonnull_arg (void *, tree, unsigned HOST_WIDE_INT);
@@ -745,8 +747,12 @@ const struct attribute_spec c_common_attribute_table[] =
  The name contains space to prevent its usage in source code.  */
   { "fn spec",   1, 1, false, true, true,
  handle_fnspec_attribute, false },
-  { "warn_unused",0, 0, false, false, false,
+  { "warn_unused",   0, 0, false, false, false,
  handle_warn_unused_attribute, false },
+  { "bnd_variable_size",  0, 0, true,  false, false,
+ handle_bnd_variable_size_attribute, false },
+  { "bnd_legacy", 0, 0, true, false, false,
+ handle_bnd_legacy, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -8000,6 +8006,38 @@ handle_fnspec_attribute (tree *node ATTRIBUTE_UNUSED, 
tree ARG_UNUSED (name),
   return NULL_TREE;
 }
 
+/* Handle a "bnd_variable_size" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_bnd_variable_size_attribute (tree *node, tree name, tree ARG_UNUSED 
(args),
+   int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FIELD_DECL)
+{
+  warning (OPT_Wattributes, "%qE attribute ignored", name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
+/* Handle a "bnd_legacy" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_bnd_legacy (tree *node, tree name, tree ARG_UNUSED (args),
+  int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+{
+  warning (OPT_Wattributes, "%qE attribute ignored", name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
 /* Handle a "warn_unused" attribute; arguments as in
struct attribute_spec.handler.  */
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index ef422ad..7701d60 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2137,7 +2137,7 @@ attributes are currently defined for functions on all 
targets:
 @code{warn_unused_result}, @code{nonnull}, @code{gnu_inline},
 @code{externally_visible}, @code{hot}, @code{cold}, @code{artificial},
 @code{no_sanitize_address}, @code{no_address_safety_analysis},
-@code{no_sanitize_undefined},
+@code{no_sanitize_undefined}, @code{bnd_legacy},
 @code{error} and @code{warning}.
 Several other attributes are defined for functions on particular
 target systems.  Other attributes, including @code{section} are
@@ -3508,6 +3508,12 @@ The @code{no_sanitize_undefined} attribute on functions 
is used
 to inform the compiler that it should not check for undefined behavior
 in the function when compiling with the @option{-fsanitize=undefined} option.
 
+@item bnd_legacy
+@cindex @code{bnd_legacy} function attribute
+The @code{bnd_legacy} attribute on functions is used to inform
+compiler that function should not be instrumented when compiled
+with @option{-fcheck-pointers} option.
+
 @item regparm (@var{number})
 @cindex @code{regparm} attribute
 @cindex functions that are passed arguments in registers on the 386
@@ -5280,12 +5286,12 @@ placed in either the @code{.bss_below100} section or the
 The keyword @code{__attribute__} allows you to specify special
 attributes of @code{struct} and @code{union} types when you define
 such types.  This keyword is followed by an attribute specification
-inside double parentheses.  Seven attributes are currently defined for
+inside double parentheses.  Eight attributes are currently defined for
 types: @code{aligned}, @code{packed}, @code{transparent_union},
-@code{unused}, @c

[PATCH, MPX, 2/X] Pointers Checker [4/25] Constructors

2013-10-21 Thread Ilya Enkovich

Hi,

This patch introduces two new contructor types supported by 
cgraph_build_static_cdtor.

'B' type is used to initialize static objects (bounds) created by Pointers 
Checker. The difference of this type from the regular constructor is that 'B' 
constructor is never instrumented by Pointers Checker.

'P' type is used by Pointers Checker to generate constructors to initialize 
bounds of statically initialized pointers. Pointers Checker remove all stores 
from such constructors after instrumentation.

Since 'P' type constructors are created for statically initialized objects, we 
need to avoid creation of such objects during its gimplification. New 
restriction was added to gimplify_init_constructor.

Bootstrapped and checked on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-21  Ilya Enkovich  

* ipa.c (cgraph_build_static_cdtor_1): Support contructors
with "chkp ctor" and "bnd_legacy" attributes.
* gimplify.c (gimplify_init_constructor): Avoid infinite
loop during gimplification of bounds initializer.


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 86bda77..7c350fd 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -4033,10 +4033,19 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
*pre_p, gimple_seq *post_p,
   individual element initialization.  Also don't do this for small
   all-zero initializers (which aren't big enough to merit
   clearing), and don't try to make bitwise copies of
-  TREE_ADDRESSABLE types.  */
+  TREE_ADDRESSABLE types.
+
+  We cannot apply such transformation when compiling chkp static
+  initializer because creation of initializer image in the memory
+  will require static initialization of bounds for it.  It should
+  result in another gimplification of similar initializer and we
+  may fall into infinite loop.  */
if (valid_const_initializer
&& !(cleared || num_nonzero_elements == 0)
-   && !TREE_ADDRESSABLE (type))
+   && !TREE_ADDRESSABLE (type)
+   && (!current_function_decl
+   || !lookup_attribute ("chkp ctor",
+ DECL_ATTRIBUTES (current_function_decl
  {
HOST_WIDE_INT size = int_size_in_bytes (type);
unsigned int align;
diff --git a/gcc/ipa.c b/gcc/ipa.c
index 92343fb2..36cc621 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -1259,9 +1259,11 @@ make_pass_ipa_whole_program_visibility (gcc::context 
*ctxt)
 }
 
 /* Generate and emit a static constructor or destructor.  WHICH must
-   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
-   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
-   initialization priority for this constructor or destructor. 
+   be one of 'I' (for a constructor), 'D' (for a destructor), 'P'
+   (for chp static vars constructor) or 'B' (for chkp static bounds
+   constructor).  BODY is a STATEMENT_LIST containing GENERIC
+   statements.  PRIORITY is the initialization priority for this
+   constructor or destructor.
 
FINAL specify whether the externally visible name for collect2 should
be produced. */
@@ -1320,6 +1322,20 @@ cgraph_build_static_cdtor_1 (char which, tree body, int 
priority, bool final)
   DECL_STATIC_CONSTRUCTOR (decl) = 1;
   decl_init_priority_insert (decl, priority);
   break;
+case 'P':
+  DECL_STATIC_CONSTRUCTOR (decl) = 1;
+  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier ("chkp ctor"),
+ NULL,
+ NULL_TREE);
+  decl_init_priority_insert (decl, priority);
+  break;
+case 'B':
+  DECL_STATIC_CONSTRUCTOR (decl) = 1;
+  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier ("bnd_legacy"),
+ NULL,
+ NULL_TREE);
+  decl_init_priority_insert (decl, priority);
+  break;
 case 'D':
   DECL_STATIC_DESTRUCTOR (decl) = 1;
   decl_fini_priority_insert (decl, priority);
@@ -1337,9 +1353,11 @@ cgraph_build_static_cdtor_1 (char which, tree body, int 
priority, bool final)
 }
 
 /* Generate and emit a static constructor or destructor.  WHICH must
-   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
-   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
-   initialization priority for this constructor or destructor.  */
+   be one of 'I' (for a constructor), 'D' (for a destructor), 'P'
+   (for chkp static vars constructor) or 'B' (for chkp static bounds
+   constructor).  BODY is a STATEMENT_LIST containing GENERIC
+   statements.  PRIORITY is the initialization priority for this
+   constructor or destructor.  */
 
 void
 cgraph_build_static_cdtor (char which, tree body, int priority)

[PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-21 Thread Ilya Enkovich

Hi,

This patch adds bounds related iface macros and functions for tree and gimple. 

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-21  Ilya Enkovich  

* tree-core.h (tree_index): Add TI_BOUND_TYPE.
* tree.h (BOUND_P): New.
(BOUNDED_TYPE_P): New.
(BOUNDED_P): New.
(bound_type_node): New.
* tree.c (build_common_tree_nodes): Initialize bound_type_node.
* gimple.h (gimple_call_get_nobnd_arg_index): New.
(gimple_call_num_nobnd_args): New.
(gimple_call_nobnd_arg): New.
(gimple_return_retbnd): New.
(gimple_return_set_retbnd): New
* gimple.c (gimple_build_return): Increase number of ops
for return statement.
(gimple_call_get_nobnd_arg_index): New.
* gimple-pretty-print.c (dump_gimple_return): Print second op.


diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 1e985e0..fddcee0 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -535,11 +535,12 @@ dump_gimple_assign (pretty_printer *buffer, gimple gs, 
int spc, int flags)
 static void
 dump_gimple_return (pretty_printer *buffer, gimple gs, int spc, int flags)
 {
-  tree t;
+  tree t, t2;
 
   t = gimple_return_retval (gs);
+  t2 = gimple_return_retbnd (gs);
   if (flags & TDF_RAW)
-dump_gimple_fmt (buffer, spc, flags, "%G <%T>", gs, t);
+dump_gimple_fmt (buffer, spc, flags, "%G <%T %T>", gs, t, t2);
   else
 {
   pp_string (buffer, "return");
@@ -548,6 +549,11 @@ dump_gimple_return (pretty_printer *buffer, gimple gs, int 
spc, int flags)
  pp_space (buffer);
  dump_generic_node (buffer, t, spc, flags, false);
}
+  if (t2)
+   {
+ pp_string (buffer, ", ");
+ dump_generic_node (buffer, t2, spc, flags, false);
+   }
   pp_semicolon (buffer);
 }
 }
diff --git a/gcc/gimple.c b/gcc/gimple.c
index e12f7d9..eb2d365 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -179,7 +179,7 @@ gimple_build_with_ops_stat (enum gimple_code code, unsigned 
subcode,
 gimple
 gimple_build_return (tree retval)
 {
-  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 1);
+  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 2);
   if (retval)
 gimple_return_set_retval (s, retval);
   return s;
@@ -371,6 +371,26 @@ gimple_build_call_from_tree (tree t)
 }
 
 
+/* Return index of INDEX's non bound argument of the call.  */
+
+unsigned
+gimple_call_get_nobnd_arg_index (const_gimple gs, unsigned index)
+{
+  unsigned num_args = gimple_call_num_args (gs);
+  for (unsigned n = 0; n < num_args; n++)
+{
+  if (BOUND_P (gimple_call_arg (gs, n)))
+   continue;
+  else if (index)
+   index--;
+  else
+   return n;
+}
+
+  gcc_unreachable ();
+}
+
+
 /* Extract the operands and code for expression EXPR into *SUBCODE_P,
*OP1_P, *OP2_P and *OP3_P respectively.  */
 
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 376fda2..4376408 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -910,6 +910,7 @@ extern tree get_initialized_tmp_var (tree, gimple_seq *, 
gimple_seq *);
 extern tree get_formal_tmp_var (tree, gimple_seq *);
 extern void declare_vars (tree, gimple, bool);
 extern void annotate_all_with_location (gimple_seq, location_t);
+extern unsigned gimple_call_get_nobnd_arg_index (const_gimple, unsigned);
 
 /* Validation of GIMPLE expressions.  Note that these predicates only check
the basic form of the expression, they don't recurse to make sure that
@@ -2379,6 +2380,31 @@ gimple_call_arg (const_gimple gs, unsigned index)
 }
 
 
+/* Return the number of arguments used by call statement GS.  */
+
+static inline unsigned
+gimple_call_num_nobnd_args (const_gimple gs)
+{
+  unsigned num_args = gimple_call_num_args (gs);
+  unsigned res = num_args;
+  for (unsigned n = 0; n < num_args; n++)
+if (BOUND_P (gimple_call_arg (gs, n)))
+  res--;
+  return res;
+}
+
+
+/* Return INDEX's call argument ignoring bound ones.  */
+static inline tree
+gimple_call_nobnd_arg (const_gimple gs, unsigned index)
+{
+  /* No bound args may exist if pointers checker is off.  */
+  if (!flag_check_pointers)
+return gimple_call_arg (gs, index);
+  return gimple_call_arg (gs, gimple_call_get_nobnd_arg_index (gs, index));
+}
+
+
 /* Return a pointer to the argument at position INDEX for call
statement GS.  */
 
@@ -4886,6 +4912,26 @@ gimple_return_set_retval (gimple gs, tree retval)
 }
 
 
+/* Return the return bounds for GIMPLE_RETURN GS.  */
+
+static inline tree
+gimple_return_retbnd (const_gimple gs)
+{
+  GIMPLE_CHECK (gs, GIMPLE_RETURN);
+  return gimple_op (gs, 1);
+}
+
+
+/* Set RETVAL to be the return bounds for GIMPLE_RETURN GS.  */
+
+static inline void
+gimple_return_set_retbnd (gimple gs, tree retval)
+{
+  GIMPLE_CHECK (gs, GIMPLE_RETURN);
+  gimple_set_op (gs, 1, retval);
+}
+
+
 /* Returns t

Re: [PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks

2013-10-21 Thread Ilya Enkovich

esOn 21 Oct 11:44, Joseph S. Myers wrote:
> On Mon, 21 Oct 2013, Ilya Enkovich wrote:
> 
> > +DEFHOOK
> > +(builtin_chkp_function,
> > + "Pointers checker instrumentation pass uses this hook to obtain\n\
> > +target-specific functions which implement specified generic checker\n\
> > +builtins.",
> > + tree, (unsigned fcode),
> > + default_builtin_chkp_function)
> 
> I don't think that's enough detail.  The audience for this hook 
> description is back-end maintainers wanting to implement such hooks for 
> their back ends, and the hook description should give sufficient 
> information to do so.  This description says nothing at all about the 
> semantics of the hook argument or return value.
> 
> If it seems difficult to describe things sufficiently in the context of 
> individual hook descriptions, maybe an overview of the feature and 
> implementation approach is needed as a new section in the internals 
> manual, with hook descriptions then referring to that section, or going in 
> appropriate places within that section (if the section is in tm.texi.in).
> 
> > +DEFHOOK
> > +(fn_abi_va_list_bounds_size,
> > + "This hook returns size for va_list object or integer_zero_node if\n\
> > +it does not have any (e.g. is scalar pointer to the stack).",
> > + tree, (tree fndecl),
> > + default_fn_abi_va_list_bounds_size)
> 
> @code{va_list}, @code{integer_zero_node}, specify semantics of fndecl 
> argument.
> 
> >  DEFHOOK
> > +(load_bounds_for_arg,
> > + "This hook is used to emit insn to load arg's bounds\n\
> > +in case bounds are not passed on register.  Return loaded bounds",
> > + rtx, (rtx, rtx, rtx),
> > + default_load_bounds_for_arg)
> 
> You need to name all the arguments and explain their semantics by name in 
> the documentation (which should end with ".").
> 
> > +DEFHOOK
> > +(store_bounds_for_arg,
> > + "This hook is used to emit insn to store arg's bounds\n\
> > +in case bounds are not passed on register.",
> > + void, (rtx, rtx, rtx, rtx),
> > + default_store_bounds_for_arg)
> 
> Likewise.
> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com

Hello Joseph,

Thanks for your comments! I attach a new patch version with changed hooks 
documentation. Hope it is more informative now.

Thanks,
Ilya
--

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8d220f3..79bd0f9 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4334,6 +4334,13 @@ This hook returns the va_list type of the calling 
convention specified by
 The default version of this hook returns @code{va_list_type_node}.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE (tree 
@var{fndecl})
+This hook returns size for @code{va_list} object in function specified
+by @var{fndecl}.  This hook is used by Pointers Checker to build bounds for
+@code{va_list} object.  Return @code{integer_zero_node} if no bounds should
+be used (e.g. @code{va_list} is a scalar pointer to the stack).
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_CANONICAL_VA_LIST_TYPE (tree @var{type})
 This hook returns the va_list type of the calling convention specified by the
 type of @var{type}. If @var{type} is not a valid va_list type, it returns
@@ -5151,6 +5158,19 @@ defined, then define this hook to return @code{true} if
 Otherwise, you should not define this hook.
 @end deftypefn
 
+@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx @var{slot}, rtx 
@var{arg}, rtx @var{slot_no})
+This hook is used to emit insn to load bounds of @var{arg} passed
+in @var{slot}.  In case @var{slot} is not a memory, @var{slot_no} is RTX
+constant holding number of the special slot we should get bounds from.
+Return loaded bounds.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_STORE_BOUNDS_FOR_ARG (rtx @var{arg}, rtx 
@var{slot}, rtx @var{bounds}, rtx @var{slot_no})
+This hook is used to emit insn to store bounds of @var{arg} passed
+in @var{slot}.  In case @var{slot} is not a memory, @var{slot_no} is RTX
+constant holding number of the special slot we should store bounds to.
+@end deftypefn
+
 @node Trampolines
 @section Trampolines for Nested Functions
 @cindex trampolines for nested functions
@@ -10907,6 +10927,27 @@ ignored.  This function should return the result of 
the call to the
 built-in function.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_BUILTIN_CHKP_FUNCTION (unsigned 
@var{fcode})
+This hook allows target to redefine built-in functions used by
+Pointers Checker for code instrumentation.  Hook should return
+fndecl of function implementing generic builtin whose code is
+passed in @var{fcode}.  Currently following built-in functions are
+obtained using this hook:
+@code{BUILT_IN_CHKP_BNDMK}, @co

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-21 Thread Ilya Enkovich

On 21 Oct 12:08, Joseph S. Myers wrote:
> On Mon, 21 Oct 2013, Ilya Enkovich wrote:
> 
> > +  if (flag_check_pointers)
> > +{
> > +  if (flag_lto)
> > +   sorry ("Pointers checker is not yet fully supported for link-time 
> > optimization");
> 
> That sounds wrong.  It suggests some bug somewhere in your patch series 
> failing to allow for LTO, which should be fixed.  At least give a more 
> detailed explanation in a comment of what would need to change to remove 
> this call to sorry ().

It has been a while since I saw problem with LTO (some functions appeared to 
miss instrumentation; it seemed compilation flags for checker were not handled 
correctly for some reason).  When I made this branch public I put this guard to 
prevent users from falling into this problem.  Actually I do not know what is 
the current status of that issue.  A lot of things has changed including 
checker flags definition.  I'll try mpx testsuite with LTO and be back with the 
result.

> 
> > +  if (targetm.chkp_bound_mode () == VOIDmode)
> > +   error ("-fcheck-pointers is not supported for this target.");
> 
> Also note GNU Coding Standards for diagnostics.  They should not start 
> with uppercase letters or end with ".".

Thanks for input! I'll fix it.

Thanks,
Ilya

> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com

Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-22 Thread Ilya Enkovich

2013/10/21 Jeff Law :
> On 10/15/13 07:31, Ilya Enkovich wrote:
>>
>> Hey guys,
>>
>> could please someone look at this small patch? It blocks approved MPX
>> ISA support on i386 target.
>
>
>>>>
>>>> diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
>>>> index 1d62223..02b1214 100644
>>>> --- a/gcc/doc/rtl.texi
>>>> +++ b/gcc/doc/rtl.texi
>>>> @@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the
>>>> @file{@var{machine}-modes.def}.
>>>>   @xref{Jump Patterns},
>>>>   also see @ref{Condition Code}.
>>>>
>>>> +@findex MODE_BOUND
>>>> +@item MODE_BOUND
>>>> +Bound modes class.  Used to represent values of pointer bounds.
>>>> +
>>>>   @findex MODE_RANDOM
>>>>   @item MODE_RANDOM
>>>>   This is a catchall mode class for modes which don't fit into the above
>
> So why are bounds distinct modes?Is there some inherent reason why
> bounds are something other than an integer mode (MODE_INT)?
>
> Similarly what's the rationale behind having new types for bounds?  Is there
> some reason why they couldn't be implemented with one of the existing types?
>
> ISTM the entire patch is gated on being able to answer those two questions.
>

Hello Jeff,

Before introducing new type and mode we tried to implement everything
using existing ones. We tried integers, pointers, complex with pointer
type as base and also structure of two pointers. The problem is that
semantics of bounds is different from everything we have for base
types. All operators (exprs) we have for existing types are not
applicable to bounds. We probably may use some existing type/mode but
it would still require some additional flag to mark bounds. And almost
each first time we handle chosen basic type, it would be required to
check if we are working with bounds. I do not think many GCC
developers (at least in the nearest future) will care about
instrumented code while writing their patches. It means that many
developers may break instrumented code by adding any sort of
manipulation with values of type/mode we choose as basic for bounds.
I'm sure having a proper type is much more convenient and natural.

In addition to all said for bound type, bound mode may also have
different binary format. On i386 target bounds have special binary
format, it is not equal to pair of pointers. In many places (ABI, insn
templates etc.) we need to know if we work with bounds. E. g. passing
'long long' and bounds on a register(s) is different even if size is
the same.

Shortly: why to use same base type/mode for totally different matters?
I do not know if it is possible to implement everything using existing
types and modes. Probably it is possible, but for me it does not seem
a right way to go.

Thanks,
Ilya

>
> jeff
>

Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-23 Thread Ilya Enkovich

eOn 22 Oct 22:55, Jeff Law wrote:
> On 09/17/13 02:18, Ilya Enkovich wrote:
> >Hi,
> >
> >Here is a patch introducing new type and mode for bounds. It is a part of 
> >MPX ISA support patch 
> >(http://gcc.gnu.org/ml/gcc-patches/2013-07/msg01094.html).
> >
> >Bootstrapped and tested on linux-x86_64. Is it OK for trunk?
> >
> >Thanks,
> >Ilya
> >--
> >
> >gcc/
> >
> >2013-09-16  Ilya Enkovich  
> >
> > * mode-classes.def (MODE_BOUND): New.
> > * tree.def (BOUND_TYPE): New.
> > * genmodes.c (complete_mode): Support MODE_BOUND.
> > (BOUND_MODE): New.
> > (make_bound_mode): New.
> > * machmode.h (BOUND_MODE_P): New.
> > * stor-layout.c (int_mode_for_mode): Support MODE_BOUND.
> > (layout_type): Support BOUND_TYPE.
> > * tree-pretty-print.c (dump_generic_node): Support BOUND_TYPE.
> > * tree.c (build_int_cst_wide): Support BOUND_TYPE.
> > (type_contains_placeholder_1): Likewise.
> > * tree.h (BOUND_TYPE_P): New.
> > * varasm.c (output_constant): Support BOUND_TYPE.
> > * doc/rtl.texi (MODE_BOUND): New.
> Mostly OK.  Just a few minor things that should be fixed or at least
> clarified.
> 
> 
> 
> 
> >
> >diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
> >index 1d62223..02b1214 100644
> >--- a/gcc/doc/rtl.texi
> >+++ b/gcc/doc/rtl.texi
> >@@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the 
> >@file{@var{machine}-modes.def}.
> >  @xref{Jump Patterns},
> >  also see @ref{Condition Code}.
> >
> >+@findex MODE_BOUND
> >+@item MODE_BOUND
> >+Bound modes class.  Used to represent values of pointer bounds.
> I can't help but feel more is needed here -- without going into the
> details of the MPX implementation we ought to say something about
> how these differ from the more normal integer modes.  Drawing from
> the brief discussion between Richard & myself earlier today should
> give some ideas on how to improve this.
> 
> 
> 
> I'd probably use MODE_POINTER_BOUNDS which is a bit more
> descriptive. We wouldn't want someone to (for example) think this
> stuff relates to array bounds.  Obviously a change to
> MODE_POINTER_BOUNDS would propagate into other places where you use
> "BOUND" without a "POINTER" qualification, such as "BOUND_MODE_P"
> which we'd change to POINTER_BOUNDS_MODE_P.
> 
> >diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def
> >index 7207ef7..c5ea215 100644
> >--- a/gcc/mode-classes.def
> >+++ b/gcc/mode-classes.def
> >@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
> >DEF_MODE_CLASS (MODE_RANDOM),/* other */ 
> >\
> >DEF_MODE_CLASS (MODE_CC),/* condition code in a register 
> > */ \
> >DEF_MODE_CLASS (MODE_INT),   /* integer */   
> >\
> >+  DEF_MODE_CLASS (MODE_BOUND),/* bounds */ \
> >DEF_MODE_CLASS (MODE_PARTIAL_INT),   /* integer with padding bits */ 
> >\
> >DEF_MODE_CLASS (MODE_FRACT), /* signed fractional number */  
> >\
> >DEF_MODE_CLASS (MODE_UFRACT),/* unsigned fractional number 
> > */   \
> Does genmodes do the right thing WRT MAX_INT_MODE and MIN_INT_MODE?
> 
> I'd be more comfortable if MODE_POINTER_BOUNDS wasn't sitting
> between MODE_INT and MODE_PARTIAL_INT.  I'm not aware of code that
> iterates over these things that would get confused, but ISTM putting
> MODE_POINTER_BOUNDS after MODE_PARTIAL_INT is marginally safer.
> 
> 
> 
> >diff --git a/gcc/tree.c b/gcc/tree.c
> >index b469b97..bbbe16e 100644
> >--- a/gcc/tree.c
> >+++ b/gcc/tree.c
> >@@ -1197,6 +1197,7 @@ build_int_cst_wide (tree type, unsigned HOST_WIDE_INT 
> >low, HOST_WIDE_INT hi)
> >
> >  case INTEGER_TYPE:
> >  case OFFSET_TYPE:
> >+case BOUND_TYPE:
> >if (TYPE_UNSIGNED (type))
> > {
> >   /* Cache 0..N */
> So here you're effectively treading POINTER_BOUNDS_TYPE like an
> integer.  I'm guessing there's a number of flags that may not be
> relevant for your type and which you might want to repurpose (again,
> I haven't looked at the entire patchset).  If so, you want to be
> real careful here since you'll be looking at (for example)
> TYPE_UNSIGNED which may not have any real meaning for
> POINTER_BOUNDS_TYPE.
> 
> Overall, it seems fairly reasonable -- the biggest conc

Re: [PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks

2013-10-24 Thread Ilya Enkovich

2013/10/24 Jeff Law :
> On 10/21/13 08:20, Ilya Enkovich wrote:
>
>> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
>> index 8d220f3..79bd0f9 100644
>> --- a/gcc/doc/tm.texi
>> +++ b/gcc/doc/tm.texi
>> +@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx @var{slot},
>> rtx @var{arg}, rtx @var{slot_no})
>> +This hook is used to emit insn to load bounds of @var{arg} passed
>> +in @var{slot}.  In case @var{slot} is not a memory, @var{slot_no} is RTX
>> +constant holding number of the special slot we should get bounds from.
>> +Return loaded bounds.
>> +@end deftypefn
>> +
>> +@deftypefn {Target Hook} void TARGET_STORE_BOUNDS_FOR_ARG (rtx @var{arg},
>> rtx @var{slot}, rtx @var{bounds}, rtx @var{slot_no})
>> +This hook is used to emit insn to store bounds of @var{arg} passed
>> +in @var{slot}.  In case @var{slot} is not a memory, @var{slot_no} is RTX
>> +constant holding number of the special slot we should store bounds to.
>> +@end deftypefn
>> +
>
> Almost there. What I think is missing is more information about the case
> where SLOT is not a memory (presumably it's a reg) and how that relates to
> SLOT_NO.
>
> Isn't this just providing a mapping from the input argument registers to
> some set of bounds pointer registers?  Presumably you aren't really exposing
> the bound registers as argument registers hence this hack?
>
> Not asking you to change anything, just trying to understand the rationale.
>
>

These two hooks are used by expand pass to pass/receive bounds for
args. When bounds are passed in a register, expand does not need this
hook and uses regular move insn. If we are out of bound register or
platform does not have them at all, this hook is called. If bounded
arg is passed in memory, regular way to store associated bounds is
supposed to be used in hook. That is why no slot_no value is required,
only arg and it's place (slot) are used. If bounded arg is passed in
register (e.g. it happens on i386 with MPX when more that 4 pointers
are passed in registers), then some special slot has to be used for
bounds. slot_no here holds identifier of this special slot. E.g. if we
call function with 6 pointer on i386, we call this hook passing R8 and
R9 as slot with const1_rtx and const2_rtx as slot_no.

>
>
>
>>   @node Trampolines
>>   @section Trampolines for Nested Functions
>>   @cindex trampolines for nested functions
>> @@ -10907,6 +10927,27 @@ ignored.  This function should return the result
>> of the call to the
>>   built-in function.
>>   @end deftypefn
>>
>> +@deftypefn {Target Hook} tree TARGET_BUILTIN_CHKP_FUNCTION (unsigned
>> @var{fcode})
>> +This hook allows target to redefine built-in functions used by
>> +Pointers Checker for code instrumentation.  Hook should return
>> +fndecl of function implementing generic builtin whose code is
>> +passed in @var{fcode}.  Currently following built-in functions are
>> +obtained using this hook:
>> +@code{BUILT_IN_CHKP_BNDMK}, @code{BUILT_IN_CHKP_BNDSTX},
>> +@code{BUILT_IN_CHKP_BNDLDX}, @code{BUILT_IN_CHKP_BNDCL},
>> +@code{BUILT_IN_CHKP_BNDCU}, @code{BUILT_IN_CHKP_BNDRET},
>> +@code{BUILT_IN_CHKP_INTERSECT}, @code{BUILT_IN_CHKP_SET_PTR_BOUNDS},
>> +@code{BUILT_IN_CHKP_NARROW}, @code{BUILT_IN_CHKP_ARG_BND},
>> +@code{BUILT_IN_CHKP_SIZEOF}, @code{BUILT_IN_CHKP_EXTRACT_LOWER},
>> +@code{BUILT_IN_CHKP_EXTRACT_UPPER}.
>> +@end deftypefn
>> +@deftypefn {Target Hook} tree TARGET_CHKP_BOUND_TYPE (void)
>> +Return type to be used for bounds
>> +@end deftypefn
>> +@deftypefn {Target Hook} {enum machine_mode} TARGET_CHKP_BOUND_MODE
>> (void)
>> +Return mode to be used for bounds.
>> +@end deftypefn
>
> So how am I (as a GCC developer) suppsoed to know what BNDMK, BNDLDX, BNDCU,
> etc mean?  The names aren't particularly descriptive.  Are these documented
> elsewhere in a follow-up patch?  If not, it seems to me we need to document
> them here.

Actually the next patch introduces them and is a good place for
documentation. But currently this patch has documentation for user
visible built-ins only. For now built-ins used for instrumentation are
described on Wiki only
(http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler#Builtins_used_for_instrumentation).
Where should I move it? Does it also go to extend.texi?

Thanks,
Ilya

>
> Jeff
>

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-24 Thread Ilya Enkovich

2013/10/24 Richard Henderson :
> On 10/23/2013 02:41 PM, Jeff Law wrote:
>> Out of curiosity, did you consider and/or discuss with Richard whether or not
>> to make these target-dependent or target-independent builtins?  I realize 
>> it's
>> a bit problematic with Richard being involved during the NDA portion and
>> someone else during the review/integration portion, but that's unfortunately
>> where we are.
>
> I suggested that they be target independent.
>
> I suggested that there was nothing in MPX that couldn't be
> done generically, if slower, on non-MPX hardware.
>
> E.g. there's no reason why bounds couldn't be packed into a
> double-word integer, and the checking builtins completely
> outlined into a runtime library.
>
> I suggested that the optimization done on the bound type
> would help a generic mudflap replacement.

Right. The design implies generic support of Pointers Checker without
MPX support on hardware. This series does not include generic support.
We are currently examining priority of this task. Suppose generic
support of Pointers Checker may replace Mudflap. Do not know yet if
Pointers Checker may borrow some stuff from Mudflap.

Ilya

>
>
> r~

Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-24 Thread Ilya Enkovich

2013/10/24 Jeff Law :
> On 10/23/13 04:57, Ilya Enkovich wrote:
>>
>>
>> 2013-10-23  Ilya Enkovich  
>>
>> * mode-classes.def (MODE_POINTER_BOUNDS): New.
>> * tree.def (POINTER_BOUNDS_TYPE): New.
>> * genmodes.c (complete_mode): Support MODE_POINTER_BOUNDS.
>> (POINTER_BOUNDS_MODE): New.
>> (make_pointer_bounds_mode): New.
>> * machmode.h (POINTER_BOUNDS_MODE_P): New.
>> * stor-layout.c (int_mode_for_mode): Support MODE_POINTER_BOUNDS.
>> (layout_type): Support POINTER_BOUNDS_TYPE.
>> * tree-pretty-print.c (dump_generic_node): Support
>> POINTER_BOUNDS_TYPE.
>> * tree.c (build_int_cst_wide): Support POINTER_BOUNDS_TYPE.
>> (type_contains_placeholder_1): Likewise.
>> * tree.h (POINTER_BOUNDS_TYPE_P): New.
>> * varasm.c (output_constant): Support POINTER_BOUNDS_TYPE.
>> * doc/rtl.texi (MODE_POINTER_BOUNDS): New.
>
> OK for the trunk.  IIRC, there was a backend patch with conditional approval
> that should be good to go now (conditional upon the acceptance of the
> types/modes patch).
>
> Note that since I asked for a couple things to be renamed that backend patch
> might need tweaking.  If so, make the obvious changes, post the patch (so
> it's recorded into the archives) and go ahead and check it into the trunk.
>
> jeff
>

Thanks!

Ilya

Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA. 2/2 New registers and instructions

2013-10-24 Thread Ilya Enkovich

On 01 Oct 20:00, Uros Bizjak wrote:
> 
> This is OK for mainline, on the condition that target independent part
> is approved and committed first.
> 
> Thanks,
> Uros.

Thanks for review!

Attached is a version to be committed.  The only difference from the previous 
one is BOUND_MODE renamed to POINTER_BOUNDS_MODE due to changes in target 
independent part.  ChangeLog was not modified.

Ilya
diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 92e0c05..ddfd402 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -18,7 +18,7 @@
 ;; .
 
 ;;; Unused letters:
-;;; B H   T
+;;;   H
 ;;;   h j
 
 ;; Integer register constraints.
@@ -91,6 +91,9 @@
 (define_register_constraint "x" "TARGET_SSE ? SSE_REGS : NO_REGS"
  "Any SSE register.")
 
+(define_register_constraint "B" "TARGET_MPX ? BND_REGS : NO_REGS"
+ "@internal Any bound register.")
+
 ;; We use the Y prefix to denote any number of conditional register sets:
 ;;  z  First SSE register.
 ;;  i  SSE2 inter-unit moves to SSE register enabled
@@ -232,3 +235,15 @@
to fit that range (for immediate operands in zero-extending x86-64
instructions)."
   (match_operand 0 "x86_64_zext_immediate_operand"))
+
+;; T prefix is used for different address constraints
+;;   i - address with no index and no rip
+;;   b - address with no base and no rip
+
+(define_address_constraint "Ti"
+  "MPX address operand without index"
+  (match_operand 0 "address_mpx_no_index_operand"))
+
+(define_address_constraint "Tb"
+  "MPX address operand without base"
+  (match_operand 0 "address_mpx_no_base_operand"))
diff --git a/gcc/config/i386/i386-c.c b/gcc/config/i386/i386-c.c
index 2e764e7..da16240 100644
--- a/gcc/config/i386/i386-c.c
+++ b/gcc/config/i386/i386-c.c
@@ -358,6 +358,8 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_flag,
 def_or_undef (parse_in, "__SSE_MATH__");
   if ((fpmath & FPMATH_SSE) && (isa_flag & OPTION_MASK_ISA_SSE2))
 def_or_undef (parse_in, "__SSE2_MATH__");
+  if (isa_flag & OPTION_MASK_ISA_MPX)
+def_or_undef (parse_in, "__MPX__");
 }
 
 
diff --git a/gcc/config/i386/i386-modes.def b/gcc/config/i386/i386-modes.def
index e0b8fc8..a73730e 100644
--- a/gcc/config/i386/i386-modes.def
+++ b/gcc/config/i386/i386-modes.def
@@ -87,6 +87,9 @@ VECTOR_MODE (INT, DI, 1); /*   V1DI */
 VECTOR_MODE (INT, SI, 1); /*   V1SI */
 VECTOR_MODE (INT, QI, 2); /*   V2QI */
 
+POINTER_BOUNDS_MODE (BND32, 8);
+POINTER_BOUNDS_MODE (BND64, 16);
+
 INT_MODE (OI, 32);
 INT_MODE (XI, 64);
 
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 3ab2f3a..fdd98fc 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -239,6 +239,8 @@ extern void ix86_expand_mul_widen_hilo (rtx, rtx, rtx, 
bool, bool);
 extern void ix86_expand_sse2_mulv4si3 (rtx, rtx, rtx);
 extern void ix86_expand_sse2_mulvxdi3 (rtx, rtx, rtx);
 
+extern bool ix86_bnd_prefixed_insn_p (rtx);
+
 /* In i386-c.c  */
 extern void ix86_target_macros (void);
 extern void ix86_register_pragmas (void);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 21fc531..f973f84 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -1956,6 +1956,8 @@ enum reg_class const regclass_map[FIRST_PSEUDO_REGISTER] =
   /* Mask registers.  */
   MASK_REGS, MASK_EVEX_REGS, MASK_EVEX_REGS, MASK_EVEX_REGS,
   MASK_EVEX_REGS, MASK_EVEX_REGS, MASK_EVEX_REGS, MASK_EVEX_REGS,
+  /* MPX bound registers */
+  BND_REGS, BND_REGS, BND_REGS, BND_REGS,
 };
 
 /* The "default" register map used in 32bit mode.  */
@@ -1972,6 +1974,7 @@ int const dbx_register_map[FIRST_PSEUDO_REGISTER] =
   -1, -1, -1, -1, -1, -1, -1, -1,   /* AVX-512 registers 16-23*/
   -1, -1, -1, -1, -1, -1, -1, -1,   /* AVX-512 registers 24-31*/
   93, 94, 95, 96, 97, 98, 99, 100,  /* Mask registers */
+  101, 102, 103, 104,  /* bound registers */
 };
 
 /* The "default" register map used in 64bit mode.  */
@@ -1988,6 +1991,7 @@ int const dbx64_register_map[FIRST_PSEUDO_REGISTER] =
   67, 68, 69, 70, 71, 72, 73, 74,   /* AVX-512 registers 16-23 */
   75, 76, 77, 78, 79, 80, 81, 82,   /* AVX-512 registers 24-31 */
   118, 119, 120, 121, 122, 123, 124, 125, /* Mask registers */
+  126, 127, 128, 129,  /* bound registers */
 };
 
 /* Define the register numbers to be used in Dwarf debugging information.
@@ -2056,6 +2060,7 @@ int const svr4_dbx_register_map[FIRST_PSEUDO_REGISTER] =
   -1, -1, -1, -1, -1, -1, -1, -1,   /* AVX-512 registers 16-23*/
   -1, -1, -1, -1, -1, -1, -1, -1,   /* AVX-512 registers 24-31*/
   93, 94, 95, 96, 97, 98, 99, 100,  /* Mask registers */
+  -1, -1, -1, -1,   /* bound registers */
 };
 
 /* Define parameter passing and return registers.  */
@@ -2478,6 +2483,7 @@ ix86_target_string (HOST_WIDE_INT isa, int fl

Re: [PATCH, i386, MPX 1/X] Support of Intel MPX ISA. 2/2 New registers and instructions

2013-10-24 Thread Ilya Enkovich

2013/10/24 Uros Bizjak :
> On Thu, Oct 24, 2013 at 12:06 PM, Ilya Enkovich  
> wrote:
>> On 01 Oct 20:00, Uros Bizjak wrote:
>>>
>>> This is OK for mainline, on the condition that target independent part
>>> is approved and committed first.
>>>
>>> Thanks,
>>> Uros.
>>
>> Thanks for review!
>>
>> Attached is a version to be committed.  The only difference from the 
>> previous one is BOUND_MODE renamed to POINTER_BOUNDS_MODE due to changes in 
>> target independent part.  ChangeLog was not modified.
>
> I think you missed a couple of length -> length_nobnd updates:
>
> @@ -11635,7 +11680,12 @@
>[(simple_return)
> (unspec [(const_int 0)] UNSPEC_REP)]
>"reload_completed"
> -  "rep%; ret"
> +{
> +  if (ix86_bnd_prefixed_insn_p (insn))
> +return "%!ret";
> +
> +  return "rep%; ret";
> +}
>[(set_attr "length" "2")
> (set_attr "atom_unit" "jeu")
> (set_attr "length_immediate" "0")
>

There is no reason for length_nobnd because instruction length is
always 2. Difference is in prefix used for MPX and non-MPX code.

> and possibly here:
>
> @@ -11186,7 +11231,7 @@
>  (define_insn "*indirect_jump"
>[(set (pc) (match_operand:W 0 "indirect_branch_operand" "rw"))]
>""
> -  "jmp\t%A0"
> +  "%!jmp\t%A0"
>[(set_attr "type" "ibr")
> (set_attr "length_immediate" "0")])
>
> @@ -11235,7 +11280,7 @@
>[(set (pc) (match_operand:W 0 "indirect_branch_operand" "rw"))
> (use (label_ref (match_operand 1)))]
>""
> -  "jmp\t%A0"
> +  "%!jmp\t%A0"
>[(set_attr "type" "ibr")
> (set_attr "length_immediate" "0")])

For these cases 'prefix_rep' attribute makes the work due to 'ibr'
type. Generic length should work fine.

Thanks,
Ilya

>
> Uros.

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-24 Thread Ilya Enkovich

esOn 21 Oct 12:08, Joseph S. Myers wrote:
> On Mon, 21 Oct 2013, Ilya Enkovich wrote:
> 
> > +  if (flag_check_pointers)
> > +{
> > +  if (flag_lto)
> > +   sorry ("Pointers checker is not yet fully supported for link-time 
> > optimization");
> 
> That sounds wrong.  It suggests some bug somewhere in your patch series 
> failing to allow for LTO, which should be fixed.  At least give a more 
> detailed explanation in a comment of what would need to change to remove 
> this call to sorry ().

I made some Pointers Checker tests with LTO.  Currently instrumented code 
causes ICE in LTO streamer.  Since it happens on early stage, I do not know if 
other significant problems exist.  I added a comment for now and will 
investigate it further.  I also fixed error messages.

Thanks,
Ilya

> 
> > +  if (targetm.chkp_bound_mode () == VOIDmode)
> > +   error ("-fcheck-pointers is not supported for this target.");
> 
> Also note GNU Coding Standards for diagnostics.  They should not start 
> with uppercase letters or end with ".".
> 
> -- 
> Joseph S. Myers
> joseph@codesourcery.

--

gcc/

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 2634ecc..c6c5e5c 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -224,12 +224,15 @@ DEF_FUNCTION_TYPE_1 (BT_FN_DFLOAT64_DFLOAT64, 
BT_DFLOAT64, BT_DFLOAT64)
 DEF_FUNCTION_TYPE_1 (BT_FN_DFLOAT128_DFLOAT128, BT_DFLOAT128, BT_DFLOAT128)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, BT_VOLATILE_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_PTRPTR, BT_VOID, BT_PTR_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_VOID_CONST_PTR, BT_VOID, BT_CONST_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONG_ULONG, BT_ULONG, BT_ULONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT16_UINT16, BT_UINT16, BT_UINT16)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)
+DEF_FUNCTION_TYPE_1 (BT_FN_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_CONST_PTR_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
 
@@ -341,6 +344,10 @@ DEF_FUNCTION_TYPE_2 (BT_FN_VOID_VPTR_INT, BT_VOID, 
BT_VOLATILE_PTR, BT_INT)
 DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_VPTR_INT, BT_BOOL, BT_VOLATILE_PTR, BT_INT)
 DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_SIZE_CONST_VPTR, BT_BOOL, BT_SIZE,
 BT_CONST_VOLATILE_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, BT_SIZE)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR, 
BT_CONST_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTRPTR_CONST_PTR, BT_VOID, BT_PTR_PTR, 
BT_CONST_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_CONST_PTR_SIZE, BT_VOID, BT_CONST_PTR, BT_SIZE)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR)
 
@@ -425,6 +432,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, 
BT_VOLATILE_PTR, BT_I2, BT
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, 
BT_INT)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, 
BT_INT)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, 
BT_I16, BT_INT)
+DEF_FUNCTION_TYPE_3 (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, 
BT_CONST_PTR, BT_SIZE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 3b16d59..b8dec3f 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -5861,7 +5861,18 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
enum machine_mode mode,
   && fcode != BUILT_IN_EXECVE
   && fcode != BUILT_IN_ALLOCA
   && fcode != BUILT_IN_ALLOCA_WITH_ALIGN
-  && fcode != BUILT_IN_FREE)
+  && fcode != BUILT_IN_FREE
+  && fcode != BUILT_IN_CHKP_SET_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_INIT_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_NULL_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_COPY_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_NARROW_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_STORE_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_CHECK_PTR_LBOUNDS
+  && fcode != BUILT_IN_CHKP_CHECK_PTR_UBOUNDS
+  && fcode != BUILT_IN_CHKP_CHECK_PTR_BOUNDS
+  && fcode != BUILT_IN_CHKP_GET_PTR_LBOUND
+  && fcode != BUILT_IN_CHKP_GET_PTR_UBOUND)
 return expand_call (exp, target, ignore);
 
   /* The built-in function expanders test for target == const0_rtx
@@ -6905,6 +6916,50 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
enum machine_mode mode,
   expand_builtin_set_thread_pointer (exp);
   return

Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-24 Thread Ilya Enkovich

Here is an updated version with changes due to renames in basic bounds type 
patch.

Thanks,
Ilya
--

gcc/

2013-10-23  Ilya Enkovich  

* tree-core.h (tree_index): Add TI_POINTER_BOUNDS_TYPE.
* tree.h (POINTER_BOUNDS_P): New.
(BOUNDED_TYPE_P): New.
(BOUNDED_P): New.
(pointer_bounds_type_node): New.
* tree.c (build_common_tree_nodes): Initialize
pointer_bounds_type_node.
* gimple.h (gimple_call_get_nobnd_arg_index): New.
(gimple_call_num_nobnd_args): New.
(gimple_call_nobnd_arg): New.
(gimple_return_retbnd): New.
(gimple_return_set_retbnd): New
* gimple.c (gimple_build_return): Increase number of ops
for return statement.
(gimple_call_get_nobnd_arg_index): New.
* gimple-pretty-print.c (dump_gimple_return): Print second op.


diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 1e985e0..fddcee0 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -535,11 +535,12 @@ dump_gimple_assign (pretty_printer *buffer, gimple gs, 
int spc, int flags)
 static void
 dump_gimple_return (pretty_printer *buffer, gimple gs, int spc, int flags)
 {
-  tree t;
+  tree t, t2;
 
   t = gimple_return_retval (gs);
+  t2 = gimple_return_retbnd (gs);
   if (flags & TDF_RAW)
-dump_gimple_fmt (buffer, spc, flags, "%G <%T>", gs, t);
+dump_gimple_fmt (buffer, spc, flags, "%G <%T %T>", gs, t, t2);
   else
 {
   pp_string (buffer, "return");
@@ -548,6 +549,11 @@ dump_gimple_return (pretty_printer *buffer, gimple gs, int 
spc, int flags)
  pp_space (buffer);
  dump_generic_node (buffer, t, spc, flags, false);
}
+  if (t2)
+   {
+ pp_string (buffer, ", ");
+ dump_generic_node (buffer, t2, spc, flags, false);
+   }
   pp_semicolon (buffer);
 }
 }
diff --git a/gcc/gimple.c b/gcc/gimple.c
index e12f7d9..59ca78a 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -179,7 +179,7 @@ gimple_build_with_ops_stat (enum gimple_code code, unsigned 
subcode,
 gimple
 gimple_build_return (tree retval)
 {
-  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 1);
+  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 2);
   if (retval)
 gimple_return_set_retval (s, retval);
   return s;
@@ -371,6 +371,26 @@ gimple_build_call_from_tree (tree t)
 }
 
 
+/* Return index of INDEX's non bound argument of the call.  */
+
+unsigned
+gimple_call_get_nobnd_arg_index (const_gimple gs, unsigned index)
+{
+  unsigned num_args = gimple_call_num_args (gs);
+  for (unsigned n = 0; n < num_args; n++)
+{
+  if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
+   continue;
+  else if (index)
+   index--;
+  else
+   return n;
+}
+
+  gcc_unreachable ();
+}
+
+
 /* Extract the operands and code for expression EXPR into *SUBCODE_P,
*OP1_P, *OP2_P and *OP3_P respectively.  */
 
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 376fda2..484b467 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -910,6 +910,7 @@ extern tree get_initialized_tmp_var (tree, gimple_seq *, 
gimple_seq *);
 extern tree get_formal_tmp_var (tree, gimple_seq *);
 extern void declare_vars (tree, gimple, bool);
 extern void annotate_all_with_location (gimple_seq, location_t);
+extern unsigned gimple_call_get_nobnd_arg_index (const_gimple, unsigned);
 
 /* Validation of GIMPLE expressions.  Note that these predicates only check
the basic form of the expression, they don't recurse to make sure that
@@ -2379,6 +2380,31 @@ gimple_call_arg (const_gimple gs, unsigned index)
 }
 
 
+/* Return the number of arguments used by call statement GS.  */
+
+static inline unsigned
+gimple_call_num_nobnd_args (const_gimple gs)
+{
+  unsigned num_args = gimple_call_num_args (gs);
+  unsigned res = num_args;
+  for (unsigned n = 0; n < num_args; n++)
+if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
+  res--;
+  return res;
+}
+
+
+/* Return INDEX's call argument ignoring bound ones.  */
+static inline tree
+gimple_call_nobnd_arg (const_gimple gs, unsigned index)
+{
+  /* No bound args may exist if pointers checker is off.  */
+  if (!flag_check_pointers)
+return gimple_call_arg (gs, index);
+  return gimple_call_arg (gs, gimple_call_get_nobnd_arg_index (gs, index));
+}
+
+
 /* Return a pointer to the argument at position INDEX for call
statement GS.  */
 
@@ -4886,6 +4912,26 @@ gimple_return_set_retval (gimple gs, tree retval)
 }
 
 
+/* Return the return bounds for GIMPLE_RETURN GS.  */
+
+static inline tree
+gimple_return_retbnd (const_gimple gs)
+{
+  GIMPLE_CHECK (gs, GIMPLE_RETURN);
+  return gimple_op (gs, 1);
+}
+
+
+/* Set RETVAL to be the return bounds for GIMPLE_RETURN GS.  */
+
+static inline void
+gimple_return_set_retbnd (gimple gs, tree retval)
+{
+  GIMPLE_CHECK (gs, GIMPLE_RETURN);
+  gimple_set_op (gs, 1, retval);
+}
+

Re: [PATCH, i386]: Unbreak bootstrap

2013-10-25 Thread Ilya Enkovich

2013/10/25 Uros Bizjak :
> Hello!
>
> 2013-10-25  Uros Bizjak  
>
> * config/i386/i386.h (TARGET_MPX): New define.
> (TARGET_MPX_P): Ditto.
>
> Tested on x86_64-pc-linux-gnu, committed.

Thanks for fixing it!

>
> Uros.

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-25 Thread Ilya Enkovich

2013/10/25 Jeff Law :
> On 10/21/13 05:49, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> This patch introduces built-in functions used by Pointers Checker and flag
>> to enable Pointers Checker. Builtins available for user are expanded in
>> expand_builtin. All other builtins are not allowed in expand until generic
>> version of Pointers Cheker is implemented.
>>
>> Bootstrapped and tested on linux-x86_64.
>>
>> Thanks,
>> Ilya
>> --
>>
>> gcc/
>>
>> 2013-10-04  Ilya Enkovich  
>>
>> * builtin-types.def (BT_FN_VOID_CONST_PTR): New.
>> (BT_FN_PTR_CONST_PTR): New.
>> (BT_FN_CONST_PTR_CONST_PTR): New.
>> (BT_FN_PTR_CONST_PTR_SIZE): New.
>> (BT_FN_PTR_CONST_PTR_CONST_PTR): New.
>> (BT_FN_VOID_PTRPTR_CONST_PTR): New.
>> (BT_FN_VOID_CONST_PTR_SIZE): New.
>> (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
>> * chkp-builtins.def: New.
>> * builtins.def: include chkp-builtins.def.
>> (DEF_CHKP_BUILTIN): New.
>> * builtins.c (expand_builtin): Support
>> BUILT_IN_CHKP_INIT_PTR_BOUNDS,
>> BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
>> BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
>> BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
>> BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
>> BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
>> BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
>> BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
>> BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_ARG_BND,
>> BUILT_IN_CHKP_NARROW,
>> BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
>> * common.opt (fcheck-pointers): New.
>> * toplev.c (process_options): Check Pointers Checker is supported.
>> * doc/extend.texi: Document Pointers Checker built-in functions.
>
> Just a few minor comments.
>
>
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -874,6 +874,11 @@ fbounds-check
>>   Common Report Var(flag_bounds_check)
>>   Generate code to check bounds before indexing arrays
>>
>> +fcheck-pointers
>> +Common Report Var(flag_check_pointers)
>> +Add pointers checker instrumentation.  fchkp-* flags are used to
>> +control instrumentation.  Currently available for C, C++ and ObjC.
>> +
>
> I'd probably use "pointer bounds checking" rather than "pointers checker".
> It's a nit, but most folks have heard the term "pointer bounds checking",
> but few probabaly use "pointers checker".
>
> I think you make several references to "pointers checker" that are probably
> best reworded slightly to use "pointer bounds checker"
>
>
>
>> diff --git a/gcc/toplev.c b/gcc/toplev.c
>> index feba051..285b36d 100644
>> --- a/gcc/toplev.c
>> +++ b/gcc/toplev.c
>> @@ -1290,6 +1290,18 @@ process_options (void)
>> if (flag_mudflap && flag_lto)
>>   sorry ("mudflap cannot be used together with link-time
>> optimization");
>>
>> +  if (flag_check_pointers)
>> +{
>> +  if (flag_lto)
>> +   sorry ("Pointers checker is not yet fully supported for link-time
>> optimization");
>
> What was the final resolution of this?  Like jsm, this seems to me to be
> papering over a problem elsewhere.
>
> I'll pre-approve this patch with the terminology change and the flag_lto
> hack removed.
>
> jeff
>

There are currently two known issues with LTO. The first one is ICE in
LTO streamer when it reads instrumented code. The second one is
unitialized flag_check_pointers when code is compiled by lto1 (do you
know why it may happen BTW?). It also causes ICE beacause instrumented
code is met when not expected. Of course, I'll fix these problems
anyway, but I was going to allow '-fcheck-pointers -flto' only when
checker testsuite has 100% pass rate with lto.

Ilya

Re: [PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks

2013-10-28 Thread Ilya Enkovich

On 24 Oct 23:21, Jeff Law wrote:
> On 10/24/13 02:24, Ilya Enkovich wrote:
> >These two hooks are used by expand pass to pass/receive bounds for
> >args. When bounds are passed in a register, expand does not need this
> >hook and uses regular move insn. If we are out of bound register or
> >platform does not have them at all, this hook is called. If bounded
> >arg is passed in memory, regular way to store associated bounds is
> >supposed to be used in hook. That is why no slot_no value is required,
> >only arg and it's place (slot) are used. If bounded arg is passed in
> >register (e.g. it happens on i386 with MPX when more that 4 pointers
> >are passed in registers), then some special slot has to be used for
> >bounds. slot_no here holds identifier of this special slot. E.g. if we
> >call function with 6 pointer on i386, we call this hook passing R8 and
> >R9 as slot with const1_rtx and const2_rtx as slot_no.
> So can we find a concise way to describe this and include that in
> the docs for the hooks.  Otherwise I can't see how a developer is
> going to know how to use this stuff.
> 
> >>So how am I (as a GCC developer) suppsoed to know what BNDMK, BNDLDX, BNDCU,
> >>etc mean?  The names aren't particularly descriptive.  Are these documented
> >>elsewhere in a follow-up patch?  If not, it seems to me we need to document
> >>them here.
> >
> >Actually the next patch introduces them and is a good place for
> >documentation. But currently this patch has documentation for user
> >visible built-ins only. For now built-ins used for instrumentation are
> >described on Wiki only
> >(http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler#Builtins_used_for_instrumentation).
> >Where should I move it? Does it also go to extend.texi?
> If they're strictly for developers, then there's not a real good
> place for them.  They're not really extensions we would expect the
> user to use, so extend.texi seems inappropriate.
> 
> Perhaps a section in tm.texi?
> 
> Jeff
> 

I fixed documentation for bounds load/store hooks.  Also added documentation 
for list of builtins returned by builtin_chkp_function.

Thanks,
Ilya
--

gcc/

2013-10-28  Ilya Enkovich  

* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(fn_abi_va_list_bounds_size): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
* doc/tm.texi.in (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New.
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
* doc/tm.texi: Regenerated.
* langhooks.h (lang_hooks): Add chkp_supported field.
* langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): New.
(LANG_HOOKS_INITIALIZER); Add LANG_HOOKS_CHKP_SUPPORTED.


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8d220f3..c60ebef 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4334,6 +4334,13 @@ This hook returns the va_list type of the calling 
convention specified by
 The default version of this hook returns @code{va_list_type_node}.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE (tree 
@var{fndecl})
+This hook returns size for @code{va_list} object in function specified
+by @var{fndecl}.  This hook is used by Pointer Bounds Checker to build bounds
+for @code{va_list} object.  Return @code{integer_zero_node} if no bounds
+should be used (e.g. @code{va_list} is a scalar pointer to the stack).
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_CANONICAL_VA_LIST_TYPE (tree @var{type})
 This hook returns the va_list type of the calling convention specified by the
 type of @var{type}. If @var{type} is not a valid va_list type, it returns
@@ -5151,6 +5158,26 @@ defined, then define this hook to return @code{true} if
 Otherwise, you should not define this hook.
 @end deftypefn
 
+@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx @var{slot}, rtx 
@var{arg}, rtx @var{slot_no})
+This hook is used by expand pass to emit insn to load bounds of
+@var{arg} passed in @var{slot}.  E

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-29 Thread Ilya Enkovich

On 29 Oct 06:59, Jeff Law wrote:
> On 10/29/13 04:17, Richard Biener wrote:
> >On Mon, Oct 28, 2013 at 10:21 PM, Jeff Law  wrote:
> >>On 10/25/13 11:57, Ilya Enkovich wrote:
> >>
> >>>
> >>>There are currently two known issues with LTO. The first one is ICE in
> >>>LTO streamer when it reads instrumented code. The second one is
> >>>unitialized flag_check_pointers when code is compiled by lto1 (do you
> >>>know why it may happen BTW?). It also causes ICE beacause instrumented
> >>>code is met when not expected. Of course, I'll fix these problems
> >>>anyway, but I was going to allow '-fcheck-pointers -flto' only when
> >>>checker testsuite has 100% pass rate with lto.
> >>
> >>No idea about why LTO would be failing in this way -- my experience with LTO
> >>is virtually nil.
> >>
> >>I'm OK with the restriction as a temporary measure.  But we really dislike
> >>this kidn of stuff, so I need a commitment from you to dive into these
> >>problems and sort out what's really going on.
> >
> >I'm not ok with adding such arbitrary restrictions.  LTO support should
> >be trivial.
> In  that case, Ilya, you've got to resolve the LTO issues before
> this can move forward.
> 
> jeff
> 

Yeah.  I'm working on it right now.  I've fixed known issues and now I'm 
looking for others.  Meanwhile here is a new patch version with required 
renames and without LTO restriction.

Thanks,
Ilya
--

gcc/

2013-10-29  Ilya Enkovich  

* builtin-types.def (BT_FN_VOID_CONST_PTR): New.
(BT_FN_PTR_CONST_PTR): New.
(BT_FN_CONST_PTR_CONST_PTR): New.
(BT_FN_PTR_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_VOID_PTRPTR_CONST_PTR): New.
(BT_FN_VOID_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
* chkp-builtins.def: New.
* builtins.def: include chkp-builtins.def.
(DEF_CHKP_BUILTIN): New.
* builtins.c (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS,
BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_ARG_BND, BUILT_IN_CHKP_NARROW,
BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
* common.opt (fcheck-pointer-bounds): New.
* toplev.c (process_options): Check Pointer Bounds Checker is supported.
* doc/extend.texi: Document Pointer Bounds Checker built-in functions.

diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 3deedba..1f9ae4e 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -226,6 +226,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_DFLOAT64_DFLOAT64, BT_DFLOAT64, 
BT_DFLOAT64)
 DEF_FUNCTION_TYPE_1 (BT_FN_DFLOAT128_DFLOAT128, BT_DFLOAT128, BT_DFLOAT128)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, BT_VOLATILE_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_PTRPTR, BT_VOID, BT_PTR_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_VOID_CONST_PTR, BT_VOID, BT_CONST_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONG_ULONG, BT_ULONG, BT_ULONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG)
@@ -233,6 +234,8 @@ DEF_FUNCTION_TYPE_1 (BT_FN_UINT16_UINT16, BT_UINT16, 
BT_UINT16)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)
 DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_INT, BT_BOOL, BT_INT)
+DEF_FUNCTION_TYPE_1 (BT_FN_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_CONST_PTR_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR)

 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)

@@ -346,6 +349,10 @@ DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_SIZE_CONST_VPTR, BT_BOOL, 
BT_SIZE,
 BT_CONST_VOLATILE_PTR)
 DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_INT_BOOL, BT_BOOL, BT_INT, BT_BOOL)
 DEF_FUNCTION_TYPE_2 (BT_FN_VOID_UINT_UINT, BT_VOID, BT_UINT, BT_UINT)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, BT_SIZE)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR, 
BT_CONST_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTRPTR_CONST_PTR, BT_VOID, BT_PTR_PTR, 
BT_CONST_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_CONST_PTR_SIZE, BT_VOID, BT_CONST_PTR, BT_SIZE)

 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR)

@@ -428,6 +435,7 @@ DEF_FUNCT

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-29 Thread Ilya Enkovich

2013/10/29 Jeff Law :
> On 10/29/13 07:52, Ilya Enkovich wrote:
>>
>>
>> Yeah.  I'm working on it right now.  I've fixed known issues and now
>> I'm looking for others.  Meanwhile here is a new patch version with
>> required renames and without LTO restriction.
>
> I can't help but but curious, what turned out to be the root cause of those
> LTO problems?

There were three different problems fixed.

The first one was SSA_NAME in DECL_INITIAL of local var.
Instrumentation used it to initialize var with input arg value
(default SSA_NAME of PARM_DECL was used). LTO cannot handle it because
when it reads symbols, it does not have SSA_NAMEs. It caused ICE.

Another problem was in LTO front-end. I did not realize it has own
langhooks. It caused reset of flag_check_pointer_bounds in
process_options by my own code.

And the last one was in initialization of checker structures. Some
structures were initialized during checker pass and then used in other
passes (e.g. expand). With LTO checker pass is not executed after LTO
front-end and following passes could work with uninitialized checker
structures.

>
>
>
> 2013-10-29  Ilya Enkovich  
>
> * builtin-types.def (BT_FN_VOID_CONST_PTR): New.
> (BT_FN_PTR_CONST_PTR): New.
> (BT_FN_CONST_PTR_CONST_PTR): New.
> (BT_FN_PTR_CONST_PTR_SIZE): New.
> (BT_FN_PTR_CONST_PTR_CONST_PTR): New.
> (BT_FN_VOID_PTRPTR_CONST_PTR): New.
> (BT_FN_VOID_CONST_PTR_SIZE): New.
> (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
> * chkp-builtins.def: New.
> * builtins.def: include chkp-builtins.def.
> (DEF_CHKP_BUILTIN): New.
> * builtins.c (expand_builtin): Support
> BUILT_IN_CHKP_INIT_PTR_BOUNDS,
> BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
> BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
> BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
> BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
> BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
> BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
> BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
> BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_ARG_BND,
> BUILT_IN_CHKP_NARROW,
> BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
> * common.opt (fcheck-pointer-bounds): New.
> * toplev.c (process_options): Check Pointer Bounds Checker is
> supported.
> * doc/extend.texi: Document Pointer Bounds Checker built-in
> functions.
>
> This is fine.  Please install.

Thanks!

Ilya
>
> Thanks,
> jeff

Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-29 Thread Ilya Enkovich

On 29 Oct 13:39, Jeff Law wrote:
> On 10/24/13 08:43, Ilya Enkovich wrote:
> >
> >+/* Return the number of arguments used by call statement GS.  */
> >+
> >+static inline unsigned
> >+gimple_call_num_nobnd_args (const_gimple gs)
> >+{
> >+  unsigned num_args = gimple_call_num_args (gs);
> >+  unsigned res = num_args;
> >+  for (unsigned n = 0; n < num_args; n++)
> >+if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
> >+  res--;
> >+  return res;
> >+}
> "number of arguments" seems wrong.  Aren't you counting the number
> of arguments without bounds?

Damned copy-paste.
> 
> 
> >+
> >+/* Nonzero if this type supposes bounds existence.  */
> >+#define BOUNDED_TYPE_P(type) \
> >+  (TREE_CODE (type) == POINTER_TYPE \
> >+|| TREE_CODE (type) == REFERENCE_TYPE)
> So how is this different than POINTER_TYPE_P?
> 
> 
> If you really want BOUNDED_TYPE_P, perhaps define it in terms of
> POINTER_TYPE_P?
> 
> With that and the comment fix, this is fine.
> 
> jeff

I'd like to keep this macro.  Currently it is equal to POINTER_TYPE_P but 
semantics is a little different.

Below is a fixed version to be committed.

Thanks!
Ilya
--

diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index e4b0f81..248dfea 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -539,11 +539,12 @@ dump_gimple_assign (pretty_printer *buffer, gimple gs, 
int spc, int flags)
 static void
 dump_gimple_return (pretty_printer *buffer, gimple gs, int spc, int flags)
 {
-  tree t;
+  tree t, t2;
 
   t = gimple_return_retval (gs);
+  t2 = gimple_return_retbnd (gs);
   if (flags & TDF_RAW)
-dump_gimple_fmt (buffer, spc, flags, "%G <%T>", gs, t);
+dump_gimple_fmt (buffer, spc, flags, "%G <%T %T>", gs, t, t2);
   else
 {
   pp_string (buffer, "return");
@@ -552,6 +553,11 @@ dump_gimple_return (pretty_printer *buffer, gimple gs, int 
spc, int flags)
  pp_space (buffer);
  dump_generic_node (buffer, t, spc, flags, false);
}
+  if (t2)
+   {
+ pp_string (buffer, ", ");
+ dump_generic_node (buffer, t2, spc, flags, false);
+   }
   pp_semicolon (buffer);
 }
 }
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 3ddceb9..20f6010 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -174,7 +174,7 @@ gimple_build_with_ops_stat (enum gimple_code code, unsigned 
subcode,
 gimple
 gimple_build_return (tree retval)
 {
-  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 1);
+  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 2);
   if (retval)
 gimple_return_set_retval (s, retval);
   return s;
@@ -366,6 +366,26 @@ gimple_build_call_from_tree (tree t)
 }
 
 
+/* Return index of INDEX's non bound argument of the call.  */
+
+unsigned
+gimple_call_get_nobnd_arg_index (const_gimple gs, unsigned index)
+{
+  unsigned num_args = gimple_call_num_args (gs);
+  for (unsigned n = 0; n < num_args; n++)
+{
+  if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
+   continue;
+  else if (index)
+   index--;
+  else
+   return n;
+}
+
+  gcc_unreachable ();
+}
+
+
 /* Extract the operands and code for expression EXPR into *SUBCODE_P,
*OP1_P, *OP2_P and *OP3_P respectively.  */
 
diff --git a/gcc/gimple.h b/gcc/gimple.h
index fef64cd..c7ce394 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -919,6 +919,7 @@ extern tree get_initialized_tmp_var (tree, gimple_seq *, 
gimple_seq *);
 extern tree get_formal_tmp_var (tree, gimple_seq *);
 extern void declare_vars (tree, gimple, bool);
 extern void annotate_all_with_location (gimple_seq, location_t);
+extern unsigned gimple_call_get_nobnd_arg_index (const_gimple, unsigned);
 
 /* Validation of GIMPLE expressions.  Note that these predicates only check
the basic form of the expression, they don't recurse to make sure that
@@ -2414,6 +2415,32 @@ gimple_call_arg (const_gimple gs, unsigned index)
 }
 
 
+/* Return the number of arguments used by call statement GS
+   ignoring bound ones.  */
+
+static inline unsigned
+gimple_call_num_nobnd_args (const_gimple gs)
+{
+  unsigned num_args = gimple_call_num_args (gs);
+  unsigned res = num_args;
+  for (unsigned n = 0; n < num_args; n++)
+if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
+  res--;
+  return res;
+}
+
+
+/* Return INDEX's call argument ignoring bound ones.  */
+static inline tree
+gimple_call_nobnd_arg (const_gimple gs, unsigned index)
+{
+  /* No bound args may exist if pointers checker is off.  */
+  if (!flag_check_pointer_bounds)
+return gimple_call_arg (gs, index);
+  return gimple_call_arg (gs, gimple_call_get_nobnd_arg_index (gs, index));
+}
+
+
 /* Return a pointer to the argument at position INDEX for call
statement GS.  */
 
@@ -5220,6 +5

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-30 Thread Ilya Enkovich

2013/10/30 Richard Biener :
> On Tue, Oct 29, 2013 at 8:48 PM, Ilya Enkovich  wrote:
>> 2013/10/29 Jeff Law :
>>> On 10/29/13 07:52, Ilya Enkovich wrote:
>>>>
>>>>
>>>> Yeah.  I'm working on it right now.  I've fixed known issues and now
>>>> I'm looking for others.  Meanwhile here is a new patch version with
>>>> required renames and without LTO restriction.
>>>
>>> I can't help but but curious, what turned out to be the root cause of those
>>> LTO problems?
>>
>> There were three different problems fixed.
>>
>> The first one was SSA_NAME in DECL_INITIAL of local var.
>> Instrumentation used it to initialize var with input arg value
>> (default SSA_NAME of PARM_DECL was used). LTO cannot handle it because
>> when it reads symbols, it does not have SSA_NAMEs. It caused ICE.
>
> Obviously putting things in trees is bad.

Yes, it is fixed.
>
>> Another problem was in LTO front-end. I did not realize it has own
>> langhooks. It caused reset of flag_check_pointer_bounds in
>> process_options by my own code.
>>
>> And the last one was in initialization of checker structures. Some
>> structures were initialized during checker pass and then used in other
>> passes (e.g. expand). With LTO checker pass is not executed after LTO
>> front-end and following passes could work with uninitialized checker
>> structures.
>
> Looks badly designed then - any function related information should
> be hooked off struct function and streamed by LTO.  Or the info
> should be present in the IL.

Info is local to pass and is not required to be streamed by LTO. It is
just stored
using checker interfaces in its structures.

Ilya

>
> Richard.
>
>>>
>>>
>>>
>>> 2013-10-29  Ilya Enkovich  
>>>
>>> * builtin-types.def (BT_FN_VOID_CONST_PTR): New.
>>> (BT_FN_PTR_CONST_PTR): New.
>>> (BT_FN_CONST_PTR_CONST_PTR): New.
>>> (BT_FN_PTR_CONST_PTR_SIZE): New.
>>> (BT_FN_PTR_CONST_PTR_CONST_PTR): New.
>>> (BT_FN_VOID_PTRPTR_CONST_PTR): New.
>>> (BT_FN_VOID_CONST_PTR_SIZE): New.
>>> (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
>>> * chkp-builtins.def: New.
>>> * builtins.def: include chkp-builtins.def.
>>> (DEF_CHKP_BUILTIN): New.
>>> * builtins.c (expand_builtin): Support
>>> BUILT_IN_CHKP_INIT_PTR_BOUNDS,
>>> BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
>>> BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
>>> BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
>>> BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
>>> BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
>>> BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
>>> BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
>>> BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_ARG_BND,
>>> BUILT_IN_CHKP_NARROW,
>>> BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
>>> * common.opt (fcheck-pointer-bounds): New.
>>> * toplev.c (process_options): Check Pointer Bounds Checker is
>>> supported.
>>> * doc/extend.texi: Document Pointer Bounds Checker built-in
>>> functions.
>>>
>>> This is fine.  Please install.
>>
>> Thanks!
>>
>> Ilya
>>>
>>> Thanks,
>>> jeff

Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Ilya Enkovich

On 30 Oct 10:26, Richard Biener wrote:
> 
> Ick - you enlarge all return statements?  But you don't set the actual value?
> So why allocate it with 2 ops in the first place??

When return does not return bounds it has operand with zero value similar to 
case when it does not return value. What is the difference then?

> 
> [Seems I completely missed that MPX changes "gimple" and the design
> document that was posted somewhere??]
> 

Design is on GCC Wiki and link was posted few times: 
http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler. Here 
is quote about return statements: 

Returns instrumentation. We add new operand to return statement to hold 
returned bounds and instrumentation pass is responsible to fill this operand 
with correct bounds

> Bah.
> 
> Where is the update to gimple.texi and tree.texi?
> 
> Richard.
> 

Unfortunately patch has been already installed.  Should we uninstall it?  If 
not, then here is patch for documentation.

Thanks,
Ilya
--

gcc/

2013-10-30  Ilya Enkovich  

* doc/gimple.texi (gimple_call_num_nobnd_args): New.
(gimple_call_nobnd_arg): New.
(gimple_return_retbnd): New.
(gimple_return_set_retbnd): New.
(gimple_call_get_nobnd_arg_index): New.


diff --git a/gcc/doc/gimple.texi b/gcc/doc/gimple.texi
index 7bd9fd5..be74170 100644
--- a/gcc/doc/gimple.texi
+++ b/gcc/doc/gimple.texi
@@ -1240,11 +1240,25 @@ Set @code{CHAIN} to be the static chain for call 
statement @code{G}.
 Return the number of arguments used by call statement @code{G}.
 @end deftypefn
 
+@deftypefn {GIMPLE function} unsigned gimple_call_num_nobnd_args (gimple g)
+Return the number of arguments used by call statement @code{G}
+ignoring bound ones.
+@end deftypefn
+
 @deftypefn {GIMPLE function} tree gimple_call_arg (gimple g, unsigned index)
 Return the argument at position @code{INDEX} for call statement @code{G}.  The
 first argument is 0.
 @end deftypefn
 
+@deftypefn {GIMPLE function} tree gimple_call_nobnd_arg (gimple g, unsigned 
index)
+Return the argument at position @code{INDEX} for call statement @code{G}
+ignoring bound ones.  The first argument is 0.
+@end deftypefn
+
+@deftypefn {GIMPLE function} unsigned gimple_call_get_nobnd_arg_index (gimple 
g, unsigned index)
+Return index of @code{INDEX}'s non bound argument of the call statement 
@code{G}
+@end deftypefn
+
 @deftypefn {GIMPLE function} {tree *} gimple_call_arg_ptr (gimple g, unsigned 
index)
 Return a pointer to the argument at position @code{INDEX} for call
 statement @code{G}.
@@ -2029,6 +2043,15 @@ Return the return value for @code{GIMPLE_RETURN} 
@code{G}.
 Set @code{RETVAL} to be the return value for @code{GIMPLE_RETURN} @code{G}.
 @end deftypefn
 
+@deftypefn {GIMPLE function} tree gimple_return_retbnd (gimple g)
+Return the bounds of return value for @code{GIMPLE_RETURN} @code{G}.
+@end deftypefn
+
+@deftypefn {GIMPLE function} void gimple_return_set_retbnd (gimple g, tree 
retbnd)
+Set @code{RETBND} to be the bounds of return value for @code{GIMPLE_RETURN}
+@code{G}.
+@end deftypefn
+
 @node @code{GIMPLE_SWITCH}
 @subsection @code{GIMPLE_SWITCH}
 @cindex @code{GIMPLE_SWITCH}

Re: [PATCH, MPX, 2/X] Pointers Checker [3/25] Attributes

2013-10-30 Thread Ilya Enkovich

On 24 Oct 23:34, Jeff Law wrote:
> On 10/21/13 05:59, Ilya Enkovich wrote:
> >Hi,
> >
> >This patch adds attributes 'bnd_variable_size' and 'bnd_legacy' used by 
> >Pointers Checker.
> >
> >Bootstrapped and tested on linux-x86_64.
> >
> >Thanks,
> >Ilya
> >--
> >
> >gcc/
> >
> >2013-10-21  Ilya Enkovich  
> >
> > * c-family/c-common.c (handle_bnd_variable_size_attribute): New.
> > (handle_bnd_legacy): New.
> > (c_common_attribute_table): Add bnd_variable_size and bnd_legacy.
> > * doc/extend.texi: Document bnd_variable_size and bnd_legacy
> > attributes.
> OK with the same terminology changes from the 2/25 Builtins patch.
> 
> jeff
> 
Thanks! Below is the installed version.

Ilya
--

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index b20fdd6..f519489 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -375,6 +375,8 @@ static tree handle_omp_declare_simd_attribute (tree *, 
tree, tree, int,
   bool *);
 static tree handle_omp_declare_target_attribute (tree *, tree, tree, int,
 bool *);
+static tree handle_bnd_variable_size_attribute (tree *, tree, tree, int, bool 
*);
+static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 
 static void check_function_nonnull (tree, int, tree *);
 static void check_nonnull_arg (void *, tree, unsigned HOST_WIDE_INT);
@@ -757,6 +759,10 @@ const struct attribute_spec c_common_attribute_table[] =
  handle_omp_declare_simd_attribute, false },
   { "omp declare target", 0, 0, true, false, false,
  handle_omp_declare_target_attribute, false },
+  { "bnd_variable_size",  0, 0, true,  false, false,
+ handle_bnd_variable_size_attribute, false },
+  { "bnd_legacy", 0, 0, true, false, false,
+ handle_bnd_legacy, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -8013,6 +8019,38 @@ handle_fnspec_attribute (tree *node ATTRIBUTE_UNUSED, 
tree ARG_UNUSED (name),
   return NULL_TREE;
 }
 
+/* Handle a "bnd_variable_size" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_bnd_variable_size_attribute (tree *node, tree name, tree ARG_UNUSED 
(args),
+   int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FIELD_DECL)
+{
+  warning (OPT_Wattributes, "%qE attribute ignored", name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
+/* Handle a "bnd_legacy" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_bnd_legacy (tree *node, tree name, tree ARG_UNUSED (args),
+  int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+{
+  warning (OPT_Wattributes, "%qE attribute ignored", name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
 /* Handle a "warn_unused" attribute; arguments as in
struct attribute_spec.handler.  */
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 8ca3137..1d52e42 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2138,7 +2138,7 @@ attributes are currently defined for functions on all 
targets:
 @code{returns_nonnull}, @code{gnu_inline},
 @code{externally_visible}, @code{hot}, @code{cold}, @code{artificial},
 @code{no_sanitize_address}, @code{no_address_safety_analysis},
-@code{no_sanitize_undefined},
+@code{no_sanitize_undefined}, @code{bnd_legacy},
 @code{error} and @code{warning}.
 Several other attributes are defined for functions on particular
 target systems.  Other attributes, including @code{section} are
@@ -3549,6 +3549,12 @@ The @code{no_sanitize_undefined} attribute on functions 
is used
 to inform the compiler that it should not check for undefined behavior
 in the function when compiling with the @option{-fsanitize=undefined} option.
 
+@item bnd_legacy
+@cindex @code{bnd_legacy} function attribute
+The @code{bnd_legacy} attribute on functions is used to inform
+compiler that function should not be instrumented when compiled
+with @option{-fcheck-pointers} option.
+
 @item regparm (@var{number})
 @cindex @code{regparm} attribute
 @cindex functions that are passed arguments in registers on the 386
@@ -5321,12 +5327,12 @@ placed in either the @code{.bss_below100} section or the
 The keyword @code{__attribute__} allows you to specify special
 attributes of @code{struct} and @code{union} types when you define
 such types.  This keyword is followed by an attribute specification
-inside double parentheses.  Seven attributes are currently defined for
+inside doub

Re: [PATCH, MPX, 2/X] Pointers Checker [4/25] Constructors

2013-10-30 Thread Ilya Enkovich

On 24 Oct 23:55, Jeff Law wrote:
> On 10/21/13 06:10, Ilya Enkovich wrote:
> >Hi,
> >
> >This patch introduces two new contructor types supported by 
> >cgraph_build_static_cdtor.
> >
> >'B' type is used to initialize static objects (bounds) created by Pointers 
> >Checker. The difference of this type from the regular constructor is that 
> >'B' constructor is never instrumented by Pointers Checker.
> >
> >'P' type is used by Pointers Checker to generate constructors to initialize 
> >bounds of statically initialized pointers. Pointers Checker remove all 
> >stores from such constructors after instrumentation.
> >
> >Since 'P' type constructors are created for statically initialized objects, 
> >we need to avoid creation of such objects during its gimplification. New 
> >restriction was added to gimplify_init_constructor.
> >
> >Bootstrapped and checked on linux-x86_64.
> >
> >Thanks,
> >Ilya
> >--
> >
> >gcc/
> >
> >2013-10-21  Ilya Enkovich  
> >
> > * ipa.c (cgraph_build_static_cdtor_1): Support contructors
> > with "chkp ctor" and "bnd_legacy" attributes.
> > * gimplify.c (gimplify_init_constructor): Avoid infinite
> > loop during gimplification of bounds initializer.
> This is OK.
> 
> As a side note, it seems awfully strange to be passing in the type
> of ctor/dtor in a char varaible.  I'd look favorably upon changing
> that to an enum where the enum values describe the cases they
> handle.  The existing code seems so, umm, 80s/90s style.  Obviously
> not something that's required of you to move this patch forward.

Thanks! Installed to trunk.

Ilya
> 
> jeff
>

Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Ilya Enkovich

2013/10/30 Richard Biener :
> On Wed, Oct 30, 2013 at 11:34 AM, Ilya Enkovich  
> wrote:
>> On 30 Oct 10:26, Richard Biener wrote:
>>>
>>> Ick - you enlarge all return statements?  But you don't set the actual 
>>> value?
>>> So why allocate it with 2 ops in the first place??
>>
>> When return does not return bounds it has operand with zero value similar to 
>> case when it does not return value. What is the difference then?
>>
>>>
>>> [Seems I completely missed that MPX changes "gimple" and the design
>>> document that was posted somewhere??]
>>>
>>
>> Design is on GCC Wiki and link was posted few times: 
>> http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler. 
>> Here is quote about return statements:
>>
>> Returns instrumentation. We add new operand to return statement to hold 
>> returned bounds and instrumentation pass is responsible to fill this operand 
>> with correct bounds
>
> foo (int * p, unsigned int size)
> {
>__bound_tmp.0;
>   long unsigned int D.2239;
>   long unsigned int _2;
>   sizetype _6;
>   int * _7;
>
>   :
>   __bound_tmp.0_4 = __builtin_ia32_arg_bnd (p_3(D));
>
>   :
>   _2 = (long unsigned int) size_1(D);
>   __builtin_ia32_bndcl (__bound_tmp.0_4, p_3(D));
>   _6 = _2 + 18446744073709551615;
>   _7 = p_3(D) + _6;
>   __builtin_ia32_bndcu (__bound_tmp.0_4, _7);
>   access_and_store (p_3(D), __bound_tmp.0_4, size_1(D));
>
> so it seems there is now a mismatch between DECL_ARGUMENTS
> and the GIMPLE call stmt arguments.  How (if) did you amend
> the GIMPLE stmt verifier for this?

Verifier just ignores bound arguments while iterating through them.

>
> How does regular code deal with this which may expect matching
> to DECL_ARGUMENTS?  In fact interleaving the additional
> arguments sounds very error-prone for existing code - I'd have
> appended all bound args at the end.  Also you unconditionally
> claim all pointer arguments have a bound - that looks like bad
> design as well.  Why didn't you add a flag to the relevant
> PARM_DECL (and then, what do you do for indirect calls?).

I'll consider using another layout for bound args. But why should we
have any PARM_DECL or other pointer not having bounds?
>
> /* Return the number of arguments used by call statement GS
>ignoring bound ones.  */
>
> static inline unsigned
> gimple_call_num_nobnd_args (const_gimple gs)
> {
>   unsigned num_args = gimple_call_num_args (gs);
>   unsigned res = num_args;
>   for (unsigned n = 0; n < num_args; n++)
> if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
>   res--;
>   return res;
> }
>
> the choice means that gimple_call_num_nobnd_args is not O(1).

Yes. And even having all bound args at the end would not fix it. We
have to additionally keep number of bounds if want to fix it. But I do
not see the strong reason for that. Currently there are just three
calls to this function in whole GCC.

>
> /* Return INDEX's call argument ignoring bound ones.  */
> static inline tree
> gimple_call_nobnd_arg (const_gimple gs, unsigned index)
> {
>   /* No bound args may exist if pointers checker is off.  */
>   if (!flag_check_pointer_bounds)
> return gimple_call_arg (gs, index);
>   return gimple_call_arg (gs, gimple_call_get_nobnd_arg_index (gs, index));
> }
>
> GIMPLE layout depending on flag_check_pointer_bounds sounds
> like a recipie for desaster if you consider TUs compiled with and
> TUs compiled without and LTO.  Or if you consider using
> optimized attribute with that flag.

Yep. Marking instrumented calls and functions would be useful in LTO case.

>
> I wish I had seen all this before.
>
>>> Bah.
>>>
>>> Where is the update to gimple.texi and tree.texi?
>>>
>>> Richard.
>>>
>>
>> Unfortunately patch has been already installed.  Should we uninstall it?  If 
>> not, then here is patch for documentation.
>
> I hope the reviewers that approved the patch will work with you to
> address the above issues.  I can't be everywhere.

Thanks for valuable input!

Ilya

>
> Richard.
>
>> Thanks,
>> Ilya
>> --
>>
>> gcc/
>>
>> 2013-10-30  Ilya Enkovich  
>>
>> * doc/gimple.texi (gimple_call_num_nobnd_args): New.
>> (gimple_call_nobnd_arg): New.
>> (gimple_return_retbnd): New.
>> (gimple_return_set_retbnd): New.
>> (gimple_call_get_nobnd_arg_index): New.
>>
>>
>> diff --git a/gcc/doc/gimple.texi b/gcc/doc/gimple.texi
>> index 7bd9fd5..be74170 100644
>&g

Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Ilya Enkovich

On 30 Oct 11:40, Jeff Law wrote:
> On 10/30/13 04:48, Richard Biener wrote:
> >foo (int * p, unsigned int size)
> >{
> >__bound_tmp.0;
> >   long unsigned int D.2239;
> >   long unsigned int _2;
> >   sizetype _6;
> >   int * _7;
> >
> >   :
> >   __bound_tmp.0_4 = __builtin_ia32_arg_bnd (p_3(D));
> >
> >   :
> >   _2 = (long unsigned int) size_1(D);
> >   __builtin_ia32_bndcl (__bound_tmp.0_4, p_3(D));
> >   _6 = _2 + 18446744073709551615;
> >   _7 = p_3(D) + _6;
> >   __builtin_ia32_bndcu (__bound_tmp.0_4, _7);
> >   access_and_store (p_3(D), __bound_tmp.0_4, size_1(D));
> >
> >so it seems there is now a mismatch between DECL_ARGUMENTS
> >and the GIMPLE call stmt arguments.  How (if) did you amend
> >the GIMPLE stmt verifier for this?
> Effectively the bounds are passed "on the side".
> 
> >
> >How does regular code deal with this which may expect matching
> >to DECL_ARGUMENTS?  In fact interleaving the additional
> >arguments sounds very error-prone for existing code - I'd have
> >appended all bound args at the end.  Also you unconditionally
> >claim all pointer arguments have a bound - that looks like bad
> >design as well.  Why didn't you add a flag to the relevant
> >PARM_DECL (and then, what do you do for indirect calls?).
> You can't actually interleave them -- that results in MPX and normal
> code not being able to interact.   Passing the bound at the end
> doesn't really work either -- varargs and the desire to pass some of
> the bounds around in bound registers.
> 
> 
> >
> >/* Return the number of arguments used by call statement GS
> >ignoring bound ones.  */
> >
> >static inline unsigned
> >gimple_call_num_nobnd_args (const_gimple gs)
> >{
> >   unsigned num_args = gimple_call_num_args (gs);
> >   unsigned res = num_args;
> >   for (unsigned n = 0; n < num_args; n++)
> > if (POINTER_BOUNDS_P (gimple_call_arg (gs, n)))
> >   res--;
> >   return res;
> >}
> >
> >the choice means that gimple_call_num_nobnd_args is not O(1).
> Yes, but I don't see that's terribly problematical.
> 
> 
> >
> >/* Return INDEX's call argument ignoring bound ones.  */
> >static inline tree
> >gimple_call_nobnd_arg (const_gimple gs, unsigned index)
> >{
> >   /* No bound args may exist if pointers checker is off.  */
> >   if (!flag_check_pointer_bounds)
> > return gimple_call_arg (gs, index);
> >   return gimple_call_arg (gs, gimple_call_get_nobnd_arg_index (gs, index));
> >}
> >
> >GIMPLE layout depending on flag_check_pointer_bounds sounds
> >like a recipie for desaster if you consider TUs compiled with and
> >TUs compiled without and LTO.  Or if you consider using
> >optimized attribute with that flag.
> Sorry, I don't follow.  Can you elaborate please.

I suppose the possile problem here is when we run LTO compiler without 
-fcheck-pointer-bounds and give instrumented code as input. 
gimple_call_nobnd_arg would work wrong for instrumented code. Actually there 
are other places in subsequent patches wich assume that 
flag_check_pointer_bounds is 1 if we have instrumented code. 

Ilya

> 
> >I hope the reviewers that approved the patch will work with you to
> >address the above issues.  I can't be everywhere.
> Obviously I will.
> 
> jeff
>

Re: [PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-30 Thread Ilya Enkovich

2013/10/30 Jeff Law :
> On 10/30/13 04:34, Ilya Enkovich wrote:
>>
>> On 30 Oct 10:26, Richard Biener wrote:
>>>
>>>
>>> Ick - you enlarge all return statements?  But you don't set the
>>> actual value? So why allocate it with 2 ops in the first place??
>>
>>
>> When return does not return bounds it has operand with zero value
>> similar to case when it does not return value. What is the difference
>> then?
>
> In general, when someone proposes a change in the size of tree, rtl or
> gimple nodes, it's a "yellow flag" that something may need further
> investigation.
>
> In this specific instance, I could trivially predict how that additional
> field would be used and a GIMPLE_RETURN isn't terribly important from a size
> standpoint, so I didn't call it out.
>
>
>
>> Returns instrumentation. We add new operand to return statement to
>> hold returned bounds and instrumentation pass is responsible to fill
>> this operand with correct bounds
>
> Exactly what I expected.
>
>
>>
>> Unfortunately patch has been already installed.  Should we uninstall
>> it?  If not, then here is patch for documentation.
>
> I think we're OK for now.  If Richi wants it out, he'll say so explicitly.
>
>
>
>>
>> Thanks, Ilya --
>>
>> gcc/
>>
>> 2013-10-30  Ilya Enkovich  
>>
>> * doc/gimple.texi (gimple_call_num_nobnd_args): New.
>> (gimple_call_nobnd_arg): New. (gimple_return_retbnd): New.
>> (gimple_return_set_retbnd): New. (gimple_call_get_nobnd_arg_index):
>> New.
>
> Can you also fixup the GIMPLE_RETURN documentation in gimple.texi.  It needs
> a minor update after these changes.

I could not find anything but accessors for GIMPLE_RETURN in
gimple.texi. And new accessors are in my doc patch already.

Ilya
>
> jeff
>

[PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-10-31 Thread Ilya Enkovich

Hi,

Here is a patch which hadles the problem with optimization of 
BUILT_IN_CHKP_ARG_BND calls.  Pointer Bounds Checker expects that argument of 
this call is a default SSA_NAME of the PARM_DECL whose bounds we want to get.  
The problem is in optimizations which may replace arg with it's copy or a known 
value.  This patch suppress such modifications.

Thanks,
Ilya
--

gcc/

2013-10-28  Ilya Enkovich  

* tree-into-ssa.c: Include "target.h"
(rewrite_update_stmt): Skip BUILT_IN_CHKP_ARG_BND calls.
* tree-ssa-dom.c: Include "target.h"
(cprop_into_stmt): Skip BUILT_IN_CHKP_ARG_BND calls.


diff --git a/gcc/tree-into-ssa.c b/gcc/tree-into-ssa.c
index 981e9f4..8d48f6d 100644
--- a/gcc/tree-into-ssa.c
+++ b/gcc/tree-into-ssa.c
@@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "diagnostic-core.h"
 #include "tree-into-ssa.h"
+#include "target.h"
 
 
 /* This file builds the SSA form for a function as described in:
@@ -1921,8 +1922,14 @@ rewrite_update_stmt (gimple stmt, gimple_stmt_iterator 
gsi)
 }
 
   /* Rewrite USES included in OLD_SSA_NAMES and USES whose underlying
- symbol is marked for renaming.  */
-  if (rewrite_uses_p (stmt))
+ symbol is marked for renaming.
+ Skip calls to BUILT_IN_CHKP_ARG_BND whose arg should never be
+ renamed.  */
+  if (rewrite_uses_p (stmt)
+  && !(flag_check_pointer_bounds
+  && (gimple_code (stmt) == GIMPLE_CALL)
+  && gimple_call_fndecl (stmt)
+  == targetm.builtin_chkp_function (BUILT_IN_CHKP_ARG_BND)))
 {
   if (is_gimple_debug (stmt))
{
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 211bfcf..445278a 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "params.h"
 #include "tree-ssa-threadedge.h"
 #include "tree-ssa-dom.h"
+#include "target.h"
 
 /* This file implements optimizations on the dominator tree.  */
 
@@ -2266,6 +2267,16 @@ cprop_into_stmt (gimple stmt)
   use_operand_p op_p;
   ssa_op_iter iter;
 
+  /* Call used to obtain bounds of input arg by Pointer Bounds Checker
+ should not be optimized.  Argument of the call is a default
+ SSA_NAME of PARM_DECL.  It should never be replaced by value.  */
+  if (flag_check_pointer_bounds && gimple_code (stmt) == GIMPLE_CALL)
+{
+  tree fndecl = gimple_call_fndecl (stmt);
+  if (fndecl == targetm.builtin_chkp_function (BUILT_IN_CHKP_ARG_BND))
+   return;
+}
+
   FOR_EACH_SSA_USE_OPERAND (op_p, stmt, iter, SSA_OP_USE)
 cprop_operand (stmt, op_p);
 }

[PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-10-31 Thread Ilya Enkovich

Hi,

This patch adds support Pointer Bounds Checker into c-family and LTO 
front-ends.  The main purpose of changes in front-end is to register all 
statically initialized objects for checker.  We also need to register such 
objects created by compiler.

Thanks,
Ilya
--

gcc/

2013-10-29  Ilya Enkovich  

* c/c-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
* cp/cp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
* objc/objc-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
* objcp/objcp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
* lto/lto-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
* c/c-parser.c (c_parser_declaration_or_fndef): Register statically
initialized decls in Pointer Bounds Checker.
* cp/decl.c (cp_finish_decl): Likewise.
* gimplify.c (gimplify_init_constructor): Likewise.


diff --git a/gcc/c/c-lang.c b/gcc/c/c-lang.c
index 614c46d..a32bc6b 100644
--- a/gcc/c/c-lang.c
+++ b/gcc/c/c-lang.c
@@ -43,6 +43,8 @@ enum c_language_kind c_language = clk_c;
 #define LANG_HOOKS_INIT c_objc_common_init
 #undef LANG_HOOKS_INIT_TS
 #define LANG_HOOKS_INIT_TS c_common_init_ts
+#undef LANG_HOOKS_CHKP_SUPPORTED
+#define LANG_HOOKS_CHKP_SUPPORTED true
 
 /* Each front end provides its own lang hook initializer.  */
 struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 9ccae3b..65d83c8 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1682,6 +1682,12 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
fndef_ok,
  maybe_warn_string_init (TREE_TYPE (d), init);
  finish_decl (d, init_loc, init.value,
   init.original_type, asm_name);
+
+ /* Register all decls with initializers in Pointer
+Bounds Checker to generate required static bounds
+initializers.  */
+ if (DECL_INITIAL (d) != error_mark_node)
+   chkp_register_var_initializer (d);
}
}
  else
diff --git a/gcc/cp/cp-lang.c b/gcc/cp/cp-lang.c
index a7fa8e4..6d138bd 100644
--- a/gcc/cp/cp-lang.c
+++ b/gcc/cp/cp-lang.c
@@ -81,6 +81,8 @@ static tree get_template_argument_pack_elems_folded 
(const_tree);
 #define LANG_HOOKS_EH_PERSONALITY cp_eh_personality
 #undef LANG_HOOKS_EH_RUNTIME_TYPE
 #define LANG_HOOKS_EH_RUNTIME_TYPE build_eh_type_type
+#undef LANG_HOOKS_CHKP_SUPPORTED
+#define LANG_HOOKS_CHKP_SUPPORTED true
 
 /* Each front end provides its own lang hook initializer.  */
 struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 1e92f2a..db40e75 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6379,6 +6379,12 @@ cp_finish_decl (tree decl, tree init, bool 
init_const_expr_p,
 the class specifier.  */
  if (!DECL_EXTERNAL (decl))
var_definition_p = true;
+
+ /* If var has initilizer then we need to register it in
+Pointer Bounds Checker to generate static bounds initilizer
+if required.  */
+ if (DECL_INITIAL (decl) && DECL_INITIAL (decl) != error_mark_node)
+   chkp_register_var_initializer (decl);
}
   /* If the variable has an array type, lay out the type, even if
 there is no initializer.  It is valid to index through the
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 4f52c27..503450f 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -4111,6 +4111,11 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
*pre_p, gimple_seq *post_p,
 
walk_tree (&ctor, force_labels_r, NULL, NULL);
ctor = tree_output_constant_def (ctor);
+
+   /* We need to register created constant object to
+  initialize bounds for pointers in it.  */
+   chkp_register_var_initializer (ctor);
+
if (!useless_type_conversion_p (type, TREE_TYPE (ctor)))
  ctor = build1 (VIEW_CONVERT_EXPR, type, ctor);
TREE_OPERAND (*expr_p, 1) = ctor;
diff --git a/gcc/lto/lto-lang.c b/gcc/lto/lto-lang.c
index 0fa0fc9..b6073d9 100644
--- a/gcc/lto/lto-lang.c
+++ b/gcc/lto/lto-lang.c
@@ -1278,6 +1278,8 @@ static void lto_init_ts (void)
 
 #undef LANG_HOOKS_INIT_TS
 #define LANG_HOOKS_INIT_TS lto_init_ts
+#undef LANG_HOOKS_CHKP_SUPPORTED
+#define LANG_HOOKS_CHKP_SUPPORTED true
 
 struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
 
diff --git a/gcc/objc/objc-lang.c b/gcc/objc/objc-lang.c
index bc0008b..5e7e43b 100644
--- a/gcc/objc/objc-lang.c
+++ b/gcc/objc/objc-lang.c
@@ -49,6 +49,8 @@ enum c_language_kind c_language = clk_objc;
 #define LANG_HOOKS_GIMPLIFY_EXPR objc_gimplify_expr
 #undef LANG_HOOKS_INIT_TS
 #define LANG_HOOKS_INIT_TS objc_common_init_ts
+#undef LANG_HOOKS_CHKP_SUPPORTED
+#define LANG_HOOKS_CHKP_SUPPORTED true
 
 /* Each front end provides its own lang hook initializer.  */
 struct lang_hook

[PATCH, MPX, 2/X] Pointers Checker [9/25] Bound constants

2013-10-31 Thread Ilya Enkovich

Hi,

Here is a patch which adds support for bound constant to be used as 
DECL_INITIAL for constant static bounds generated by compiler.

Thanks,
Ilya
--

gcc/

2013-10-23  Ilya Enkovich  

* emit-rtl.c (immed_double_const): Support MODE_POINTER_BOUNDS.
* explow.c (trunc_int_for_mode): Likewise.
* varpool.c (ctor_for_folding): Do not fold constant
bounds vars.


diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
index b0fc846..5d13b69 100644
--- a/gcc/emit-rtl.c
+++ b/gcc/emit-rtl.c
@@ -538,7 +538,8 @@ immed_double_const (HOST_WIDE_INT i0, HOST_WIDE_INT i1, 
enum machine_mode mode)
  || GET_MODE_CLASS (mode) == MODE_PARTIAL_INT
  /* We can get a 0 for an error mark.  */
  || GET_MODE_CLASS (mode) == MODE_VECTOR_INT
- || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT);
+ || GET_MODE_CLASS (mode) == MODE_VECTOR_FLOAT
+ || GET_MODE_CLASS (mode) == MODE_POINTER_BOUNDS);
 
   if (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT)
return gen_int_mode (i0, mode);
diff --git a/gcc/explow.c b/gcc/explow.c
index f278e29..095434f 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -52,7 +52,8 @@ trunc_int_for_mode (HOST_WIDE_INT c, enum machine_mode mode)
   int width = GET_MODE_PRECISION (mode);
 
   /* You want to truncate to a _what_?  */
-  gcc_assert (SCALAR_INT_MODE_P (mode));
+  gcc_assert (SCALAR_INT_MODE_P (mode)
+ || POINTER_BOUNDS_MODE_P (mode));
 
   /* Canonicalize BImode to 0 and STORE_FLAG_VALUE.  */
   if (mode == BImode)
diff --git a/gcc/varpool.c b/gcc/varpool.c
index 2eb1fc1..d9c08c1 100644
--- a/gcc/varpool.c
+++ b/gcc/varpool.c
@@ -254,6 +254,12 @@ ctor_for_folding (tree decl)
   && TREE_CODE (decl) != CONST_DECL)
 return error_mark_node;
 
+  /* Static constant bounds are created to be
+ used instead of constants and therefore
+ do not let folding it.  */
+  if (POINTER_BOUNDS_P (decl))
+return error_mark_node;
+
   if (TREE_CODE (decl) == CONST_DECL
   || DECL_IN_CONSTANT_POOL (decl))
 return DECL_INITIAL (decl);

[PATCH, MPX, 2/X] Pointers Checker [10/25] Calls copy and verification

2013-10-31 Thread Ilya Enkovich

Hi,

Here is a patch to support of instrumented code in calls verifiers and calls 
copy with skipped args.

Thanks,
Ilya
--

gcc/

2013-10-29  Ilya Enkovich  

* cgraph.c (gimple_check_call_args): Handle bound args.
* gimple.c (gimple_call_copy_skip_args): Likewise.
(validate_call): Likewise.


diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 52d9ab0..9d7ae85 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -3030,40 +3030,54 @@ gimple_check_call_args (gimple stmt, tree fndecl, bool 
args_count_match)
 {
   for (i = 0, p = DECL_ARGUMENTS (fndecl);
   i < nargs;
-  i++, p = DECL_CHAIN (p))
+  i++)
{
- tree arg;
+ tree arg = gimple_call_arg (stmt, i);
+
+ /* Skip bound args inserted by Pointer Bounds Checker.  */
+ if (POINTER_BOUNDS_P (arg))
+   continue;
+
  /* We cannot distinguish a varargs function from the case
 of excess parameters, still deferring the inlining decision
 to the callee is possible.  */
  if (!p)
break;
- arg = gimple_call_arg (stmt, i);
+
  if (p == error_mark_node
  || arg == error_mark_node
  || (!types_compatible_p (DECL_ARG_TYPE (p), TREE_TYPE (arg))
  && !fold_convertible_p (DECL_ARG_TYPE (p), arg)))
 return false;
+
+ p = DECL_CHAIN (p);
}
   if (args_count_match && p)
return false;
 }
   else if (parms)
 {
-  for (i = 0, p = parms; i < nargs; i++, p = TREE_CHAIN (p))
+  for (i = 0, p = parms; i < nargs; i++)
{
- tree arg;
+ tree arg = gimple_call_arg (stmt, i);
+
+ /* Skip bound args inserted by Pointer Bounds Checker.  */
+ if (POINTER_BOUNDS_P (arg))
+   continue;
+
  /* If this is a varargs function defer inlining decision
 to callee.  */
  if (!p)
break;
- arg = gimple_call_arg (stmt, i);
+
  if (TREE_VALUE (p) == error_mark_node
  || arg == error_mark_node
  || TREE_CODE (TREE_VALUE (p)) == VOID_TYPE
  || (!types_compatible_p (TREE_VALUE (p), TREE_TYPE (arg))
  && !fold_convertible_p (TREE_VALUE (p), arg)))
 return false;
+
+ p = TREE_CHAIN (p);
}
 }
   else
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 20f6010..dc85bf8 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -3048,15 +3048,20 @@ canonicalize_cond_expr_cond (tree t)
 gimple
 gimple_call_copy_skip_args (gimple stmt, bitmap args_to_skip)
 {
-  int i;
+  int i, bit;
   int nargs = gimple_call_num_args (stmt);
   vec vargs;
   vargs.create (nargs);
   gimple new_stmt;
 
-  for (i = 0; i < nargs; i++)
-if (!bitmap_bit_p (args_to_skip, i))
-  vargs.quick_push (gimple_call_arg (stmt, i));
+  for (i = 0, bit = 0; i < nargs; i++, bit++)
+  if (POINTER_BOUNDS_P (gimple_call_arg (stmt, i)))
+   {
+ if (!bitmap_bit_p (args_to_skip, --bit))
+   vargs.quick_push (gimple_call_arg (stmt, i));
+   }
+  else if (!bitmap_bit_p (args_to_skip, bit))
+ vargs.quick_push (gimple_call_arg (stmt, i));
 
   if (gimple_call_internal_p (stmt))
 new_stmt = gimple_build_call_internal_vec (gimple_call_internal_fn (stmt),
@@ -3702,6 +3707,9 @@ validate_call (gimple stmt, tree fndecl)
   if (!targs)
return true;
   tree arg = gimple_call_arg (stmt, i);
+  /* Skip bounds.  */
+  if (flag_check_pointer_bounds && POINTER_BOUNDS_P (arg))
+   continue;
   if (INTEGRAL_TYPE_P (TREE_TYPE (arg))
  && INTEGRAL_TYPE_P (TREE_VALUE (targs)))
;

Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-05 Thread Ilya Enkovich

2013/11/4 Richard Biener :
> Richard Biener  wrote:
>>On Thu, Oct 31, 2013 at 10:02 AM, Ilya Enkovich
>> wrote:
>>> Hi,
>>>
>>> Here is a patch which hadles the problem with optimization of
>>BUILT_IN_CHKP_ARG_BND calls.  Pointer Bounds Checker expects that
>>argument of this call is a default SSA_NAME of the PARM_DECL whose
>>bounds we want to get.  The problem is in optimizations which may
>>replace arg with it's copy or a known value.  This patch suppress such
>>modifications.
>>
>>This doesn't seem like a good fix.  I suppose you require the same on
>>RTL, that is, have the incoming arg reg coalesced with the use?
>>In that case better set SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
>
> Btw, I would have chosen
>
> P_2 = __builtin_xyz (p_1, bound)
> Call (p_2)
>
> Thus make the builtins a transparent pass-through which effectively binds 
> parameter to their bound, removing the need for the artificial arguments and 
> making propagations a non.issue
>
> Also, how does the feature interact with other extensions such as nested 
> functions or optimizations like inlining?

In RTL all incoming bounds are materialized into slot where bounds are
passed and arg_bnd call is expanded into this slot.  Thus in RTL bound
arg looks more like a regular arg.

If I just set abnormal phi flag for SSA_NAME, SSA verifier should fail
because it does not used in abnormal phi, shouldn't it?  Also it would
prevent all optimizations for these SSA_NAMEs right?  Instrumentation
is performed on the earlier stage, right after we build SSA. I think
using abnormal phi flag and binding pointers with bounds via calls
would prevent some useful optimizations.

Many interprocedural optimizations require some support when work with
instrumented calls.  Inlining support includes:
  - replacement of arg_bnd calls with actual bounds passed to the inlined call
  - replacement of retbnd call with bounds, returned by inlined function

Not all IPA passes are fully enabled right now.  E.g. I restrict
bounded value propagation in ipa-prop and bounded args in functions
generated by ipa-split.  Such features will be enabled later.

For nested functions I do not see much difference from checker point
of view.  It just has an additional static chain param. Probably I
miss here something.  I did just few tests with nested functions.

Ilya

>
> Richard.
>
>
>>Richard.
>>
>>> Thanks,
>>> Ilya
>>> --
>>>
>>> gcc/
>>>
>>> 2013-10-28  Ilya Enkovich  
>>>
>>> * tree-into-ssa.c: Include "target.h"
>>> (rewrite_update_stmt): Skip BUILT_IN_CHKP_ARG_BND calls.
>>> * tree-ssa-dom.c: Include "target.h"
>>> (cprop_into_stmt): Skip BUILT_IN_CHKP_ARG_BND calls.
>>>
>>>
>>> diff --git a/gcc/tree-into-ssa.c b/gcc/tree-into-ssa.c
>>> index 981e9f4..8d48f6d 100644
>>> --- a/gcc/tree-into-ssa.c
>>> +++ b/gcc/tree-into-ssa.c
>>> @@ -46,6 +46,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "params.h"
>>>  #include "diagnostic-core.h"
>>>  #include "tree-into-ssa.h"
>>> +#include "target.h"
>>>
>>>
>>>  /* This file builds the SSA form for a function as described in:
>>> @@ -1921,8 +1922,14 @@ rewrite_update_stmt (gimple stmt,
>>gimple_stmt_iterator gsi)
>>>  }
>>>
>>>/* Rewrite USES included in OLD_SSA_NAMES and USES whose
>>underlying
>>> - symbol is marked for renaming.  */
>>> -  if (rewrite_uses_p (stmt))
>>> + symbol is marked for renaming.
>>> + Skip calls to BUILT_IN_CHKP_ARG_BND whose arg should never be
>>> + renamed.  */
>>> +  if (rewrite_uses_p (stmt)
>>> +  && !(flag_check_pointer_bounds
>>> +  && (gimple_code (stmt) == GIMPLE_CALL)
>>> +  && gimple_call_fndecl (stmt)
>>> +  == targetm.builtin_chkp_function (BUILT_IN_CHKP_ARG_BND)))
>>>  {
>>>if (is_gimple_debug (stmt))
>>> {
>>> diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
>>> index 211bfcf..445278a 100644
>>> --- a/gcc/tree-ssa-dom.c
>>> +++ b/gcc/tree-ssa-dom.c
>>> @@ -45,6 +45,7 @@ along with GCC; see the file COPYING3.  If not see
>>>  #include "params.h"
>>>  #include "tree-ssa-threadedge.h"
>>>  #include "tree-ssa-dom.h"
>>> +#include "target.h"
>>>
>>>  /* This file implements optimizations on the dominator tree.  */
>>>
>>> @@ -2266,6 +2267,16 @@ cprop_into_stmt (gimple stmt)
>>>use_operand_p op_p;
>>>ssa_op_iter iter;
>>>
>>> +  /* Call used to obtain bounds of input arg by Pointer Bounds
>>Checker
>>> + should not be optimized.  Argument of the call is a default
>>> + SSA_NAME of PARM_DECL.  It should never be replaced by value.
>>*/
>>> +  if (flag_check_pointer_bounds && gimple_code (stmt) ==
>>GIMPLE_CALL)
>>> +{
>>> +  tree fndecl = gimple_call_fndecl (stmt);
>>> +  if (fndecl == targetm.builtin_chkp_function
>>(BUILT_IN_CHKP_ARG_BND))
>>> +   return;
>>> +}
>>> +
>>>FOR_EACH_SSA_USE_OPERAND (op_p, stmt, iter, SSA_OP_USE)
>>>  cprop_operand (stmt, op_p);
>>>  }
>
>

Re: [PATCH, MPX, 2/X] Pointers Checker [7/25] Suppress BUILT_IN_CHKP_ARG_BND optimizations.

2013-11-05 Thread Ilya Enkovich

2013/11/5 Richard Biener :
> On Tue, Nov 5, 2013 at 1:02 PM, Ilya Enkovich  wrote:
>> 2013/11/4 Richard Biener :
>>> Richard Biener  wrote:
>>>>On Thu, Oct 31, 2013 at 10:02 AM, Ilya Enkovich
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> Here is a patch which hadles the problem with optimization of
>>>>BUILT_IN_CHKP_ARG_BND calls.  Pointer Bounds Checker expects that
>>>>argument of this call is a default SSA_NAME of the PARM_DECL whose
>>>>bounds we want to get.  The problem is in optimizations which may
>>>>replace arg with it's copy or a known value.  This patch suppress such
>>>>modifications.
>>>>
>>>>This doesn't seem like a good fix.  I suppose you require the same on
>>>>RTL, that is, have the incoming arg reg coalesced with the use?
>>>>In that case better set SSA_NAME_OCCURS_IN_ABNORMAL_PHI.
>>>
>>> Btw, I would have chosen
>>>
>>> P_2 = __builtin_xyz (p_1, bound)
>>> Call (p_2)
>>>
>>> Thus make the builtins a transparent pass-through which effectively binds 
>>> parameter to their bound, removing the need for the artificial arguments 
>>> and making propagations a non.issue
>>>
>>> Also, how does the feature interact with other extensions such as nested 
>>> functions or optimizations like inlining?
>>
>> In RTL all incoming bounds are materialized into slot where bounds are
>> passed and arg_bnd call is expanded into this slot.  Thus in RTL bound
>> arg looks more like a regular arg.
>
> I don't care so much for RTL, but this description hints at that the
> suggestion above would work (and it would eliminate all my concerns about
> the representation on the GIMPLE level - you'd not even need this
> strange POINTER_BOUNDS_TYPE as far as I can see.
>
>> If I just set abnormal phi flag for SSA_NAME, SSA verifier should fail
>> because it does not used in abnormal phi, shouldn't it?  Also it would
>> prevent all optimizations for these SSA_NAMEs right?  Instrumentation
>> is performed on the earlier stage, right after we build SSA. I think
>> using abnormal phi flag and binding pointers with bounds via calls
>> would prevent some useful optimizations.
>
> Well, what are the constraints that you need to avoid propagation in
> the first place?

For input parameter P I need to have
  BOUNDS = __builtin_arg_bnd (P)
to somehow refer to bounds of P in GIMPLE.  Optimizations may modify
__builtin_arg_bnd (P) replacing P with its copy or some value. It
makes call useless because removes information about parameter whose
bounds we refer to. I want such optimization to ignore
__builtin_arg_bnd calls and always leave default SSA_NAME of PARM_DECL
there as arg.

>
>> Many interprocedural optimizations require some support when work with
>> instrumented calls.  Inlining support includes:
>>   - replacement of arg_bnd calls with actual bounds passed to the inlined 
>> call
>>   - replacement of retbnd call with bounds, returned by inlined function
>>
>> Not all IPA passes are fully enabled right now.  E.g. I restrict
>> bounded value propagation in ipa-prop and bounded args in functions
>> generated by ipa-split.  Such features will be enabled later.
>>
>> For nested functions I do not see much difference from checker point
>> of view.  It just has an additional static chain param. Probably I
>> miss here something.  I did just few tests with nested functions.
>
> I was thinking of
>
> int foo (char *s)
> {
>int bar (void)
>{
>   ... use bound of 's' of the containing function ...
>   foo (q, and pass it along here for q)
>}
> }
>
> that is references to the containing functions parameters and their

All foo's locals referenced by nested bar here are placed into special
structure and all bounds should be stored in appropriated entries of
Bound Table.  Nested function may refer to them via this Table.

Ilya

>
> Richard.
>
>> Ilya
>>
>>>
>>> Richard.
>>>
>>>
>>>>Richard.
>>>>
>>>>> Thanks,
>>>>> Ilya
>>>>> --
>>>>>
>>>>> gcc/
>>>>>
>>>>> 2013-10-28  Ilya Enkovich  
>>>>>
>>>>> * tree-into-ssa.c: Include "target.h"
>>>>> (rewrite_update_stmt): Skip BUILT_IN_CHKP_ARG_BND calls.
>>>>> * tree-ssa-dom.c: Include "target.h"
>>>>> (cprop_into_stmt): Ski

Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-05 Thread Ilya Enkovich

2013/11/2 Joseph S. Myers :
> On Thu, 31 Oct 2013, Ilya Enkovich wrote:
>
>> This patch adds support Pointer Bounds Checker into c-family and LTO
>> front-ends.  The main purpose of changes in front-end is to register all
>> statically initialized objects for checker.  We also need to register
>> such objects created by compiler.
>
> What happens with statically initialized TLS objects?  It would seem that
> you need something like how the C++ front end handles static constructors
> for C++11 thread-local objects, but I don't see anything obvious for that
> here.

This patch takes care of pointers initialized by linker.  TLS objects
are dynamically allocated for each thread and should have some
constructor to perform initialization.  And if there is such
constructors then it should be instrumented by Pointer Bounds Checker.
 Therefore it should not require additional changes in front-end,
right?

The only problem I now see for TLS objects is that in my constructors
initialize bounds for an actual TLS object, not for object in .tdata.
How can I refer to .tdata objects in GIMPLE to get correct bounds
initialization for .tdata section?

Thanks,
Ilya

>
> --
> Joseph S. Myers
> jos...@codesourcery.com

Re: [PATCH, MPX, 2/X] Pointers Checker [8/25] Languages support

2013-11-05 Thread Ilya Enkovich

2013/11/4 Richard Biener :
> On Thu, Oct 31, 2013 at 10:11 AM, Ilya Enkovich  
> wrote:
>> Hi,
>>
>> This patch adds support Pointer Bounds Checker into c-family and LTO 
>> front-ends.  The main purpose of changes in front-end is to register all 
>> statically initialized objects for checker.  We also need to register such 
>> objects created by compiler.

LTO is quite specific front-end.  For LTO Pointer Bounds Checker
support macro just means it allows instrumented code as input because
all instrumentation is performed before code is streamed out for LTO.

Ilya

>
> You define CHKP as supported in LTO.  That means it has to be supported
> for all languages and thus you should drop the langhook.
>
> Richard.
>
>> Thanks,
>> Ilya
>> --
>>
>> gcc/
>>
>> 2013-10-29  Ilya Enkovich  
>>
>> * c/c-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
>> * cp/cp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
>> * objc/objc-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
>> * objcp/objcp-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
>> * lto/lto-lang.c (LANG_HOOKS_CHKP_SUPPORTED): New.
>> * c/c-parser.c (c_parser_declaration_or_fndef): Register statically
>> initialized decls in Pointer Bounds Checker.
>> * cp/decl.c (cp_finish_decl): Likewise.
>> * gimplify.c (gimplify_init_constructor): Likewise.
>>
>>
>> diff --git a/gcc/c/c-lang.c b/gcc/c/c-lang.c
>> index 614c46d..a32bc6b 100644
>> --- a/gcc/c/c-lang.c
>> +++ b/gcc/c/c-lang.c
>> @@ -43,6 +43,8 @@ enum c_language_kind c_language = clk_c;
>>  #define LANG_HOOKS_INIT c_objc_common_init
>>  #undef LANG_HOOKS_INIT_TS
>>  #define LANG_HOOKS_INIT_TS c_common_init_ts
>> +#undef LANG_HOOKS_CHKP_SUPPORTED
>> +#define LANG_HOOKS_CHKP_SUPPORTED true
>>
>>  /* Each front end provides its own lang hook initializer.  */
>>  struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
>> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
>> index 9ccae3b..65d83c8 100644
>> --- a/gcc/c/c-parser.c
>> +++ b/gcc/c/c-parser.c
>> @@ -1682,6 +1682,12 @@ c_parser_declaration_or_fndef (c_parser *parser, bool 
>> fndef_ok,
>>   maybe_warn_string_init (TREE_TYPE (d), init);
>>   finish_decl (d, init_loc, init.value,
>>init.original_type, asm_name);
>> +
>> + /* Register all decls with initializers in Pointer
>> +Bounds Checker to generate required static bounds
>> +initializers.  */
>> + if (DECL_INITIAL (d) != error_mark_node)
>> +   chkp_register_var_initializer (d);
>> }
>> }
>>   else
>> diff --git a/gcc/cp/cp-lang.c b/gcc/cp/cp-lang.c
>> index a7fa8e4..6d138bd 100644
>> --- a/gcc/cp/cp-lang.c
>> +++ b/gcc/cp/cp-lang.c
>> @@ -81,6 +81,8 @@ static tree get_template_argument_pack_elems_folded 
>> (const_tree);
>>  #define LANG_HOOKS_EH_PERSONALITY cp_eh_personality
>>  #undef LANG_HOOKS_EH_RUNTIME_TYPE
>>  #define LANG_HOOKS_EH_RUNTIME_TYPE build_eh_type_type
>> +#undef LANG_HOOKS_CHKP_SUPPORTED
>> +#define LANG_HOOKS_CHKP_SUPPORTED true
>>
>>  /* Each front end provides its own lang hook initializer.  */
>>  struct lang_hooks lang_hooks = LANG_HOOKS_INITIALIZER;
>> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
>> index 1e92f2a..db40e75 100644
>> --- a/gcc/cp/decl.c
>> +++ b/gcc/cp/decl.c
>> @@ -6379,6 +6379,12 @@ cp_finish_decl (tree decl, tree init, bool 
>> init_const_expr_p,
>>  the class specifier.  */
>>   if (!DECL_EXTERNAL (decl))
>> var_definition_p = true;
>> +
>> + /* If var has initilizer then we need to register it in
>> +Pointer Bounds Checker to generate static bounds initilizer
>> +if required.  */
>> + if (DECL_INITIAL (decl) && DECL_INITIAL (decl) != error_mark_node)
>> +   chkp_register_var_initializer (decl);
>> }
>>/* If the variable has an array type, lay out the type, even if
>>  there is no initializer.  It is valid to index through the
>> diff --git a/gcc/gimplify.c b/gcc/gimplify.c
>> index 4f52c27..503450f 100644
>> --- a/gcc/gimplify.c
>> +++ b/gcc/gimplify.c
>> @@ -4111,6 +4111,11 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
>> *pre_p, gimple_seq *post_p,
>>
>> walk_tree (&ctor, force_labels_r, NULL, NULL);
&g

Re: -z bndplt documentation in GCC manual

2016-01-20 Thread Ilya Enkovich

2016-01-20 3:42 GMT+03:00 Sandra Loosemore :
> On 01/19/2016 03:24 AM, Ilya Enkovich wrote:
>>
>> 2016-01-19 5:25 GMT+03:00 Sandra Loosemore :
>>>
>>> I think the documentation relating to '-z bndplt' in the GCC manual
>>> description of -fcheck-pointer-bounds is incorrect.  It looks like, as of
>>> r225862, the GCC driver is supposed to emit an error message if GCC was
>>> configured with a linker that doesn't support this option and you pass
>>> -mmpx
>>> without -static.  Is that right?  I'll fix the documentation once I'm
>>> clear
>>> on what the actual behavior is.
>>
>>
>> Compiler just emits a note where user is warned that GCC configuration may
>> lead to decreased instrumentation coverage.
>
>
> OK.  Is the attached patch accurate?  The existing text has several
> markup/grammatical/spelling errors and I'd like to simplify it to make it
> less repetitive and more direct and user-friendly.

I think your text accurately describes the situation. Thanks!

Ilya

>
> (BTW, part of the problem I had parsing the code is that the manual doesn't
> document the %n spec file syntax, or several other % escapes.  I opened
> PR69367 for that since I have too many other things in my pile to get to it
> any time soon.)
>
> -Sandra

Re: [PATCH] Require non-x32 target for compile-time MPX tests

2016-01-20 Thread Ilya Enkovich

2016-01-20 8:29 GMT+03:00 H.J. Lu :
> Compile-time MPX tests don't need the MPX run-time library.  They
> should pass for non-x32 target.
>
> OK for trunk and backport to GCC 5 branch?

This patch is OK.

Thanks,
Ilya

>
> H.J.
> ---
> Compile-time MPX tests don't need the MPX run-time library.  They
> should pass for non-x32 target.
>
> PR testsuite/69369
> * g++.dg/pr63995-1.C: Require non-x32 target, instead of,
> the MPX run-time library, for compile-time MPX test.
> * gcc.target/i386/chkp-always_inline.c: Likewise.
> * gcc.target/i386/chkp-bndret.c: Likewise.
> * gcc.target/i386/chkp-builtins-1.c: Likewise.
> * gcc.target/i386/chkp-builtins-2.c: Likewise.
> * gcc.target/i386/chkp-builtins-3.c: Likewise.
> * gcc.target/i386/chkp-builtins-4.c: Likewise.
> * gcc.target/i386/chkp-const-check-1.c: Likewise.
> * gcc.target/i386/chkp-const-check-2.c: Likewise.
> * gcc.target/i386/chkp-hidden-def.c: Likewise.
> * gcc.target/i386/chkp-label-address.c: Likewise.
> * gcc.target/i386/chkp-lifetime-1.c: Likewise.
> * gcc.target/i386/chkp-narrow-bounds.c: Likewise.
> * gcc.target/i386/chkp-pr69044.c: Likewise.
> * gcc.target/i386/chkp-remove-bndint-1.c: Likewise.
> * gcc.target/i386/chkp-remove-bndint-2.c: Likewise.
> * gcc.target/i386/chkp-strchr.c: Likewise.
> * gcc.target/i386/chkp-strlen-1.c: Likewise.
> * gcc.target/i386/chkp-strlen-2.c: Likewise.
> * gcc.target/i386/chkp-strlen-3.c: Likewise.
> * gcc.target/i386/chkp-strlen-4.c: Likewise.
> * gcc.target/i386/chkp-strlen-5.c: Likewise.
> * gcc.target/i386/chkp-stropt-1.c: Likewise.
> * gcc.target/i386/chkp-stropt-10.c: Likewise.
> * gcc.target/i386/chkp-stropt-11.c: Likewise.
> * gcc.target/i386/chkp-stropt-12.c: Likewise.
> * gcc.target/i386/chkp-stropt-13.c: Likewise.
> * gcc.target/i386/chkp-stropt-14.c: Likewise.
> * gcc.target/i386/chkp-stropt-15.c: Likewise.
> * gcc.target/i386/chkp-stropt-16.c: Likewise.
> * gcc.target/i386/chkp-stropt-2.c: Likewise.
> * gcc.target/i386/chkp-stropt-3.c: Likewise.
> * gcc.target/i386/chkp-stropt-4.c: Likewise.
> * gcc.target/i386/chkp-stropt-5.c: Likewise.
> * gcc.target/i386/chkp-stropt-6.c: Likewise.
> * gcc.target/i386/chkp-stropt-7.c: Likewise.
> * gcc.target/i386/chkp-stropt-8.c: Likewise.
> * gcc.target/i386/chkp-stropt-9.c: Likewise.
> * gcc.target/i386/pr63995-2.c: Likewise.
> * gcc.target/i386/pr64805.c: Likewise.
> * gcc.target/i386/pr65044.c: Likewise.
> * gcc.target/i386/pr65167.c: Likewise.
> * gcc.target/i386/pr65183.c: Likewise.
> * gcc.target/i386/pr65184.c: Likewise.
> * gcc.target/i386/thunk-retbnd.c: Likewise.
> ---
>  gcc/testsuite/g++.dg/pr63995-1.C | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-always_inline.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-bndret.c  | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-builtins-1.c  | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-builtins-2.c  | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-builtins-3.c  | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-builtins-4.c  | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-const-check-1.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-const-check-2.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-hidden-def.c  | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-label-address.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-lifetime-1.c  | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-narrow-bounds.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-pr69044.c | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-remove-bndint-1.c | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-remove-bndint-2.c | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-strchr.c  | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-strlen-1.c| 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-strlen-2.c| 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-strlen-3.c| 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-strlen-4.c| 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-strlen-5.c| 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-1.c| 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-10.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-11.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-12.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-13.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-14.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-15.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-16.c   | 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-2.c| 3 +--
>  gcc/testsuite/gcc.target/i386/chkp-stropt-3.c| 3 +--
>  gcc/te

Re: [PATCH] New version of libmpx with new memmove wrapper

2016-01-20 Thread Ilya Enkovich

2016-01-20 16:20 GMT+03:00 Matthias Klose :
> On 11.12.2015 15:34, Ilya Enkovich wrote:
>>
>> I fixed it, bootstrapped, regtested and applied to trunk.  Here is
>> committed version.
>
>
> this left libmpx/libtool-version, which now is unused and outdated. Ok to
> remove?

OK if bootstrap passes.

Thanks,
Ilya

>
> Matthias
>

[libmpx, committed] Fix verbosity for error messages

2016-01-25 Thread Ilya Enkovich

Hi,

This is an obvious patch fixing a verbosity for a part of error messages.  
Bootstrapped on x86_64-pc-linux-gnu.  Applied to trunk and gcc-5-branch.

Thanks,
Ilya
--
libmpx/

2016-01-20  Ilya Enkovich  

* mpxrt/mpxrt.c (handler): Fix verbosity for
error message.


diff --git a/libmpx/mpxrt/mpxrt.c b/libmpx/mpxrt/mpxrt.c
index bcdd3a6..b52906b 100644
--- a/libmpx/mpxrt/mpxrt.c
+++ b/libmpx/mpxrt/mpxrt.c
@@ -268,7 +268,7 @@ handler (int sig __attribute__ ((unused)),
   __mpxrt_write_uint (VERB_ERROR, trapno, 10);
   __mpxrt_write (VERB_ERROR, ", ip = 0x");
   __mpxrt_write_uint (VERB_ERROR, ip, 16);
-  __mpxrt_write (VERB_BR, "\n");
+  __mpxrt_write (VERB_ERROR, "\n");
   exit (255);
 }
   else
@@ -277,7 +277,7 @@ handler (int sig __attribute__ ((unused)),
   __mpxrt_write_uint (VERB_ERROR, trapno, 10);
   __mpxrt_write (VERB_ERROR, "! at 0x");
   __mpxrt_write_uint (VERB_ERROR, ip, 16);
-  __mpxrt_write (VERB_BR, "\n");
+  __mpxrt_write (VERB_ERROR, "\n");
   exit (255);
 }
 }

[PATCH, PR69421] Check vector types of COND_EXPR operands are compatible when vectorizing it

2016-01-25 Thread Ilya Enkovich

Hi,

This patch covers one more case when boolean operands get different
vectypes and we don't detect it.

Bootstrapped and regtested on x86_64-pc-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-01-25  Ilya Enkovich  

PR target/69421
* tree-vect-stmts.c (vectorizable_condition): Check vectype
of operands is compatible with a statement vectype.

gcc/testsuite/

2016-01-25  Ilya Enkovich  

PR target/69421
* gcc.dg/pr69421.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr69421.c b/gcc/testsuite/gcc.dg/pr69421.c
new file mode 100644
index 000..252e22c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69421.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+struct A { double a; };
+double a;
+
+void
+foo (_Bool *x)
+{
+  long i;
+  for (i = 0; i < 64; i++)
+{
+  struct A c;
+  x[i] = c.a || a;
+}
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 1d2246d..ed2ce07 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7528,6 +7528,7 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
 
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
   int nunits = TYPE_VECTOR_SUBPARTS (vectype);
+  tree vectype1 = NULL_TREE, vectype2 = NULL_TREE;
 
   if (slp_node || PURE_SLP_STMT (stmt_info))
 ncopies = 1;
@@ -7547,9 +7548,17 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
 return false;
 
   gimple *def_stmt;
-  if (!vect_is_simple_use (then_clause, stmt_info->vinfo, &def_stmt, &dt))
+  if (!vect_is_simple_use (then_clause, stmt_info->vinfo, &def_stmt, &dt,
+  &vectype1))
+return false;
+  if (!vect_is_simple_use (else_clause, stmt_info->vinfo, &def_stmt, &dt,
+  &vectype2))
 return false;
-  if (!vect_is_simple_use (else_clause, stmt_info->vinfo, &def_stmt, &dt))
+
+  if (vectype1 && !useless_type_conversion_p (vectype, vectype1))
+return false;
+
+  if (vectype2 && !useless_type_conversion_p (vectype, vectype2))
 return false;
 
   masked = !COMPARISON_CLASS_P (cond_expr);

[PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread Ilya Enkovich

Hi,

Currently STV pass may require a stack realignment if any
transformation occurs to enable SSE registers spill/fill.
It appears it's invalid to increase stack alignment requirements
at this point.  Thus we have to either assume we need stack to be
aligned if are going to run STV pass or disable STV if stack is
not properly aligned.  I suppose we shouldn't ignore explicitly
requested stack alignment not beeing sure we really optimize
anything (and STV is not an optimization frequiently applied).
So I think we may disable TARGET_STV for such cases as Jakub
suggested.  This patch was bootstrapped and regtested on
x86_64-pc-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-01-27  Jakub Jelinek  
    Ilya Enkovich  

PR target/69454
* config/i386/i386.c (convert_scalars_to_vector): Remove
stack alignment fixes.
(ix86_option_override_internal): Disable TARGET_STV if stack
is not properly aligned.

gcc/testsuite/

2016-01-27  Ilya Enkovich  

PR target/69454
* gcc.target/i386/pr69454-1.c: New test.
* gcc.target/i386/pr69454-2.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 34b57a4..9fb8db8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
   bitmap_obstack_release (NULL);
   df_process_deferred_rescans ();
 
-  /* Conversion means we may have 128bit register spills/fills
- which require aligned stack.  */
-  if (converted_insns)
-{
-  if (crtl->stack_alignment_needed < 128)
-   crtl->stack_alignment_needed = 128;
-  if (crtl->stack_alignment_estimated < 128)
-   crtl->stack_alignment_estimated = 128;
-}
-
   return 0;
 }
 
@@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
 opts->x_target_flags |= MASK_STV;
+  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
+ stack realignment will be extra cost the pass doesn't take into
+ account and the pass can't realign the stack.  */
+  if (ix86_preferred_stack_boundary < 64)
+opts->x_target_flags &= ~MASK_STV;
   if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
   && !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
 opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-1.c 
b/gcc/testsuite/gcc.target/i386/pr69454-1.c
new file mode 100644
index 000..12ecfd3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -msse2 -mno-accumulate-outgoing-args 
-mpreferred-stack-boundary=2" } */
+
+typedef struct { long long w64[2]; } V128;
+extern V128* fn2(void);
+long long a;
+V128 b;
+void fn1() {
+  V128 *c = fn2();
+  c->w64[0] = a ^ b.w64[0];
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-2.c 
b/gcc/testsuite/gcc.target/i386/pr69454-2.c
new file mode 100644
index 000..28bab93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -mpreferred-stack-boundary=2" } */
+
+extern void fn2 ();
+long long a, b;
+
+void fn1 ()
+{
+  long long c = a;
+  a = b ^ a;
+  fn2 ();
+  a = c;
+}

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread Ilya Enkovich

2016-01-27 18:43 GMT+03:00 H.J. Lu :
> On Wed, Jan 27, 2016 at 7:34 AM, Ilya Enkovich  wrote:
>> Hi,
>>
>> Currently STV pass may require a stack realignment if any
>> transformation occurs to enable SSE registers spill/fill.
>> It appears it's invalid to increase stack alignment requirements
>> at this point.  Thus we have to either assume we need stack to be
>> aligned if are going to run STV pass or disable STV if stack is
>> not properly aligned.  I suppose we shouldn't ignore explicitly
>> requested stack alignment not beeing sure we really optimize
>> anything (and STV is not an optimization frequiently applied).
>> So I think we may disable TARGET_STV for such cases as Jakub
>> suggested.  This patch was bootstrapped and regtested on
>> x86_64-pc-linux-gnu.  OK for trunk?
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2016-01-27  Jakub Jelinek  
>> Ilya Enkovich  
>>
>> PR target/69454
>> * config/i386/i386.c (convert_scalars_to_vector): Remove
>> stack alignment fixes.
>> (ix86_option_override_internal): Disable TARGET_STV if stack
>> is not properly aligned.
>>
>> gcc/testsuite/
>>
>> 2016-01-27  Ilya Enkovich  
>>
>> PR target/69454
>> * gcc.target/i386/pr69454-1.c: New test.
>> * gcc.target/i386/pr69454-2.c: New test.
>>
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 34b57a4..9fb8db8 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>>bitmap_obstack_release (NULL);
>>df_process_deferred_rescans ();
>>
>> -  /* Conversion means we may have 128bit register spills/fills
>> - which require aligned stack.  */
>> -  if (converted_insns)
>> -{
>> -  if (crtl->stack_alignment_needed < 128)
>> -   crtl->stack_alignment_needed = 128;
>> -  if (crtl->stack_alignment_estimated < 128)
>> -   crtl->stack_alignment_estimated = 128;
>> -}
>> -
>>return 0;
>>  }
>>
>> @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>>  opts->x_target_flags |= MASK_VZEROUPPER;
>>if (!(opts_set->x_target_flags & MASK_STV))
>>  opts->x_target_flags |= MASK_STV;
>> +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
>> + stack realignment will be extra cost the pass doesn't take into
>> + account and the pass can't realign the stack.  */
>> +  if (ix86_preferred_stack_boundary < 64)
>> +opts->x_target_flags &= ~MASK_STV;
>>if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
>>&& !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>>  opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
>
> The right fix is
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index a03a515..62af55a 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>bitmap_obstack_release (NULL);
>df_process_deferred_rescans ();
>
> -  /* Conversion means we may have 128bit register spills/fills
> - which require aligned stack.  */
> -  if (converted_insns)
> -{
> -  if (crtl->stack_alignment_needed < 128)
> - crtl->stack_alignment_needed = 128;
> -  if (crtl->stack_alignment_estimated < 128)
> - crtl->stack_alignment_estimated = 128;
> -}
> -
>return 0;
>  }
>
> @@ -29300,8 +29290,10 @@ ix86_minimum_alignment (tree exp, machine_mode mode,
>  return align;
>
>/* Don't do dynamic stack realignment for long long objects with
> - -mpreferred-stack-boundary=2.  */
> -  if ((mode == DImode || (type && TYPE_MODE (type) == DImode))
> + -mpreferred-stack-boundary=2.  The STV pass uses SSE2 instructions
> + on DImode which needs 64-bit alignment for DImode.  */
> +  if (!(TARGET_STV && TARGET_SSE2 && optimize > 1)
> +  && (mode == DImode || (type && TYPE_MODE (type) == DImode))
>&& (!type || !TYPE_USER_ALIGN (type))
>&& (!decl || !DECL_USER_ALIGN (decl)))
>  return 32;
>

DImode object doesn't mean STV will be applied.  So you might just
ignore preferred stack
alignment for no reason. 'Right' here depends on what is more
important in such case.

Thanks,
Ilya

>
>
> --
> H.J.

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread Ilya Enkovich

On 27 Jan 16:44, Jakub Jelinek wrote:
> On Wed, Jan 27, 2016 at 06:34:41PM +0300, Ilya Enkovich wrote:
> > @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
> >  opts->x_target_flags |= MASK_VZEROUPPER;
> >if (!(opts_set->x_target_flags & MASK_STV))
> >  opts->x_target_flags |= MASK_STV;
> > +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
> 
> The comment doesn't match the code, you disable STV only for
> -mpreferred-stack-boundary=2.

Thanks, here is an updated version.

Ilya
--
gcc/

2016-01-27  Jakub Jelinek  
Ilya Enkovich  

PR target/69454
* config/i386/i386.c (convert_scalars_to_vector): Remove
stack alignment fixes.
(ix86_option_override_internal): Disable TARGET_STV if stack
    is not properly aligned.

gcc/testsuite/

2016-01-27  Ilya Enkovich  

PR target/69454
* gcc.target/i386/pr69454-1.c: New test.
* gcc.target/i386/pr69454-2.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 34b57a4..9fb8db8 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
   bitmap_obstack_release (NULL);
   df_process_deferred_rescans ();
 
-  /* Conversion means we may have 128bit register spills/fills
- which require aligned stack.  */
-  if (converted_insns)
-{
-  if (crtl->stack_alignment_needed < 128)
-   crtl->stack_alignment_needed = 128;
-  if (crtl->stack_alignment_estimated < 128)
-   crtl->stack_alignment_estimated = 128;
-}
-
   return 0;
 }
 
@@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
 opts->x_target_flags |= MASK_VZEROUPPER;
   if (!(opts_set->x_target_flags & MASK_STV))
 opts->x_target_flags |= MASK_STV;
+  /* Disable STV if -mpreferred-stack-boundary=2 - the needed
+ stack realignment will be extra cost the pass doesn't take into
+ account and the pass can't realign the stack.  */
+  if (ix86_preferred_stack_boundary < 64)
+opts->x_target_flags &= ~MASK_STV;
   if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
   && !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
 opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-1.c 
b/gcc/testsuite/gcc.target/i386/pr69454-1.c
new file mode 100644
index 000..12ecfd3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-1.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -msse2 -mno-accumulate-outgoing-args 
-mpreferred-stack-boundary=2" } */
+
+typedef struct { long long w64[2]; } V128;
+extern V128* fn2(void);
+long long a;
+V128 b;
+void fn1() {
+  V128 *c = fn2();
+  c->w64[0] = a ^ b.w64[0];
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr69454-2.c 
b/gcc/testsuite/gcc.target/i386/pr69454-2.c
new file mode 100644
index 000..28bab93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr69454-2.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-O2 -mpreferred-stack-boundary=2" } */
+
+extern void fn2 ();
+long long a, b;
+
+void fn1 ()
+{
+  long long c = a;
+  a = b ^ a;
+  fn2 ();
+  a = c;
+}

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-27 Thread Ilya Enkovich

2016-01-27 19:18 GMT+03:00 H.J. Lu :
> On Wed, Jan 27, 2016 at 8:11 AM, Ilya Enkovich  wrote:
>> On 27 Jan 16:44, Jakub Jelinek wrote:
>>> On Wed, Jan 27, 2016 at 06:34:41PM +0300, Ilya Enkovich wrote:
>>> > @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>>> >  opts->x_target_flags |= MASK_VZEROUPPER;
>>> >if (!(opts_set->x_target_flags & MASK_STV))
>>> >  opts->x_target_flags |= MASK_STV;
>>> > +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
>>>
>>> The comment doesn't match the code, you disable STV only for
>>> -mpreferred-stack-boundary=2.
>>
>> Thanks, here is an updated version.
>>
>> Ilya
>> --
>> gcc/
>>
>> 2016-01-27  Jakub Jelinek  
>> Ilya Enkovich  
>>
>> PR target/69454
>> * config/i386/i386.c (convert_scalars_to_vector): Remove
>> stack alignment fixes.
>> (ix86_option_override_internal): Disable TARGET_STV if stack
>> is not properly aligned.
>>
>> gcc/testsuite/
>>
>> 2016-01-27  Ilya Enkovich  
>>
>> PR target/69454
>> * gcc.target/i386/pr69454-1.c: New test.
>> * gcc.target/i386/pr69454-2.c: New test.
>>
>>
>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>> index 34b57a4..9fb8db8 100644
>> --- a/gcc/config/i386/i386.c
>> +++ b/gcc/config/i386/i386.c
>> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>>bitmap_obstack_release (NULL);
>>df_process_deferred_rescans ();
>>
>> -  /* Conversion means we may have 128bit register spills/fills
>> - which require aligned stack.  */
>> -  if (converted_insns)
>> -{
>> -  if (crtl->stack_alignment_needed < 128)
>> -   crtl->stack_alignment_needed = 128;
>> -  if (crtl->stack_alignment_estimated < 128)
>> -   crtl->stack_alignment_estimated = 128;
>> -}
>> -
>>return 0;
>>  }
>>
>> @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>>  opts->x_target_flags |= MASK_VZEROUPPER;
>>if (!(opts_set->x_target_flags & MASK_STV))
>>  opts->x_target_flags |= MASK_STV;
>> +  /* Disable STV if -mpreferred-stack-boundary=2 - the needed
>> + stack realignment will be extra cost the pass doesn't take into
>> + account and the pass can't realign the stack.  */
>> +  if (ix86_preferred_stack_boundary < 64)
>> +opts->x_target_flags &= ~MASK_STV;
>>if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
>>&& !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>>  opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
>
> MINIMUM_ALIGNMENT keeps track stack alignment.  It is OK
> to disable STV for -mpreferred-stack-boundary=2.  But you should
> also update ix86_minimum_alignment to make sure that STV is
> disabled before returning 32 for DImode.

If -mpreferred-stack-boundary=2 then STV is disabled, if STV is enabled then
-mpreferred-stack-boundary>=3 and this condition in
ix86_minimum_alignment works:

  if (TARGET_64BIT || align != 64 || ix86_preferred_stack_boundary >= 64)
return align;

Thanks,
Ilya

>
>
> --
> H.J.

[PATCH] Add NULL check for vectype in vectorizable_comparison

2016-01-27 Thread Ilya Enkovich

Hi,

This is a trivial patch which adds a NULL check for vectype.
We may have a NULL vectype in case of void call.  Will commit
as obvious to trunk after testing.

Thanks,
Ilya
--
gcc/

2016-01-27  Ilya Enkovich  

* tree-vect-stmts.c (vectorizable_comparison): Add
NULL check for vectype.

gcc/testsuite/

2016-01-27  Ilya Enkovich  

* gcc.dg/declare-simd.c: New test.


diff --git a/gcc/testsuite/gcc.dg/declare-simd.c 
b/gcc/testsuite/gcc.dg/declare-simd.c
new file mode 100644
index 000..1c71b60
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/declare-simd.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fopenmp-simd" } */
+
+#pragma omp declare simd linear (p2, p3)
+extern void fn2 (float p1, float *p2, float *p3);
+
+float *a, *b;
+void fn1 (float *p1)
+{
+  int i;
+#pragma omp simd
+  for (i = 0; i < 1000; i++)
+fn2 (p1[i], a + i, b + i);
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 1dcd129..fa4a364 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7764,7 +7764,7 @@ vectorizable_comparison (gimple *stmt, 
gimple_stmt_iterator *gsi,
   if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
 return false;
 
-  if (!VECTOR_BOOLEAN_TYPE_P (vectype))
+  if (!vectype || !VECTOR_BOOLEAN_TYPE_P (vectype))
 return false;
 
   mask_type = vectype;

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-01-28 Thread Ilya Enkovich

2016-01-28 9:00 GMT+03:00 H.J. Lu :
> On Wed, Jan 27, 2016 at 8:36 AM, H.J. Lu  wrote:
>> On Wed, Jan 27, 2016 at 8:29 AM, Ilya Enkovich  
>> wrote:
>>> 2016-01-27 19:18 GMT+03:00 H.J. Lu :
>>>> On Wed, Jan 27, 2016 at 8:11 AM, Ilya Enkovich  
>>>> wrote:
>>>>> On 27 Jan 16:44, Jakub Jelinek wrote:
>>>>>> On Wed, Jan 27, 2016 at 06:34:41PM +0300, Ilya Enkovich wrote:
>>>>>> > @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>>>>>> >  opts->x_target_flags |= MASK_VZEROUPPER;
>>>>>> >if (!(opts_set->x_target_flags & MASK_STV))
>>>>>> >  opts->x_target_flags |= MASK_STV;
>>>>>> > +  /* Disable STV if -mpreferred-stack-boundary={2,3} - the needed
>>>>>>
>>>>>> The comment doesn't match the code, you disable STV only for
>>>>>> -mpreferred-stack-boundary=2.
>>>>>
>>>>> Thanks, here is an updated version.
>>>>>
>>>>> Ilya
>>>>> --
>>>>> gcc/
>>>>>
>>>>> 2016-01-27  Jakub Jelinek  
>>>>> Ilya Enkovich  
>>>>>
>>>>> PR target/69454
>>>>> * config/i386/i386.c (convert_scalars_to_vector): Remove
>>>>> stack alignment fixes.
>>>>> (ix86_option_override_internal): Disable TARGET_STV if stack
>>>>> is not properly aligned.
>>>>>
>>>>> gcc/testsuite/
>>>>>
>>>>> 2016-01-27  Ilya Enkovich  
>>>>>
>>>>> PR target/69454
>>>>> * gcc.target/i386/pr69454-1.c: New test.
>>>>> * gcc.target/i386/pr69454-2.c: New test.
>>>>>
>>>>>
>>>>> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
>>>>> index 34b57a4..9fb8db8 100644
>>>>> --- a/gcc/config/i386/i386.c
>>>>> +++ b/gcc/config/i386/i386.c
>>>>> @@ -3588,16 +3588,6 @@ convert_scalars_to_vector ()
>>>>>bitmap_obstack_release (NULL);
>>>>>df_process_deferred_rescans ();
>>>>>
>>>>> -  /* Conversion means we may have 128bit register spills/fills
>>>>> - which require aligned stack.  */
>>>>> -  if (converted_insns)
>>>>> -{
>>>>> -  if (crtl->stack_alignment_needed < 128)
>>>>> -   crtl->stack_alignment_needed = 128;
>>>>> -  if (crtl->stack_alignment_estimated < 128)
>>>>> -   crtl->stack_alignment_estimated = 128;
>>>>> -}
>>>>> -
>>>>>return 0;
>>>>>  }
>>>>>
>>>>> @@ -5453,6 +5443,11 @@ ix86_option_override_internal (bool main_args_p,
>>>>>  opts->x_target_flags |= MASK_VZEROUPPER;
>>>>>if (!(opts_set->x_target_flags & MASK_STV))
>>>>>  opts->x_target_flags |= MASK_STV;
>>>>> +  /* Disable STV if -mpreferred-stack-boundary=2 - the needed
>>>>> + stack realignment will be extra cost the pass doesn't take into
>>>>> + account and the pass can't realign the stack.  */
>>>>> +  if (ix86_preferred_stack_boundary < 64)
>>>>> +opts->x_target_flags &= ~MASK_STV;
>
> This won't work for 32-bit incoming stack boundary and 64-bit preferred
> stack boundary.  In this case, STV won't be off.  When LRA needs 64-bit
> aligned stack slot, stack must be realigned.  But for leaf function, we may
> not realign stack if ix86_minimum_alignment returns 32 for DImode.   You
> must either add assert (!TARGET_STV) before returning 32 for DImode or
> return 64 for DImode if STV is on in ix86_minimum_alignment.

TARGET_STV doesn't mean STV pass will run. We can check alignment in STV
pass gate and this assert would be wrong. If we decide STV to be dependent on
stack alignment then we shouldn't make alignment be dependent on STV. I can add
assert into convert_scalars_to_vector to check
crtl->stack_alignment_estimated >= 64
by that moment.

Thanks,
Ilya

>
>>>>>if (!ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD_OPTIMAL]
>>>>>&& !(opts_set->x_target_flags & MASK_AVX256_SPLIT_UNALIGNED_LOAD))
>>>>>  opts->x_target_flags |= MASK_AVX256_SPLIT_UNALIGNED_LOAD;
>>>>
>>>> MINIMUM_ALIGNMENT keeps track stack alignment.  It is OK
>>>> to disable STV for -mpreferred-stack-boundary=2.  But you should
>>>> also update ix86_minimum_alignment to make sure that STV is
>>>> disabled before returning 32 for DImode.
>>>
>>> If -mpreferred-stack-boundary=2 then STV is disabled, if STV is enabled then
>>> -mpreferred-stack-boundary>=3 and this condition in
>>> ix86_minimum_alignment works:
>>>
>>>   if (TARGET_64BIT || align != 64 || ix86_preferred_stack_boundary >= 64)
>>> return align;
>>>
>>
>> No, you shouldn't make assumptions in ix86_minimum_alignment. You
>> should check explicitly that STV is disabled in ix86_minimum_alignment.
>>
>>
>> --
>> H.J.
>
>
>
> --
> H.J.

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-02-02 Thread Ilya Enkovich

2016-02-02 15:46 GMT+03:00 H.J. Lu :
> On Tue, Feb 2, 2016 at 4:30 AM, H.J. Lu  wrote:
>> On Tue, Feb 2, 2016 at 4:29 AM, Jakub Jelinek  wrote:
>>> On Tue, Feb 02, 2016 at 01:24:26PM +0100, Uros Bizjak wrote:
 On Tue, Feb 2, 2016 at 12:53 PM, Jakub Jelinek  wrote:

 >> The bottom line is  ix86_minimum_alignment must return the correct
 >> number for DImode or you can just turn off STV.   My suggestion is
 >> to use my patch.
 >
 > Uros, any preferences here?  I mean, it is possible to use
 > e.g. the ix86_option_override_internal and have H.J's 
 > ix86_minimum_alignment
 > change as a safety net, in the usual case for 
 > -mpreferred-stack-boundary=2
 > we'll just disable TARGET_STV and ix86_minimum_alignment change won't do
 > anything, as TARGET_STV will be false, and if for whatever case it gets
 > through (target attribute, -mincoming-stack-boundary=, ...)
 > ix86_minimum_alignment will be there to ensure enough stack alignment.
 > Most of the smaller -mpreferred-stack-boundary= uses are -mno-sse anyway,
 > and that is something we don't want to affect.

 IMO, we should disable STV when -mpreferred-stack-boundary < 3, as STV
 is only an optimization. Perhaps we can also emit a "sorry" for
 explicit -mstv in case stack boundary requirement is not satisfied.
 *If* there is a need for -mstv with smaller stack boundary, we can
 revisit this decision for later gcc versions.

 I think disabling STV is less surprising option than increasing stack
 boundary behind the user's back.
>>>
>>> So, is http://gcc.gnu.org/ml/gcc-patches/2016-01/msg02129.html
>>> ok for trunk then (alone or with additional sorry, incremental or not?)?
>>> I believe it does just that.
>>
>> This patch is WRONG.
>>
>> --
>> H.J.
>
> You will run into the same ICE with
>
> -mincoming-stack-boundary=2 -msse2 -O2 -m32
>
> in a leaf function which needs DImode spill/fill.

Why would we need DImode spill/fill having no DImode registers?

Thanks,
Ilya

>
>
> --
> H.J.

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-02-02 Thread Ilya Enkovich

2016-02-02 16:06 GMT+03:00 H.J. Lu :
> On Tue, Feb 2, 2016 at 5:03 AM, Ilya Enkovich  wrote:
>> 2016-02-02 15:46 GMT+03:00 H.J. Lu :
>>> On Tue, Feb 2, 2016 at 4:30 AM, H.J. Lu  wrote:
>>>> On Tue, Feb 2, 2016 at 4:29 AM, Jakub Jelinek  wrote:
>>>>> On Tue, Feb 02, 2016 at 01:24:26PM +0100, Uros Bizjak wrote:
>>>>>> On Tue, Feb 2, 2016 at 12:53 PM, Jakub Jelinek  wrote:
>>>>>>
>>>>>> >> The bottom line is  ix86_minimum_alignment must return the correct
>>>>>> >> number for DImode or you can just turn off STV.   My suggestion is
>>>>>> >> to use my patch.
>>>>>> >
>>>>>> > Uros, any preferences here?  I mean, it is possible to use
>>>>>> > e.g. the ix86_option_override_internal and have H.J's 
>>>>>> > ix86_minimum_alignment
>>>>>> > change as a safety net, in the usual case for 
>>>>>> > -mpreferred-stack-boundary=2
>>>>>> > we'll just disable TARGET_STV and ix86_minimum_alignment change won't 
>>>>>> > do
>>>>>> > anything, as TARGET_STV will be false, and if for whatever case it gets
>>>>>> > through (target attribute, -mincoming-stack-boundary=, ...)
>>>>>> > ix86_minimum_alignment will be there to ensure enough stack alignment.
>>>>>> > Most of the smaller -mpreferred-stack-boundary= uses are -mno-sse 
>>>>>> > anyway,
>>>>>> > and that is something we don't want to affect.
>>>>>>
>>>>>> IMO, we should disable STV when -mpreferred-stack-boundary < 3, as STV
>>>>>> is only an optimization. Perhaps we can also emit a "sorry" for
>>>>>> explicit -mstv in case stack boundary requirement is not satisfied.
>>>>>> *If* there is a need for -mstv with smaller stack boundary, we can
>>>>>> revisit this decision for later gcc versions.
>>>>>>
>>>>>> I think disabling STV is less surprising option than increasing stack
>>>>>> boundary behind the user's back.
>>>>>
>>>>> So, is http://gcc.gnu.org/ml/gcc-patches/2016-01/msg02129.html
>>>>> ok for trunk then (alone or with additional sorry, incremental or not?)?
>>>>> I believe it does just that.
>>>>
>>>> This patch is WRONG.
>>>>
>>>> --
>>>> H.J.
>>>
>>> You will run into the same ICE with
>>>
>>> -mincoming-stack-boundary=2 -msse2 -O2 -m32
>>>
>>> in a leaf function which needs DImode spill/fill.
>>
>> Why would we need DImode spill/fill having no DImode registers?
>>
>
> Because STV is enabled with
>
>  -mincoming-stack-boundary=2 -msse2 -O2 -m32

I misread it as -mpreferred-... So why would we fail having a proper
preferred stack alignment? AFAIK leaf function doesn't affect
alignment until we finalize it after RA.

>
>
> --
> H.J.

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-02-02 Thread Ilya Enkovich

2016-02-02 16:14 GMT+03:00 H.J. Lu :
> On Tue, Feb 2, 2016 at 5:11 AM, Ilya Enkovich  wrote:
>> 2016-02-02 16:06 GMT+03:00 H.J. Lu :
>>> On Tue, Feb 2, 2016 at 5:03 AM, Ilya Enkovich  
>>> wrote:
>>>> 2016-02-02 15:46 GMT+03:00 H.J. Lu :
>>>>> On Tue, Feb 2, 2016 at 4:30 AM, H.J. Lu  wrote:
>>>>>> On Tue, Feb 2, 2016 at 4:29 AM, Jakub Jelinek  wrote:
>>>>>>> On Tue, Feb 02, 2016 at 01:24:26PM +0100, Uros Bizjak wrote:
>>>>>>>> On Tue, Feb 2, 2016 at 12:53 PM, Jakub Jelinek  
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> >> The bottom line is  ix86_minimum_alignment must return the correct
>>>>>>>> >> number for DImode or you can just turn off STV.   My suggestion is
>>>>>>>> >> to use my patch.
>>>>>>>> >
>>>>>>>> > Uros, any preferences here?  I mean, it is possible to use
>>>>>>>> > e.g. the ix86_option_override_internal and have H.J's 
>>>>>>>> > ix86_minimum_alignment
>>>>>>>> > change as a safety net, in the usual case for 
>>>>>>>> > -mpreferred-stack-boundary=2
>>>>>>>> > we'll just disable TARGET_STV and ix86_minimum_alignment change 
>>>>>>>> > won't do
>>>>>>>> > anything, as TARGET_STV will be false, and if for whatever case it 
>>>>>>>> > gets
>>>>>>>> > through (target attribute, -mincoming-stack-boundary=, ...)
>>>>>>>> > ix86_minimum_alignment will be there to ensure enough stack 
>>>>>>>> > alignment.
>>>>>>>> > Most of the smaller -mpreferred-stack-boundary= uses are -mno-sse 
>>>>>>>> > anyway,
>>>>>>>> > and that is something we don't want to affect.
>>>>>>>>
>>>>>>>> IMO, we should disable STV when -mpreferred-stack-boundary < 3, as STV
>>>>>>>> is only an optimization. Perhaps we can also emit a "sorry" for
>>>>>>>> explicit -mstv in case stack boundary requirement is not satisfied.
>>>>>>>> *If* there is a need for -mstv with smaller stack boundary, we can
>>>>>>>> revisit this decision for later gcc versions.
>>>>>>>>
>>>>>>>> I think disabling STV is less surprising option than increasing stack
>>>>>>>> boundary behind the user's back.
>>>>>>>
>>>>>>> So, is http://gcc.gnu.org/ml/gcc-patches/2016-01/msg02129.html
>>>>>>> ok for trunk then (alone or with additional sorry, incremental or not?)?
>>>>>>> I believe it does just that.
>>>>>>
>>>>>> This patch is WRONG.
>>>>>>
>>>>>> --
>>>>>> H.J.
>>>>>
>>>>> You will run into the same ICE with
>>>>>
>>>>> -mincoming-stack-boundary=2 -msse2 -O2 -m32
>>>>>
>>>>> in a leaf function which needs DImode spill/fill.
>>>>
>>>> Why would we need DImode spill/fill having no DImode registers?
>>>>
>>>
>>> Because STV is enabled with
>>>
>>>  -mincoming-stack-boundary=2 -msse2 -O2 -m32
>>
>> I misread it as -mpreferred-... So why would we fail having a proper
>> preferred stack alignment? AFAIK leaf function doesn't affect
>> alignment until we finalize it after RA.
>>
>
> /* Finalize stack_realign_needed flag, which will guide prologue/epilogue
>to be generated in correct form.  */
> static void
> ix86_finalize_stack_realign_flags (void)
> {
>   /* Check if stack realign is really needed after reload, and
>  stores result in cfun */
>   unsigned int incoming_stack_boundary
> = (crtl->parm_stack_boundary > ix86_incoming_stack_boundary
>? crtl->parm_stack_boundary : ix86_incoming_stack_boundary);
>   unsigned int stack_realign
> = (incoming_stack_boundary
>< (crtl->is_leaf && !ix86_current_function_calls_tls_descriptor
>   ? crtl->max_used_stack_slot_alignment
> ^^

We call it after RA when all spill slots are allocated and check if we
may relax stack alignment. Don't see any problem here.

Thanks,
Ilya

>
> For leaf function, we check max_used_stack_slot_alignment.
> Since ix86_minimum_alignment returns 32 for DImode.
> We won't realign stack for DImode spill/fill.
>
>   : crtl->stack_alignment_needed));
>
>
> --
> H.J.

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-02-02 Thread Ilya Enkovich

2016-02-02 16:25 GMT+03:00 H.J. Lu :
> On Tue, Feb 2, 2016 at 5:21 AM, Ilya Enkovich  wrote:
>> 2016-02-02 16:14 GMT+03:00 H.J. Lu :
>>> On Tue, Feb 2, 2016 at 5:11 AM, Ilya Enkovich  
>>> wrote:
>>>> 2016-02-02 16:06 GMT+03:00 H.J. Lu :
>>>>> On Tue, Feb 2, 2016 at 5:03 AM, Ilya Enkovich  
>>>>> wrote:
>>>>>> 2016-02-02 15:46 GMT+03:00 H.J. Lu :
>>>>>>> On Tue, Feb 2, 2016 at 4:30 AM, H.J. Lu  wrote:
>>>>>>>> On Tue, Feb 2, 2016 at 4:29 AM, Jakub Jelinek  wrote:
>>>>>>>>> On Tue, Feb 02, 2016 at 01:24:26PM +0100, Uros Bizjak wrote:
>>>>>>>>>> On Tue, Feb 2, 2016 at 12:53 PM, Jakub Jelinek  
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> >> The bottom line is  ix86_minimum_alignment must return the correct
>>>>>>>>>> >> number for DImode or you can just turn off STV.   My suggestion is
>>>>>>>>>> >> to use my patch.
>>>>>>>>>> >
>>>>>>>>>> > Uros, any preferences here?  I mean, it is possible to use
>>>>>>>>>> > e.g. the ix86_option_override_internal and have H.J's 
>>>>>>>>>> > ix86_minimum_alignment
>>>>>>>>>> > change as a safety net, in the usual case for 
>>>>>>>>>> > -mpreferred-stack-boundary=2
>>>>>>>>>> > we'll just disable TARGET_STV and ix86_minimum_alignment change 
>>>>>>>>>> > won't do
>>>>>>>>>> > anything, as TARGET_STV will be false, and if for whatever case it 
>>>>>>>>>> > gets
>>>>>>>>>> > through (target attribute, -mincoming-stack-boundary=, ...)
>>>>>>>>>> > ix86_minimum_alignment will be there to ensure enough stack 
>>>>>>>>>> > alignment.
>>>>>>>>>> > Most of the smaller -mpreferred-stack-boundary= uses are -mno-sse 
>>>>>>>>>> > anyway,
>>>>>>>>>> > and that is something we don't want to affect.
>>>>>>>>>>
>>>>>>>>>> IMO, we should disable STV when -mpreferred-stack-boundary < 3, as 
>>>>>>>>>> STV
>>>>>>>>>> is only an optimization. Perhaps we can also emit a "sorry" for
>>>>>>>>>> explicit -mstv in case stack boundary requirement is not satisfied.
>>>>>>>>>> *If* there is a need for -mstv with smaller stack boundary, we can
>>>>>>>>>> revisit this decision for later gcc versions.
>>>>>>>>>>
>>>>>>>>>> I think disabling STV is less surprising option than increasing stack
>>>>>>>>>> boundary behind the user's back.
>>>>>>>>>
>>>>>>>>> So, is http://gcc.gnu.org/ml/gcc-patches/2016-01/msg02129.html
>>>>>>>>> ok for trunk then (alone or with additional sorry, incremental or 
>>>>>>>>> not?)?
>>>>>>>>> I believe it does just that.
>>>>>>>>
>>>>>>>> This patch is WRONG.
>>>>>>>>
>>>>>>>> --
>>>>>>>> H.J.
>>>>>>>
>>>>>>> You will run into the same ICE with
>>>>>>>
>>>>>>> -mincoming-stack-boundary=2 -msse2 -O2 -m32
>>>>>>>
>>>>>>> in a leaf function which needs DImode spill/fill.
>>>>>>
>>>>>> Why would we need DImode spill/fill having no DImode registers?
>>>>>>
>>>>>
>>>>> Because STV is enabled with
>>>>>
>>>>>  -mincoming-stack-boundary=2 -msse2 -O2 -m32
>>>>
>>>> I misread it as -mpreferred-... So why would we fail having a proper
>>>> preferred stack alignment? AFAIK leaf function doesn't affect
>>>> alignment until we finalize it after RA.
>>>>
>>>
>>> /* Finalize stack_realign_needed flag, which will guide prologue/epilogue
>>>to be generated in correct form.  */
>>> static void
>>> ix86_finalize_stack_realign_flags (void)
>>> {
>>>   /* Check if stack realign is really needed after reload, and
>>>  stores result in cfun */
>>>   unsigned int incoming_stack_boundary
>>> = (crtl->parm_stack_boundary > ix86_incoming_stack_boundary
>>>? crtl->parm_stack_boundary : ix86_incoming_stack_boundary);
>>>   unsigned int stack_realign
>>> = (incoming_stack_boundary
>>>< (crtl->is_leaf && !ix86_current_function_calls_tls_descriptor
>>>   ? crtl->max_used_stack_slot_alignment
>>> ^^
>>
>> We call it after RA when all spill slots are allocated and check if we
>> may relax stack alignment. Don't see any problem here.
>
> Please see
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69454#c26
>
> Why did LRA crash then?

Because it tries a patch [1] which doesn't fix stack alignment and STV
enabling and therefore doesn't resolve the problem when
-mpreferred-stack-boundary=2 is used.

Thanks,
Ilya
--
[1] https://gcc.gnu.org/bugzilla/attachment.cgi?id=37468&action=diff

>
>
> --
> H.J.

Re: [PATCH, PR target/69454] Disable TARGET_STV when stack is not properly aligned

2016-02-02 Thread Ilya Enkovich

2016-02-02 17:03 GMT+03:00 H.J. Lu :
> On Tue, Feb 2, 2016 at 5:55 AM, Ilya Enkovich  wrote:
>> 2016-02-02 16:25 GMT+03:00 H.J. Lu :
>>> On Tue, Feb 2, 2016 at 5:21 AM, Ilya Enkovich  
>>> wrote:
>>>> 2016-02-02 16:14 GMT+03:00 H.J. Lu :
>>>>> On Tue, Feb 2, 2016 at 5:11 AM, Ilya Enkovich  
>>>>> wrote:
>>>>>> 2016-02-02 16:06 GMT+03:00 H.J. Lu :
>>>>>>> On Tue, Feb 2, 2016 at 5:03 AM, Ilya Enkovich  
>>>>>>> wrote:
>>>>>>>> 2016-02-02 15:46 GMT+03:00 H.J. Lu :
>>>>>>>>> On Tue, Feb 2, 2016 at 4:30 AM, H.J. Lu  wrote:
>>>>>>>>>> On Tue, Feb 2, 2016 at 4:29 AM, Jakub Jelinek  
>>>>>>>>>> wrote:
>>>>>>>>>>> On Tue, Feb 02, 2016 at 01:24:26PM +0100, Uros Bizjak wrote:
>>>>>>>>>>>> On Tue, Feb 2, 2016 at 12:53 PM, Jakub Jelinek  
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> >> The bottom line is  ix86_minimum_alignment must return the 
>>>>>>>>>>>> >> correct
>>>>>>>>>>>> >> number for DImode or you can just turn off STV.   My suggestion 
>>>>>>>>>>>> >> is
>>>>>>>>>>>> >> to use my patch.
>>>>>>>>>>>> >
>>>>>>>>>>>> > Uros, any preferences here?  I mean, it is possible to use
>>>>>>>>>>>> > e.g. the ix86_option_override_internal and have H.J's 
>>>>>>>>>>>> > ix86_minimum_alignment
>>>>>>>>>>>> > change as a safety net, in the usual case for 
>>>>>>>>>>>> > -mpreferred-stack-boundary=2
>>>>>>>>>>>> > we'll just disable TARGET_STV and ix86_minimum_alignment change 
>>>>>>>>>>>> > won't do
>>>>>>>>>>>> > anything, as TARGET_STV will be false, and if for whatever case 
>>>>>>>>>>>> > it gets
>>>>>>>>>>>> > through (target attribute, -mincoming-stack-boundary=, ...)
>>>>>>>>>>>> > ix86_minimum_alignment will be there to ensure enough stack 
>>>>>>>>>>>> > alignment.
>>>>>>>>>>>> > Most of the smaller -mpreferred-stack-boundary= uses are 
>>>>>>>>>>>> > -mno-sse anyway,
>>>>>>>>>>>> > and that is something we don't want to affect.
>>>>>>>>>>>>
>>>>>>>>>>>> IMO, we should disable STV when -mpreferred-stack-boundary < 3, as 
>>>>>>>>>>>> STV
>>>>>>>>>>>> is only an optimization. Perhaps we can also emit a "sorry" for
>>>>>>>>>>>> explicit -mstv in case stack boundary requirement is not satisfied.
>>>>>>>>>>>> *If* there is a need for -mstv with smaller stack boundary, we can
>>>>>>>>>>>> revisit this decision for later gcc versions.
>>>>>>>>>>>>
>>>>>>>>>>>> I think disabling STV is less surprising option than increasing 
>>>>>>>>>>>> stack
>>>>>>>>>>>> boundary behind the user's back.
>>>>>>>>>>>
>>>>>>>>>>> So, is http://gcc.gnu.org/ml/gcc-patches/2016-01/msg02129.html
>>>>>>>>>>> ok for trunk then (alone or with additional sorry, incremental or 
>>>>>>>>>>> not?)?
>>>>>>>>>>> I believe it does just that.
>>>>>>>>>>
>>>>>>>>>> This patch is WRONG.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> H.J.
>>>>>>>>>
>>>>>>>>> You will run into the same ICE with
>>>>>>>>>
>>>>>>>>> -mincoming-stack-boundary=2 -msse2 -O2 -m32
>>>>>>>>>
>>>>>>>>> in a leaf function which needs DImode spill/fill.
>>>>>>>>
>>>>>>>> Why would we need DImode spill/fill having no DImode registers?
>>>>>>>>
>>>>>>>
>>>>>>> Because STV is enabled with
>>>>>>>
>>>>>>>  -mincoming-stack-boundary=2 -msse2 -O2 -m32
>>>>>>
>>>>>> I misread it as -mpreferred-... So why would we fail having a proper
>>>>>> preferred stack alignment? AFAIK leaf function doesn't affect
>>>>>> alignment until we finalize it after RA.
>>>>>>
>>>>>
>>>>> /* Finalize stack_realign_needed flag, which will guide prologue/epilogue
>>>>>to be generated in correct form.  */
>>>>> static void
>>>>> ix86_finalize_stack_realign_flags (void)
>>>>> {
>>>>>   /* Check if stack realign is really needed after reload, and
>>>>>  stores result in cfun */
>>>>>   unsigned int incoming_stack_boundary
>>>>> = (crtl->parm_stack_boundary > ix86_incoming_stack_boundary
>>>>>? crtl->parm_stack_boundary : ix86_incoming_stack_boundary);
>>>>>   unsigned int stack_realign
>>>>> = (incoming_stack_boundary
>>>>>< (crtl->is_leaf && !ix86_current_function_calls_tls_descriptor
>>>>>   ? crtl->max_used_stack_slot_alignment
>>>>> ^^
>>>>
>>>> We call it after RA when all spill slots are allocated and check if we
>>>> may relax stack alignment. Don't see any problem here.
>>>
>>> Please see
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69454#c26
>>>
>>> Why did LRA crash then?
>>
>> Because it tries a patch [1] which doesn't fix stack alignment and STV
>> enabling and therefore doesn't resolve the problem when
>> -mpreferred-stack-boundary=2 is used.
>>
>
> No, it is because RA doesn't increase stack alignment.  You have to
> have the correct stack alignment requirement before entering RA.

And it's too late to do it after STV pass and therefore we disable it
when stack is not properly aligned. I think this argumentation goes in
a loop.

Thanks,
Ilya

>
>
> --
> H.J.

[PATCH, committed] Revert r232560 to fix PR target/69369

2016-02-05 Thread Ilya Enkovich

Hi,

This patch reverts r232560 which caused multiple failures
for Pointer Bounds Checker.  Patch was bootstrapped and
regtested on x86_64-pc-linux-gnu.  Applied to trunk.

Thanks,
Ilya
--
2016-02-05  Ilya Enkovich  

PR target/69369
Revert r232560:
2016-01-19  Jan Hubicka  

* cgraphunit.c (cgraph_node::reset): Clear thunk info and
instrumented_version.


diff --git a/gcc/cgraphunit.c b/gcc/cgraphunit.c
index b95c172..2c49d7b 100644
--- a/gcc/cgraphunit.c
+++ b/gcc/cgraphunit.c
@@ -366,14 +366,12 @@ cgraph_node::reset (void)
   memset (&local, 0, sizeof (local));
   memset (&global, 0, sizeof (global));
   memset (&rtl, 0, sizeof (rtl));
-  memset (&thunk, 0, sizeof (cgraph_thunk_info));
   analyzed = false;
   definition = false;
   alias = false;
   transparent_alias = false;
   weakref = false;
   cpp_implicit_alias = false;
-  instrumented_version = NULL;
 
   remove_callees ();
   remove_all_references ();

Re: [RFC] Combine vectorized loops with its scalar remainder.

2016-02-09 Thread Ilya Enkovich

2015-12-15 19:41 GMT+03:00 Yuri Rumyantsev :
> Hi Richard,
>
> I re-designed the patch to determine ability of loop masking on fly of
> vectorization analysis and invoke it after loop transformation.
> Test-case is also provided.
>
> what is your opinion?
>
> Thanks.
> Yuri.
>

Hi,

I'm going to start work on extending this patch to handle mixed mask sizes,
support vectorization of peeled loop tail and fix profitability
estimation to choose
proper loop tail processing. Here is shortly a planned changes list:

1. Don't put any restriction on mask type when check if statement can be masked.
Instead just store all required masks in LOOP_VINFO_REQUIRED_MASKS. After
all statements are checked we additionally check all required masks
can be produced
(we have proper comparison, widening and narrowing support).

2. In vect_estimate_min_profitable_iters compute overhead for masks creation,
decide what we should do with a loop tail (nothing, vectorize, combine
with loop body),
additionally return a number of tail iterations required for chosen
tail processing
profitability.

3. In vect_transform_loop depending on chosen strategy either mask whole loop or
produce vectorized tail. For now it's not fully clear to me what is
the best way to get
vectorized tail.

The first option is to just peel one iteration after loop is
vectorized. But in our masking
functions we use LOOP_VINFO and STMT_VINFO structures we loose during peeling.

Another option is to peel scalar loop and then just run vectorizer one more time
to vectorize and mask it.

Also we may peel vectorized loop and use original version (with all
STMT_VINFO still
available) as a tail and peeled version as a main loop.

Currently I think the best option is to peel scalar loop and run
vectorizer one more time
for it. This option is simpler and can also be used to vectorize loop
tail with a smaller vector
size when target doesn't support masking or masking is not profitable.

Any comments?

Thanks,
Ilya

[PATCH, MPX, committed] Fix warning in MPX effective target test

2016-02-12 Thread Ilya Enkovich

Hi,

This patch fixes a warning in test used for effective MPX target check.  Fix 
allows to use test with g++.  Bootsrapped and tested on x86_64-pc-linux-gnu.  
Applied to trunk.

Thanks,
Ilya
--
gcc/testsuite/

2016-02-11  Ilya Enkovich  

* lib/mpx-dg.exp: Fix warning in check_effective_target_mpx
test.


diff --git a/gcc/testsuite/lib/mpx-dg.exp b/gcc/testsuite/lib/mpx-dg.exp
index fa2faaa..b245c5f 100644
--- a/gcc/testsuite/lib/mpx-dg.exp
+++ b/gcc/testsuite/lib/mpx-dg.exp
@@ -22,7 +22,7 @@ proc check_effective_target_mpx {} {
int *foo (int *arg) { return arg; }
int main (void)
{
-   int *p = __builtin_malloc (sizeof (int));
+   int *p = (int *)__builtin_malloc (sizeof (int));
int res = foo (p) == 0;
__builtin_free (p);
return res;

[PATCH, CHKP, committed] Fix PR middle-end/69729

2016-02-12 Thread Ilya Enkovich

Hi,

This patch fixes instrumentation thunk recognition condition in lto_output.  
This avoids missing required thunks in ltrans modules.  Bootstrapped and 
regtested on x86_64-pc-linux-gnu.  Committed to trunk.

Thanks,
Ilya
--
gcc/

2016-02-12  Ilya Enkovich  

PR target/69729
* lto-streamer-out.c (lto_output): Use thunk.add_pointer_bounds_args
to correctly determine instrumentation thunks.

gcc/testsuite/

2016-02-12  Ilya Enkovich  

* g++.dg/lto/lto.exp: Include and init mpx.
* g++.dg/lto/pr69729_0.C: New test.


diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
index 6bb76cc..997a28b 100644
--- a/gcc/lto-streamer-out.c
+++ b/gcc/lto-streamer-out.c
@@ -2321,7 +2321,7 @@ lto_output (void)
{
  if (lto_symtab_encoder_encode_body_p (encoder, node)
  && !node->alias
- && (!node->thunk.thunk_p || !node->instrumented_version))
+ && (!node->thunk.thunk_p || !node->thunk.add_pointer_bounds_args))
{
  if (flag_checking)
{
diff --git a/gcc/testsuite/g++.dg/lto/lto.exp b/gcc/testsuite/g++.dg/lto/lto.exp
index 8d99418..48f0947 100644
--- a/gcc/testsuite/g++.dg/lto/lto.exp
+++ b/gcc/testsuite/g++.dg/lto/lto.exp
@@ -31,6 +31,7 @@ if $tracelevel then {
 load_lib standard.exp
 load_lib g++.exp
 load_lib target-libpath.exp
+load_lib mpx-dg.exp
 
 # Load the language-independent compabibility support procedures.
 load_lib lto.exp
@@ -42,6 +43,7 @@ if { ![check_effective_target_lto] } {
 
 g++_init
 lto_init no-mathlib
+mpx_init
 
 # Define an identifier for use with this suite to avoid name conflicts
 # with other lto tests running at the same time.
@@ -57,4 +59,5 @@ foreach src [lsort [find $srcdir/$subdir *_0.\[cC\]]] {
 lto-execute $src $sid
 }
 
+mpx_finish
 lto_finish
diff --git a/gcc/testsuite/g++.dg/lto/pr69729_0.C 
b/gcc/testsuite/g++.dg/lto/pr69729_0.C
new file mode 100644
index 000..b736406
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lto/pr69729_0.C
@@ -0,0 +1,35 @@
+/* { dg-lto-do link } */
+/* { dg-require-effective-target mpx } */
+/* { dg-lto-options {{-fcheck-pointer-bounds -mmpx -flto -flto-partition=max}} 
} */
+
+class cl1
+{
+ public:
+  virtual ~cl1 () { };
+};
+
+class cl2
+{
+ public:
+  virtual ~cl2 () { };
+};
+
+class cl3 : cl1, cl2
+{
+};
+
+class cl4 : cl3
+{
+ public:
+  ~cl4 ();
+};
+
+cl4::~cl4 ()
+{
+}
+
+int main (int argc, char **argv)
+{
+  cl4 c;
+  return 0;
+}

Re: [PATCH, PR middle-end/68134] Reject scalar modes in default get_mask_mode hook

2016-02-20 Thread Ilya Enkovich

2016-02-19 20:36 GMT+03:00 Alan Lawrence :
> On 17/11/15 11:49, Ilya Enkovich wrote:
>>
>> Hi,
>>
>> Default hook for get_mask_mode is supposed to return integer vector modes.
>> This means it should reject calar modes returned by mode_for_vector.
>> Bootstrapped and regtested on x86_64-unknown-linux-gnu, regtested on
>> aarch64-unknown-linux-gnu.  OK for trunk?
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2015-11-17  Ilya Enkovich  
>>
>> PR middle-end/68134
>> * targhooks.c (default_get_mask_mode): Filter out
>> scalar modes returned by mode_for_vector.
>
>
> I've been playing around with a patch that expands arbitrary VEC_COND_EXPRs
> using the vec_cmp and vcond_mask optabs - allowing platforms to drop the
> ugly vcond pattern (for which a cross-product of modes is
> usually required) where vcond = vec_cmp + vcond_mask. (E.g. ARM and
> AArch64.)
>
> Mostly this is fairly straightforward, relatively little midend code is
> required, and the backend cleans up quite a bit. However, I get stuck on the
> case of singleton vectors (64x1). No surprises there, then...
>
> The PR/68134 fix, makes the 'mask mode' for comparing 64x1 vectors, into
> BLKmode, so that we get past this in expand_vector_operations:
>
> /* A scalar operation pretending to be a vector one.  */
>   if (VECTOR_BOOLEAN_TYPE_P (type)
>   && !VECTOR_MODE_P (TYPE_MODE (type))
>   && TYPE_MODE (type) != BLKmode)
> return;
>
> and we do the operation piecewise. (Which is what we want; there is only one
> piece!)
>
> However, with my vec_cmp + vcond_mask changes, dropping vconddidi, this
> means we look for a vcond_maskdiblk and vec_cmpdiblk. Which doesn't really
> feel right - it feels like the 64x1 mask should be a DImode, just like other
> 64x1 vectors.

The problem here is to distinguish vector mask of one DI element and
DI scalar mask.  We don't want to lower scalar mask manipulations
because they are simple integer operations, not vector ones. Probably
vector of a single DI should have V1DI mode and not pretend to be a
scalar?  This would make things easier.

>
> So, I'm left thinking - is there a better way to look for "scalar operations
> pretending to be vector ones", such that we can get a DImode vec
> through the above? What exactly do we want that check to do? In my AArch64
> testing, I was able to take it out altogether and get all tests passing...?
> (I don't have AVX512 hardware)

You were able to do it because it is related to scalar masks supported
by AVX512 only for now.

Thanks,
Ilya

>
> Thanks, Alan
>

Re: [PATCH, PR middle-end/68134] Reject scalar modes in default get_mask_mode hook

2016-02-24 Thread Ilya Enkovich

2016-02-22 14:50 GMT+03:00 Alan Lawrence :
> On 20/02/16 09:29, Ilya Enkovich wrote:
>>
>> 2016-02-19 20:36 GMT+03:00 Alan Lawrence :
>>>
>>> On 17/11/15 11:49, Ilya Enkovich wrote:
>>>>
>>>>
>>>> Hi,
>>>>
>>>> Default hook for get_mask_mode is supposed to return integer vector
>>>> modes.
>>>> This means it should reject calar modes returned by mode_for_vector.
>>>> Bootstrapped and regtested on x86_64-unknown-linux-gnu, regtested on
>>>> aarch64-unknown-linux-gnu.  OK for trunk?
>>>>
>>>> Thanks,
>>>> Ilya
>>>> --
>>>> gcc/
>>>>
>>>> 2015-11-17  Ilya Enkovich  
>>>>
>>>>  PR middle-end/68134
>>>>  * targhooks.c (default_get_mask_mode): Filter out
>>>>  scalar modes returned by mode_for_vector.
>>>
>>>
>>>
>>> I've been playing around with a patch that expands arbitrary
>>> VEC_COND_EXPRs
>>> using the vec_cmp and vcond_mask optabs - allowing platforms to drop the
>>> ugly vcond pattern (for which a cross-product of modes is
>>> usually required) where vcond = vec_cmp + vcond_mask. (E.g. ARM and
>>> AArch64.)
>>>
>>> Mostly this is fairly straightforward, relatively little midend code is
>>> required, and the backend cleans up quite a bit. However, I get stuck on
>>> the
>>> case of singleton vectors (64x1). No surprises there, then...
>>>
>>> The PR/68134 fix, makes the 'mask mode' for comparing 64x1 vectors, into
>>> BLKmode, so that we get past this in expand_vector_operations:
>>>
>>> /* A scalar operation pretending to be a vector one.  */
>>>if (VECTOR_BOOLEAN_TYPE_P (type)
>>>&& !VECTOR_MODE_P (TYPE_MODE (type))
>>>&& TYPE_MODE (type) != BLKmode)
>>>  return;
>>>
>>> and we do the operation piecewise. (Which is what we want; there is only
>>> one
>>> piece!)
>>>
>>> However, with my vec_cmp + vcond_mask changes, dropping vconddidi, this
>>> means we look for a vcond_maskdiblk and vec_cmpdiblk. Which doesn't
>>> really
>>> feel right - it feels like the 64x1 mask should be a DImode, just like
>>> other
>>> 64x1 vectors.
>>
>>
>> The problem here is to distinguish vector mask of one DI element and
>> DI scalar mask.  We don't want to lower scalar mask manipulations
>> because they are simple integer operations, not vector ones. Probably
>> vector of a single DI should have V1DI mode and not pretend to be a
>> scalar?  This would make things easier.
>
>
> Thanks for the quite reply, Ilya.
>
> What's the difference between, as you say, a "simple integer operation" and
> a "vector" operation of just one element?

The difference is at least in how this operation is expanded.  You
would use different optabs for scalar and vector cases. Also note that
default_get_mask_mode uses BLKmode for scalar modes not just because
of a single element vector.  mode_for_vector may return DImode for
V2SI and V4HI vectors in case target doesn't define such vector modes.
To distinguish true scalar masks I avoid scalar mode usage for
non-scalar masks.  One element vector might be an exception here.  You
may try to define TARGET_VECTORIZE_GET_MASK_MODE for your target and
keep scalar mode when you want it.

>
> This is why we do *not* have V1DImode in the AArch64 (or ARM) backends, but
> instead treat 64x1 vectors as DImode - the operations are the same; so
> keeping them as the same mode, enables CSE and lots of other optimizations,
> plus we don't have to have two near-identical copies (DI + V1DI) for many
> patterns, etc...

Well, you don't have to keep V1DI mode after expand.  You may also
just don't add any new patterns for vector optabs and therefore get
these vector operations lowered into 'true' scalar operations with
both scalar type and mode.

>
> If the operations were on a "DI scalar mask", when would the first part of
> that test, VECTOR_BOOLEAN_TYPE_P, hold?

Any vector comparison produces a value of a boolean vector type.  Even
if these vectors and mask have a scalar mode.

Thanks,
Ilya

>
> Thanks, Alan

[PATCH, PR69956] Fix multi-step conversion of boolean vectors

2016-02-26 Thread Ilya Enkovich

Hi,

Currently multi-step vector conversion tries to compute
intermediate type from its mode but it doesn't work for
boolean vectors.  This patch introduces a computation
of intermediate vector masks.  Bootstrapped and tested
on x86_64-pc-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-02-26  Ilya Enkovich  

PR tree-optimization/69956
* tree-vect-stmts.c (supportable_widening_operation): Support
multi-step conversion of boolean vectors.
(supportable_narrowing_operation): Likewise.

gcc/testsuite/

2016-02-26  Ilya Enkovich  

PR tree-optimization/69956
* gcc.dg/pr69956.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr69956.c b/gcc/testsuite/gcc.dg/pr69956.c
new file mode 100644
index 000..37d24d4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr69956.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize" } */
+/* { dg-additional-options "-march=skylake-avx512" { target { i?86-*-* 
x86_64-*-* } } } */
+
+void
+fn1 (char *b, char *d, int *c, int i)
+{
+  for (; i; i++, d++)
+if (b[i])
+  *d = c[i];
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 9678d7c..182b277 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -9000,9 +9000,19 @@ supportable_widening_operation (enum tree_code code, 
gimple *stmt,
   for (i = 0; i < MAX_INTERM_CVT_STEPS; i++)
 {
   intermediate_mode = insn_data[icode1].operand[0].mode;
-  intermediate_type
-   = lang_hooks.types.type_for_mode (intermediate_mode,
- TYPE_UNSIGNED (prev_type));
+  if (VECTOR_BOOLEAN_TYPE_P (prev_type))
+   {
+ intermediate_type
+   = build_truth_vector_type (TYPE_VECTOR_SUBPARTS (prev_type) / 2,
+  current_vector_size);
+ if (intermediate_mode != TYPE_MODE (intermediate_type))
+   return false;
+   }
+  else
+   intermediate_type
+ = lang_hooks.types.type_for_mode (intermediate_mode,
+   TYPE_UNSIGNED (prev_type));
+
   optab3 = optab_for_tree_code (c1, intermediate_type, optab_default);
   optab4 = optab_for_tree_code (c2, intermediate_type, optab_default);
 
@@ -9065,7 +9075,7 @@ supportable_narrowing_operation (enum tree_code code,
   tree vectype = vectype_in;
   tree narrow_vectype = vectype_out;
   enum tree_code c1;
-  tree intermediate_type;
+  tree intermediate_type, prev_type;
   machine_mode intermediate_mode, prev_mode;
   int i;
   bool uns;
@@ -9111,6 +9121,7 @@ supportable_narrowing_operation (enum tree_code code,
   /* Check if it's a multi-step conversion that can be done using intermediate
  types.  */
   prev_mode = vec_mode;
+  prev_type = vectype;
   if (code == FIX_TRUNC_EXPR)
 uns = TYPE_UNSIGNED (vectype_out);
   else
@@ -9145,8 +9156,17 @@ supportable_narrowing_operation (enum tree_code code,
   for (i = 0; i < MAX_INTERM_CVT_STEPS; i++)
 {
   intermediate_mode = insn_data[icode1].operand[0].mode;
-  intermediate_type
-   = lang_hooks.types.type_for_mode (intermediate_mode, uns);
+  if (VECTOR_BOOLEAN_TYPE_P (prev_type))
+   {
+ intermediate_type
+   = build_truth_vector_type (TYPE_VECTOR_SUBPARTS (prev_type) * 2,
+  current_vector_size);
+ if (intermediate_mode != TYPE_MODE (intermediate_type))
+ return false;
+   }
+  else
+   intermediate_type
+ = lang_hooks.types.type_for_mode (intermediate_mode, uns);
   interm_optab
= optab_for_tree_code (VEC_PACK_TRUNC_EXPR, intermediate_type,
   optab_default);
@@ -9164,6 +9184,7 @@ supportable_narrowing_operation (enum tree_code code,
return true;
 
   prev_mode = intermediate_mode;
+  prev_type = intermediate_type;
   optab1 = interm_optab;
 }

[PATCH, PR70026] Fix boolean comparison processing in masks conversion patterns

2016-03-02 Thread Ilya Enkovich

Hi,

This patch fixes mask type determination for boolean values comparison.  That 
avoid incorrect application of masks conversion pattern.  Bootstrapped and 
tested on x86_64-pc-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-03-02  Ilya Enkovich  

* tree-vect-patterns.c (search_type_for_mask): Handle
comparison of booleans.


gcc/testsuite/

2016-03-02  Ilya Enkovich  

* gcc.dg/pr70026.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr70026.c b/gcc/testsuite/gcc.dg/pr70026.c
new file mode 100644
index 000..32f59e2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr70026.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+unsigned int a[64], b[64], c[64], d[64], e[64];
+
+void
+foo ()
+{
+  int i;
+  for (i = 0; i < 64; i++)
+{
+  d[i] = a[i];
+  e[i] = ((b[i] < e[i]) != !c[i]) && !a[i];
+}
+}
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 95ce38d..7037e04 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -3219,6 +3219,15 @@ search_type_for_mask (tree var, vec_info *vinfo)
{
  tree comp_vectype, mask_type;
 
+ if (TREE_CODE (TREE_TYPE (rhs1)) == BOOLEAN_TYPE)
+   {
+ res = search_type_for_mask (rhs1, vinfo);
+ res2 = search_type_for_mask (gimple_assign_rhs2 (def_stmt), 
vinfo);
+ if (!res || (res2 && TYPE_PRECISION (res) > TYPE_PRECISION 
(res2)))
+   res = res2;
+ break;
+   }
+
  comp_vectype = get_vectype_for_scalar_type (TREE_TYPE (rhs1));
  if (comp_vectype == NULL_TREE)
return NULL_TREE;

[RFC] Using function clones for Pointer Bounds Checker

2014-01-14 Thread Ilya Enkovich

Hi,

I've been working for some time on the prototype of the Pointer Bounds
Checker which uses function clones for instrumentation
(http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03327.html). After
several experiments with this approach I want to share my results and
ask for some feedback to make a decision about the future steps.

Firstly I want to remind the reasons for digging in this direction. In
the original approach bounds of call arguments and input parameters
are associated with arguments via special built-in calls. It creates
implicit data flow compiler is not aware about which confuses some
optimizations resulting in miss-optimization and breaks bounds data
flow. Thus optimizations have to be fixed to get better pointers
protection.

Clones approach does not use special built-in function calls to
associate bounds with call arguments and input parameters. Each
function which should be instrumented gets an additional version and
only this special version will be instrumented.This new version gets
additional bound arguments to express input bounds. When function call
is instrumented, it is redirected to instrumented version and all
bounds are passed as explicit call arguments. Thus we have explicit
pointer bounds flow similar to regular function parameters. It should
allow to avoid changes in optimization, avoid miss-optimizations,
allow existing IPA optimizations to work with bound args (e.g.
propagate constant bounds value and remove checks in called function).

I made a prototype implementation of this approach in the following way:

- Add new IPA pass before early local passes to produce versions for
all functions to be instrumented.
- Put instrumentation pass after SSA pass.
- Add new pass after IPA passes to remove bodies of functions which
have instrumented versions. Function nodes may still be required for
calls in not instrumented code. But we do not emit this code and
therefore function bodies are not needed.

Positive changes are:

- IPA optimizations are not confused by bound parameters
- bounds are now more like regular arguments; it makes their
processing in expand easier
- functions with bounds not attached to any pointer are allowed

On simple codes this approach worked well but on a bigger tests some
issues were revealed.

1. Nodes reachability. Instrumented version is actually always
reachable when original function is reachable because it is always
emitted instead of the original. Thus I had to fix reachability
analysis to achieve it. Another similar problem is check whether node
can be removed after inline when inlining instrumented function. Not
hard to fix but probably other similar problems exist.

2. Function processing order. Function processing order is determined
before early local passes. But during function instrumentation call
graph is modified significantly and used topological order becomes
outdated. That causes some troubles. E.g. function marked as 'always
inline' cannot be inlined because it is not in SSA form yet. Surely
inlining problem may be solved by just putting instrumentation after
early inline, but similar problem may exist in other passes too. To
resolve this problem I tried to split early local passes into three
parts. The first one builds SSA, the second one performs
instrumentation, the last one does the rest. Each part is performed on
all functions before the next one starts. Thus I get all functions in
SSA form and all instrumentation performed before starting early
optimizations. Unfortunately such passes order leads to invalid SSA
because of local_pure_const optimization affecting callers correctness
(in case caller SSA was built before optimization revealed 'pure' or
'const' flag).

In general I feel that having special function version for
instrumentation has a better potential, should lead to less intrusive
changes in the compiler and better code quality.

But before continue this implementation I would like to get some
feedback and probably some advice on how to order passes to get the
best result. Currently I incline to have all functions instrumented
before any local optimizations and solve pure_const problem by
modifying all callers when attribute is computed for some function.

Thanks,
Ilya

Re: [RFC] Using function clones for Pointer Bounds Checker

2014-01-14 Thread Ilya Enkovich

2014/1/14 Richard Biener :
> On Tue, Jan 14, 2014 at 10:15 AM, Ilya Enkovich  
> wrote:
>> Hi,
>>
>> I've been working for some time on the prototype of the Pointer Bounds
>> Checker which uses function clones for instrumentation
>> (http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03327.html). After
>> several experiments with this approach I want to share my results and
>> ask for some feedback to make a decision about the future steps.
>>
>> Firstly I want to remind the reasons for digging in this direction. In
>> the original approach bounds of call arguments and input parameters
>> are associated with arguments via special built-in calls. It creates
>> implicit data flow compiler is not aware about which confuses some
>> optimizations resulting in miss-optimization and breaks bounds data
>> flow. Thus optimizations have to be fixed to get better pointers
>> protection.
>>
>> Clones approach does not use special built-in function calls to
>> associate bounds with call arguments and input parameters. Each
>> function which should be instrumented gets an additional version and
>> only this special version will be instrumented.This new version gets
>> additional bound arguments to express input bounds. When function call
>> is instrumented, it is redirected to instrumented version and all
>> bounds are passed as explicit call arguments. Thus we have explicit
>> pointer bounds flow similar to regular function parameters. It should
>> allow to avoid changes in optimization, avoid miss-optimizations,
>> allow existing IPA optimizations to work with bound args (e.g.
>> propagate constant bounds value and remove checks in called function).
>>
>> I made a prototype implementation of this approach in the following way:
>>
>> - Add new IPA pass before early local passes to produce versions for
>> all functions to be instrumented.
>> - Put instrumentation pass after SSA pass.
>> - Add new pass after IPA passes to remove bodies of functions which
>> have instrumented versions. Function nodes may still be required for
>> calls in not instrumented code. But we do not emit this code and
>> therefore function bodies are not needed.
>>
>> Positive changes are:
>>
>> - IPA optimizations are not confused by bound parameters
>> - bounds are now more like regular arguments; it makes their
>> processing in expand easier
>> - functions with bounds not attached to any pointer are allowed
>
> First of all thanks for trying to work in this direction.  Comments on the
> issues you encountered below (also CCed Honza as he should be more
> familiar with reachability and clone issues).
>
>> On simple codes this approach worked well but on a bigger tests some
>> issues were revealed.
>>
>> 1. Nodes reachability. Instrumented version is actually always
>> reachable when original function is reachable because it is always
>> emitted instead of the original. Thus I had to fix reachability
>> analysis to achieve it. Another similar problem is check whether node
>> can be removed after inline when inlining instrumented function. Not
>> hard to fix but probably other similar problems exist.
>
> I suppose you do not update the callgraph / the call stmts when
> creating the clones?  Btw, is it desirable to inline the uninstrumented
> function and then instrument the result (thus run cloning and
> instrumentation after early inlining?)?  Especially handling always_inlines
> before cloning/isntrumentation looks very sensible.

Right. Created clones have the same code as the original function and
therefore same cgraph edges. I suppose instrumentation after early
inlining is OK and may be preferred because inline shouldn't lead to
any losses of bounds information. I tried variant when instrumentation
works right after early inlining but with cloning still before early
local passes. In general it looked OK.

>
>> 2. Function processing order. Function processing order is determined
>> before early local passes. But during function instrumentation call
>> graph is modified significantly and used topological order becomes
>> outdated. That causes some troubles. E.g. function marked as 'always
>> inline' cannot be inlined because it is not in SSA form yet. Surely
>> inlining problem may be solved by just putting instrumentation after
>> early inline, but similar problem may exist in other passes too. To
>> resolve this problem I tried to split early local passes into three
>> parts. The first one builds SSA, the second one performs
>> instrumentation, the last one does the rest. Each part is performed on
>> al

Re: [RFC] Using function clones for Pointer Bounds Checker

2014-01-14 Thread Ilya Enkovich

2014/1/14 Richard Biener :
> On Tue, Jan 14, 2014 at 1:47 PM, Ilya Enkovich  wrote:
>> 2014/1/14 Richard Biener :
>>> On Tue, Jan 14, 2014 at 10:15 AM, Ilya Enkovich  
>>> wrote:
>>>> Hi,
>>>>
>>>> I've been working for some time on the prototype of the Pointer Bounds
>>>> Checker which uses function clones for instrumentation
>>>> (http://gcc.gnu.org/ml/gcc-patches/2013-11/msg03327.html). After
>>>> several experiments with this approach I want to share my results and
>>>> ask for some feedback to make a decision about the future steps.
>>>>
>>>> Firstly I want to remind the reasons for digging in this direction. In
>>>> the original approach bounds of call arguments and input parameters
>>>> are associated with arguments via special built-in calls. It creates
>>>> implicit data flow compiler is not aware about which confuses some
>>>> optimizations resulting in miss-optimization and breaks bounds data
>>>> flow. Thus optimizations have to be fixed to get better pointers
>>>> protection.
>>>>
>>>> Clones approach does not use special built-in function calls to
>>>> associate bounds with call arguments and input parameters. Each
>>>> function which should be instrumented gets an additional version and
>>>> only this special version will be instrumented.This new version gets
>>>> additional bound arguments to express input bounds. When function call
>>>> is instrumented, it is redirected to instrumented version and all
>>>> bounds are passed as explicit call arguments. Thus we have explicit
>>>> pointer bounds flow similar to regular function parameters. It should
>>>> allow to avoid changes in optimization, avoid miss-optimizations,
>>>> allow existing IPA optimizations to work with bound args (e.g.
>>>> propagate constant bounds value and remove checks in called function).
>>>>
>>>> I made a prototype implementation of this approach in the following way:
>>>>
>>>> - Add new IPA pass before early local passes to produce versions for
>>>> all functions to be instrumented.
>>>> - Put instrumentation pass after SSA pass.
>>>> - Add new pass after IPA passes to remove bodies of functions which
>>>> have instrumented versions. Function nodes may still be required for
>>>> calls in not instrumented code. But we do not emit this code and
>>>> therefore function bodies are not needed.
>>>>
>>>> Positive changes are:
>>>>
>>>> - IPA optimizations are not confused by bound parameters
>>>> - bounds are now more like regular arguments; it makes their
>>>> processing in expand easier
>>>> - functions with bounds not attached to any pointer are allowed
>>>
>>> First of all thanks for trying to work in this direction.  Comments on the
>>> issues you encountered below (also CCed Honza as he should be more
>>> familiar with reachability and clone issues).
>>>
>>>> On simple codes this approach worked well but on a bigger tests some
>>>> issues were revealed.
>>>>
>>>> 1. Nodes reachability. Instrumented version is actually always
>>>> reachable when original function is reachable because it is always
>>>> emitted instead of the original. Thus I had to fix reachability
>>>> analysis to achieve it. Another similar problem is check whether node
>>>> can be removed after inline when inlining instrumented function. Not
>>>> hard to fix but probably other similar problems exist.
>>>
>>> I suppose you do not update the callgraph / the call stmts when
>>> creating the clones?  Btw, is it desirable to inline the uninstrumented
>>> function and then instrument the result (thus run cloning and
>>> instrumentation after early inlining?)?  Especially handling always_inlines
>>> before cloning/isntrumentation looks very sensible.
>>
>> Right. Created clones have the same code as the original function and
>> therefore same cgraph edges. I suppose instrumentation after early
>> inlining is OK and may be preferred because inline shouldn't lead to
>> any losses of bounds information. I tried variant when instrumentation
>> works right after early inlining but with cloning still before early
>> local passes. In general it looked OK.
>>
>>>
>>>> 2. Function processing order. Function processing order is determined

Re: [PATCH, PR ipa/66566] Fix ICE in early_inliner: internal compiler error: in operator[]

2015-07-13 Thread Ilya Enkovich

Ping

2015-06-18 12:54 GMT+03:00 Ilya Enkovich :
> Hi,
>
> In early_inliner we do recompute inline summaries for edges after 
> optimize_inline_calls, but check this summary exists in case new edges 
> appear.  But then it calls inline_update_overall_summary which also going 
> through edges inline summaries but with no check this time causing segfault.  
> This patch fixes it.  Bootstrapped and regtested for 
> x86_64-unknown-linux-gnu.  Is it OK for trunk and gcc-5-branch?
>
> Thanks,
> Ilya
> --
> gcc/
>
> 2015-06-18  Ilya Enkovich  
>
> PR ipa/66566
> * ipa-inline-analysis.c (estimate_calls_size_and_time): Check
> edge summary is available.
>
> gcc/testsuite/
>
> 2015-06-18  Ilya Enkovich  
>
> PR ipa/66566
> * gcc.target/i386/mpx/pr66566.c: New test.
>
>
> diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
> index bbde855..e910ac5 100644
> --- a/gcc/ipa-inline-analysis.c
> +++ b/gcc/ipa-inline-analysis.c
> @@ -3122,6 +3122,9 @@ estimate_calls_size_and_time (struct cgraph_node *node, 
> int *size,
>struct cgraph_edge *e;
>for (e = node->callees; e; e = e->next_callee)
>  {
> +  if (inline_edge_summary_vec.length () <= (unsigned) e->uid)
> +   continue;
> +
>struct inline_edge_summary *es = inline_edge_summary (e);
>
>/* Do not care about zero sized builtins.  */
> @@ -3153,6 +3156,9 @@ estimate_calls_size_and_time (struct cgraph_node *node, 
> int *size,
>  }
>for (e = node->indirect_calls; e; e = e->next_callee)
>  {
> +  if (inline_edge_summary_vec.length () <= (unsigned) e->uid)
> +   continue;
> +
>struct inline_edge_summary *es = inline_edge_summary (e);
>if (!es->predicate
>   || evaluate_predicate (es->predicate, possible_truths))
> diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr66566.c 
> b/gcc/testsuite/gcc.target/i386/mpx/pr66566.c
> new file mode 100644
> index 000..a405c20
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/mpx/pr66566.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fcheck-pointer-bounds -mmpx" } */
> +
> +union jsval_layout
> +{
> +  void *asPtr;
> +};
> +union jsval_layout a;
> +union jsval_layout b;
> +union jsval_layout __inline__ fn1() { return b; }
> +
> +void fn2() { a = fn1(); }

Re: [RFC, PR target/65105] Use vector instructions for scalar 64bit computations on 32bit target

2015-07-14 Thread Ilya Enkovich

Hi,

Any comments on this?

Thanks,
Ilya

2015-06-19 16:21 GMT+03:00 Ilya Enkovich :
> Hi,
>
> This patch tries to improve 64bit integer computations on 32bit target.  This 
> is achieved by an additional i386 target pass which searches for all 
> conversion candidates and tries to transform them into vector mode when 
> profitable.
>
> Initial problem discussion had several assumptions that this optimization 
> should be done in RA.  But implementation of this in RA seems really complex. 
>  I don't believe it can be done in a reasonalble time.  And taking into 
> account quite narrow performance impact, I believe a separate conversion pass 
> is a better solution.
>
> Here is shortly a list of changes:
>
> 1. Add insn templates for 64bit and/ior/xor/zext for 32bit target to avoid 
> split on expand
> 2. Add new pass to convert scalar computation into vector.  The flow of the 
> pass is following:
> a. Find all instructions we may convert
> b. Split them into chains of dependant instructions
> c. Estimate if chain conversion is profitable
> d. Convert chain if profitable
> 3. Add splits for not converted insns
>
> Current cost model uses processor_costs table to estimate how much gain somes 
> from a single instruction usage vs pair of instruction + estimate cost of 
> scalar->vector and back conversions.  Cost estimation doesn't actually use 
> CFG and have a (lot of) room for improvement.  The problem here is a lack of 
> workloads to be used for tuning.
>
> Added DI insns and splits for 32bit target delay insns split until 
> reload_completed.  It is a potential degradation for cases when conversion 
> doesn't happen. Pass probably may be moved before spli1 pass to allow early 
> split of not converted insns.  Or new pass itself may split not converted 
> chains.
>
> I also had to modify register constraint of movdi for sse->mem alternative.  
> I understand we don't like this alternative for 64bit target but for 32bit it 
> is more useful.  E.g. I see mem->mem copies go through xmm instead of GPR 
> pair with this change.  May we have separate xmm register alternatives for 
> 32bit and bit targets in movdi?
>
> Any comments/advices on this approach?
>
> I ran it for SPEC2000 and EMMBC1.1/2.0 on Avoton server.  The only test where 
> conversion was applied is 186.crafty.  I got +6% on -O2 and +10% on -Ofast + 
> -flto.
>
> Thanks,
> Ilya
> --
> 2015-06-19  Ilya Enkovich  
>
> * config/i386/i386.c: Include dbgcnt.h.
> (has_non_address_hard_reg): New.
> (convertible_comparison_p): New.
> (scalar_to_vector_candidate_p): New.
> (remove_non_convertible_regs): New.
> (scalar_chain): New.
> (scalar_chain::scalar_chain): New.
> (scalar_chain::~scalar_chain): New.
> (scalar_chain::add_to_queue): New.
> (scalar_chain::mark_dual_mode_def): New.
> (scalar_chain::analyze_register_chain): New.
> (scalar_chain::add_insn): New.
> (scalar_chain::build): New.
> (scalar_chain::compute_convert_gain): New.
> (scalar_chain::replace_with_subreg): New.
> (scalar_chain::replace_with_subreg_in_insn): New.
> (scalar_chain::make_vector_copies): New.
> (scalar_chain::convert_reg): New.
> (scalar_chain::convert_op): New.
> (scalar_chain::convert_insn): New.
> (scalar_chain::convert): New.
> (convert_scalars_to_vector): New.
> (pass_data_stv): New.
> (pass_stv): New.
> (make_pass_stv): New.
> (ix86_option_override): Created and register stv pass.
> * config/i386/i386.md (SWIM1248x): New.
> (*movdi_internal): Remove '*' modifier for xmm to mem alternative.
> (and3): Use SWIM1248x iterator instead of SWIM.
> (*anddi3_doubleword): New.
> (*zext_doubleword): New.
> (*zextqi_doubleword): New.
> (3): Use SWIM1248x iterator instead of SWIM.
> (*di3_doubleword): New.
> * dbgcnt.def (stv_conversion): New.
>
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index 153dd85..6995afd 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -110,6 +110,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "tree-iterator.h"
>  #include "tree-chkp.h"
>  #include "rtl-chkp.h"
> +#include "dbgcnt.h"
>
>  static rtx legitimize_dllimport_symbol (rtx, bool);
>  static rtx legitimize_pe_coff_extern_decl (rtx, bool);
> @@ -2549,6 +2550,868 @@ rest_of_handle_insert_vzeroupper (void)
>return 0;
>  }
>
> +/*

[PATCH, PR testsuite/66734] Support MPX tests in lto.exp

2015-07-14 Thread Ilya Enkovich

Hi,

This patch initializes MPX paths in lto.exp to give check_effective_target_mpx 
a chance.  Is it OK for trunk?

Thanks,
Ilya
--
gcc/testsuite/

2015-07-14  Ilya Enkovich  

PR testsuite/66734
* gcc.dg/lto/lto.exp: Initialize MPX.


diff --git a/gcc/testsuite/gcc.dg/lto/lto.exp b/gcc/testsuite/gcc.dg/lto/lto.exp
index c1d7c4c..7b919c2 100644
--- a/gcc/testsuite/gcc.dg/lto/lto.exp
+++ b/gcc/testsuite/gcc.dg/lto/lto.exp
@@ -42,6 +42,7 @@ if { ![check_effective_target_lto] } {
 
 gcc_init
 lto_init no-mathlib
+mpx_init
 
 # Define an identifier for use with this suite to avoid name conflicts
 # with other lto tests running at the same time.

Re: [CHKP, GCC 5] Port a set of stability chkp patches to gcc-5-branch

2015-07-20 Thread Ilya Enkovich

Ping

2015-06-19 17:10 GMT+03:00 Ilya Enkovich :
> Hi,
>
> There was a set of stability fixes (mostly different ICEs) for Pointer Bounds 
> Checker done in GCC 6.  But only few of them were approved to be ported to 
> GCC 5.  Will it be OK to port other chkp specific stability fixes to GCC 5?  
> Here is a list of patches:
>
>  https://gcc.gnu.org/ml/gcc-patches/2015-03/msg00995.html
>  https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01067.html
>  https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01065.html
>  https://gcc.gnu.org/ml/gcc-patches/2015-05/msg01386.html
>  https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01248.html
>  https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01252.html
>  https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01253.html
>  https://gcc.gnu.org/ml/gcc-patches/2015-06/msg01319.html
>
> Thanks,
> Ilya

Re: [PATCH, PR ipa/66566] Fix ICE in early_inliner: internal compiler error: in operator[]

2015-07-20 Thread Ilya Enkovich

Ping

2015-07-13 11:47 GMT+03:00 Ilya Enkovich :
> Ping
>
> 2015-06-18 12:54 GMT+03:00 Ilya Enkovich :
>> Hi,
>>
>> In early_inliner we do recompute inline summaries for edges after 
>> optimize_inline_calls, but check this summary exists in case new edges 
>> appear.  But then it calls inline_update_overall_summary which also going 
>> through edges inline summaries but with no check this time causing segfault. 
>>  This patch fixes it.  Bootstrapped and regtested for 
>> x86_64-unknown-linux-gnu.  Is it OK for trunk and gcc-5-branch?
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2015-06-18  Ilya Enkovich  
>>
>> PR ipa/66566
>> * ipa-inline-analysis.c (estimate_calls_size_and_time): Check
>> edge summary is available.
>>
>> gcc/testsuite/
>>
>> 2015-06-18  Ilya Enkovich  
>>
>> PR ipa/66566
>> * gcc.target/i386/mpx/pr66566.c: New test.
>>
>>
>> diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c
>> index bbde855..e910ac5 100644
>> --- a/gcc/ipa-inline-analysis.c
>> +++ b/gcc/ipa-inline-analysis.c
>> @@ -3122,6 +3122,9 @@ estimate_calls_size_and_time (struct cgraph_node 
>> *node, int *size,
>>struct cgraph_edge *e;
>>for (e = node->callees; e; e = e->next_callee)
>>  {
>> +  if (inline_edge_summary_vec.length () <= (unsigned) e->uid)
>> +   continue;
>> +
>>struct inline_edge_summary *es = inline_edge_summary (e);
>>
>>/* Do not care about zero sized builtins.  */
>> @@ -3153,6 +3156,9 @@ estimate_calls_size_and_time (struct cgraph_node 
>> *node, int *size,
>>  }
>>for (e = node->indirect_calls; e; e = e->next_callee)
>>  {
>> +  if (inline_edge_summary_vec.length () <= (unsigned) e->uid)
>> +   continue;
>> +
>>struct inline_edge_summary *es = inline_edge_summary (e);
>>if (!es->predicate
>>   || evaluate_predicate (es->predicate, possible_truths))
>> diff --git a/gcc/testsuite/gcc.target/i386/mpx/pr66566.c 
>> b/gcc/testsuite/gcc.target/i386/mpx/pr66566.c
>> new file mode 100644
>> index 000..a405c20
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/i386/mpx/pr66566.c
>> @@ -0,0 +1,12 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fcheck-pointer-bounds -mmpx" } */
>> +
>> +union jsval_layout
>> +{
>> +  void *asPtr;
>> +};
>> +union jsval_layout a;
>> +union jsval_layout b;
>> +union jsval_layout __inline__ fn1() { return b; }
>> +
>> +void fn2() { a = fn1(); }

[PATCH, i386, PR driver/66737] Don't pass '-z bndplt' to linker for 32bit target

2015-07-20 Thread Ilya Enkovich

Hi,

This patch adds a target filter for '-z bndplt' linker option.  Bootstrapped 
and regtested for x86_64-unknown-linux-gnu.  MPX tests at lto.exp are not 
marked as unsupported for 32bit any more.  Going to commit it to trunk in a few 
days if no obections appear.

Thanks,
Ilya
--
gcc/

2015-07-20  Ilya Enkovich  

PR target/66737
* config/i386/linux-common.h (MPX_SPEC): Use linker option
for 64bit target only.


diff --git a/gcc/config/i386/linux-common.h b/gcc/config/i386/linux-common.h
index 63dd8d8..da09d3d 100644
--- a/gcc/config/i386/linux-common.h
+++ b/gcc/config/i386/linux-common.h
@@ -72,7 +72,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #ifndef MPX_SPEC
 #define MPX_SPEC "\
- %{mmpx:%{fcheck-pointer-bounds:%{!static:" LINK_MPX "}}}"
+ %{mmpx:%{fcheck-pointer-bounds:%{!static:%{" SPEC_64 ":" LINK_MPX ""
 #endif
 
 #ifndef LIBMPX_SPEC

Re: [PATCH][RTL-ifcvt] Improve conditional select ops on immediates

2015-08-03 Thread Ilya Enkovich

2015-08-03 17:04 GMT+03:00 Uros Bizjak :
> On Mon, Aug 3, 2015 at 3:02 PM, Kyrill Tkachov  wrote:
>>
>> On 03/08/15 13:33, Uros Bizjak wrote:
>>>
>>> Hello!
>>>
 2015-07-30  Kyrylo Tkachov  

  * ifcvt.c (noce_try_store_flag_constants): Make logic of the case
  when diff == STORE_FLAG_VALUE or diff == -STORE_FLAG_VALUE more
  explicit.  Prefer to add the flag whenever possible.
  (noce_process_if_block): Try noce_try_store_flag_constants before
  noce_try_cmove.

 2015-07-30  Kyrylo Tkachov  

  * gcc.target/aarch64/csel_bfx_1.c: New test.
  * gcc.target/aarch64/csel_imms_inc_1.c: Likewise.
>>>
>>> This patch regressed following tests on x86_64:
>>>
>>> FAIL: gcc.target/i386/cmov2.c scan-assembler sbb
>>> FAIL: gcc.target/i386/cmov3.c scan-assembler cmov[^3]
>
> The difference for cmov3.c on x86_64 is:
>
>cmpl%esi, %edi
>movl$-5, %edx
>movl$5, %eax
>cmovg   %edx, %eax
>ret
>
> vs. new code:
>
>xorl%eax, %eax
>cmpl%esi, %edi
>setle   %al
>negl%eax
>andl$10, %eax
>subl$5, %eax
>ret
>
> I'm not sure old code is really better than new. HJ, do you have any
> better insight?
>
> Uros.

The original code looks better, tree height is just 2 and therefore it
can be executed in 2 cycles. New code has more dependencies and tree
height becomes 5. It is always hard to say for all x86 targets but as
a generic code the original version is better.

Thanks,
Ilya

Re: [RFC] Masking vectorized loops with bound not aligned to VF.

2015-09-17 Thread Ilya Enkovich

2015-09-16 15:30 GMT+03:00 Richard Biener :
> On Mon, 14 Sep 2015, Kirill Yukhin wrote:
>
>> Hello,
>> I'd like to initiate discussion on vectorization of loops which
>> boundaries are not aligned to VF. Main target for this optimization
>> right now is x86's AVX-512, which features per-element embedded masking
>> for all instructions. The main goal for this mail is to agree on overall
>> design of the feature.
>>
>> This approach was presented @ GNU Cauldron 2015 by Ilya Enkovich [1].
>>
>> Here's a sketch of the algorithm:
>>   1. Add check on basic stmts for masking: possibility to introduce index 
>> vector and
>>  corresponding mask
>>   2. At the check if statements are vectorizable we additionally check if 
>> stmts
>>  need and can be masked and compute masking cost. Result is stored in 
>> `stmt_vinfo`.
>>  We are going  to mask only mem. accesses, reductions and modify mask 
>> for already
>>  masked stmts (mask load, mask store and vect. condition)
>
> I think you also need to mask divisions (for integer divide by zero) and
> want to mask FP ops which may result in NaNs or denormals (because that's
> generally to slow down execution a lot in my experience).
>
> Why not simply mask all stmts?

Hi,

Statement masking may be not free. Especially if we need to transform
mask somehow to do it. It also may be unsupported on a platform (e.g.
for AVX-512 not all instructions support masking) but still not be a
problem to mask a loop. BTW for AVX-512 masking doesn't boost
performance even if we have some special cases like NaNs. We don't
consider exceptions in vector code (and it seems to be a case now?)
otherwise we would need to mask them also.

>
>>   3. Make a decision about masking: take computed costs and est. iterations 
>> count
>>  into consideration
>>   4. Modify prologue/epilogue generation according decision made at 
>> analysis. Three
>>  options available:
>> a. Use scalar remainder
>> b. Use masked remainder. Won't be supported in first version
>> c. Mask main loop
>>   5.Support vectorized loop masking:
>> - Create stmts for mask generation
>> - Support generation of masked vector code (create generic vector code 
>> then
>>   patch it w/ masks)
>>   -  Mask loads/stores/vconds/reductions only
>>
>>  In first version (targeted v6) we're not going to support 4.b and loop
>> mask pack/unpack. No `pack/unpack` means that masking will be supported
>> only for types w/ the same size as index variable
>
> This means that if ncopies for any stmt is > 1 masking won't be supported,
> right?  (you'd need two or more different masks)

We don't think it is a very important feature to have in initial
version. It can be added later and shouldn't affect overall
implementation design much. BTW currently masked loads and stores
don't support masks of other sizes and don't do masks pack/unpack.

>
>>
>> [1] - 
>> https://gcc.gnu.org/wiki/cauldron2015?action=AttachFile&do=view&target=Vectorization+for+Intel+AVX-512.pdf
>>
>> What do you think?
>
> There was the idea some time ago to use single-iteration vector
> variants for prologues/epilogues by simply overlapping them with
> the vector loop (and either making sure to mask out the overlap
> area or make sure the result stays the same).  This kind-of is
> similar to 4b and thus IMHO it's better to get 4b implemented
> rather than trying 4c.  So for example
>
>  int a[];
>  for (i=0; i < 13; ++i)
>a[i] = i;
>
> would be vectorized (with v4si) as
>
>  for (i=0; i < 13 / 4; ++i)
>((v4si *)a)[i] = { ... };
>  *(v4si *)(&a[9]) = { ... };
>
> where the epilogue store of course would be unaligned.  The masked
> variant can avoid the data pointer adjustment and instead use a masked
> store.
>
> OTOH it might be that the unaligned scheme is as efficient as the
> masked version.  Only the masked version is more trivially correct,
> data dependences can make the above idea not work without masking
> out stores like for
>
>  for (i=0; i < 13; ++i)
>a[i] = a[i+1];
>
> obviously the overlapping iterations in the epilogue would
> compute bogus values.  To avoid this we can merge the result
> with the previously stored values (using properly computed masks)
> before storing it.
>
> Basically both 4b and the above idea need to peel a vector
> iteration and "modify" it.  The same trick can be applied to
> prologue loops of course.
>
> Any chance you can try working on 4b instead?  It als

Re: [RFC] Try vector as a new representation for vector masks

2015-09-18 Thread Ilya Enkovich

2015-09-17 20:35 GMT+03:00 Richard Henderson :
> On 09/15/2015 06:52 AM, Ilya Enkovich wrote:
>> I made a step forward forcing vector comparisons have a mask (vec) 
>> result and disabling bool patterns in case vector comparison is supported by 
>> target.  Several issues were met.
>>
>>  - c/c++ front-ends generate vector comparison with integer vector result.  
>> I had to make some modifications to use vec_cond instead.  Don't know if 
>> there are other front-ends producing vector comparisons.
>>  - vector lowering fails to expand vector masks due to mismatch of type and 
>> mode sizes.  I fixed vector type size computation to match mode size and 
>> added a special handling of mask expand.
>>  - I disabled canonical type creation for vector mask because we can't 
>> layout it with VOID mode. I don't know why we may need a canonical type 
>> here.  But get_mask_mode call may be moved into type layout to get it.
>>  - Expand of vec constants/contstructors requires special handling.  
>> Common case should require target hooks/optabs to expand vector into 
>> required mode.  But I suppose we want to have a generic code to handle 
>> vector of int mode case to avoid modification of multiple targets which use 
>> default vec modes.
>>
>> Currently 'make check' shows two types of regression.
>>   - missed vector expression pattern recongnition (MIN, MAX, ABX, VEC_COND). 
>>  This must be due to my front-end changes.  Hope it will be easy to fix.
>>   - missed vectorization. All of them appear due to bool patterns disabling. 
>>  I didn't look into all of them but it seems the main problem is in mixed 
>> type sizes.  With bool patterns and integer vector masks we just put 
>> int->(other sized int) conversion for masks and it gives us required mask 
>> transformation.  With boolean mask we don't have a proper scalar statements 
>> to do that.  I think mask widening/narrowing may be directly supported in 
>> masked statements vectorization.  Going to look into it.
>>
>> I attach what I currently have for a prototype.  It grows bigger so I split 
>> into several parts.
>
> The general approach looks good.
>

Great!

>
>> +/* By defaults a vector of integers is used as a mask.  */
>> +
>> +machine_mode
>> +default_get_mask_mode (unsigned nunits, unsigned vector_size)
>> +{
>> +  unsigned elem_size = vector_size / nunits;
>> +  machine_mode elem_mode
>> += smallest_mode_for_size (elem_size * BITS_PER_UNIT, MODE_INT);
>
> Why these arguments as opposed to passing elem_size?  It seems that every hook
> is going to have to do this division...

Every target would have nunits = vector_size / elem_size because
nunits is used to create a vector mode. Thus no difference.

>
>> +#define VECTOR_MASK_TYPE_P(TYPE) \
>> +  (TREE_CODE (TYPE) == VECTOR_TYPE   \
>> +   && TREE_CODE (TREE_TYPE (TYPE)) == BOOLEAN_TYPE)
>
> Perhaps better as VECTOR_BOOLEAN_TYPE_P, since that's exactly what's being 
> tested?

OK

>
>> @@ -3464,10 +3464,10 @@ verify_gimple_comparison (tree type, tree op0, tree 
>> op1)
>>return true;
>>  }
>>  }
>> -  /* Or an integer vector type with the same size and element count
>> +  /* Or a boolean vector type with the same element count
>>   as the comparison operand types.  */
>>else if (TREE_CODE (type) == VECTOR_TYPE
>> -&& TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE)
>> +&& TREE_CODE (TREE_TYPE (type)) == BOOLEAN_TYPE)
>
> VECTOR_BOOLEAN_TYPE_P.
>
>> @@ -122,7 +122,19 @@ tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
>> tree t, tree bitsize, tree bitpos)
>>  {
>>if (bitpos)
>> -return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
>> +{
>> +  if (TREE_CODE (type) == BOOLEAN_TYPE)
>> + {
>> +   tree itype
>> + = build_nonstandard_integer_type (tree_to_uhwi (bitsize), 0);
>> +   tree field = gimplify_build3 (gsi, BIT_FIELD_REF, itype, t,
>> + bitsize, bitpos);
>> +   return gimplify_build2 (gsi, NE_EXPR, type, field,
>> +   build_zero_cst (itype));
>> + }
>> +  else
>> + return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
>> +}
>>else
>>  return gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, t);
>>  }
>
> So... this is us lowering vector operations on a

Re: [RFC] Try vector as a new representation for vector masks

2015-09-18 Thread Ilya Enkovich

2015-09-18 15:22 GMT+03:00 Richard Biener :
> On Thu, Sep 3, 2015 at 3:57 PM, Ilya Enkovich  wrote:
>> 2015-09-03 15:11 GMT+03:00 Richard Biener :
>>> On Thu, Sep 3, 2015 at 2:03 PM, Ilya Enkovich  
>>> wrote:
>>>> Adding CCs.
>>>>
>>>> 2015-09-03 15:03 GMT+03:00 Ilya Enkovich :
>>>>> 2015-09-01 17:25 GMT+03:00 Richard Biener :
>>>>>
>>>>> Totally disabling old style vector comparison and bool pattern is a
>>>>> goal but doing hat would mean a lot of regressions for many targets.
>>>>> Do you want to it to be tried to estimate amount of changes required
>>>>> and reveal possible issues? What would be integration plan for these
>>>>> changes? Do you want to just introduce new vector in GIMPLE
>>>>> disabling bool patterns and then resolving vectorization regression on
>>>>> all targets or allow them live together with following target switch
>>>>> one by one from bool patterns with finally removing them? Not all
>>>>> targets are likely to be adopted fast I suppose.
>>>
>>> Well, the frontends already create vec_cond exprs I believe.  So for
>>> bool patterns the vectorizer would have to do the same, but the
>>> comparison result in there would still use vec.  Thus the scalar
>>>
>>>  _Bool a = b < c;
>>>  _Bool c = a || d;
>>>  if (c)
>>>
>>> would become
>>>
>>>  vec a = VEC_COND ;
>>>  vec c = a | d;
>>
>> This should be identical to
>>
>> vec<_Bool> a = a < b;
>> vec<_Bool> c = a | d;
>>
>> where vec<_Bool> has VxSI mode. And we should prefer it in case target
>> supports vector comparison into vec, right?
>>
>>>
>>> when the target does not have vecs directly and otherwise
>>> vec directly (dropping the VEC_COND).
>>>
>>> Just the vector comparison inside the VEC_COND would always
>>> have vec type.
>>
>> I don't really understand what you mean by 'doesn't have vecs
>> dirrectly' here. Currently I have a hook to ask for a vec mode
>> and assume target doesn't support it in case it returns VOIDmode. But
>> in such case I have no mode to use for vec inside VEC_COND
>> either.
>
> I was thinking about targets not supporting generating vec
> (of whatever mode) from a comparison directly but only via
> a COND_EXPR.

Where may these direct comparisons come from? Vectorizer never
generates unsupported statements. It means we get them from
gimplifier? So touch optabs in gimplifier to avoid direct comparisons?
Actually vect lowering checks if we are able to make comparison and
expand also uses vec_cond to expand vector comparison, so probably we
may live with them.

>
>> In default implementation of the new target hook I always return
>> integer vector mode (to have default behavior similar to the current
>> one). It should allow me to use vec for conditions in all
>> vec_cond. But we'd need some other trigger for bool patterns to apply.
>> Probably check vec_cmp optab in check_bool_pattern and don't convert
>> in case comparison is supported by target? Or control it via
>> additional hook.
>
> Not sure if we are always talking about the same thing for
> "bool patterns".  I'd remove bool patterns completely, IMHO
> they are not necessary at all.

I refer to transformations made by vect_recog_bool_pattern. Don't see
how to remove them completely for targets not supporting comparison
vectorization.

>
>>>
>>> And the "bool patterns" I am talking about are those in
>>> tree-vect-patterns.c, not any targets instruction patterns.
>>
>> I refer to them also. BTW bool patterns also pull comparison into
>> vec_cond. Thus we cannot have SSA_NAME in vec_cond as a condition. I
>> think with vector comparisons in place we should allow SSA_NAME as
>> conditions in VEC_COND for better CSE. That should require new vcond
>> optabs though.
>
> I think we do allow this, just the vectorizer doesn't expect it.  In the long
> run I want to get rid of the GENERIC exprs in both COND_EXPR and
> VEC_COND_EXPR.  Just didn't have the time to do this...

That would be nice. As a first step I'd like to support optabs for
VEC_COND_EXPR directly using vec.

Thanks,
Ilya

>
> Richard.
>
>> Ilya
>>
>>>
>>> Richard.
>>>
>>>>>
>>>>> Ilya

Re: [RFC] Try vector as a new representation for vector masks

2015-09-18 Thread Ilya Enkovich

2015-09-18 15:29 GMT+03:00 Richard Biener :
> On Tue, Sep 15, 2015 at 3:52 PM, Ilya Enkovich  wrote:
>> On 08 Sep 15:37, Ilya Enkovich wrote:
>>> 2015-09-04 23:42 GMT+03:00 Jeff Law :
>>> >
>>> > So do we have enough confidence in this representation that we want to go
>>> > ahead and commit to it?
>>>
>>> I think new representation fits nice mostly. There are some places
>>> where I have to make some exceptions for vector of bools to make it
>>> work. This is mostly to avoid target modifications. I'd like to avoid
>>> necessity to change all targets currently supporting vec_cond. It
>>> makes me add some special handling of vec in GIMPLE, e.g. I add
>>> a special code in vect_init_vector to build vec invariants with
>>> proper casting to int. Otherwise I'd need to do it on a target side.
>>>
>>> I made several fixes and current patch (still allowing integer vector
>>> result for vector comparison and applying bool patterns) passes
>>> bootstrap and regression testing on x86_64. Now I'll try to fully
>>> switch to vec and see how it goes.
>>>
>>> Thanks,
>>> Ilya
>>>
>>
>> Hi,
>>
>> I made a step forward forcing vector comparisons have a mask (vec) 
>> result and disabling bool patterns in case vector comparison is supported by 
>> target.  Several issues were met.
>>
>>  - c/c++ front-ends generate vector comparison with integer vector result.  
>> I had to make some modifications to use vec_cond instead.  Don't know if 
>> there are other front-ends producing vector comparisons.
>>  - vector lowering fails to expand vector masks due to mismatch of type and 
>> mode sizes.  I fixed vector type size computation to match mode size and 
>> added a special handling of mask expand.
>>  - I disabled canonical type creation for vector mask because we can't 
>> layout it with VOID mode. I don't know why we may need a canonical type 
>> here.  But get_mask_mode call may be moved into type layout to get it.
>>  - Expand of vec constants/contstructors requires special handling.  
>> Common case should require target hooks/optabs to expand vector into 
>> required mode.  But I suppose we want to have a generic code to handle 
>> vector of int mode case to avoid modification of multiple targets which use 
>> default vec modes.
>
> One complication you might run into currently is that at the moment we
> require the comparison result to be
> of the same size as the comparison operands.  This means that
> vector with, say, 4 elements has
> to support different modes for v4si < v4si vs. v4df < v4df (if you
> think of x86 with its multiple vector sizes).
> That's for the "fallback" non-mask vector only of course.  Does
> that mean we have to use different
> bool types with different modes here?

I though about boolean types with different sizes/modes. I still avoid
them but it causes some ugliness. E.g. sizeof(innertype)*nelems !=
sizeof(vectortype) for vec. I causes some special handling in
type layout and problems in lowering because BIT_FIELD_REF uses more
bits than resulting type has. I use additional comparison to handle
it. Richard also proposed to extract one bit only for bools. Don't
know if differently sized boolean types may help to resolve this issue
or create more problems.

>
> So the other possibility is to never expose the fallback vector
> anywhere but make sure to lower to
> vector via VEC_COND_EXPRs.  After all it's only the vectorizer
> that should create stmts with
> vector LHS and the vectorizer is already careful to only
> generate code supported by the target.

In case vec has integer vector mode, comparison should be
handled similar to VEC_COND_EXPR by vect lowering and expand which
should be enough to have it properly handled on targets with no
vec support.

Thanks,
Ilya

>
>> Currently 'make check' shows two types of regression.
>>   - missed vector expression pattern recongnition (MIN, MAX, ABX, VEC_COND). 
>>  This must be due to my front-end changes.  Hope it will be easy to fix.
>>   - missed vectorization. All of them appear due to bool patterns disabling. 
>>  I didn't look into all of them but it seems the main problem is in mixed 
>> type sizes.  With bool patterns and integer vector masks we just put 
>> int->(other sized int) conversion for masks and it gives us required mask 
>> transformation.  With boolean mask we don't have a proper scalar statements 
>> to do that.  I think mask widening/narrowing may be directly supported in 
>> masked statements vectorization.  Going to look into it.
>>
>> I attach what I currently have for a prototype.  It grows bigger so I split 
>> into several parts.
>>
>> Thanks,
>> Ilya

Re: [RFC] Try vector as a new representation for vector masks

2015-09-21 Thread Ilya Enkovich

2015-09-18 19:50 GMT+03:00 Richard Henderson :
> On 09/18/2015 06:21 AM, Ilya Enkovich wrote:
>>>> +machine_mode
>>>> +default_get_mask_mode (unsigned nunits, unsigned vector_size)
>>>> +{
>>>> +  unsigned elem_size = vector_size / nunits;
>>>> +  machine_mode elem_mode
>>>> += smallest_mode_for_size (elem_size * BITS_PER_UNIT, MODE_INT);
>>>
>>> Why these arguments as opposed to passing elem_size?  It seems that every 
>>> hook
>>> is going to have to do this division...
>>
>> Every target would have nunits = vector_size / elem_size because
>> nunits is used to create a vector mode. Thus no difference.
>
> I meant passing nunits and elem_size, but not vector_size.  Thus no division
> required.  If the target does require the vector size, it could be obtained by
> multiplication, which is cheaper.  But in cases like this we'd not require
> either mult or div.

OK

>
>>>> @@ -1885,7 +1885,9 @@ expand_MASK_LOAD (gcall *stmt)
>>>>create_output_operand (&ops[0], target, TYPE_MODE (type));
>>>>create_fixed_operand (&ops[1], mem);
>>>>create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
>>>> -  expand_insn (optab_handler (maskload_optab, TYPE_MODE (type)), 3, ops);
>>>> +  expand_insn (convert_optab_handler (maskload_optab, TYPE_MODE (type),
>>>> +   TYPE_MODE (TREE_TYPE (maskt))),
>>>> +3, ops);
>>>
>>> Why do we now need a conversion here?
>>
>> Mask mode was implicit for masked loads and stores. Now it becomes
>> explicit because we may load the same value using different masks.
>> E.g. for i386 we may load 256bit vector using both vector and scalar
>> masks.
>
> Ok, sure, the mask mode is needed, I get that.  But that doesn't answer the
> question regarding conversion.  Why would convert_optab_handler be needed to
> *change* the mode of the mask.  I assume that's not actually possible, with 
> the
> target hook already having chosen the proper mode for the mask.

There is no any conversion here, maskload_optab is a convert_optab
because it uses two modes, one for value and the other one for mask.

Ilya

>
>
> r~

Re: [RFC, PR target/65105] Use vector instructions for scalar 64bit computations on 32bit target

2015-09-23 Thread Ilya Enkovich

On 14 Sep 17:50, Uros Bizjak wrote:
> 
> +(define_insn_and_split "*zext_doubleword"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> + (zero_extend:DI (match_operand:SWI24 1 "nonimmediate_operand" "rm")))]
> +  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
> +  "#"
> +  "&& reload_completed && GENERAL_REG_P (operands[0])"
> +  [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
> +   (set (match_dup 2) (const_int 0))]
> +  "split_double_mode (DImode, &operands[0], 1, &operands[0], &operands[2]);")
> +
> +(define_insn_and_split "*zextqi_doubleword"
> +  [(set (match_operand:DI 0 "register_operand" "=r")
> + (zero_extend:DI (match_operand:QI 1 "nonimmediate_operand" "qm")))]
> +  "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
> +  "#"
> +  "&& reload_completed && GENERAL_REG_P (operands[0])"
> +  [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
> +   (set (match_dup 2) (const_int 0))]
> +  "split_double_mode (DImode, &operands[0], 1, &operands[0], &operands[2]);")
> +
> 
> Please put the above patterns together with other zero_extend
> patterns. You can also merge these two patterns using SWI124 mode
> iterator with  mode attribute as a register constraint. Also, no
> need to check for GENERAL_REG_P after reload, when "r" constraint is
> in effect:
> 
> (define_insn_and_split "*zext_doubleword"
>   [(set (match_operand:DI 0 "register_operand" "=r")
>  (zero_extend:DI (match_operand:SWI124 1 "nonimmediate_operand" "m")))]
>   "!TARGET_64BIT && TARGET_STV && TARGET_SSE2"
>   "#"
>   "&& reload_completed"
>   [(set (match_dup 0) (zero_extend:SI (match_dup 1)))
>(set (match_dup 2) (const_int 0))]
>   "split_double_mode (DImode, &operands[0], 1, &operands[0], &operands[2]);")

Register constraint doesn't affect split and I need GENERAL_REG_P to filter 
other registers case.

I merged QI and HI cases of zext but made a separate pattern for SI case 
because it doesn't need zero_extend in resulting code.  Bootstrapped and 
regtested for x86_64-unknown-linux-gnu.

Thanks,
Ilya
--
gcc/

2015-09-23  Ilya Enkovich  

* config/i386/i386.c: Include dbgcnt.h.
(has_non_address_hard_reg): New.
(convertible_comparison_p): New.
(scalar_to_vector_candidate_p): New.
(remove_non_convertible_regs): New.
(scalar_chain): New.
(scalar_chain::scalar_chain): New.
(scalar_chain::~scalar_chain): New.
(scalar_chain::add_to_queue): New.
(scalar_chain::mark_dual_mode_def): New.
(scalar_chain::analyze_register_chain): New.
(scalar_chain::add_insn): New.
(scalar_chain::build): New.
(scalar_chain::compute_convert_gain): New.
(scalar_chain::replace_with_subreg): New.
(scalar_chain::replace_with_subreg_in_insn): New.
(scalar_chain::emit_conversion_insns): New.
(scalar_chain::make_vector_copies): New.
(scalar_chain::convert_reg): New.
(scalar_chain::convert_op): New.
(scalar_chain::convert_insn): New.
(scalar_chain::convert): New.
(convert_scalars_to_vector): New.
(pass_data_stv): New.
(pass_stv): New.
(make_pass_stv): New.
(ix86_option_override): Created and register stv pass.
(flag_opts): Add -mstv.
    (ix86_option_override_internal): Likewise.
* config/i386/i386.md (SWIM1248x): New.
(*movdi_internal): Add xmm to mem alternative for TARGET_STV.
(and3): Use SWIM1248x iterator instead of SWIM.
(*anddi3_doubleword): New.
(*zext_doubleword): New.
(*zextsi_doubleword): New.
(3): Use SWIM1248x iterator instead of SWIM.
(*di3_doubleword): New.
* config/i386/i386.opt (mstv): New.
* dbgcnt.def (stv_conversion): New.

gcc/testsuite/

2015-09-23  Ilya Enkovich  

* gcc.target/i386/pr65105-1.c: New.
* gcc.target/i386/pr65105-2.c: New.
* gcc.target/i386/pr65105-3.c: New.
* gcc.target/i386/pr65105-4.C: New.
* gcc.dg/lower-subreg-1.c: Add -mno-stv options for ia32.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d547cfd..2663f85 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -87,6 +87,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-iterator.h"
 #include "tree-chkp.h"
 #include "rtl-chkp.h"
+#include "dbgcnt.h"
 
 /* This file should be included last.  */
 #include "target-def.

Re: [RFC] Try vector as a new representation for vector masks

2015-09-23 Thread Ilya Enkovich

2015-09-18 16:40 GMT+03:00 Ilya Enkovich :
> 2015-09-18 15:22 GMT+03:00 Richard Biener :
>>
>> I was thinking about targets not supporting generating vec
>> (of whatever mode) from a comparison directly but only via
>> a COND_EXPR.
>
> Where may these direct comparisons come from? Vectorizer never
> generates unsupported statements. It means we get them from
> gimplifier? So touch optabs in gimplifier to avoid direct comparisons?
> Actually vect lowering checks if we are able to make comparison and
> expand also uses vec_cond to expand vector comparison, so probably we
> may live with them.
>
>>
>> Not sure if we are always talking about the same thing for
>> "bool patterns".  I'd remove bool patterns completely, IMHO
>> they are not necessary at all.
>
> I refer to transformations made by vect_recog_bool_pattern. Don't see
> how to remove them completely for targets not supporting comparison
> vectorization.
>
>>
>> I think we do allow this, just the vectorizer doesn't expect it.  In the long
>> run I want to get rid of the GENERIC exprs in both COND_EXPR and
>> VEC_COND_EXPR.  Just didn't have the time to do this...
>
> That would be nice. As a first step I'd like to support optabs for
> VEC_COND_EXPR directly using vec.
>
> Thanks,
> Ilya
>
>>
>> Richard.

Hi Richard,

Do you think we have enough confidence approach is working and we may
start integrating it into trunk? What would be integration plan then?

Thanks,
Ilya

Re: [PATCH, PR67405, committed] Avoid NULL pointer dereference

2015-09-24 Thread Ilya Enkovich

2015-09-15 14:01 GMT+03:00 Ilya Enkovich :
> 2015-09-15 13:32 GMT+03:00 Richard Biener :
>> On Tue, Sep 15, 2015 at 11:28 AM, Ilya Enkovich  
>> wrote:
>>
>> I see.  I wonder why we even call chkp_find_bound_slots if seen_errors().
>
> Even with errors we still gimplify function. Function parameters
> gimplification checks where parameters are passed to possibly copy
> some of them. It triggers ix86_function_arg_advance which uses
> chkp_find_bound_slots to skip required amount of bounds registers.
>
>> I suppose only recursing for COMPLETE_TYPE_P () would work?
>
> Yep, it should work. I'll rework my fix.

It turned out to be wrong. For this test

struct S
{
  S f;
};

void fn1 (S p1) {}

Structure S is considered as complete (has size 8 for some reason) at
fn1 gimplification. Thus even with complete type check I still hit
this field with error_node instead of a type and NULL at
DECL_FIELD_BIT_OFFSET. Should my current fix be OK then?

Thanks,
Ilya

Re: [RFC] Try vector as a new representation for vector masks

2015-09-25 Thread Ilya Enkovich

2015-09-23 16:53 GMT+03:00 Richard Biener :
> On Wed, Sep 23, 2015 at 3:41 PM, Ilya Enkovich  wrote:
>> 2015-09-18 16:40 GMT+03:00 Ilya Enkovich :
>>> 2015-09-18 15:22 GMT+03:00 Richard Biener :
>>>>
>>>> I was thinking about targets not supporting generating vec
>>>> (of whatever mode) from a comparison directly but only via
>>>> a COND_EXPR.
>>>
>>> Where may these direct comparisons come from? Vectorizer never
>>> generates unsupported statements. It means we get them from
>>> gimplifier? So touch optabs in gimplifier to avoid direct comparisons?
>>> Actually vect lowering checks if we are able to make comparison and
>>> expand also uses vec_cond to expand vector comparison, so probably we
>>> may live with them.
>>>
>>>>
>>>> Not sure if we are always talking about the same thing for
>>>> "bool patterns".  I'd remove bool patterns completely, IMHO
>>>> they are not necessary at all.
>>>
>>> I refer to transformations made by vect_recog_bool_pattern. Don't see
>>> how to remove them completely for targets not supporting comparison
>>> vectorization.
>>>
>>>>
>>>> I think we do allow this, just the vectorizer doesn't expect it.  In the 
>>>> long
>>>> run I want to get rid of the GENERIC exprs in both COND_EXPR and
>>>> VEC_COND_EXPR.  Just didn't have the time to do this...
>>>
>>> That would be nice. As a first step I'd like to support optabs for
>>> VEC_COND_EXPR directly using vec.
>>>
>>> Thanks,
>>> Ilya
>>>
>>>>
>>>> Richard.
>>
>> Hi Richard,
>>
>> Do you think we have enough confidence approach is working and we may
>> start integrating it into trunk? What would be integration plan then?
>
> I'm still worried about the vec vector size vs. element size
> issue (well, somewhat).

Yeah, I hit another problem related to element size in vec lowering.
It uses inner type sizes in expand_vector_piecewise and bool vector
expand goes in a wrong way. There were also other places with similar
problems and therefore I want to try to use bools of different sizes
and see how it goes. Also having different sized bools may be useful
to represent masks pack/unpack in scalar code.

>
> Otherwise the integration plan would be
>
>  1) put in the vector GIMPLE type support and change the vector
> comparison type IL requirement to be vector,
> fixing all fallout
>
>  2) get support for directly expanding vector comparisons to
> vector and make use of that from the x86 backend
>
>  3) make the vectorizer generate the above if supported
>
> I think independent improvements are
>
>  1) remove (most) of the bool patterns from the vectorizer
>
>  2) make VEC_COND_EXPR not have a GENERIC comparison embedded

Sounds great!

Ilya

>
> (same for COND_EXPR?)
>
> Richard.
>
>> Thanks,
>> Ilya

[PATCH, PR target/67761] Fix i686-- bootstrap comparison failure

2015-09-29 Thread Ilya Enkovich

Hi,

My recenttly introduced STV pass doesn't skip debug instructions and it causes 
transformation (mistly cost computation) depending on debug info.  It causes 
bootstrap comparison failure.  This patch fixes.  Bootstrapped for i686-linux.  
Testing for x86_64-unknown-linux-gnu{,m32} is in progress.  OK for trunk if 
pass?

Thanks,
Ilya
--
gcc/

2015-09-29  Ilya Enkovich  

* config/i386/i386.c (scalar_chain::analyze_register_chain): Ignore
debug insns.
(scalar_chain::convert_reg): Likewise.

gcc/testsuite/

2015-09-29  Ilya Enkovich  

* gcc.target/i386/pr67761.c: New test.


diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 6f2380f..7b3ffb0 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2919,6 +2919,10 @@ scalar_chain::analyze_register_chain (bitmap candidates, 
df_ref ref)
   for (chain = DF_REF_CHAIN (ref); chain; chain = chain->next)
 {
   unsigned uid = DF_REF_INSN_UID (chain->ref);
+
+  if (!NONDEBUG_INSN_P (DF_REF_INSN (chain->ref)))
+   continue;
+
   if (!DF_REF_REG_MEM_P (chain->ref))
{
  if (bitmap_bit_p (insns, uid))
@@ -3279,7 +3283,7 @@ scalar_chain::convert_reg (unsigned regno)
bitmap_clear_bit (conv, DF_REF_INSN_UID (ref));
  }
   }
-else
+else if (NONDEBUG_INSN_P (DF_REF_INSN (ref)))
   {
replace_rtx (DF_REF_INSN (ref), reg, scopy);
df_insn_rescan (DF_REF_INSN (ref));
diff --git a/gcc/testsuite/gcc.target/i386/pr67761.c 
b/gcc/testsuite/gcc.target/i386/pr67761.c
new file mode 100644
index 000..9b13d58
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr67761.c
@@ -0,0 +1,13 @@
+/* PR target/pr67761 */
+/* { dg-do run { target { ia32 } } } */
+/* { dg-options "-O2 -march=slm -g" } */
+/* { dg-final { scan-assembler "paddq" } } */
+
+void
+test (long long *values, long long val, long long delta)
+{
+  unsigned i;
+
+  for (i = 0; i < 128; i++, val += delta)
+values[i] = val;
+}

Re: [PATCH, PR target/67761] Fix i686-- bootstrap comparison failure

2015-09-30 Thread Ilya Enkovich

2015-09-30 9:06 GMT+03:00 Uros Bizjak :
> Hello!
>
>> My recenttly introduced STV pass doesn't skip debug instructions and it 
>> causes transformation
>> (mistly cost computation) depending on debug info.  It causes bootstrap 
>> comparison failure.  This
>> patch fixes.  Bootstrapped for i686-linux.  Testing for 
>> x86_64-unknown-linux-gnu{,m32} is in
>> progress.  OK for trunk if pass?
>
> IMO, it would be also beneficial to bootstrap with slm default
> architecture, so new code paths get some coverage via bootstrap.

I bootstrapped with --with-cpu=slm also.

>
>> gcc/
>>
>> 2015-09-29  Ilya Enkovich  
>>
>> * config/i386/i386.c (scalar_chain::analyze_register_chain): Ignore
>> debug insns.
>> (scalar_chain::convert_reg): Likewise.
>>
>> gcc/testsuite/
>>
>> 2015-09-29  Ilya Enkovich  
>>
>> * gcc.target/i386/pr67761.c: New test.
>
> OK.

Thanks!

Ilya

>
> Thanks,
> Uros.

Re: [PATCH, testsuite]: Fix gcc.target/i386/pr65105-1.c test

2015-10-01 Thread Ilya Enkovich

2015-10-01 13:12 GMT+03:00 Uros Bizjak :
> Hello!
>
> Attached patch fixes gcc.target/i386/pr65105-1.c:

Thanks!
Ilya

>
> a) As a runtime SSE2 test, we have to check for target SSE2 support
> and use proper test infrastructure.
>
> b) A runtime test can't check output assembly without -save-temps.
>
> The patch also use another misuse of -save-temps in gcc.target/i386 directory.
>
> The patch solves:
>
> UNRESOLVED: gcc.target/i386/pr65105-1.c scan-assembler por
> UNRESOLVED: gcc.target/i386/pr65105-1.c scan-assembler pand
>
> 2015-10-01  Uros Bizjak  
>
> * gcc.target/i386/pr65105-1.c: Require sse2 effective target.
> (main): Rename to sse2_test.  Abort if count != 5.
> (dg-options): Add -save-temps.  Use "-msse2 -mtune=slm" instead
> of -march=slm.
> * gcc.target/i386/pr46865-2.c (dg-options): Remove -save-temps.
>
> Tested on x86_64-linux-gnu {,-m32} and committed to mainline SVN.
>
> Uros.

[Boolean Vector, patch 1/5] Introduce boolean vector to be used as a vector comparison type

2015-10-02 Thread Ilya Enkovich

Hi,

This patch starts the first series to introduce vec as a vector 
comparison type.  This series introduces the new vec type and force its 
usage for all vector comparisons.  This series doesn't intoroduce any new 
vectorization features.  I split it into five small patches but will commit in 
a single chunk.  Patch series was bootstrapped and tested on 
x86_64-unknown-linux-gnu.

The first patch introduces a target hook and functions to produce new vector 
type.

Thanks,
Ilya
--
2015-10-02  Ilya Enkovich  

* doc/tm.texi: Regenerated.
* doc/tm.texi.in (TARGET_VECTORIZE_GET_MASK_MODE): New.
* stor-layout.c (layout_type): Use mode to get vector mask size.
* target.def (get_mask_mode): New.
* targhooks.c (default_get_mask_mode): New.
* targhooks.h (default_get_mask_mode): New.
* gcc/tree-vect-stmts.c (get_same_sized_vectype): Add special case
for boolean vector.
* tree.c (MAX_BOOL_CACHED_PREC): New.
(nonstandard_boolean_type_cache): New.
(build_nonstandard_boolean_type): New.
(make_vector_type): Vector mask has no canonical type.
(build_truth_vector_type): New.
(build_same_sized_truth_vector_type): New.
(truth_type_for): Support vector masks.
* tree.h (VECTOR_BOOLEAN_TYPE_P): New.
(build_truth_vector_type): New.
(build_same_sized_truth_vector_type): New.
(build_nonstandard_boolean_type): New.


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index eb495a8..098213e 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5688,6 +5688,11 @@ mode returned by 
@code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE}.
 The default is zero which means to not iterate over other vector sizes.
 @end deftypefn
 
+@deftypefn {Target Hook} machine_mode TARGET_VECTORIZE_GET_MASK_MODE (unsigned 
@var{nunits}, unsigned @var{length})
+This hook returns mode to be used for a mask to be used for a vector
+of specified @var{length} with @var{nunits} elements.
+@end deftypefn
+
 @deftypefn {Target Hook} {void *} TARGET_VECTORIZE_INIT_COST (struct loop 
*@var{loop_info})
 This hook should initialize target-specific data structures in preparation for 
modeling the costs of vectorizing a loop or basic block.  The default allocates 
three unsigned integers for accumulating costs for the prologue, body, and 
epilogue of the loop or basic block.  If @var{loop_info} is non-NULL, it 
identifies the loop being vectorized; otherwise a single block is being 
vectorized.
 @end deftypefn
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 92835c1..92cfa1d 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4225,6 +4225,8 @@ address;  but often a machine-dependent strategy can 
generate better code.
 
 @hook TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES
 
+@hook TARGET_VECTORIZE_GET_MASK_MODE
+
 @hook TARGET_VECTORIZE_INIT_COST
 
 @hook TARGET_VECTORIZE_ADD_STMT_COST
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 938e54b..58ecd7b 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2184,10 +2184,16 @@ layout_type (tree type)
 
TYPE_SATURATING (type) = TYPE_SATURATING (TREE_TYPE (type));
 TYPE_UNSIGNED (type) = TYPE_UNSIGNED (TREE_TYPE (type));
-   TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
-TYPE_SIZE_UNIT (innertype),
-size_int (nunits));
-   TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
+   /* Several boolean vector elements may fit in a single unit.  */
+   if (VECTOR_BOOLEAN_TYPE_P (type))
+ TYPE_SIZE_UNIT (type)
+   = size_int (GET_MODE_SIZE (type->type_common.mode));
+   else
+ TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
+  TYPE_SIZE_UNIT (innertype),
+  size_int (nunits));
+   TYPE_SIZE (type) = int_const_binop (MULT_EXPR,
+   TYPE_SIZE (innertype),
bitsize_int (nunits));
 
/* For vector types, we do not default to the mode's alignment.
diff --git a/gcc/target.def b/gcc/target.def
index f330709..b96fd51 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1789,6 +1789,15 @@ The default is zero which means to not iterate over 
other vector sizes.",
  (void),
  default_autovectorize_vector_sizes)
 
+/* Function to get a target mode for a vector mask.  */
+DEFHOOK
+(get_mask_mode,
+ "This hook returns mode to be used for a mask to be used for a vector\n\
+of specified @var{length} with @var{nunits} elements.",
+ machine_mode,
+ (unsigned nunits, unsigned length),
+ default_get_mask_mode)
+
 /* Target builtin that implements vector gather operation.  */
 DEFHOOK
 (builtin_gather,
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 7238c8f..ac01d57 100644
---

[Boolean Vector, patch 2/5] Change vector comparison IL requirement

2015-10-02 Thread Ilya Enkovich

Hi,

This patch change vector comparison to require boolean vector resulting type.

Thanks,
Ilya
--
gcc/

2015-10-02  Ilya Enkovich  

* tree-cfg.c (verify_gimple_comparison) Require boolean
vector type for vector comparison.
(verify_gimple_assign_ternary): Likewise.


diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 807d96f..c3dcced 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3464,10 +3464,10 @@ verify_gimple_comparison (tree type, tree op0, tree op1)
   return true;
 }
 }
-  /* Or an integer vector type with the same size and element count
+  /* Or a boolean vector type with the same element count
  as the comparison operand types.  */
   else if (TREE_CODE (type) == VECTOR_TYPE
-  && TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE)
+  && TREE_CODE (TREE_TYPE (type)) == BOOLEAN_TYPE)
 {
   if (TREE_CODE (op0_type) != VECTOR_TYPE
  || TREE_CODE (op1_type) != VECTOR_TYPE)
@@ -3478,12 +3478,7 @@ verify_gimple_comparison (tree type, tree op0, tree op1)
   return true;
 }
 
-  if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type)
- || (GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (type)))
- != GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op0_type
- /* The result of a vector comparison is of signed
-integral type.  */
- || TYPE_UNSIGNED (TREE_TYPE (type)))
+  if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type))
 {
   error ("invalid vector comparison resulting type");
   debug_generic_expr (type);
@@ -3970,15 +3965,13 @@ verify_gimple_assign_ternary (gassign *stmt)
   break;
 
 case VEC_COND_EXPR:
-  if (!VECTOR_INTEGER_TYPE_P (rhs1_type)
- || TYPE_SIGN (rhs1_type) != SIGNED
- || TYPE_SIZE (rhs1_type) != TYPE_SIZE (lhs_type)
+  if (!VECTOR_BOOLEAN_TYPE_P (rhs1_type)
  || TYPE_VECTOR_SUBPARTS (rhs1_type)
 != TYPE_VECTOR_SUBPARTS (lhs_type))
{
- error ("the first argument of a VEC_COND_EXPR must be of a signed "
-"integral vector type of the same size and number of "
-"elements as the result");
+ error ("the first argument of a VEC_COND_EXPR must be of a "
+"boolean vector type of the same number of elements "
+"as the result");
  debug_generic_expr (lhs_type);
  debug_generic_expr (rhs1_type);
  return true;

[Boolean Vector, patch 3/5] Use boolean vector in C/C++ FE

2015-10-02 Thread Ilya Enkovich

Hi,

This patch makes C/C++ FE to use boolean vector as a resulting type for vector 
comparison.  As a result vector comparison in source code now parsed into 
VEC_COND_EXPR, it required a testcase fix-up.

Thanks,
Ilya
--
gcc/c

2015-10-02  Ilya Enkovich  

* c-typeck.c (build_conditional_expr): Use boolean vector
type for vector comparison.
(build_vec_cmp): New.
(build_binary_op): Use build_vec_cmp for comparison.

gcc/cp

2015-10-02  Ilya Enkovich  

* call.c (build_conditional_expr_1): Use boolean vector
type for vector comparison.
* typeck.c (build_vec_cmp): New.
(cp_build_binary_op): Use build_vec_cmp for comparison.

gcc/testsuite/

2015-10-02  Ilya Enkovich  

* g++.dg/ext/vector22.C: Allow VEC_COND_EXPR.


diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index 3b26231..3f64d76 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -4771,6 +4771,18 @@ build_conditional_expr (location_t colon_loc, tree 
ifexp, bool ifexp_bcp,
   && TREE_CODE (orig_op2) == INTEGER_CST
   && !TREE_OVERFLOW (orig_op2)));
 }
+
+  /* Need to convert condition operand into a vector mask.  */
+  if (VECTOR_TYPE_P (TREE_TYPE (ifexp)))
+{
+  tree vectype = TREE_TYPE (ifexp);
+  tree elem_type = TREE_TYPE (vectype);
+  tree zero = build_int_cst (elem_type, 0);
+  tree zero_vec = build_vector_from_val (vectype, zero);
+  tree cmp_type = build_same_sized_truth_vector_type (vectype);
+  ifexp = build2 (NE_EXPR, cmp_type, ifexp, zero_vec);
+}
+
   if (int_const || (ifexp_bcp && TREE_CODE (ifexp) == INTEGER_CST))
 ret = fold_build3_loc (colon_loc, COND_EXPR, result_type, ifexp, op1, op2);
   else
@@ -10220,6 +10232,19 @@ push_cleanup (tree decl, tree cleanup, bool eh_only)
   STATEMENT_LIST_STMT_EXPR (list) = stmt_expr;
 }
 
+/* Build a vector comparison using VEC_COND_EXPR.  */
+
+static tree
+build_vec_cmp (tree_code code, tree type,
+  tree arg0, tree arg1)
+{
+  tree zero_vec = build_zero_cst (type);
+  tree minus_one_vec = build_minus_one_cst (type);
+  tree cmp_type = build_same_sized_truth_vector_type (type);
+  tree cmp = build2 (code, cmp_type, arg0, arg1);
+  return build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
+}
+
 /* Build a binary-operation expression without default conversions.
CODE is the kind of expression to build.
LOCATION is the operator's location.
@@ -10786,7 +10811,8 @@ build_binary_op (location_t location, enum tree_code 
code,
   result_type = build_opaque_vector_type (intt,
  TYPE_VECTOR_SUBPARTS (type0));
   converted = 1;
-  break;
+ ret = build_vec_cmp (resultcode, result_type, op0, op1);
+  goto return_build_binary_op;
 }
   if (FLOAT_TYPE_P (type0) || FLOAT_TYPE_P (type1))
warning_at (location,
@@ -10938,7 +10964,8 @@ build_binary_op (location_t location, enum tree_code 
code,
   result_type = build_opaque_vector_type (intt,
  TYPE_VECTOR_SUBPARTS (type0));
   converted = 1;
-  break;
+ ret = build_vec_cmp (resultcode, result_type, op0, op1);
+  goto return_build_binary_op;
 }
   build_type = integer_type_node;
   if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 367d42b..0488b82 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4615,6 +4615,15 @@ build_conditional_expr_1 (location_t loc, tree arg1, 
tree arg2, tree arg3,
 
   if (VECTOR_INTEGER_TYPE_P (TREE_TYPE (arg1)))
 {
+  /* If arg1 is another cond_expr choosing between -1 and 0,
+then we can use its comparison.  It may help to avoid
+additional comparison, produce more accurate diagnostics
+and enables folding.  */
+  if (TREE_CODE (arg1) == VEC_COND_EXPR
+ && integer_minus_onep (TREE_OPERAND (arg1, 1))
+ && integer_zerop (TREE_OPERAND (arg1, 2)))
+   arg1 = TREE_OPERAND (arg1, 0);
+
   arg1 = force_rvalue (arg1, complain);
   arg2 = force_rvalue (arg2, complain);
   arg3 = force_rvalue (arg3, complain);
@@ -4727,8 +4736,10 @@ build_conditional_expr_1 (location_t loc, tree arg1, 
tree arg2, tree arg3,
}
 
   if (!COMPARISON_CLASS_P (arg1))
-   arg1 = cp_build_binary_op (loc, NE_EXPR, arg1,
-  build_zero_cst (arg1_type), complain);
+   {
+ tree cmp_type = build_same_sized_truth_vector_type (arg1_type);
+ arg1 = build2 (NE_EXPR, cmp_type, arg1, build_zero_cst (arg1_type));
+   }
   return fold_build3 (VEC_COND_EXPR, arg2_type, arg1, arg2, arg3);
 }
 
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 482e42c..96b1683 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -3900,6 +3900,19 @@ build_binary_op (lo

[Boolean Vector, patch 4/5] Use boolean vectors in VEC_COND_EXPR

2015-10-02 Thread Ilya Enkovich

Hi,

This patch forces boolean vector usage in VEC_COND_EXPR generated by 
vectorizer.  VEC_COND_EXPR expand is fixed appropriately.

Thanks,
Ilya
--
gcc/

2015-10-02  Ilya Enkovich  

* optabs.c (expand_vec_cond_expr): Accept boolean vector as
condition operand.
* tree-vect-stmts.c (vectorizable_condition): Use boolean
vector type for vector comparison.


diff --git a/gcc/optabs.c b/gcc/optabs.c
index c49d66b..8d9d742 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -5365,16 +5365,17 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, 
tree op1, tree op2,
   op0a = TREE_OPERAND (op0, 0);
   op0b = TREE_OPERAND (op0, 1);
   tcode = TREE_CODE (op0);
+  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
 }
   else
 {
   /* Fake op0 < 0.  */
-  gcc_assert (!TYPE_UNSIGNED (TREE_TYPE (op0)));
+  gcc_assert (VECTOR_BOOLEAN_TYPE_P (TREE_TYPE (op0)));
   op0a = op0;
   op0b = build_zero_cst (TREE_TYPE (op0));
   tcode = LT_EXPR;
+  unsignedp = false;
 }
-  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
   cmp_op_mode = TYPE_MODE (TREE_TYPE (op0a));
 
 
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 2ff2827..e93f5ef 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7384,10 +7384,7 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
   && TREE_CODE (else_clause) != FIXED_CST)
 return false;
 
-  unsigned int prec = GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (vectype)));
-  /* The result of a vector comparison should be signed type.  */
-  tree cmp_type = build_nonstandard_integer_type (prec, 0);
-  vec_cmp_type = get_same_sized_vectype (cmp_type, vectype);
+  vec_cmp_type = build_same_sized_truth_vector_type (comp_vectype);
   if (vec_cmp_type == NULL_TREE)
 return false;

[[Boolean Vector, patch 5/5] Support boolean vectors in vector lowering

2015-10-02 Thread Ilya Enkovich

Hi,

This patch supports boolean vectors in vector lowering.  Main change is to 
lower vector comparison into comparisons, not cond_exprs.

Thanks,
Ilya
--
2015-10-02  Ilya Enkovich  

* tree-vect-generic.c (elem_op_func): Add new operand to hold
vector type.
(do_unop): Adjust to modified function type.
(do_binop): Likewise.
(do_plus_minus): Likewise.
(do_negate); Likewise.
(expand_vector_piecewise): Likewise.
(do_cond): Likewise.
(do_compare): Use comparison instead of condition.
(expand_vector_divmod): Use boolean vector type for comparison.
(expand_vector_operations_1): Skip scalar mask operations.


diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index dad38a2..a20b9af 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -105,14 +105,27 @@ build_word_mode_vector_type (int nunits)
 }
 
 typedef tree (*elem_op_func) (gimple_stmt_iterator *,
- tree, tree, tree, tree, tree, enum tree_code);
+ tree, tree, tree, tree, tree, enum tree_code,
+ tree);
 
 static inline tree
 tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
  tree t, tree bitsize, tree bitpos)
 {
   if (bitpos)
-return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
+{
+  if (TREE_CODE (type) == BOOLEAN_TYPE)
+   {
+ tree itype
+   = build_nonstandard_integer_type (tree_to_uhwi (bitsize), 0);
+ tree field = gimplify_build3 (gsi, BIT_FIELD_REF, itype, t,
+   bitsize, bitpos);
+ return gimplify_build2 (gsi, NE_EXPR, type, field,
+ build_zero_cst (itype));
+   }
+  else
+   return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
+}
   else
 return gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, t);
 }
@@ -120,7 +133,7 @@ tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
 static tree
 do_unop (gimple_stmt_iterator *gsi, tree inner_type, tree a,
 tree b ATTRIBUTE_UNUSED, tree bitpos, tree bitsize,
-enum tree_code code)
+enum tree_code code, tree type ATTRIBUTE_UNUSED)
 {
   a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
   return gimplify_build1 (gsi, code, inner_type, a);
@@ -128,7 +141,8 @@ do_unop (gimple_stmt_iterator *gsi, tree inner_type, tree a,
 
 static tree
 do_binop (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
- tree bitpos, tree bitsize, enum tree_code code)
+ tree bitpos, tree bitsize, enum tree_code code,
+ tree type ATTRIBUTE_UNUSED)
 {
   if (TREE_CODE (TREE_TYPE (a)) == VECTOR_TYPE)
 a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
@@ -145,20 +159,12 @@ do_binop (gimple_stmt_iterator *gsi, tree inner_type, 
tree a, tree b,
size equal to the size of INNER_TYPE.  */
 static tree
 do_compare (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
- tree bitpos, tree bitsize, enum tree_code code)
+   tree bitpos, tree bitsize, enum tree_code code, tree type)
 {
-  tree comp_type;
-
   a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
   b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
 
-  comp_type = build_nonstandard_integer_type
- (GET_MODE_BITSIZE (TYPE_MODE (inner_type)), 0);
-
-  return gimplify_build3 (gsi, COND_EXPR, comp_type,
- fold_build2 (code, boolean_type_node, a, b),
- build_int_cst (comp_type, -1),
- build_int_cst (comp_type, 0));
+  return gimplify_build2 (gsi, code, TREE_TYPE (type), a, b);
 }
 
 /* Expand vector addition to scalars.  This does bit twiddling
@@ -177,7 +183,7 @@ do_compare (gimple_stmt_iterator *gsi, tree inner_type, 
tree a, tree b,
 static tree
 do_plus_minus (gimple_stmt_iterator *gsi, tree word_type, tree a, tree b,
   tree bitpos ATTRIBUTE_UNUSED, tree bitsize ATTRIBUTE_UNUSED,
-  enum tree_code code)
+  enum tree_code code, tree type ATTRIBUTE_UNUSED)
 {
   tree inner_type = TREE_TYPE (TREE_TYPE (a));
   unsigned HOST_WIDE_INT max;
@@ -209,7 +215,8 @@ static tree
 do_negate (gimple_stmt_iterator *gsi, tree word_type, tree b,
   tree unused ATTRIBUTE_UNUSED, tree bitpos ATTRIBUTE_UNUSED,
   tree bitsize ATTRIBUTE_UNUSED,
-  enum tree_code code ATTRIBUTE_UNUSED)
+  enum tree_code code ATTRIBUTE_UNUSED,
+  tree type ATTRIBUTE_UNUSED)
 {
   tree inner_type = TREE_TYPE (TREE_TYPE (b));
   HOST_WIDE_INT max;
@@ -255,7 +262,7 @@ expand_vector_piecewise (gimple_stmt_iterator *gsi, 
elem_op_func f,
   for (i = 0; i < nunits;
i += delta, index = int_const_binop (PLUS_EXPR, index, part_width))
 {
-  tree result = f (gsi, inner_type, a, b, index, part_width, code);
+  tree result = f (

[PATCH] Fix ICE for SIMD clones usage in LTO

2015-10-05 Thread Ilya Enkovich

Hi,

When SIMD clone is created original function may be defined in another 
partition.  In this case SIMD clone also has to have in_other_partition flag.  
Now it doesn't and we get an ICE.  This patch fixes it.  Bootstrapped and 
regtested for x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2015-10-05  Ilya Enkovich  

* omp-low.c (simd_clone_create): Set in_other_partition
for created clones.

gcc/testsuite/

2015-10-05  Ilya Enkovich  

* gcc.dg/lto/simd-function_0.c: New test.


diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index cdcf9d6..8d25784 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -12948,6 +12948,8 @@ simd_clone_create (struct cgraph_node *old_node)
   DECL_STATIC_CONSTRUCTOR (new_decl) = 0;
   DECL_STATIC_DESTRUCTOR (new_decl) = 0;
   new_node = old_node->create_version_clone (new_decl, vNULL, NULL);
+  if (old_node->in_other_partition)
+   new_node->in_other_partition = 1;
   symtab->call_cgraph_insertion_hooks (new_node);
 }
   if (new_node == NULL)
diff --git a/gcc/testsuite/gcc.dg/lto/simd-function_0.c 
b/gcc/testsuite/gcc.dg/lto/simd-function_0.c
new file mode 100755
index 000..cda31aa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/simd-function_0.c
@@ -0,0 +1,34 @@
+/* { dg-lto-do link } */
+/* { dg-require-effective-target avx2 } */
+/* { dg-lto-options { { -fopenmp-simd -O3 -ffast-math -mavx2 -flto 
-flto-partition=max } } } */
+
+#define SIZE 4096
+float x[SIZE];
+
+
+#pragma omp declare simd
+float
+__attribute__ ((noinline))
+my_mul (float x, float y) {
+  return x * y;
+}
+
+__attribute__ ((noinline))
+int foo ()
+{
+  int i = 0;
+#pragma omp simd safelen (16)
+  for (i = 0; i < SIZE; i++)
+x[i] = my_mul ((float)i, 9932.3323);
+  return (int)x[0];
+}
+
+int main ()
+{
+  int i = 0;
+  for (i = 0; i < SIZE; i++)
+x[i] = my_mul ((float) i, 9932.3323);
+  foo ();
+  return (int)x[0];
+}
+

[vec-cmp, patch 1/6] Add optabs for vector comparison

2015-10-08 Thread Ilya Enkovich

Hi,

This series introduces autogeneration of vector comparison and its support on 
i386 target.  It lets comparison statements to be vectorized into vector 
comparison instead of VEC_COND_EXPR.  This allows to avoid some restrictions 
implied by boolean patterns.  This series applies on top of bolean vectors 
series [1].

This patch introduces optabs for vector comparison.

[1] https://gcc.gnu.org/ml/gcc-patches/2015-10/msg00215.html

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  

* expr.c (do_store_flag): Use expand_vec_cmp_expr for mask results.
* optabs-query.h (get_vec_cmp_icode): New.
* optabs-tree.c (expand_vec_cmp_expr_p): New.
* optabs-tree.h (expand_vec_cmp_expr_p): New.
* optabs.c (vector_compare_rtx): Add OPNO arg.
(expand_vec_cond_expr): Adjust to vector_compare_rtx change.
(expand_vec_cmp_expr): New.
* optabs.def (vec_cmp_optab): New.
(vec_cmpu_optab): New.
* optabs.h (expand_vec_cmp_expr): New.
* tree-vect-generic.c (expand_vector_comparison): Add vector
comparison optabs check.


diff --git a/gcc/expr.c b/gcc/expr.c
index 0bbfccd..88da8cb 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11025,9 +11025,15 @@ do_store_flag (sepops ops, rtx target, machine_mode 
mode)
   if (TREE_CODE (ops->type) == VECTOR_TYPE)
 {
   tree ifexp = build2 (ops->code, ops->type, arg0, arg1);
-  tree if_true = constant_boolean_node (true, ops->type);
-  tree if_false = constant_boolean_node (false, ops->type);
-  return expand_vec_cond_expr (ops->type, ifexp, if_true, if_false, 
target);
+  if (VECTOR_BOOLEAN_TYPE_P (ops->type))
+   return expand_vec_cmp_expr (ops->type, ifexp, target);
+  else
+   {
+ tree if_true = constant_boolean_node (true, ops->type);
+ tree if_false = constant_boolean_node (false, ops->type);
+ return expand_vec_cond_expr (ops->type, ifexp, if_true,
+  if_false, target);
+   }
 }
 
   /* Get the rtx comparison code to use.  We know that EXP is a comparison
diff --git a/gcc/optabs-query.h b/gcc/optabs-query.h
index 73f2729..81ac362 100644
--- a/gcc/optabs-query.h
+++ b/gcc/optabs-query.h
@@ -74,6 +74,16 @@ trapv_binoptab_p (optab binoptab)
  || binoptab == smulv_optab);
 }
 
+/* Return insn code for a comparison operator with VMODE
+   resultin MASK_MODE, unsigned if UNS is true.  */
+
+static inline enum insn_code
+get_vec_cmp_icode (machine_mode vmode, machine_mode mask_mode, bool uns)
+{
+  optab tab = uns ? vec_cmpu_optab : vec_cmp_optab;
+  return convert_optab_handler (tab, vmode, mask_mode);
+}
+
 /* Return insn code for a conditional operator with a comparison in
mode CMODE, unsigned if UNS is true, resulting in a value of mode VMODE.  */
 
diff --git a/gcc/optabs-tree.c b/gcc/optabs-tree.c
index 3b03338..aa863cf 100644
--- a/gcc/optabs-tree.c
+++ b/gcc/optabs-tree.c
@@ -320,6 +320,19 @@ supportable_convert_operation (enum tree_code code,
   return false;
 }
 
+/* Return TRUE if appropriate vector insn is available
+   for vector comparison expr with vector type VALUE_TYPE
+   and resulting mask with MASK_TYPE.  */
+
+bool
+expand_vec_cmp_expr_p (tree value_type, tree mask_type)
+{
+  enum insn_code icode = get_vec_cmp_icode (TYPE_MODE (value_type),
+   TYPE_MODE (mask_type),
+   TYPE_UNSIGNED (value_type));
+  return (icode != CODE_FOR_nothing);
+}
+
 /* Return TRUE iff, appropriate vector insns are available
for vector cond expr with vector type VALUE_TYPE and a comparison
with operand vector types in CMP_OP_TYPE.  */
diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
index bf6c9e3..5b966ca 100644
--- a/gcc/optabs-tree.h
+++ b/gcc/optabs-tree.h
@@ -39,6 +39,7 @@ optab optab_for_tree_code (enum tree_code, const_tree, enum 
optab_subtype);
 optab scalar_reduc_to_vector (optab, const_tree);
 bool supportable_convert_operation (enum tree_code, tree, tree, tree *,
enum tree_code *);
+bool expand_vec_cmp_expr_p (tree, tree);
 bool expand_vec_cond_expr_p (tree, tree);
 void init_tree_optimization_optabs (tree);
 
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 8d9d742..ca1a6e7 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -5100,11 +5100,13 @@ get_rtx_code (enum tree_code tcode, bool unsignedp)
 }
 
 /* Return comparison rtx for COND. Use UNSIGNEDP to select signed or
-   unsigned operators. Do not generate compare instruction.  */
+   unsigned operators.  OPNO holds an index of the first comparison
+   operand in insn with code ICODE.  Do not generate compare instruction.  */
 
 static rtx
 vector_compare_rtx (enum tree_code tcode, tree t_op0, tree t_op1,
-   bool unsignedp, enum insn_code icode)
+   bool unsignedp, enum insn_code icode,
+   unsigned int opno)
 {
   str

[vec-cmp, patch 2/6] Vectorization factor computation

2015-10-08 Thread Ilya Enkovich

Hi,

This patch handles statements with boolean result in vectorization factor 
computation.  For comparison its operands type is used instead of restult type 
to compute VF.  Other boolean statements are ignored for VF.

Vectype for comparison is computed using type of compared values.  Computed 
type is propagated into other boolean operations.

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  

* tree-vect-loop.c (vect_determine_vectorization_factor):  Ignore mask
operations for VF.  Add mask type computation.
* tree-vect-stmts.c (get_mask_type_for_scalar_type): New.
* tree-vectorizer.h (get_mask_type_for_scalar_type): New.


diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 63e29aa..c7e8067 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -183,19 +183,21 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  int nbbs = loop->num_nodes;
+  unsigned nbbs = loop->num_nodes;
   unsigned int vectorization_factor = 0;
   tree scalar_type;
   gphi *phi;
   tree vectype;
   unsigned int nunits;
   stmt_vec_info stmt_info;
-  int i;
+  unsigned i;
   HOST_WIDE_INT dummy;
   gimple *stmt, *pattern_stmt = NULL;
   gimple_seq pattern_def_seq = NULL;
   gimple_stmt_iterator pattern_def_si = gsi_none ();
   bool analyze_pattern_stmt = false;
+  bool bool_result;
+  auto_vec mask_producers;
 
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location,
@@ -414,6 +416,8 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
  return false;
}
 
+ bool_result = false;
+
  if (STMT_VINFO_VECTYPE (stmt_info))
{
  /* The only case when a vectype had been already set is for stmts
@@ -434,6 +438,32 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
scalar_type = TREE_TYPE (gimple_call_arg (stmt, 3));
  else
scalar_type = TREE_TYPE (gimple_get_lhs (stmt));
+
+ /* Bool ops don't participate in vectorization factor
+computation.  For comparison use compared types to
+compute a factor.  */
+ if (TREE_CODE (scalar_type) == BOOLEAN_TYPE)
+   {
+ mask_producers.safe_push (stmt_info);
+ bool_result = true;
+
+ if (gimple_code (stmt) == GIMPLE_ASSIGN
+ && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+== tcc_comparison
+ && TREE_CODE (TREE_TYPE (gimple_assign_rhs1 (stmt)))
+!= BOOLEAN_TYPE)
+   scalar_type = TREE_TYPE (gimple_assign_rhs1 (stmt));
+ else
+   {
+ if (!analyze_pattern_stmt && gsi_end_p (pattern_def_si))
+   {
+ pattern_def_seq = NULL;
+ gsi_next (&si);
+   }
+ continue;
+   }
+   }
+
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location,
@@ -456,7 +486,8 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
  return false;
}
 
- STMT_VINFO_VECTYPE (stmt_info) = vectype;
+ if (!bool_result)
+   STMT_VINFO_VECTYPE (stmt_info) = vectype;
 
  if (dump_enabled_p ())
{
@@ -469,8 +500,9 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
  /* The vectorization factor is according to the smallest
 scalar type (or the largest vector size, but we only
 support one vector size per loop).  */
- scalar_type = vect_get_smallest_scalar_type (stmt, &dummy,
-  &dummy);
+ if (!bool_result)
+   scalar_type = vect_get_smallest_scalar_type (stmt, &dummy,
+&dummy);
  if (dump_enabled_p ())
{
  dump_printf_loc (MSG_NOTE, vect_location,
@@ -545,6 +577,100 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
 }
   LOOP_VINFO_VECT_FACTOR (loop_vinfo) = vectorization_factor;
 
+  for (i = 0; i < mask_producers.length (); i++)
+{
+  tree mask_type = NULL;
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (mask_producers[i]);
+
+  stmt = STMT_VINFO_STMT (mask_producers[i]);
+
+  if (gimple_code (stmt) == GIMPLE_ASSIGN
+ && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_comparison
+ && TREE_CODE (TREE_TYPE (gimple_assign_rhs1 (stmt))) != BOOLEAN_TYPE)
+   {
+ scalar_type = TREE_TYPE (gimple_assign_rhs1

[vec-cmp, patch 3/6] Vectorize comparison

2015-10-08 Thread Ilya Enkovich

Hi,

This patch supports comparison statements vectrization basing on introduced 
optabs.

Thanks,
Ilya
--
gcc/

2015-10-08  Ilya Enkovich  

* tree-vect-data-refs.c (vect_get_new_vect_var): Support vect_mask_var.
(vect_create_destination_var): Likewise.
* tree-vect-stmts.c (vectorizable_comparison): New.
(vect_analyze_stmt): Add vectorizable_comparison.
(vect_transform_stmt): Likewise.
* tree-vectorizer.h (enum vect_var_kind): Add vect_mask_var.
(enum stmt_vec_info_type): Add comparison_vec_info_type.
(vectorizable_comparison): New.


diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index 3befa38..9edc663 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -3849,6 +3849,9 @@ vect_get_new_vect_var (tree type, enum vect_var_kind 
var_kind, const char *name)
   case vect_scalar_var:
 prefix = "stmp";
 break;
+  case vect_mask_var:
+prefix = "mask";
+break;
   case vect_pointer_var:
 prefix = "vectp";
 break;
@@ -4403,7 +4406,11 @@ vect_create_destination_var (tree scalar_dest, tree 
vectype)
   tree type;
   enum vect_var_kind kind;
 
-  kind = vectype ? vect_simple_var : vect_scalar_var;
+  kind = vectype
+? VECTOR_BOOLEAN_TYPE_P (vectype)
+? vect_mask_var
+: vect_simple_var
+: vect_scalar_var;
   type = vectype ? vectype : TREE_TYPE (scalar_dest);
 
   gcc_assert (TREE_CODE (scalar_dest) == SSA_NAME);
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 8eda8e9..6949c71 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -7525,6 +7525,211 @@ vectorizable_condition (gimple *stmt, 
gimple_stmt_iterator *gsi,
   return true;
 }
 
+/* vectorizable_comparison.
+
+   Check if STMT is comparison expression that can be vectorized.
+   If VEC_STMT is also passed, vectorize the STMT: create a vectorized
+   comparison, put it in VEC_STMT, and insert it at GSI.
+
+   Return FALSE if not a vectorizable STMT, TRUE otherwise.  */
+
+bool
+vectorizable_comparison (gimple *stmt, gimple_stmt_iterator *gsi,
+gimple **vec_stmt, tree reduc_def,
+slp_tree slp_node)
+{
+  tree lhs, rhs1, rhs2;
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  tree vectype1 = NULL_TREE, vectype2 = NULL_TREE;
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  tree vec_rhs1 = NULL_TREE, vec_rhs2 = NULL_TREE;
+  tree vec_compare;
+  tree new_temp;
+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+  tree def;
+  enum vect_def_type dt, dts[4];
+  unsigned nunits;
+  int ncopies;
+  enum tree_code code;
+  stmt_vec_info prev_stmt_info = NULL;
+  int i, j;
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
+  vec vec_oprnds0 = vNULL;
+  vec vec_oprnds1 = vNULL;
+  tree mask_type;
+  tree mask;
+
+  if (!VECTOR_BOOLEAN_TYPE_P (vectype))
+return false;
+
+  mask_type = vectype;
+  nunits = TYPE_VECTOR_SUBPARTS (vectype);
+
+  if (slp_node || PURE_SLP_STMT (stmt_info))
+ncopies = 1;
+  else
+ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
+
+  gcc_assert (ncopies >= 1);
+  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
+return false;
+
+  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
+  && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
+  && reduc_def))
+return false;
+
+  if (STMT_VINFO_LIVE_P (stmt_info))
+{
+  if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"value used after loop.\n");
+  return false;
+}
+
+  if (!is_gimple_assign (stmt))
+return false;
+
+  code = gimple_assign_rhs_code (stmt);
+
+  if (TREE_CODE_CLASS (code) != tcc_comparison)
+return false;
+
+  rhs1 = gimple_assign_rhs1 (stmt);
+  rhs2 = gimple_assign_rhs2 (stmt);
+
+  if (TREE_CODE (rhs1) == SSA_NAME)
+{
+  gimple *rhs1_def_stmt = SSA_NAME_DEF_STMT (rhs1);
+  if (!vect_is_simple_use_1 (rhs1, stmt, loop_vinfo, bb_vinfo,
+&rhs1_def_stmt, &def, &dt, &vectype1))
+   return false;
+}
+  else if (TREE_CODE (rhs1) != INTEGER_CST && TREE_CODE (rhs1) != REAL_CST
+  && TREE_CODE (rhs1) != FIXED_CST)
+return false;
+
+  if (TREE_CODE (rhs2) == SSA_NAME)
+{
+  gimple *rhs2_def_stmt = SSA_NAME_DEF_STMT (rhs2);
+  if (!vect_is_simple_use_1 (rhs2, stmt, loop_vinfo, bb_vinfo,
+&rhs2_def_stmt, &def, &dt, &vectype2))
+   return false;
+}
+  else if (TREE_CODE (rhs2) != INTEGER_CST && TREE_CODE (rhs2) != REAL_CST
+  && TREE_CODE (rhs2) != FIXED_CST)
+return false;
+
+  if (vectype1 && vectype2
+  && TYPE_VECTOR_SUBPARTS (vectype1) != TYPE_VECTOR_SUBPARTS (vectype2))
+return false;
+
+  vectype = vectype1 ? vectype1 : vectype

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 1037 matches

Mail list logo