2014-10-03 23:59 GMT+04:00 Jeff Law <l...@redhat.com>:
> On 10/03/14 02:50, Ilya Enkovich wrote:
>>
>> Attached is an updated version of the patch.  It has disabled
>> instrumenttation for builtin calls.
>>
>> Thanks,
>> Ilya
>> --
>> gcc/
>>
>> 2014-10-02  Ilya Enkovich<ilya.enkov...@intel.com>
>>
>>         * tree-chkp.c: New.
>>         * tree-chkp.h: New.
>>         * rtl-chkp.c: New.
>>         * rtl-chkp.h: New.
>>         * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o.
>>         (GTFILES): Add tree-chkp.c.
>>         * c-family/c.opt (fchkp-check-incomplete-type): New.
>>         (fchkp-zero-input-bounds-for-main): New.
>>         (fchkp-first-field-has-own-bounds): New.
>>         (fchkp-narrow-bounds): New.
>>         (fchkp-narrow-to-innermost-array): New.
>>         (fchkp-optimize): New.
>>         (fchkp-use-fast-string-functions): New.
>>         (fchkp-use-nochk-string-functions): New.
>>         (fchkp-use-static-bounds): New.
>>         (fchkp-use-static-const-bounds): New.
>>         (fchkp-treat-zero-dynamic-size-as-infinite): New.
>>         (fchkp-check-read): New.
>>         (fchkp-check-write): New.
>>         (fchkp-store-bounds): New.
>>         (fchkp-instrument-calls): New.
>>         (fchkp-instrument-marked-only): New.
>>         * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
>>         __CHKP__ macro when Pointer Bounds Checker is on.
>>         * passes.def (pass_ipa_chkp_versioning): New.
>>         (pass_early_local_passes): Removed.
>>         (pass_build_ssa_passes): New.
>>         (pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes.
>>         (pass_chkp_instrumentation_passes): New.
>>         (pass_ipa_chkp_produce_thunks): New.
>>         (pass_local_optimization_passes): New.
>>         (pass_chkp_opt): New.
>>         * toplev.c: include tree-chkp.h.
>>         (compile_file): Add chkp_finish_file call.
>>         * tree-pass.h (make_pass_ipa_chkp_versioning): New.
>>         (make_pass_ipa_chkp_produce_thunks): New.
>>         (make_pass_chkp): New.
>>         (make_pass_chkp_opt): New.
>>         (make_pass_early_local_passes): Removed.
>>         (make_pass_build_ssa_passes): New.
>>         (make_pass_chkp_instrumentation_passes): New.
>>         (make_pass_local_optimization_passes): New.
>>         * tree.h (called_as_built_in): New.
>>         * builtins.c (called_as_built_in): Not static anymore.
>>         * passes.c (pass_manager::execute_early_local_passes): Execute
>>         early passes in three steps.
>>         (execute_all_early_local_passes): Removed.
>>         (pass_data_early_local_passes): Removed.
>>         (pass_early_local_passes): Removed.
>>         (execute_build_ssa_passes): New.
>>         (pass_data_build_ssa_passes): New.
>>         (pass_build_ssa_passes): New.
>>         (pass_data_chkp_instrumentation_passes): New.
>>         (pass_chkp_instrumentation_passes): New.
>>         (pass_data_local_optimization_passes): New.
>>         (pass_local_optimization_passes): New.
>>         (make_pass_early_local_passes): Removed.
>>         (make_pass_build_ssa_passes): New.
>>         (make_pass_chkp_instrumentation_passes): New.
>>         (make_pass_local_optimization_passes): New.
>>
>> gcc/testsuite
>>
>> 2014-10-02  Ilya Enkovich<ilya.enkov...@intel.com>
>>
>>         * gcc.dg/pr37858.c: Replace early_local_cleanups pass name
>>         with build_ssa_passes.
>
> General question.  At the RTL level you represent the bounds with an RTX
> which is perfectly reasonable.  What are the structure sharing assumptions
> of those values?  Do they follow the existing RTL structure sharing
> assumptions?
>
> Minor nit 2014 in the copyright year for all these files ;-)
>
> So, for example if there are two references to the same bounds in RTL, are
> they distinct RTXs with the same underlying values?  Or is it a single rtx
> object that is shared?  It looks like you generally create new RTXs, but I'm
> a bit concerned that you might shove those things into a hash table and
> return them and embed a single reference into multiple hunks of parent RTL.

For expander bounds are quite regular vars and SSA names which are
expanded as all other values and therefore I believe regular sharing
assumptions are followed.

Hash tables are used just to link pointer values returned by call with
returned bounds.  It is required to expand retbnd calls.  Similarly
returned bounds are associated with DECL_RESULT using
SET_DECL_BOUNDS_RTL.

>
>
>
>
>
>
>>
>>
>> mpx-9-pass.patch
>>
>>
>> diff --git a/gcc/builtins.c b/gcc/builtins.c
>> index 17754e5..78ac91f 100644
>> --- a/gcc/builtins.c
>> +++ b/gcc/builtins.c
>> @@ -255,7 +255,7 @@ is_builtin_fn (tree decl)
>>      of the optimization level.  This means whenever a function is invoked
>> with
>>      its "internal" name, which normally contains the prefix "__builtin".
>> */
>>
>> -static bool
>> +bool
>>   called_as_built_in (tree node)
>>   {
>>     /* Note that we must use DECL_NAME, not DECL_ASSEMBLER_NAME_SET_P
>> since
>
> Is there some reason you put the new prototype in tree.h rather than
> builtins.h?

It was made at time when everything was in tree.h :)
Since current version doesn't work with builtins, there is no need in
this change at all.  I remove it for now.

>
>> +void
>> +chkp_emit_bounds_store (rtx bounds, rtx value, rtx mem)
>> +{
>
> [ ... ]
>>
>> +         else
>> +           ptr = gen_rtx_SUBREG (Pmode, value, INTVAL (offs));
>
> I'm a bit concerned about the SUBREG use here.  Are you really trying to
> look at a different part of a REG expression here?   ISTM at the least you
> want to use one of the simplify routines to collapse this down in cases
> where it's useful to do so.

This is for case when a small structure is represented as a long
register and SUBREG here accesses structure field.

>
> I see a fair amount of "tree" things in rtl-chkp.c.  We're trying to
> untangle trees from the rest of the compiler.  Can you look at see if any of
> that stuff could be rewritten to avoid a tree interface?
> chkp_copy_bounds_for_stack_parm comes to mind here.
>

Expand codes inevitably use both rtx and tree interfaces.  Don't see
it can be separated further.

> chkp_expand_bounds_for_reset_mem really looks like it belongs elsewhere.  It
> operates entirely on trees.  tree-chkp.c perhaps?

OK

>
>
>
>
>
>> diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
>
> [ ... ]
>
>> +
>> +    Instrumentation clones have pointer bounds arguments added rigth
>> after
>
> s/rigth/right/
>
>
>> +
>> +    d) Calls.
>> +
>> +    For each call in the code we should add additional arguments to pass
>
> s/should//
>
>
>> +
>> +    3. Bounds computation.
>> +
>> +    Compiler is fully responsible for computing bounds to be used for
>> each
>> +    memory access.  The first step for bounds computation is to find the
>> +    origin of pointer dereferenced for memory access.  Basing on pointer
>> +    origin we define a way to compute its bounds.  There are just few
>> +    possible cases:
>> +
>> +    a) Pointer is returned by call.
>> +
>> +    In this case we use corresponding checker builtin method to obtain
>> returned
>> +    bounds.
>> +
>> +    Example:
>> +
>> +      buf_1 = malloc (size_2);
>> +      foo (buf_1);
>> +
>> +      is translated into:
>> +
>> +      buf_1 = malloc (size_2);
>> +      __bound_tmp.1_3 = __builtin___chkp_bndret (buf_1);
>> +      foo (buf_1, __bound_tmp.1_3);
>
> Q. Have you checked that nested return values work correctly?  ie

What is a nested return value and when does it appear?

>
>
>> +
>> +    b) Pointer is an address of an object.
>> +
>> +    In this case compiler tries to compute objects size and create
>> corresponding
>> +    bounds.  If object has incomplete type then special checker builtin
>> is used to
>> +    obtain its size at runtime.
>
> So what happens if we have a pointer outside the object, but in the memory
> reference we add a value so that the final effective address is inside the
> object?
>
> I continue to try and stamp out that kind of pointer manipulation because it
> doesn't work on some architectures.  However, I believe it still occurs.

For bounds it matters only how pointer is produced and what value it
has before de-reference.  Pointer may have any intermediate values.
So it would be OK to have: p1 = &buf + 100000; return p1[-100000];

>
>
>
>> +
>> +
>> +    d) Pointer is the result of pointer arithmetic or type cast.
>> +
>> +    In this case bounds of the base pointer are used.
>
> And you can always get to the base?!?

We can always follow DF searching for a valid base.  This is the
reason we instrument code before optimization which makes base search
impossible.  Default bounds are used in case we cannot find valid
pointer source.

>
>
>> +
>> +static vec<struct bb_checks, va_heap, vl_ptr> check_infos = vNULL;
>> +
>> +static GTY (()) tree chkp_uintptr_type;
>> +
>> +static GTY (()) tree chkp_zero_bounds_var;
>> +static GTY (()) tree chkp_none_bounds_var;
>> +
>> +static GTY (()) basic_block entry_block;
>> +static GTY (()) tree zero_bounds;
>> +static GTY (()) tree none_bounds;
>> +static GTY (()) tree incomplete_bounds;
>> +static GTY (()) tree tmp_var;
>> +static GTY (()) tree size_tmp_var;
>> +static GTY (()) bitmap chkp_abnormal_copies;
>> +
>> +struct hash_set<tree> *chkp_invalid_bounds;
>> +struct hash_set<tree> *chkp_completed_bounds_set;
>> +struct hash_map<tree, tree> *chkp_reg_bounds;
>> +struct hash_map<tree, tree> *chkp_bound_vars;
>> +struct hash_map<tree, tree> *chkp_reg_addr_bounds;
>> +struct hash_map<tree, tree> *chkp_incomplete_bounds_map;
>> +struct hash_map<tree, tree> *chkp_bounds_map;
>> +struct hash_map<tree, tree> *chkp_static_var_bounds;
>> +
>> +static bool in_chkp_pass;
>
> That's a whole lot of static variables.  Do some of those belong elsewhere?
> Perhaps in cfun?

This data is pass local and therefore shouldn't belong anything like
function structure.

>
>
>> +
>> +/* Static checker constructors may become very large and their
>> +   compilation with optimization may take too much time.
>> +   Therefore we put a limit to number of statements in one
>> +   construcor.  Tests with 100 000 statically initialized
>> +   pointers showed following compilation times on Sandy Bridge
>> +   server (used -O2):
>> +   limit    100 => ~18 sec.
>> +   limit    300 => ~22 sec.
>> +   limit   1000 => ~30 sec.
>> +   limit   3000 => ~49 sec.
>> +   limit   5000 => ~55 sec.
>> +   limit  10000 => ~76 sec.
>> +   limit 100000 => ~532 sec.  */
>> +#define MAX_STMTS_IN_STATIC_CHKP_CTOR (optimize > 0 ? 5000 : 100000)
>
> Well, it seems to be growing at a reasonable rate.  ie, you've got 1000
> times more statements in the last case when compared to the first.  But it's
> only ~30 times slower.  So I'm not sure this is really necessary or helpful.
> If you feel it needs to be kept, then use a --param rather than magic
> numbers.

These numbers are for the same test and thus the same amount of
statements in total.  The difference is in how statements are split
into separate functions.

>
>
>> +
>> +
>> +/* Fix operands of attribute from ATTRS list named ATTR_NAME.
>> +   Integer operands are replaced with values according to
>> +   INDEXES map having LEN elements.  For operands out of len
>> +   we just add DELTA.  */
>> +
>> +static void
>> +chkp_map_attr_arg_indexes (tree attrs, const char *attr_name,
>> +                          unsigned *indexes, int len, int delta)
>> +{
>> +  tree attr = lookup_attribute (attr_name, attrs);
>> +  tree op;
>> +
>> +  if (!attr)
>> +    return;
>> +
>> +  TREE_VALUE (attr) = copy_list (TREE_VALUE (attr));
>> +  for (op = TREE_VALUE (attr); op; op = TREE_CHAIN (op))
>> +    {
>> +      int idx;
>> +
>> +      if (TREE_CODE (TREE_VALUE (op)) != INTEGER_CST)
>> +       continue;
>> +
>> +      idx = TREE_INT_CST_LOW (TREE_VALUE (op));
>> +
>> +      /* If idx exceeds indexes length then we just
>> +        keep it at the same distance from the last
>> +        known arg.  */
>> +      if (idx > len)
>> +       idx += delta;
>> +      else
>> +       idx = indexes[idx - 1] + 1;
>> +      TREE_VALUE (op) = build_int_cst (TREE_TYPE (TREE_VALUE (op)), idx);
>> +    }
>> +}
>> +
>
>
>> +
>> +/* For given function NODE add bounds arguments to arguments
>> +   list.  */
>> +static void
>> +chkp_add_bounds_params_to_function (tree fndecl)
>> +{
>> +  tree arg;
>> +
>> +  for (arg = DECL_ARGUMENTS (fndecl); arg; arg = DECL_CHAIN (arg))
>> +    if (BOUNDED_P (arg))
>> +      {
>> +       std::string new_name = CHKP_BOUNDS_OF_SYMBOL_PREFIX;
>> +       if (DECL_NAME (arg))
>> +         new_name += IDENTIFIER_POINTER (DECL_NAME (arg));
>> +       else
>> +         {
>> +           char uid[10];
>> +           snprintf (uid, 10, "D.%u", DECL_UID (arg));
>> +           new_name += uid;
>> +         }
>
> 10 digits really enough here?  Why not go ahead and bullet-proof this code
> so that we don't ever run the chance  of a buffer overflow.

snprintf will never fail. Will change it to 25 to have enough for MAX_ULONG.

>
>> +
>> +  /* Mark instrumentation clones created fo aliases and thunks
>
> s/fo/of/
>
> It feels like you've got a lot of stuff in tree-chkp.  Please consider
> separating it a bit.  There's some basic helpers, some expansion related
> stuff, argument twiddling and much more.  It feels like it's where you
> dumped everything remotely related to checking that touched trees rather
> than thinking about where each function naturally fit.  The size also makes
> review hard.

I think it would be reasonable to put IPA passes into a separate file.
Don't see much reason to split tree stuff (and actually was asked last
year to put it into a single file).  I'll try to split it into few
patches (within a single file) to make it simpler.

>
> Can you discuss all the PHI node handling for incomplete bounds in this
> file?   What problem are you trying to solve with this code?

Incomplete bounds are used to walk through cyclic DF dependencies.  I
added more comments for pointers arithmetic handling.

>
> Without looking at it closely, if we need to keep it, you may find it works
> better as a worklist rather than traversing the entire map.  Or does this
> code remove objects from the map when complete (I scanned, but didn't
> immediately see them getting removed).  It also feels like this code should
> probably break out of this file and be put elsewhere.

This is a part of a bounds computation algorithm and shouldn't be put
apart.  We keep phi bounds incomplete until all its args are computed.
Map fits better because for given bounds we should have an ability to
check if bounds are completed.  Also a pointer is associated with each
incomplete bounds to perform bounds re-computation.

>
> In fact, looking at things, I'm less than 5% through this patchkit. Clearly
> this patchkit needs to be broken down into more manageable hunks for review
> purposes.
>
> Please address the issues noted above, then find logical ways to partition
> this patch into smaller pieces.

I fixed issues you mentioned and will send a result as a new series.

Thanks,
Ilya

>
> Jeff
>
>

Reply via email to