On 10/03/14 02:50, Ilya Enkovich wrote:
Attached is an updated version of the patch.  It has disabled instrumenttation 
for builtin calls.

Thanks,
Ilya
--
gcc/

2014-10-02  Ilya Enkovich<ilya.enkov...@intel.com>

        * tree-chkp.c: New.
        * tree-chkp.h: New.
        * rtl-chkp.c: New.
        * rtl-chkp.h: New.
        * Makefile.in (OBJS): Add tree-chkp.o, rtl-chkp.o.
        (GTFILES): Add tree-chkp.c.
        * c-family/c.opt (fchkp-check-incomplete-type): New.
        (fchkp-zero-input-bounds-for-main): New.
        (fchkp-first-field-has-own-bounds): New.
        (fchkp-narrow-bounds): New.
        (fchkp-narrow-to-innermost-array): New.
        (fchkp-optimize): New.
        (fchkp-use-fast-string-functions): New.
        (fchkp-use-nochk-string-functions): New.
        (fchkp-use-static-bounds): New.
        (fchkp-use-static-const-bounds): New.
        (fchkp-treat-zero-dynamic-size-as-infinite): New.
        (fchkp-check-read): New.
        (fchkp-check-write): New.
        (fchkp-store-bounds): New.
        (fchkp-instrument-calls): New.
        (fchkp-instrument-marked-only): New.
        * cppbuiltin.c (define_builtin_macros_for_compilation_flags): Add
        __CHKP__ macro when Pointer Bounds Checker is on.
        * passes.def (pass_ipa_chkp_versioning): New.
        (pass_early_local_passes): Removed.
        (pass_build_ssa_passes): New.
        (pass_fixup_cfg): Moved to pass_chkp_instrumentation_passes.
        (pass_chkp_instrumentation_passes): New.
        (pass_ipa_chkp_produce_thunks): New.
        (pass_local_optimization_passes): New.
        (pass_chkp_opt): New.
        * toplev.c: include tree-chkp.h.
        (compile_file): Add chkp_finish_file call.
        * tree-pass.h (make_pass_ipa_chkp_versioning): New.
        (make_pass_ipa_chkp_produce_thunks): New.
        (make_pass_chkp): New.
        (make_pass_chkp_opt): New.
        (make_pass_early_local_passes): Removed.
        (make_pass_build_ssa_passes): New.
        (make_pass_chkp_instrumentation_passes): New.
        (make_pass_local_optimization_passes): New.
        * tree.h (called_as_built_in): New.
        * builtins.c (called_as_built_in): Not static anymore.
        * passes.c (pass_manager::execute_early_local_passes): Execute
        early passes in three steps.
        (execute_all_early_local_passes): Removed.
        (pass_data_early_local_passes): Removed.
        (pass_early_local_passes): Removed.
        (execute_build_ssa_passes): New.
        (pass_data_build_ssa_passes): New.
        (pass_build_ssa_passes): New.
        (pass_data_chkp_instrumentation_passes): New.
        (pass_chkp_instrumentation_passes): New.
        (pass_data_local_optimization_passes): New.
        (pass_local_optimization_passes): New.
        (make_pass_early_local_passes): Removed.
        (make_pass_build_ssa_passes): New.
        (make_pass_chkp_instrumentation_passes): New.
        (make_pass_local_optimization_passes): New.

gcc/testsuite

2014-10-02  Ilya Enkovich<ilya.enkov...@intel.com>

        * gcc.dg/pr37858.c: Replace early_local_cleanups pass name
        with build_ssa_passes.
General question. At the RTL level you represent the bounds with an RTX which is perfectly reasonable. What are the structure sharing assumptions of those values? Do they follow the existing RTL structure sharing assumptions?

Minor nit 2014 in the copyright year for all these files ;-)

So, for example if there are two references to the same bounds in RTL, are they distinct RTXs with the same underlying values? Or is it a single rtx object that is shared? It looks like you generally create new RTXs, but I'm a bit concerned that you might shove those things into a hash table and return them and embed a single reference into multiple hunks of parent RTL.








mpx-9-pass.patch


diff --git a/gcc/builtins.c b/gcc/builtins.c
index 17754e5..78ac91f 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -255,7 +255,7 @@ is_builtin_fn (tree decl)
     of the optimization level.  This means whenever a function is invoked with
     its "internal" name, which normally contains the prefix "__builtin".  */

-static bool
+bool
  called_as_built_in (tree node)
  {
    /* Note that we must use DECL_NAME, not DECL_ASSEMBLER_NAME_SET_P since
Is there some reason you put the new prototype in tree.h rather than builtins.h?

+void
+chkp_emit_bounds_store (rtx bounds, rtx value, rtx mem)
+{
[ ... ]
+         else
+           ptr = gen_rtx_SUBREG (Pmode, value, INTVAL (offs));
I'm a bit concerned about the SUBREG use here. Are you really trying to look at a different part of a REG expression here? ISTM at the least you want to use one of the simplify routines to collapse this down in cases where it's useful to do so.

I see a fair amount of "tree" things in rtl-chkp.c. We're trying to untangle trees from the rest of the compiler. Can you look at see if any of that stuff could be rewritten to avoid a tree interface? chkp_copy_bounds_for_stack_parm comes to mind here.

chkp_expand_bounds_for_reset_mem really looks like it belongs elsewhere. It operates entirely on trees. tree-chkp.c perhaps?





diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
[ ... ]

+
+    Instrumentation clones have pointer bounds arguments added rigth after
s/rigth/right/


+
+    d) Calls.
+
+    For each call in the code we should add additional arguments to pass
s/should//


+
+    3. Bounds computation.
+
+    Compiler is fully responsible for computing bounds to be used for each
+    memory access.  The first step for bounds computation is to find the
+    origin of pointer dereferenced for memory access.  Basing on pointer
+    origin we define a way to compute its bounds.  There are just few
+    possible cases:
+
+    a) Pointer is returned by call.
+
+    In this case we use corresponding checker builtin method to obtain returned
+    bounds.
+
+    Example:
+
+      buf_1 = malloc (size_2);
+      foo (buf_1);
+
+      is translated into:
+
+      buf_1 = malloc (size_2);
+      __bound_tmp.1_3 = __builtin___chkp_bndret (buf_1);
+      foo (buf_1, __bound_tmp.1_3);
Q. Have you checked that nested return values work correctly?  ie


+
+    b) Pointer is an address of an object.
+
+    In this case compiler tries to compute objects size and create 
corresponding
+    bounds.  If object has incomplete type then special checker builtin is 
used to
+    obtain its size at runtime.
So what happens if we have a pointer outside the object, but in the memory reference we add a value so that the final effective address is inside the object?

I continue to try and stamp out that kind of pointer manipulation because it doesn't work on some architectures. However, I believe it still occurs.



+
+
+    d) Pointer is the result of pointer arithmetic or type cast.
+
+    In this case bounds of the base pointer are used.
And you can always get to the base?!?


+
+static vec<struct bb_checks, va_heap, vl_ptr> check_infos = vNULL;
+
+static GTY (()) tree chkp_uintptr_type;
+
+static GTY (()) tree chkp_zero_bounds_var;
+static GTY (()) tree chkp_none_bounds_var;
+
+static GTY (()) basic_block entry_block;
+static GTY (()) tree zero_bounds;
+static GTY (()) tree none_bounds;
+static GTY (()) tree incomplete_bounds;
+static GTY (()) tree tmp_var;
+static GTY (()) tree size_tmp_var;
+static GTY (()) bitmap chkp_abnormal_copies;
+
+struct hash_set<tree> *chkp_invalid_bounds;
+struct hash_set<tree> *chkp_completed_bounds_set;
+struct hash_map<tree, tree> *chkp_reg_bounds;
+struct hash_map<tree, tree> *chkp_bound_vars;
+struct hash_map<tree, tree> *chkp_reg_addr_bounds;
+struct hash_map<tree, tree> *chkp_incomplete_bounds_map;
+struct hash_map<tree, tree> *chkp_bounds_map;
+struct hash_map<tree, tree> *chkp_static_var_bounds;
+
+static bool in_chkp_pass;
That's a whole lot of static variables. Do some of those belong elsewhere? Perhaps in cfun?


+
+/* Static checker constructors may become very large and their
+   compilation with optimization may take too much time.
+   Therefore we put a limit to number of statements in one
+   construcor.  Tests with 100 000 statically initialized
+   pointers showed following compilation times on Sandy Bridge
+   server (used -O2):
+   limit    100 => ~18 sec.
+   limit    300 => ~22 sec.
+   limit   1000 => ~30 sec.
+   limit   3000 => ~49 sec.
+   limit   5000 => ~55 sec.
+   limit  10000 => ~76 sec.
+   limit 100000 => ~532 sec.  */
+#define MAX_STMTS_IN_STATIC_CHKP_CTOR (optimize > 0 ? 5000 : 100000)
Well, it seems to be growing at a reasonable rate. ie, you've got 1000 times more statements in the last case when compared to the first. But it's only ~30 times slower. So I'm not sure this is really necessary or helpful. If you feel it needs to be kept, then use a --param rather than magic numbers.


+
+
+/* Fix operands of attribute from ATTRS list named ATTR_NAME.
+   Integer operands are replaced with values according to
+   INDEXES map having LEN elements.  For operands out of len
+   we just add DELTA.  */
+
+static void
+chkp_map_attr_arg_indexes (tree attrs, const char *attr_name,
+                          unsigned *indexes, int len, int delta)
+{
+  tree attr = lookup_attribute (attr_name, attrs);
+  tree op;
+
+  if (!attr)
+    return;
+
+  TREE_VALUE (attr) = copy_list (TREE_VALUE (attr));
+  for (op = TREE_VALUE (attr); op; op = TREE_CHAIN (op))
+    {
+      int idx;
+
+      if (TREE_CODE (TREE_VALUE (op)) != INTEGER_CST)
+       continue;
+
+      idx = TREE_INT_CST_LOW (TREE_VALUE (op));
+
+      /* If idx exceeds indexes length then we just
+        keep it at the same distance from the last
+        known arg.  */
+      if (idx > len)
+       idx += delta;
+      else
+       idx = indexes[idx - 1] + 1;
+      TREE_VALUE (op) = build_int_cst (TREE_TYPE (TREE_VALUE (op)), idx);
+    }
+}
+

+
+/* For given function NODE add bounds arguments to arguments
+   list.  */
+static void
+chkp_add_bounds_params_to_function (tree fndecl)
+{
+  tree arg;
+
+  for (arg = DECL_ARGUMENTS (fndecl); arg; arg = DECL_CHAIN (arg))
+    if (BOUNDED_P (arg))
+      {
+       std::string new_name = CHKP_BOUNDS_OF_SYMBOL_PREFIX;
+       if (DECL_NAME (arg))
+         new_name += IDENTIFIER_POINTER (DECL_NAME (arg));
+       else
+         {
+           char uid[10];
+           snprintf (uid, 10, "D.%u", DECL_UID (arg));
+           new_name += uid;
+         }
10 digits really enough here? Why not go ahead and bullet-proof this code so that we don't ever run the chance of a buffer overflow.

+
+  /* Mark instrumentation clones created fo aliases and thunks
s/fo/of/

It feels like you've got a lot of stuff in tree-chkp. Please consider separating it a bit. There's some basic helpers, some expansion related stuff, argument twiddling and much more. It feels like it's where you dumped everything remotely related to checking that touched trees rather than thinking about where each function naturally fit. The size also makes review hard.

Can you discuss all the PHI node handling for incomplete bounds in this file? What problem are you trying to solve with this code?

Without looking at it closely, if we need to keep it, you may find it works better as a worklist rather than traversing the entire map. Or does this code remove objects from the map when complete (I scanned, but didn't immediately see them getting removed). It also feels like this code should probably break out of this file and be put elsewhere.

In fact, looking at things, I'm less than 5% through this patchkit. Clearly this patchkit needs to be broken down into more manageable hunks for review purposes.

Please address the issues noted above, then find logical ways to partition this patch into smaller pieces.

Jeff


Reply via email to