> On Wed, Jun 14, 2017 at 1:14 PM, Prachi Godbole
> wrote:
> > I'm developing a solution to optimize away intermediate stores (loads) for
> static local variables which are assigned to before referenced on every path
> through a function.
> >
> > Currently GCC eliminates all loads/stores in a straight line code but not in
> control structures. AFAIU, passes like sccvn, lim (store motion in particular)
> sink etc. try to propagate the assignment exprs through SSA vars instead of
> through memory thereby eliminating almost all loads and some stores but
> still not all.
> >
> > For example:
> >
> > void some_function()
> > {
> > static some_type sum;
> > ...
> > for( i = 0 ; i < n ; i++ )
> > {
> > sum = matrix [i][n] ;
> >
> > for( j = 0 ; j < i ; j++ )
> > {
> > sum -= matrix [i][j] * matrix [j][n] ;
> > }
> >
> > matrix [i][n] = sum ;
> > }
> > ...
> > }
> >
> > Resulting SSA:
> >
> > ...
> >[2.25%]:
> > _10 = matrix[0][n_13(D)];
> > sum = _10;
> > goto ; [100.00%]
> > ...
> >[12.75%]:
> > _1 = matrix[i_16][n_13(D)];
> > if (i_16 > 0)
> > goto ; [100.00%]
> > else
> > goto ; [0.00%]
> > ...
> >[85.00%]:
> > # j_24 = PHI
> > # sum_lsm.4_27 = PHI <_6(6), _1(4)>
> > _3 = matrix[i_16][j_24];
> > _4 = matrix[j_24][n_13(D)];
> > _5 = _3 * _4;
> > _6 = sum_lsm.4_27 - _5;
> > j_18 = j_24 + 1;
> > if (i_16 > j_18)
> > goto ; [85.00%]
> > else
> > goto ; [15.00%]
> > ...
> >[12.75%]:
> > # _40 = PHI <_6(6)>
> > goto ; [100.00%]
> >
> >[12.75%]:
> > # sum_lsm.4_19 = PHI <_40(7), _1(4)>
> >
> >[15.00%]:
> > # i_20 = PHI
> > # sum_lsm.4_8 = PHI
> > # sum_lsm.5_22 = PHI <1(9), 0(14)>
> > matrix[i_20][n_13(D)] = sum_lsm.4_8;
> > i_16 = i_20 + 1;
> > if (n_13(D) > i_16)
> > goto ; [85.00%]
> > else
> > goto ; [15.00%]
> >
> >[2.25%]:
> > # sum_lsm.4_39 = PHI
> > # sum_lsm.5_38 = PHI
> > if (sum_lsm.5_38 != 0)
> > goto ; [0.00%]
> > else
> > goto ; [100.00%]
> >
> >[0.00%]:
> > sum = sum_lsm.4_39;
> > ...
> >
> > Although all intermediate loads are eliminated here, store to sum before
> and after the 'j' loop (see bb 14 and bb 12) still remain which can be
> eliminated.
> >
> > My solution would reuse a lot of data structures and routines from the sink
> pass. So my question is, can I add this optimization into the sink pass
> itself or
> would it require a new pass and if yes, what would be the ideal place for it.
> > Any other pointers are more than welcome.
>
> So the idea is that for any store to such variable that is dominated by a
> kill the
> function exit doesn't count as a use, right?
> (unless the variable escapes the scope of the function, for example via a
> nested function -- which might be the case that is very difficult to detect).
>
Yes that is the general idea. I had thought of initially being conservative and
not optimizing at all when the variable could potentially escape the scope.
> I'm not sure what part of the sink pass is suitable for this but as sinking
> walks
> post dominators backwards which also DSE does it would fit DSE as well, no?
> You'd pre-compute blocks containing kills (or do two backwards passes) and
> then special case GIMPLE_RETURN in dse_classify_store where previously
> ref_maybe_used_by_stmt_p covers those.
>
I was looking for a one time solution rather than doing it as and when the
opportunities present themselves.
But doing it in DSE is more advantageous, I can see that. So I'll go ahead with
DSE.
> nested fn case:
>
> void foo (void)
> {
> static int i = 0;
> int __attribute__((noinline)) bar () { return i; }
> int j = bar ();
> i = 1;
> }
>
> I think we don't mark 'i' TREE_ADDRESSABLE (escaped) here so we simply
> treat 'i' as a global variable accessible from other scopes.
>
> Richard.
>
> > Thanks,
> > Prachi