On Mon, Jul 22, 2019 at 10:54 AM Akshat Garg <xks...@gmail.com> wrote:

> On Mon, Jul 22, 2019 at 2:11 PM Richard Biener <richard.guent...@gmail.com>
> wrote:
>
>> On Mon, Jul 22, 2019 at 2:27 AM Akshat Garg <xks...@gmail.com> wrote:
>>
>>> Hi all,
>>> Consider part of an example(figure 20) from doc P0190R4(
>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf)
>>> shown below:
>>>
>>> 1.  void thread1 (void)
>>> 2.  {
>>> 3.    int * volatile p;
>>> 4.    p = rcu_dereference(gip);
>>> 5.    if (p)
>>> 6.    assert(*(p+p[0]) == 42);
>>> 7.  }
>>> The .gimple code produced is :
>>>
>>> 1.  thread1 ()
>>> 2.  {
>>> 3.   atomic int * D.1992;
>>> 4.    int * volatile p;
>>> 5.  {
>>> 6.    atomic int * * __atomic_load_ptr;
>>> 7.   atomic int * __atomic_load_tmp;
>>> 8.    try
>>> 9.     {
>>> 10.        __atomic_load_ptr = &gip;
>>> 11.        _1 = __atomic_load_8 (__atomic_load_ptr, 1);
>>> 12.        _2 = (atomic int *) _1;
>>> 13.        __atomic_load_tmp = _2;
>>> 14.        D.1992 = __atomic_load_tmp;
>>> 15.     }
>>> 16.    finally
>>> 17.      {
>>> 18.        __atomic_load_tmp = {CLOBBER};
>>> 19.      }
>>> 20.  }
>>> 21.   p = D.1992;
>>> 22.   p.2_3 = p;
>>> 23.  if (p.2_3 != 0B) goto <D.1994>; else goto <D.1995>;
>>> 24.  <D.1994>:
>>> 25.   p.3_4 = p;
>>> 26.  p.4_5 = p;
>>> 27.   _6 = *p.4_5;
>>> 28.  _7 = (long unsigned int) _6;
>>> 29.  _8 = _7 * 4;
>>> 30.  _9 = p.3_4 + _8;
>>> 31.  _10 = *_9;
>>> 32.  _11 = _10 == 42;
>>> 33.  _12 = (int) _11;
>>> 34.  assert (_12);
>>> 35.  <D.1995>:
>>> 36. }
>>>
>>> assert at line 34 in .gimple code still breaks the dependency given by
>>> the user. I believe, there should be some ssa defined variable of p or p
>>> itself in assert. This is happening when I am considering pointer as
>>> volatile qualified. If I consider it as _Dependent_ptr qualified then it
>>> surely produces the broken dependency code. Let me know, if I wrong
>>> somewhere.
>>>
>>>
>> p appears as memory here which we load its value to p.3_4 which we then
>> offset by _8 and load from the
>> computed address into _10 which then appears in the assert condition.  I
>> think that's as good as it can
>> get ...
>>
>> Richard.
>>
>
> Thank you for your reply. For, the same example above, consider this (
> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L402)
> instruction at rtl level changed form this (
> https://github.com/AKG001/rtl_opt/blob/master/p0190r4_fig20.c.239r.cse1#L231)
> during the cse pass. The variable p.2_3 gets replaced by a temporary _1 but
> _1 is not any dependent pointer where, p.2_3 was. Is this also not breaking
> any dependencies
>

I'm not sure.  In general CSE can break dependences.  If the dependent
pointer chain needs to conver multiple levels of
indirections from the original atomic operation you need to make sure to
not expose atomics as CSEable.  Thus on
RTL have them all UNSPECs.

Richard.


> -Akshat
>>>
>>>
>>>
>>>
>>> On Wed, Jul 17, 2019 at 4:23 PM Akshat Garg <xks...@gmail.com> wrote:
>>>
>>>> On Tue, Jul 2, 2019 at 9:06 PM Jason Merrill <ja...@redhat.com> wrote:
>>>>
>>>>> On Mon, Jul 1, 2019 at 8:59 PM Paul E. McKenney <paul...@linux.ibm.com>
>>>>> wrote:
>>>>> >
>>>>> > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote:
>>>>> > > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg <xks...@gmail.com>
>>>>> wrote:
>>>>> > >
>>>>> > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan <
>>>>> > > > ramana....@googlemail.com> wrote:
>>>>> > > >
>>>>> > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg <xks...@gmail.com>
>>>>> wrote:
>>>>> > > >> >
>>>>> > > >> > As we have some working front-end code for _Dependent_ptr,
>>>>> What should
>>>>> > > >> we do next? What I understand, we can start adding the library
>>>>> for
>>>>> > > >> dependent_ptr and its functions for C corresponding to the ones
>>>>> we created
>>>>> > > >> as C++ template library. Then, after that, we can move on to
>>>>> generating the
>>>>> > > >> assembly code part.
>>>>> > > >> >
>>>>> > > >>
>>>>> > > >>
>>>>> > > >> I think the next step is figuring out how to model the Dependent
>>>>> > > >> pointer information in the IR and figuring out what
>>>>> optimizations to
>>>>> > > >> allow or not with that information. At this point , I suspect
>>>>> we need
>>>>> > > >> a plan on record and have the conversation upstream on the
>>>>> lists.
>>>>> > > >>
>>>>> > > >> I think we need to put down a plan on record.
>>>>> > > >>
>>>>> > > >> Ramana
>>>>> > > >
>>>>> > > > [CCing gcc mailing list]
>>>>> > > >
>>>>> > > > So, shall I start looking over the pointer optimizations only
>>>>> and see what
>>>>> > > > information we may be needed on the same examples in the IR
>>>>> itself?
>>>>> > > >
>>>>> > > > - Akshat
>>>>> > > >
>>>>> > > I have coded an example where equality comparison kills dependency
>>>>> from the
>>>>> > > document P0190R4 as shown below :
>>>>> > >
>>>>> > > 1. struct rcutest rt = {1, 2, 3};
>>>>> > > 2. void thread0 ()
>>>>> > > 3. {
>>>>> > > 4.     rt.a = -42;
>>>>> > > 5.     rt.b = -43;
>>>>> > > 6.     rt.c = -44;
>>>>> > > 7.     rcu_assign_pointer(gp, &rt);
>>>>> > > 8. }
>>>>> > > 9.
>>>>> > > 10. void thread1 ()
>>>>> > > 11. {
>>>>> > > 12.    int i = -1;
>>>>> > > 13.    int j = -1;
>>>>> > > 14.    _Dependent_ptr struct rcutest *p;
>>>>> > > 15.
>>>>> > > 16.    p = rcu_dereference(gp);
>>>>> > > 17.    j = p->a;
>>>>> > > 18.   if (p == &rt)
>>>>> > > 19.        i = p->b;  /*Dependency breaking point*/
>>>>> > > 20.   else if(p)
>>>>> > > 21.       i = p->c;
>>>>> > > 22.   assert(i<0);
>>>>> > > 23.   assert(j<0);
>>>>> > > 24. }
>>>>> > > The gimple unoptimized code produced for lines 17-24 is shown below
>>>>> > >
>>>>> > > 1. if (p_16 == &rt)
>>>>> > > 2.     goto <bb 3>; [INV]
>>>>> > > 3.   else
>>>>> > > 4.    goto <bb 4>; [INV]
>>>>> > > 5.
>>>>> > > 6.  <bb 3> :
>>>>> > > 7.  i_19 = p_16->b;
>>>>> > > 8.  goto <bb 6>; [INV]
>>>>> > > 9.
>>>>> > > 10.  <bb 4> :
>>>>> > > 11.  if (p_16 != 0B)
>>>>> > > 12.    goto <bb 5>; [INV]
>>>>> > > 13.  else
>>>>> > > 14.    goto <bb 6>; [INV]
>>>>> > > 15.
>>>>> > > 16.  <bb 5> :
>>>>> > > 17.  i_18 = p_16->c;
>>>>> > > 18.
>>>>> > > 19.  <bb 6> :
>>>>> > > 20.  # i_7 = PHI <i_19(3), i_8(4), i_18(5)>
>>>>> > > 21.  _3 = i_7 < 0;
>>>>> > > 22.  _4 = (int) _3;
>>>>> > > 23.  assert (_4);
>>>>> > > 24.  _5 = j_17 < 0;
>>>>> > > 25.  _6 = (int) _5;
>>>>> > > 26.  assert (_6);
>>>>> > > 27.  return;
>>>>> > >
>>>>> > > The optimized code after -O1 is applied for the same lines is hown
>>>>> below :
>>>>> > >
>>>>> > > 1. if (_2 == &rt)
>>>>> > > 2.    goto <bb 3>; [30.00%]
>>>>> > > 3. else
>>>>> > > 4.    goto <bb 4>; [70.00%]
>>>>> > > 5.
>>>>> > > 6.  <bb 3> [local count: 322122547]:
>>>>> > > 7.   i_12 = rt.b;
>>>>> > > 8.   goto <bb 6>; [100.00%]
>>>>> > > 9.
>>>>> > > 10.  <bb 4> [local count: 751619277]:
>>>>> > > 11.   if (_1 != 0)
>>>>> > > 12.   goto <bb 5>; [50.00%]
>>>>> > > 13.   else
>>>>> > > 14.    goto <bb 6>; [50.00%]
>>>>> > > 15.
>>>>> > > 16.  <bb 5> [local count: 375809638]:
>>>>> > > 17.   i_11 = MEM[(dependent_ptr struct rcutest *)_2].c;
>>>>> > > 18.
>>>>> > > 19.   <bb 6> [local count: 1073741824]:
>>>>> > > 20.  # i_7 = PHI <i_12(3), i_11(5), -1(4)>
>>>>> > > 21.   _3 = i_7 < 0;
>>>>> > > 22.   _4 = (int) _3;
>>>>> > > 23.   assert (_4);
>>>>> > > 24.  _5 = j_10 < 0;
>>>>> > > 25.  _6 = (int) _5;
>>>>> > > 26.   assert (_6);
>>>>> > > 27.   return;
>>>>> >
>>>>> > Good show on tracing this through!
>>>>> >
>>>>> > > Statement 19 in the program gets converted from  i_19 = p_16->b;
>>>>> in line 7
>>>>> > > in unoptimized code to i_12 = rt.b; in line 7 in optimized code
>>>>> which
>>>>> > > breaks the dependency chain. We need to figure out the pass that
>>>>> does that
>>>>> > > and put some handling code in there for the _dependent_ptr
>>>>> qualified
>>>>> > > pointers. Passing simply -fipa-pure-const,
>>>>> -fguess-branch-probability or
>>>>> > > any other option alone does not produce the optimized code that
>>>>> breaks the
>>>>> > > dependency. But applying -O1, i.e., allowing all the optimizations
>>>>> does so.
>>>>> > > As passes are applied in a certain order, we need to figure out up
>>>>> to what
>>>>> > > passes, the code remains same and after what pass the dependency
>>>>> does not
>>>>> > > holds. So, we need to check the translated code after every pass.
>>>>> > >
>>>>> > > Does this sounds like a workable plan for ? Let me know your
>>>>> thoughts. If
>>>>> > > this sounds good then, we can do this for all the optimizations
>>>>> that may
>>>>> > > kill the dependencies at somepoint.
>>>>> >
>>>>> > I don't know of a better plan.
>>>>> >
>>>>> > My usual question...  Is there some way to script the checking of the
>>>>> > translated code at the end of each pass?
>>>>>
>>>>> The usual way to check the output of an optimization pass is by
>>>>> dumping the intermediate code at that point and matching the dump
>>>>> against a regexp, as in the tree-ssa directories in the testsuite.
>>>>> -fdump-tree-all will dump after all the gimple optimization passes,
>>>>> and you can look through them until you find the breakage.
>>>>>
>>>>> Jason
>>>>>
>>>> Thanks Jason for the advice, I followed it.
>>>>
>>>> I coded an example shown in figure 26 from here (
>>>> http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf).
>>>> The example:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c
>>>> The .objsz1 code:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.027t.objsz1
>>>> The .ccp1 code:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.028t.ccp1
>>>> The .sra code:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.114t.sra
>>>> The .dom2 code:
>>>> https://github.com/AKG001/test/blob/master/dependency_breaking.c.116t.dom2
>>>>
>>>> Comparing the .objsz1 and .ccp1 file, the "struct p" gets removed after
>>>> constant copy propagation (ccp) pass(
>>>> https://github.com/gcc-mirror/gcc/blob/master/gcc/passes.def#L77). In
>>>> .objsz1 file at line 53, the dependencies start with
>>>> p_16 = _14;
>>>> .......
>>>> i_19 = p_16->b;
>>>> ......
>>>> as user specified and after that all the references are made through
>>>> struct p or its variants. But in .ccp1 file at line 41, the "struct p" is
>>>> replaced with variable _2, now, variable _2 starts the dependencies as we
>>>> can see.
>>>> _2 = (atomic struct rcutest *) _1;
>>>> ...........
>>>> i_19 = MEM[(struct rcutest *)_2].b;
>>>> ..........
>>>> I don't know whether it is okay to say at this point, the dependencies
>>>> are still maintained or not. Or we have to pass the dependent pointer
>>>> qualification to new variables that replaces them. Hoping someone could
>>>> clarify.
>>>>
>>>> Also, In .sra file at line 43 in thread1(), the statement is:
>>>>     i_12 = MEM[(struct rcutest *)_2].b;
>>>> In .dom2 file at line 61, the above statement gets converted to:
>>>>     i_12 = rt.b;
>>>> So, I believe that sure does breaks the dependency specified by the
>>>> user. For this, we need to do some modifications in tree-ssa-dom.c file.
>>>> Hope this makes sense.
>>>>
>>>> Thanks,
>>>> Akshat
>>>>
>>>

Reply via email to