Re: Doubts regarding the _Dependent_ptr keyword

2019-07-07 Thread Akshat Garg
On Sun, Jul 7, 2019 at 9:48 PM Akshat Garg  wrote:

> On Sun, Jul 7, 2019 at 7:49 PM Paul E. McKenney 
> wrote:
>
>> On Sat, Jul 06, 2019 at 12:39:45PM +0530, Akshat Garg wrote:
>> > On Sat, Jul 6, 2019 at 1:09 AM Akshat Garg  wrote:
>> >
>> > > On Fri, Jul 5, 2019 at 7:18 AM Paul E. McKenney <
>> paul...@linux.ibm.com>
>> > > wrote:
>> > >
>> > >> [CCing Ramana]
>> > >>
>> > >> My guess is that Jakub is saying that you should start by
>> > >> replacing all instances of TREE_VOLATILE() with (TREE_VOLATILE()
>> > >> || TREE_DEPENDENT_PTR()), give or take my getting the names wrong.
>> > >> Then selectively remove instances of TREE_DEPENDENT_PTR() where it
>> can
>> > >> be shown to be safe to do so.
>> > >>
>> > >
>> > >> The advantage of Jakub's approach over selectively inserting
>> > >> TREE_DEPENDENT_PTR() where it is needed is less chance of bugs due
>> > >> to missing TREE_DEPENDENT_PTR() instances.
>> > >>
>> > > I have checked that replacing TREE_VOLATILE() with (TREE_VOLATILE() ||
>> > > TREE_DEPENDENT_PTR()) is not sufficient to make the dependent ptr act
>> > > similar to volatile ptrs. I am looking over the source and will let
>> you
>> > > know if I am able to do it. If any of like, I can share the patch.
>> > >
>> > > Akshat
>> > >
>> > I have made some changes to the previous commit for _Dependent_ptr
>> > qualifier. We need to use this (
>> >
>> https://github.com/AKG001/gcc/blob/fb4187bc3872a50880159232cf336f0a03505fa8/gcc/tree-core.h#L943
>> )
>> > side-effect flag also with the dependent_ptr flag therefore, I have
>> pulled
>> > out the dependent_ptr from the union. Also, I have introduced a new
>> macro
>> > for dependent ptr TREE_THIS_DEPENDENT_PTR. Similar macro
>> > (TREE_THIS_VOLATILE) is used to check whether anything is volatile or
>> not.
>> > The initial plan was to change all occurrences of checks for volatile
>> with
>> > volatile or dependent_ptr, i.e., TREE_THIS_VOLATILE(NODE) with
>> > (TREE_THIS_DEPENDENT_PTR(NODE) || TREE_THIS_VOLATILE(NODE)). But, I
>> found
>> > 250 places to do this. Would that be okay, if I still change over all
>> the
>> > places?
>>
>> There is only one way to find out!  And in any case, this sounds like
>> one reason why your previous changes didn't fully mimic the volatile
>> keyword.  ;-)
>>
>> So I recommend making the changes.
>>
>> Probably not quite large enough a change to make it worthwhile learning
>> Coccinelle, but that is nevertheless an option.  (This tool allows you
>> to specify update patterns in C code, which it will automatically apply.)
>> http://coccinelle.lip6.fr/ if you want to try it out.
>>
>> Thanx, Paul
>
>
 I have made the changes shown here
https://github.com/AKG001/gcc/commit/e4ffd77f62ace986c124a70b90a662c083c570ba
The changes are not much tested on the edge cases. The test it was breaking
earlier shows here
https://github.com/AKG001/test/blob/master/dependency_breaking.c I am
checking for other test cases.
The optimized code with -O3 after applying the patch is here
https://github.com/AKG001/test/blob/master/dependency_breaking.c.231t.optimized
Please, I need your opinion and further directions on this.

Thanks,
Akshat

> > >> Jakub also seems to be pointing out that optimizations in the backend
> > >> also need to be handled, which I believe is what he is getting at
> with his
> > >> mention of RTL.  Of course, there is a very large number of backends,
> so
> > >> there needs to be a way of only doing some of them.  One way of doing
> that
> > >> is to handle only the multicore weakly ordered CPUs (ARM, ARMv8,
> PowerPC,
> > >> MIPS, maybe RISC-V), and on other CPUs continue the current behavior
> > >> of transforming the memory_order_consume load to memory_order_acquire.
> > >> For example, memory_order_acquire is almost free on x86, s390, and
> > >> the like.
> > >>
> > >> Does that help?
> > >>
> > >> Thanx, Paul
> > >>
> > >> On Fri, Jul 05, 2019 at 04:38:03AM +0530, Akshat Garg wrote:
> > >> > On Thu, Jul 4, 2019 at 11:39 PM Jakub Jelinek 
> wrote:
> > >> >
> > >> > > On Thu, Jul 04, 2019 at 10:40:15AM -0700, Paul E. McKenney wrote:
> > >> > > > > I think fully guaranteeing this is hard (besides when you use
> > >> > > > > volatile), we have the very same issue when dealing with
> > >> > > > > pointer provenance rules, known for years and not fixed
> > >> > > > > (and I don't see a good way to fix these issues without
> > >> > > > > sacrifying performance everywhere).
> > >> > > > >
> > >> > > > > Good luck ;)
> > >> > > >
> > >> > > > Thank you, we will need it!  ;-)
> > >> > >
> > >> > > Well, luck probably isn't all you need, you'll need some way to
> > >> represent
> > >> > > those dependencies in the IL (both GIMPLE and RTL) that would
> ensure
> > >> that
> > >> > > they aren't optimized away, because just hoping that optimization
> > >> don't
> > >> > > mess
> > >> > > that up or just patching a 

gcc-10-20190707 is now available

2019-07-07 Thread gccadmin
Snapshot gcc-10-20190707 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/10-20190707/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 10 SVN branch
with the following options: svn://gcc.gnu.org/svn/gcc/trunk revision 273184

You'll find:

 gcc-10-20190707.tar.xz   Complete GCC

  SHA256=1cbe33692a54cc1d0aacf60689de245928e028e8f51ab2ba7d8a56e040f7d2e0
  SHA1=1da5c6811e109a8ab2149f10bdaeffab37c94acc

Diffs from 10-20190630 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-10
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: ARM peephole2 from 2003 never merged, still valid

2019-07-07 Thread Fredrik Hederstierna
> From: Segher Boessenkool 
>  Sent: Thursday, June 6, 2019 6:49 PM
>  To: Richard Earnshaw (lists)
>  Cc: Fredrik Hederstierna; Jeff Law; gcc@gcc.gnu.org
>  Subject: Re: ARM peephole2 from 2003 never merged, still valid 
>    
>  
> On Thu, Jun 06, 2019 at 05:06:35PM +0100, Richard Earnshaw (lists) wrote:
>  > The reason combine doesn't catch this is because at the time it runs the
>  > MOV is in a different basic block.  Later on it is sunk into the same
>  > basic block, but it's then too late to do the merge.
>  
>  Or you could say the MOV didn't even exist yet: the insn that is merged
>  by the peephole is created by the prologue code, eventually.
>  
>  This isn't really a target problem, it is very much generic, but I don't
>  see a better solution than a peephole either.
>
>  Segher
 
So what is the conclusion, should be re-open PR-9663 and try to fix the missing 
peephole?
I think there will always be cases where the code generated target specific 
register optimizations are possible, that not necessarily have any direct 
connection of the actual basic blocks but more by chance which instructions are 
finally generated after each other, where maybe eg peephole-alike optimizers 
are the only solution to do the job?
/Fredrik