Re: [GSOC-2019] - Csmith fuzzer leveraging GCC C Extensions

2019-05-15 Thread Martin Liška
On 5/15/19 9:49 AM, Shubham Narlawar wrote:
> Hello Martin and community,

Hi Shubham.

Thank you for starting working on the project. There are some recommendations
before we fully being:

1) I've recently received 5 emails that have very similar subject and it's
   unclear why that happened? When you e.g. forget to add something into
   an email, it's easier to reply to your email and just mention what
   was forgotten.

2) Please reply to this email thread and for each new functionality, please
   adjust email subject so that it's easier to orient.

3) I would recommend to make a separate branch for each functionality and
   send a branch name here. I can easily pull and don't have to apply a patch.

4) Can we directly use pull requests into 
https://github.com/csmith-project/csmith ?
   I can comment the changes there if you want.

5) I'm quite new to csmith, so be patient with me ;)

6) Are you familiar with coding style of CSmith? I can't verify changes in that 
area.

Thanks for working on that.
Martin


Re: [GSOC-2019] - Implement Csmith fuzzer leveraging GCC C Extensions.

2019-05-15 Thread Martin Liška
On 5/15/19 10:12 AM, Shubham Narlawar wrote:
> Hello Martin and community,
> 
> Below is attached patch for Function Attribute Aligned -
> 
> The command line option to generate aligned attribute is -
> --func-attr-aligned which can be found in ./csmith --help
> 
> Please review the patch and suggest changes if any.
> 
> The process of adding any extension is same -
> 1. Implement an extension
> 2. Send patch for review to community.
> 3. Test GCC compiler and report bugs if found any.
> 4. Push patch into github fork.
> Continue.
> 
> I have a query regarding C extension "function attribute alias". Is it 
> correct way to add aliasing for functions in csmith program as shown below -
> 
> e.g. In csmith program, output aliasing below function declaration looks like 
> -
> 
> //function---declarations--
> void func_1();
> void func_3();
> .
> 
> void foo1() __attribute__ ((weak, alias("func_1")));
> void foo3() __attribute__ ((weak, alias("func_3")));
> ..
> 
> Looking forward to do this project with GCC community.
> 
> Thanks
> Shubham
> 
> func_attr_aligned.patch
> 
> diff --git a/src/CGOptions.cpp b/src/CGOptions.cpp
> index b74375a..98c629b 100644
> --- a/src/CGOptions.cpp
> +++ b/src/CGOptions.cpp
> @@ -202,6 +202,7 @@ DEFINE_GETTER_SETTER_BOOL(fast_execution);
>  
>  //GCC C Extensions
>  DEFINE_GETTER_SETTER_BOOL(func_attr_inline);
> +DEFINE_GETTER_SETTER_BOOL(func_attr_aligned);
>  
>  void
>  CGOptions::set_default_builtin_kinds()
> diff --git a/src/CGOptions.h b/src/CGOptions.h
> index 76887b8..a1aa389 100644
> --- a/src/CGOptions.h
> +++ b/src/CGOptions.h
> @@ -474,6 +474,9 @@ public:
>   static bool func_attr_inline(void);
>  static bool func_attr_inline(bool p);
>  
> + static bool func_attr_aligned(void);
> +static bool func_attr_aligned(bool p);
> +

Hi.

There you have a different indentation.

>  private:
>   static bool enabled_builtin_kind(const string &kind);
>  
> @@ -623,6 +626,7 @@ private:
>  
>   //GCC C Extensions
>   static bool func_attr_inline_;
> + static bool func_attr_aligned_;
>  private:
>   CGOptions(void);
>   CGOptions(CGOptions &cgo);
> diff --git a/src/Function.cpp b/src/Function.cpp
> index 9935552..1a769db 100644
> --- a/src/Function.cpp
> +++ b/src/Function.cpp
> @@ -396,6 +396,7 @@ Function::Function(const string &name, const Type 
> *return_type)
>  {
>   FuncList.push_back(this);   // Add to global list 
> of functions.
>   func_attr_inline = false;
> + func_attr_aligned = false;
>  }
>  
>  Function::Function(const string &name, const Type *return_type, bool builtin)
> @@ -411,6 +412,7 @@ Function::Function(const string &name, const Type 
> *return_type, bool builtin)
>  {
>   FuncList.push_back(this);   // Add to global list 
> of functions.
>   func_attr_inline = false;
> + func_attr_aligned = false;
>  }
>  
>  Function *
> @@ -434,8 +436,11 @@ Function::make_random_signature(const CGContext& 
> cg_context, const Type* type, c
>   if (CGOptions::inline_function() && rnd_flipcoin(InlineFunctionProb))
>   f->is_inlined = true;
>  
> - if(CGOptions::func_attr_inline() && rnd_flipcoin(InlineFunctionProb))
> + if(CGOptions::func_attr_inline() && rnd_flipcoin(FuncAttrInline))
>   f->func_attr_inline = true;

Are you sure here? Why are you removing InlineFunctionProb of an unrelated 
option (inline)?

> +
> + if(CGOptions::func_attr_aligned() && rnd_flipcoin(FuncAttrAligned))
> +f->func_attr_aligned = true;
>   return f;
>  }>  
> @@ -487,8 +492,11 @@ Function::make_first(void)
>   // collect info about global dangling pointers
>   fm->find_dangling_global_ptrs(f);
>  
> - if(CGOptions::func_attr_inline() && rnd_flipcoin(InlineFunctionProb))
> + if(CGOptions::func_attr_inline() && rnd_flipcoin(FuncAttrInline))
>  f->func_attr_inline = true;

Same comment here.

> +
> + if(CGOptions::func_attr_aligned() && rnd_flipcoin(FuncAttrAligned))
> +f->func_attr_aligned = true;
>   return f;
>  }
>  
> @@ -562,6 +570,43 @@ Function::OutputForwardDecl(std::ostream &out)
>   OutputHeader(out);
>   if(func_attr_inline)
>   out << " __attribute__((always_inline))";
> + if(func_attr_aligned){
> + int probability__BIGGEST_ALIGNMENT__ = 38;
> + bool use__BIGGEST_ALIGNMENT__ = 
> rnd_flipcoin(probability__BIGGEST_ALIGNMENT__);
> + if (use__BIGGEST_ALIGNMENT__)
> + out << " 
> __attribute__((aligned(__BIGGEST_ALIGNMENT__)))";
> + else{
> + int value = 0;
> + int power = rnd_upto(8);
> + if (power == 0)
> + power++;
> + switch (power){
> + case 1:
> + value = 2;
> +   

Re: [GSOC-2019] - Csmith fuzzer leveraging GCC C Extensions

2019-05-15 Thread Martin Liška
On 5/15/19 9:49 AM, Shubham Narlawar wrote:
> void func_1();
> void func_3();
> .
> 
> void foo1() __attribute__ ((weak, alias("func_1")));
> void foo3() __attribute__ ((weak, alias("func_3")));

Hi.

In all emails that you sent there's no patch that will implement 'alias' 
attribute.

Thanks,
Martin


Re: [GSOC-2019] - Csmith fuzzer leveraging GCC C Extensions

2019-05-15 Thread Shubham Narlawar
On Wed, May 15, 2019 at 2:13 PM Martin Liška  wrote:

> On 5/15/19 9:49 AM, Shubham Narlawar wrote:
> > Hello Martin and community,
>
> Hi Shubham.
>
> Thank you for starting working on the project. There are some
> recommendations
> before we fully being:
>
> 1) I've recently received 5 emails that have very similar subject and it's
>unclear why that happened? When you e.g. forget to add something into
>an email, it's easier to reply to your email and just mention what
>was forgotten.
>

I tried to send you Aligned patch and one csmith program but it bounces
back. I don't know why it happened. Sorry for sending mail 5 times. Sure, I
will reply to single thread.


> 2) Please reply to this email thread and for each new functionality, please
>adjust email subject so that it's easier to orient.
>

> 3) I would recommend to make a separate branch for each functionality and
>send a branch name here. I can easily pull and don't have to apply a
> patch.
>
Sure.

>
> 4) Can we directly use pull requests into
> https://github.com/csmith-project/csmith ?
>I can comment the changes there if you want.
>

I have created a fork github.com/shubhamnarlawar77/csmith/tree/csmith-gcc
I will make branch of csmith-gcc and send patches to it so that you can
pull and review. After review, I will push patch into csmith-gcc fork. Does
this work for you?


> 5) I'm quite new to csmith, so be patient with me ;)
>
> 6) Are you familiar with coding style of CSmith? I can't verify changes in
> that area.
>
Yes. I am quite familiar with it.

>
> Thanks for working on that.
> Martin
>


Re: [GSOC-2019] - Csmith fuzzer leveraging GCC C Extensions

2019-05-15 Thread Shubham Narlawar
On Wed, May 15, 2019 at 2:23 PM Martin Liška  wrote:

> On 5/15/19 9:49 AM, Shubham Narlawar wrote:
> > void func_1();
> > void func_3();
> > .
> >
> > void foo1() __attribute__ ((weak, alias("func_1")));
> > void foo3() __attribute__ ((weak, alias("func_3")));
>
> Hi.
>
> In all emails that you sent there's no patch that will implement 'alias'
> attribute.
>
I have not implemented alias attribute yet. I will send it to github fork
for review after aligned attribute.

>
> Thanks,
> Martin
>


Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Richard Biener
On Tue, May 14, 2019 at 9:21 PM Aaron Sawdey  wrote:
>
> GCC does not currently do inline expansion of overlapping memmove, nor does it
> have an expansion pattern to allow for non-overlapping memcpy, so I plan to 
> add
> patterns and support to implement this in gcc 10 timeframe.
>
> At present memcpy and memmove are kind of entangled. Here's the current state 
> of
> play:
>
> memcpy -> expand with movmem pattern
> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern
> memmove (overlap) -> remains memmove -> glibc call
>
> There are several problems currently. If the memmove() arguments are in fact
> overlapping, then the expansion is actually not used which makes no sense and
> costs performance of calling a library function instead of inline expanding
> memmove() of small blocks.
>
> There is currently no way to have a separate memcpy pattern. I know from
> experience with expansion of memcmp on power that lengths on the order of
> hundreds of bytes are needed before the function call overhead is overcome by
> optimized glibc code. But we need the memcpy guarantee of non-overlapping
> arguments to make that happen, as we don't want to do a runtime overlap test.
>
> There is some analysis that happens in gimple_fold_builtin_memory_op() that
> determines when memmove calls cannot have an overlap between the arguments and
> converts them into memcpy() which is nice.
>
> However in builtins.c expand_builtin_memmove() does not actually do the
> expansion using the memmove pattern. This is why a memmove() call that cannot 
> be
> converted to memcpy() by gimple_fold_builtin_memory_op() is not expanded and 
> we
> call glibc memmove(). Only expand_builtin_memcpy() actually uses the memmove
> pattern.
>
> So here's my proposed set of fixes:
>  * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove
>cases.
>  * The movmem optab will continue to be treated exactly as it is today so
>that ports that might have a broken movmem pattern that doesn't actually
>handle the overlap cases will continue to work.
>  * expand_builtin_memmove() needs to actually do the memmove() expansion.
>  * expand_builtin_memcpy() needs to use cpymem. Currently this happens down in
>emit_block_move_via_movmem() so some functions might need to be renamed.
>  * ports can then add the new overlapping move and nonoverlapping copy 
> expanders
>and will get better expansion of both memmove and memcpy functions.
>
> I'd be interested in any comments about pieces of this machinery that need to
> work a certain way, or other related issues that should be addressed in
> between expand_builtin_memcpy() and emit_block_move_via_movmem().

I wonder if introducing a __builtin_memmove_with_hints specifying whether
src < dst or dst > src or unknown and/or a safe block size where that
doesn't matter
would help?  I can then be safely expanded to memmove() or to specific
inline code.

Richard.

> Thanks!
>Aaron
>
> --
> Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
> 050-2/C113  (507) 253-7520 home: 507/263-0782
> IBM Linux Technology Center - PPC Toolchain
>


Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Michael Matz
Hi,

On Tue, 14 May 2019, Aaron Sawdey wrote:

> memcpy -> expand with movmem pattern
> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern
> memmove (overlap) -> remains memmove -> glibc call
...
> However in builtins.c expand_builtin_memmove() does not actually do the 
> expansion using the memmove pattern.

Because it can't: the movmem pattern is not defined to require handling 
overlaps, and hence can't be used for any possibly overlapping 
memmove.  (So, in a way the pattern is misnamed and should probably have 
been called cpymem from the beginning, alas there we are).

> So here's my proposed set of fixes:
>  * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove
>cases.

Wouldn't it be nicer to rename the current movmem pattern to cpymem 
wholesale for all ports (i.e. roughly a big s/movmem/cpymem/ over the 
whole tree) and then introduce a new optional movmem pattern with 
overlapping semantics?


Ciao,
Michael.


Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 8:10 AM, Michael Matz wrote:> On Tue, 14 May 2019, Aaron Sawdey 
wrote:
> 
>> memcpy -> expand with movmem pattern
>> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern
>> memmove (overlap) -> remains memmove -> glibc call
> ...
>> However in builtins.c expand_builtin_memmove() does not actually do the 
>> expansion using the memmove pattern.
> 
> Because it can't: the movmem pattern is not defined to require handling 
> overlaps, and hence can't be used for any possibly overlapping 
> memmove.  (So, in a way the pattern is misnamed and should probably have 
> been called cpymem from the beginning, alas there we are).
> 
>> So here's my proposed set of fixes:
>>  * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove
>>cases.
> 
> Wouldn't it be nicer to rename the current movmem pattern to cpymem 
> wholesale for all ports (i.e. roughly a big s/movmem/cpymem/ over the 
> whole tree) and then introduce a new optional movmem pattern with 
> overlapping semantics?

Yeah that makes a lot of sense. I was unaware of that history, and was led
astray by the fact that the powerpc implementation of movemem works by
doing a bunch of loads into registers followed by a bunch of stores and
so (I think) would actually work for the overlap case.

Thanks,
   Aaron




-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 7:22 AM, Richard Biener wrote:
> On Tue, May 14, 2019 at 9:21 PM Aaron Sawdey  wrote:
>> I'd be interested in any comments about pieces of this machinery that need to
>> work a certain way, or other related issues that should be addressed in
>> between expand_builtin_memcpy() and emit_block_move_via_movmem().
> 
> I wonder if introducing a __builtin_memmove_with_hints specifying whether
> src < dst or dst > src or unknown and/or a safe block size where that
> doesn't matter
> would help?  I can then be safely expanded to memmove() or to specific
> inline code.

Yes this would be a nice thing to get to, a single move/copy underlying
builtin, to which we communicate what the compiler's analysis tells us
about whether the operands overlap and by how much.

Next question would be how do we move from the existing movmem pattern
(which Michael Matz tells us should be renamed cpymem anyway) to this
new thing. Are you proposing that we still have both movmem and cpymem
optab entries underneath to call the patterns but introduce this
new memmove_with_hints() to be used by things called by expand_builtin_memmove()
and expand_builtin_memcpy()?

Thanks!
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Michael Matz
Hi,

On Wed, 15 May 2019, Aaron Sawdey wrote:

> Yes this would be a nice thing to get to, a single move/copy underlying 
> builtin, to which we communicate what the compiler's analysis tells us 
> about whether the operands overlap and by how much.
> 
> Next question would be how do we move from the existing movmem pattern 
> (which Michael Matz tells us should be renamed cpymem anyway) to this 
> new thing. Are you proposing that we still have both movmem and cpymem 
> optab entries underneath to call the patterns but introduce this new 
> memmove_with_hints() to be used by things called by 
> expand_builtin_memmove() and expand_builtin_memcpy()?

I'd say so.  There are multiple levels at play:
a) exposal to user: probably a new __builtint_memmove, or a new combined 
   builtin with a hint param to differentiate (but we can't get rid of 
   __builtin_memcpy/mempcpy/strcpy, which all can go through the same 
   route in the middleend)
b) getting it through the gimple pipeline, probably just a new builtin 
   code, trivial
c) expanding the new builtin, with the help of next items
d) RTL block moves: they are defined as non-overlapping and I don't think 
   we should change this (essentially they're the reflection of struct 
   copies in C)
e) how any of the above (builtins and RTL block moves) are implemented: 
   currently non-overlapping only, using movmem pattern when possible; 
   ultimately all sitting in the emit_block_move_hints() routine.

So, I'd add a new method to emit_block_move_hints indicating possible 
overlap, disabling the use of move_by_pieces.  Then in 
emit_block_move_via_movmem (alse getting an indication of overlap), do the 
equivalent of:

  finished = 0;
  if (overlap_possible) {
if (optab[movmem])
  finished = emit(movmem)
  } else {
if (optab[cpymem])
  finished = emit(cpymem);
if (!finished && optab[movmem])  // can use movmem also for overlap
  finished = emit(movmem);
  }

The overlap_possible method would only ever be used from the builtin 
expansion, and never from the RTL block move expand.  Additionally a 
target may optionally only define the movmem pattern if it's just as good 
as the cpymem pattern (e.g. because it only handles fixed small sizes and 
uses a load-all then store-all sequence).


Ciao,
Michael.


Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Jakub Jelinek
On Wed, May 15, 2019 at 02:02:32PM +, Michael Matz wrote:
> > Yes this would be a nice thing to get to, a single move/copy underlying 
> > builtin, to which we communicate what the compiler's analysis tells us 
> > about whether the operands overlap and by how much.
> > 
> > Next question would be how do we move from the existing movmem pattern 
> > (which Michael Matz tells us should be renamed cpymem anyway) to this 
> > new thing. Are you proposing that we still have both movmem and cpymem 
> > optab entries underneath to call the patterns but introduce this new 
> > memmove_with_hints() to be used by things called by 
> > expand_builtin_memmove() and expand_builtin_memcpy()?
> 
> I'd say so.  There are multiple levels at play:
> a) exposal to user: probably a new __builtint_memmove, or a new combined 
>builtin with a hint param to differentiate (but we can't get rid of 
>__builtin_memcpy/mempcpy/strcpy, which all can go through the same 
>route in the middleend)
> b) getting it through the gimple pipeline, probably just a new builtin 
>code, trivial
> c) expanding the new builtin, with the help of next items
> d) RTL block moves: they are defined as non-overlapping and I don't think 
>we should change this (essentially they're the reflection of struct 
>copies in C)
> e) how any of the above (builtins and RTL block moves) are implemented: 
>currently non-overlapping only, using movmem pattern when possible; 
>ultimately all sitting in the emit_block_move_hints() routine.

Just one thing to note, our "memcpy" expectation is that either there is no
overlap, or there is 100% overlap (src == dest), both all the current movmem
== future cpymem expanders and all the supported library implementations do
support that, though the latter just de-facto, it isn't a written guarantee.

Jakub


Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Michael Matz
Hi,

On Wed, 15 May 2019, Jakub Jelinek wrote:

> Just one thing to note, our "memcpy" expectation is that either there is 
> no overlap, or there is 100% overlap (src == dest), both all the current 
> movmem == future cpymem expanders and all the supported library 
> implementations do support that, though the latter just de-facto, it 
> isn't a written guarantee.

Yes, I should have been more precise, complete overlap is always de-facto 
supported as well.


Ciao,
Michael.


[PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .

2019-05-15 Thread Umesh Kalappa
Hi All,

We have the situation ,where the R12 is pointing to Thunk GEP ,not the
current function  like

   .size   _ZN12Intermediate1vEv,.-_ZN12Intermediate1vEv
.set.LTHUNK0,_ZN12Intermediate1vEv
.align 2
.globl _ZThn8_N12Intermediate1vEv
.type   _ZThn8_N12Intermediate1vEv, @function
_ZThn8_N12Intermediate1vEv:
.LFB27:
.file 2 "/home/vkumar1/tmp/64_bit/qsp_ppc/gnu/dkmcxx/lib.h"
.loc 2 13 16
.cfi_startproc
.LCF2:
0:  addis 2,12,.TOC.-.LCF2@ha
addi 2,2,.TOC.-.LCF2@l
.localentry _ZThn8_N12Intermediate1vEv,.-_ZThn8_N12Intermediate1vEv
addi 3,3,-8
b .LTHUNK0
.cfi_endproc
.LFE27:
.size   _ZThn8_N12Intermediate1vEv,.-_ZThn8_N12Intermediate1vEv
.section".toc","aw"
.set .LC1,.LC0

.section".text"
.align 2
.globl _ZN12Intermediate1vEv
.type   _ZN12Intermediate1vEv, @function
_ZN12Intermediate1vEv:
.LFB25:
.loc 1 7 23
.cfi_startproc
.LCF1:
0:  addis 2,12,.TOC.-.LCF1@ha
addi 2,2,.TOC.-.LCF1@l
.localentry _ZN12Intermediate1vEv,.-_ZN12Intermediate1vEv
mflr 0
std 0,16(1)
std 31,-8(1)
stdu 1,-64(1)

like above the control  from "_ZThn8_N12Intermediate1vEv" (support
function for this pointer update)  is transferred
"_ZN12Intermediate1vEv" by b  inst (where its not updating the r12)
and in the beginning  of   "_ZN12Intermediate1vEv" we are loading the
toc base from r12 (which is incorrect ) ,we are investigating the
issue and one way to fix the issue is that make THUNK to update the
r12 ,the cal like bctrl  or  load the r12 with the function address in
the _ZN12Intermediate1vEv prologue code .

But before we go ahead ,please share your thoughts or shed some lights
on the same .

Thank you
~Umesh


Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .

2019-05-15 Thread Eric Botcazou
> like above the control  from "_ZThn8_N12Intermediate1vEv" (support
> function for this pointer update)  is transferred
> "_ZN12Intermediate1vEv" by b  inst (where its not updating the r12)
> and in the beginning  of   "_ZN12Intermediate1vEv" we are loading the
> toc base from r12 (which is incorrect ) ,we are investigating the
> issue and one way to fix the issue is that make THUNK to update the
> r12 ,the cal like bctrl  or  load the r12 with the function address in
> the _ZN12Intermediate1vEv prologue code .

Is that on VxWorks in kernel mode?  If so, the loader doesn't abide by the 
ELFv2 ABI so the simple way out is to disable asm thunks altogether:

#undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
#define TARGET_ASM_CAN_OUTPUT_MI_THUNK rs6000_can_output_mi_thunk

/* Return true if rs6000_output_mi_thunk would be able to output the
   assembler code for the thunk function specified by the arguments
   it is passed, and false otherwise.  */

static bool
rs6000_can_output_mi_thunk (const_tree, HOST_WIDE_INT, HOST_WIDE_INT,
const_tree)
{
  /* The only possible issue is for VxWorks in kernel mode.  */
  if (!TARGET_VXWORKS || TARGET_VXWORKS_RTP)
return true;

  /* The loader neither creates the glue code sequence that loads r12 nor uses
 the local entry point for the sibcall's target in the ELFv2 ABI.  */
  return DEFAULT_ABI != ABI_ELFv2;
}

-- 
Eric Botcazou


Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 9:02 AM, Michael Matz wrote:
> On Wed, 15 May 2019, Aaron Sawdey wrote:
>> Next question would be how do we move from the existing movmem pattern 
>> (which Michael Matz tells us should be renamed cpymem anyway) to this 
>> new thing. Are you proposing that we still have both movmem and cpymem 
>> optab entries underneath to call the patterns but introduce this new 
>> memmove_with_hints() to be used by things called by 
>> expand_builtin_memmove() and expand_builtin_memcpy()?
> 
> I'd say so.  There are multiple levels at play:
> a) exposal to user: probably a new __builtint_memmove, or a new combined 
>builtin with a hint param to differentiate (but we can't get rid of 
>__builtin_memcpy/mempcpy/strcpy, which all can go through the same 
>route in the middleend)
> b) getting it through the gimple pipeline, probably just a new builtin 
>code, trivial
> c) expanding the new builtin, with the help of next items
> d) RTL block moves: they are defined as non-overlapping and I don't think 
>we should change this (essentially they're the reflection of struct 
>copies in C)
> e) how any of the above (builtins and RTL block moves) are implemented: 
>currently non-overlapping only, using movmem pattern when possible; 
>ultimately all sitting in the emit_block_move_hints() routine.
> 
> So, I'd add a new method to emit_block_move_hints indicating possible 
> overlap, disabling the use of move_by_pieces.  Then in 
> emit_block_move_via_movmem (alse getting an indication of overlap), do the 
> equivalent of:
> 
>   finished = 0;
>   if (overlap_possible) {
> if (optab[movmem])
>   finished = emit(movmem)
>   } else {
> if (optab[cpymem])
>   finished = emit(cpymem);
> if (!finished && optab[movmem])  // can use movmem also for overlap
>   finished = emit(movmem);
>   }
> 
> The overlap_possible method would only ever be used from the builtin 
> expansion, and never from the RTL block move expand.  Additionally a 
> target may optionally only define the movmem pattern if it's just as good 
> as the cpymem pattern (e.g. because it only handles fixed small sizes and 
> uses a load-all then store-all sequence).

We currently have gimple_fold_builtin_memory_op() figuring out where there
is no overlap and converging __builtin_memmove() to __builtin_memcpy(). Would
you forsee looking for converting __builtin_memmove() with overlap into
a call to __builtin_memmove_hint if it is a case where we can define the
overlap precisely enough to provide the hint? My guess is that this wouldn't
be a very common case.

My goals for this are:
 * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
 * memmove() call becomes __builtin_memmove (or __builtin_memcpy based
   on the gimple analysis) and goes through optab[movmem] or optab[cpymem]

I think what you've described meets these goals and cleans things up.

Thanks,
Aaron


-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Jakub Jelinek
On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote:
> We currently have gimple_fold_builtin_memory_op() figuring out where there
> is no overlap and converging __builtin_memmove() to __builtin_memcpy(). Would
> you forsee looking for converting __builtin_memmove() with overlap into
> a call to __builtin_memmove_hint if it is a case where we can define the

Please do not introduce user visible builtins that you are not intending to
support for user use.  Thus, I think you want internal function
.MEMMOVE_HINT as opposed to __builtin_memmove_hint.

> overlap precisely enough to provide the hint? My guess is that this wouldn't
> be a very common case.
> 
> My goals for this are:
>  * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
>  * memmove() call becomes __builtin_memmove (or __builtin_memcpy based
>on the gimple analysis) and goes through optab[movmem] or optab[cpymem]

Except for the becomes part (the memcpy call is the same thing as
__builtin_memcpy in the middle-end, all you care about if it is
BUILT_IN_MEMCPY etc. and whether it has compatible arguments), and for the
missing optab[movmem] part and movmem->cpymem renaming, isn't that what we
have already?

Jakub


Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .

2019-05-15 Thread Umesh Kalappa
Thank you Eric for the suggestion and say that we support in the  loader
part ,can you please point on elfv2 reference that says implementation for
this specific case.

~Umesh

On Wed, May 15, 2019, 21:35 Eric Botcazou  wrote:

> > like above the control  from "_ZThn8_N12Intermediate1vEv" (support
> > function for this pointer update)  is transferred
> > "_ZN12Intermediate1vEv" by b  inst (where its not updating the r12)
> > and in the beginning  of   "_ZN12Intermediate1vEv" we are loading the
> > toc base from r12 (which is incorrect ) ,we are investigating the
> > issue and one way to fix the issue is that make THUNK to update the
> > r12 ,the cal like bctrl  or  load the r12 with the function address in
> > the _ZN12Intermediate1vEv prologue code .
>
> Is that on VxWorks in kernel mode?  If so, the loader doesn't abide by the
> ELFv2 ABI so the simple way out is to disable asm thunks altogether:
>
> #undef TARGET_ASM_CAN_OUTPUT_MI_THUNK
> #define TARGET_ASM_CAN_OUTPUT_MI_THUNK rs6000_can_output_mi_thunk
>
> /* Return true if rs6000_output_mi_thunk would be able to output the
>assembler code for the thunk function specified by the arguments
>it is passed, and false otherwise.  */
>
> static bool
> rs6000_can_output_mi_thunk (const_tree, HOST_WIDE_INT, HOST_WIDE_INT,
> const_tree)
> {
>   /* The only possible issue is for VxWorks in kernel mode.  */
>   if (!TARGET_VXWORKS || TARGET_VXWORKS_RTP)
> return true;
>
>   /* The loader neither creates the glue code sequence that loads r12 nor
> uses
>  the local entry point for the sibcall's target in the ELFv2 ABI.  */
>   return DEFAULT_ABI != ABI_ELFv2;
> }
>
> --
> Eric Botcazou
>


Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 11:31 AM, Jakub Jelinek wrote:
> On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote:
>> My goals for this are:
>>  * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
>>  * memmove() call becomes __builtin_memmove (or __builtin_memcpy based
>>on the gimple analysis) and goes through optab[movmem] or optab[cpymem]
> 
> Except for the becomes part (the memcpy call is the same thing as
> __builtin_memcpy in the middle-end, all you care about if it is
> BUILT_IN_MEMCPY etc. and whether it has compatible arguments), and for the
> missing optab[movmem] part and movmem->cpymem renaming, isn't that what we
> have already?

Yes. I was just trying to state what I wanted it to become, some of which
is already present. So I think I will start working on two patches:

1) rename optab movmem and the underlying patterns to cpymem.
2) add a new optab movmem that is really memmove() and add support for
having __builtin_memmove() use it.

Handling of the partial overlap case can be a separate piece of work.

Thanks,
   Aaron

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain


Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Jakub Jelinek
On Wed, May 15, 2019 at 12:59:01PM -0500, Aaron Sawdey wrote:
> On 5/15/19 11:31 AM, Jakub Jelinek wrote:
> > On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote:
> >> My goals for this are:
> >>  * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
> >>  * memmove() call becomes __builtin_memmove (or __builtin_memcpy based
> >>on the gimple analysis) and goes through optab[movmem] or optab[cpymem]
> > 
> > Except for the becomes part (the memcpy call is the same thing as
> > __builtin_memcpy in the middle-end, all you care about if it is
> > BUILT_IN_MEMCPY etc. and whether it has compatible arguments), and for the
> > missing optab[movmem] part and movmem->cpymem renaming, isn't that what we
> > have already?
> 
> Yes. I was just trying to state what I wanted it to become, some of which
> is already present. So I think I will start working on two patches:
> 
> 1) rename optab movmem and the underlying patterns to cpymem.
> 2) add a new optab movmem that is really memmove() and add support for
> having __builtin_memmove() use it.
> 
> Handling of the partial overlap case can be a separate piece of work.

That 1) and 2) can be also separate pieces of work ;).

Jakub


Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

2019-05-15 Thread Aaron Sawdey
On 5/15/19 1:01 PM, Jakub Jelinek wrote:
> On Wed, May 15, 2019 at 12:59:01PM -0500, Aaron Sawdey wrote:
>> 1) rename optab movmem and the underlying patterns to cpymem.
>> 2) add a new optab movmem that is really memmove() and add support for
>> having __builtin_memmove() use it.
>>
>> Handling of the partial overlap case can be a separate piece of work.
> 
> That 1) and 2) can be also separate pieces of work ;).
Exactly -- make things as easy as possible when I go begging for reviewers :-)

-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain



Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .

2019-05-15 Thread Eric Botcazou
> Thank you Eric for the suggestion and say that we support in the  loader
> part ,can you please point on elfv2 reference that says implementation for
> this specific case.

I don't think there is a reference to this specific case in the ABI, only the 
general stuff about local and global entry points.  We have had this patch in 
our tree for some time and it works well, so let me submit it for inclusion in 
the official tree.

-- 
Eric Botcazou


Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .

2019-05-15 Thread Segher Boessenkool
Hi Umesh,

Please spell out "global entry point", almost everything and everyone does.

On Wed, May 15, 2019 at 08:57:29PM +0530, Umesh Kalappa wrote:

> .set.LTHUNK0,_ZN12Intermediate1vEv

> _ZThn8_N12Intermediate1vEv:
> .LCF2:
> 0:  addis 2,12,.TOC.-.LCF2@ha
> addi 2,2,.TOC.-.LCF2@l
> .localentry 
> _ZThn8_N12Intermediate1vEv,.-_ZThn8_N12Intermediate1vEv
> b .LTHUNK0

> _ZN12Intermediate1vEv:
> .LCF1:
> 0:  addis 2,12,.TOC.-.LCF1@ha
> addi 2,2,.TOC.-.LCF1@l

> like above the control  from "_ZThn8_N12Intermediate1vEv" (support
> function for this pointer update)  is transferred
> "_ZN12Intermediate1vEv" by b  inst (where its not updating the r12)
> and in the beginning  of   "_ZN12Intermediate1vEv" we are loading the
> toc base from r12 (which is incorrect ) ,we are investigating the
> issue and one way to fix the issue is that make THUNK to update the
> r12 ,the cal like bctrl  or  load the r12 with the function address in
> the _ZN12Intermediate1vEv prologue code .

The thunk should point to the local entry point instead, I guess?

Please open a PR, with compilable testcase, and other relevant info (if
you need some specific -mcpu= for example, or other flags, do mention
that!)  See https://gcc.gnu.org/bugs.html for more info.  Thanks!


Segher


Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .

2019-05-15 Thread Segher Boessenkool
On Wed, May 15, 2019 at 08:31:27PM +0200, Eric Botcazou wrote:
> > Thank you Eric for the suggestion and say that we support in the  loader
> > part ,can you please point on elfv2 reference that says implementation for
> > this specific case.
> 
> I don't think there is a reference to this specific case in the ABI, only the 
> general stuff about local and global entry points.

Yes, it is just a normal jump to a local function.

> We have had this patch in 
> our tree for some time and it works well, so let me submit it for inclusion 
> in 
> the official tree.

Can't you get the loader fixed, instead?


Segher


regarding https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89924

2019-05-15 Thread kamlesh kumar
Hi devs,
I am looking into subjected issue.
consider below testcase.
$test.cpp
class A {
public:
virtual int value() { return 1; }
};

class  B final : public A {
public:
int value(){ return 2; }
};
int test(A* b) {
return b->value() + 11;
}
void foo_test(B*b)
{
test(b);
}
Here gcc is unable to devirtualize the call to value into the test, because
type information is not being propogated from foo_test to test.
So this is what i am trying to do to make devirtualization happen:

First we can inline the function test into foo_test.
And then update the type in the OBJ_TYPE_REF.
so that devirtualization can be applied.
Like to know your thought before going ahead.

./Kamlesh


Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .

2019-05-15 Thread Umesh Kalappa
>>Can't you get the loader fixed, instead?
Yes we are thinking the same ,question what should be loader semantics here
(update the prologue code to update r12 to Global Entry Point or Update R2
with toc base (that don't relay on the R12). )

~Umesh
On Thu, May 16, 2019, 05:22 Segher Boessenkool 
wrote:

> On Wed, May 15, 2019 at 08:31:27PM +0200, Eric Botcazou wrote:
> > > Thank you Eric for the suggestion and say that we support in the
> loader
> > > part ,can you please point on elfv2 reference that says implementation
> for
> > > this specific case.
> >
> > I don't think there is a reference to this specific case in the ABI,
> only the
> > general stuff about local and global entry points.
>
> Yes, it is just a normal jump to a local function.
>
> > We have had this patch in
> > our tree for some time and it works well, so let me submit it for
> inclusion in
> > the official tree.
>
> Can't you get the loader fixed, instead?
>
>
> Segher
>