Re: [GSOC-2019] - Csmith fuzzer leveraging GCC C Extensions
On 5/15/19 9:49 AM, Shubham Narlawar wrote: > Hello Martin and community, Hi Shubham. Thank you for starting working on the project. There are some recommendations before we fully being: 1) I've recently received 5 emails that have very similar subject and it's unclear why that happened? When you e.g. forget to add something into an email, it's easier to reply to your email and just mention what was forgotten. 2) Please reply to this email thread and for each new functionality, please adjust email subject so that it's easier to orient. 3) I would recommend to make a separate branch for each functionality and send a branch name here. I can easily pull and don't have to apply a patch. 4) Can we directly use pull requests into https://github.com/csmith-project/csmith ? I can comment the changes there if you want. 5) I'm quite new to csmith, so be patient with me ;) 6) Are you familiar with coding style of CSmith? I can't verify changes in that area. Thanks for working on that. Martin
Re: [GSOC-2019] - Implement Csmith fuzzer leveraging GCC C Extensions.
On 5/15/19 10:12 AM, Shubham Narlawar wrote: > Hello Martin and community, > > Below is attached patch for Function Attribute Aligned - > > The command line option to generate aligned attribute is - > --func-attr-aligned which can be found in ./csmith --help > > Please review the patch and suggest changes if any. > > The process of adding any extension is same - > 1. Implement an extension > 2. Send patch for review to community. > 3. Test GCC compiler and report bugs if found any. > 4. Push patch into github fork. > Continue. > > I have a query regarding C extension "function attribute alias". Is it > correct way to add aliasing for functions in csmith program as shown below - > > e.g. In csmith program, output aliasing below function declaration looks like > - > > //function---declarations-- > void func_1(); > void func_3(); > . > > void foo1() __attribute__ ((weak, alias("func_1"))); > void foo3() __attribute__ ((weak, alias("func_3"))); > .. > > Looking forward to do this project with GCC community. > > Thanks > Shubham > > func_attr_aligned.patch > > diff --git a/src/CGOptions.cpp b/src/CGOptions.cpp > index b74375a..98c629b 100644 > --- a/src/CGOptions.cpp > +++ b/src/CGOptions.cpp > @@ -202,6 +202,7 @@ DEFINE_GETTER_SETTER_BOOL(fast_execution); > > //GCC C Extensions > DEFINE_GETTER_SETTER_BOOL(func_attr_inline); > +DEFINE_GETTER_SETTER_BOOL(func_attr_aligned); > > void > CGOptions::set_default_builtin_kinds() > diff --git a/src/CGOptions.h b/src/CGOptions.h > index 76887b8..a1aa389 100644 > --- a/src/CGOptions.h > +++ b/src/CGOptions.h > @@ -474,6 +474,9 @@ public: > static bool func_attr_inline(void); > static bool func_attr_inline(bool p); > > + static bool func_attr_aligned(void); > +static bool func_attr_aligned(bool p); > + Hi. There you have a different indentation. > private: > static bool enabled_builtin_kind(const string &kind); > > @@ -623,6 +626,7 @@ private: > > //GCC C Extensions > static bool func_attr_inline_; > + static bool func_attr_aligned_; > private: > CGOptions(void); > CGOptions(CGOptions &cgo); > diff --git a/src/Function.cpp b/src/Function.cpp > index 9935552..1a769db 100644 > --- a/src/Function.cpp > +++ b/src/Function.cpp > @@ -396,6 +396,7 @@ Function::Function(const string &name, const Type > *return_type) > { > FuncList.push_back(this); // Add to global list > of functions. > func_attr_inline = false; > + func_attr_aligned = false; > } > > Function::Function(const string &name, const Type *return_type, bool builtin) > @@ -411,6 +412,7 @@ Function::Function(const string &name, const Type > *return_type, bool builtin) > { > FuncList.push_back(this); // Add to global list > of functions. > func_attr_inline = false; > + func_attr_aligned = false; > } > > Function * > @@ -434,8 +436,11 @@ Function::make_random_signature(const CGContext& > cg_context, const Type* type, c > if (CGOptions::inline_function() && rnd_flipcoin(InlineFunctionProb)) > f->is_inlined = true; > > - if(CGOptions::func_attr_inline() && rnd_flipcoin(InlineFunctionProb)) > + if(CGOptions::func_attr_inline() && rnd_flipcoin(FuncAttrInline)) > f->func_attr_inline = true; Are you sure here? Why are you removing InlineFunctionProb of an unrelated option (inline)? > + > + if(CGOptions::func_attr_aligned() && rnd_flipcoin(FuncAttrAligned)) > +f->func_attr_aligned = true; > return f; > }> > @@ -487,8 +492,11 @@ Function::make_first(void) > // collect info about global dangling pointers > fm->find_dangling_global_ptrs(f); > > - if(CGOptions::func_attr_inline() && rnd_flipcoin(InlineFunctionProb)) > + if(CGOptions::func_attr_inline() && rnd_flipcoin(FuncAttrInline)) > f->func_attr_inline = true; Same comment here. > + > + if(CGOptions::func_attr_aligned() && rnd_flipcoin(FuncAttrAligned)) > +f->func_attr_aligned = true; > return f; > } > > @@ -562,6 +570,43 @@ Function::OutputForwardDecl(std::ostream &out) > OutputHeader(out); > if(func_attr_inline) > out << " __attribute__((always_inline))"; > + if(func_attr_aligned){ > + int probability__BIGGEST_ALIGNMENT__ = 38; > + bool use__BIGGEST_ALIGNMENT__ = > rnd_flipcoin(probability__BIGGEST_ALIGNMENT__); > + if (use__BIGGEST_ALIGNMENT__) > + out << " > __attribute__((aligned(__BIGGEST_ALIGNMENT__)))"; > + else{ > + int value = 0; > + int power = rnd_upto(8); > + if (power == 0) > + power++; > + switch (power){ > + case 1: > + value = 2; > +
Re: [GSOC-2019] - Csmith fuzzer leveraging GCC C Extensions
On 5/15/19 9:49 AM, Shubham Narlawar wrote: > void func_1(); > void func_3(); > . > > void foo1() __attribute__ ((weak, alias("func_1"))); > void foo3() __attribute__ ((weak, alias("func_3"))); Hi. In all emails that you sent there's no patch that will implement 'alias' attribute. Thanks, Martin
Re: [GSOC-2019] - Csmith fuzzer leveraging GCC C Extensions
On Wed, May 15, 2019 at 2:13 PM Martin Liška wrote: > On 5/15/19 9:49 AM, Shubham Narlawar wrote: > > Hello Martin and community, > > Hi Shubham. > > Thank you for starting working on the project. There are some > recommendations > before we fully being: > > 1) I've recently received 5 emails that have very similar subject and it's >unclear why that happened? When you e.g. forget to add something into >an email, it's easier to reply to your email and just mention what >was forgotten. > I tried to send you Aligned patch and one csmith program but it bounces back. I don't know why it happened. Sorry for sending mail 5 times. Sure, I will reply to single thread. > 2) Please reply to this email thread and for each new functionality, please >adjust email subject so that it's easier to orient. > > 3) I would recommend to make a separate branch for each functionality and >send a branch name here. I can easily pull and don't have to apply a > patch. > Sure. > > 4) Can we directly use pull requests into > https://github.com/csmith-project/csmith ? >I can comment the changes there if you want. > I have created a fork github.com/shubhamnarlawar77/csmith/tree/csmith-gcc I will make branch of csmith-gcc and send patches to it so that you can pull and review. After review, I will push patch into csmith-gcc fork. Does this work for you? > 5) I'm quite new to csmith, so be patient with me ;) > > 6) Are you familiar with coding style of CSmith? I can't verify changes in > that area. > Yes. I am quite familiar with it. > > Thanks for working on that. > Martin >
Re: [GSOC-2019] - Csmith fuzzer leveraging GCC C Extensions
On Wed, May 15, 2019 at 2:23 PM Martin Liška wrote: > On 5/15/19 9:49 AM, Shubham Narlawar wrote: > > void func_1(); > > void func_3(); > > . > > > > void foo1() __attribute__ ((weak, alias("func_1"))); > > void foo3() __attribute__ ((weak, alias("func_3"))); > > Hi. > > In all emails that you sent there's no patch that will implement 'alias' > attribute. > I have not implemented alias attribute yet. I will send it to github fork for review after aligned attribute. > > Thanks, > Martin >
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
On Tue, May 14, 2019 at 9:21 PM Aaron Sawdey wrote: > > GCC does not currently do inline expansion of overlapping memmove, nor does it > have an expansion pattern to allow for non-overlapping memcpy, so I plan to > add > patterns and support to implement this in gcc 10 timeframe. > > At present memcpy and memmove are kind of entangled. Here's the current state > of > play: > > memcpy -> expand with movmem pattern > memmove (no overlap) -> transform to memcpy -> expand with movmem pattern > memmove (overlap) -> remains memmove -> glibc call > > There are several problems currently. If the memmove() arguments are in fact > overlapping, then the expansion is actually not used which makes no sense and > costs performance of calling a library function instead of inline expanding > memmove() of small blocks. > > There is currently no way to have a separate memcpy pattern. I know from > experience with expansion of memcmp on power that lengths on the order of > hundreds of bytes are needed before the function call overhead is overcome by > optimized glibc code. But we need the memcpy guarantee of non-overlapping > arguments to make that happen, as we don't want to do a runtime overlap test. > > There is some analysis that happens in gimple_fold_builtin_memory_op() that > determines when memmove calls cannot have an overlap between the arguments and > converts them into memcpy() which is nice. > > However in builtins.c expand_builtin_memmove() does not actually do the > expansion using the memmove pattern. This is why a memmove() call that cannot > be > converted to memcpy() by gimple_fold_builtin_memory_op() is not expanded and > we > call glibc memmove(). Only expand_builtin_memcpy() actually uses the memmove > pattern. > > So here's my proposed set of fixes: > * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove >cases. > * The movmem optab will continue to be treated exactly as it is today so >that ports that might have a broken movmem pattern that doesn't actually >handle the overlap cases will continue to work. > * expand_builtin_memmove() needs to actually do the memmove() expansion. > * expand_builtin_memcpy() needs to use cpymem. Currently this happens down in >emit_block_move_via_movmem() so some functions might need to be renamed. > * ports can then add the new overlapping move and nonoverlapping copy > expanders >and will get better expansion of both memmove and memcpy functions. > > I'd be interested in any comments about pieces of this machinery that need to > work a certain way, or other related issues that should be addressed in > between expand_builtin_memcpy() and emit_block_move_via_movmem(). I wonder if introducing a __builtin_memmove_with_hints specifying whether src < dst or dst > src or unknown and/or a safe block size where that doesn't matter would help? I can then be safely expanded to memmove() or to specific inline code. Richard. > Thanks! >Aaron > > -- > Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com > 050-2/C113 (507) 253-7520 home: 507/263-0782 > IBM Linux Technology Center - PPC Toolchain >
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
Hi, On Tue, 14 May 2019, Aaron Sawdey wrote: > memcpy -> expand with movmem pattern > memmove (no overlap) -> transform to memcpy -> expand with movmem pattern > memmove (overlap) -> remains memmove -> glibc call ... > However in builtins.c expand_builtin_memmove() does not actually do the > expansion using the memmove pattern. Because it can't: the movmem pattern is not defined to require handling overlaps, and hence can't be used for any possibly overlapping memmove. (So, in a way the pattern is misnamed and should probably have been called cpymem from the beginning, alas there we are). > So here's my proposed set of fixes: > * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove >cases. Wouldn't it be nicer to rename the current movmem pattern to cpymem wholesale for all ports (i.e. roughly a big s/movmem/cpymem/ over the whole tree) and then introduce a new optional movmem pattern with overlapping semantics? Ciao, Michael.
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
On 5/15/19 8:10 AM, Michael Matz wrote:> On Tue, 14 May 2019, Aaron Sawdey wrote: > >> memcpy -> expand with movmem pattern >> memmove (no overlap) -> transform to memcpy -> expand with movmem pattern >> memmove (overlap) -> remains memmove -> glibc call > ... >> However in builtins.c expand_builtin_memmove() does not actually do the >> expansion using the memmove pattern. > > Because it can't: the movmem pattern is not defined to require handling > overlaps, and hence can't be used for any possibly overlapping > memmove. (So, in a way the pattern is misnamed and should probably have > been called cpymem from the beginning, alas there we are). > >> So here's my proposed set of fixes: >> * Add new optab entries for nonoverlapping_memcpy and overlapping_memmove >>cases. > > Wouldn't it be nicer to rename the current movmem pattern to cpymem > wholesale for all ports (i.e. roughly a big s/movmem/cpymem/ over the > whole tree) and then introduce a new optional movmem pattern with > overlapping semantics? Yeah that makes a lot of sense. I was unaware of that history, and was led astray by the fact that the powerpc implementation of movemem works by doing a bunch of loads into registers followed by a bunch of stores and so (I think) would actually work for the overlap case. Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
On 5/15/19 7:22 AM, Richard Biener wrote: > On Tue, May 14, 2019 at 9:21 PM Aaron Sawdey wrote: >> I'd be interested in any comments about pieces of this machinery that need to >> work a certain way, or other related issues that should be addressed in >> between expand_builtin_memcpy() and emit_block_move_via_movmem(). > > I wonder if introducing a __builtin_memmove_with_hints specifying whether > src < dst or dst > src or unknown and/or a safe block size where that > doesn't matter > would help? I can then be safely expanded to memmove() or to specific > inline code. Yes this would be a nice thing to get to, a single move/copy underlying builtin, to which we communicate what the compiler's analysis tells us about whether the operands overlap and by how much. Next question would be how do we move from the existing movmem pattern (which Michael Matz tells us should be renamed cpymem anyway) to this new thing. Are you proposing that we still have both movmem and cpymem optab entries underneath to call the patterns but introduce this new memmove_with_hints() to be used by things called by expand_builtin_memmove() and expand_builtin_memcpy()? Thanks! Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
Hi, On Wed, 15 May 2019, Aaron Sawdey wrote: > Yes this would be a nice thing to get to, a single move/copy underlying > builtin, to which we communicate what the compiler's analysis tells us > about whether the operands overlap and by how much. > > Next question would be how do we move from the existing movmem pattern > (which Michael Matz tells us should be renamed cpymem anyway) to this > new thing. Are you proposing that we still have both movmem and cpymem > optab entries underneath to call the patterns but introduce this new > memmove_with_hints() to be used by things called by > expand_builtin_memmove() and expand_builtin_memcpy()? I'd say so. There are multiple levels at play: a) exposal to user: probably a new __builtint_memmove, or a new combined builtin with a hint param to differentiate (but we can't get rid of __builtin_memcpy/mempcpy/strcpy, which all can go through the same route in the middleend) b) getting it through the gimple pipeline, probably just a new builtin code, trivial c) expanding the new builtin, with the help of next items d) RTL block moves: they are defined as non-overlapping and I don't think we should change this (essentially they're the reflection of struct copies in C) e) how any of the above (builtins and RTL block moves) are implemented: currently non-overlapping only, using movmem pattern when possible; ultimately all sitting in the emit_block_move_hints() routine. So, I'd add a new method to emit_block_move_hints indicating possible overlap, disabling the use of move_by_pieces. Then in emit_block_move_via_movmem (alse getting an indication of overlap), do the equivalent of: finished = 0; if (overlap_possible) { if (optab[movmem]) finished = emit(movmem) } else { if (optab[cpymem]) finished = emit(cpymem); if (!finished && optab[movmem]) // can use movmem also for overlap finished = emit(movmem); } The overlap_possible method would only ever be used from the builtin expansion, and never from the RTL block move expand. Additionally a target may optionally only define the movmem pattern if it's just as good as the cpymem pattern (e.g. because it only handles fixed small sizes and uses a load-all then store-all sequence). Ciao, Michael.
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
On Wed, May 15, 2019 at 02:02:32PM +, Michael Matz wrote: > > Yes this would be a nice thing to get to, a single move/copy underlying > > builtin, to which we communicate what the compiler's analysis tells us > > about whether the operands overlap and by how much. > > > > Next question would be how do we move from the existing movmem pattern > > (which Michael Matz tells us should be renamed cpymem anyway) to this > > new thing. Are you proposing that we still have both movmem and cpymem > > optab entries underneath to call the patterns but introduce this new > > memmove_with_hints() to be used by things called by > > expand_builtin_memmove() and expand_builtin_memcpy()? > > I'd say so. There are multiple levels at play: > a) exposal to user: probably a new __builtint_memmove, or a new combined >builtin with a hint param to differentiate (but we can't get rid of >__builtin_memcpy/mempcpy/strcpy, which all can go through the same >route in the middleend) > b) getting it through the gimple pipeline, probably just a new builtin >code, trivial > c) expanding the new builtin, with the help of next items > d) RTL block moves: they are defined as non-overlapping and I don't think >we should change this (essentially they're the reflection of struct >copies in C) > e) how any of the above (builtins and RTL block moves) are implemented: >currently non-overlapping only, using movmem pattern when possible; >ultimately all sitting in the emit_block_move_hints() routine. Just one thing to note, our "memcpy" expectation is that either there is no overlap, or there is 100% overlap (src == dest), both all the current movmem == future cpymem expanders and all the supported library implementations do support that, though the latter just de-facto, it isn't a written guarantee. Jakub
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
Hi, On Wed, 15 May 2019, Jakub Jelinek wrote: > Just one thing to note, our "memcpy" expectation is that either there is > no overlap, or there is 100% overlap (src == dest), both all the current > movmem == future cpymem expanders and all the supported library > implementations do support that, though the latter just de-facto, it > isn't a written guarantee. Yes, I should have been more precise, complete overlap is always de-facto supported as well. Ciao, Michael.
[PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .
Hi All, We have the situation ,where the R12 is pointing to Thunk GEP ,not the current function like .size _ZN12Intermediate1vEv,.-_ZN12Intermediate1vEv .set.LTHUNK0,_ZN12Intermediate1vEv .align 2 .globl _ZThn8_N12Intermediate1vEv .type _ZThn8_N12Intermediate1vEv, @function _ZThn8_N12Intermediate1vEv: .LFB27: .file 2 "/home/vkumar1/tmp/64_bit/qsp_ppc/gnu/dkmcxx/lib.h" .loc 2 13 16 .cfi_startproc .LCF2: 0: addis 2,12,.TOC.-.LCF2@ha addi 2,2,.TOC.-.LCF2@l .localentry _ZThn8_N12Intermediate1vEv,.-_ZThn8_N12Intermediate1vEv addi 3,3,-8 b .LTHUNK0 .cfi_endproc .LFE27: .size _ZThn8_N12Intermediate1vEv,.-_ZThn8_N12Intermediate1vEv .section".toc","aw" .set .LC1,.LC0 .section".text" .align 2 .globl _ZN12Intermediate1vEv .type _ZN12Intermediate1vEv, @function _ZN12Intermediate1vEv: .LFB25: .loc 1 7 23 .cfi_startproc .LCF1: 0: addis 2,12,.TOC.-.LCF1@ha addi 2,2,.TOC.-.LCF1@l .localentry _ZN12Intermediate1vEv,.-_ZN12Intermediate1vEv mflr 0 std 0,16(1) std 31,-8(1) stdu 1,-64(1) like above the control from "_ZThn8_N12Intermediate1vEv" (support function for this pointer update) is transferred "_ZN12Intermediate1vEv" by b inst (where its not updating the r12) and in the beginning of "_ZN12Intermediate1vEv" we are loading the toc base from r12 (which is incorrect ) ,we are investigating the issue and one way to fix the issue is that make THUNK to update the r12 ,the cal like bctrl or load the r12 with the function address in the _ZN12Intermediate1vEv prologue code . But before we go ahead ,please share your thoughts or shed some lights on the same . Thank you ~Umesh
Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .
> like above the control from "_ZThn8_N12Intermediate1vEv" (support > function for this pointer update) is transferred > "_ZN12Intermediate1vEv" by b inst (where its not updating the r12) > and in the beginning of "_ZN12Intermediate1vEv" we are loading the > toc base from r12 (which is incorrect ) ,we are investigating the > issue and one way to fix the issue is that make THUNK to update the > r12 ,the cal like bctrl or load the r12 with the function address in > the _ZN12Intermediate1vEv prologue code . Is that on VxWorks in kernel mode? If so, the loader doesn't abide by the ELFv2 ABI so the simple way out is to disable asm thunks altogether: #undef TARGET_ASM_CAN_OUTPUT_MI_THUNK #define TARGET_ASM_CAN_OUTPUT_MI_THUNK rs6000_can_output_mi_thunk /* Return true if rs6000_output_mi_thunk would be able to output the assembler code for the thunk function specified by the arguments it is passed, and false otherwise. */ static bool rs6000_can_output_mi_thunk (const_tree, HOST_WIDE_INT, HOST_WIDE_INT, const_tree) { /* The only possible issue is for VxWorks in kernel mode. */ if (!TARGET_VXWORKS || TARGET_VXWORKS_RTP) return true; /* The loader neither creates the glue code sequence that loads r12 nor uses the local entry point for the sibcall's target in the ELFv2 ABI. */ return DEFAULT_ABI != ABI_ELFv2; } -- Eric Botcazou
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
On 5/15/19 9:02 AM, Michael Matz wrote: > On Wed, 15 May 2019, Aaron Sawdey wrote: >> Next question would be how do we move from the existing movmem pattern >> (which Michael Matz tells us should be renamed cpymem anyway) to this >> new thing. Are you proposing that we still have both movmem and cpymem >> optab entries underneath to call the patterns but introduce this new >> memmove_with_hints() to be used by things called by >> expand_builtin_memmove() and expand_builtin_memcpy()? > > I'd say so. There are multiple levels at play: > a) exposal to user: probably a new __builtint_memmove, or a new combined >builtin with a hint param to differentiate (but we can't get rid of >__builtin_memcpy/mempcpy/strcpy, which all can go through the same >route in the middleend) > b) getting it through the gimple pipeline, probably just a new builtin >code, trivial > c) expanding the new builtin, with the help of next items > d) RTL block moves: they are defined as non-overlapping and I don't think >we should change this (essentially they're the reflection of struct >copies in C) > e) how any of the above (builtins and RTL block moves) are implemented: >currently non-overlapping only, using movmem pattern when possible; >ultimately all sitting in the emit_block_move_hints() routine. > > So, I'd add a new method to emit_block_move_hints indicating possible > overlap, disabling the use of move_by_pieces. Then in > emit_block_move_via_movmem (alse getting an indication of overlap), do the > equivalent of: > > finished = 0; > if (overlap_possible) { > if (optab[movmem]) > finished = emit(movmem) > } else { > if (optab[cpymem]) > finished = emit(cpymem); > if (!finished && optab[movmem]) // can use movmem also for overlap > finished = emit(movmem); > } > > The overlap_possible method would only ever be used from the builtin > expansion, and never from the RTL block move expand. Additionally a > target may optionally only define the movmem pattern if it's just as good > as the cpymem pattern (e.g. because it only handles fixed small sizes and > uses a load-all then store-all sequence). We currently have gimple_fold_builtin_memory_op() figuring out where there is no overlap and converging __builtin_memmove() to __builtin_memcpy(). Would you forsee looking for converting __builtin_memmove() with overlap into a call to __builtin_memmove_hint if it is a case where we can define the overlap precisely enough to provide the hint? My guess is that this wouldn't be a very common case. My goals for this are: * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem] * memmove() call becomes __builtin_memmove (or __builtin_memcpy based on the gimple analysis) and goes through optab[movmem] or optab[cpymem] I think what you've described meets these goals and cleans things up. Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote: > We currently have gimple_fold_builtin_memory_op() figuring out where there > is no overlap and converging __builtin_memmove() to __builtin_memcpy(). Would > you forsee looking for converting __builtin_memmove() with overlap into > a call to __builtin_memmove_hint if it is a case where we can define the Please do not introduce user visible builtins that you are not intending to support for user use. Thus, I think you want internal function .MEMMOVE_HINT as opposed to __builtin_memmove_hint. > overlap precisely enough to provide the hint? My guess is that this wouldn't > be a very common case. > > My goals for this are: > * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem] > * memmove() call becomes __builtin_memmove (or __builtin_memcpy based >on the gimple analysis) and goes through optab[movmem] or optab[cpymem] Except for the becomes part (the memcpy call is the same thing as __builtin_memcpy in the middle-end, all you care about if it is BUILT_IN_MEMCPY etc. and whether it has compatible arguments), and for the missing optab[movmem] part and movmem->cpymem renaming, isn't that what we have already? Jakub
Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .
Thank you Eric for the suggestion and say that we support in the loader part ,can you please point on elfv2 reference that says implementation for this specific case. ~Umesh On Wed, May 15, 2019, 21:35 Eric Botcazou wrote: > > like above the control from "_ZThn8_N12Intermediate1vEv" (support > > function for this pointer update) is transferred > > "_ZN12Intermediate1vEv" by b inst (where its not updating the r12) > > and in the beginning of "_ZN12Intermediate1vEv" we are loading the > > toc base from r12 (which is incorrect ) ,we are investigating the > > issue and one way to fix the issue is that make THUNK to update the > > r12 ,the cal like bctrl or load the r12 with the function address in > > the _ZN12Intermediate1vEv prologue code . > > Is that on VxWorks in kernel mode? If so, the loader doesn't abide by the > ELFv2 ABI so the simple way out is to disable asm thunks altogether: > > #undef TARGET_ASM_CAN_OUTPUT_MI_THUNK > #define TARGET_ASM_CAN_OUTPUT_MI_THUNK rs6000_can_output_mi_thunk > > /* Return true if rs6000_output_mi_thunk would be able to output the >assembler code for the thunk function specified by the arguments >it is passed, and false otherwise. */ > > static bool > rs6000_can_output_mi_thunk (const_tree, HOST_WIDE_INT, HOST_WIDE_INT, > const_tree) > { > /* The only possible issue is for VxWorks in kernel mode. */ > if (!TARGET_VXWORKS || TARGET_VXWORKS_RTP) > return true; > > /* The loader neither creates the glue code sequence that loads r12 nor > uses > the local entry point for the sibcall's target in the ELFv2 ABI. */ > return DEFAULT_ABI != ABI_ELFv2; > } > > -- > Eric Botcazou >
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
On 5/15/19 11:31 AM, Jakub Jelinek wrote: > On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote: >> My goals for this are: >> * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem] >> * memmove() call becomes __builtin_memmove (or __builtin_memcpy based >>on the gimple analysis) and goes through optab[movmem] or optab[cpymem] > > Except for the becomes part (the memcpy call is the same thing as > __builtin_memcpy in the middle-end, all you care about if it is > BUILT_IN_MEMCPY etc. and whether it has compatible arguments), and for the > missing optab[movmem] part and movmem->cpymem renaming, isn't that what we > have already? Yes. I was just trying to state what I wanted it to become, some of which is already present. So I think I will start working on two patches: 1) rename optab movmem and the underlying patterns to cpymem. 2) add a new optab movmem that is really memmove() and add support for having __builtin_memmove() use it. Handling of the partial overlap case can be a separate piece of work. Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
On Wed, May 15, 2019 at 12:59:01PM -0500, Aaron Sawdey wrote: > On 5/15/19 11:31 AM, Jakub Jelinek wrote: > > On Wed, May 15, 2019 at 11:23:54AM -0500, Aaron Sawdey wrote: > >> My goals for this are: > >> * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem] > >> * memmove() call becomes __builtin_memmove (or __builtin_memcpy based > >>on the gimple analysis) and goes through optab[movmem] or optab[cpymem] > > > > Except for the becomes part (the memcpy call is the same thing as > > __builtin_memcpy in the middle-end, all you care about if it is > > BUILT_IN_MEMCPY etc. and whether it has compatible arguments), and for the > > missing optab[movmem] part and movmem->cpymem renaming, isn't that what we > > have already? > > Yes. I was just trying to state what I wanted it to become, some of which > is already present. So I think I will start working on two patches: > > 1) rename optab movmem and the underlying patterns to cpymem. > 2) add a new optab movmem that is really memmove() and add support for > having __builtin_memmove() use it. > > Handling of the partial overlap case can be a separate piece of work. That 1) and 2) can be also separate pieces of work ;). Jakub
Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy
On 5/15/19 1:01 PM, Jakub Jelinek wrote: > On Wed, May 15, 2019 at 12:59:01PM -0500, Aaron Sawdey wrote: >> 1) rename optab movmem and the underlying patterns to cpymem. >> 2) add a new optab movmem that is really memmove() and add support for >> having __builtin_memmove() use it. >> >> Handling of the partial overlap case can be a separate piece of work. > > That 1) and 2) can be also separate pieces of work ;). Exactly -- make things as easy as possible when I go begging for reviewers :-) -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain
Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .
> Thank you Eric for the suggestion and say that we support in the loader > part ,can you please point on elfv2 reference that says implementation for > this specific case. I don't think there is a reference to this specific case in the ABI, only the general stuff about local and global entry points. We have had this patch in our tree for some time and it works well, so let me submit it for inclusion in the official tree. -- Eric Botcazou
Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .
Hi Umesh, Please spell out "global entry point", almost everything and everyone does. On Wed, May 15, 2019 at 08:57:29PM +0530, Umesh Kalappa wrote: > .set.LTHUNK0,_ZN12Intermediate1vEv > _ZThn8_N12Intermediate1vEv: > .LCF2: > 0: addis 2,12,.TOC.-.LCF2@ha > addi 2,2,.TOC.-.LCF2@l > .localentry > _ZThn8_N12Intermediate1vEv,.-_ZThn8_N12Intermediate1vEv > b .LTHUNK0 > _ZN12Intermediate1vEv: > .LCF1: > 0: addis 2,12,.TOC.-.LCF1@ha > addi 2,2,.TOC.-.LCF1@l > like above the control from "_ZThn8_N12Intermediate1vEv" (support > function for this pointer update) is transferred > "_ZN12Intermediate1vEv" by b inst (where its not updating the r12) > and in the beginning of "_ZN12Intermediate1vEv" we are loading the > toc base from r12 (which is incorrect ) ,we are investigating the > issue and one way to fix the issue is that make THUNK to update the > r12 ,the cal like bctrl or load the r12 with the function address in > the _ZN12Intermediate1vEv prologue code . The thunk should point to the local entry point instead, I guess? Please open a PR, with compilable testcase, and other relevant info (if you need some specific -mcpu= for example, or other flags, do mention that!) See https://gcc.gnu.org/bugs.html for more info. Thanks! Segher
Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .
On Wed, May 15, 2019 at 08:31:27PM +0200, Eric Botcazou wrote: > > Thank you Eric for the suggestion and say that we support in the loader > > part ,can you please point on elfv2 reference that says implementation for > > this specific case. > > I don't think there is a reference to this specific case in the ABI, only the > general stuff about local and global entry points. Yes, it is just a normal jump to a local function. > We have had this patch in > our tree for some time and it works well, so let me submit it for inclusion > in > the official tree. Can't you get the loader fixed, instead? Segher
regarding https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89924
Hi devs, I am looking into subjected issue. consider below testcase. $test.cpp class A { public: virtual int value() { return 1; } }; class B final : public A { public: int value(){ return 2; } }; int test(A* b) { return b->value() + 11; } void foo_test(B*b) { test(b); } Here gcc is unable to devirtualize the call to value into the test, because type information is not being propogated from foo_test to test. So this is what i am trying to do to make devirtualization happen: First we can inline the function test into foo_test. And then update the type in the OBJ_TYPE_REF. so that devirtualization can be applied. Like to know your thought before going ahead. ./Kamlesh
Re: [PowerPC 64]r12 is not updated to GEP when control transferred from virtual thunk function .
>>Can't you get the loader fixed, instead? Yes we are thinking the same ,question what should be loader semantics here (update the prologue code to update r12 to Global Entry Point or Update R2 with toc base (that don't relay on the R12). ) ~Umesh On Thu, May 16, 2019, 05:22 Segher Boessenkool wrote: > On Wed, May 15, 2019 at 08:31:27PM +0200, Eric Botcazou wrote: > > > Thank you Eric for the suggestion and say that we support in the > loader > > > part ,can you please point on elfv2 reference that says implementation > for > > > this specific case. > > > > I don't think there is a reference to this specific case in the ABI, > only the > > general stuff about local and global entry points. > > Yes, it is just a normal jump to a local function. > > > We have had this patch in > > our tree for some time and it works well, so let me submit it for > inclusion in > > the official tree. > > Can't you get the loader fixed, instead? > > > Segher >