Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

Aaron Sawdey Wed, 15 May 2019 09:24:43 -0700

On 5/15/19 9:02 AM, Michael Matz wrote:
> On Wed, 15 May 2019, Aaron Sawdey wrote:
>> Next question would be how do we move from the existing movmem pattern 
>> (which Michael Matz tells us should be renamed cpymem anyway) to this 
>> new thing. Are you proposing that we still have both movmem and cpymem 
>> optab entries underneath to call the patterns but introduce this new 
>> memmove_with_hints() to be used by things called by 
>> expand_builtin_memmove() and expand_builtin_memcpy()?
> 
> I'd say so.  There are multiple levels at play:
> a) exposal to user: probably a new __builtint_memmove, or a new combined 
>    builtin with a hint param to differentiate (but we can't get rid of 
>    __builtin_memcpy/mempcpy/strcpy, which all can go through the same 
>    route in the middleend)
> b) getting it through the gimple pipeline, probably just a new builtin 
>    code, trivial
> c) expanding the new builtin, with the help of next items
> d) RTL block moves: they are defined as non-overlapping and I don't think 
>    we should change this (essentially they're the reflection of struct 
>    copies in C)
> e) how any of the above (builtins and RTL block moves) are implemented: 
>    currently non-overlapping only, using movmem pattern when possible; 
>    ultimately all sitting in the emit_block_move_hints() routine.
> 
> So, I'd add a new method to emit_block_move_hints indicating possible 
> overlap, disabling the use of move_by_pieces.  Then in 
> emit_block_move_via_movmem (alse getting an indication of overlap), do the 
> equivalent of:
> 
>   finished = 0;
>   if (overlap_possible) {
>     if (optab[movmem])
>       finished = emit(movmem)
>   } else {
>     if (optab[cpymem])
>       finished = emit(cpymem);
>     if (!finished && optab[movmem])  // can use movmem also for overlap
>       finished = emit(movmem);
>   }
> 
> The overlap_possible method would only ever be used from the builtin 
> expansion, and never from the RTL block move expand.  Additionally a 
> target may optionally only define the movmem pattern if it's just as good 
> as the cpymem pattern (e.g. because it only handles fixed small sizes and 
> uses a load-all then store-all sequence).


We currently have gimple_fold_builtin_memory_op() figuring out where there
is no overlap and converging __builtin_memmove() to __builtin_memcpy(). Would
you forsee looking for converting __builtin_memmove() with overlap into
a call to __builtin_memmove_hint if it is a case where we can define the
overlap precisely enough to provide the hint? My guess is that this wouldn't
be a very common case.

My goals for this are:
 * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem]
 * memmove() call becomes __builtin_memmove (or __builtin_memcpy based
   on the gimple analysis) and goes through optab[movmem] or optab[cpymem]

I think what you've described meets these goals and cleans things up.

Thanks,
    Aaron


-- 
Aaron Sawdey, Ph.D.  acsaw...@linux.vnet.ibm.com
050-2/C113  (507) 253-7520 home: 507/263-0782
IBM Linux Technology Center - PPC Toolchain

Re: Fixing inline expansion of overlapping memmove and non-overlapping memcpy

Reply via email to