On 5/15/19 9:02 AM, Michael Matz wrote: > On Wed, 15 May 2019, Aaron Sawdey wrote: >> Next question would be how do we move from the existing movmem pattern >> (which Michael Matz tells us should be renamed cpymem anyway) to this >> new thing. Are you proposing that we still have both movmem and cpymem >> optab entries underneath to call the patterns but introduce this new >> memmove_with_hints() to be used by things called by >> expand_builtin_memmove() and expand_builtin_memcpy()? > > I'd say so. There are multiple levels at play: > a) exposal to user: probably a new __builtint_memmove, or a new combined > builtin with a hint param to differentiate (but we can't get rid of > __builtin_memcpy/mempcpy/strcpy, which all can go through the same > route in the middleend) > b) getting it through the gimple pipeline, probably just a new builtin > code, trivial > c) expanding the new builtin, with the help of next items > d) RTL block moves: they are defined as non-overlapping and I don't think > we should change this (essentially they're the reflection of struct > copies in C) > e) how any of the above (builtins and RTL block moves) are implemented: > currently non-overlapping only, using movmem pattern when possible; > ultimately all sitting in the emit_block_move_hints() routine. > > So, I'd add a new method to emit_block_move_hints indicating possible > overlap, disabling the use of move_by_pieces. Then in > emit_block_move_via_movmem (alse getting an indication of overlap), do the > equivalent of: > > finished = 0; > if (overlap_possible) { > if (optab[movmem]) > finished = emit(movmem) > } else { > if (optab[cpymem]) > finished = emit(cpymem); > if (!finished && optab[movmem]) // can use movmem also for overlap > finished = emit(movmem); > } > > The overlap_possible method would only ever be used from the builtin > expansion, and never from the RTL block move expand. Additionally a > target may optionally only define the movmem pattern if it's just as good > as the cpymem pattern (e.g. because it only handles fixed small sizes and > uses a load-all then store-all sequence).
We currently have gimple_fold_builtin_memory_op() figuring out where there is no overlap and converging __builtin_memmove() to __builtin_memcpy(). Would you forsee looking for converting __builtin_memmove() with overlap into a call to __builtin_memmove_hint if it is a case where we can define the overlap precisely enough to provide the hint? My guess is that this wouldn't be a very common case. My goals for this are: * memcpy() call becomes __builtin_memcpy and goes to optab[cpymem] * memmove() call becomes __builtin_memmove (or __builtin_memcpy based on the gimple analysis) and goes through optab[movmem] or optab[cpymem] I think what you've described meets these goals and cleans things up. Thanks, Aaron -- Aaron Sawdey, Ph.D. acsaw...@linux.vnet.ibm.com 050-2/C113 (507) 253-7520 home: 507/263-0782 IBM Linux Technology Center - PPC Toolchain