On Wed, Sep 12, 2012 at 8:37 AM, Eric Botcazou <ebotca...@adacore.com> wrote: > This is the PR about the useless spilling to memory of structures that are > returned in registers. It was essentially addressed last year by Easwaran > with > an enhancement of the RTL DSE pass, but Easwaran also noted that we still > spill > to memory in the simplest cases, e.g. gcc.dg/pr44194-1.c, because expand_call > creates a temporary on the stack to store the value returned in registers... > > The attached patch solves this problem by copying the value into pseudos > instead by means of emit_group_move_into_temps. This is sufficient to get rid > of the remaining memory accesses for gcc.dg/pr44194-1.c on x86-64 for example, > but not on strict-alignment platforms like SPARC64. > > The problem is that, on strict-alignment platforms, emit_group_store will use > bitfield techniques (store_bit_field) to store the returned value, and the > bitfield routines (store_bit_field and extract_bit_field) have these lines: > > /* We may be accessing data outside the field, which means > we can alias adjacent data. */ > if (MEM_P (op0)) > { > op0 = shallow_copy_rtx (op0); > set_mem_alias_set (op0, 0); > set_mem_expr (op0, 0); > } > > Now the enhancement implemented in the RTL DSE pass by Easwaran is precisely > based on the MEM_EXPR of MEM objects. > > The patch solves this problem by implementing a variant of adjust_address > along > the lines of the comment at the end of adjust_address_1: > > /* At some point, we should validate that this offset is within the object, > if all the appropriate values are known. */ > return new_rtx; > > i.e. adjust_bitfield_address will drop the underlying object of the MEM if it > cannot prove that the adjusted memory access is still within its bounds. > The bitfield manipulation routines in expmed.c are then changed to invoke > adjust_bitfield_address instead of adjust_address and the above special lines > in store_bit_field and extract_bit_field are eliminated. > > While I was at it, I also fixed a probable oversight in extract_bit_field_1 > that has bothered me for a while: in the multi-word case, extract_bit_field_1 > recurses on extract_bit_field instead of itself (unlike store_bit_field_1), > which short-circuits the FALLBACK_P parameter. > > Tested on x86-64/Linux and SPARC64/Solaris. Comments? > > > 2012-09-12 Eric Botcazou <ebotca...@adacore.com> > > PR rtl-optimization/44194 > * calls.c (expand_call): In the PARALLEL case, copy the return value > into pseudos instead of spilling it onto the stack. > * emit-rtl.c (adjust_address_1): Rename ADJUST into ADJUST_ADDRESS and > add new ADJUST_OBJECT parameter. > If ADJUST_OBJECT is set, drop the underlying object if it cannot be > proved that the adjusted memory access is still within its bounds. > (adjust_automodify_address_1): Adjust call to adjust_address_1. > (widen_memory_access): Likewise. > * expmed.c (store_bit_field_1): Call adjust_bitfield_address instead > of adjust_address. Do not drop the underlying object of a MEM. > (store_fixed_bit_field): Likewise. > (extract_bit_field_1): Likewise. Fix oversight in recursion. > (extract_fixed_bit_field): Likewise. > * expr.h (adjust_address_1): Adjust prototype. > (adjust_address): Adjust call to adjust_address_1. > (adjust_address_nv): Likewise. > (adjust_bitfield_address): New macro. > (adjust_bitfield_address_nv): Likewise. > * expr.c (expand_assignment): Handle a PARALLEL in more cases. > (store_expr): Likewise. > (store_field): Likewise. > > * dse.c: Fix typos in the head comment.
Will it help http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54315 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=28831 Thanks. -- H.J.