Hi! On Mon, Jun 08, 2020 at 02:22:23PM +0800, luoxhu wrote: > On 2020/6/3 04:32, Segher Boessenkool wrote: > > On Tue, Jun 02, 2020 at 04:41:50AM -0500, Xionghu Luo wrote: > >> + if (TREE_CODE (type) == RECORD_TYPE > >> + && rs6000_discover_homogeneous_aggregate (TYPE_MODE (type), type, > >> NULL, > >> + NULL)) > >> + { > >> + tree field_type = TREE_TYPE (first_field (type)); > >> + if (field_type && TREE_CODE (field_type) == ARRAY_TYPE > >> + && TREE_CODE (TREE_TYPE (field_type)) == REAL_TYPE) > >> + elt_mode = TYPE_MODE (TREE_TYPE (field_type)); > >> + } > > > > Homogeneous aggregates only exist in the ELFv2 ABI, while the problem > > here is the SP float things. You also noticed (elsewhere) that if the > > struct contains (say) SI, SF, SI, SF, then this does not help. > > > > Is there some better condition this could use, and maybe an expansion > > that works in more cases as well? > > > > And, it would be lovely if generic code could expand to something better > > already (not expand to a block move at all, certainly not for something > > as tiny as this). > > This pr65421 is array in structure, the assignment is just struct to struct > and won't be split by SROA to element assignment like struct contains <SF, SF> > or <SI, SF>.
Yes, but it is a very small, fixed-length array (4 SP floats in PR65421). So what we do currently is not very good. > For pr65421, the arguments and return value are accessed by BLKmode after > gimplify, > since there is no IPA pass, it is never changed from pass gimple to expand. And that is a problem (not only on Power, but everywhere that likes registers more than it likes memory... basically everywhere). > In expander, the type conversion only happens on expand_assignment of "D.2909 > = a;" > (arguments assigned to local variable, stack to stack, generated by > expand_block_move, > insn #13~#20 as followed), the expand_function_start(insn #2~#9) load each > element type > to be DF already, DF to DI conversion in insn #13~#20 cause the later RTL > passes fail to > do forward propagation in 246r.fwprop1. So my patch tries to use the actual > type for > array in structure here. If rs6000_discover_homogeneous_aggregate is not > allowed to be > used here, how about expose and call rs6000_aggregate_candidate directly? I think it would be best if generic code did not force this into memory at all. > Not clear why > "Homogeneous aggregates only exist in the ELFv2 ABI" since double array in > structure is a > common usage? Homogeneous arguments are a concept from the ELFv2 ABI. This macro return false if not (TARGET_HARD_FLOAT && DEFAULT_ABI == ABI_ELFv2 && type && AGGREGATE_TYPE_P (type)) On other ABIs we have the exact same problem, so a good solution would fix it for all such ABIs (but even for all targets if possible). > To bypass block move requires very generic code change, Yes. > the BLK mode is determined very early > in gimple, remove BLKmode seems huge project in > stor-layout.c\function.c\expr.c etc. and not sure > other targets like it, the ARM64 use OImode register to avoid BLKmode/stack > operations, while > X86 expand to two TImode register assignment and pointer result return. To get good code they will have to change it to the proper access mode in some later pass. > Or do you mean some workaround that don't call emit_block_move to fall in > expand_block_move in > rs6000-string.c when expand_assignment of "D.2909 = a;" below? > rtx > store_expr (tree exp, rtx target, int call_param_p, > bool nontemporal, bool reverse) > { > ... > else if (GET_CODE (temp) == PARALLEL) > emit_group_store (target, temp, TREE_TYPE (exp), > int_size_in_bytes (TREE_TYPE (exp))); > else if (GET_MODE (temp) == BLKmode) > emit_block_move (target, temp, expr_size (exp), > (call_param_p > ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL)) > ... > } We can cetainly do workarounds in the target code if the generic code cannot be fixed, or other targets are against this for some reason -- but let's try to do the proper thing first? Thanks, Segher