Richard Biener <rguent...@suse.de> writes:
> On Tue, 24 Jun 2025, Richard Sandiford wrote:
>
>> Richard Biener <rguent...@suse.de> writes:
>> > On Tue, 24 Jun 2025, Richard Sandiford wrote:
>> >
>> >> Tamar Christina <tamar.christ...@arm.com> writes:
>> >> > store_bit_field_1 has an optimization where if a target is not a memory 
>> >> > operand
>> >> > and the entire value is being set from something larger we can just 
>> >> > wrap a
>> >> > subreg around the source and emit a move.
>> >> >
>> >> > For vector constructors this is however problematic because the subreg 
>> >> > means
>> >> > that the expansion of the constructor won't happen through vec_init 
>> >> > anymore.
>> >> >
>> >> > Complicated constructors which aren't natively supported by targets 
>> >> > then ICE as
>> >> > they wouldn't have been expanded so recog fails.
>> >> >
>> >> > This patch blocks the optimization on non-constant vector constructors. 
>> >> > Or non-uniform
>> >> > non-constant vectors. I allowed constant vectors because if I read the 
>> >> > code right
>> >> > simplify-rtx should be able to perform the simplification of pulling 
>> >> > out the element
>> >> > or merging the constant values.  There are several testcases in 
>> >> > aarch64-sve-pcs.exp
>> >> > that test this as well. I allowed uniform non-constant vectors because 
>> >> > they
>> >> > would be folded into a vec_select later on.
>> >> >
>> >> > Note that codegen is quite horrible, for what should only be an lsr.  
>> >> > But I'll
>> >> > address that separately so that this patch is backportable.
>> >> >
>> >> > Bootstrapped Regtested on aarch64-none-linux-gnu,
>> >> > arm-none-linux-gnueabihf, x86_64-pc-linux-gnu
>> >> > -m32, -m64 and no issues.
>> >> >
>> >> > Ok for master? and GCC 15, 14, 13?
>> >> 
>> >> I was discussing this Alex off-list last week, and the fix we talked
>> >> about there was:
>> >> 
>> >> diff --git a/gcc/explow.cc b/gcc/explow.cc
>> >> index 7799a98053b..8b138f54f75 100644
>> >> --- a/gcc/explow.cc
>> >> +++ b/gcc/explow.cc
>> >> @@ -753,7 +753,7 @@ force_subreg (machine_mode outermode, rtx op,
>> >>         machine_mode innermode, poly_uint64 byte)
>> >>  {
>> >>    rtx x = simplify_gen_subreg (outermode, op, innermode, byte);
>> >> -  if (x)
>> >> +  if (x && (!SUBREG_P (x) || REG_P (SUBREG_REG (x))))
>> >>      return x;
>> >>  
>> >>    auto *start = get_last_insn ();
>> >> 
>> >> The justification is that force_subreg is somewhat like a "subreg
>> >> version of force_operand", and so should try to avoid returning
>> >> subregs that force_operand would have replaced.  The force_operand
>> >> code I mean is:
>> >
>> > Yeah, in particular CONSTANT_P isn't sth documented as valid as
>> > subreg operands, only registers (and memory) are.  But isn't this
>> > then a bug in simplify_gen_subreg itself, that it creates a SUBREG
>> > of a non-REG/MEM?
>> 
>> I don't think the documentation is correct/up-to-date.  subreg is
>> de facto used as a general operation, and for example there are
>> patterns like:
>> 
>> (define_insn ""
>>   [(set (match_operand:QI 0 "general_operand_dst" 
>> "=rm,Za,Zb,Zc,Zd,Ze,Zf,Zh,Zg")
>>         (subreg:QI (lshiftrt:SI (match_operand:SI 1 "register_operand" 
>> "r,Z0,Z1,Z2,Z3,Z4,Z5,Z6,Z7")
>>                                 (const_int 16)) 3))
>>    (clobber (match_scratch:SI 2 "=&r,&r,&r,&r,&r,&r,&r,&r,&r"))
>>    (clobber (reg:CC CC_REG))]
>>   ""
>>   "mov.w\\t%e1,%f2\;mov.b\\t%w2,%R0"
>>   [(set_attr "length" "10")])
>
> I see.  Is the subreg for such define_insn generated by the middle-end
> though?

I assume it was written to match something that combine could generate.
Whether it still does in another question.

>> (from h8300).  This is also why simplify_gen_subreg has:
>> 
>>   if (GET_CODE (op) == SUBREG
>>       || GET_CODE (op) == CONCAT
>>       || GET_MODE (op) == VOIDmode)
>>     return NULL_RTX;
>> 
>>   if (MODE_COMPOSITE_P (outermode)
>>       && (CONST_SCALAR_INT_P (op)
>>        || CONST_DOUBLE_AS_FLOAT_P (op)
>>        || CONST_FIXED_P (op)
>>        || GET_CODE (op) == CONST_VECTOR))
>>     return NULL_RTX;
>> 
>> rather than the !REG_P (op) && !MEM_P (op) that the documentation
>> would imply.
>
> So maybe we can drop the MODE_COMPOSITE_P check here, as said on IRC
> we don't seem to ever legitmize constants wrapped in a SUBREG, so
> we shouldn't generate a SUBREG of a constant (in the middle-end)?

Hmm, yeah, maybe.  I'd originally rejected that because I assumed
the MODE_COMPOSITE_P was there for a reason.  But looking at the
history, the check came from c0f772894b6b3cd8ed5c5dd09d0c7917f51cf70f,
where the reason given was:

    As for the simplify_gen_subreg change, I think it would be desirable
    to just avoid creating SUBREGs of constants on all targets and for all
    constants, if simplify_immed_subreg simplified, fine, otherwise punt,
    but as we are late in GCC11 development, the patch instead guards this
    behavior on MODE_COMPOSITE_P (outermode) - i.e. only conversions to
    powerpc{,64,64le} double double long double - and only for the cases where
    simplify_immed_subreg was called.

Richard

Reply via email to