"Robin Dapp" <rdapp....@gmail.com> writes:
>> > IMO, what ought to happen here is that the RA should spill
>> > the inner register to memory and load the V4SI back from there.
>> > (Or vice versa, for an lvalue.)  Obviously that's not very efficient,
>> > and so a patch like the above might be useful as an optimisation.[*]
>> > But it shouldn't be needed for correctness.  The target-independent
>> > code should already have the information it needs to realise that
>> > it can't predict the register index at compile time (at least for SVE).
>>
>> Or actually, for that case:
>>
>>   /* For pseudo registers, we want most of the same checks.  Namely:
>>
>>      Assume that the pseudo register will be allocated to hard registers
>>      that can hold REGSIZE bytes each.  If OSIZE is not a multiple of 
>> REGSIZE,
>>      the remainder must correspond to the lowpart of the containing hard
>>      register.  If BYTES_BIG_ENDIAN, the lowpart is at the highest offset,
>>      otherwise it is at the lowest offset.
>>
>>      Given that we've already checked the mode and offset alignment,
>>      we only have to check subblock subregs here.  */
>>   if (maybe_lt (osize, regsize)
>>       && ! (lra_in_progress && (FLOAT_MODE_P (imode) || FLOAT_MODE_P 
>> (omode))))
>>     {
>>       /* It is invalid for the target to pick a register size for a mode
>>       that isn't ordered wrt to the size of that mode.  */
>>       poly_uint64 block_size = ordered_min (isize, regsize);
>>       unsigned int start_reg;
>>       poly_uint64 offset_within_reg;
>>       if (!can_div_trunc_p (offset, block_size, &start_reg, 
>> &offset_within_reg)
>>           ...
>>
>
> Like aarch64 we set REGMODE_NATURAL_SIZE for fixed-size modes to
> UNITS_PER_WORD.  Isn't that part of the problem?
>
> In extract_bit_field_as_subreg we check lowpart_bit_field_p (= true because
> 128 is a multiple of UNITS_PER_WORD).  This leads to the subreg expression.
>
> If I have REGMODE_NATURAL_SIZE return a VLA number this fails and we extract
> via memory - but that of course breaks almost everything else :)
>
> When you say the target-independent code should already have all information 
> it
> needs, what are you referring to?  Something else than REGMODE_NATURAL_SIZE?

In the aarch64 example I mentioned, the REGMODE_NATURAL_SIZE of the inner
mode is variable.  (The REGMODE_NATURAL_SIZE of the outer mode is constant,
but it's the inner mode that matters here.)

Thanks,
Richard

Reply via email to