Vladimir Makarov <[email protected]> writes:
> On 1/11/25 1:15 PM, Denis Chertykov wrote:
>> The fix for PR117868.
>>
>> In brief:
>> this is an LRA bug derived from reuse spilling slots after frame
>> pointer spilling.
>> The slot was created for QImode (1 byte) and it was reused after
>> spilling of the
>> frame pointer for TImode register (16 bytes long) and it overlaps
>> other slots.
>>
> Denis, thank you for working on the PR, finding the reason for wrong
> code generation, and proposing the patch.
>>
>> Ok for trunk ?
>>
>>
> I'd prefer something like the patch in the attachment.
>
> It is simpler and even removing more LRA code than adding one.
>
> But most important, it generates smaller reserved stack space as QI
> and TI pseudos will share the same slot (of TImode) vs 2 slots QI and
> TI in your fix.
Your patch seems wrong to me.
The original code:
--------------------------part of lra-spill.cc -----------------------
slots_num = 0;
1 assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
1 for (i = 0; i < n; i++)
1 if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
1 assign_mem_slot (pseudo_regnos[i]);
- if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
{
/* Assign stack slots to spilled pseudos assigned to fp. */
2 assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);
2 for (i = 0; i < n2; i++)
2 if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
2 assign_mem_slot (pseudo_regnos[i]);
}
----------------------------------------------------------------------
The original code consists of two passes of the stack allocation process
separated
by call `lra_update_fp2sp_elimination()'.
It's right because real allocation is made by `assign_mem_slot()' and
the AVR port used frame size as a trigger for frame pointer elimination.
(If we have a frame then we can't eliminate frame pointer)
The first pass allocate stack for spilled registers
after that LRA asks target about possibility of fp2sp elimination
(call `lra_update_fp2sp_elimination()')
and after that LRA allocates a memory slots for registers spilled in the
process of
spilling hard frame pointer.
In your patch call of `lra_update_fp2sp_elimination()' happened before
real stack allocation that will be done by `assign_mem_slot()' and information
about fp2sp elimination will be wrong.
Denis.
>
> I'll test my patch and submit it on the next week.
>
>>
>> PR rtl-optimization/117868
>> gcc/
>> * lra-spills.cc (assign_stack_slot_num_and_sort_pseudos): Reuse slots
>> only without allocated memory or only with equal or smaller registers
>> with equal or smaller alignment.
>> (lra_spill): Print slot size as width.
>>
>>
>> diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
>> index db78dcd28a3..93a0c92db9f 100644
>> --- a/gcc/lra-spills.cc
>> +++ b/gcc/lra-spills.cc
>> @@ -386,7 +386,18 @@ assign_stack_slot_num_and_sort_pseudos (int
>> *pseudo_regnos, int n)
>> && ! (lra_intersected_live_ranges_p
>> (slots[j].live_ranges,
>> lra_reg_info[regno].live_ranges)))
>> - break;
>> + {
>> + /* A slot without allocated memory can be shared. */
>> + if (slots[j].mem == NULL_RTX)
>> + break;
>> +
>> + /* A slot with allocated memory can be shared only with equal
>> + or smaller register with equal or smaller alignment. */
>> + if (slots[j].align >= spill_slot_alignment (mode)
>> + && compare_sizes_for_sort (slots[j].size,
>> + GET_MODE_SIZE (mode)) != -1)
>> + break;
>> + }
>> }
>> if (j >= slots_num)
>> {
>> @@ -656,8 +667,7 @@ lra_spill (void)
>> for (i = 0; i < slots_num; i++)
>> {
>> fprintf (lra_dump_file, " Slot %d regnos (width = ", i);
>> - print_dec (GET_MODE_SIZE (GET_MODE (slots[i].mem)),
>> - lra_dump_file, SIGNED);
>> + print_dec (slots[i].size, lra_dump_file, SIGNED);
>> fprintf (lra_dump_file, "):");
>> for (curr_regno = slots[i].regno;;
>> curr_regno = pseudo_slots[curr_regno].next - pseudo_slots)
>>
>
> diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
> index db78dcd28a3..e0d88e86c18 100644
> --- a/gcc/lra-spills.cc
> +++ b/gcc/lra-spills.cc
> @@ -635,17 +635,12 @@ lra_spill (void)
> n = assign_spill_hard_regs (pseudo_regnos, n);
> slots_num = 0;
> assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n);
> - for (i = 0; i < n; i++)
> + if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos + n)) > 0)
> + /* Assign stack slots to spilled pseudos assigned to fp. */
> + assign_stack_slot_num_and_sort_pseudos (pseudo_regnos + n, n2);
> + for (i = 0; i < n + n2; i++)
> if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
> assign_mem_slot (pseudo_regnos[i]);
> - if ((n2 = lra_update_fp2sp_elimination (pseudo_regnos)) > 0)
> - {
> - /* Assign stack slots to spilled pseudos assigned to fp. */
> - assign_stack_slot_num_and_sort_pseudos (pseudo_regnos, n2);
> - for (i = 0; i < n2; i++)
> - if (pseudo_slots[pseudo_regnos[i]].mem == NULL_RTX)
> - assign_mem_slot (pseudo_regnos[i]);
> - }
> if (n + n2 > 0 && crtl->stack_alignment_needed)
> /* If we have a stack frame, we must align it now. The stack size
> may be a part of the offset computation for register