https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91131

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Per Dalgas Jakobsen from comment #9)
> (In reply to Richard Biener from comment #8)
> > Fixed on trunk sofar.
> > 
> > Note the non-optimal code-gen probably was a side-effect of us making
> > three volatile accesses out of one.  On x86 I now see
> > 
> > main:
> > .LFB0:
> >         .cfi_startproc
> >         movl    $0, Reg_A(%rip)
> >         xorl    %eax, %eax
> >         movl    $8, Reg_B(%rip)
> >         movl    $255, Reg_C(%rip)
> >         movb    $0, Reg_D(%rip)
> >         movb    $-1, Reg_E(%rip)
> >         ret
> 
> That's as efficient as it gets :)
> 
> On AVR architecture I now get this:
> avr-gcc -O0:
>       andi    r25, 0xF8       ; 248
>       andi    r25, 0xF7       ; 247
>       andi    r25, 0x0F       ; 15
>       sts     0x0064, r25     ; 0x800064 <Reg_A>
>       andi    r24, 0xF8       ; 248
>       ori     r24, 0x08       ; 8
>       andi    r24, 0x0F       ; 15
>       sts     0x0065, r24     ; 0x800065 <Reg_B>
>       lds     r24, 0x0060     ; 0x800060 <__data_start>
>       sts     0x0063, r24     ; 0x800063 <Reg_C>
>       sts     0x0066, r1      ; 0x800066 <Reg_D>
>       ldi     r24, 0xFF       ; 255
>       sts     0x0062, r24     ; 0x800062 <__data_end>
> 
> avr-gcc -O3 (and -Os, -O1, -O2):
>       sts     0x0064, r1      ; 0x800064 <Reg_A>
>       ldi     r24, 0x08       ; 8
>       sts     0x0065, r24     ; 0x800065 <Reg_B>
>       lds     r24, 0x0060     ; 0x800060 <__data_start>
>       sts     0x0063, r24     ; 0x800063 <Reg_C>
>       sts     0x0066, r1      ; 0x800066 <Reg_D>
>       ldi     r24, 0xFF       ; 255
>       sts     0x0062, r24     ; 0x800062 <__data_end>
> 
> Nice improvement, but Reg_C is still loading from memory. Is it possible to
> get that into an immediate as well?

No idea - that's a question for the architecture maintainers.  It might
be that not all constants can be in the immediate.  Note that RTL
needs to get rid of the temporary since we expand from

  struct Reg_T D.1954;
  struct Reg_T D.1953;
  struct Reg_T D.1952;

  <bb 2> [local count: 1073741824]:
  MEM <unsigned char> [(struct Reg_T *)&D.1952] = 0;
  Reg_A ={v} D.1952;
  MEM <unsigned char> [(struct Reg_T *)&D.1953] = 8;
  Reg_B ={v} D.1953;
  MEM <unsigned char> [(struct Reg_T *)&D.1954] = 255;
  Reg_C ={v} D.1954;
  Reg_D ={v} 0;
  Reg_E ={v} 255;
  return 0;

here we're too conservative in optimizing the volatile
aggregate copies (only their LHS is volatile).  But we only have
the quite late store-merging pass merging the invididual bitfield
accesses (but I can see VN doing that as well).

> I'm slightly surprised that -O0 shows the setting of individual fields. It's
> certainly not a bug, perhaps not even an issue, and absolutely something I
> can live with :)

Yeah, most importantly the actual Rec_* are just stored to once so
correctness-wise -O0 is fine.

Reply via email to