https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96401

            Bug ID: 96401
           Summary: [nvptx] Take advantage of subword ld/st/cvt
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: vries at gcc dot gnu.org
  Target Milestone: ---

Consider test-case test.c:
...
void
foo (void)
{
  volatile unsigned int v;
  volatile unsigned short v2;
  v2 = v;
}
...

With the current compiler, we have:
...
$ gcc test.c -S -o- -O2
  ...
        .reg.u32 %r22;
        .reg.u16 %r24;
                ld.u32  %r22, [%frame];
                cvt.u16.u32     %r24, %r22;
                st.u16  [%frame+4], %r24;
}
...

As it happens, the nvptx manual states at 5.2.2 "Restricted Use of Sub-Word
Sizes":
...
For convenience, ld, st, and cvt instructions permit source and destination
data operands to be wider than the instruction-type size, so that narrow values
may be loaded, stored, and converted using regular-width registers. For
example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit
registers when being loaded, stored, or converted to other types and sizes.
...

In other words, we may emit instead:
...
        .reg.u32 %r22;
                ld.u32  %r22, [%frame];
                st.u16  [%frame+4], %r22;
...

Reply via email to