https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96401
Bug ID: 96401 Summary: [nvptx] Take advantage of subword ld/st/cvt Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- Consider test-case test.c: ... void foo (void) { volatile unsigned int v; volatile unsigned short v2; v2 = v; } ... With the current compiler, we have: ... $ gcc test.c -S -o- -O2 ... .reg.u32 %r22; .reg.u16 %r24; ld.u32 %r22, [%frame]; cvt.u16.u32 %r24, %r22; st.u16 [%frame+4], %r24; } ... As it happens, the nvptx manual states at 5.2.2 "Restricted Use of Sub-Word Sizes": ... For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. ... In other words, we may emit instead: ... .reg.u32 %r22; ld.u32 %r22, [%frame]; st.u16 [%frame+4], %r22; ...