https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96403
Bug ID: 96403 Summary: [nvptx] Less optimal code in v2si-cvt.c after setting TARGET_TRULY_NOOP_TRUNCATION to false Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: vries at gcc dot gnu.org Target Milestone: --- [ I've rewritten the v2si-cvt.c source to something more minimal: ... __v2si __attribute__((unused)) vector_cvt (__v2si arg) { unsigned short *p = (unsigned short*)&arg; volatile unsigned short s = p[0]; return arg; } __v2si __attribute__((unused)) vector_cvt_2 (__v2si arg) { unsigned char *p = (unsigned char*)&arg; volatile unsigned char s = p[0]; return arg; } ... ] When changing TARGET_TRULY_NOOP_TRUNCATION to false, we have a regression in v2si-cvt.c, this for vector_cvt: ... - cvt.u16.u32 %r27, %r25.x; + mov.b64 %r26, %r25; + cvt.u16.u64 %r27, %r26; ... and this for vector_cvt_2: ... - cvt.u32.u32 %r27, %r25.x; - st.u8 [%frame], %r27; + mov.b64 %r26, %r25; + cvt.u32.u64 %r27, %r26; + cvt.u16.u8 %r32, %r27; + mov.u16 %r29, %r32; + cvt.u32.u16 %r30, %r29; + st.u8 [%frame], %r30; ...