https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101723
Bug ID: 101723 Summary: arm: incorrect order of .fpu and .arch_extension directives leads to unsupported instructions Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: rearnsha at gcc dot gnu.org Target Milestone: --- This bug was originally reported against GNU binutils (https://sourceware.org/bugzilla/show_bug.cgi?id=28078), but is really a problem with the way GCC emits the directives .fpu and .arch_extension. Alok Parlikar 2021-07-12 04:50:04 UTC I was trying to build tensorflow-lite v2.5 with a custom toolchain that was using binutils 2.36.1. The build failed when building the xnnpack project with an error: /tmp/ccMSJOfk.s:380: Error: selected processor does not support `vsdot.s8 q12,q9,d11[0]' in ARM mode Some of my notes about this issue are here: https://github.com/google/XNNPACK/issues/1465#issuecomment-877910701 Following is a minimal example to reproduce this: // file: test.c #include <arm_neon.h> int32x2_t test(int32x2_t a, int8x8_t b, int8x8_t c) { return vdot_lane_s32(a, b, c, 1); } // EOF