https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108869
Bug ID: 108869 Summary: compiling an intrinsic wrapper : gives internal compiler error: in dwarf2out_register_main_translation_unit Product: gcc Version: 7.5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: jaydesh9 at gmail dot com Target Milestone: --- Hi, I have a generically built binary that needs to include a lookup routine which gets compiled into vectorized instructions or otherwise based upon whether the cpu supports avx/avx2. The lookup routine is same as that explained here : https://stackoverflow.com/questions/54897297/check-all-bytes-of-a-m128i-for-a-match-of-a-single-byte-using-sse-avx-avx2 Here the (_mm_set1_epi8, __mm_cmpeq_epi8,_mm_movemask_epi8) intrinsic set will compile into either vectorized instructions if avx/avx2 is supported by the cpu or just sse based instructions, otherwise. in a oversimplified main.c : compiled without mavx/mavx2 and with -msse3 -msse4 -o 3 #define __SSE2__ #define SSE_Lookup() \ _mm_set1_epi8; \ __mm_cmpeq_epi8; \ match_bitmap=_mm_movemask_epi8 #endif static inline __attribute__((always_inline)) uint64_t foo() { unsigned int a=1,b,c,d; uint64_t match_bitmap; __cpuid(1,a,b,c,d); if(c & bit_AVX) { match_bitmap= avx_lookup(); }else { #if __SSE__ SSE_Lookup(); #endif } } foo_avx.c #include <emmintrin.h> //mimicing an intrinsic wrapper //don't want to create any new stack frames so keeping it inline extern __inline int __attribute__((__gnu_inline__, __always_inline__, __artificial__)) __avx_lookup (char h, __m128i h) { __m128i k = _mm_set1_epi8(h); __m128i r = _mm_cmpeq_epi8(k,h); return _mm_movemask_epi8(r); } GCC Bug report : Bugzilla : x86_64_gcc-7.5.0_glibc/bin/x86_64-openwrt-linux-gnu-gcc Compiled as follows : /build/openwrt/toolchain-x86_64_gcc-7.5.0_glibc/bin/x86_64-openwrt-linux-gnu-gcc -fsigned-char -fmerge-constants -fPIC -pipe -MD -MP -Werror -Wall -Wextra -Wpointer-arith -Wreturn-type -Wwrite-strings -Wno-unused-parameter -Wunused-variable -Wno-unused-result -Wformat -Wformat-security -Winit-self -Wno-implicit-fallthrough -Wno-format-truncation -Wno-deprecated-declarations -Wno-error=deprecated-declarations -mtune=generic -msse3 -msse4 -mpclmul -maes -mcx16 -g -O0 -DBUILD_UT=1 -std=gnu99 -Wno-missing-field-initializers -Wmissing-prototypes -mavx -o 2 -o /build/x86_64/common/foo_avx.o foo_avx.c foo_avx.c:19:1: internal compiler error: in dwarf2out_register_main_translation_unit, at dwarf2out.c:26628 } ^ Please submit a full bug report, with preprocessed source if appropriate. See <http://bugs.openwrt.org/> for instructions. Makefile:72: recipe for target '/build/x86_64/common/foo_avx.o' failed make[3]: *** [/build/x86_64/common/foo_avx.o] Error 1 So question are : 1. Is this a genuine bug ? 2. If not, is the approach correct in defining the intrinsic wrapper 3. Is there a better way of doing this ? goal is to have an executable with code for sse , avx as well as avx2 avx512 embedded that can be invoked based upon the cpu support at the run time. Thanks in advance. -J