https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62642
Bug ID: 62642 Summary: x86 rdtsc is moved through barrier Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: M8R-ynb11d at mailinator dot com given: unsigned long long measure(void (*func)(void)) { unsigned long long before = __builtin_ia32_rdtsc(); asm volatile("" ::: "memory"); func(); asm volatile("" ::: "memory"); unsigned long long after = __builtin_ia32_rdtsc(); return after - before; } On x86 linux with -O2, this results in the obviously useless: measure: push edi push esi push ebx call [DWORD PTR [esp+16]] rdtsc mov esi, eax mov edi, edx rdtsc pop ebx sub esi, eax sbb edi, edx mov eax, esi mov edx, edi pop esi pop edi ret I can reproduce the problem on 32 bit x86 on Linux and MinGW with 4.8.2 and 4.9.1. (I guess 4.8.0 also exhibits the problem but I don't have that available for testing, so I set the version to 4.8.2 in the report.) 4.7.x and 4.6.x work correctly, as does 64 bit on both platforms. If this is not the proper sanctioned way to write this function, I'm all ears to a better way. I've tried also adding calls to __builtin_ia32_mfence() which as I understand it should not be necessary, and it gets even more comical: ... mfence call [DWORD PTR [esp+16]] mfence rdtsc mov esi, eax mov edi, edx rdtsc ...