https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119280
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> --- res = a[0] * a[0]; __asm volatile("rdcycle %0" : "=r"(end_cycle) : "r"(res) : "memory"); Will cause the second rdcycle to stay after the mult. Otherwise you could just write the full thing in assembly. Anways this documented and has been for all long time