( I don't know if it's allowed to ask such question, if not, please remind me. )
I know Intel implemented several static branch prediction mechanisms these years: * 80486 age: Always-not-take * Pentium4 age: Backwards Taken/Forwards Not-Taken * PM, Core2: Didn't use static prediction, randomly depending on what happens to be in corresponding BTB entry , according to agner's optimization guide ¹. * Newer CPUs like Ivy Bridge, Haswell have become increasingly intangible, according to Matt G's experiment ². And Intel seems don't want to talk about it any more, because the latest material I found within Intel Document was written about ten years ago. I know static branch prediction is (far?) less important than dynamic, but in quite a few situations, CPU will be completely lost and programmers(with compiler) are usually the best guide. Of course these situations are usually not performance bottleneck, because once a branch is frequently executed, the dynamic predictor will capture it. Since Intel no longer clearly statements the dynamic prediction mechanism in its document, the builtin_expect() of GCC can do nothing more than removing the unlikely branch from hot path or reversely for likely branch. I am not familiar with CPU design and I don't know what exactly mechanism Intel use nowadays for its static predictor, I just feel the best static mechanism for Intel should be to clearly document his CPU "where I plan to go when dynamic predictor failed, forward or backward", because usually the programmer is the best guide at that time. APPENDIX: ¹ Agner's optimization guide: https://www.agner.org/optimize/microarchitecture.pdf , section 3.5 . ² Matt G's experiment: https://xania.org/201602/bpu-part-two