Dear Richard, Thank you so much for your reply. I submitted the patch for the third case to LLVM before I've received your reply, and they said the same thing, that it would probably be used outside of loops as well and it would inflict a branch misprediction, so it should be implemented at the level of loop code generation only (because branch predictor could handle it inside loops).
I didn't know that the second pattern would cause disassociation from division. Unexpectedly LLVM has that pattern in their match.pd equivalent but what you've said makes more sense. For the first pattern, I verified that trunk GCC, for: void d(unsigned x) { if (x >= 5) __builtin_unreachable(); x %= 5; g(x); } optimizes away the "%=", as you've said. However, for the code: void a(void) { unsigned m = 0; for(int i = 0; i < 300; i++) { m++; m %= 600; g(m); } } it gets optimized only with the patch, which is surprising. Thanks MCCCS