http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53100
--- Comment #3 from Marc Glisse <marc.glisse at normalesup dot org> 2012-05-01 12:47:03 UTC --- (In reply to comment #2) > and not to introduce them just before an optimization that removes them. Usually, doing (long)num1*(__int128)(long)num2 does the right thing. I tried in the example here replacing the plain __int128 multiplications with: inline bool g1(__int128 x){ //return(x<=LONG_MAX)&&(x>=LONG_MIN); //on 2 lines because of PR30318, unless you apply the patch I posted there bool b1 = x<=LONG_MAX; bool b2 = x>=LONG_MIN; return b1&&b2; } inline __int128 mul(__int128 a,__int128 b){ bool B=g1(a)&&g1(b); if(__builtin_constant_p(B)&&B) return (long)a*(__int128)(long)b; return a*b; } __builtin_constant_p does detect we are in the right case, however, because of bad timing between the various optimizations, the double cast (__int128)(long)(u-x) is simplified to just (u-x) before it gets a chance to help. I need to replace the subtraction instead (or in addition) to the multiplication: inline __int128 sub(__int128 a,__int128 b){ bool B=g1(a)&&g1(b)&& g1(a-b); if(__builtin_constant_p(B)&&B) return (long)a-(long)b; return a-b; } But it would fit better inside the compiler than as a fragile use of __builtin_constant_p.