https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863
--- Comment #9 from Uroš Bizjak <ubizjak at gmail dot com> --- (In reply to Li Pan from comment #8) > Thanks Richard. > Yes, the .SAT_TRUNC doesn't pay any attention the other possible use of > MIN_EXPR. > > As your suggestion, we may need one additional check here (like > gimple_unsigned_sat_trunc() && no_other_MIN_EXPR_use_after_sat_trunc_p ()) > before we build the SAT_TRUNC call. > Sorry I didn't get the point here why we need to do this, could you please > help to explain a bit more about it? Like wrong code or something else in > above > sample code. The wrong-code bug is now fixed (it was x86 target-specific oversight in the expander), but while fixing the original bug, I noticed that the addition of ustrunc{m}{n}2 optab regressed compress2 loop performance wise. Without ustrunc{m}{n} the loop in compress2 looks like: <bb 5> [local count: 536870912]: _18 = MIN_EXPR <left_8, 4294967295>; iftmp.0_11 = (unsigned int) _18; stream.avail_out = iftmp.0_11; left_37 = left_8 - _18; and when ustrunc{m}{n}2 is present in i386.md: <bb 5> [local count: 536870912]: _45 = MIN_EXPR <left_8, 4294967295>; iftmp.0_11 = .SAT_TRUNC (left_8); stream.avail_out = iftmp.0_11; left_37 = left_8 - _45; In the first case, iftmp.0_11 is calculated with a simple truncation from unsinged long to int (i.e. "mov %eax, %edx" on x86). In the second case, it uses .SAT_TRUNC optab, which on x85 expands to a sequence of complex instructions. Performanve wise, it is universally better to have a "normal" truncation after MIN_EXPR than saturating .SAT_TRUNC truncation.