https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96929
--- Comment #3 from CVS Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Jakub Jelinek <ja...@gcc.gnu.org>: https://gcc.gnu.org/g:4866b2f5db117f9e89f82c44ffed57178c09cc49 commit r11-5271-g4866b2f5db117f9e89f82c44ffed57178c09cc49 Author: Jakub Jelinek <ja...@redhat.com> Date: Tue Nov 24 09:03:17 2020 +0100 middle-end, c++: Treat shifts by negative as undefined [PR96929] The PR38359 change made the -1 >> x to -1 optimization less useful by requiring that the x must be non-negative. Shifts by negative amount are UB, but we for historic reasons had in some (but not all) places some hack to treat shifts by negative value as the other direction shifts by the negated amount. The following patch just removes that special handling, instead we punt on optimizing those (and ideally path isolation should catch that up and turn those into __builtin_unreachable, perhaps with __builtin_warning next to it). Folding the shifts in some places as if they were rotates and in other as if they were saturating just leads to inconsistencies. For C++ constexpr diagnostics and -fpermissive, I've added code to pretend fold-const.c has not changed, without -fpermissive it will be an error anyway and I think it is better not to change all the diagnostics. During x86_64-linux and i686-linux bootstrap/regtest, my statistics gathering patch noted 185 unique -m32/-m64 x TU x function_name x shift_kind x fold-const/tree-ssa-ccp cases. I have investigated the 64 ../../gcc/config/i386/i386.c x86_output_aligned_bss LSHIFT_EXPR wide_int_bitop 64 ../../gcc/config/i386/i386-expand.c emit_memmov LSHIFT_EXPR wide_int_bitop 64 ../../gcc/config/i386/i386-expand.c ix86_expand_carry_flag_compare LSHIFT_EXPR wide_int_bitop 64 ../../gcc/expmed.c expand_divmod LSHIFT_EXPR wide_int_bitop 64 ../../gcc/lra-lives.c process_bb_lives LSHIFT_EXPR wide_int_bitop 64 ../../gcc/rtlanal.c nonzero_bits1 LSHIFT_EXPR wide_int_bitop 64 ../../gcc/varasm.c optimize_constant_pool.isra LSHIFT_EXPR wide_int_bitop cases and all of them are either during jump threading (dom) or during PRE. For jump threading, the most common case is 1 << floor_log2 (whatever) where floor_log2 is return HOST_BITS_PER_WIDE_INT - 1 - clz_hwi (x); and clz_hwi is if (x == 0) return HOST_BITS_PER_WIDE_INT; return __builtin_clz* (x); and so has range [-1, 63] and a comparison against == 0 which makes the threader think it might be nice to jump thread the case leading to 1 << -1. I think it is better to keep the 1 << -1 s in the IL for this and let path isolation turn that into __builtin_unreachable () if the user wishes so. 2020-11-24 Jakub Jelinek <ja...@redhat.com> PR tree-optimization/96929 * fold-const.c (wide_int_binop) <case LSHIFT_EXPR, case RSHIFT_EXPR>: Return false on negative second argument rather than trying to handle it as shift in the other direction. * tree-ssa-ccp.c (bit_value_binop) <case LSHIFT_EXPR, case RSHIFT_EXPR>: Punt on negative shift count rather than trying to handle it as shift in the other direction. * match.pd (-1 >> x to -1): Remove tree_expr_nonnegative_p check. * constexpr.c (cxx_eval_binary_expression): For shifts by constant with MSB set, emulate older wide_int_binop behavior to preserve diagnostics and -fpermissive behavior. * gcc.dg/tree-ssa/pr96929.c: New test.