https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66610
Bug ID: 66610 Summary: Compound assignments prevent value-numbering optimization Product: gcc Version: 6.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: dmalcolm at gcc dot gnu.org Target Milestone: --- Created attachment 35818 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=35818&action=edit Minimal testcase demonstrating the issue libgccjit showed poor performance relative to a LLVM backend within an experimental JIT-compiler for Lua (https://github.com/dibyendumajumdar/ravi). On investigation it appears to be due to unions and structs stopping value-numbering from working. I'm attaching a minimal reproducer for the issue. If the code copies the struct and the union within it field-wise, pass_fre (tree-ssa-pre.c) uses value numbering to eliminate the copy of the loop index through *arr, and turns loop_using_union_field_assignment into: loop_using_union_field_assignment (int num_iters, struct s * arr) { int i; <bb 2>: goto <bb 4>; <bb 3>: arr_6(D)->union_field.int_field = i_1; MEM[(struct s *)arr_6(D) + 4B].union_field.int_field = i_1; i_11 = i_1 + 1; <bb 4>: # i_1 = PHI <0(2), i_11(3)> if (i_1 < num_iters_5(D)) goto <bb 3>; else goto <bb 5>; <bb 5>: return; } and the loop is eliminated altogether by cddce2. However, if the code does a compound copy, pass_fre doesn't eliminate the copy of the loop index and the loop can't be eliminated (with a big performance loss); in the example, functions "loop_using_struct_assignment" and "loop_using_union_assignment" fail to have their loops optimized away at -O3. Hence the libgccjit user has patched things at their end to direct copying the fields (fwiw their workaround was this commit https://github.com/dibyendumajumdar/ravi/commit/a5b192cd4f4213cd544e31b08b02eb9082142b20 ) Should compound assignments be optimizable via value-numbering? Would it make sense to split out the compound assignments field-wise internally before doing value-numbering? This is all with gcc trunk (r224625 aka e3a904dbdc78cb45b98e8b109e0e49e759315b7c) on x86_64 at -O3.