https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117559

            Bug ID: 117559
           Summary: Hybrid analysis confused by store pattern for uniform
                    mask .MASK_STORE
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: rguenth at gcc dot gnu.org
  Target Milestone: ---

gcc.target/aarch64/sve/mask_struct_store_1.c ends up using hybrid SLP:

gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:   Processing hybrid
candidate : patt_35 = (<signed-boolean:1>) _36;
gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:   Found loop_vect
sink: patt_35 = (<signed-boolean:1>) _36;
gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:   marking hybrid: _36
= _9 != 0;
gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:   marking hybrid: _9 =
*_8;

which is because the mask for the masked store-lane node

gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:   op template:
.MASK_STORE (_10, 8B, patt_51, value_20);
gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:           stmt 0
.MASK_STORE (_10, 8B, patt_51, value_20);
gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:           stmt 1
.MASK_STORE (_12, 8B, patt_35, value_20);
gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:           children
0x4a2c1f8 0x4a2c4f0 0x4a2c1f8 (store-lanes)

only uses a single lane (for the whole group):

gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:   node 0x4a2c4f0
(max_nunits=16, refcnt=3) vector([16,16]) <signed-boolean:1>
gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:   op template: patt_51
= (<signed-boolean:1>) _36;
gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:           stmt 0
patt_51 = (<signed-boolean:1>) _36;
gcc.target/aarch64/sve/mask_struct_store_1.c:43:1: note:           children
0x4a2c588

this makes the other (redundant) patterns no longer SLP covered.  This
is mask_conversion pattern code which fails to preserve "same" mask
args:

vect_recog_mask_conversion_pattern: detected: .MASK_STORE (_10, 8B, _36,
value_20);
mask_conversion pattern recognized: .MASK_STORE (_10, 8B, patt_51, value_20);
vect_recog_mask_conversion_pattern: detected: .MASK_STORE (_12, 8B, _36,
value_20);
mask_conversion pattern recognized: .MASK_STORE (_12, 8B, patt_35, value_20);

it might be possible to somehow recover after the fact by marking other
lane as not relevant or somehow also recording "alternate" representations
of the same lane in the mask SLP node and covering SLP marking that way
it's quite ugly that pattern matching messes this up.

Note pattern matching happens before DR group analysis (which figures the
masks are the same because it analyzes the original DRs and not the patterns).

Since the patterns are associated with the .MASK_STORE and not the original
mask def we can't really use the same pattern stmt...

Reply via email to