http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48927

           Summary: Issues with "enable" attribute and IRA register
                    preferences
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: ubiz...@gmail.com


Trying to merge*vec_concatv4si_1_avx and *vec_concatv4si_1 patterns usign
"enable" attribute, gcc.target/i386/pr36246.c test (scan-asm-not for movq
insn) and gcc.target/i386/pr36222-1.c (scan-assembler-not for movdqa) failed
with:

FAIL: gcc.target/i386/pr36222-1.c scan-assembler-not movdqa
FAIL: gcc.target/i386/pr36246.c scan-assembler-not movq


Following are the two patterns, original (the first one) and merged pattern
(the second one). Separate AVX pattern is not relevant to this discussion.

(define_insn "*vec_concatv4si_old"
 [(set (match_operand:V4SI 0 "register_operand"       "=Y2,x,x")
       (vec_concat:V4SI
         (match_operand:V2SI 1 "register_operand"     " 0 ,0,0")
         (match_operand:V2SI 2 "nonimmediate_operand" " Y2,x,m")))]
 "0"
 "@
  punpcklqdq\t{%2, %0|%0, %2}
  movlhps\t{%2, %0|%0, %2}
  movhps\t{%2, %0|%0, %2}"
 [(set_attr "type" "sselog,ssemov,ssemov")
  (set_attr "mode" "TI,V4SF,V2SF")])

(define_insn "*vec_concatv4si"
 [(set (match_operand:V4SI 0 "register_operand"       "=Y2,x,x,x,x")
       (vec_concat:V4SI
         (match_operand:V2SI 1 "register_operand"     " 0 ,x,0,0,x")
         (match_operand:V2SI 2 "nonimmediate_operand" " Y2,x,x,m,m")))]
 "TARGET_SSE"
 "@
  punpcklqdq\t{%2, %0|%0, %2}
  vpunpcklqdq\t{%2, %1, %0|%0, %1, %2}
  movlhps\t{%2, %0|%0, %2}
  movhps\t{%2, %0|%0, %2}
  vmovhps\t{%2, %1, %0|%0, %1, %2}"
 [(set_attr "isa" "noavx,avx,noavx,noavx,avx")
  (set_attr "type" "sselog,sselog,ssemov,ssemov,ssemov")
  (set_attr "prefix" "orig,vex,orig,orig,vex")
  (set_attr "mode" "TI,TI,V4SF,V2SF,V2SF")])

The problem is, that for non-AVX target, merged pattern somehow
changes register allocation preferences (please note that all new
constraints are disabled for non-AVX target), so in theory, there
should be nothing different, However, IRA shows certain differences,
the diff betwen non-patched (pr36246.c) and patched (pr34246_1.c) IRA
dump files show:

--- pr36246_1.c.190r.ira        2011-05-05 22:06:46.252582018 +0200
+++ pr36246.c.190r.ira  2011-05-05 21:50:07.831975984 +0200
@@ -100,10 +100,9 @@
  cp1:a1(r68)<->a5(r62)@125:shuffle
  cp2:a2(r69)<->a4(r65)@125:shuffle
  cp3:a2(r69)<->a3(r64)@125:shuffle
-  cp4:a0(r67)<->a2(r69)@125:shuffle
-  cp5:a0(r67)<->a1(r68)@125:shuffle
+  cp4:a0(r67)<->a2(r69)@1000:constraint
  regions=1, blocks=3, points=8
-    allocnos=7 (big 0), copies=6, conflicts=0, ranges=7
+    allocnos=7 (big 0), copies=5, conflicts=0, ranges=7

 **** Allocnos coloring:

@@ -140,11 +139,11 @@
      Popping a6(r63,l0)  -- assign reg 4
      Popping a5(r62,l0)  -- assign reg 5
      Popping a0(r67,l0)  -- assign reg 21
-      Popping a1(r68,l0)  -- assign reg 21
-      Popping a2(r69,l0)  -- assign reg 22
+      Popping a1(r68,l0)  -- assign reg 22
+      Popping a2(r69,l0)  -- assign reg 21
 Disposition:
    5:r62  l0     5    6:r63  l0     4    3:r64  l0     1    4:r65  l0     2
-    0:r67  l0    21    1:r68  l0    21    2:r69  l0    22
+    0:r67  l0    21    1:r68  l0    22    2:r69  l0    21

This results in different allocated registers, so the difference
between assembly files shows:

--- pr36246_1.s 2011-05-05 22:06:46.255582628 +0200
+++ pr36246.s   2011-05-05 21:50:07.833976438 +0200
@@ -1,4 +1,4 @@
-       .file   "pr36246_1.c"
+       .file   "pr36246.c"
       .text
       .p2align 4,,15
       .globl  _mm_set_epi32
@@ -7,19 +7,17 @@
 .LFB0:
       .cfi_startproc
       movl    %esi, -12(%rsp) # 23    *movsi_internal/2       [length = 4]
-       movd    -12(%rsp), %xmm0        # 24    *movsi_internal/12      [length
= 6]
+       movd    -12(%rsp), %xmm1        # 24    *movsi_internal/12      [length
= 6]
       movl    %edi, -12(%rsp) # 25    *movsi_internal/2       [length = 4]
-       movd    -12(%rsp), %xmm1        # 26    *movsi_internal/12      [length
= 6]
+       movd    -12(%rsp), %xmm0        # 26    *movsi_internal/12      [length
= 6]
       movl    %ecx, -12(%rsp) # 27    *movsi_internal/2       [length = 4]
-       punpckldq       %xmm1, %xmm0    # 9     *vec_concatv2si_sse2/1  [length
= 4]
-       movd    -12(%rsp), %xmm1        # 28    *movsi_internal/12      [length
= 6]
+       punpckldq       %xmm0, %xmm1    # 9     *vec_concatv2si_sse2/1  [length
= 4]
+       movd    -12(%rsp), %xmm0        # 28    *movsi_internal/12      [length
= 6]
       movl    %edx, -12(%rsp) # 29    *movsi_internal/2       [length = 4]
       movd    -12(%rsp), %xmm2        # 30    *movsi_internal/12      [length
= 6]
-       punpckldq       %xmm2, %xmm1    # 10    *vec_concatv2si_sse2/1  [length
= 4]
-       movq    %xmm1, %xmm2    # 31    *movv2si_internal_rex64/10      [length
= 4]
-       punpcklqdq      %xmm0, %xmm2    # 11    *vec_concatv4si/1       [length
= 4]
-       movdqa  %xmm2, %xmm0    # 32    *movv4si_internal/2     [length = 4]
-       ret     # 35    return_internal [length = 1]
+       punpckldq       %xmm2, %xmm0    # 10    *vec_concatv2si_sse2/1  [length
= 4]
+       punpcklqdq      %xmm1, %xmm0    # 11    *vec_concatv4si_1/1     [length
= 4]
+       ret     # 33    return_internal [length = 1]
       .cfi_endproc
 .LFE0:
       .size   _mm_set_epi32, .-_mm_set_epi32

This triggers the scan-asm-not scanner failure, pointing to the
interference between "enable" attribute and IRA. I believe that the
intention of "enable" attribute is to maintain consistency between
separate patterns and merged patterns in all stages of compilation.

Reply via email to