Re: Patch: PR target/55142: [4.8 Regression] internal compiler error: in plus_constant, at explow.c:88

H.J. Lu Mon, 19 Nov 2012 11:19:31 -0800

On Tue, Nov 13, 2012 at 10:59 PM, Uros Bizjak <ubiz...@gmail.com> wrote:
> On Tue, Nov 13, 2012 at 7:23 PM, H.J. Lu <hjl.to...@gmail.com> wrote:
>
>>>>> >>>> Since x32 runs in 64-bit mode, for address -0x40000300(%rax), 
>>>>> >>>> hardware
>>>>> >>>> sign-extends displacement from 32-bits to 64-bits and adds it to 
>>>>> >>>> %rax.
>>>>> >>>> But x32 wants 32-bit -0x40000300, not 64-bit -0x40000300.  This patch
>>>>> >>>> uses 32-bit registers instead of 64-bit registers when displacement
>>>>> >>>> < -16*1024*1024.  -16*1024*1024 is used instead of 0 so that we will
>>>>> >>>> still generate -16(%rsp) instead of -16(%esp).
>>>>> >>>>
>>>>> >>>> Tested it on Linux/x32.  OK to install?
>>>>> >>>
>>>>> >>> This problem uncovers a bug in the middle-end, so I guess it would be
>>>>> >>> better to fix it there.
>>>>> >>>
>>>>> >>> Uros.
>>>>> >>
>>>>> >> The problem is it isn't safe to transform
>>>>> >>
>>>>> >> (zero_extend:DI (plus:SI (FOO:SI) (const_int Y)))
>>>>> >>
>>>>> >> to
>>>>> >>
>>>>> >> (plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))
>>>>> >>
>>>>> >> when Y is negative and its absolute value is greater than FOO.  
>>>>> >> However,
>>>>> >> we have to do this transformation since other parts of GCC depend on
>>>>> >> it.  If we revert the fix for
>>>>> >>
>>>>> >> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49721
>>>>> >>
>>>>> >> we will get
>>>>> >>
>>>>> >> FAIL: gcc.c-torture/compile/990523-1.c  -O3 -g  (internal compiler 
>>>>> >> error)
>>>>> >> FAIL: gcc.c-torture/compile/990523-1.c  -O3 -g  (test for excess 
>>>>> >> errors)
>>>>> >> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer 
>>>>> >> -funroll-all-loo
>>>>> >> ps -finline-functions  (internal compiler error)
>>>>> >> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer 
>>>>> >> -funroll-all-loo
>>>>> >> ps -finline-functions  (test for excess errors)
>>>>> >> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer 
>>>>> >> -funroll-loops
>>>>> >> (internal compiler error)
>>>>> >> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer 
>>>>> >> -funroll-loops
>>>>> >> (test for excess errors)
>>>>> >> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer  
>>>>> >> (internal compi
>>>>> >> ler error)
>>>>> >> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -fomit-frame-pointer  (test 
>>>>> >> for exces
>>>>> >> s errors)
>>>>> >> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -g  (internal compiler 
>>>>> >> error)
>>>>> >> FAIL: gcc.c-torture/compile/pr41634.c  -O3 -g  (test for excess errors)
>>>>> >> FAIL: gcc.dg/Warray-bounds.c (internal compiler error)
>>>>> >> FAIL: gcc.dg/Warray-bounds.c (test for excess errors)
>>>>> >>
>>>>> >> since we generate pseudo registers to convert SImode to DImode
>>>>> >> after reload.  Fixing it requires significant changes.
>>>>> >>
>>>>> >> This is only a problem for 64-bit register address since the symbolic
>>>>> >> address is 32-bit.  Using 32-bit base/index registers will work around
>>>>> >> this issue.
>>>>> >
>>>>> > This address
>>>>> >
>>>>> > (plus:DI (zero_extend:DI (FOO:SI)) (const_int Y))
>>>>> >
>>>>> > is OK for x32 as long as Y, which is encoded as 32-bit immediate,
>>>>> > is zero-extend from 32-bit to 64-bit.  SImode address does it.
>>>>> > My patch optimizes it a little bit by using SImode address only
>>>>> > for Y < -16*1024*1024.
>>>>>
>>>>> I was wondering, why we operate with constant -16*1024*1024? Should we
>>>>> use 0x7FFFFFF instead? Since the MSB is always zero, we are safe.
>>>>>
>>>>
>>>> We can check 0x7FFFFFF, i.e., disp < 0.  I use -16*1024*1024, which
>>>> is also used to check legitimate address displacements for PIC, to
>>>> reduce code sizes for small negative displacements.  Or we can always
>>>> encode negative displacements with zero-extension, including -1(%rsp).
>>>>
>>>>> Please add a fat ??? comment, why we paper-over this issue and repost
>>>>> the latest patch. I got lost in all the versions :(
>>>>>
>>>>
>>>> Here is the updated patch.
>>>>
>>>> gcc/
>>>>
>>>> 2012-11-13  Eric Botcazou  <ebotca...@adacore.com>
>>>>             H.J. Lu  <hongjiu...@intel.com>
>>>>
>>>>         PR target/55142
>>>>         * config/i386/i386.c (legitimize_pic_address): Properly handle
>>>>         REG + CONST.
>>>>         (ix86_print_operand_address): For x32, zero-extend negative
>>>>         displacement if it < -16*1024*1024.
>>>>
>>>> gcc/testsuite/
>>>>
>>>> 2012-11-13  H.J. Lu  <hongjiu...@intel.com>
>>>>
>>>>         PR target/55142
>>>>         * gcc.target/i386/pr55142-1.c: New file.
>>>>         * gcc.target/i386/pr55142-2.c: Likewise.
>>>
>>> OK, for mainline SVN (with the reservation that middle-end fix was not
>>> found to be a viable solution, so target fix is papering-over real
>>> issue. Let's wait for the next victim... ).
>>
>> That is true.
>>
>>> Oh, and please fix a typo of mine in the one line above the patch
>>> hunk; the modifier for SI addresses should be 'k', not 'l'.
>>
>> Will do.
>>
>>> BTW: Do we need this patch also for 4.7? x32 address-mode is long by
>>> default there, but the problem doesn't trigger.
>>>
>>
>> This regression is triggered by revision 188008:
>>
>> http://gcc.gnu.org/ml/gcc-cvs/2012-05/msg00038.html
>>
>> aka the un-sign-extension of sizetype constants.  We can
>> backport it to 4.7 branch if we want to be on the safer side.
>
> Yes, please backport the patch to 4.7 after it lives a couple of days
> in mainline without problems.
>
> Thanks,
> Uros.


I checked this patch into 4.7 branch after testing.

Thanks.

-- 
H.J.
---
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c7fce0b..b33aeea 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2012-11-19  H.J. Lu  <hongjiu...@intel.com>
+
+       Backported from mainline
+       2012-11-13  Eric Botcazou  <ebotca...@adacore.com>
+                   H.J. Lu  <hongjiu...@intel.com>
+
+       PR middle-end/55142
+       * config/i386/i386.c (legitimize_pic_address): Properly handle
+       REG + CONST.
+       (ix86_print_operand_address): Set code to 'k' when forcing
+       addr32 prefix.  For x32, zero-extend negative displacement if
+       it < -16*1024*1024.
+
 2012-11-12  Eric Botcazou  <ebotca...@adacore.com>

        * common.opt (fvar-tracking-uninit): Document.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 45be11b..b805af6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -12330,7 +12330,6 @@ legitimize_pic_address (rtx orig, rtx reg)
 {
   rtx addr = orig;
   rtx new_rtx = orig;
-  rtx base;

 #if TARGET_MACHO
   if (TARGET_MACHO && !TARGET_64BIT)
@@ -12535,20 +12534,33 @@ legitimize_pic_address (rtx orig, rtx reg)
            }
          else
            {
-             base = legitimize_pic_address (XEXP (addr, 0), reg);
-             new_rtx  = legitimize_pic_address (XEXP (addr, 1),
-                                                base == reg ? NULL_RTX : reg);
+             rtx base = legitimize_pic_address (op0, reg);
+             enum machine_mode mode = GET_MODE (base);
+             new_rtx
+               = legitimize_pic_address (op1, base == reg ? NULL_RTX : reg);

              if (CONST_INT_P (new_rtx))
-               new_rtx = plus_constant (base, INTVAL (new_rtx));
+               {
+                 if (INTVAL (new_rtx) < -16*1024*1024
+                     || INTVAL (new_rtx) >= 16*1024*1024)
+                   {
+                     if (!x86_64_immediate_operand (new_rtx, mode))
+                       new_rtx = force_reg (mode, new_rtx);
+                     new_rtx
+                       = gen_rtx_PLUS (mode, force_reg (mode, base), new_rtx);
+                   }
+                 else
+                   new_rtx = plus_constant (base, INTVAL (new_rtx));
+               }
              else
                {
-                 if (GET_CODE (new_rtx) == PLUS && CONSTANT_P (XEXP (new_rtx, 
1)))
+                 if (GET_CODE (new_rtx) == PLUS
+                     && CONSTANT_P (XEXP (new_rtx, 1)))
                    {
-                     base = gen_rtx_PLUS (Pmode, base, XEXP (new_rtx, 0));
+                     base = gen_rtx_PLUS (mode, base, XEXP (new_rtx, 0));
                      new_rtx = XEXP (new_rtx, 1);
                    }
-                 new_rtx = gen_rtx_PLUS (Pmode, base, new_rtx);
+                 new_rtx = gen_rtx_PLUS (mode, base, new_rtx);
                }
            }
        }
@@ -14642,7 +14654,30 @@ ix86_print_operand_address (FILE *file, rtx addr)
            }
 #endif
          gcc_assert (!code);
-         code = 'l';
+         code = 'k';
+       }
+      else if (code == 0
+              && TARGET_X32
+              && disp
+              && CONST_INT_P (disp)
+              && INTVAL (disp) < -16*1024*1024)
+       {
+         /* X32 runs in 64-bit mode, where displacement, DISP, in
+            address DISP(%r64), is encoded as 32-bit immediate sign-
+            extended from 32-bit to 64-bit.  For -0x40000300(%r64),
+            address is %r64 + 0xffffffffbffffd00.  When %r64 <
+            0x40000300, like 0x37ffe064, address is 0xfffffffff7ffdd64,
+            which is invalid for x32.  The correct address is %r64
+            - 0x40000300 == 0xf7ffdd64.  To properly encode
+            -0x40000300(%r64) for x32, we zero-extend negative
+            displacement by forcing addr32 prefix which truncates
+            0xfffffffff7ffdd64 to 0xf7ffdd64.  In theory, we should
+            zero-extend all negative displacements, including -1(%rsp).
+            However, for small negative displacements, sign-extension
+            won't cause overflow.  We only zero-extend negative
+            displacements if they < -16*1024*1024, which is also used
+            to check legitimate address displacements for PIC.  */
+         code = 'k';
        }

       if (ASSEMBLER_DIALECT == ASM_ATT)
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 7f402b1..a040c81 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,12 @@
+2012-11-19  H.J. Lu  <hongjiu...@intel.com>
+
+       Backported from mainline
+       2012-11-13  H.J. Lu  <hongjiu...@intel.com>
+
+       PR middle-end/55142
+       * gcc.target/i386/pr55142-1.c: New file.
+       * gcc.target/i386/pr55142-2.c: Likewise.
+
 2012-11-09  Eric Botcazou  <ebotca...@adacore.com>

        * gnat.dg/stack_check3.ad[sb]: New test.
diff --git a/gcc/testsuite/gcc.target/i386/pr55142-1.c
b/gcc/testsuite/gcc.target/i386/pr55142-1.c
new file mode 100644
index 0000000..28375b5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr55142-1.c
@@ -0,0 +1,34 @@
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-require-effective-target fpic } */
+/* { dg-options "-O2 -mx32 -fpic" } */
+
+typedef int int32_t;
+typedef unsigned int uint32_t;
+typedef int32_t Elf32_Sword;
+typedef struct
+{
+  Elf32_Sword d_tag;
+} Elf32_Dyn;
+struct link_map
+{
+  Elf32_Dyn *l_ld;
+  Elf32_Dyn *l_info[34];
+};
+extern struct link_map _dl_rtld_map __attribute__ ((visibility ("hidden")));
+static void elf_get_dynamic_info (struct link_map *l)
+{
+  Elf32_Dyn *dyn = l->l_ld;
+  Elf32_Dyn **info;
+  info = l->l_info;
+  while (dyn->d_tag != 0)
+    {
+      if ((uint32_t) (0x6ffffeff - dyn->d_tag) < 11)
+       info[0x6ffffeff - dyn->d_tag + 12] = dyn;
+      ++dyn;
+    }
+}
+void
+foo (void)
+{
+  elf_get_dynamic_info (&_dl_rtld_map);
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr55142-2.c
b/gcc/testsuite/gcc.target/i386/pr55142-2.c
new file mode 100644
index 0000000..9daae9d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr55142-2.c
@@ -0,0 +1,33 @@
+/* { dg-do compile { target { ! { ia32 } } } } */
+/* { dg-require-effective-target fpic } */
+/* { dg-options "-O3 -mx32 -fpic" } */
+/* { dg-final { scan-assembler-not "movl\[\\t \]*%.*,\[\\t
\]*-1073742592\\(%r(.x|.i|.p|\[1-9\]*)\\)" } } */
+
+typedef int int32_t;
+typedef unsigned int uint32_t;
+typedef uint32_t Elf32_Word;
+typedef int32_t Elf32_Sword;
+typedef uint32_t Elf32_Addr;
+typedef struct {
+  Elf32_Sword d_tag;
+  union {
+    Elf32_Word d_val;
+    Elf32_Addr d_ptr;
+  } d_un;
+} Elf32_Dyn;
+struct link_map {
+  Elf32_Dyn *l_ld;
+  Elf32_Dyn *l_info[34 + 16 + 3 + 12 + 11];
+};
+void
+elf_get_dynamic_info (struct link_map *l)
+{
+  Elf32_Dyn *dyn = l->l_ld;
+  Elf32_Dyn **info = l->l_info;
+  typedef Elf32_Word d_tag_utype;
+  while (dyn->d_tag != 0) {
+    if ((d_tag_utype) (0x6ffffeff - dyn->d_tag) < 11)
+      info[(0x6ffffeff - dyn->d_tag) + 34 + 16 + 3 + 12] = dyn;
+    ++dyn;
+  }
+}
-- 
1.7.11.7

Re: Patch: PR target/55142: [4.8 Regression] internal compiler error: in plus_constant, at explow.c:88

Reply via email to