------- Comment #8 from steven at gcc dot gnu dot org  2006-01-10 17:50 -------
The new reassociation pass, or the removal of DOM's reassociation bits, fixed
this on the trunk.  We get poorer initial RTL generation out of GCC 4.1 and we
never manage to fix it up:

The .final_cleanup from GCC 4.1 and GCC 4.0:
;; Function foo (foo)

foo (v)
{
<bb 0>:
  return v & -v;

}


And the .final_cleanup from GCC 4.2:
;; Function foo (foo)

foo (v)
{
<bb 2>:
  return -v & v;

}


 (insn 12 11 13 (parallel [
             (set (reg:DI 60)
-                (and:DI (reg/v:DI 59 [ v ])
-                    (reg:DI 61)))
+                (and:DI (reg:DI 61)
+                    (reg/v:DI 59 [ v ])))
             (clobber (reg:CC 17 flags))
         ]) -1 (nil)
     (nil))

So this regression is not caused by the register allocator, but it does play a
role:

In the .combine and .ce2 RTL dumps, the difference is still there:
(insn 12 11 16 (parallel [
(insn 12 11 16 (parallel [
             (set (reg:DI 60)
-                (and:DI (reg/v:DI 59 [ v ])
-                    (reg:DI 61)))
+                (and:DI (reg:DI 61)
+                    (reg/v:DI 59 [ v ])))
             (clobber (reg:CC 17 flags))
-    (expr_list:REG_DEAD (reg/v:DI 59 [ v ])
-        (expr_list:REG_DEAD (reg:DI 61)
+    (expr_list:REG_DEAD (reg:DI 61)
+        (expr_list:REG_DEAD (reg/v:DI 59 [ v ])
             (expr_list:REG_UNUSED (reg:CC 17 flags)
                 (nil)))))

Then in the .regmove RTL dump something changes:
(insn:HI 12 11 16 (parallel [
-            (set (reg/v:DI 59 [ v ])
-                (and:DI (reg/v:DI 59 [ v ])
-                    (reg:DI 61)))
+            (set (reg:DI 61)
+                (and:DI (reg:DI 61)
+                    (reg/v:DI 59 [ v ])))
             (clobber (reg:CC 17 flags))
         ]) 297 {*anddi_1_rex64} (insn_list:REG_DEP_TRUE 11 (nil))
-    (expr_list:REG_DEAD (reg:DI 61)
+    (expr_list:REG_DEAD (reg/v:DI 59 [ v ])
         (expr_list:REG_UNUSED (reg:CC 17 flags)
             (nil))))

This small difference eventually leads to a different choice of register
allocation.  The choice that GCC 4.2 makes is superior because it makes the
move to the result a dead instruction.  The .greg RTL dump shows this:

-(insn:HI 12 11 16 0 (parallel [
-            (set (reg/v:DI 5 di [orig:59 v ] [59])
-                (and:DI (reg/v:DI 5 di [orig:59 v ] [59])
-                    (reg:DI 0 ax [61])))
+(insn:HI 12 11 16 2 (parallel [
+            (set (reg:DI 0 ax [61])
+                (and:DI (reg:DI 0 ax [61])
+                    (reg/v:DI 5 di [orig:59 v ] [59])))
             (clobber (reg:CC 17 flags))
         ]) 297 {*anddi_1_rex64} (insn_list:REG_DEP_TRUE 11 (nil))
     (nil))

-(insn:HI 19 16 25 0 (set (reg/i:DI 0 ax [ <result> ])
-        (reg/v:DI 5 di [orig:59 v ] [59])) 81 {*movdi_1_rex64} 
-    (insn_list:REG_DEP_TRUE 12 (nil))
-    (nil))


-- 

steven at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  BugsThisDependsOn|18427                       |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21715


Reply via email to