> Given that x86 memset/memcpy is still broken, I think we should revert
> it for now.

Well, looking into the code, the SSE alignment issues needs work - the
alignment test merely tests whether some alignmnet is known not whether 16 byte
alignment is known that is the cause of failures in 32bit bootstrap.  I 
originally
convinced myself that this is safe since we soot for unaligned load/stores 
anyway.


I've commited the following patch that disabled SSE codegen and unbreaks atom
bootstrap.  This seems more sensible to me given that the patch cumulated some
good improvements on the non-SSE path as well and we could return into the SSE
alignment issues incremntally.  There is still falure in the fortran testcase
that I am convinced is previously latent issue.

I will be offline tomorrow.  If there are futher serious problems, just fell
free to revert the changes and we could look into them for next stage1.

Honza

        * i386.c (atom_cost): Disable SSE loop until alignment issues are fixed.
Index: i386.c
===================================================================
--- i386.c      (revision 181479)
+++ i386.c      (working copy)
@@ -1783,18 +1783,18 @@ struct processor_costs atom_cost = {
   /* stringop_algs for memcpy.  
      SSE loops works best on Atom, but fall back into non-SSE unrolled loop 
variant
      if that fails.  */
-  {{{libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}, /* 
Known alignment.  */
-    {libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}},
-   {{libcall, {{2048, sse_loop}, {2048, unrolled_loop}, {-1, libcall}}}, /* 
Unknown alignment.  */
-    {libcall, {{2048, sse_loop}, {2048, unrolled_loop},
+  {{{libcall, {{4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment.  */
+    {libcall, {{4096, unrolled_loop}, {-1, libcall}}}},
+   {{libcall, {{2048, unrolled_loop}, {-1, libcall}}}, /* Unknown alignment.  
*/
+    {libcall, {{2048, unrolled_loop},
               {-1, libcall}}}}},
 
   /* stringop_algs for memset.  */
-  {{{libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}, /* 
Known alignment.  */
-    {libcall, {{4096, sse_loop}, {4096, unrolled_loop}, {-1, libcall}}}},
-   {{libcall, {{1024, sse_loop}, {1024, unrolled_loop},         /* Unknown 
alignment.  */
+  {{{libcall, {{4096, unrolled_loop}, {-1, libcall}}}, /* Known alignment.  */
+    {libcall, {{4096, unrolled_loop}, {-1, libcall}}}},
+   {{libcall, {{1024, unrolled_loop},   /* Unknown alignment.  */
               {-1, libcall}}},
-    {libcall, {{2048, sse_loop}, {2048, unrolled_loop},
+    {libcall, {{2048, unrolled_loop},
               {-1, libcall}}}}},
   1,                                   /* scalar_stmt_cost.  */
   1,                                   /* scalar load_cost.  */

Reply via email to