Problem with stores and loads from unions and proposed fix

2010-02-05 Thread Martin Chaney
This problem showed up in a PDP10 C version of GCC I'm responsible for 
and took a good while to track down.  The fix is in generic gcc code so 
even though my PDP10 compiler is not an official gcc version and I 
haven't been successful at creating a failing program on the Intel 
compiler it seems like it should cause problems elsewhere so I figured I 
should pass it on.


Here's a union that allows referencing bits in a word in different ways 
(the PDP10 has a 36 bit word, but that doesn't seem to be an issue here)


   union {
   int word;
   struct {
   unsigned long w0 : 32;
   unsigned long pad : 4;
   } i32;
   struct {
   unsigned long s0 : 16;
   unsigned long s1 : 16;
   unsigned long pad : 4;
   } i16;
   struct {
   unsigned long b0 : 8;
   unsigned long b1 : 8;
   unsigned long b2 : 8;
   unsigned long b3 : 8;
   unsigned long pad : 4;
   } i8;
   } u32;


u32.word =  ;

/* in a subsequent different basic block which is guaranteed to be 
reached with u32 unchanged */

u32.i8.b1 = ... ;
... = u32.word ;

CSE detects that the same subexpression is used in two places and 
substitutes a reaching register for the reference to u32.word without 
noticing that the memory has been modified by the bit field reference.  
Adding a call to invalidate_any_buried_refs(dest) flags the memory 
reference in such a way that it is not ignored and the erroneous CSE 
optimization is not done.



--- gcse.c  (revision 156482)
+++ gcse.c  (working copy)
@@ -4783,7 +4783,12 @@ compute_ld_motion_mems (void)
 else
   ptr->invalid = 1;
   }
-   }
+  else
+{
+  /* Make sure there isn't a buried store somewhere.  */
+  invalidate_any_buried_refs (dest);
+}
+   }
 else
   invalidate_any_buried_refs (PATTERN (insn));
   }


Thanks to anyone who can help determine whether this is a problem for 
other gcc versions and getting a fix into the gcc source.


Martin Chaney
XKL, LLC








Re: Problem with stores and loads from unions and proposed fix

2010-02-08 Thread Martin Chaney

Richard Guenther wrote:

On Sat, Feb 6, 2010 at 3:44 AM, Martin Chaney  wrote:
  

This problem showed up in a PDP10 C version of GCC I'm responsible for and
took a good while to track down.  The fix is in generic gcc code so even
though my PDP10 compiler is not an official gcc version and I haven't been
successful at creating a failing program on the Intel compiler it seems like
it should cause problems elsewhere so I figured I should pass it on.

Here's a union that allows referencing bits in a word in different ways (the
PDP10 has a 36 bit word, but that doesn't seem to be an issue here)

  union {
  int word;
  struct {
  unsigned long w0 : 32;
  unsigned long pad : 4;
  } i32;
  struct {
  unsigned long s0 : 16;
  unsigned long s1 : 16;
  unsigned long pad : 4;
  } i16;
  struct {
  unsigned long b0 : 8;
  unsigned long b1 : 8;
  unsigned long b2 : 8;
  unsigned long b3 : 8;
  unsigned long pad : 4;
  } i8;
  } u32;


u32.word =  ;

/* in a subsequent different basic block which is guaranteed to be reached
with u32 unchanged */
u32.i8.b1 = ... ;
... = u32.word ;

CSE detects that the same subexpression is used in two places and
substitutes a reaching register for the reference to u32.word without
noticing that the memory has been modified by the bit field reference.
 Adding a call to invalidate_any_buried_refs(dest) flags the memory
reference in such a way that it is not ignored and the erroneous CSE
optimization is not done.


--- gcse.c  (revision 156482)
+++ gcse.c  (working copy)
@@ -4783,7 +4783,12 @@ compute_ld_motion_mems (void)
else
  ptr->invalid = 1;
  }
-   }
+  else
+{
+  /* Make sure there isn't a buried store somewhere.  */
+  invalidate_any_buried_refs (dest);
+}
+   }
else
  invalidate_any_buried_refs (PATTERN (insn));
  }


Thanks to anyone who can help determine whether this is a problem for other
gcc versions and getting a fix into the gcc source.



Please specify the GCC version this happens with and show the RTL
before gcse.

Richard.

  

Martin Chaney
XKL, LLC








My PDP10 compiler is based on GCC 4.3.0.  (I'm still trying to port the 
4.4 gimplification changes.)  Pasted below is the RTL from the pass 
prior to GCSE.  Without the fix, GCSE will replace the use of REG 48 in 
insn 70 with a reaching register which contains the value (mem:SI 
(reg:SI 16 )) and delete insn 69.  With the fix, this section of 
code is not modified by GCSE.


(note 66 65 67 15 [bb 15] NOTE_INSN_BASIC_BLOCK)

(insn 67 66 68 15 testunion.c:52 (set (reg:SI 46)
   (const_int 0 [0x0])) 66 {*movsi} (nil))

(insn 68 67 69 15 testunion.c:52 (set (zero_extract:SI (mem/s/j/c:SI 
(reg/f:SI 16 ) [0+0 S4 A36])

   (const_int 8 [0x8])
   (const_int 24 [0x18]))
   (reg:SI 46)) 48 {*insv_memsi} (expr_list:REG_EQUAL (const_int 0 
[0x0])

   (nil)))

(insn 69 68 70 15 testunion.c:53 (set (reg:SI 48)
   (mem/s/c:SI (reg/f:SI 16 ) [0+0 S4 A36])) 66 {*movsi} (nil))

(insn 70 69 71 15 testunion.c:53 (set (reg:SI 48)
   (lshiftrt:SI (reg:SI 48)
   (neg:SI (const_int -4 [0xfffc] 155 
{LSH_right} (nil))


(insn 71 70 73 15 testunion.c:53 (set (reg:SI 48)
   (and:SI (reg:SI 48)
   (const_int 65535 [0x]))) 170 {*andsi3} (nil))

(insn 73 71 76 15 testunion.c:53 (set (reg/v:SI 32 [ sum ])
   (plus:SI (reg/v:SI 32 [ sum ])
   (reg:SI 48))) 98 {*addsi3} (nil))
;; End of basic block 15 -> ( 8)


gcc compiler for pdp10

2008-04-18 Thread Martin Chaney

Hi,

I'm am the proprietor of a gcc compiler for the PDP10 architecture.

(This is a compiler previously worked on by Lars Brinkhoff who left XKL 
some while before I joined XKL.  It's possible some of you may have been 
familiar with him or the compiler from that time.)


The compiler is currently in a state where it is synched with the both 
the 4.3 and 4.4 branches, and it passes the testsuite tests (with the 
exception of some I've flagged as expected failures for the pdp10).


My employer is happy to release my work on the gcc compiler back to the 
gcc community and I've sent in a request for the necessary forms.


The PDP10 architecture is unusual in various ways that distinguish it 
from the mainstream architectures supported by the gcc compiler and this 
has made the development of this compiler a significant task.  
Undoubtedly I've made customizations in inappropriate ways.  I'm seeking 
contacts with people who might be able to advise me on how to cleanup my 
implementation to reduce the amount of #ifdef __PDP10_H__ I've sprinkled 
liberally throughout the source.  Also, if its possible to get simple 
changes made to prevent breaking my PDP10 version and that are otherwise 
innocuous that would be wonderful.  For example, the PDP10 word size is 
36 bits;  Fairly recently people have taken to writing code that assumes 
word size is a power of 2 even when it's straightforward to write in a 
manner that doesn't make that assumption.


Considering the large number of files customized to get the PDP10 
compiler working, I'm not sure whether it's possible to get it to build 
directly from the gcc trunk, but it would be nice to work toward that goal.


Some other things which distinguish the PDP10 architecture from 
assumptions in the gcc code base include: its variety of formats of 
pointers only one of which can be viewed as an integer and that one is 
capable of referencing only word aligned data, a functional difference 
between signed and unsigned integers, and peculiarities to the use of 
PDP10 byte arrays which are very difficult to describe.


Any help or advise would be appreciated.

Martin Chaney
XKL, LLC