Re: strict aliasing

2007-11-06 Thread Ian Lance Taylor
skaller <[EMAIL PROTECTED]> writes:

> On Mon, 2007-11-05 at 14:30 -0500, Ross Ridge wrote:
> 
> > One example of where it hurts on just about any platform is something
> > like this:
> > 
> > void allocate(int **p, unsigned len);
> > 
> > int *foo(unsigned len) {
> > int *p;
> > unsigned i;
> > allocate(&p, len);
> > for (i = 0; i < len; i++) 
> > p[i] = 1;
> > return p;
> > }
> > 
> > Without strict aliasing being enabled, the compiler can't assume that
> > that the assignment "p[i] = 1" won't change "p".  This results the value
> > of p being loaded on every loop iteration, instead of just once at the
> > start of the loop.  It also prevents GCC from vectorizing the loop.
> 
> 
> Now I'm a bit confused.. Ian wrote previously:
> 
> " Strict
> aliasing refers to one component of that analysis, really a final
> tie-breaker: using type information to distinguish loads and stores."
> 
> and 
> 
> "Strict aliasing only refers to loads and stores using pointers."
> 
> but the assignment here is of an int.

The assignment is indeed of an int, but it uses a pointer.  Strict
aliasing only refers to loads and stores which use pointers.  The
type-based alias analysis is done on the types to which those pointers
point.

Given two pointers, "T1* p1" and "T2* p2", type based alias analysis
(aka strict aliasing) lets us conclude that p1 and p2 point to
different memory if T1 and T2 are incompatible types with respect to
aliasing.  The rules for alias compatibility come straight from the C
standard.

Ian


Help with debug info for fixed-point data types

2007-11-06 Thread Mohamed Shafi
Hello all,

I have implemented fixed point data types in gcc-4.1.1 by
incorporating the patches sent out in the gcc patches mailing list. To
debug fixed-point types, as per the dwarf manual we may need the
following info as follows.
DW_AT_binary_scale : binary scale factor for fixed-point type.
DW_AT_digit_count : digit count for packed decimal or numeric string type.

For a fixed point data type of format s7.15 how to encode the
information for the format into the following information

Format : s7.15
DW_AT_binary_scale :
DW_AT_digit_count :


Format : s.31
DW_AT_binary_scale :
DW_AT_digit_count :


Any help would be appreciated.

Regards,
Shafi


Re: strict aliasing

2007-11-06 Thread skaller

On Tue, 2007-11-06 at 00:15 -0800, Ian Lance Taylor wrote:
> skaller <[EMAIL PROTECTED]> writes:
> 
> > On Mon, 2007-11-05 at 14:30 -0500, Ross Ridge wrote:
> > 
> > > One example of where it hurts on just about any platform is something
> > > like this:
> > > 
> > >   void allocate(int **p, unsigned len);
> > > 
> > >   int *foo(unsigned len) {
> > >   int *p;
> > >   unsigned i;
> > >   allocate(&p, len);
> > >   for (i = 0; i < len; i++) 
> > >   p[i] = 1;
> > >   return p;
> > >   }

> The assignment is indeed of an int, but it uses a pointer.  Strict
> aliasing only refers to loads and stores which use pointers.  The
> type-based alias analysis is done on the types to which those pointers
> point.
> 
> Given two pointers, "T1* p1" and "T2* p2", type based alias analysis
> (aka strict aliasing) lets us conclude that p1 and p2 point to
> different memory if T1 and T2 are incompatible types with respect to
> aliasing.  The rules for alias compatibility come straight from the C
> standard.

Yes but I still don't understand. The claim was that the assignment
might modify p. This is is contra-indicated since p is a pointer 
to an int, whereas the value being assigned is an int.

So IF we assume an int cannot alias a pointer, p cannot be modified:
the only store is of a type which cannot modify p.

This assumes we apply aliasing rules EXCEPT between pointers.

In particular, let me try to build a model: partition all the
types into classes which can alias each other, in two ways:
with strict aliasing (S), and with rough aliasing (R).

Then in S, T1* and  T2* can alias IFF T2 is T2 (except for 'void'
of course).

Whereas in in R, T1* and T2* can alias each other independently 
of T1 and T2.

However in R, int cannot alias T1*. So the assignment of an int
cannot modify p above in EITHER partition S or R.

[Yes, I know it probably isn't a partition, just a mental model]

Now of course if we allow int to alias a pointer, THEN certainly
the assignment may modify p (at least without data flow analysis
to prove otherwise we have to assume that).

However this is a different circumstance. My understanding
was that gcc with strict aliasing turned off would optimise
the code above the same way as if it were on. On amd64
an int cannot alias a pointer (int is 32 bits, pointer is
64 bits).

So I remain confused as to the difference between strict
and non-strict aliasing with respect to what optimisations
are permitted (and/or actually done).


-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Reload using a live register to reload into

2007-11-06 Thread Pranav Bhandarkar
Hi,
Working on a private port I am seeing a problem with reload clobbering
a live register and thus causing havoc.

Consider the following snippet of the code dump.
(note:HI 85 84 86 5 [bb 5] NOTE_INSN_BASIC_BLOCK)

(note:HI 86 85 89 5 NOTE_INSN_DELETED)

(insn:HI 89 86 87 5 cor_h.c:129 (set (reg:SI 3 $c3 [ ivtmp.103 ])
(sign_extend:SI (subreg:HI (reg:SI 206 [ ivtmp.103 ]) 0))) 86
{extendhisi2} (nil))

(insn:HI 87 89 88 5 cor_h.c:129 (set (reg:SI 1 $c1 [ ivtmp.101 ])
(reg:SI 208 [ ivtmp.101 ])) 45 {*movsi} (nil))

(insn:HI 88 87 270 5 cor_h.c:129 (set (reg:SI 2 $c2 [ h ])
(reg/v/f:SI 236 [ h ])) 45 {*movsi} (nil))

(insn:HI 270 88 91 5 cor_h.c:129 (set (reg:SI 4 $c4)
(const_int 0 [0x0])) 45 {*movsi} (nil))

(call_insn:HI 91 270 92 5 cor_h.c:129 (parallel [
(set (reg:SI 1 $c1)
(call (mem:SI (symbol_ref:SI
("DotProductWithoutShift") [flags 0x41] ) [0 S4 A32])
(const_int 0 [0x0])))
(use (const_int 0 [0x0]))
(clobber (reg:SI 31 $link))
]) 42 {*call_value_direct} (expr_list:REG_DEAD (reg:SI 4 $c4)
(expr_list:REG_DEAD (reg:SI 3 $c3 [ ivtmp.103 ])
(expr_list:REG_DEAD (reg:SI 2 $c2 [ h ])
(nil
(expr_list:REG_DEP_TRUE (use (reg:SI 4 $c4))
(expr_list:REG_DEP_TRUE (use (reg:SI 3 $c3 [ ivtmp.103 ]))
(expr_list:REG_DEP_TRUE (use (reg:SI 2 $c2 [ h ]))
(expr_list:REG_DEP_TRUE (use (reg:SI 1 $c1 [ ivtmp.101 ]))
(nil))

(insn:HI 92 91 285 5 cor_h.c:129 (set (reg/v:SI 230 [ s ])
(reg:SI 1 $c1)) 45 {*movsi} (expr_list:REG_DEAD (reg:SI 1 $c1)
(nil)))

(jump_insn:HI 285 92 286 5 (set (pc)
(label_ref 118)) 8 {jump} (nil))
;; End of basic block 5 -> ( 10)

The register $c1 is used to pass the first argument to the function
DotProductWithoutShift
On encountering the call_insn ( insn no 91) global.c inserts a store
to save the register $c16 (which contains a variable 'tot'  and $c16
is a caller save register ).
Hence the following insn is inserted just before the call to
DotProductWithoutShift.

(insn 309 270 91 5 cor_h.c:129 (set (mem/c:SI (plus:SI (reg/f:SI 29 $sp)
(const_int 176 [0xb0])) [11 S4 A32])
(reg:SI 16 $c16)) 45 {*movsi} (nil))

However the index 176 is too large and
 (plus:SI (reg/f:SI 29 $sp)
(const_int 176 [0xb0])) needs to be reloaded.

$c1 gets chosen for this reload and now the dump snippet looks like

(note:HI 85 84 86 5 [bb 5] NOTE_INSN_BASIC_BLOCK)

(note:HI 86 85 89 5 NOTE_INSN_DELETED)

(insn:HI 89 86 87 5 cor_h.c:129 (set (reg:SI 3 $c3 [ ivtmp.103 ])
(sign_extend:SI (reg:HI 14 $c14 [orig:206 ivtmp.103 ] [206])))
86 {extendhisi2} (nil))

(insn:HI 87 89 88 5 cor_h.c:129 (set (reg:SI 1 $c1 [ ivtmp.101 ])
(reg:SI 8 $c8 [orig:208 ivtmp.101 ] [208])) 45 {*movsi} (nil))

(insn:HI 88 87 270 5 cor_h.c:129 (set (reg:SI 2 $c2 [ h ])
(reg/v/f:SI 22 $c22 [orig:236 h ] [236])) 45 {*movsi} (nil))

(insn:HI 270 88 329 5 cor_h.c:129 (set (reg:SI 4 $c4)
(const_int 0 [0x0])) 45 {*movsi} (nil))

(insn 329 270 330 5 cor_h.c:129 (set (reg:SI 1 $c1)
(const_int 176 [0xb0])) 45 {*movsi} (nil))

(insn 330 329 309 5 cor_h.c:129 (set (reg:SI 1 $c1)
(plus:SI (reg:SI 1 $c1)
(reg/f:SI 29 $sp))) 65 {*addsi3} (expr_list:REG_EQUIV
(plus:SI (reg/f:SI 29 $sp)
(const_int 176 [0xb0]))
(nil)))

(insn 309 330 91 5 cor_h.c:129 (set (mem/c:SI (reg:SI 1 $c1) [11 S4 A32])
(reg:SI 16 $c16)) 45 {*movsi} (nil))

(call_insn:HI 91 309 332 5 cor_h.c:129 (parallel [
(set (reg:SI 1 $c1)
(call (mem:SI (symbol_ref:SI
("DotProductWithoutShift") [flags 0x41] ) [0 S4 A32])
(const_int 0 [0x0])))
(use (const_int 0 [0x0]))
(clobber (reg:SI 31 $link))
]) 42 {*call_value_direct} (nil)
(expr_list:REG_DEP_TRUE (use (reg:SI 4 $c4))
(expr_list:REG_DEP_TRUE (use (reg:SI 3 $c3 [ ivtmp.103 ]))
(expr_list:REG_DEP_TRUE (use (reg:SI 2 $c2 [ h ]))
(expr_list:REG_DEP_TRUE (use (reg:SI 1 $c1 [ ivtmp.101 ]))
(nil))

(insn 332 91 333 5 (set (reg:SI 4 $c4)
(const_int 176 [0xb0])) 45 {*movsi} (nil))

(insn 333 332 310 5 (set (reg:SI 4 $c4)
(plus:SI (reg:SI 4 $c4)
(reg/f:SI 29 $sp))) 65 {*addsi3} (expr_list:REG_EQUIV
(plus:SI (reg/f:SI 29 $sp)
(const_int 176 [0xb0]))
(nil)))

(insn 310 333 285 5 (set (reg:SI 16 $c16)
(mem/c:SI (reg:SI 4 $c4) [11 S4 A32])) 45 {*movsi} (nil))

(jump_insn:HI 285 310 286 5 (set (pc)
(label_ref 118)) 8 {jump} (nil))
;; End of basic block 5 -> ( 10)

clearly the register $c1 which after insn 87 has the first argument of
the function DotProductWithoutShift is overwritten.

The file.c.176r.greg for insn 309 says

"Spilling for insn 309.
Using reg 6 for reload 0"

and indeed rld[0].regno is 6 and rld[0].in is

(plus:SI 

PR target/30961 (was: Re: GCC 4.3.0 Status Report (2007-11-04))

2007-11-06 Thread Ulrich Weigand
Mark Mitchell wrote:
> H.J. Lu wrote:
> 
> > http://gcc.gnu.org/ml/gcc-patches/2007-08/msg01865.html
> > 
> > which involves reload.
> 
> I'm not going to try to wade into reload.  Ulrich, Eric, Ian -- would
> one of you please review this patch?

@@ -1821,6 +1835,18 @@ find_reg (struct insn_chain *chain, int 
this_cost--;
  if (rl->out && REG_P (rl->out) && REGNO (rl->out) == regno)
this_cost--;
+#ifdef SECONDARY_MEMORY_NEEDED
+ /* If a memory location is needed for rl->in and dest_reg
+is usable, we will favor it.  */
+ else if (dest_reg == regno
+  && rl->in
+  && REG_P (rl->in)
+  && REGNO (rl->in) < FIRST_PSEUDO_REGISTER
+  && SECONDARY_MEMORY_NEEDED (REGNO_REG_CLASS (REGNO (rl->in)),
+  rl->class,
+  rl->mode))
+   this_cost = 0;
+#endif


Hmm, this isn't really related to secondary memory.  In general,
if we have a simple move with input reload, the destination of 
the move should be the preferred reload register.  In fact, there
already is code in find_reloads that is supposed to address this
problem:

  /* Special case a simple move with an input reload and a
 destination of a hard reg, if the hard reg is ok, use it.  */
  for (i = 0; i < n_reloads; i++)
if (rld[i].when_needed == RELOAD_FOR_INPUT
&& GET_CODE (PATTERN (insn)) == SET
&& REG_P (SET_DEST (PATTERN (insn)))
&& SET_SRC (PATTERN (insn)) == rld[i].in
&& !elimination_target_reg_p (SET_DEST (PATTERN (insn
  {

This does not trigger in the given test case because the SUBREG
interferes:

Reloads for insn # 6
Reload 0: reload_in (DF) = (reg:DF 5 di)
SSE_REGS, RELOAD_FOR_INPUT (opnum = 1), can't combine
reload_in_reg: (subreg:DF (reg/v:DI 5 di [orig:59 in ] [59]) 0)

(insn:HI 6 3 10 2 xxx.i:4
(set (reg:DF 21 xmm0 [orig:58  ] [58])
(subreg:DF (reg/v:DI 5 di [orig:59 in ] [59]) 0)) 102
 {*movdf_integer_rex64} (expr_list:REG_DEAD (reg/v:DI 5 di [orig:59 in ] [59])
(nil)))

Note how reload_in is not equal to the SET_SRC, but reload_in_reg is.
In that case, the same special case should apply.

The following patch fixes the test case for me:

Index: gcc/reload.c
===
--- gcc/reload.c(revision 129925)
+++ gcc/reload.c(working copy)
@@ -4462,7 +4462,8 @@
 if (rld[i].when_needed == RELOAD_FOR_INPUT
&& GET_CODE (PATTERN (insn)) == SET
&& REG_P (SET_DEST (PATTERN (insn)))
-   && SET_SRC (PATTERN (insn)) == rld[i].in
+   && (SET_SRC (PATTERN (insn)) == rld[i].in
+   || SET_SRC (PATTERN (insn)) == rld[i].in_reg)
&& !elimination_target_reg_p (SET_DEST (PATTERN (insn
   {
rtx dest = SET_DEST (PATTERN (insn));


H.J., could you verify that this solves your problem?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  [EMAIL PROTECTED]


Re: strict aliasing

2007-11-06 Thread Ross Ridge
skaller writes:
> Yes but I still don't understand. The claim was that the assignment
> might modify p. This is is contra-indicated since p is a pointer 
> to an int, whereas the value being assigned is an int.

The claim was, in the context of the example given, "Without strict
aliasing being enabled, the compiler can't assume that that the assignment
'p[i] = 1' won't change 'p'".  If you do enable GCC's strict type-based
aliasing analysis, then GCC will assume that the assignment can't change
"p" and generate better code.

Another way to put the "strict alias" rule is that two memory references
can't refer to an overlapping region of memory if the types being
referenced are different, with the exception of certain combinations of
types or in certain cases in a union.  For cases where the "strict alias"
rule doesn't apply, either because of the types involved or because its
disabled, "non-strict" alias analysis can find other cases whether two
memory references can't overlap.  For example, it's obvious that the
assigment of one variable to another (eg. "a = b") can't overlap because
all variables are allocated unique non-overlapping regions of memory.
A reference to an automatic variable can't overlap any reference to
memory through a pointer, if that automatic variable hasn't had its
address taken.  (Note how in my example, "p" has its address taken and
passed to another function so this rule doesn't apply.)

Note the use of the word "reference" in the above paragraph means
any operation that causes memory to be accessed whether by reading or
writing it.  It doesn't mean only the use of C++ reference type.

Ross Ridge



Re: strict aliasing

2007-11-06 Thread Ian Lance Taylor
skaller <[EMAIL PROTECTED]> writes:

> On Tue, 2007-11-06 at 00:15 -0800, Ian Lance Taylor wrote:
> > skaller <[EMAIL PROTECTED]> writes:
> > 
> > > On Mon, 2007-11-05 at 14:30 -0500, Ross Ridge wrote:
> > > 
> > > > One example of where it hurts on just about any platform is something
> > > > like this:
> > > > 
> > > > void allocate(int **p, unsigned len);
> > > > 
> > > > int *foo(unsigned len) {
> > > > int *p;
> > > > unsigned i;
> > > > allocate(&p, len);
> > > > for (i = 0; i < len; i++) 
> > > > p[i] = 1;
> > > > return p;
> > > > }
> 
> > The assignment is indeed of an int, but it uses a pointer.  Strict
> > aliasing only refers to loads and stores which use pointers.  The
> > type-based alias analysis is done on the types to which those pointers
> > point.
> > 
> > Given two pointers, "T1* p1" and "T2* p2", type based alias analysis
> > (aka strict aliasing) lets us conclude that p1 and p2 point to
> > different memory if T1 and T2 are incompatible types with respect to
> > aliasing.  The rules for alias compatibility come straight from the C
> > standard.
> 
> Yes but I still don't understand. The claim was that the assignment
> might modify p. This is is contra-indicated since p is a pointer 
> to an int, whereas the value being assigned is an int.

Right.  That is type based aliasing that tells you that.


> In particular, let me try to build a model: partition all the
> types into classes which can alias each other, in two ways:
> with strict aliasing (S), and with rough aliasing (R).

I recommend that you just read the standard and see the real aliasing
rules.


> However this is a different circumstance. My understanding
> was that gcc with strict aliasing turned off would optimise
> the code above the same way as if it were on. On amd64
> an int cannot alias a pointer (int is 32 bits, pointer is
> 64 bits).

What, other than strict aliasing, tells you that the two types can not
be aliased?  It is perfectly possible to do a 32-bit write to half of
a 64-bit value.

Ian


Re: strict aliasing

2007-11-06 Thread skaller

On Tue, 2007-11-06 at 06:29 -0800, Ian Lance Taylor wrote:
> skaller <[EMAIL PROTECTED]> writes:

> I recommend that you just read the standard and see the real aliasing
> rules.

I don't care about that, I'm trying to discover what
-fno-strict-aliasing actually does.

You hinted that it weakens the rules, without entirely
disabling type based optimisations.

> What, other than strict aliasing, tells you that the two types can not
> be aliased?  It is perfectly possible to do a 32-bit write to half of
> a 64-bit value.

What I'm mainly interested in is how "valid" aliases where
you write one and read the other will be affected by the
-fno-strict-aliasing switch.

The code does what it has to, so it doesn't matter what
the Standard says is allowed and what it says is not:
what matters is whether I have to use -fno-strict-aliasing
to ensure my code will work as I expect, and what that
will cost in terms of optimising unrelated code.

As a contrived example:

void f() {
struct X { int x; X(int a) : x(a) {} };
X w(1);
int *px = (int*)(void*)&w;
assert( (void*)px == (void*)&w);
assert( (void*)px == (void*)(&w.x));
cout << *px << endl;
}

I expect this to print 1 every time, despite the fact that
px and &w are pointers to the same store seen as different
types. I believe the compiler is entitled under strict aliasing
rules to completely elide the constructor application.
However the assertions cannot fail.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: strict aliasing

2007-11-06 Thread skaller

On Tue, 2007-11-06 at 09:37 -0500, Ross Ridge wrote:
> skaller writes:
> > Yes but I still don't understand. The claim was that the assignment
> > might modify p. This is is contra-indicated since p is a pointer 
> > to an int, whereas the value being assigned is an int.
> 
> The claim was, in the context of the example given, "Without strict
> aliasing being enabled, the compiler can't assume that that the assignment
> 'p[i] = 1' won't change 'p'". 

In this case I think it can. More precisely, IF the assignment 
changes p, another rules is broken (1 isn't a pointer), and
all bets are off, so  the compiler can go ahead and assume
p can't be modified -- on an amd64 anyhow (since int isn't
intptr_t).

However if the example is changed so we have a T* being assigned
to an element of an array of U*, the argument holds (assuming
pointers all have the same representation).

> Another way to put the "strict alias" rule

I do know what the rule means: what I don't understand is
exactly what specifying -fno-strict-aliasing does, i.e. what
would be allowed and what would not.

As the example above shows, aliasing rules aren't the only
way to determine store cannot overlap.

In particular there is a difference between:

int x;
float *px = (float*)(void*)&x;

and 

struct X { int a; };
struct Y { int b; };
X x;
Y *px = (Y*)(void*)&x;

In the first case, the compiler can assume no overlap,
and generate optimised code which will not do what you 
think because floats and ints aren't layout compatible.

But in the second case X and Y are layout compatible,
guaranteed. So in this case, the compiler can ALSO
generate optimised code that doesn't do what you want
unless -fno-strict-aliasing is specified.

The point is the switch has no effect on the first 
case: it can be optimised anyhow, because any code
which can tell the difference necessarily breaks
another rule.

Hope this make sense. Clearly I want the second
case to work as expected, whereas I want the
first case to be optimised no matter what
the -fno-strict-aliasing switch is.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: strict aliasing

2007-11-06 Thread Joe Buck
On Wed, Nov 07, 2007 at 02:30:44AM +1100, skaller wrote:
> > The claim was, in the context of the example given, "Without strict
> > aliasing being enabled, the compiler can't assume that that the assignment
> > 'p[i] = 1' won't change 'p'". 
> 
> In this case I think it can. More precisely, IF the assignment 
> changes p, another rules is broken (1 isn't a pointer), and
> all bets are off, so  the compiler can go ahead and assume
> p can't be modified -- on an amd64 anyhow (since int isn't
> intptr_t).

You are implicitly assuming strict aliasing.  Strict aliasing means
that you reject a possible aliasing based on the type, as you are
doing here (1 isn't a pointer).  If you turn it off, the compiler
cannot make that assumption: the user might be doing something
strange with casts.

Now it appears that you want to make some kind of intermediate assumption
(semi-strict aliasing?), where pointers of different types are allowed to
alias while ints can't alias with pointers.  But that's a rule you're
making up yourself, without any support in the standard or in the GCC
implementation.



Re: strict aliasing

2007-11-06 Thread Joe Buck
On Tue, Nov 06, 2007 at 12:15:17AM -0800, Ian Lance Taylor wrote:
> The assignment is indeed of an int, but it uses a pointer.  Strict
> aliasing only refers to loads and stores which use pointers.  The
> type-based alias analysis is done on the types to which those pointers
> point.

Minor nit: here "pointers" includes C++ references, as well as
pointer-valued expressions, as in

   long lv;
   ((int *)&lv) = 1; /* strict aliasing violation */


Re: strict aliasing

2007-11-06 Thread Ian Lance Taylor
skaller <[EMAIL PROTECTED]> writes:

> On Tue, 2007-11-06 at 06:29 -0800, Ian Lance Taylor wrote:
> > skaller <[EMAIL PROTECTED]> writes:
> 
> > I recommend that you just read the standard and see the real aliasing
> > rules.
> 
> I don't care about that, I'm trying to discover what
> -fno-strict-aliasing actually does.
> 
> You hinted that it weakens the rules, without entirely
> disabling type based optimisations.

I have no idea what I hinted.  What -fno-strict-aliasing does is turn
off type-based alias analysis.  The documentation is accurate.


> > What, other than strict aliasing, tells you that the two types can not
> > be aliased?  It is perfectly possible to do a 32-bit write to half of
> > a 64-bit value.
> 
> What I'm mainly interested in is how "valid" aliases where
> you write one and read the other will be affected by the
> -fno-strict-aliasing switch.
> 
> The code does what it has to, so it doesn't matter what
> the Standard says is allowed and what it says is not:
> what matters is whether I have to use -fno-strict-aliasing
> to ensure my code will work as I expect, and what that
> will cost in terms of optimising unrelated code.
> 
> As a contrived example:
> 
>   void f() {
>   struct X { int x; X(int a) : x(a) {} };
>   X w(1);
>   int *px = (int*)(void*)&w;
>   assert( (void*)px == (void*)&w);
>   assert( (void*)px == (void*)(&w.x));
>   cout << *px << endl;
>   }
> 
> I expect this to print 1 every time, despite the fact that
> px and &w are pointers to the same store seen as different
> types. I believe the compiler is entitled under strict aliasing
> rules to completely elide the constructor application.
> However the assertions cannot fail.

In this particular case the compiler is likely to act as you expect.
But, yes, in general, you need -fno-strict-aliasing if you want these
kinds of pointer type casts to work as you expect.

Ian


Re: strict aliasing

2007-11-06 Thread Ian Lance Taylor
Joe Buck <[EMAIL PROTECTED]> writes:

> On Wed, Nov 07, 2007 at 02:30:44AM +1100, skaller wrote:
> > > The claim was, in the context of the example given, "Without strict
> > > aliasing being enabled, the compiler can't assume that that the assignment
> > > 'p[i] = 1' won't change 'p'". 
> > 
> > In this case I think it can. More precisely, IF the assignment 
> > changes p, another rules is broken (1 isn't a pointer), and
> > all bets are off, so  the compiler can go ahead and assume
> > p can't be modified -- on an amd64 anyhow (since int isn't
> > intptr_t).
> 
> You are implicitly assuming strict aliasing.  Strict aliasing means
> that you reject a possible aliasing based on the type, as you are
> doing here (1 isn't a pointer).  If you turn it off, the compiler
> cannot make that assumption: the user might be doing something
> strange with casts.

In particular people really do write code like
int addr = 0x12345678;
char* p = (char*)&addr;
*p = 1;
and
int f[2] = { 0x8000, 0 };
double d = *(double*)f;
This is not valid C/C++ code.  But since there is quite a lot of it
out there, the -fno-strict-aliasing rule makes it work correctly.

Ian


Re: strict aliasing

2007-11-06 Thread skaller

On Tue, 2007-11-06 at 07:49 -0800, Joe Buck wrote:
> On Wed, Nov 07, 2007 at 02:30:44AM +1100, skaller wrote:

> Now it appears that you want to make some kind of intermediate assumption
> (semi-strict aliasing?), where pointers of different types are allowed to
> alias while ints can't alias with pointers.  

Yes. I want layout compatible types to be allowed to alias but
not others. In other words, where the access would be valid
provided it isn't optimised, don't optimise it. But where
the access would not be valid, optimise away.

Roughly speaking I want structural typing instead of nominal
typing. That is, when -fno-strict-aliasing is specified
I still want the type based optimisations to be applied,
but I want them based on the underlying structural types, 
not nominal types.

So for example

struct X { int x; };
struct Y { int z; };

are distinct nominal types, but they're structurally equivalent.

This is what I had hoped the 'strict' in the switch meant:
it turned off *strictness* by relaxing the notion of type
from nominal to structural.. but still did the analysis
and optimisations.

In C++ this is essential because constructable types cannot
be aliased in a union. This problem doesn't arise like that
in C.

You may note that even C already has structural typing
and aliasing rules already apply it of necessity: 
technically a T* and a T const* are distinct types.
But clearly they can alias.

BTW: yes I understand I ask for something gcc may not be
doing, I'm not asking for a change, just to understand
what it actually does. I guess that the optimisations
are defeated by, for example, subroutine calls across
translation unit boundaries. This provides part of
what I want: local code is still optimised, but aliases
across unit boundaries aren't, so whilst that model
applies I don't actually need the switch.

Unfortunately there are other cases (in C++) where I have
to break the rules, but I still don't want optimisations
disabled in other code. I want my cake and to eat it too .. :)

Without delving into the Standard, nor gcc implementation,
there are clearly isolated place where the optimisation
must be disabled. The most obvious is C++ reinterpret_cast<>
which *always* breaks the strict aliasing rules .. that
is indeed its purpose. 


-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: strict aliasing

2007-11-06 Thread skaller

On Tue, 2007-11-06 at 07:58 -0800, Ian Lance Taylor wrote:
> Joe Buck <[EMAIL PROTECTED]> writes:

> In particular people really do write code like
> int addr = 0x12345678;
> char* p = (char*)&addr;
> *p = 1;
> and
> int f[2] = { 0x8000, 0 };
> double d = *(double*)f;
> This is not valid C/C++ code.  

What you mean is that it is not strictly conforming C code.
[You cannot say that for C++, there is no notion of strictly
conforming C++ code]

> But since there is quite a lot of it
> out there, the -fno-strict-aliasing rule makes it work correctly.

Yes. This is overkill for me: making that work would disable
too many optimisations for my taste.

BTW: gcc handles these rules very cleverly indeed. I have
played with some code and things like union-ing an unsigned
char array correctly defeat the optimisations. I'm quite
surprised, this is very hard to get right.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: PR target/30961 (was: Re: GCC 4.3.0 Status Report (2007-11-04))

2007-11-06 Thread H.J. Lu
On Tue, Nov 06, 2007 at 03:30:04PM +0100, Ulrich Weigand wrote:
> Mark Mitchell wrote:
> > H.J. Lu wrote:
> > 
> > > http://gcc.gnu.org/ml/gcc-patches/2007-08/msg01865.html
> > > 
> > > which involves reload.
> > 
> > I'm not going to try to wade into reload.  Ulrich, Eric, Ian -- would
> > one of you please review this patch?
> 
> @@ -1821,6 +1835,18 @@ find_reg (struct insn_chain *chain, int 
>   this_cost--;
> if (rl->out && REG_P (rl->out) && REGNO (rl->out) == regno)
>   this_cost--;
> +#ifdef SECONDARY_MEMORY_NEEDED
> +   /* If a memory location is needed for rl->in and dest_reg
> +  is usable, we will favor it.  */
> +   else if (dest_reg == regno
> +&& rl->in
> +&& REG_P (rl->in)
> +&& REGNO (rl->in) < FIRST_PSEUDO_REGISTER
> +&& SECONDARY_MEMORY_NEEDED (REGNO_REG_CLASS (REGNO (rl->in)),
> +rl->class,
> +rl->mode))
> + this_cost = 0;
> +#endif
> 
> 
> Hmm, this isn't really related to secondary memory.  In general,
> if we have a simple move with input reload, the destination of 
> the move should be the preferred reload register.  In fact, there
> already is code in find_reloads that is supposed to address this
> problem:
> 
>   /* Special case a simple move with an input reload and a
>  destination of a hard reg, if the hard reg is ok, use it.  */
>   for (i = 0; i < n_reloads; i++)
> if (rld[i].when_needed == RELOAD_FOR_INPUT
> && GET_CODE (PATTERN (insn)) == SET
> && REG_P (SET_DEST (PATTERN (insn)))
> && SET_SRC (PATTERN (insn)) == rld[i].in
> && !elimination_target_reg_p (SET_DEST (PATTERN (insn
>   {
> 
> This does not trigger in the given test case because the SUBREG
> interferes:
> 
> Reloads for insn # 6
> Reload 0: reload_in (DF) = (reg:DF 5 di)
> SSE_REGS, RELOAD_FOR_INPUT (opnum = 1), can't combine
> reload_in_reg: (subreg:DF (reg/v:DI 5 di [orig:59 in ] [59]) 0)
> 
> (insn:HI 6 3 10 2 xxx.i:4
> (set (reg:DF 21 xmm0 [orig:58  ] [58])
> (subreg:DF (reg/v:DI 5 di [orig:59 in ] [59]) 0)) 102
>  {*movdf_integer_rex64} (expr_list:REG_DEAD (reg/v:DI 5 di [orig:59 in ] [59])
> (nil)))
> 
> Note how reload_in is not equal to the SET_SRC, but reload_in_reg is.
> In that case, the same special case should apply.
> 
> The following patch fixes the test case for me:
> 
> Index: gcc/reload.c
> ===
> --- gcc/reload.c  (revision 129925)
> +++ gcc/reload.c  (working copy)
> @@ -4462,7 +4462,8 @@
>  if (rld[i].when_needed == RELOAD_FOR_INPUT
>   && GET_CODE (PATTERN (insn)) == SET
>   && REG_P (SET_DEST (PATTERN (insn)))
> - && SET_SRC (PATTERN (insn)) == rld[i].in
> + && (SET_SRC (PATTERN (insn)) == rld[i].in
> + || SET_SRC (PATTERN (insn)) == rld[i].in_reg)
>   && !elimination_target_reg_p (SET_DEST (PATTERN (insn
>{
>   rtx dest = SET_DEST (PATTERN (insn));
> 
> 
> H.J., could you verify that this solves your problem?
> 

Yes, it works for me. I tested it on Linux/ia32, Linux/intel64
and linux/ia64. There are no regressions.

Thanks.


H.J.

gcc/

2007-11-06  Ulrich Weigand  <[EMAIL PROTECTED]>

PR target/30961
* reload1.c (find_reloads): Also check in_reg when handling a
simple move with an input reload and a destination of a hard
register.

gcc/testsuite/

2007-11-06  H.J. Lu  <[EMAIL PROTECTED]>

PR target/30961
* gcc.target/i386/pr30961-1.c: New.

--- gcc/reload.c.second 2007-09-08 09:50:55.0 -0700
+++ gcc/reload.c2007-11-06 07:43:52.0 -0800
@@ -4464,7 +4464,8 @@ find_reloads (rtx insn, int replace, int
 if (rld[i].when_needed == RELOAD_FOR_INPUT
&& GET_CODE (PATTERN (insn)) == SET
&& REG_P (SET_DEST (PATTERN (insn)))
-   && SET_SRC (PATTERN (insn)) == rld[i].in)
+   && (SET_SRC (PATTERN (insn)) == rld[i].in
+   || SET_SRC (PATTERN (insn)) == rld[i].in_reg))
   {
rtx dest = SET_DEST (PATTERN (insn));
unsigned int regno = REGNO (dest);
--- gcc/testsuite/gcc.target/i386/pr30961-1.c.second2007-11-06 
07:41:26.0 -0800
+++ gcc/testsuite/gcc.target/i386/pr30961-1.c   2007-11-06 07:41:26.0 
-0800
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O2" } */
+
+double
+convert (long long in)
+{
+  double f;
+  __builtin_memcpy( &f, &in, sizeof( in ) );
+  return f;
+}
+
+/* { dg-final { scan-assembler-not "movapd" } } */


Re: strict aliasing

2007-11-06 Thread Andrew Haley
Joe Buck writes:
 > On Wed, Nov 07, 2007 at 04:06:11AM +1100, skaller wrote:

 > > I understand I ask for something gcc may not be doing, I'm not
 > > asking for a change, just to understand what it actually does.
 > 
 > You are misusing C++, I'm afraid, and there are no promises that
 > some day a new optimization won't break your code.  I suggest
 > consulting a C++ experts' forum, like comp.lang.c++.moderated,
 > for ideas on how to do what you want to do in standard C++.

I agree.  This is way off-topic for [EMAIL PROTECTED]  We have been
very patient already, and it is time for it to end.

Andrew.


Re: strict aliasing

2007-11-06 Thread Joe Buck
On Wed, Nov 07, 2007 at 04:06:11AM +1100, skaller wrote:
> On Tue, 2007-11-06 at 07:49 -0800, Joe Buck wrote:
> > Now it appears that you want to make some kind of intermediate assumption
> > (semi-strict aliasing?), where pointers of different types are allowed to
> > alias while ints can't alias with pointers.  
> 
> Yes. I want layout compatible types to be allowed to alias but
> not others. In other words, where the access would be valid
> provided it isn't optimised, don't optimise it. But where
> the access would not be valid, optimise away.

The problem is that this doesn't appear to be what anyone else wants.
Your rule would still break some existing code that needs
-fno-strict-aliasing, but allows some aliasing that the C standard
does not allow.  It seems to be a rule that is tailored to your
personal programming style.

> In C++ this is essential because constructable types cannot
> be aliased in a union. This problem doesn't arise like that
> in C.

One way to do this in C++ is to derive the different representations that
might appear in your "union" from a common base class, and use placement
new to lay them out.  There are probably other ways as well.  Your
hairy casts are, IMHO, quite risky.

> BTW: yes I understand I ask for something gcc may not be
> doing, I'm not asking for a change, just to understand
> what it actually does.

You are misusing C++, I'm afraid, and there are no promises that
some day a new optimization won't break your code.  I suggest
consulting a C++ experts' forum, like comp.lang.c++.moderated,
for ideas on how to do what you want to do in standard C++.


Re: strict aliasing

2007-11-06 Thread skaller

On Tue, 2007-11-06 at 09:32 -0800, Joe Buck wrote:
> On Wed, Nov 07, 2007 at 04:06:11AM +1100, skaller wrote:
> > On Tue, 2007-11-06 at 07:49 -0800, Joe Buck wrote:
> > > Now it appears that you want to make some kind of intermediate assumption
> > > (semi-strict aliasing?), where pointers of different types are allowed to
> > > alias while ints can't alias with pointers.  
> > 
> > Yes. I want layout compatible types to be allowed to alias but
> > not others. In other words, where the access would be valid
> > provided it isn't optimised, don't optimise it. But where
> > the access would not be valid, optimise away.
> 
> The problem is that this doesn't appear to be what anyone else wants.

Yes, but that may be because they have no idea what they
need because at the moment the optimisations are defeated
in other ways, such as by crossing subroutine or
translation unit boundaries.

> Your rule would still break some existing code that needs
> -fno-strict-aliasing, but allows some aliasing that the C standard
> does not allow. 

Yes, so perhaps a different switch would avoid that.
[Note again this is not a feature request, just a discussion
where I am trying to learn what gcc does]

>  It seems to be a rule that is tailored to your
> personal programming style.

It seems that way, but you may be surprised how much
code 'legitimately' breaks the rules.

> > In C++ this is essential because constructable types cannot
> > be aliased in a union. This problem doesn't arise like that
> > in C.
> 
> One way to do this in C++ is to derive the different representations that
> might appear in your "union" from a common base class, and use placement
> new to lay them out. 

I don't understand. You cannot put ANY constructable types
in a union. So for example:

struct X { string x; };

cannot go in a union, even though no constructor is written,
because 'string' has a constructor, so X has one generated,
and thus X is also constructable.

Using a cast instead of a union is another way to solve this,
but then as we know the strict aliasing rules might get
in the way. Using 'placement new' still requires a cast,
and it doesn't solve the problem: I need an *expression*
which is an initialised first class array.

Thats what the casts are for: making first class
initialised array expressions. It works, even with
strict aliasing on.. at the moment.

A fast conforming solution isn't possible AFAIK.

[BTW: this is only one of the aliasing hacks I use]

>  There are probably other ways as well.  Your
> hairy casts are, IMHO, quite risky.

I know. I don't like it, but this is the best alternative.
C++ is broken in a number of places. This is one of them.

> You are misusing C++, I'm afraid, and there are no promises that
> some day a new optimization won't break your code. 

I know, but as I said, C++ is broken and so there is
no help for it. In this case the safe alternative is 
going to be much slower and may not work:

struct X { 
T data[10]; X(T d1, T d2, ... )
{ data[0]=d1; data[1]=d2; ... }
};

This will not work unless T has a default constructor.
It is also slow, because the default constructors are
all applied first, then assignments done. Whereas my
hacked up code initialises the array directly by 
remodelling it as a set of discrete variables,
then aliasing them as an array.

C++ 2010 may solve this by finally providing the inline
array class I required ~15 years ago.

>  I suggest
> consulting a C++ experts' forum, like comp.lang.c++.moderated,
> for ideas on how to do what you want to do in standard C++.

Thanks, but I AM a C++ expert, and I have already asked.

[Boost's aligned_storage solves one of my problems,
but it isn't portable, and it is also very ugly to
use .. I also need the C++ code my compiler generates
to be readable]


-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: strict aliasing

2007-11-06 Thread skaller

On Tue, 2007-11-06 at 17:39 +, Andrew Haley wrote:
> Joe Buck writes:

> I agree.  This is way off-topic for [EMAIL PROTECTED]  We have been
> very patient already, and it is time for it to end.

Thanks for your patience, I'm not meaning to go off topic,
just to discover exactly what gcc does with -fno-strict-aliasing:
I think that is itself on-topic: I don't see any other place
I could find out (other than reading all the code myself,
which is a rather daunting task :)

Anyhow I'm satisfied with the answers, thanks again everyone
who commented.

Just for context: my current interest is learning about
exactly what optimisations gcc/g++ does and doesn't
do in what circumstances, because I have a compiler
which generates C++: to get my compiler to generate
fast code, I have to know what kind of code g++ likes
and what it doesn't handle so well.

Again, thanks for your patience.

-- 
John Skaller 
Felix, successor to C++: http://felix.sf.net


Re: PR target/30961 (was: Re: GCC 4.3.0 Status Report (2007-11-04))

2007-11-06 Thread Ulrich Weigand
H.J. Lu wrote:

> Yes, it works for me. I tested it on Linux/ia32, Linux/intel64
> and linux/ia64. There are no regressions.

Thanks for testing!

> gcc/
> 
> 2007-11-06  Ulrich Weigand  <[EMAIL PROTECTED]>
> 
>   PR target/30961
>   * reload1.c (find_reloads): Also check in_reg when handling a
>   simple move with an input reload and a destination of a hard
>   register.
> 
> gcc/testsuite/
> 
> 2007-11-06  H.J. Lu  <[EMAIL PROTECTED]>
> 
>   PR target/30961
>   * gcc.target/i386/pr30961-1.c: New.

This is OK, please check it in.

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU Toolchain for Linux on System z and Cell BE
  [EMAIL PROTECTED]


Re: PR target/30961 (was: Re: GCC 4.3.0 Status Report (2007-11-04))

2007-11-06 Thread H.J. Lu
On Tue, Nov 06, 2007 at 07:40:00PM +0100, Ulrich Weigand wrote:
> H.J. Lu wrote:
> 
> > Yes, it works for me. I tested it on Linux/ia32, Linux/intel64
> > and linux/ia64. There are no regressions.
> 
> Thanks for testing!
> 
> > gcc/
> > 
> > 2007-11-06  Ulrich Weigand  <[EMAIL PROTECTED]>
> > 
> > PR target/30961
> > * reload1.c (find_reloads): Also check in_reg when handling a
> > simple move with an input reload and a destination of a hard
> > register.
> > 
> > gcc/testsuite/
> > 
> > 2007-11-06  H.J. Lu  <[EMAIL PROTECTED]>
> > 
> > PR target/30961
> > * gcc.target/i386/pr30961-1.c: New.
> 
> This is OK, please check it in.
> 

There was a typo in the patch. This is the one I checked in.

Thanks.


H.J.
---
gcc/

2007-11-06  Ulrich Weigand  <[EMAIL PROTECTED]>

PR target/30961
* reload1.c (find_reloads): Also check in_reg when handling a
simple move with an input reload and a destination of a hard
register.

gcc/testsuite/

2007-11-06  H.J. Lu  <[EMAIL PROTECTED]>

PR target/30961
* gcc.target/i386/pr30961-1.c: New.

--- gcc/reload.c.second 2007-10-03 06:23:52.0 -0700
+++ gcc/reload.c2007-11-06 07:38:33.0 -0800
@@ -4462,7 +4462,8 @@ find_reloads (rtx insn, int replace, int
 if (rld[i].when_needed == RELOAD_FOR_INPUT
&& GET_CODE (PATTERN (insn)) == SET
&& REG_P (SET_DEST (PATTERN (insn)))
-   && SET_SRC (PATTERN (insn)) == rld[i].in
+   && (SET_SRC (PATTERN (insn)) == rld[i].in
+   || SET_SRC (PATTERN (insn)) == rld[i].in_reg)
&& !elimination_target_reg_p (SET_DEST (PATTERN (insn
   {
rtx dest = SET_DEST (PATTERN (insn));
--- gcc/testsuite/gcc.target/i386/pr30961-1.c.second2007-11-06 
07:38:33.0 -0800
+++ gcc/testsuite/gcc.target/i386/pr30961-1.c   2007-11-06 07:38:33.0 
-0800
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-O2" } */
+
+double
+convert (long long in)
+{
+  double f;
+  __builtin_memcpy( &f, &in, sizeof( in ) );
+  return f;
+}
+
+/* { dg-final { scan-assembler-not "movapd" } } */


Re: strict aliasing

2007-11-06 Thread Joe Buck
On Wed, Nov 07, 2007 at 05:22:01AM +1100, skaller wrote:
> > One way to do this in C++ is to derive the different representations that
> > might appear in your "union" from a common base class, and use placement
> > new to lay them out. 
> 
> I don't understand. You cannot put ANY constructable types
> in a union.

That's why I said "union", with the quotes.  We're not talking
about a union keyword, we're talking about how to get the
effect you want, legally, as opposed to the illegal way that
you are doing it.

If you have a pointer or reference to Base, the object might
really be any class derived from Base.  With placement new,
you make a buffer big enough to hold the object, you can
then construct an object of the right type there.  This kind
of code could be auto-generated by your compiler and would
be quick and type-safe.  The actual type of the storage
is an array of char, which by the rules of the language may
alias to any type.

But this is off-topic.



Using crlibm as the default math library in GCC sources

2007-11-06 Thread Uros Bizjak

Hello!

A little background:

Some time ago, Richard Guenther proposed that GCC imports libm math 
library from libc sources, but this proposal went nowhere, mostly due to 
non-technical reasons. At the time, the idea was that having a local 
math library, we could implement high-performance math library that will 
use SSE instructions with x86_64-like register passing convention also 
on i686-class 32bit targets. Since then, x86_64 gained soft-float 128bit 
support, but there was no 128bit libm available to fully exploit the 
potential of this high-precision infrastructure.


The proposal:

Forwarded is a short discussion with Mr. de Dinechin (forwarded with 
permission), where the possibility to import crlibm as the default gcc 
math library is discussed. I would like to ask members of the GCC SC for 
the opinion on the idea of adopting crlibm as the default math library. 
This way, the library can be compiled using all knowledge that gcc has 
about the target processor. As hinted in the attached message - 
autovectorization and perhaps recently introduced fixed-point 
infrastructure can be used effectively in the implementation of this 
library, as well as new AMD's FMA instructions.


The conclusion:

I think that including active maintained high-accuracy math library in 
GCC would be an important step to use GCC as a high-performance compiler 
in FP intensive applications.


Uros.

---forwarded message---

Hello,

We are very interested. Actually we are more and more working on 
automatic generation of libm functions, as opposed to writing the 
functions themselves. See

http://perso.ens-lyon.fr/christoph.lauter/intelportland.pdf
for a recent overview of what we have.
So our research direction is clearly towards compilers, not libraries.

However the development workforce is now reduced to one third-year PhD 
student (Christoph) and one lecturer (myself), both of whom work 
part-time on crlibm-related research. It is a small workforce for such a 
wide project. Do you think there could be some way to involve more 
people, or fund a post-doc or an engineer?


This was the political answer. More technical comments and issues below.

Uros Bizjak a écrit :

Hello!

GCC (GNU Compiler Collection) community is looking to include a libm 
library as a part of gcc support library in future releases of gcc. 
This way, we plan to introduce certain optimizations and enhancement 
in FP intensive code, that can be achieved only by tightly coupling 
the compiler and high-performance libm implementation.


We totally agree. For the kind of optimizations related to scheduling, 
such as loop pipelining etc, we have little competence, however we are 
very interested in more arithmetic optimizations, such as expression 
fusing, compile-time generation of optimized polynomial approximations,
sharing argument reduction for sin and cos of the same value, etc. 
Actually we would like to know what static information gcc would 
currently be able to provide to a floating-point optimizer.




Some time ago, there was an offer from your side [1] to gradually 
include crlibm in glibc [1]. We think, that more appropriate place for 
this library could be directly in the compiler infrastructure, so all 
supported targets can automatically inherit the capabilities of crlibm 
(newlib and uClinux targets, for example).


Yes.



In addition to target-dependant optimizations that can be achieved by 
coupling crlibm and the compiler, such as:


a) usage of target processor specific instructions (cmove, FMA) and 
instruction sets (SSEn)

b) optimal scheduling of the instructions for certain processor
c) various target microoptimizations


Yes to all.

On one hand, CRLibm can be retargetted relatively easily. It already 
uses macros for the FMA which accelerate it on PowerPC and IA64 for 
instance, and it was written with sufficient dataparallelism to enable 
SSE usage, although current gcc doesn't seem to be able to exploit it 
(hint).

However it will never be efficient on targets without an FPU, for instance.

On the other hand, we are prototyping the automatic implementation of 
libm functions. It is based on the Sollya tool

http://sollya.gforge.inria.fr/
Although Sollya was never intended to be included in a compiler, it can 
be used to generate target-specific code. It still needs quite a lot of 
development, though.


gcc currently provides the full infrastructure for soft-FP 128bit long 
double values on a mainstream processor (x86_64). Current libm 
implementation doesn't use this infrastructure, and there are no plans 
to do so. Unfortunately, this locks out important segment of users 
(fluid dynamics) that would benefit greatly from increased precision; 
both from correct libm implementation and from increased bit widths.


There are two issues here. The first is using 128-bit FP to provide 
64-bit FP correctly rounded elementary functions. This is possible, but 
probably less efficient than the current approach in CRLibm (using 

Re: Old UTF16 patch

2007-11-06 Thread Lawrence Crowl
On 11/1/07, Joseph S. Myers <[EMAIL PROTECTED]> wrote:
> I haven't followed any developments relating to TR19769 in WG14
> after its publication in detail; has WG14 yet given an answer
> on what should be done with u'C' where C represents a single
> character that requires a surrogate pair to represent in UTF-16
> (to name one noted place where the TR underspecifies things)?

Pending such an answer, I think gcc should make such characters
ill-formed.  The text in the C TR is "The corresponding character
constant is denoted by u'c-char-sequence' and has the type char16_t."
Given that surrogate pairs are unrepresentable in that type, I
conclude that the intent was to make character literals requiring
surrogates ill-formed.  The C++ standard also makes such characters
ill-formed.  Furthermore, making them ill-formed will be upward
compatible should the C committee choose some other interpretation.

> A TR is not a standard, so for C this must be disabled in all strict
> conformance modes (note that it affects the rules for lexing and so
> changes the semantics of conforming programs); likewise for C++98.
> The C++0x draft includes the notation from TR19769, so the feature
> should be enabled by default in C++0x (and so far as the C TR is
> compatible with C++0x, both should be followed in both C and C++
> when the feature is enabled).

Note that char16_t and char32_t are typedefs in C but primitive types
in C++, just like wchar_t.

-- 
Lawrence Crowl


Re: Question regarding GCC fdump option

2007-11-06 Thread Tom Tromey
> "Johan" == Johan Bohlin <[EMAIL PROTECTED]> writes:

Johan> Hi I have a question regarding gcc or g++
Johan> -fdump-tree-all-raw-details (.tu file). I want to dump the
Johan> entire C (not C++) AST tree the only way to do this, without
Johan> losing any information, is if I use g++ and
Johan> -fdump-tree-all-raw-details and i have some kind of error in
Johan> the file. Can I in some way change a flag or edit a method in
Johan> the source files so i can use this option without having an
Johan> error in my c file but get all the information. Even better
Johan> would be if i could use gcc.

I don't think there is a way to do this without modifying gcc.  FWIW,
I think that would be a worthwhile project; IMO gcc should move a bit
more into the analysis space.

Note that the C front end does lose information relatively early.  For
instance, it constant folds as it builds expression trees.  So, even
internally the AST won't show you some things that you might want to
see.  This is also fixable of course.

Tom


Re: Wrong ChangeLog entry in gcc/ChangeLog

2007-11-06 Thread H.J. Lu
On Mon, Nov 05, 2007 at 03:40:52PM -0800, H.J. Lu wrote:
> Hi Paul,
> 
> Did you check the wrong ChangeLog entry in gcc/ChangeLog with
> 
> http://gcc.gnu.org/viewcvs?view=rev&revision=129904
> 

I checked in the following patch as an obvious fix.


H.J.

Index: ChangeLog
===
--- ChangeLog   (revision 129944)
+++ ChangeLog   (working copy)
@@ -112,23 +112,19 @@
 
* config/xtensa/xtensa.c (xtensa_expand_nonlocal_goto): Do not
replace references to virtual_stack_vars_rtx in goto_handler.
-   
+
 2007-11-05  Paul Brook  <[EMAIL PROTECTED]>
 
-   * Makefile.target: Add ssd0303.o, pl022.o and ssd0323.o.
-   * vl.c (register_machines): Add lm3s6965evb_machine.
-   * vl.h (armv7m_init): Add.
-   (lm3s6965evb_machine): Declare.
-   (pl022_init): New prototype.
-   (ssd0323_xfer_ssi, ssd0323_init): New prototype.
-   * hw/ssd0323.c: New file.
-   * hw/armv7m.c (armv7m_init): Remove board init code.
-   (lm3s811evb_machine): Remove.
-   * hw/osram_oled.c: Rename...
-   * hw/ssd0303.c: ... to this.
-   * hw/pl022.c: New file.
-   * hw/stellaris.c: Define and use stellaris_boards.
-   (lm3s811evb_machine, lm3s6965evb_machine): New.
+   * config.gcc (arm*-*-*): Set c_target_objs and cxx_target_objs.
+   * config/arm/arm.c (arm_lang_output_object_attributes_hook): New.
+   (arm_file_start): Don't set Tag_ABI_PCS_wchar_t.  Call
+   arm_lang_output_object_attributes_hook.
+   * config/arm/arm.h (arm_lang_output_object_attributes_hook): Declare.
+   (REGISTER_TARGET_PRAGMAS): Call arm_lang_object_attributes_init.
+   * config/arm/arm-protos.h (arm_lang_object_attributes_init): Add
+   prototype.
+   * config/arm/t-arm.c (arm.o): New rule.
+   * config/arm/arm-c.c: New file.
 
 2007-11-05  Nick Clifton  <[EMAIL PROTECTED]>
Sebastian Pop  <[EMAIL PROTECTED]>


Medical Doctor List

2007-11-06 Thread Novak N Jesse



We have alot of different lists in the medical industry and we don't charge per 
thousand like many other companies on the internet. 
In fact there are some good deals we are running for this week. Contact me at 
topleveldata AT hotmail.com to get additional details.

















About VLIW backend

2007-11-06 Thread Li Wang
Hi,
I wonder if any efforts have been made to retarget GCC to VLIW
backend.Is there any project trying to do that? Is it included in the
GCC mainstream? Thanks.

Regards,
Li Wang


Fw: error: array type has incomplete element type ??

2007-11-06 Thread onkar . mahajan



This is a part of the code :
--
extern struct dummy temp[];
error: array type has incomplete element type
--

 which i  compiled without any error on :

$gcc -v
Reading specs from /usr/bin/../lib/gcc-lib/powerpc-ibm-aix5.1.0.0
/2.9-aix51-020209/specs
gcc version 2.9-aix51-020209

but the same code doesnt compile on the :
$gcc -v
Using built-in specs.
Target: powerpc-ibm-aix5.3.0.0
Configured with: ../configure --with-as=/usr/bin/as --with-ld=/usr/bin/ld
--disable-nls --enable-languages=c,c++ --prefix=/opt/freeware
--enable-threads --enable-version-specific-runtime-libs
--host=powerpc-ibm-aix5.3.0.0
Thread model: aix
gcc version 4.0.0

please suggest me the arguments I must give to gcc 4.0 to get the above
code compiled.



Onkar


PS :


I have already gone through the article :
.

the code I am compiling is  large and it is difficult to change it now.
Please suggest me something that GCC 4.0 has for backward compatibility.






" Save Paper - Do you really need to print this e-mail? "

This e-Mail may contain proprietary and confidential information and is
sent for the intended recipient(s) only.  If by an addressing or
transmission error this mail has been misdirected to you, you are requested
to delete this mail immediately. You are also hereby notified that any use,
any form of reproduction, dissemination, copying, disclosure, modification,
distribution and/or publication of this e-mail message, contents or its
attachment other than by its intended recipient/s is strictly prohibited.

Visit us at http://www.polaris.co.in


" Save Paper - Do you really need to print this e-mail? "

This e-Mail may contain proprietary and confidential information and is sent 
for the intended recipient(s) only.  If by an addressing or transmission error 
this mail has been misdirected to you, you are requested to delete this mail 
immediately. You are also hereby notified that any use, any form of 
reproduction, dissemination, copying, disclosure, modification, distribution 
and/or publication of this e-mail message, contents or its attachment other 
than by its intended recipient/s is strictly prohibited.

Visit us at http://www.polaris.co.in


Re: About VLIW backend

2007-11-06 Thread Robert Dewar

Li Wang wrote:

Hi,
I wonder if any efforts have been made to retarget GCC to VLIW
backend.Is there any project trying to do that? Is it included in the
GCC mainstream? Thanks.


the ia64 is a VLIW architecture!


Regards,
Li Wang





Re: About VLIW backend

2007-11-06 Thread Li Wang

Hi,
I know that. But I am talking to a more _pure_ VLIW architecture 
which totally relies on static scheduling rather than EPIC architecture. 
Thanks.

Li Wang wrote:

Hi,
I wonder if any efforts have been made to retarget GCC to VLIW
backend.Is there any project trying to do that? Is it included in the
GCC mainstream? Thanks.


the ia64 is a VLIW architecture!


Regards,
Li Wang








Re: About VLIW backend

2007-11-06 Thread Ian Lance Taylor
Li Wang <[EMAIL PROTECTED]> writes:

> I wonder if any efforts have been made to retarget GCC to VLIW
> backend.Is there any project trying to do that? Is it included in the
> GCC mainstream? Thanks.

The FRV is an example of a currently supported VLIW backend.

Ian


Target specific attributes to variables

2007-11-06 Thread Naveen H.S.
Hi,

We are implementing attributes to variables. The attribute of the
operand is checked and the respective instructions are emitted based
on the attributes. We have added the attribute to one addressing mode
in which the operand is absolute memory (SYMBOL_REF). This was 
implemented by checking the tree value using the macro SYMBOL_REF_DECL. 

Even though the other 2 addressing modes are implemented, the 
attributes could not be checked in the other 2 modes. These 2 modes 
are "disp with register" and "register indirect" addressing modes. The
tree structure in these addressing modes could not be checked for 
attributes using the RTX of the operand. We were unable to get any 
information from other target specific attributes.

Any help/suggestions in solving the problem will be highly appreciated.

Regards,
Naveen.H.S.
KPIT Cummins Infosystems Ltd,
Pune (INDIA) 
~~  
Free download of GNU based tool-chains for Renesas' SH, H8, R8C, M16C   
and M32C Series. The following site also offers free technical support  
to its users. Visit http://www.kpitgnutools.com for details.
Latest versions of KPIT GNU tools were released on October 1, 2007. 
~~  


Designs for better debug info in GCC (was: Re: [vta] don't let debug insns get in the way of simple vect reduction)

2007-11-06 Thread Alexandre Oliva
On Nov  5, 2007, "Richard Guenther" <[EMAIL PROTECTED]> wrote:

> On 11/5/07, Alexandre Oliva <[EMAIL PROTECTED]> wrote:
>> libgfortran had some vectorization cases that wouldn't be applied in
>> the presence of debug stmts referencing the same variables.  Fixed
>> with the patch below, to be installed shortly.

> (I'm just picking a random patch of this kind for this mail)

> I see you have to touch lots of places to teach them about debug
> insns.

Yes.  There's no escaping for that.  There are two options:

- keep them separate, and modify the code that manipulates the IL so
  as to update them as needed, or

- keep them in the IL, and modify the code to disregard them as
  needed.

I've pondered both alternatives, and decided that the latter was the
only testable path.  If we had a reliable debug information tester, we
could proceed incrementally with the first alternative; it might be
viable, but I don't really see that it would make things any simpler.
If anything, you'd need to introduce a lot of new code to manipulate
the separate representation, unless this separate representation was
very similar in structure to the existing representation, and in any
case you'd have to add code all over the place to keep it up to date.

With the approach I've taken, there's something that's testable: as
long as there are codegen changes, something needs to be fixed.
Besides, the information is encoded in a form that is automatically
handled by most compilation passes, so updates for pretty much all
transformations are already in place, without any additional code.

The only additional code is what's needed to detect missing updates
and to ensure the debug notes don't interfere with code generation.
I've managed to implement these such that they don't take any
additional memory unless you actually request the additional debug
information, and such that they almost never bring any compile-time
performance hit.  That's one of the reasons that guided the placement
of DEBUG_INSN just next to the other INSNs: such that INSN_P is
optimized to a range test (as it was before, but now with a different
boundary), and INSN_P && !DEBUG_INSN_P is optimized to the original
range test.  In most other places, it's just yet another entry in a
switch table, so again it's zero-cost in terms of performance.  And at
points where it would be more costly, there's a test guarding the
complex processing to tell whether the feature is enabled that
requires that additional processing.  Hard to beat that.

> I believe in the long long thread earlier this year people suggested
> to use a on-the-side representation for the extra information.

Yes.  And I thought I'd already made it clear why this on-the-side
representation won't get you as far as I needed to go.  Basically, it
leads to a situation in which you can't possibly represent correct
debug information, or you end up adding annotations to the instruction
flow anyway, which means you have to deal with them or give up correct
debug information.

Since one of the requirements I was given was that debug information
be correct (as in, if I don't know where a variable is, debug
information must say so, rather than say the variable is somewhere it
really isn't), going without additional annotations just wouldn't
work.  Therefore, I figured I'd have to bite the bullet and take the
longer path, even though I don't dispute that it is possible to
achieve many improvements with the simpler approach.

However, eventually the simpler approach runs into a wall, and I
couldn't afford to get to that point and then backtrack to the
complete approach, because the wall couldn't be surpassed.

> With the different approach I and Matz started (and to which we
> didn't yet spend enough time to get debug information actually
> output - but I hope we'll get there soon), on the tree level the
> extra information is stored in a bitmap per SSA_NAME (where
> necessary).

This will fail on a very fundamental level.  Consider code such as:

f(int x, int y) {
  int c;
  /* other vars */

  c = x;
  do_something_with(c, ...); // doesn't touch x or y

  c = y;
  do_something_else_with(c, ...); // doesn't touch x or y
}

where do_something_*with are actually complex computations, be that
explicit code, be it macros or inlined functions.

This can (and should) be trivially optimized to:

f(int x, int y) {
  /* other vars */

  do_something_with(x, ...); // doesn't touch x or y

  do_something_else_with(y, ...); // doesn't touch x or y
}

But now, if I 'print c' in a debugger in the middle of one of the
do_something_*with expansions, what do I get?

With the approach I'm implementing, you should get x and y at the
appropriate points, even though variable c doesn't really exist any
more.

With your approach, what will you get?

There isn't any assignment to x or y you could hook your notes to.

Even if you were to set up side representations to model the
additional variables that end up mapped to the incoming arguments,
y

Re: About VLIW backend

2007-11-06 Thread Pranav Bhandarkar
On 06 Nov 2007 21:50:09 -0800, Ian Lance Taylor <[EMAIL PROTECTED]> wrote:
> Li Wang <[EMAIL PROTECTED]> writes:
>
> > I wonder if any efforts have been made to retarget GCC to VLIW
> > backend.Is there any project trying to do that? Is it included in the
> > GCC mainstream? Thanks.

Dr. Baumgartl,  Jan Parthey and folks at the Chemnitz University of Technology
had got substantial success with a port for the TMS320C6x series of
VLIW processors.

http://archiv.tu-chemnitz.de/pub/2004/0107/data/index.html

We ( A few friends and I - college students then ) had added some
improvements to this port as part of
undergraduate university coursework ( project).


cheers!
Pranav