Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Richard Guenther
On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen  wrote:
> Andreas Schwab  writes:
>>
>> The asm fails to mention that it modifies *regs.
>
> It has a memory clobber, that should be enough, no?

No.  A memory clobber does not cover automatic storage.

Btw, I can't see a testcase anywhere so I just assume Andreas got
it right as usual.

Richard.

> Besides in any case it cannot be eliminated because it has
> valid non dead inputs and outputs.
>
> -Andi
> --
> a...@linux.intel.com -- Speaking for myself only.
>


Re: integral overflow and integral conversions

2010-11-08 Thread Richard Guenther
On Sun, Nov 7, 2010 at 11:04 PM, Jason Merrill  wrote:
> Currently, the middle end seems to use the same rules for handling constant
> overflow of integer arithmetic and conversion between integer types: set
> TREE_OVERFLOW on the INTEGER_CST if the type is signed and the value doesn't
> fit in the target type.  But this doesn't seem to match the C/C++ standards.
>
> C99 says,
>
> 6.3.1.3 Signed and unsigned integers
>
> 1 When a value with integer type is converted to another integer type other
> than _Bool, if the value can be represented by the new type, it is
> unchanged.
> 2 Otherwise, if the new type is unsigned, the value is converted by
> repeatedly adding or subtracting one more than the maximum value that can be
> represented in the new type until the value is in the range of the new
> type.49)
> 3 Otherwise, the new type is signed and the value cannot be represented in
> it; either the result is implementation-defined or an implementation-defined
> signal is raised.
>
> 6.5 Expressions
>
> 5 If an exceptional condition occurs during the evaluation of an expression
> (that is, if the result is not mathematically defined or not in the range of
> representable values for its type), the behavior is undefined.
>
> Note the difference.  When converting an integer value to another integer
> type that it doesn't fit into, the behavior is either well-defined or
> implementation-defined.  When arithmetic produces a value that doesn't fit
> into the type in which the arithmetic is done, the behavior is undefined
> even if the type is unsigned.
>
> So we're setting TREE_OVERFLOW inappropriately for conversion to signed
> integer types (though the front ends unset it again in cast context), and,
> more problematically, failing to set it for unsigned arithmetic overflow:
>
> #include 
>
> enum E {
>  A = (unsigned char)-1,        // OK
>  B = (signed char)UCHAR_MAX,   // implementation-defined
>  C = INT_MAX+1,                // undefined (C)/ill-formed (C++)
>  D = UINT_MAX+1                // undefined (C)/ill-formed (C++)
> };
>
> Am I missing something?
>
> This is more of a problem for C++, which says that arithmetic overflow in a
> context that requires a constant expression is ill-formed; in C it's merely
> undefined.

The use of TREE_OVERFLOW is largely historical and of not much use
for the middle-end (but it's used extensively by the frontend(s) and so
can't be changed easily).

The middle-end would have use for detecting overflow for both signed
and unsigned arithmetic (in the type that is provided, thus without
implicit promotions).  But this should be signalled with a separate
return value, not with flags on INTEGER_CSTs (which eventually
should get deprecated).

I've tried to move us in that direction repeatedly (with some VRP work
and also on the no-undefined-overflow branch) but always run into
interesting issues in the C frontend and stor-layout code.  Hum.

Richard.

> Jason
>


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Andi Kleen
Richard Guenther  writes:

> On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen  wrote:
>> Andreas Schwab  writes:
>>>
>>> The asm fails to mention that it modifies *regs.
>>
>> It has a memory clobber, that should be enough, no?
>
> No.  A memory clobber does not cover automatic storage.

That's a separate problem.

> Btw, I can't see a testcase anywhere so I just assume Andreas got
> it right as usual.

An asm with live inputs and outputs should never be optimized
way. If 4.5.1 started doing that it's seriously broken.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only.


Dedicated logical instructions

2010-11-08 Thread Radu Hobincu
Hello again,

I have another, quick question: I have dedicated logical instructions in
my RISC machine (lt - less than, gt - greater than, ult - unsigned less
than, etc.). I'm also working on adding instructions for logical OR, AND,
NOT, XOR. While reading GCC internals, I've stumbled on this:

"Except when they appear in the condition operand of a COND_EXPR, logical
‘and’ and ‘or’ operators are simplified as follows: a = b && c becomes
T1 = (bool)b;
if (T1)
  T1 = (bool)c;
a = T1;"

I really, really don't want this. Is there any way I can define the
instructions in the .md file so the compiler generates code for computing
a boolean expression without using branches (using these dedicated insns)?

Regards,
Radu





Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Richard Guenther
On Mon, Nov 8, 2010 at 12:20 PM, Andi Kleen  wrote:
> Richard Guenther  writes:
>
>> On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen  wrote:
>>> Andreas Schwab  writes:

 The asm fails to mention that it modifies *regs.
>>>
>>> It has a memory clobber, that should be enough, no?
>>
>> No.  A memory clobber does not cover automatic storage.
>
> That's a separate problem.
>
>> Btw, I can't see a testcase anywhere so I just assume Andreas got
>> it right as usual.
>
> An asm with live inputs and outputs should never be optimized
> way. If 4.5.1 started doing that it's seriously broken.

Please provide a testcase, such asms can be optimized if the
outputs are dead.

Richard.

> -Andi
>
> --
> a...@linux.intel.com -- Speaking for myself only.
>


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Paul Koning

On Nov 8, 2010, at 6:20 AM, Richard Guenther wrote:

> On Mon, Nov 8, 2010 at 12:20 PM, Andi Kleen  wrote:
>> Richard Guenther  writes:
>> 
>>> On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen  wrote:
 Andreas Schwab  writes:
> 
> The asm fails to mention that it modifies *regs.
 
 It has a memory clobber, that should be enough, no?
>>> 
>>> No.  A memory clobber does not cover automatic storage.
>> 
>> That's a separate problem.
>> 
>>> Btw, I can't see a testcase anywhere so I just assume Andreas got
>>> it right as usual.
>> 
>> An asm with live inputs and outputs should never be optimized
>> way. If 4.5.1 started doing that it's seriously broken.
> 
> Please provide a testcase, such asms can be optimized if the
> outputs are dead.

I don't know about 4.5, but I noticed that with 4.6 (trunk), testcasese like 
gcc.c-torture/compile/2804-1.c optimize away the asm and all the operand 
generation except for -O0.

paul



Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Jakub Jelinek
On Mon, Nov 08, 2010 at 06:47:59AM -0500, Paul Koning wrote:
> I don't know about 4.5, but I noticed that with 4.6 (trunk), testcasese
> like gcc.c-torture/compile/2804-1.c optimize away the asm and all the
> operand generation except for -O0.

That's fine, the asm isn't volatile and the output is not used.

Jakub


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Michael Matz
Hi,

On Mon, 8 Nov 2010, Andi Kleen wrote:

> Richard Guenther  writes:
> 
> > On Mon, Nov 8, 2010 at 12:03 AM, Andi Kleen  wrote:
> >> Andreas Schwab  writes:
> >>>
> >>> The asm fails to mention that it modifies *regs.
> >>
> >> It has a memory clobber, that should be enough, no?
> >
> > No.  A memory clobber does not cover automatic storage.
> 
> That's a separate problem.
> 
> > Btw, I can't see a testcase anywhere so I just assume Andreas got
> > it right as usual.
> 
> An asm with live inputs and outputs should never be optimized
> way. If 4.5.1 started doing that it's seriously broken.

You know the drill: testcase -> gcc.gnu.org/bugzilla/

(In particular up to now it's only speculation in some forum that the asm 
really is optimized away, which I agree would be a bug, or if it isn't 
merely that regs->eax isn't reloaded after the asm(), which would be 
caused by the problem Andreas mentioned)


Ciao,
Michael.


Re: Why is -fstrict-aliasing excluded from function "optimize" attribute?

2010-11-08 Thread Paolo Bonzini

On 11/04/2010 11:28 AM, Bingfeng Mei wrote:

I think of adding a warning too. Should I submit a patch?


That's always a good idea. :)

Paolo


Re: I propose Ralf Wildenhues for build machinery maintainer

2010-11-08 Thread Paolo Bonzini

On 11/05/2010 07:00 PM, Ian Lance Taylor wrote:

To the steering committee: I propose Ralf Wildenhues as a new maintainer
for the build machinery.

Ralf often has useful comments for proposed patches to the configure
scripts.  He has done good work in upgrading to new versions of autoconf
and libtool.  As an autoconf maintainer himself he has experience with
acting as a maintainer.


Current build maintainers agreed unanimously off-list on this proposal, 
so I can say we would gladly welcome Ralf in our ranks. :)


Paolo


Re: asm_fprintf inefficiency?

2010-11-08 Thread Paolo Bonzini

On 11/05/2010 08:10 AM, Jay K wrote:

the checking for puts_locked...
the fact that asm_fprintf calls putc one character at a time,
which probably benefits from _unlocked.


Honest question: is asm_fprintf in the profile at all, even at -O0?

Paolo


Re: integral overflow and integral conversions

2010-11-08 Thread Joseph S. Myers
I am confident in the correctness of the tests I wrote for overflow for C 
(gcc.dg/overflow-warn-[1234].c).  This doesn't necessarily mean the 
present mixture of ways of indicating that an expression of integer 
constants is not an integer constant expression (for whatever combination 
of reasons involving overflow, division by zero, out-of-range shifts, 
etc.) is the best implementation approach.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: %pc relative addressing of string literals/const data

2010-11-08 Thread Joakim Tjernlund
Dave Korn  wrote on 2010/10/27 13:59:00:
>
> On 27/10/2010 07:47, Joakim Tjernlund wrote:
> > Alan Modra  wrote on 2010/10/27 04:01:50:
> >> On Wed, Oct 27, 2010 at 12:53:00AM +0100, Dave Korn wrote:
> >>> On 26/10/2010 23:37, Joakim Tjernlund wrote:
> >>>
>  Everything went dead quiet the minute I stated to send patches, what did
>  I do wrong?
> >>>   Nothing, you just ran into the lack-of-manpower problem.  Sorry!  And I
> >>> can't even help, I'm not a ppc maintainer.
> >> I also cannot approve gcc patches.
> >
> > Sent it to gcc-patches too. I already sent another gcc patch there but that 
> > didn't
> > trigger any response either.
> > Perhaps you can notify whoever that can approve patches?
>
>   We have a convention on the patches list; if a patch hasn't gotten an answer
> after ten to fourteen days or so, send a reply to the original post, adding
> "[PING]" to the beginning of the subject line.  (Sometimes it can take two or
> three pings, unfortunately that's just a consequence of our limited 
> resources.)
>
>   I see your first patch was posted on the 19th.  Give it another few days,
> then ping it.  When you do so, you could also mention your other patch at the
> same time.

One ping and a few days later and nothing. Very frustrating. I don't
believe all PPC devs are so "busy" that none has the time to look
at a simple one liner. What is up?

  Jocke



named address spaces: addr_space_convert never called

2010-11-08 Thread Georg Lay
Hi, I just started playing around with named address spaces for avr.
Besides general space (ram), I introduced a second one, __pgm, which
shall address program memory where also constants may live. avr is
havard architecture, and both program memory and ram start at address 0.

>From this and the internals on TARGET_ADDR_SPACE_CONVERT I understand
that pointer casting will not work as expected, because that hook will
only get called if the respective address spaces are subsets. However,
neither is space-1 a subset of space-0 nor vice versa (or am I midlead
by internals?)

Is there a way to make it work in the case where the address spaces
are disjoint? Started this morning and everything went smooth until I
started messing around with pointer casts:


char cast_3 (char in_pgm, void * p)
{
return in_pgm ? (*((char __pgm *) p)) : (*((char *) p));
}

The cast looks fine from the trees perspective (excerpt from .expand):

The first cast nullifies the pointer.
;; Function cast_3 (cast_3)

cast_3 (char in_pgm, void * p)
{
   char * D.1934;
  char D.1930;

  # BLOCK 2 freq:1
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  if (in_pgm_2(D) != 0)
goto ;
  else
goto ;
  # SUCC: 3 [39.0%]  (true,exec) 4 [61.0%]  (false,exec)

  # BLOCK 3 freq:3900
  # PRED: 2 [39.0%]  (true,exec)
  D.1934_4 = ( char *) p_3(D);
  D.1930_5 = *D.1934_4;
  goto ;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:6100
  # PRED: 2 [61.0%]  (false,exec)
  D.1930_6 = MEM[(char *)p_3(D)];
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # D.1930_1 = PHI 
  return D.1930_1;
  # SUCC: EXIT [100.0%]

}

This produces the following RTL:

;; Generating RTL for gimple basic block 3

;; D.1930_5 = *D.1934_4;

(insn 10 9 11 (set (reg/f:PHI 47)
(const_int 0 [0])) pgm.c:97 -1
 (nil))

(insn 11 10 0 (set (reg:QI 42 [ D.1930 ])
(mem:QI (reg/f:PHI 47) [0 *D.1934_4+0 S1 A8 AS1])) pgm.c:97 -1
 (nil))

;; Generating RTL for gimple basic block 4

So as of internals doc, named address spaces are not intended to
implement this kind of memory?

Georg


Re: named address spaces: addr_space_convert never called

2010-11-08 Thread Georg Lay
Georg Lay schrieb:

FYI, the code is as expected when I define the addrspaces to be
subsets of each other.

Georg


Re: named address spaces: addr_space_convert never called

2010-11-08 Thread Richard Guenther
On Mon, Nov 8, 2010 at 3:39 PM, Georg Lay  wrote:
> Hi, I just started playing around with named address spaces for avr.
> Besides general space (ram), I introduced a second one, __pgm, which
> shall address program memory where also constants may live. avr is
> havard architecture, and both program memory and ram start at address 0.
>
> From this and the internals on TARGET_ADDR_SPACE_CONVERT I understand
> that pointer casting will not work as expected, because that hook will
> only get called if the respective address spaces are subsets. However,
> neither is space-1 a subset of space-0 nor vice versa (or am I midlead
> by internals?)
>
> Is there a way to make it work in the case where the address spaces
> are disjoint? Started this morning and everything went smooth until I
> started messing around with pointer casts:
>
>
> char cast_3 (char in_pgm, void * p)
> {
>    return in_pgm ? (*((char __pgm *) p)) : (*((char *) p));
> }
>
> The cast looks fine from the trees perspective (excerpt from .expand):
>
> The first cast nullifies the pointer.
> ;; Function cast_3 (cast_3)
>
> cast_3 (char in_pgm, void * p)
> {
>   char * D.1934;
>  char D.1930;
>
>  # BLOCK 2 freq:1
>  # PRED: ENTRY [100.0%]  (fallthru,exec)
>  if (in_pgm_2(D) != 0)
>    goto ;
>  else
>    goto ;
>  # SUCC: 3 [39.0%]  (true,exec) 4 [61.0%]  (false,exec)
>
>  # BLOCK 3 freq:3900
>  # PRED: 2 [39.0%]  (true,exec)
>  D.1934_4 = ( char *) p_3(D);
>  D.1930_5 = *D.1934_4;
>  goto ;
>  # SUCC: 5 [100.0%]  (fallthru,exec)
>
>  # BLOCK 4 freq:6100
>  # PRED: 2 [61.0%]  (false,exec)
>  D.1930_6 = MEM[(char *)p_3(D)];
>  # SUCC: 5 [100.0%]  (fallthru,exec)
>
>  # BLOCK 5 freq:1
>  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
>  # D.1930_1 = PHI 
>  return D.1930_1;
>  # SUCC: EXIT [100.0%]
>
> }
>
> This produces the following RTL:
>
> ;; Generating RTL for gimple basic block 3
>
> ;; D.1930_5 = *D.1934_4;
>
> (insn 10 9 11 (set (reg/f:PHI 47)
>        (const_int 0 [0])) pgm.c:97 -1
>     (nil))
>
> (insn 11 10 0 (set (reg:QI 42 [ D.1930 ])
>        (mem:QI (reg/f:PHI 47) [0 *D.1934_4+0 S1 A8 AS1])) pgm.c:97 -1
>     (nil))
>
> ;; Generating RTL for gimple basic block 4
>
> So as of internals doc, named address spaces are not intended to
> implement this kind of memory?

If they are not subsets of each other how'd you convert a pointer
pointing into one to point into the other address-space?  I think
the frontend should diagnose this as invalid.

Richard.

> Georg
>


Re: named address spaces: addr_space_convert never called

2010-11-08 Thread Georg Lay
Richard Guenther schrieb:
> On Mon, Nov 8, 2010 at 3:39 PM, Georg Lay  wrote:
>> Hi, I just started playing around with named address spaces for avr.
>> Besides general space (ram), I introduced a second one, __pgm, which
>> shall address program memory where also constants may live. avr is
>> havard architecture, and both program memory and ram start at address 0.
>>
>> From this and the internals on TARGET_ADDR_SPACE_CONVERT I understand
>> that pointer casting will not work as expected, because that hook will
>> only get called if the respective address spaces are subsets. However,
>> neither is space-1 a subset of space-0 nor vice versa (or am I midlead
>> by internals?)
>>
>> Is there a way to make it work in the case where the address spaces
>> are disjoint? Started this morning and everything went smooth until I
>> started messing around with pointer casts:
>>
>>
>> char cast_3 (char in_pgm, void * p)
>> {
>>return in_pgm ? (*((char __pgm *) p)) : (*((char *) p));
>> }
>>
>> The cast looks fine from the trees perspective (excerpt from .expand):
>>
>> The first cast nullifies the pointer.
>> ;; Function cast_3 (cast_3)
>>
>> cast_3 (char in_pgm, void * p)
>> {
>>   char * D.1934;
>>  char D.1930;
>>
>>  # BLOCK 2 freq:1
>>  # PRED: ENTRY [100.0%]  (fallthru,exec)
>>  if (in_pgm_2(D) != 0)
>>goto ;
>>  else
>>goto ;
>>  # SUCC: 3 [39.0%]  (true,exec) 4 [61.0%]  (false,exec)
>>
>>  # BLOCK 3 freq:3900
>>  # PRED: 2 [39.0%]  (true,exec)
>>  D.1934_4 = ( char *) p_3(D);
>>  D.1930_5 = *D.1934_4;
>>  goto ;
>>  # SUCC: 5 [100.0%]  (fallthru,exec)
>>
>>  # BLOCK 4 freq:6100
>>  # PRED: 2 [61.0%]  (false,exec)
>>  D.1930_6 = MEM[(char *)p_3(D)];
>>  # SUCC: 5 [100.0%]  (fallthru,exec)
>>
>>  # BLOCK 5 freq:1
>>  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
>>  # D.1930_1 = PHI 
>>  return D.1930_1;
>>  # SUCC: EXIT [100.0%]
>>
>> }
>>
>> This produces the following RTL:
>>
>> ;; Generating RTL for gimple basic block 3
>>
>> ;; D.1930_5 = *D.1934_4;
>>
>> (insn 10 9 11 (set (reg/f:PHI 47)
>>(const_int 0 [0])) pgm.c:97 -1
>> (nil))
>>
>> (insn 11 10 0 (set (reg:QI 42 [ D.1930 ])
>>(mem:QI (reg/f:PHI 47) [0 *D.1934_4+0 S1 A8 AS1])) pgm.c:97 -1
>> (nil))
>>
>> ;; Generating RTL for gimple basic block 4
>>
>> So as of internals doc, named address spaces are not intended to
>> implement this kind of memory?
> 
> If they are not subsets of each other how'd you convert a pointer
> pointing into one to point into the other address-space?  I think
> the frontend should diagnose this as invalid.

The front end emits warning. However, explicit casts should yield only
a warning if explicit requested.

With subsets relation returning true the code is like that (I changed
the test case to add 5, the there is an additional *addphi).

;; Generating RTL for gimple basic block 3

;; D.1987_6 = MEM[( char *)D.1991_4 + 5B];

(insn 10 9 11 (set (reg:PHI 47)
(subreg:PHI (reg/v/f:HI 46 [ p ]) 0)) pgm.c:98 -1
 (nil))

(insn 11 10 12 (set (reg:PHI 48)
(reg:PHI 47)) pgm.c:98 -1
 (nil))

(insn 12 11 13 (set (reg/f:PHI 49)
(plus:PHI (reg:PHI 47)
(const_int 5 [0x5]))) pgm.c:98 -1
 (nil))

(insn 13 12 0 (set (reg:QI 42 [ D.1987 ])
(mem:QI (reg/f:PHI 49) [0 MEM[( char
*)D.1991_4 + 5B]+0 S1 A8 AS1])) pgm.c:98 -1
 (nil))

;; Generating RTL for gimple basic block 4

This is fine.

However, I am still confused:

"A is a subset of B iff every member of A is alse member of B".

But in this case, an element of ram is not an element of flash nor is
an element of flash an element of ram. Written down as numbers, these
numers are the same, yes, so that information gets encoded in the
machine mode to know what addresses are legal and what instruction
must be used.

Memory is not linearized because that would imply to take the decision
at runtime.


Georg


Re: Dedicated logical instructions

2010-11-08 Thread Ian Lance Taylor
"Radu Hobincu"  writes:

> I have another, quick question: I have dedicated logical instructions in
> my RISC machine (lt - less than, gt - greater than, ult - unsigned less
> than, etc.). I'm also working on adding instructions for logical OR, AND,
> NOT, XOR. While reading GCC internals, I've stumbled on this:
>
> "Except when they appear in the condition operand of a COND_EXPR, logical
> ‘and’ and ‘or’ operators are simplified as follows: a = b && c becomes
> T1 = (bool)b;
> if (T1)
>   T1 = (bool)c;
> a = T1;"
>
> I really, really don't want this. Is there any way I can define the
> instructions in the .md file so the compiler generates code for computing
> a boolean expression without using branches (using these dedicated insns)?

That is the only correct way to implement && and || in C, C++, and other
similar languages.  The question you should be asking is whether gcc
will be able to put simple cases without side effects back together
again.  The answer is that, yes, it should be able to do that.

You should not worry about this level of things when it comes to writing
your backend port.  Language level details like this are handled by the
frontend, not the backend.  When your port is working, come back to this
and make sure that you get the kind of code you want.

Ian


Re: gcc 4.5.1 / as 2.20.51.0.11 miscompiling drivers/char/i8k.c ?

2010-11-08 Thread Dave Korn
On 08/11/2010 11:20, Andi Kleen wrote:

> An asm with live inputs and outputs should never be optimized
> way. If 4.5.1 started doing that it's seriously broken.

  I don't see that.  Consider:

void foo (void)
{
  int x, y, z;
  x = 23;
  y = x + 1;
  z = y + 1;
}

  So far, you'd agree the compiler may optimise the entire function away?  So
why not this:

void foo (void)
{
  int x, y, z;
  x = 23;
  asm ("do something" : "=r" (y) : "r" (x) );
  z = y + 1;
}

  ?

cheers,
  DaveK


Re: define_split

2010-11-08 Thread Michael Meissner
On Thu, Oct 28, 2010 at 09:11:44AM +0200, roy rosen wrote:
> Hi all,
> 
> I am trying to use define_split, but it seems to me that I don't
> understand how it is used.
> It says in the gccint.pdf (which I use as my tutorial (is there
> anything better or more up to date?)) that the combiner only uses the
> define_split if it doesn't find any define_insn to match. This is not
> clear to me: If there is no define_insn to match a pattern in the
> combine stage, then how is this pattern there in the first place? The
> only way I can think for it to happen is that such a pattern was
> created by the combiner itself (but it seems unreasonable to me that
> we now want to split what the combiner just combined). Can someone
> please explain this to me?

Basically combine.c works by trying to build a combined insn, and then it sees
if it matches an insn.  The canonical example is fused multiply/add (avoiding
all of the issues with FMA's for the moment):

(define_insn "addsf3"
  [(set (match_operand:SF 0 "f_operand" "=f")
(plus:SF (match_operand:SF 1 "f_operand" "f")
 (match_operand:SF 2 "f_operand" "f")))]
  ""
  "fadd %0,%1,%2")

(define_insn "mulsf3"
  [(set (match_operand:SF 0 "f_operand" "=f")
(mult:SF (match_operand:SF 1 "f_operand" "f")
 (match_operand:SF 2 "f_operand" "f")))]
  ""
  "fadd %0,%1,%2")

(define_insn "fmasf3"
 [(set (match_operand:SF 0 "f_operand" "=f")
   (plus:SF (mult:SF (match_operand:SF 1 "f_operand" "f")
 (match_operand:SF 2 "f_operand" "f"))
(match_operand:SF 3 "f_operand" "f")))]
 ""
 "fma %0,%1,%2,%3")

What the documentation is trying to say is that instead of the fmasf3
define_insn, you could have a define_split.  I personally would do a
define_insn_and_split for the combined insn.  Consider a weird machine that
you need to do FMA in a special fixed register, but you need to do the multiply
and add as separate instructions:

(define_split
 [(set (match_operand:SF 0 "f_operand" "=f")
   (plus:SF (mult:SF (match_operand:SF 1 "f_operand" "f")
 (match_operand:SF 2 "f_operand" "f"))
(match_operand:SF 3 "f_operand" "f")))]
 ""
 [(set (match_dup 4)
   (mult:SF (match_dup 1)
(match_dup 2)))
  (set (match_dup 4)
   (plus:SF (match_dup 4)
(match_dup 3)))
  (set (match_dup 0)
   (match_dup 4))]
 "operands[4] = gen_rtx_REG (SFmode, ACC_REGISTER);")

I would probably write it as:

(define_insn_and_split "fmasf3"
 [(set (match_operand:SF 0 "f_operand" "=f")
   (plus:SF (mult:SF (match_operand:SF 1 "f_operand" "f")
 (match_operand:SF 2 "f_operand" "f"))
(match_operand:SF 3 "f_operand" "f")))
  (clobber (reg:SF ACC_REGISTER))]
 ""
 "#"
 ""
 [(set (match_dup 4)
   (mult:SF (match_dup 1)
(match_dup 2)))
  (set (match_dup 4)
   (plus:SF (match_dup 4)
(match_dup 3)))
  (set (match_dup 0)
   (match_dup 4))]
 "operands[4] = gen_rtx_REG (SFmode, ACC_REGISTER);")

In the old days, define_split was only processed before each of the two
scheduling passes if they were run and at the very end if final finds an insn
string "#".

Now, it is always run several times.  Looking at passes.c, we see it is run:

Before the 2nd lower subreg pass (which is before the 1st scheduling pass)
After gcse2 and reload
Just before the 2nd scheduling pass
Before regstack elimination on x86
Before the execption handling/short branch passes

I talked about this in part of my tutorial on using modern features in your MD
file that I gave at this year's GCC summit:
http://gcc.gnu.org/wiki/summit2010

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com


Re: named address spaces: addr_space_convert never called

2010-11-08 Thread David Brown

On 08/11/10 16:59, Georg Lay wrote:

Richard Guenther schrieb:

On Mon, Nov 8, 2010 at 3:39 PM, Georg Lay  wrote:

Hi, I just started playing around with named address spaces for avr.
Besides general space (ram), I introduced a second one, __pgm, which
shall address program memory where also constants may live. avr is
havard architecture, and both program memory and ram start at address 0.

 From this and the internals on TARGET_ADDR_SPACE_CONVERT I understand
that pointer casting will not work as expected, because that hook will
only get called if the respective address spaces are subsets. However,
neither is space-1 a subset of space-0 nor vice versa (or am I midlead
by internals?)

Is there a way to make it work in the case where the address spaces
are disjoint? Started this morning and everything went smooth until I
started messing around with pointer casts:


char cast_3 (char in_pgm, void * p)
{
return in_pgm ? (*((char __pgm *) p)) : (*((char *) p));
}

The cast looks fine from the trees perspective (excerpt from .expand):

The first cast nullifies the pointer.
;; Function cast_3 (cast_3)

cast_3 (char in_pgm, void * p)
{
char * D.1934;
  char D.1930;

  # BLOCK 2 freq:1
  # PRED: ENTRY [100.0%]  (fallthru,exec)
  if (in_pgm_2(D) != 0)
goto;
  else
goto;
  # SUCC: 3 [39.0%]  (true,exec) 4 [61.0%]  (false,exec)

  # BLOCK 3 freq:3900
  # PRED: 2 [39.0%]  (true,exec)
  D.1934_4 = (  char *) p_3(D);
  D.1930_5 = *D.1934_4;
  goto;
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 4 freq:6100
  # PRED: 2 [61.0%]  (false,exec)
  D.1930_6 = MEM[(char *)p_3(D)];
  # SUCC: 5 [100.0%]  (fallthru,exec)

  # BLOCK 5 freq:1
  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
  # D.1930_1 = PHI
  return D.1930_1;
  # SUCC: EXIT [100.0%]

}

This produces the following RTL:

;; Generating RTL for gimple basic block 3

;; D.1930_5 = *D.1934_4;

(insn 10 9 11 (set (reg/f:PHI 47)
(const_int 0 [0])) pgm.c:97 -1
 (nil))

(insn 11 10 0 (set (reg:QI 42 [ D.1930 ])
(mem:QI (reg/f:PHI 47) [0 *D.1934_4+0 S1 A8 AS1])) pgm.c:97 -1
 (nil))

;; Generating RTL for gimple basic block 4

So as of internals doc, named address spaces are not intended to
implement this kind of memory?


If they are not subsets of each other how'd you convert a pointer
pointing into one to point into the other address-space?  I think
the frontend should diagnose this as invalid.


The front end emits warning. However, explicit casts should yield only
a warning if explicit requested.

With subsets relation returning true the code is like that (I changed
the test case to add 5, the there is an additional *addphi).

;; Generating RTL for gimple basic block 3

;; D.1987_6 = MEM[(  char *)D.1991_4 + 5B];

(insn 10 9 11 (set (reg:PHI 47)
 (subreg:PHI (reg/v/f:HI 46 [ p ]) 0)) pgm.c:98 -1
  (nil))

(insn 11 10 12 (set (reg:PHI 48)
 (reg:PHI 47)) pgm.c:98 -1
  (nil))

(insn 12 11 13 (set (reg/f:PHI 49)
 (plus:PHI (reg:PHI 47)
 (const_int 5 [0x5]))) pgm.c:98 -1
  (nil))

(insn 13 12 0 (set (reg:QI 42 [ D.1987 ])
 (mem:QI (reg/f:PHI 49) [0 MEM[(  char
*)D.1991_4 + 5B]+0 S1 A8 AS1])) pgm.c:98 -1
  (nil))

;; Generating RTL for gimple basic block 4

This is fine.

However, I am still confused:

"A is a subset of B iff every member of A is alse member of B".

But in this case, an element of ram is not an element of flash nor is
an element of flash an element of ram. Written down as numbers, these
numers are the same, yes, so that information gets encoded in the
machine mode to know what addresses are legal and what instruction
must be used.

Memory is not linearized because that would imply to take the decision
at runtime.


Georg




Would be be possible to define a third memory space as "global" memory, 
of which both the ram and the flash are subsets?  It is important to 
keep ram as the default memory space, but perhaps an artificial global 
memory space would let you do  conversions like this safely and without 
warnings.


Even better would be if the global memory space could have 24-bit (or 
32-bit if necessary) pointers, so that it would actually encompass all 
memory, with ram pointers at 0x80 to match the addresses used by the 
linker.  It would also make it a lot easier to use the full flash space 
on AVR's with more than 64K flash.


May I say I think it's great that you are looking into this?  Program 
space access on the AVR was the first thing I thought of when I heard of 
the concept of named address spaces in C.


mvh.,

David



Re: peephole2: dead regs not marked as dead

2010-11-08 Thread Michael Meissner
On Tue, Nov 02, 2010 at 10:41:49AM +0100, Georg Lay wrote:
> This solution works:
> 
> Generating a named insn in andsi3-expander as
> 
> (define_insn_and_split  "..."
>   [(set (match_operand:SI 0 "register_operand"   "")
> (and:SI (match_operand:SI 1 "register_operand"   "")
> (match_operand:SI 2 "const_int_operand"  "")))
>(clobber (match_operand:SI 3 "register_operand"   ""))]
>   "...
>&& !reload_completed
>&& !reload_in_progress"
>   {
> gcc_unreachable();
>   }
>   "&& 1"
>   [(set (match_dup 3)
> (and:SI (match_dup 1)
> (match_dup 4)))
>(set (match_dup 0)
> (xor:SI (match_dup 3)
> (match_dup 1)))]
>   {
> ...
>   })
> 
> The insn passes combine unchanged and gets split in split1 as expected :-)
> 
> What I do not understand is *why* this works.
> The internals "16.16 How to Split Instructions" mention two flavours of insn
> splitting: One after reload for the scheduler and one during combine stage, 
> the
> latter just doing single_set --> 2 * single_set splits for insns that do *not*
> match during combine stage.

As I just wrote for another reply:

In the old days, define_split was only processed before each of the two
scheduling passes if they were run and at the very end if final finds an insn
string "#".

Now, it is always run several times.  Looking at passes.c, we see it is run:

Before the 2nd lower subreg pass (which is before the 1st scheduling pass)
After gcse2 and reload
Just before the 2nd scheduling pass
Before regstack elimination on x86
Before the execption handling/short branch passes

I talked about this in part of my tutorial on using modern features in your MD
file that I gave at this year's GCC summit:
http://gcc.gnu.org/wiki/summit2010

In particular, go to pages 5-7 of my tutorial (chapter 6) where I talk about
scratch registers and allocating a new pseudo in split1 which is now always
run.

> However, this insn matches. So the combiner does'n split in accordance with
> internals doc. But the opportunity to use split1 is neither mentioned nor
> described in the internals.
> 
> I observed that the insn gets split in split1 even if the splitter produces 
> more
> than two insns and also if optimization is off.
> 
> That's nice but can I rely on that behaviour? (As far as -O0 is concerned, I
> will emit the special insn just if optimization is turned on, but nevertheless
> it would be interesting to know why this works smooth with -O0 as I expected 
> to
> run in unrecognizable insn or something like that during reload).

In gcc/passes.c the split passes are always run and do not depend on the
optimization flags, so yes, you can now rely on it with -O0.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com


Re: how much is the effort required to retarget gcc?

2010-11-08 Thread Michael Meissner
On Mon, Nov 01, 2010 at 09:21:14PM +0800, Hui Yan Cheah wrote:
> Hi,
> 
> We are working on a new project which requires a retargetting a
> compiler to a small cpu on FPGA.
> The cpu is hand-coded and it supports only a limited number of instruction 
> sets.
> 
> My questions are:
> 
> 1. Since I have very limited experience with compilers (this is my
> first compiler project), is it wise to begin with gcc? I have
> googled-up smaller compilers like pcc, lcc and small-c and they seem
> like very good candidates. However, I would like to listen to the
> opinions of programmers who have worked with gcc or retargetted gcc.
> 
> 2. How much is the effort in retargetting compilers? I heard it took
> months but it all depends on the level of experience of the
> programmer. So if an experienced programmer took 2 month, it might
> have taken a beginner 6 months or so.

It really depends on the complexity of the machine and how much you know about
the RTL interface to the compiler.  It also depends on whether you also have to
do the assembler, linker, library and debugger, or if it is just the compiler
work.

Back in the day when I was doing new ports, I tended to estimate 3 months for
an initial port of a normal machine generating reasonable code that is passing
all of the test suites, etc.  However, generally within 1 month, I would be
generating code, but not all of the features targetted.  That assumes somebody
else did the assembler/linker/debugger/library.  However, note that I have been
working on GCC for quite some time, and have done at least 5 ports from
scratch, so you probably don't want to use my time estimates :-)

The more irregular/limited the machine is, the more it takes to get reasonable
code generation.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com


Re: UNITS_PER_SIMD_WORD

2010-11-08 Thread Michael Meissner
On Mon, Nov 01, 2010 at 04:52:28PM +0200, roy rosen wrote:
> Hi All,
> 
> Is it possible to define UNITS_PER_SIMD_WORD as a global variable and
> to set this varibale using a pragma (even once for a compilation) and
> that way to be able to compile one file with UNITS_PER_SIMD_WORD = 8
> and another file with UNITS_PER_SIMD_WORD = 16?

The general way to do this would be to have an appropriate -m option to control
this.  Typically it is based on the instructions available to the back end.

If you have an -m option, you could presumably define target attributes or
pragmas to change this for particular functions.  At the moment, the i386 port
is the only one to support target attributes/pragmas, though I have just
submitted patches for the rs6000 port to add them there, so you can see what
types of changes are needed (note, I need to do another patch after the first
is approved for builtin functions).

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com


Re: %pc relative addressing of string literals/const data

2010-11-08 Thread Dave Korn
On 08/11/2010 13:44, Joakim Tjernlund wrote:

> One ping and a few days later and nothing. Very frustrating. I don't
> believe all PPC devs are so "busy" that none has the time to look
> at a simple one liner. What is up?

  There's only the one of him.  He probably is that busy.  He's a very nice
bloke and wouldn't be snubbing you just to be nasty, but he does have a day
job as well as volunteering for GCC.

cheers,
  DaveK


Merging gdc (Gnu D Compiler) into gcc

2010-11-08 Thread Walter Bright
Who do I need to talk to in order to resolve the various licensing 
issues so this becomes possible?



Walter Bright
Digital Mars
http://www.digitalmars.com
free C, C++, D programming language compilers


Re: Merging gdc (Gnu D Compiler) into gcc

2010-11-08 Thread Joseph S. Myers
On Mon, 8 Nov 2010, Walter Bright wrote:

> Who do I need to talk to in order to resolve the various licensing issues so
> this becomes possible?

The FSF, via the Steering Committee, via this list.  The standard 
assignment and licensing policies are as described in the Mission 
Statement .  Any special arrangement 
like that for the Go front end (where part providing the GCC interface is 
assigned to the FSF and maintained in the GCC tree and part that could be 
used with other back ends is maintained externally with third-party 
copyright) needs specific approval.  (Note that the Go front end does not 
yet achieve the level of separation achieved by Ada, for example; there 
are plenty of uses of GCC's tree interfaces in the gofrontend/ directory 
that mean portability to other back ends is more theory than reality.)

In general I'd like to encourage maintainers of separate front ends - not 
limited to D - to work towards merging them into FSF GCC and maintaining 
them there; additional front ends help improve the quality of the core 
language-independent code, and no doubt GNU/Linux distributors would be 
glad to avoid the complexities of patching out-of-tree front ends into 
their GCC packages.  Front ends do of course need to pass technical review 
and do things in the ways that are considered current good practice for 
front ends instead of being gratuitously different, but when maintainers 
are ready to follow current good technical and licensing practice I think 
having them in FSF GCC benefits both the front ends and GCC.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Merging gdc (Gnu D Compiler) into gcc

2010-11-08 Thread Walter Bright



Joseph S. Myers wrote:

On Mon, 8 Nov 2010, Walter Bright wrote:

  

Who do I need to talk to in order to resolve the various licensing issues so
this becomes possible?



The FSF, via the Steering Committee, via this list.  The standard 
assignment and licensing policies are as described in the Mission 
Statement .  Any special arrangement 
like that for the Go front end (where part providing the GCC interface is 
assigned to the FSF and maintained in the GCC tree and part that could be 
used with other back ends is maintained externally with third-party 
copyright) needs specific approval.  (Note that the Go front end does not 
yet achieve the level of separation achieved by Ada, for example; there 
are plenty of uses of GCC's tree interfaces in the gofrontend/ directory 
that mean portability to other back ends is more theory than reality.)
  


The D specific part of gdc is already GPL, it's just copyrighted by
Digital Mars. I understand the copyright must be reassigned to the FSF.
Is it possible to fork the code, and assign copyright of one fork to the
FSF and leave the other copyrighted by Digital Mars?

In general I'd like to encourage maintainers of separate front ends - not 
limited to D - to work towards merging them into FSF GCC and maintaining 
them there; additional front ends help improve the quality of the core 
language-independent code, and no doubt GNU/Linux distributors would be 
glad to avoid the complexities of patching out-of-tree front ends into 
their GCC packages.  Front ends do of course need to pass technical review 
and do things in the ways that are considered current good practice for 
front ends instead of being gratuitously different, but when maintainers 
are ready to follow current good technical and licensing practice I think 
having them in FSF GCC benefits both the front ends and GCC.


  


Sounds sensible.



Re: %pc relative addressing of string literals/const data

2010-11-08 Thread Peter Bergner
latOn Mon, 2010-11-08 at 21:13 +, Dave Korn wrote:
> On 08/11/2010 13:44, Joakim Tjernlund wrote:
> > One ping and a few days later and nothing. Very frustrating. I don't
> > believe all PPC devs are so "busy" that none has the time to look
> > at a simple one liner. What is up?
> 
>   There's only the one of him.  He probably is that busy.  He's a very nice
> bloke and wouldn't be snubbing you just to be nasty, but he does have a day
> job as well as volunteering for GCC.

Not to mention he was at the recent GCC Summit and probably has a large
backlog of email to catch up with.

Hälsningar,

Peter





Re: UNITS_PER_SIMD_WORD

2010-11-08 Thread roy rosen
This is what I done.
It works well.
Thanks to everybody.

2010/11/8 Michael Meissner :
> On Mon, Nov 01, 2010 at 04:52:28PM +0200, roy rosen wrote:
>> Hi All,
>>
>> Is it possible to define UNITS_PER_SIMD_WORD as a global variable and
>> to set this varibale using a pragma (even once for a compilation) and
>> that way to be able to compile one file with UNITS_PER_SIMD_WORD = 8
>> and another file with UNITS_PER_SIMD_WORD = 16?
>
> The general way to do this would be to have an appropriate -m option to 
> control
> this.  Typically it is based on the instructions available to the back end.
>
> If you have an -m option, you could presumably define target attributes or
> pragmas to change this for particular functions.  At the moment, the i386 port
> is the only one to support target attributes/pragmas, though I have just
> submitted patches for the rs6000 port to add them there, so you can see what
> types of changes are needed (note, I need to do another patch after the first
> is approved for builtin functions).
>
> --
> Michael Meissner, IBM
> 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
> meiss...@linux.vnet.ibm.com
>


Re: define_split

2010-11-08 Thread roy rosen
2010/11/8 Michael Meissner :
> On Thu, Oct 28, 2010 at 09:11:44AM +0200, roy rosen wrote:
>> Hi all,
>>
>> I am trying to use define_split, but it seems to me that I don't
>> understand how it is used.
>> It says in the gccint.pdf (which I use as my tutorial (is there
>> anything better or more up to date?)) that the combiner only uses the
>> define_split if it doesn't find any define_insn to match. This is not
>> clear to me: If there is no define_insn to match a pattern in the
>> combine stage, then how is this pattern there in the first place? The
>> only way I can think for it to happen is that such a pattern was
>> created by the combiner itself (but it seems unreasonable to me that
>> we now want to split what the combiner just combined). Can someone
>> please explain this to me?
>
> Basically combine.c works by trying to build a combined insn, and then it sees
> if it matches an insn.  The canonical example is fused multiply/add (avoiding
> all of the issues with FMA's for the moment):
>
>        (define_insn "addsf3"
>          [(set (match_operand:SF 0 "f_operand" "=f")
>                (plus:SF (match_operand:SF 1 "f_operand" "f")
>                         (match_operand:SF 2 "f_operand" "f")))]
>          ""
>          "fadd %0,%1,%2")
>
>        (define_insn "mulsf3"
>          [(set (match_operand:SF 0 "f_operand" "=f")
>                (mult:SF (match_operand:SF 1 "f_operand" "f")
>                         (match_operand:SF 2 "f_operand" "f")))]
>          ""
>          "fadd %0,%1,%2")
>
>        (define_insn "fmasf3"
>         [(set (match_operand:SF 0 "f_operand" "=f")
>               (plus:SF (mult:SF (match_operand:SF 1 "f_operand" "f")
>                                 (match_operand:SF 2 "f_operand" "f"))
>                        (match_operand:SF 3 "f_operand" "f")))]
>         ""
>         "fma %0,%1,%2,%3")
>
> What the documentation is trying to say is that instead of the fmasf3
> define_insn, you could have a define_split.  I personally would do a
> define_insn_and_split for the combined insn.  Consider a weird machine that
> you need to do FMA in a special fixed register, but you need to do the 
> multiply
> and add as separate instructions:
>
>        (define_split
>         [(set (match_operand:SF 0 "f_operand" "=f")
>               (plus:SF (mult:SF (match_operand:SF 1 "f_operand" "f")
>                                 (match_operand:SF 2 "f_operand" "f"))
>                        (match_operand:SF 3 "f_operand" "f")))]
>         ""
>         [(set (match_dup 4)
>               (mult:SF (match_dup 1)
>                        (match_dup 2)))
>          (set (match_dup 4)
>               (plus:SF (match_dup 4)
>                        (match_dup 3)))
>          (set (match_dup 0)
>               (match_dup 4))]
>         "operands[4] = gen_rtx_REG (SFmode, ACC_REGISTER);")
>
> I would probably write it as:
>
>        (define_insn_and_split "fmasf3"
>         [(set (match_operand:SF 0 "f_operand" "=f")
>               (plus:SF (mult:SF (match_operand:SF 1 "f_operand" "f")
>                                 (match_operand:SF 2 "f_operand" "f"))
>                        (match_operand:SF 3 "f_operand" "f")))
>          (clobber (reg:SF ACC_REGISTER))]
>         ""
>         "#"
>         ""
>         [(set (match_dup 4)
>               (mult:SF (match_dup 1)
>                        (match_dup 2)))
>          (set (match_dup 4)
>               (plus:SF (match_dup 4)
>                        (match_dup 3)))
>          (set (match_dup 0)
>               (match_dup 4))]
>         "operands[4] = gen_rtx_REG (SFmode, ACC_REGISTER);")
>
> In the old days, define_split was only processed before each of the two
> scheduling passes if they were run and at the very end if final finds an insn
> string "#".
>
> Now, it is always run several times.  Looking at passes.c, we see it is run:
>
>    Before the 2nd lower subreg pass (which is before the 1st scheduling pass)
>    After gcse2 and reload
>    Just before the 2nd scheduling pass
>    Before regstack elimination on x86
>    Before the execption handling/short branch passes
>
> I talked about this in part of my tutorial on using modern features in your MD
> file that I gave at this year's GCC summit:
> http://gcc.gnu.org/wiki/summit2010
>
> --
> Michael Meissner, IBM
> 5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
> meiss...@linux.vnet.ibm.com
>

I still don't understand the difference between your two examples:
If you write a define_split then whenever during combine it gets into
a pattern which matches the define_split then it splits.

What is the difference when writing define_insn_and_split?
From what I understood from the docs then if there is such an insn
then the split does not occur so it would simply match it as an insn
without splitting and at the end would print the #?
Can you please elaborate?

Thanks, Roy.