Re: gcc.c-torture/execute/stdarg-2.c: long vs int

2005-08-23 Thread Jakub Jelinek
On Mon, Aug 22, 2005 at 08:38:01PM -0400, DJ Delorie wrote:
> 
> This test assumes that integer constants passed as varargs are
> promoted to a type at least as big as "long", which is not valid on 16
> bit hosts.  For example:
> 
> void
> f1 (int i, ...)
> {
>   va_start (gap, i);
>   x = va_arg (gap, long);
> 
> 
> int
> main (void)
> {
>   f1 (1, 79);
>   if (x != 79)
> abort ();
> 
> 
> Shouldn't those constants be 79L, not just 79?  That change fixes one
> m32c failure, but given that it's a test case I'm not going to make
> any assumptions about it.

This certainly wasn't my intention, please change it to 79L.

Jakub


Re: Warning Behavior

2005-08-23 Thread Andreas Schwab
Ivan Novick <[EMAIL PROTECTED]> writes:

> How come the following code would not be considered a Warning?

Try -Wextra.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: please update the gcj main page

2005-08-23 Thread Florian Weimer
* Gerald Pfeifer:

> On Sun, 31 Jul 2005, Daniel Berlin wrote:
>> For code.
>> I have never seen such claims made for documentation, since it's much
>> easier to remove and deal with infringing docs than code.
>
> I have seen such statements, by RMS himself.

The official position might have changed (e.g. copyright assignments
and documentation).


Re: Memory usage reduction in loop.c ?

2005-08-23 Thread Giovanni Bajo
Christophe Jaillet <[EMAIL PROTECTED]> wrote:

> I think that the structure 'struct loop_info' in loop.c could be
> shrinked a bit if all the 'int has_XXX' fields where turned into a
> bitfield just as in 'struct iv_class' or 'struct induction' in the
> same file.
>
> I don't know if it worse it (in term of memory usage reduction)
> neither the impact in performance.
>
> If anyone interested, I can try it and do a bootstrap but I don't
> have the tools to perform benchmark (memory usage or speed of the
> compiler)

loop.c is a dead man walking. It'll be probably removal in GCC 4.2, so I
wouldn't waste my time on it. If you want to improve RTL loop optimizers, look
into the new RTL loop optimizer (loop-*.c).

Giovanni Bajo



Successful build off gcc-3.4.4 on Mac OS X 10.2.8

2005-08-23 Thread Xavier Dectot
Not that it come as a big surprise, but I successfully compiled 
gcc-3.4.4 on darwin 6.8 (as specified by uname -a)


config.guess reports
powerpc-apple-darwin.6.8

gcc -v reports
Configured with ../gcc-3.4.4/configure --program-suffix=-3.4.4 
--enable-languages=c,c++,f77,java,objc

Thread model: posix
gcc version: 3.4.4

compiled for c, c++, fortran and java

bootstraped from apple's gcc-3.3

No problems to signal, clean compile.

Xavier



Question about an rtx expression.

2005-08-23 Thread Leehod Baruch
Hello,

Is it true that in a SET, a search for a _use_ of a register
in the LHS should be done only inside a memory address?

Like in this SET:

(set (mem:SI (plus:DI (reg:DI 159)
   (reg/v/f:DI 150 )))
 (subreg/s:SI (reg/v:DI 142 [ j ]) 4)) -1 (nil)

Registers 142, 159 and 150 are used and no register is defined.


Thanks,
Leehod.


Re: Question about an rtx expression.

2005-08-23 Thread Paolo Bonzini

Leehod Baruch wrote:


Hello,

Is it true that in a SET, a search for a _use_ of a register
in the LHS should be done only inside a memory address?

Also within the second and third arguments of a ZERO_EXTRACT.  And its 
first argument may be a MEM, in which case you should look into it.  
Look at df_uses_record in df.c for more information.


But you can simply use the data flow info you compute, and just avoid 
uses that have the DF_REF_READ_WRITE flag set (because they occur in the 
LHS, or within an autoincrement/autodecrement)?


Paolo


[GCC 4.x][AMD64 ABI] variadic function

2005-08-23 Thread Matteo Emanuele
Hi to everyone,
 I cannot figure out how variadic function are
practically implemented. In the called (variadic)
function after few 'push's %rsp is suddenly
decremented by N bytes: the red area starts 128 bytes
below the NEW rsp or %rsp-N above?
Is it possible to  find the register save area and the
overflowing arguments within the called function
without using %ebp (that means with
-fomit-frame-pointer set) and knowing nothing of the
caller?
The -so called- spill reg area is placed at fixed
address?
Thanx in advance,
 Matteo




Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 


m64

2005-08-23 Thread ji an
Hello,

can anyone tell me how to use option -m64 in 
g++ (GCC) 3.4.3 20050227 (Red Hat 3.4.3-22.1)?

when I input the command line:
>g++ -m64 -o test test.cc

error message was output:

/tmp/ccyjpGIh.o(.text+0x900): In function `main':
: relocation truncated to fit: R_X86_64_32
.
.
.

best regards

Jian





Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 


Re: Warning Behavior

2005-08-23 Thread jlh
Andreas Schwab wrote:

> Try -Wextra.

Ah thanks!  I have already lost time several times due to this
almost invisible mistake and I didn't know -Wextra would catch it.
However, it seems to only work for the C compiler, not for C++.

(Using GCC 3.4.4)

(Oops, sorry Andreas, I actually meant to only send the message
to the list)

jlh



signature.asc
Description: OpenPGP digital signature


Re: [GCC 4.x][AMD64 ABI] variadic function

2005-08-23 Thread Florian Weimer
* Matteo Emanuele:

> Is it possible to  find the register save area and the
> overflowing arguments within the called function
> without using %ebp (that means with
> -fomit-frame-pointer set) and knowing nothing of the
> caller?

You mean, if the caller called the function as it were a non-variadic
function?


Re: [RFA] Nonfunctioning split in rs6000 back-end

2005-08-23 Thread Paolo Bonzini

David Edelsohn wrote:


Paolo Bonzini writes:
   



Paolo> I'm testing a patch that does this replacement, and I can post it 
Paolo> tomorrow morning.  It has triggered only a dozen times so far (half in 
Paolo> libgcc, half in the compiler), but it may be worth keeping it.


It would be nice to keep this type of optimization if the
re-engineered version works.

 


Here it is, bootstrapped and regtested on powerpc-apple-darwin8.1.0.

Ok for mainline?

Paolo
2005-08-22  Paolo Bonzini  <[EMAIL PROTECTED]>

* config/rs6000/predicates.md (equality_operator): New.
* config/rs6000/rs6000.md: Rewrite as a peephole2 the split for
comparison with a large constant.

Index: config/rs6000/predicates.md
===
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/predicates.md,v
retrieving revision 1.23
diff -p -u -r1.23 predicates.md
--- config/rs6000/predicates.md 11 Aug 2005 21:18:11 -  1.23
+++ config/rs6000/predicates.md 22 Aug 2005 20:44:32 -
@@ -710,6 +710,10 @@
 (define_predicate "boolean_or_operator"
   (match_code "ior,xor"))
 
+;; Return true if operand is an equality operator.
+(define_special_predicate "equality_operator"
+  (match_code "eq,ne"))
+
 ;; Return true if operand is MIN or MAX operator.
 (define_predicate "min_max_operator"
   (match_code "smin,smax,umin,umax"))
Index: rs6000.md
===
RCS file: /cvs/gcc/gcc/gcc/config/rs6000/rs6000.md,v
retrieving revision 1.400
diff -p -u -r1.400 rs6000.md
--- rs6000.md   20 Aug 2005 04:17:17 -  1.400
+++ rs6000.md   22 Aug 2005 20:41:44 -
@@ -10727,32 +10727,43 @@
   [(set_attr "type" "cmp")])
 
 ;; If we are comparing a register for equality with a large constant,
-;; we can do this with an XOR followed by a compare.  But we need a scratch
-;; register for the result of the XOR.
-
-(define_split
-  [(set (match_operand:CC 0 "cc_reg_operand" "")
-   (compare:CC (match_operand:SI 1 "gpc_reg_operand" "")
-   (match_operand:SI 2 "non_short_cint_operand" "")))
-   (clobber (match_operand:SI 3 "gpc_reg_operand" ""))]
-  "find_single_use (operands[0], insn, 0)
-   && (GET_CODE (*find_single_use (operands[0], insn, 0)) == EQ
-   || GET_CODE (*find_single_use (operands[0], insn, 0)) == NE)"
-  [(set (match_dup 3) (xor:SI (match_dup 1) (match_dup 4)))
-   (set (match_dup 0) (compare:CC (match_dup 3) (match_dup 5)))]
-  "
-{
-  /* Get the constant we are comparing against, C,  and see what it looks like
- sign-extended to 16 bits.  Then see what constant could be XOR'ed
- with C to get the sign-extended value.  */
-
-  HOST_WIDE_INT c = INTVAL (operands[2]);
+;; we can do this with an XOR followed by a compare.  But this is profitable
+;; only if the large constant is only used for the comparison (and in this
+;; case we already have a register to reuse as scratch).
+
+(define_peephole2
+  [(set (match_operand:GPR 0 "register_operand")
+(match_operand:GPR 1 "logical_operand" ""))
+   (set (match_dup 0) (match_operator:GPR 3 "boolean_or_operator"
+  [(match_dup 0)
+   (match_operand:GPR 2 "logical_operand" "")]))
+   (set (match_operand:CC 4 "cc_reg_operand" "")
+(compare:CC (match_operand:GPR 5 "gpc_reg_operand" "")
+(match_dup 0)))
+   (set (pc)
+(if_then_else (match_operator 6 "equality_operator"
+   [(match_dup 4) (const_int 0)])
+  (match_operand 7 "" "")
+  (match_operand 8 "" "")))]
+  "peep2_reg_dead_p (3, operands[0])"
+ [(set (match_dup 0) (xor:GPR (match_dup 5) (match_dup 9)))
+  (set (match_dup 4) (compare:CC (match_dup 0) (match_dup 10)))
+  (set (pc) (if_then_else (match_dup 6) (match_dup 7) (match_dup 8)))]
+ 
+{
+  /* Get the constant we are comparing against, and see what it looks like
+ when sign-extended from 16 to 32 bits.  Then see what constant we could
+ XOR with SEXTC to get the sign-extended value.  */
+  rtx cnst = simplify_const_binary_operation (GET_CODE (operands[3]),
+ GET_MODE (operands[3]),
+ operands[1], operands[2]);
+  HOST_WIDE_INT c = INTVAL (cnst);
   HOST_WIDE_INT sextc = ((c & 0x) ^ 0x8000) - 0x8000;
   HOST_WIDE_INT xorv = c ^ sextc;
 
-  operands[4] = GEN_INT (xorv);
-  operands[5] = GEN_INT (sextc);
-}")
+  operands[9] = GEN_INT (xorv);
+  operands[10] = GEN_INT (sextc);
+})
 
 (define_insn "*cmpsi_internal2"
   [(set (match_operand:CCUNS 0 "cc_reg_operand" "=y")


Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread Andreas Krebbel
Hello,

sorry for the late answer.

> Vlad promised to update it to use df.c once it wasn't "1% slower", which
> would make it easily reusable elsewhere, but never did.
> Of course, you could reuse it without that, but then someone will
> invariably come along and mess with it.

Ok I understand that implementing the special lifeness analyzers in global alloc
using the df.c framework would ease reusing it somewhere else. But my question
was more basic.
So do you agree that using one lifeness analyzer for checking what
an optimizer step has done based on a second lifeness analyzers output
is wrong? If so what is the way to fix this? Going back to the normal analyzer 
to
be used in global alloc would make global alloc creating worse code. But on the 
other hand
using the global alloc lifeness analyzer everywhere else would be a change 
which nobody would agree with in the current development stage.

Because this is a regression from 4.0 to 4.1 this should be fixed as soon as 
possible.

Bye,

-Andreas-


Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread Bernd Schmidt

Andreas Krebbel wrote:


Ok I understand that implementing the special lifeness analyzers in global alloc
using the df.c framework would ease reusing it somewhere else. But my question
was more basic.
So do you agree that using one lifeness analyzer for checking what
an optimizer step has done based on a second lifeness analyzers output
is wrong? If so what is the way to fix this? Going back to the normal analyzer 
to
be used in global alloc would make global alloc creating worse code. But on the 
other hand
using the global alloc lifeness analyzer everywhere else would be a change 
which nobody would agree with in the current development stage.


Jim Wilson once suggested we should just emit insns to make sure every 
register is initialized and be done with it - problem solved.  I had 
started to work on that, if people think it's a good idea I can dig that 
stuff out again.



Bernd


Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread Daniel Berlin
On Tue, 2005-08-23 at 16:44 +0200, Bernd Schmidt wrote:
> Andreas Krebbel wrote:
> 
> > Ok I understand that implementing the special lifeness analyzers in global 
> > alloc
> > using the df.c framework would ease reusing it somewhere else. But my 
> > question
> > was more basic.
> > So do you agree that using one lifeness analyzer for checking what
> > an optimizer step has done based on a second lifeness analyzers output
> > is wrong? If so what is the way to fix this? Going back to the normal 
> > analyzer to
> > be used in global alloc would make global alloc creating worse code. But on 
> > the other hand
> > using the global alloc lifeness analyzer everywhere else would be a change 
> > which nobody would agree with in the current development stage.
> 
> Jim Wilson once suggested we should just emit insns to make sure every 
> register is initialized and be done with it - problem solved.  

But doesn't this actually the information you get worse?

Partial liveness gives you an answer, which is "It's not really live
here, because it's not defined"

If you make them all defined, then it's going to be live where it wasn't
before, even though it's not really *used* over those paths.


> I had 
> started to work on that, if people think it's a good idea I can dig that 
> stuff out again.
> 
> 
> Bernd



Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread Bernd Schmidt

Daniel Berlin wrote:


If you make them all defined, then it's going to be live where it wasn't
before, even though it's not really *used* over those paths.


The idea is to put the initialization insns only on the paths where the 
register will be uninitialized.



Bernd


Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread Steven Bosscher
On Tuesday 23 August 2005 17:06, Bernd Schmidt wrote:
> The idea is to put the initialization insns only on the paths where the
> register will be uninitialized.

int foo (int n)
{
  int a;

  while (--n)
a = n;

  return a;
}

Not knowing n, how can you be sure whether "a" is uninitialized for
the "return" statement or not?

Gr.
Steven



Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread Daniel Berlin
On Tue, 2005-08-23 at 17:06 +0200, Bernd Schmidt wrote:
> Daniel Berlin wrote:
> 
> > If you make them all defined, then it's going to be live where it wasn't
> > before, even though it's not really *used* over those paths.
> 
> The idea is to put the initialization insns only on the paths where the 
> register will be uninitialized.

Again, that will just make the register live over those paths, when it
wasn't before, which makes your information about liveness worse.

IE if you had

int foo(void)
{
int a;

if (blah)
  a = 5;


}

and you transform this to:
int foo(void)
{
int a;

if (blah)
  a = 5;
else
  a = 0;


}

a is now considered live over both paths of the branch.whereas, with the
partial availability liveness, it will only be considered live over the
path it is actually initialized before use, which is the if branch.

Conservatively initialization will also lead to sets you can't
eliminate, and will generate real code, even if unreachable in practice.

Consider:
int argc;

int foo(void) {
int a;

while (argc--)
  a = 


}
Because you don't know the value of argc, dataflow will tell you it may
be uninitialized here.
To make it initialized, you'd have to conservatively transform this to:
int a;

a = 0;
while (argc--)
  a = 



Because you still don't know the value of argc, you won't be able to
remove the a = 0.

Besides not being able to remove them, you have to worry about placement
when it comes to loops.
Consider the  simple nested loops

for i = 1 to 10
{
  while (argv--)
 {
a = 
 }
}


If you just "stupidly" place the initializations (IE don't do LCM like
dataflow to determine where they can go), you will transform this into:

for i = 1 to 10
{
  a = 0;
  while (argv--)
 {
a = 
 }
}




You could avoid all but the "worse information" problem by tracking
which sets you added, and thus, know you can remove the sets if things
get bad,  since they don't affect the original program.

However, this probably ends up being just as ugly as the partial
liveness stuff.
--Dan



Re: Question about an rtx expression.

2005-08-23 Thread Ian Lance Taylor
Leehod Baruch <[EMAIL PROTECTED]> writes:

> Is it true that in a SET, a search for a _use_ of a register
> in the LHS should be done only inside a memory address?

See refers_to_regno_p for an example of a function which looks for
all uses of a register.

Ian


Re: Bug in builtin_floor optimization

2005-08-23 Thread Roger Sayle

On Mon, 22 Aug 2005, Dale Johannesen wrote:
> There is some clever code in convert_to_real that converts
>
> double d;
> (float)floor(d)
>
> to
>
> floorf((float)d)
> ...
>
> Comments? Should I preserve the buggy behavior with -ffast-math?

Good catch.  This is indeed a -ffast-math (or more precisely a
flag_unsafe_math_optimizations) transformation.  I'd prefer to
keep these transformations with -ffast-math, as Jan described them
as significantly helping SPEC's mesa when they were added.

My one comment is that we should try to make sure that we continue
to optimize the common safe case (even without -ffast-math):

float x, y;
x = floor(y);

i.e. that (float)floor((double)y) is the same as floorf(y).
Hmm, it might be good to know the relative merits of the safe vs.
unsafe variants.  If the majority of the benefit is from the "safe"
form, I wouldn't be opposed to removing the "unsafe" form completely,
if people think its an optimization "too far".

Thanks for investigating this.

Roger
--



Re: please update the gcj main page

2005-08-23 Thread John M. Gabriele

--- Florian Weimer <[EMAIL PROTECTED]> wrote:

> * Gerald Pfeifer:
> 
> > On Sun, 31 Jul 2005, Daniel Berlin wrote:
> >> For code.
> >> I have never seen such claims made for documentation, since it's much
> >> easier to remove and deal with infringing docs than code.
> >
> > I have seen such statements, by RMS himself.
> 
> The official position might have changed (e.g. copyright assignments
> and documentation).
> 

I had one thing I'd like to add to this thread:

I spend some amount of time updating various GNU/Linux-related
docs on the web. Before wiki's became popular (or, at least, before
I knew about them), updating a project's docs meant figuring out
how to get the site's source via cvs, learning LinuxDoc/DocBook,
and sending patches or getting commit access. I never got involved
with that.

Now that many projects are using wiki's, I can log in, make
corrections/additions, and log out. Not to mention how simple
most wiki formatting rules are. It's a piece of cake. The only
thing that bugs me is that sometimes the wiki police trample
over some nicely crafted bit of work I've done, but that's not
too often.

Devs on these mailing lists have reapeatedly mentioned how receptive
they are to having more newb-friendly docs contributed, but it's
just *so* *darn* *easy* to work with a wiki that I'm spoiled rotten,
and I'm quickly getting too lazy to start doing it the old way.

(It occurs to me to wonder if tldp is beginning to see fewer
updates to their docs because folks are preferring to use wiki's.)

IMO, it's best to keep wiki's editable only by folks/accounts
that've been approved somehow. It shouldn't be too much trouble
for a wiki maintainer to enable/disable users as-needed. (Though
some folks have mentioned that they monitor the wiki continuously
and are emailed notifications every time a change is made, so
maybe it's not necessary to only allow approved contributors.)

Anyhow, that's my opinion FWIW, coming from someone who writes
pretty good newb-friendly docs, on various wiki's, every now and
again. IMO, if there's some issue with licensing/copyright and
wiki's for GNU projects, it should be straightened out so everyone
can easily start contributing to the docs, wiki-style. That seems
to be the future of web docs AFAICT.

---John





Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs 
 


Re: Bug in builtin_floor optimization

2005-08-23 Thread Richard Henderson
On Tue, Aug 23, 2005 at 09:28:50AM -0600, Roger Sayle wrote:
> Good catch.  This is indeed a -ffast-math (or more precisely a
> flag_unsafe_math_optimizations) transformation.  I'd prefer to
> keep these transformations with -ffast-math, as Jan described them
> as significantly helping SPEC's mesa when they were added.

Are you sure it was "(float)floor(d)"->"floorf((float)d)" that
helped mesa and not "(float)floor((double)f)"->"floorf(f)" ?

It wouldn't bother me if the first transformation went away
even for -ffast-math.  It seems egregeously wrong.


r~


Assembling pending decls before writing their debug info

2005-08-23 Thread Nick Clifton
Hi Guys,

  There is a problem with unit-at-a-time compilation and DWARF debug
  info generation.  Consider this small test case which has been
  derived from GDB's observer.c source file:

int observer_test_first_observer = 0;
int observer_test_second_observer = 0;
int observer_test_third_observer = 0;

void observer_test_first_notification_function (void)
{
  observer_test_first_observer++;
}

void observer_test_second_notification_function (void)
{
  observer_test_second_observer++;
}

void observer_test_third_notification_function (void)
{
  observer_test_third_observer++;
}

  When compiled with the current mainline gcc sources for an x86
  native target and with "-g -O2 -dA" on the command line the
  following debug info is produced:

[snip]
.long   .LASF0  # DW_AT_name: "observer_test_first_observer"
.byte   0x1 # DW_AT_decl_file
.byte   0x1 # DW_AT_decl_line
.long   0x37# DW_AT_type
.byte   0x1 # DW_AT_external
.byte   0x5 # DW_AT_location
.byte   0x3 # DW_OP_addr
.long   observer_test_first_observer
.uleb128 0x3# (DIE (0x37) DW_TAG_base_type)
.ascii "int\0"  # DW_AT_name
.byte   0x4 # DW_AT_byte_size
.byte   0x5 # DW_AT_encoding
.uleb128 0x4# (DIE (0x3e) DW_TAG_variable)
.long   .LASF1  # DW_AT_name: "observer_test_second_observer"
.byte   0x1 # DW_AT_decl_file
.byte   0x2 # DW_AT_decl_line
.long   0x37# DW_AT_type
.byte   0x1 # DW_AT_external
.byte   0x0 # DW_AT_const_value
[snip]

  Note how observer_test_first_observer is correctly defined as having
  a DW_AT_location and a DW_OP_addr whereas
  observer_test_second_observer is incorrectly defined as having a
  DW_AT_const_value.  ie the debug info is saying that it is a
  variable without a location in memory.

  The reason for this behaviour is that the debug information is being
  written out before the variables have been fully resolved.  In
  particular DECL_SET() for the second and third observer functions is
  NULL when the debug info is generated, which is why they are being
  given the DW_AT_const_value attribute.
  
  In trying to solve this I found that switching the order of the
  calls to lang_hooks.decls.final_write_globals() and 
  cgraph_varpool_assemble_pending_decls() in compile_file() worked,
  and this seemed to be intuitively correct.  But when I reran the gcc
  testsuite I found that the change introduced a regression:
  gcc.dg/varpool-1.c now had the variable
  "unnecessary_static_initialized_variable" still defined at the end
  of compilation.

  I have investigated some more but not gotten much further, so I am
  asking for help.  Can anyone suggest where the conflict between
  generating the debug info and deciding if the variable is going to
  be emitted should really be resolved ?

Cheers
  Nick
  



Re: Assembling pending decls before writing their debug info

2005-08-23 Thread Andrew Pinski
> 
> Hi Guys,
> 
>   There is a problem with unit-at-a-time compilation and DWARF debug
>   info generation.  Consider this small test case which has been
>   derived from GDB's observer.c source file:

There was even more issues with uninitialized variables a month ago.
This was all caused by Mark's patch to fix PR 18556.
This is a regression from 3.4.x.

Thanks,
Andrew Pinski


Re: Bug in builtin_floor optimization

2005-08-23 Thread Dale Johannesen


On Aug 23, 2005, at 9:53 AM, Richard Henderson wrote:


On Tue, Aug 23, 2005 at 09:28:50AM -0600, Roger Sayle wrote:

Good catch.  This is indeed a -ffast-math (or more precisely a
flag_unsafe_math_optimizations) transformation.  I'd prefer to
keep these transformations with -ffast-math, as Jan described them
as significantly helping SPEC's mesa when they were added.


Are you sure it was "(float)floor(d)"->"floorf((float)d)" that
helped mesa and not "(float)floor((double)f)"->"floorf(f)" ?


All the floor calls in mesa seem to be of the form (int)floor((double)f)
or (f - floor((double)f)).  (the casts to double are implicit, 
actually.)



It wouldn't bother me if the first transformation went away
even for -ffast-math.  It seems egregeously wrong.


I think I'd prefer this, given that it is not useful in mesa.  Will put
together a patch.



Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread James E Wilson
On Tue, 2005-08-23 at 07:44, Bernd Schmidt wrote:
> Jim Wilson once suggested we should just emit insns to make sure every 
> register is initialized and be done with it - problem solved.  I had 
> started to work on that, if people think it's a good idea I can dig that 
> stuff out again.

I'd like this because of an IA-64 specific problem.

IA-64 has Not-a-Thing (NaT) bits, which are used for speculation.  If a
speculative load fails, the NaT bit is set, which indicates that we must
refetch the value before using it.  NaT bits propagate through most
operations, allowing us to speculate a series of instructions instead of
just loads.  However, they will generate an illegal instruction
exception if used in an operation with side-effects, like a store.

So the problem here is that any use of an uninitialized register may
generate an exception, if the instruction has side-effects, and the
uninitialized register just happens to have the NaT bit set.

Mostly we get by because gcc doesn't have speculation support yet, but
it is only a matter of time before someone writes it.  Meanwhile, there
are some hand-written glibc routines that do use speculation, and could
potentially trigger this problem.  This is a disaster waiting to happen
for anyone using gcc on IA-64 machines.

I created PR 2 for this problem, and it contains an artificial
testcase that demonstrates the problem using bitfield assignments.
-- 
Jim Wilson, GNU Tools Support, http://www.specifix.com



Automake versions (was: Patch to make libgcj work with autoreconf again)

2005-08-23 Thread Kelley Cook
--- Tom Tromey <[EMAIL PROTECTED]> wrote:
> > "KC" == Kelley Cook <[EMAIL PROTECTED]> writes:
> 
> KC> 2005-08-19  Kelley Cook  <[EMAIL PROTECTED]>
> KC>   * Makefile.am (ACLOCAL_AMFLAGS): Also include "..".
> KC>   * acinclude.m4: Delete.  Extract CHECK_FOR_BROKEN_MINGW_LD to
> ...
> KC>   * mingwld.m4: ... this new file.
> KC>   * aclocal.m4, Makefile.in, gcj/Makefile.in: Regenerate. 
> KC>   * include/Makefile.in, testsuite/Makfile.in: Regenerate.
> 
> You used automake 1.9.4 to build Makefile.in.

Yes, Andrew had used 1.9.4 in his patch
(http://gcc.gnu.org/ml/gcc-cvs/2005-08/msg00618.html) from a few days
before, so I did also.  I actually had to download that version first.

> AIUI, with the exception of libgfortran, the tree is currently
> standardized on automake 1.9.3.  I wouldn't mind an update, but it
> ought to be done globally and install.texi ought to be updated.
> Meanwhile, having folks using different versions causes cvs churn...

Unfortunately, we have automake 1.9.3, 1.9.4 and 1.9.5 floating
throughout the tree.  I propose standardizing the entire tree on 1.9.6,
as it is the current release; moreover the 1.9 branch has only had a
few minor patches since 1.9.6 was released 6 weeks ago so 1.9.6 might
be stable for a while.

> 
> Probably we should have a script in contrib/ that downloads and
> builds all the currently-required tool versions.

This would be very cool.


Re: Automake versions (was: Patch to make libgcj work with autoreconf again)

2005-08-23 Thread Benjamin Kosnik

Thanks Tom for pointing this out. We have to all keep these autotools
versions synced: it bugs everybody to have extraneous differences in
trees due to version mis-match.

>Unfortunately, we have automake 1.9.3, 1.9.4 and 1.9.5 floating
>throughout the tree.  

How did this happen?

>I propose standardizing the entire tree on 1.9.6,
>as it is the current release; moreover the 1.9 branch has only had a
>few minor patches since 1.9.6 was released 6 weeks ago so 1.9.6 might
>be stable for a while.

I am in support of this. The sooner the better.

>> Probably we should have a script in contrib/ that downloads and
>> builds all the currently-required tool versions.

>This would be very cool.

That seems like the only solution to end this continual issue. I'm strongly in 
favor of it.

-benjamin


Re: gcc.c-torture/execute/stdarg-2.c: long vs int

2005-08-23 Thread DJ Delorie

> This certainly wasn't my intention, please change it to 79L.

How's this?  It passes both m32c and x86-64.

2005-08-23  DJ Delorie  <[EMAIL PROTECTED]>

* gcc.c-torture/execute/stdarg-2.c (main): Make sure long
constants have the L suffix.

Index: gcc.c-torture/execute/stdarg-2.c
===
RCS file: /cvs/gcc/gcc/gcc/testsuite/gcc.c-torture/execute/stdarg-2.c,v
retrieving revision 1.2
diff -p -U3 -r1.2 stdarg-2.c
--- gcc.c-torture/execute/stdarg-2.c3 Nov 2004 21:53:39 -   1.2
+++ gcc.c-torture/execute/stdarg-2.c23 Aug 2005 18:27:57 -
@@ -143,8 +143,8 @@ f12 (int i, ...)
 int
 main (void)
 {
-  f1 (1, 79);
-  if (x != 79)
+  f1 (1, 79L);
+  if (x != 79L)
 abort ();
   f2 (0x4002, 13, -14.0);
   if (bar_arg != 0x4002)


Re: Automake versions (was: Patch to make libgcj work with autoreconf again)

2005-08-23 Thread DJ Delorie

> Thanks Tom for pointing this out. We have to all keep these
> autotools versions synced: it bugs everybody to have extraneous
> differences in trees due to version mis-match.

Could we modify the CVS commit filters to *require* the right
versions?  If it detects a commit with the wrong version (at least,
assuming the old rev had the right version), it can just reject it.


pushl vs movl + movl on x86

2005-08-23 Thread Dan Nicolaescu

For this code (from PR23525):

extern int waiting_for_initial_map;
extern int cp_pipe[2];
extern int pc_pipe[2];
extern int close (int __fd);

void
first_map_occurred(void)
{
close(cp_pipe[0]);
close(pc_pipe[1]);
waiting_for_initial_map = 0;
}

gcc -march=i686 -O2 generates: 

movlcp_pipe, %eax
movl%eax, (%esp)
callclose
movlpc_pipe+4, %eax
movl%eax, (%esp)
callclose

The Intel compiler with the same flags generates:

pushl cp_pipe   #9.11
call  close #9.5
pushl 4+pc_pipe #10.11
call  close #10.5
 

gcc -march=i686 -Os generates similar code to the Intel compiler.

Is there a performance difference between the movl + movl and pushl
code sequences? If not maybe then gcc should generate pushl for -O2
too because it is smaller code.

Thanks



Re: Automake versions (was: Patch to make libgcj work with autoreconf again)

2005-08-23 Thread Benjamin Kosnik

> Could we modify the CVS commit filters to *require* the right
> versions?  If it detects a commit with the wrong version (at least,
> assuming the old rev had the right version), it can just reject it.

Dunno if this is possible, but this would be great. It would be nice if
there was a way to set different versions per branch. For instance, the
gcc-4_0-branch, gcc-3_4-branch, and mainline might have different
autotools requirements.

-benjamin


Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread Bernd Schmidt

Steven Bosscher wrote:

On Tuesday 23 August 2005 17:06, Bernd Schmidt wrote:


The idea is to put the initialization insns only on the paths where the
register will be uninitialized.



int foo (int n)
{
  int a;

  while (--n)
a = n;

  return a;
}

Not knowing n, how can you be sure whether "a" is uninitialized for
the "return" statement or not?


In this case, assuming nothing interesting happens to the loop, you'll 
have to conservatively initialize "a" near the top of the function.  In 
many cases you can do better and either initialize just before the use, 
or initialize on an edge on which the register is uninitialized.  For 
register allocation purposes however, this should be as good as using 
Vlad's new liveness analysis.


As Jim points out, we may have to do that for IA64 anyway, so we could 
consider doing it on all targets.  Dan is correct that this can 
introduce new code that won't be eliminated.  One question is how often 
this is going to occur in practice.



Bernd


Re: [RFA] Nonfunctioning split in rs6000 back-end

2005-08-23 Thread Giovanni Bajo
Paolo Bonzini <[EMAIL PROTECTED]> wrote:

> While researching who is really using flow's computed LOG_LINKS, I
> found 
> a define_split in the rs6000 back-end that uses them through
> find_single_use.  It turns out the only users are combine, this split,
> and a function in regmove.


See also:
http://gcc.gnu.org/ml/gcc-patches/2004-01/msg02371.html

Giovanni Bajo



Re: pushl vs movl + movl on x86

2005-08-23 Thread Richard Henderson
On Tue, Aug 23, 2005 at 11:40:16AM -0700, Dan Nicolaescu wrote:
> Is there a performance difference between the movl + movl and pushl
> code sequences?

In this case, no.

> If not maybe then gcc should generate pushl for -O2
> too because it is smaller code.

It's not quite as simple as you make out.  You can get pushes out
of gcc with -mno-accumulate-outgoing-args, but then we have to add
other compensation code elsewhere.

IIRC, it was fairly well explored that we get equal or better
performance by not using pushes on P2 class machines and later.


r~


RE: pushl vs movl + movl on x86

2005-08-23 Thread Menezes, Evandro
Dan, 

> Is there a performance difference between the movl + movl and 
> pushl code sequences? 

Not in this example, but movl is faster in some circumstances than pushl.  A 
sequence of pushl has an implicit dependency chain on %esp, as it changes after 
each pushl, whereas a sequence of movl could enjoy better ILP.  However, movl 
is quite longer than pushl, as you pointed out, which may affect cache 
efficiency.  

Therefore, the sweet spot is somewhere in the middle.  It's more important to 
use movl wisely in prologs and epilogs than when passing arguments though.  
For, as RTH mentioned, -maccumulate-outgoing-args is desirable to avoid 
frequent stack maintenance.

That being said, it depends largely on the underlying architecture 
implementation.

HTH


-- 
___
Evandro MenezesAMD   Austin, TX



Re: Automake versions (was: Patch to make libgcj work with autoreconf again)

2005-08-23 Thread Tom Tromey
> "KC" == Kelley Cook <[EMAIL PROTECTED]> writes:

KC> Unfortunately, we have automake 1.9.3, 1.9.4 and 1.9.5 floating
KC> throughout the tree.  I propose standardizing the entire tree on 1.9.6,
KC> as it is the current release; moreover the 1.9 branch has only had a
KC> few minor patches since 1.9.6 was released 6 weeks ago so 1.9.6 might
KC> be stable for a while.

This sounds great to me.

>> Probably we should have a script in contrib/ that downloads and
>> builds all the currently-required tool versions.

KC> This would be very cool.

I submitted one.

Tom


SSE builtins for ia32

2005-08-23 Thread Paul Koning
Two things I'm wondering about:

1. Why do _builtin_ia32_paddusb and similar functions take signed
   vector arguments, when the hardware primitive is defined to operate
   on unsigned vectors?

2. Why are there no sse equivalents of those functions, ones that
   operate on 128 bit values (i.e., paddusb for v16qi vectors)?

  paul



Re: gcc.c-torture/execute/stdarg-2.c: long vs int

2005-08-23 Thread Mark Mitchell

DJ Delorie wrote:

This certainly wasn't my intention, please change it to 79L.



How's this?  It passes both m32c and x86-64.

2005-08-23  DJ Delorie  <[EMAIL PROTECTED]>

* gcc.c-torture/execute/stdarg-2.c (main): Make sure long
constants have the L suffix.


OK.

--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: Searching for a branch for the see optimization.

2005-08-23 Thread Mark Mitchell

Steven Bosscher wrote:

On Monday 22 August 2005 14:46, Leehod Baruch wrote:


Hello,

I would like to know if someone knows a suitable branch for the sign
extension optimization pass.



Why not just maintain it in a local tree and post refined
versions every now and then, until stage 1 for GCC 4.2 opens?
Branches are for major work and a new pass is not that major.


It's also fine to create a new branch for this work.  That let's other 
people see what you're working on.


--
Mark Mitchell
CodeSourcery, LLC
[EMAIL PROTECTED]
(916) 791-8304


Re: Automake versions (was: Patch to make libgcj work with autoreconf again)

2005-08-23 Thread Ian Lance Taylor
Benjamin Kosnik <[EMAIL PROTECTED]> writes:

> > Could we modify the CVS commit filters to *require* the right
> > versions?  If it detects a commit with the wrong version (at least,
> > assuming the old rev had the right version), it can just reject it.
> 
> Dunno if this is possible, but this would be great.

This is possible--the file to modify is CVSROOT/commitinfo, to run
some script for a specific set of files.

 It would be nice if
> there was a way to set different versions per branch. For instance, the
> gcc-4_0-branch, gcc-3_4-branch, and mainline might have different
> autotools requirements.

I'm not sure this is available.  It might be possible to look in
CVS/Tag to find the branch tag for the file.  I don't know whether
that file is certain to exist when commitinfo is run, but it seems
that it might.

Ian


Re: SSE builtins for ia32

2005-08-23 Thread Richard Henderson
On Tue, Aug 23, 2005 at 04:32:42PM -0400, Paul Koning wrote:
> 1. Why do _builtin_ia32_paddusb and similar functions take signed
>vector arguments, when the hardware primitive is defined to operate
>on unsigned vectors?

Because the interface you're actually supposed to be using
is _mm_adds_pu8, which uses an opaque type.  The underlying
builtins all use signed vectors because it was simple to 
make them all the same.

> 2. Why are there no sse equivalents of those functions, ones that
>operate on 128 bit values (i.e., paddusb for v16qi vectors)?

There are.  See _mm_adds_epu8 in emmintrin.h.


r~


gcc-3.4-20050823 is now available

2005-08-23 Thread gccadmin
Snapshot gcc-3.4-20050823 is now available on
  ftp://gcc.gnu.org/pub/gcc/snapshots/3.4-20050823/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 3.4 CVS branch
with the following options: -rgcc-ss-3_4-20050823 

You'll find:

gcc-3.4-20050823.tar.bz2  Complete GCC (includes all of below)

gcc-core-3.4-20050823.tar.bz2 C front end and core compiler

gcc-ada-3.4-20050823.tar.bz2  Ada front end and runtime

gcc-g++-3.4-20050823.tar.bz2  C++ front end and runtime

gcc-g77-3.4-20050823.tar.bz2  Fortran 77 front end and runtime

gcc-java-3.4-20050823.tar.bz2 Java front end and runtime

gcc-objc-3.4-20050823.tar.bz2 Objective-C front end and runtime

gcc-testsuite-3.4-20050823.tar.bz2The GCC testsuite

Diffs from 3.4-20050816 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-3.4
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread Peter Bergner
On Tue, 2005-08-23 at 21:26 +0200, Bernd Schmidt wrote:
> As Jim points out, we may have to do that for IA64 anyway, so we could 
> consider doing it on all targets.  Dan is correct that this can 
> introduce new code that won't be eliminated.  One question is how often 
> this is going to occur in practice.

The IBM iSeries (aka AS/400) compiler actually inserts definitions
on edges where a pseudo/register is undefined.  However, unlike the
discussion here, our "pseudo" definitions never lead to generated
code.  Our pseudo definitions were added to simplify some analysis
phases in the compiler (eg, liveness can be simplified down to LIVE
rather than LIVE & AVAL).  Note that we needed to handle these pseudo
definitions specially in some cases so they don't reduce optimization
opportunities.  If I remember correctly (it's been a while since I
left the team):

1) All pseudo defs get the value of  so rematerialization,
   etc. are not pessimized.
2) Pseudo definitions are ignored during the interference graph
   construction (ie, they never cause edges to be added to the
   interference graph).
3) More things I can't think of at the moment.

This was a win for the iSeries compiler since a fair number of
applications were/are written in RPG which is essentially a one
procedure application, so the number of basic blocks and live
ranges/webs can be quite high.  I recall one program we ran into
that had about 150K basic blocks and about 1.5M live ranges.

I know we used to have a white paper describing the internals of the
iSeries compiler (titled "The AS/400 Optimizing Translator"), but all
of the links I can find are stale.  However, I did come across their
patent (5,761,514) describing the idea: "Register allocation method
and apparatus for truncating runaway lifetimes of program variables
in a computer system".  I have no idea whether this was one of the
patents made available by IBM for use by the OSS community or not.

Peter

--
Peter Bergner
Linux on Power Toolchain
IBM Linux Technology Center




Re: Problem with the special live analyzer in global alloc

2005-08-23 Thread Daniel Berlin
On Tue, 2005-08-23 at 22:10 -0500, Peter Bergner wrote:
> On Tue, 2005-08-23 at 21:26 +0200, Bernd Schmidt wrote:
> > As Jim points out, we may have to do that for IA64 anyway, so we could 
> > consider doing it on all targets.  Dan is correct that this can 
> > introduce new code that won't be eliminated.  One question is how often 
> > this is going to occur in practice.
> 
> The IBM iSeries (aka AS/400) compiler actually inserts definitions
> on edges where a pseudo/register is undefined.  However, unlike the
> discussion here, our "pseudo" definitions never lead to generated
> code
I listed that as a possible option, the problem is that you have to know
that they are pseudo definitions, and teach other things this too.
This is the part i alluded to being probably uglier than partial
liveness analysis itself.
> .  Our pseudo definitions were added to simplify some analysis
> phases in the compiler (eg, liveness can be simplified down to LIVE
> rather than LIVE & AVAL).  Note that we needed to handle these pseudo
> definitions specially in some cases so they don't reduce optimization
> opportunities. 
Like this :)

Is LIVE & AVAIL really that much slower these days for most programs?

I imagine if you have 300k bb's or 1.5 million live pseudos to consider,
it probably makes a real difference, but that's not *too* common in our
supported languages (30k bb's/150k pseudos is probably the practical
upper limit of what we see, though i'm sure someone is going to say
they've seen larger :P)


> I know we used to have a white paper describing the internals of the
> iSeries compiler (titled "The AS/400 Optimizing Translator"), but all
> of the links I can find are stale.  However, I did come across their
> patent (5,761,514) describing the idea: "Register allocation method
> and apparatus for truncating runaway lifetimes of program variables
> in a computer system".  I have no idea whether this was one of the
> patents made available by IBM for use by the OSS community or not.

Just FYI, I've read this patent, and regardless of whether you think
this is something should have patented, etc, the claims are broad enough
to cover any way as long as you are doing liveness analysis and then
inserting something into the instruction stream to truncate the ranges
(real, fake, whatever) .

However, if this is what you guys want to do, please don't let that stop
you.  Let me know if you want to go this route, and we'll work on
getting IBM to release it.

--Dan