Re: __builtin_cpow((0,0),(0,0))

2005-03-17 Thread Ronny Peine

Dave Korn wrote:
Original Message
From: Ronny Peine
Sent: 16 March 2005 17:34

See for example:
http://mathworld.wolfram.com/ExponentLaws.html

  Ok, I did.

Even though, gcc returns 1 for pow(0.0,0.0) in version 3.4.3 like many 
other c-compiler do. The same behaviour would be expected from cpow.

  No, you're wrong (that the same behaviour would be expected from cpow).
See for example:
http://mathworld.wolfram.com/ExponentLaws.html
" Note that these rules apply in general only to real quantities, and can
give manifestly wrong results if they are blindly applied to complex
quantities. "

Well yes in the general case it's not applieable, but x^0 is 1 in the 
complex case, too. And if 0^0 is converted from the real to the complex 
domain (it's even a part of the complex domain) than the same behaviour 
would be expected, otherwise the definition wouldn't be very well.

Has anyone found a hint in the ieee754 standard if there is something 
about it in there? I haven't one here right now, well it's not 
prizeless. Otherwise these discussion won't end.

cheers,
  DaveK
cu, Ronny


Re: Question about how to compile multiple files with g++

2005-03-17 Thread Mike Stump
On Mar 16, 2005, at 11:05 PM, Yen wrote:
I have a problem to compile multiple files together, so please  
everybody give
me a help, thanks!
Wrong list, try gcc-help instead.



Compiler chokes on a simple template - why?

2005-03-17 Thread Topi Maenpaa
Hi,

Here is a snippet that does not compile with gcc 3.4.1 (on Mandrake 10.1).

---
template  class A
{
public:
  template  void test(T value) {}
};

template  void test2(A& a, T val)
{
  a.test(val);
}

int main()
{
  A a;
  a.test(1); //works fine
}
---

$ g++ -o test test.cc
test.cc: In function `void test2(A&, T)':
test.cc:9: error: expected primary-expression before "int"
test.cc:9: error: expected `;' before "int"

The funny thing is that if I change the name of the "test2" function to 
"test", everything is OK. The compiler complains only if the functions have 
different names. Why does the name matter?

The code compiles if "test2" is not a template function. Furthermore, calling 
A::test directly from main rather than through the template function works 
fine.

I don't know if this is really a compiler thing, but it's hard to imagine the 
standard would impose such behavior.

Please cc your thoughts to me, I'm not a subscriber.

Thanks.
-Topi-


Re: Bootstrap failure in varasm.c at assemble_alias

2005-03-17 Thread Andreas Schwab
Benjamin Redelings I <[EMAIL PROTECTED]> writes:

> Hi guys,
>   Just wanted to note that I'm getting a bootstrap failure in varasm.c.
>
> gcc -c   -g -O2 -DIN_GCC   -W -Wall -Wwrite-strings -Wstrict-prototypes 
> -Wmissing-prototypes -fno-common   -DHAVE_CONFIG_H-I. -I. 
> -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include 
> -I../../gcc/gcc/../libcpp/include  ../../gcc/gcc/varasm.c -o varasm.o
> ../../gcc/gcc/varasm.c: In function `const_rtx_hash_1':
> ../../gcc/gcc/varasm.c:2854: warning: right shift count >= width of type
> ../../gcc/gcc/varasm.c: In function `assemble_alias':
> ../../gcc/gcc/varasm.c:4524: error: parse error before '<<' token

Remote the conflict markers.

Andreas.

-- 
Andreas Schwab, SuSE Labs, [EMAIL PROTECTED]
SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany
Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: Questions about trampolines

2005-03-17 Thread Clifford Wolf
hi,

On Wed, Mar 16, 2005 at 02:48:56PM -0500, Robert Dewar wrote:
> Yes, but that avoids the difficulty, that's obvious so far.
> 
> The problem is to know exactly when to pop the stack, and that is
> not trivial (longjmp, exceptions, non local gotos).

hmm.. what's about doing it gc-like. Instead of a stack there simply is a
'pool' of trampolines from which trampolines are allocated and a pointer to
the trampoline is pushed on the stack.

When the last trampoline from the pool is allocated, a 'garbage collector'
is running over it and looking for pointers to trampolines between the
stack pointer and the stack start address. Every trampoline which isn't
possibly referenced is added to a free-list from which new trampolines are
allocated.

When no trampoline can be allocated, abort() is called (with or without
crossjumping ;-).

This may be a dirty hack and guessing a good size for the trampoline pool
is still an issue - but it could be implemented easily and would work...

Instead of adding the trampoline pool to libgcc (as suggested earlier in
this thread) I would suggest that gcc generates a trampoline pool in a
linkonce section every time a source file is compiled which requires
trampolines. That way there wouldn't be any trampoline pool in an
executeable which doesn't need one and a compiler option such as
-ftrampoline-pool-size=32 could be used the specify the size of the
trampoline pool on the command line.

The only issue I see with that is that the trampoline pool will actually
consist of two sections: one for the code and one for the data. Afair there
is a bug with linkonce sections connected to data sections. (triggered e.g.
when a big switch statement in function in a c++ template is compiled using
a jump table. this may lead to code-references in .data to dropped linkonce
code sections.) may this also become an issue here? or is the bug fixed
already?

yours,
 - clifford

--
ocaml graphics.cma <( echo 'open Graphics;;open_graph " 640x480"let
complex_mul(a,b)(c,d)=(a*.c-.b*.d,a*.d+.b*.c)let complex_add(a,b)(c
,d)=(a+.c,b+.d);;let rec mandel c n=if n>0 then let z=mandel c(n-1)
in complex_add(complex_mul z z)c else (0.0,0.0);; for x=0 to 640 do
for y=0 to 480 do let c=((float_of_int(x-450))/.200.0,(float_of_int
(y-240))/.200.0) in let cabs2(a,b)=(a*.a)+.(b*.b)in if cabs2(mandel
c 50)<4.0 then plot x y done done;;read_line()' )
 
M$ is not the answer. M$ is the question. No is the answer!
 


pgpupBLHWsXdb.pgp
Description: PGP signature


Re: Compiler chokes on a simple template - why?

2005-03-17 Thread Giovanni Bajo
Topi Maenpaa <[EMAIL PROTECTED]> wrote:

> ---
> template  class A
> {
> public:
>   template  void test(T value) {}
> };
>
> template  void test2(A& a, T val)
> {
>   a.test(val);
> }
>
> int main()
> {
>   A a;
>   a.test(1); //works fine
> }
> ---

This is ill-formed. You need to write:

a.template test(val);

because 'a' is a dependent name.


> The funny thing is that if I change the name of the "test2" function
> to "test", everything is OK. The compiler complains only if the
> functions have different names. Why does the name matter?

This is surely a bug. Would you please file a bug report about this?

> The code compiles if "test2" is not a template function. Furthermore,
> calling A::test directly from main rather than through the
> template function works fine.

This is correct, because if "test2" is not a template function name anymore,
then 'a' is not a dependent name, and the 'template' keyword is not needed to
disambiguate the parser.

Giovanni Bajo



Re: Compiler chokes on a simple template - why?

2005-03-17 Thread Jonathan Wakely
On Thu, Mar 17, 2005 at 10:33:54AM +0200, Topi Maenpaa wrote:

> Hi,
> 
> Here is a snippet that does not compile with gcc 3.4.1 (on Mandrake 10.1).
> 
> ---
> template  class A
> {
> public:
>   template  void test(T value) {}
> };
> 
> template  void test2(A& a, T val)
> {
>   a.test(val);

This needs to be:

   a.template test(val);

> }
> 
> int main()
> {
>   A a;
>   a.test(1); //works fine
> }
> ---

> 
> $ g++ -o test test.cc
> test.cc: In function `void test2(A&, T)':
> test.cc:9: error: expected primary-expression before "int"
> test.cc:9: error: expected `;' before "int"

Because test2 is a template it doesn't know what A is (in the general
case) so you need to tell the compiler that a.test is a function template,
otherwise it is parsed as a member variable, giving "a.test less-than int",
which doesn't make sense.

> The funny thing is that if I change the name of the "test2" function to 
> "test", everything is OK. The compiler complains only if the functions have 
> different names. Why does the name matter?

That I'm not sure about ...
I would have expected it to fail with the same error when the function
is called "test" - but I'd be wrong apparently.

> The code compiles if "test2" is not a template function.

Because in that case the compiler knows the full definition of A and
knows that a.test refers to a function template, not a member variable
(for instance).

>  Furthermore, calling 
> A::test directly from main rather than through the template function works 
> fine.

Again, in that context the compiler knows that a.test is a function
template.

> I don't know if this is really a compiler thing, but it's hard to imagine the 
> standard would impose such behavior.

Yes, it's ugly. No, it's not a bug. It's required by the standard  :-(

jon

-- 
"I find television very educating. Every time somebody turns on the set, 
 I go into the other room and read a book."
- Groucho Marx


Re: __builtin_cpow((0,0),(0,0))

2005-03-17 Thread Gabriel Dos Reis
Ronny Peine <[EMAIL PROTECTED]> writes:

| Dave Korn wrote:
| > Original Message
| >
| >>From: Ronny Peine
| >>Sent: 16 March 2005 17:34
| >
| >>See for example:
| >>http://mathworld.wolfram.com/ExponentLaws.html
| >>
| >   Ok, I did.
| >
| >> Even though, gcc returns 1 for pow(0.0,0.0) in version 3.4.3 like
| >> many other c-compiler do. The same behaviour would be expected from
| >> cpow.
| >   No, you're wrong (that the same behaviour would be expected from
| > cpow).
| > See for example:
| > http://mathworld.wolfram.com/ExponentLaws.html
| > " Note that these rules apply in general only to real quantities,
| > and can
| > give manifestly wrong results if they are blindly applied to complex
| > quantities. "
| > 
| 
| Well yes in the general case it's not applieable, but x^0 is 1 in the
| complex case, too.

Just repeating it does not make it a reality.

| And if 0^0 is converted from the real to the
| complex domain (it's even a part of the complex domain) than the same
| behaviour would be expected, otherwise the definition wouldn't be very
| well.

the point is that real or exponentiation is not the same as integer
exponentiation.  The latter has less freedom that ther former.

| Has anyone found a hint in the ieee754 standard if there is something
| about it in there? I haven't one here right now, well it's not
| prizeless. Otherwise these discussion won't end.

there are several standards, among which IEEE-754 and the ISO standard
LIA (designed to correct the IEEE-754 shot).  IEEE-754 does not
concern itself with complex arithmetic (though C99 made some
interesting and innovative extensions).  I already quoted part 2 of
LIA. Part 3 of LIA, concerning complex arithmetic, is being developed
and is in its second stage.  It is consistent with LIA-2.

-- Gaby


Re: libgcc-std.ver question

2005-03-17 Thread Richard Henderson
On Wed, Mar 16, 2005 at 05:43:32PM -0800, Mike Stump wrote:
> I have a question about libgcc export for shared libraries...  libgcc
> exports (via libgcc-std.ver):
> 
>   __ffsdi2
> 
> but not:
> 
>   __ffssi2

I suppose it would be ok, but it would only be relevent for 
embedded targets where "int" < SImode.  Otherwise we use the
plain "ffs" symbol in libc.


r~


problems compiling gcc-3.3.1

2005-03-17 Thread Amit Thakar
hello ,
Following is the error i'am getting while compiling gcc-3.3.1.I am using 
headers of my system.How do i get rid of this.
 

In file included from tconfig.h:23,
 from ../../../gcc-3.3.1/gcc/libgcc2.c:36:
../../../gcc-3.3.1/gcc/config/i386/linux.h:232:20: signal.h: No such file or 
directory
../../../gcc-3.3.1/gcc/config/i386/linux.h:233:26: sys/ucontext.h: No such file 
or directory
make[2]: *** [libgcc/./_muldi3.o] Error 1
make[2]: Leaving directory `/home/test2/platdev/mysw/build-tools/gcc'
make[1]: *** [libgcc.a] Error 2
make[1]: Leaving directory `/home/test2/platdev/mysw/build-tools/gcc'
make: *** [all-gcc] Error 2


Regards
Amit


Re: Compiler chokes on a simple template - why?

2005-03-17 Thread Jonathan Wakely
On Thu, Mar 17, 2005 at 11:03:53AM +0100, Giovanni Bajo wrote:

> Topi Maenpaa <[EMAIL PROTECTED]> wrote:
> 
> > The funny thing is that if I change the name of the "test2" function
> > to "test", everything is OK. The compiler complains only if the
> > functions have different names. Why does the name matter?
> 
> This is surely a bug. Would you please file a bug report about this?

That's what I thought - comeau seems to have the same bug btw.

jon

-- 
"You can lead a horticulture but you can't make her think."
- Dorothy Parker


Re: Questions about trampolines

2005-03-17 Thread Joern RENNECKE
Clifford Wolf wrote:
   

hmm.. what's about doing it gc-like. Instead of a stack there simply is a
'pool' of trampolines from which trampolines are allocated and a pointer to
the trampoline is pushed on the stack.
When the last trampoline from the pool is allocated, a 'garbage collector'
is running over it and looking for pointers to trampolines between the
stack pointer and the stack start address. Every trampoline which isn't
possibly referenced is added to a free-list from which new trampolines are
allocated.
If you have only one procesor stack (i.e. single-threaded execution), 
you can handle
the trampolines as a stack too.  You don't need to deallocate till you 
allocate again,
and then you adjust the trampoline stack so none of its static chain 
pointers points to
a deallocated frame, or to the current frame (since you are only about 
to set up the
trampolines for the current frame then).

If you have multiple processor stacks, you have to register and later 
search them all in
order to make the garbage-collection scheme work.
that it doesn't point at any deallocated frames.

Instead of adding the trampoline pool to libgcc (as suggested earlier in
this thread) I would suggest that gcc generates a trampoline pool in a
linkonce section every time a source file is compiled which requires
trampolines. That way there wouldn't be any trampoline pool in an
executeable which doesn't need one 

You don't need a linkonce section for this.  The function that needs a 
trampoline
calls allocation / deallocation functions, or if it inlines the code, it 
will reference
the pool start addresses - either way, it will reference some symbols.  
By putting the
.o file that provides these symbols along with the code and data parts 
of the trampoline
pool into a static library - libgcc.a or otherwise - you make sure that 
the object is only
linked in when needed.

and a compiler option such as
-ftrampoline-pool-size=32 could be used the specify the size of the
trampoline pool on the command line.
This is messy; say you have two libraries that are compiled with 
-ftrampoline-pool-size=32 ;
they will then share a trampoline pool of 32 entries.  If you compile 
one with
-ftrampoline-pool-size=16 instead, you will have them using different 
pools, or maybe
even get some multiply defined symbols.
It is much saner to make this a link time option.  By selecting a 
specific library for the
trampoline pool, you can adjust the size on a program (or dso, you you 
don't export)
basis, and you might even choose an alternate allocation strategy.  I.e. 
you could have
libgcc provide one with a size that works most of the time and uses 
destructors for
portabiliyt and robustness, have a specialized lightweight one you can 
specifically use for
single-threaded programs, and have a 64 bit linux specific one that ties 
into the threading code
(or is part of a threads package) and mmaps trampoline code pages for 
every processor stack
allocated, sufficiently large and at a fixed offset to the stack so that 
you can put the data part
on the return stack in any suitably aligned position, and have a 
matching trampoline.

I.e. the bare function address and the static chain pointer are 8 bytes 
each, so that a trampoline
data part is 16 bytes.  You require them to be 16-byte aligned on any 
processor stack.
The mmapped trampoline can be an absolute function call to some helper 
code that does the
real work, using the return address to figure out which trampoline is 
executed.  This call should
fit into 16 bytes too, so in the trampoline page to be mmapped , every 
16 bytes there is such an
absolute call insn.  You can get a 1:1 correspondence between 
trampolines and processor stacks
by allocating the stacks all in one specific memory area, and have an 
equally-sized area where
trampolines are mapped.  Thus, you can have differently-sized stacks, 
yet the trampoline code
can add a constant offset to the return address to find the data part of 
the trampoline.
  




Re: Questions about trampolines

2005-03-17 Thread Clifford Wolf
Hi,

On Thu, Mar 17, 2005 at 01:35:29PM +, Joern RENNECKE wrote:
> I.e. you could have libgcc provide one with a size that works most of the
> time

Some applications have recursions which go into a depth of 1000 and more.
Some architectures have only a few k ram. Which "a size that works most of
the time" would you suggest?

It's ugly to have a static pool size. But it's intolerable to not allow the
user to change that pool size easily using an option.

> The mmapped trampoline can be an absolute function call to some helper 
> code that does the

I am pretty sure that all processor architectures with such a strict haward
design that it is impossible to generate dynamic code are MMU-less.

yours,
 - clifford

-- 
bash -c "gcc -o mysdldemo -Wall -O2 -lSDL -lm -pthread -x c <( echo -e '
#include \n#include \nint main(){SDL_Surface*s;SDL_Event
e;int x,y,n;SDL_Init(SDL_INIT_VIDEO);s=SDL_SetVideoMode(640,480,32,0);for(x=0;
x<640;x++)for(y=0;y<480;y++){float _Complex z=0, c=((x-400)/200.0) + ((y-240)/
200.0)*1.0fi;for(n=1;n<64;n++){z=z*z+c;if(cabsf(z)>2){((Uint32*)s->pixels)[x+y
*640]=n<<3;n=99;}}}SDL_UpdateRect(s,0,0,s->w,s->h);do SDL_WaitEvent(&e); while
(e.type!=SDL_QUIT&&e.type!=SDL_KEYDOWN);SDL_Quit();return 0;}' ); ./mysdldemo"
 
M$ is not the answer. M$ is the question. No is the answer!
 


pgpUWbp1VmpeO.pgp
Description: PGP signature


Re: Questions about trampolines

2005-03-17 Thread Joern RENNECKE
Clifford Wolf wrote:
   

Some applications have recursions which go into a depth of 1000 and more.
Some architectures have only a few k ram. Which "a size that works most of
the time" would you suggest?
It's ugly to have a static pool size. But it's intolerable to not allow the
user to change that pool size easily using an option.
 

Of course the user can change the size, by using a library with a 
different size.  But
there should be a sensible default.  The size of that default can vary 
from target to target.

The mmapped trampoline can be an absolute function call to some helper 
code that does the
   

I am pretty sure that all processor architectures with such a strict haward
design that it is impossible to generate dynamic code are MMU-less.
The application of the MMU-based scheme is more to accelerate trampolines by
avoiding cache coherency issues, without making allocation / 
deallocation more expensive.
In fact, since the code is already there, the initialization is cheaper 
than for classic stack-based
trampolines on pure von Neumann architectures.

FWIW, for processor-stack based trampolines. if we could guarantee that 
trampolines are the
only code that can be executed on the stack, we could avoid the 
memory-Icache coherency
issue altogether by allocating entire cache lines for trampolines on the 
stack, and filling them
up with trampolines (at least the code part), with a code part that does 
not change for any
given stack location.  I.e. after writing the code, we'd have to flush 
it to memory, but wouldn't
need to invalidate the Icache, since the only old code that could be 
there would be identical to
the code just written.
  



Re: Questions about trampolines

2005-03-17 Thread Robert Dewar
Joern RENNECKE wrote:
Of course the user can change the size, by using a library with a 
different size. 

This is not an acceptable approach in a production environment,
where switching libraries can force revalidation and retesting.


Re: Questions about trampolines

2005-03-17 Thread Joern RENNECKE
Robert Dewar wrote:
Joern RENNECKE wrote:
Of course the user can change the size, by using a library with a 
different size. 

This is not an acceptable approach in a production environment,
where switching libraries can force revalidation and retesting. 
This sounds more like a problem with your process than a genuine 
technical problem.
Why should an option that selects a different library be less safe than 
an option that changes
code generation?
But If you really want to, you can of course select a different module 
out of the same library,
by playing with --defsym.



short int and conversions

2005-03-17 Thread Andrea
Hi,
I'm trying to port gcc 4.1 for an architecture that has the following
memory layout BITS_PER_UNIT=32 and UNITS_PER_WORD=1.
It has support (16bit registers and operators) for 16bit signed
atithmetic used mainly for addressing. There are also operators for 32
bit integer and floating point support.
I set SHORT_TYPE_SIZE=POINTER_SIZE=(BITS_PER_WORD/2).
 
I reserved QImode and QFmode for 32 bit integer/floating point operations.
And I defined a new fractional int mode FRACTIONAL_INT_MODE (PQ, 16, 1) for 
pointers and short int operations.
When I try to compile a very simple program with short int xgcc
segments for stack overflow because it calls recursively
#32 0x0806dd6d in convert (type=0xb7c7b288, expr=0xb7c88090) at
../../gcc/c-convert.c:95
#33 0x08160626 in convert_to_integer (type=0xb7c7b288,
expr=0xb7c88090) at ../../gcc/convert.c:442

I presume it tries to convert a small precision mode in something
bigger but I cannot understand why.
This is the first time I try to port gcc, so I don't know if my
assumptions are reasonable or not.
could someone help me?
thanks
andrea.


Re: __builtin_cpow((0,0),(0,0))

2005-03-17 Thread Joe Buck

Ronny Peine <[EMAIL PROTECTED]> writes:
> | Well yes in the general case it's not applieable, but x^0 is 1 in the
> | complex case, too.

On Thu, Mar 17, 2005 at 01:08:58PM +0100, Gabriel Dos Reis wrote:
> Just repeating it does not make it a reality.

However, repeating it does annoy the readership of this list, and
arguing with it just seems to cause people on the other side to repeat
their arguments once again.  Can we please stop this discussion?


Re: short int and conversions

2005-03-17 Thread Joseph S. Myers
On Thu, 17 Mar 2005, Andrea wrote:

> I'm trying to port gcc 4.1 for an architecture that has the following
> memory layout BITS_PER_UNIT=32 and UNITS_PER_WORD=1.

Support for systems with bytes wider than 8 bits is somewhat bitrotten at 
present, as it seems little has been done on the c4x port lately and it is 
the only such port we currently have; various PRs indicate it simply 
doesn't work (won't build libgcc) at present.  I have however CC:ed the 
maintainer of the c4x port in case he should wish to improve the state of 
this port and the general support for such ports.

> It has support (16bit registers and operators) for 16bit signed
> atithmetic used mainly for addressing. There are also operators for 32
> bit integer and floating point support.
> I set SHORT_TYPE_SIZE=POINTER_SIZE=(BITS_PER_WORD/2).

short needs to have at least the precision of char in C.  (C99 made 
explicit various aspects of the ordering rules for type precision which 
C90 was insufficiently complete about.)

However, types narrower than char do work in the compiler - we have them 
for bit-fields.  As required by the C standard, types narrower that int 
are promoted to int in arithmetic.  Bit-field types don't have their own 
modes, but in principle you should be able to have a special type with its 
own mode narrower than char: however, you may need to implement 
optimizations which convert operations on promoted types to operations on 
narrow types for targets with such types.

-- 
Joseph S. Myers   http://www.srcf.ucam.org/~jsm28/gcc/
[EMAIL PROTECTED] (personal mail)
[EMAIL PROTECTED] (CodeSourcery mail)
[EMAIL PROTECTED] (Bugzilla assignments and CCs)


GCC3 to GCC4 performance regression. Bug?

2005-03-17 Thread Steve Ellcey

I have been looking at a significant performance regression in the hmmer
application between GCC 3.4 and GCC 4.0.  I have a small cutdown test
case (attached) that demonstrates the problem and which runs more than
10% slower on IA64 (HP-UX or Linux) when compiled with GCC 4.0 than when
compiled with GCC 3.4.  At first I thought this was just due to 'better'
alias analysis in the P7Viterbi routine and that it was the right thing
to do even if it was slower.  It looked like GCC 3.4 does not believe
that hmm->tsc could alias mmx but GCC 4.0 thinks they could and thus GCC
4.0 does more loads inside the inner loop of P7Viterbi.  But then I
noticed something weird, if I remove the field M (which is unused in my
example) from the plan_s structure.  GCC 4.0 runs as fast as GCC 3.4.  I
don't understand why this would affect things.

Any optimization experts care to take a look at this test case and help
me understand what is going on and if this change from 3.4 to 4.0 is
intentional or not?

Steve Ellcey
[EMAIL PROTECTED]


 Test Case ---

#define L_CONST 500

void *malloc(long size);

struct plan7_s {
  int M;
  int **tsc;   /* transition scores [0.6][1.M-1]*/
};

struct dpmatrix_s {
  int **mmx;
};
struct dpmatrix_s *mx;



void
AllocPlan7Body(struct plan7_s *hmm, int M) 
{
  int i;

  hmm->tsc= malloc (7 * sizeof(int *));
  hmm->tsc[0] = malloc ((M+16) * sizeof(int));
  mx->mmx = (int **) malloc(sizeof(int *) * (L_CONST+1));
  for (i = 0; i <= L_CONST; i++) {
mx->mmx[i] = malloc (M+2+16);
  }
  return;
}  

void
P7Viterbi(int L, int M, struct plan7_s *hmm, int **mmx)
{
  int   i,k;
  
  for (i = 1; i <= L; i++) {
for (k = 1; k <= M; k++) {
  mmx[i][k] = mmx[i-1][k-1] + hmm->tsc[0][k-1];
}
  }
}

main ()
{
struct plan7_s *hmm;
char dsq[L_CONST];
int i;

hmm = (struct plan7_s *) malloc (sizeof (struct plan7_s));
mx = (struct dpmatrix_s *) malloc (sizeof (struct dpmatrix_s));
AllocPlan7Body(hmm, 10);
for (i = 0; i < 60; i++) {
P7Viterbi(500, 10, hmm, mx->mmx);
}
}


help with mudflap testsuite result analysis

2005-03-17 Thread Mike Stump
So, I've been working on mudflap for darwin8, and these are the  
results I get...  I know what you're thinking, it's impossible to get  
it working because it doesn't have --wrap and friends..  well, I  
pulled some magic pixie dust out and sprinkled it around and it's  
starting to work...

The question is, how decent are the results and can you spot any  
systematic wrongs that appear and/or can you identify any non- 
portableness to darwin of mudflap?  I started from 89 passes... :-)

I fixed most all the obvious issues that appeared due to darwin from  
looking at the build result and looking at the libmudflap.log file.   
If someone would like to help track down the issues, I'd be  
interested in hearing from you.

Thanks.


libmudflap.log.bz
Description: Binary data
 

For those who want to automatically generate predicates.md...

2005-03-17 Thread Kazu Hirata
Hi,

I created a set of scripts that generates predicates.md based on
PREDICATE_CODES in tm.h.  The generated file looks like this:

;; Predicate definitions for FIXME FIXME.
;; Copyright (C) 2005 Free Software Foundation, Inc.
;;
;; This file is part of GCC.
;;
;; :
;; : Usual copyright notice
;; :

;; Return true if OP is a valid source operand for an integer move instruction.

(define_predicate "general_operand_src"
  (match_code 
"const_int,const_double,const,symbol_ref,label_ref,subreg,reg,mem")
{
  if (GET_MODE (op) == mode
  && GET_CODE (op) == MEM
  && GET_CODE (XEXP (op, 0)) == POST_INC)
return 1;
  return general_operand (op, mode);
})

  :
  : More predicates follow.
  :

1. A copyright is automatically inserted except the port name.

2. A comment for each function is taken from tm.c.

3. The name of a predicate along with codes it accepts are
   automatically taken from PREDICATE_CODES.

4. The C code for a predicate is automatically taken from tm.c.

My scripts will only generate predicate.md.  It does not remove
PREDICATE_CODES from tm.h, predicates from tm.c, or prototypes from
tm-protos.h.  All these are left for your code cleanup pleasure. :-)

Another thing that my scripts won't do is to convert a C-style
predicate to a LISP-style predicate.  My scripts are only meant to
alleviate the mechanical part of the conversion.

Anyway, untar the attachment and run

  predicatecodes.sh h8300

under config/h8300 to generate predicates.md.  Of course, you can
replace h8300 with any port with PREDICATE_CODES.  My scripts are not
robust, so don't blame me if they eat your files.

I might actually start posting patches to convert to predicate.md.

Kazu Hirata


conv_predicate_codes.tar.gz
Description: Binary data


Re: Newlib _ctype_ alias kludge now invalid due to PR middle-end/15700 fix.

2005-03-17 Thread Jeff Johnston
Giovanni Bajo wrote:
Hans-Peter Nilsson <[EMAIL PROTECTED]> wrote:

So, the previously-questionable newlib alias-to-offset-in-table
kludge is finally judged invalid.  This is a heads-up for newlib
users.  IMHO it's not a GCC bug, though there's surely going to
be some commotion.  Maybe a NEWS item is called for, I dunno.

It will be in NEWS, since RTH already updated
http://gcc.gnu.org/gcc-4.0/changes.html. I hope newlib will be promptly fixed.
Giovanni Bajo
I have just checked in a patch to newlib that changes the ctype macros to use 
__ctype_ptr instead of _ctype_.  In addition, a configuration check is made to 
see whether the array aliasing trick can be used or not.

The code allows for backward compatibility except in the case where the old code 
is using negative offsets and the current version of newlib is built with a 
compiler that does not support the array aliasing trick.

Corinna, if this causes any Cygwin issues, please let me know.
-- Jeff J.


false spam positive from gcc-patches

2005-03-17 Thread Thomas Koenig
Hi,

any reason why the message

http://gcc.gnu.org/ml/fortran/2005-03/msg00282.html

was rejected as spam from gcc-patches, yet accepted on the fortran
list?


Re: GCC3 to GCC4 performance regression. Bug?

2005-03-17 Thread Stefan Strasser
Steve Ellcey schrieb:
 Test Case ---

I think is the same bug(which was not considered one back then) as 
benjamin redelings described in the thread "C++ math optimization 
problem...".

there are again unnecessary memory accesses as if the memory were 
volatile, which could be moved out of the inner loop.
and gcc 3.4 does it in this case(there are other cases where both fail, 
see the math optimization thread)
and once again changes which shouldn't have any effect enormously affect 
the inner loop.


#define L_CONST 500
void *malloc(long size);
struct plan7_s {
  int M;
  int **tsc;   /* transition scores [0.6][1.M-1]*/
};
struct dpmatrix_s {
  int **mmx;
};
struct dpmatrix_s *mx;

void
AllocPlan7Body(struct plan7_s *hmm, int M) 
{
  int i;

  hmm->tsc= malloc (7 * sizeof(int *));
  hmm->tsc[0] = malloc ((M+16) * sizeof(int));
  mx->mmx = (int **) malloc(sizeof(int *) * (L_CONST+1));
  for (i = 0; i <= L_CONST; i++) {
mx->mmx[i] = malloc (M+2+16);
  }
  return;
}  

void
P7Viterbi(int L, int M, struct plan7_s *hmm, int **mmx)
{
  int   i,k;
  
  for (i = 1; i <= L; i++) {
for (k = 1; k <= M; k++) {
  mmx[i][k] = mmx[i-1][k-1] + hmm->tsc[0][k-1];
}
  }
}

main ()
{
struct plan7_s *hmm;
char dsq[L_CONST];
int i;
hmm = (struct plan7_s *) malloc (sizeof (struct plan7_s));
mx = (struct dpmatrix_s *) malloc (sizeof (struct dpmatrix_s));
AllocPlan7Body(hmm, 10);
for (i = 0; i < 60; i++) {
P7Viterbi(500, 10, hmm, mx->mmx);
}
}


--
Stefan Strasser


Re: libgcc-std.ver question

2005-03-17 Thread Mike Stump
On Mar 17, 2005, at 4:27 AM, Richard Henderson wrote:
I suppose it would be ok, but it would only be relevent for
embedded targets where "int" < SImode.  Otherwise we use the
plain "ffs" symbol in libc.
Ah, ok, that falls into the don't care bin for me...  For them, they  
probably don't use shared libraries, preferring static versions...

Thanks.


re: problems compiling gcc-3.3.1

2005-03-17 Thread Daniel Kegel
Amit Thakar wrote:
Following is the error i'am getting while compiling gcc-3.3.1.I am using 
headers of my system.How do i get rid of this.
In file included from tconfig.h:23,
 from ../../../gcc-3.3.1/gcc/libgcc2.c:36:
../../../gcc-3.3.1/gcc/config/i386/linux.h:232:20: signal.h: No such file or 
directory
../../../gcc-3.3.1/gcc/config/i386/linux.h:233:26: sys/ucontext.h: No such file 
or directory
make[2]: *** [libgcc/./_muldi3.o] Error 1
make[2]: Leaving directory `/home/test2/platdev/mysw/build-tools/gcc'
make[1]: *** [libgcc.a] Error 2
make[1]: Leaving directory `/home/test2/platdev/mysw/build-tools/gcc'
make: *** [all-gcc] Error 2
A little google searching turns up several hits, e.g.
http://www.embeddedtux.org/pipermail/etux/2004-December/000925.html
which say "use glibc headers".  Is this a cross-compiler,
or a native compiler?  What's the target OS?
Perhaps you should ask these questions on the crossgcc mailing list,
which is a more comfortable place to talk about problems
building old versions of gcc.
- Dan



Re: Merging calls to `abort'

2005-03-17 Thread Richard Stallman
When they see abort: core dumped, they just curse Emacs for losing their
work and switch to vi.

I am dubious of that speculation, because Emacs is very good at not
losing your work.

  It's true
that they don't complain about it on the Emacs developer list, where you
participate, because end-user complaints usually go to the GNU/Linux
distributions first.

They have not passed these complaints on to me, at least not in recent
years.

Anyway, this is a separate issue from the question of what GCC should
do.  GCC should treat multiple abort calls in whatever way is most
useful for programs that have multiple abort calls.


coverage mismatch

2005-03-17 Thread Rajkishore Barik
Hi,
I have been trying to use "-fprofile-generate" and "-fprofile-use" for 
some small 
bitwise C benchmarks (developed at MIT). I have a check-out of October 
2004 GCC build of 4.0 
version. It throws me "coverage mismatch error for "arcs"" saying number 
of counters is "6"
 instead of "5". How do I go around fixing these problems? In fact, 8 out 
of 15 of these benchmarks
throw me the same problem. 

Most of these benchmarks have only one module "main.c".
I compile the following way 
"gcc -O2 -fprofile-generate main.c"
"gcc -O2 -fprofile-use main.c" -- here it throws error.

Thanks for your help,
regards,
Raj


Re: Questions about trampolines

2005-03-17 Thread Robert Dewar
Joern RENNECKE wrote:
You need to be able to set the value of a parameter over a widely
varying range, what makes you think you can pick two values that
will cover all cases, or 4 or 6 for that matter.


Re: coverage mismatch

2005-03-17 Thread Mike Stump
On Mar 17, 2005, at 3:17 PM, Rajkishore Barik wrote:
I have been trying to use "-fprofile-generate" and "-fprofile-use" for
some small
bitwise C benchmarks (developed at MIT). I have a check-out of October
2004 GCC build of 4.0
version.
Try a checkout from today and let us know if the problem remains  
unfixed.  If it is, please file a PR on out web site, thanks.




Re: short int and conversions

2005-03-17 Thread Andrea
Thank you for your explanations,
looking in "detail" what happens in my case (I would like to have
modes that have less bits/precision than BITS_PER_UNIT),  I cannot
understand if there is a bug in convert.c:440 or is a feature that
prevents me to use a FRACTIONAL_INT as a small precision ( wrote:
> On Thu, 17 Mar 2005, Andrea wrote:
> 
> > I'm trying to port gcc 4.1 for an architecture that has the following
> > memory layout BITS_PER_UNIT=32 and UNITS_PER_WORD=1.
> 
> Support for systems with bytes wider than 8 bits is somewhat bitrotten at
> present, as it seems little has been done on the c4x port lately and it is
> the only such port we currently have; various PRs indicate it simply
> doesn't work (won't build libgcc) at present.  I have however CC:ed the
> maintainer of the c4x port in case he should wish to improve the state of
> this port and the general support for such ports.
> 
> > It has support (16bit registers and operators) for 16bit signed
> > atithmetic used mainly for addressing. There are also operators for 32
> > bit integer and floating point support.
> > I set SHORT_TYPE_SIZE=POINTER_SIZE=(BITS_PER_WORD/2).
> 
> short needs to have at least the precision of char in C.  (C99 made
> explicit various aspects of the ordering rules for type precision which
> C90 was insufficiently complete about.)
> 
> However, types narrower than char do work in the compiler - we have them
> for bit-fields.  As required by the C standard, types narrower that int
> are promoted to int in arithmetic.  Bit-field types don't have their own
> modes, but in principle you should be able to have a special type with its
> own mode narrower than char: however, you may need to implement
> optimizations which convert operations on promoted types to operations on
> narrow types for targets with such types.
> 
> --
> Joseph S. Myers   http://www.srcf.ucam.org/~jsm28/gcc/
> [EMAIL PROTECTED] (personal mail)
> [EMAIL PROTECTED] (CodeSourcery mail)
> [EMAIL PROTECTED] (Bugzilla assignments and CCs)
>


Re: RFC: Changes in the representation of call clobbers

2005-03-17 Thread Andrew MacLeod

What if we try a variation on this.  Im not even sure how I feel about
it since its even wonkier than what you suggest.

first, create a unique GV for each type, and implement a gatherer
definition.  Instead of individual VMAYDEFS for 3 variables, we have a
gatherer which assigns them all to one global var.  something like:


> # GV_14 = V_KILL_ALL 
> bar ()

then instead of using what would have been the new version of x, y or z,
we use GV_14 for each of x, y, or z until it is redefined via another
VMAYDEF. 

I know thats not very clear, so let me try to explain it more
graphically with the virtual operands on the RHS of this listing:


foo()
{   maps to:
# X_2 = V_MUST_DEF # X_2 = V_MUST_DEF 
X = 3

# Y_4 = V_MUST_DEF# Y_4 = V_MUST_DEF 
Y = 1

# Z_6 = V_MUST_DEF# Z_6 = V_MUST_DEF 
# VUSE 
Z = X + 1

# X_3 = V_MAY_DEF  # GV_13 = V_KILL_ALL 
# Y_5 = V_MAY_DEF 
# Z_7 = V_MAY_DEF 
bar ()

# VUSE # VUSE 
# Y_8 = V_MUST_DEF # Y_8 = V_MUST_DEF 
Y = X + 2

# VUSE # VUSE 
# VUSE # VUSE 
return Y + Z;
}


In effect, what we are doing is saying that at the call site every
variable in the alias set for GV_13 has been MAYDEF'd.  This means  we
dont know its value until its physically defined again, as Y is. Until
then, we simply use the current GV variable instead of the individual
variables.  In into-ssa I guess this means the "current-def" would be
set to the alias variable at these points.

As I said, this looks pretty wonky, but I beleive it accurately
represents reality.  Other than the collector V_KILL_ALL, I dont think
anything would change... would it? 

it looks a bit tricky to sort out bugs, especially in a large program
with lots of variables. we might have to moidify the lister to add the
variable names to the RHS when there is a reference to the GV to help.
ie:

# VUSE # VUSE 
# Y_8 = V_MUST_DEF # Y_8 = V_MUST_DEF 
Y = X + 2

# VUSE # VUSE 
# VUSE # VUSE 
return Y + Z;

It will be even more cryptic than this in reality. The VUSE of GV_13 is
redundant, as it is mentioned in the RHS of the V_MUST_DEF, leaving us
with:

# Y_8 = V_MUST_DEF 
Y = X + 2

which looks even wonkyier.  Its precise and efficient however.

And will it cause an issue to have GV_13 in the RHS of a V_MUST_DEF and
then be used later in a VUSE as in the return stmt? In *theory* it
shouldnt, but I dont know if anyone has written code which assumes the
RHS of a VMAYDEF is dead.

I think all the PHI node issues work themselves out too, but you spend
way more time thinking about PHI nodes than I do. maybe you see an
issue.

Alternatively, if this is either too out there, or I'm otherwise off my
rocker for some reason Ive missed, we can visit a solution to the only
real problem with the proposal:

> 
> - If there are no uses of X, Y and Z after the call to bar, DCE will 
> think that those stores are dead.  We would have to hack DCE to
somehow 
> seeing the call to bar() as a user for those stores.


This is really the only issue I see, and Im not sure of a decent way to
deal with it. I'll think about it.

Andrew





Re: short int and conversions

2005-03-17 Thread Paul Schlie
> I'm trying to port gcc 4.1 for an architecture that has the following
> memory layout BITS_PER_UNIT=32 and UNITS_PER_WORD=1.
> It has support (16bit registers and operators) for 16bit signed
> atithmetic used mainly for addressing. There are also operators for 32
> bit integer and floating point support.
> I set SHORT_TYPE_SIZE=POINTER_SIZE=(BITS_PER_WORD/2).
>  
> I reserved QImode and QFmode for 32 bit integer/floating point operations.
> And I defined a new fractional int mode FRACTIONAL_INT_MODE (PQ, 16, 1) for
> pointers and short int operations.
> When I try to compile a very simple program with short int xgcc
> segments for stack overflow because it calls recursively
> #32 0x0806dd6d in convert (type=0xb7c7b288, expr=0xb7c88090) at
> ../../gcc/c-convert.c:95
> #33 0x08160626 in convert_to_integer (type=0xb7c7b288,
> expr=0xb7c88090) at ../../gcc/convert.c:442
> 
> I presume it tries to convert a small precision mode in something
> bigger but I cannot understand why.
> This is the first time I try to port gcc, so I don't know if my
> assumptions are reasonable or not.

With the caveat that I've never boot-strapped a port myself:

- "unit" tends to be an acronym for char, as is QI mode for < 16-bit chars.
- "word" tends to be an acronym for int/void*, typically represented as HI
  (16-bit) or SI (32-bit) mode operands, and typically the natural size of
  a memory access, although not necessarily.
- correspondingly, 32-bit float operands tend to be represented as SF mode
  operands.

(Q = quarter, H = half, S = single, D = Double) (I = integer, F = float)

So in rough summary, the following may be reasonable choices (given your
machine's apparent support of 16-bit and possibly lesser sized operations):

  bits  mode  ~type
     -
   8 QI   char/short (which can be emulated if necessary)
  16 HI   char/short/int/void*
  16 HF   (target-specific-float)
  32 SI   int/void*/long
  32 SF   float
  64 DI   long/void*/long-long/
  64 DF   double

Also as a generalization, it's likely wise not to try modeling a port after
the c4x, as it's implementation seems at best very odd. (alternatively, a
better model may be one of the supported 16/32 bit targets, depending on
your machine's architecture.)

best of luck.




Re: false spam positive from gcc-patches

2005-03-17 Thread James E Wilson
Thomas Koenig wrote:
any reason why the message
http://gcc.gnu.org/ml/fortran/2005-03/msg00282.html
was rejected as spam from gcc-patches, yet accepted on the fortran
list?
See
http://www.sourceware.org/lists.html#rbl-sucks
which has a discussion of how the spam filters work, and how to get 
around them.

Possible reasons
1) The address you posted from is subscribed to one mailing list but not 
the other (perhaps an alternate address is subscribed on the other list).
2) The address you posted from is on the allow list for one mailing list 
but not allow list of the other.

By the way, I think it is a word of all caps in the subject line which 
triggers the spam filter, and "PR" is such a word, which is unfortunate. 
 This is just a guess though.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: false spam positive from gcc-patches

2005-03-17 Thread Steve Kargl
On Thu, Mar 17, 2005 at 09:20:45PM -0800, James E Wilson wrote:
> Thomas Koenig wrote:
> >any reason why the message
> >http://gcc.gnu.org/ml/fortran/2005-03/msg00282.html
> >was rejected as spam from gcc-patches, yet accepted on the fortran
> >list?
> 
> By the way, I think it is a word of all caps in the subject line which 
> triggers the spam filter, and "PR" is such a word, which is unfortunate. 
>  This is just a guess though.

Yeah, and this is really stupid for filtering Fortran related
emails.  Traditionally, all upper case words are used to
distinguish between text and Fortran code/syntax/keywords.

-- 
Steve


Re: supporting --with-cpu=default32 option for x86_64

2005-03-17 Thread James E Wilson
Nitin Gupta wrote:
following lines were added in config.gcc in order to recognise
--with-cpu=default32. But I dont understand , how it was actually made
to default to 32-bit.
The trick is to look at the default64 code, and note what default32 
doesn't do that default64 does do.

The code you quoted is only clearing with_cpu when default32/default64 
are given, because these are valid options to the actual gcc code. 
These just mean "use whatever the default with_cpu value is".
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: Suggestion for a fix to Bug middle-end/20177

2005-03-17 Thread James E Wilson
Mostafa Hagog wrote:
The question is: what is the correct fix for the longer term ?
is it enough to mark the SMSed block dirty? or do we need
also to keep the REG_DEAD correct in each basic-block
separately?
You either have to keep all REG_NOTES up to date, or call code that will 
recompute them.  You can recompute REG_DEAD/REG_UNUSED notes by calling 
back into flow.  This is presumably what happens when you mark the block 
dirty, so that would be a sufficient solution for REG_DEAD/REG_UNUSED.

See for instance code in combine.c that updates REG_NOTES after 
combination.  This is in distribute_notes.

By the way, REG_UNUSED means that this instructions sets a register, and 
this value dies here.  There are no uses of this register before the 
next set or the end of the function.  Thus it holds register life info 
that is complimentary to REG_DEAD.
--
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: RTL?

2005-03-17 Thread James E Wilson
하태준 wrote:
> where i get the impormation about code, log_links, reg_notes

See the internals documentation, in the file gcc/doc/rtl.texi, or on the
web at
http://gcc.gnu.org/onlinedocs/gccint/Insns.html#Insns
See also the sources for more info, as the docs may not be fully up to
date, in particular the file rtl.h.
-- 
Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com


Re: reload question

2005-03-17 Thread Miles Bader
Bernd Schmidt <[EMAIL PROTECTED]> writes:
> Reload insns aren't themselves reloaded.  You should look at the 
> SECONDARY_*_RELOAD_CLASS; they'll probably let you do what you want.

Ah, thank you!

I've defined SECONDARY_*_RELOAD_CLASS (and PREFERRED_* to try to help
things along), and am now running into more understandable reload
problems:  "unable to find a register to spill in class"  :-/

The problem, as I understand, is that reload doesn't deal with conflicts
between secondary and primary reloads -- which are common with my arch
because it's an accumulator architecture.

For instance, slightly modifying my previous example:

   Say I've got a mov instruction that only works via an accumulator A,
   and a two-operand add instruction.  "r" regclass includes regs A,X,Y,
   and "a" regclass only includes reg A.

   mov has constraints like: 0 = "g,a"   1 = "a,gi"
   and add3 has constraints: 0 = "a" 1 = "0"2 = "ri" (say)

So if before reload you've got an instruction like:

   add temp, [sp + 4], [sp + 6]

and v2 and v3 are in memory, it will have to have generate something like:

   mov A, [sp + 4]; primary reload 1 in X, with secondary reload 0 A
   mov X, A   ;   ""
   mov A, [sp + 6]; primary reload 2 in A, with no secondary reload
   add A, X
   mov temp, A

There's really only _one_ register that can be used for many reloads, A.

The problem is that reload doesn't seem to be able to produce this kind of
output:  if it chooses A as a primary reload (common, as most insns use A as
a first operand), reload will think it conflicts with secondary reloads that
also use A (when it really needn't, as the secondary reloads only use A
"temporarily").  This is particularly bad with RELOAD_OTHER reloads, as

I kludged around this to some degree by changing `reload_conflicts'
(reload1.c) to always think secondary reloads _don't_ conflict [see patch1].

As that will fail in the case where a primary reload is loaded before a
secondary reload using the same register, I _also_ modified
`emit_reload_insns' to sort the order in which operand reloads are output so
that an operand who's secondary reload interferes with another operand's
primary reload is always loaded first.

However I think this is not guaranteed to always work -- certainly merely
disregarding conflicts with secondary reloads will fail for architectures
which are slightly less anemic, say with _two_ accumulators... :_)

Does anybody have a hint for a way to solve this problem?
Reload is very confusing...

Thanks,

-Miles


= patch1 =

--- gcc-3.4.3/gcc/reload1.c 2004-05-02 21:37:17.0 +0900
+++ gcc-3.4.3-supk0-20050317/gcc/reload1.c  2005-03-17 19:49:35.935534000 
+0900
@@ -1680,7 +1688,7 @@   find_reg (struct insn_chain *chain, int order)
 {
   int other = reload_order[k];
 
-  if (rld[other].regno >= 0 && reloads_conflict (other, rnum))
+  if (rld[other].regno >= 0 && reloads_conflict (other, rnum, 0))
for (j = 0; j < rld[other].nregs; j++)
  SET_HARD_REG_BIT (used_by_other_reload, rld[other].regno + j);
 }
@@ -4601,18 +4609,25 @@
 }
 
 /* Return 1 if the reloads denoted by R1 and R2 cannot share a register.
-   Return 0 otherwise.
+   Return 0 otherwise.  If SECONDARIES_CAN_CONFLICT is zero, secondary
+   reloads are considered never to conflict; otherwise they are treated
+   normally.
 
This function uses the same algorithm as reload_reg_free_p above.  */
 
 int
-reloads_conflict (int r1, int r2)
+reloads_conflict (int r1, int r2, int secondaries_can_conflict)
 {
   enum reload_type r1_type = rld[r1].when_needed;
   enum reload_type r2_type = rld[r2].when_needed;
   int r1_opnum = rld[r1].opnum;
   int r2_opnum = rld[r2].opnum;
 
+  /* Secondary reloads need not conflict with anything.  */
+  if (!secondaries_can_conflict
+  && (rld[r1].secondary_p || rld[r2].secondary_p))
+return 0;
+
   /* RELOAD_OTHER conflicts with everything.  */
   if (r2_type == RELOAD_OTHER)
 return 1;



= patch2 =

--- gcc-3.4.3/gcc/reload1.c 2004-05-02 21:37:17.0 +0900
+++ gcc-3.4.3-supk0-20050317/gcc/reload1.c  2005-03-17 19:49:35.935534000 
+0900
@@ -6951,6 +6966,51 @@  emit_reload_insns (struct insn_chain *chain)
   do_output_reload (chain, rld + j, j);
 }
 
+#ifdef SECONDARY_INPUT_RELOAD_CLASS
+  for (j = 0; j < reload_n_operands; j++)
+opnum_emit_pos[j] = emit_pos_opnum[j] = j;
+
+  /*  Order the operands to avoid conflicts between the primary reload of
+  one operand and a secondary reload in another operand (which we
+  ignored before).  XXX this only works for input reloads!!  */
+  for (j = 0; j < n_reloads; j++)
+if (rld[j].secondary_p)
+  /* This is a secon