Re: __builtin_cpow((0,0),(0,0))
Dave Korn wrote: Original Message From: Ronny Peine Sent: 16 March 2005 17:34 See for example: http://mathworld.wolfram.com/ExponentLaws.html Ok, I did. Even though, gcc returns 1 for pow(0.0,0.0) in version 3.4.3 like many other c-compiler do. The same behaviour would be expected from cpow. No, you're wrong (that the same behaviour would be expected from cpow). See for example: http://mathworld.wolfram.com/ExponentLaws.html " Note that these rules apply in general only to real quantities, and can give manifestly wrong results if they are blindly applied to complex quantities. " Well yes in the general case it's not applieable, but x^0 is 1 in the complex case, too. And if 0^0 is converted from the real to the complex domain (it's even a part of the complex domain) than the same behaviour would be expected, otherwise the definition wouldn't be very well. Has anyone found a hint in the ieee754 standard if there is something about it in there? I haven't one here right now, well it's not prizeless. Otherwise these discussion won't end. cheers, DaveK cu, Ronny
Re: Question about how to compile multiple files with g++
On Mar 16, 2005, at 11:05 PM, Yen wrote: I have a problem to compile multiple files together, so please everybody give me a help, thanks! Wrong list, try gcc-help instead.
Compiler chokes on a simple template - why?
Hi, Here is a snippet that does not compile with gcc 3.4.1 (on Mandrake 10.1). --- template class A { public: template void test(T value) {} }; template void test2(A& a, T val) { a.test(val); } int main() { A a; a.test(1); //works fine } --- $ g++ -o test test.cc test.cc: In function `void test2(A&, T)': test.cc:9: error: expected primary-expression before "int" test.cc:9: error: expected `;' before "int" The funny thing is that if I change the name of the "test2" function to "test", everything is OK. The compiler complains only if the functions have different names. Why does the name matter? The code compiles if "test2" is not a template function. Furthermore, calling A::test directly from main rather than through the template function works fine. I don't know if this is really a compiler thing, but it's hard to imagine the standard would impose such behavior. Please cc your thoughts to me, I'm not a subscriber. Thanks. -Topi-
Re: Bootstrap failure in varasm.c at assemble_alias
Benjamin Redelings I <[EMAIL PROTECTED]> writes: > Hi guys, > Just wanted to note that I'm getting a bootstrap failure in varasm.c. > > gcc -c -g -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes > -Wmissing-prototypes -fno-common -DHAVE_CONFIG_H-I. -I. > -I../../gcc/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include > -I../../gcc/gcc/../libcpp/include ../../gcc/gcc/varasm.c -o varasm.o > ../../gcc/gcc/varasm.c: In function `const_rtx_hash_1': > ../../gcc/gcc/varasm.c:2854: warning: right shift count >= width of type > ../../gcc/gcc/varasm.c: In function `assemble_alias': > ../../gcc/gcc/varasm.c:4524: error: parse error before '<<' token Remote the conflict markers. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different."
Re: Questions about trampolines
hi, On Wed, Mar 16, 2005 at 02:48:56PM -0500, Robert Dewar wrote: > Yes, but that avoids the difficulty, that's obvious so far. > > The problem is to know exactly when to pop the stack, and that is > not trivial (longjmp, exceptions, non local gotos). hmm.. what's about doing it gc-like. Instead of a stack there simply is a 'pool' of trampolines from which trampolines are allocated and a pointer to the trampoline is pushed on the stack. When the last trampoline from the pool is allocated, a 'garbage collector' is running over it and looking for pointers to trampolines between the stack pointer and the stack start address. Every trampoline which isn't possibly referenced is added to a free-list from which new trampolines are allocated. When no trampoline can be allocated, abort() is called (with or without crossjumping ;-). This may be a dirty hack and guessing a good size for the trampoline pool is still an issue - but it could be implemented easily and would work... Instead of adding the trampoline pool to libgcc (as suggested earlier in this thread) I would suggest that gcc generates a trampoline pool in a linkonce section every time a source file is compiled which requires trampolines. That way there wouldn't be any trampoline pool in an executeable which doesn't need one and a compiler option such as -ftrampoline-pool-size=32 could be used the specify the size of the trampoline pool on the command line. The only issue I see with that is that the trampoline pool will actually consist of two sections: one for the code and one for the data. Afair there is a bug with linkonce sections connected to data sections. (triggered e.g. when a big switch statement in function in a c++ template is compiled using a jump table. this may lead to code-references in .data to dropped linkonce code sections.) may this also become an issue here? or is the bug fixed already? yours, - clifford -- ocaml graphics.cma <( echo 'open Graphics;;open_graph " 640x480"let complex_mul(a,b)(c,d)=(a*.c-.b*.d,a*.d+.b*.c)let complex_add(a,b)(c ,d)=(a+.c,b+.d);;let rec mandel c n=if n>0 then let z=mandel c(n-1) in complex_add(complex_mul z z)c else (0.0,0.0);; for x=0 to 640 do for y=0 to 480 do let c=((float_of_int(x-450))/.200.0,(float_of_int (y-240))/.200.0) in let cabs2(a,b)=(a*.a)+.(b*.b)in if cabs2(mandel c 50)<4.0 then plot x y done done;;read_line()' ) M$ is not the answer. M$ is the question. No is the answer! pgpupBLHWsXdb.pgp Description: PGP signature
Re: Compiler chokes on a simple template - why?
Topi Maenpaa <[EMAIL PROTECTED]> wrote: > --- > template class A > { > public: > template void test(T value) {} > }; > > template void test2(A& a, T val) > { > a.test(val); > } > > int main() > { > A a; > a.test(1); //works fine > } > --- This is ill-formed. You need to write: a.template test(val); because 'a' is a dependent name. > The funny thing is that if I change the name of the "test2" function > to "test", everything is OK. The compiler complains only if the > functions have different names. Why does the name matter? This is surely a bug. Would you please file a bug report about this? > The code compiles if "test2" is not a template function. Furthermore, > calling A::test directly from main rather than through the > template function works fine. This is correct, because if "test2" is not a template function name anymore, then 'a' is not a dependent name, and the 'template' keyword is not needed to disambiguate the parser. Giovanni Bajo
Re: Compiler chokes on a simple template - why?
On Thu, Mar 17, 2005 at 10:33:54AM +0200, Topi Maenpaa wrote: > Hi, > > Here is a snippet that does not compile with gcc 3.4.1 (on Mandrake 10.1). > > --- > template class A > { > public: > template void test(T value) {} > }; > > template void test2(A& a, T val) > { > a.test(val); This needs to be: a.template test(val); > } > > int main() > { > A a; > a.test(1); //works fine > } > --- > > $ g++ -o test test.cc > test.cc: In function `void test2(A&, T)': > test.cc:9: error: expected primary-expression before "int" > test.cc:9: error: expected `;' before "int" Because test2 is a template it doesn't know what A is (in the general case) so you need to tell the compiler that a.test is a function template, otherwise it is parsed as a member variable, giving "a.test less-than int", which doesn't make sense. > The funny thing is that if I change the name of the "test2" function to > "test", everything is OK. The compiler complains only if the functions have > different names. Why does the name matter? That I'm not sure about ... I would have expected it to fail with the same error when the function is called "test" - but I'd be wrong apparently. > The code compiles if "test2" is not a template function. Because in that case the compiler knows the full definition of A and knows that a.test refers to a function template, not a member variable (for instance). > Furthermore, calling > A::test directly from main rather than through the template function works > fine. Again, in that context the compiler knows that a.test is a function template. > I don't know if this is really a compiler thing, but it's hard to imagine the > standard would impose such behavior. Yes, it's ugly. No, it's not a bug. It's required by the standard :-( jon -- "I find television very educating. Every time somebody turns on the set, I go into the other room and read a book." - Groucho Marx
Re: __builtin_cpow((0,0),(0,0))
Ronny Peine <[EMAIL PROTECTED]> writes: | Dave Korn wrote: | > Original Message | > | >>From: Ronny Peine | >>Sent: 16 March 2005 17:34 | > | >>See for example: | >>http://mathworld.wolfram.com/ExponentLaws.html | >> | > Ok, I did. | > | >> Even though, gcc returns 1 for pow(0.0,0.0) in version 3.4.3 like | >> many other c-compiler do. The same behaviour would be expected from | >> cpow. | > No, you're wrong (that the same behaviour would be expected from | > cpow). | > See for example: | > http://mathworld.wolfram.com/ExponentLaws.html | > " Note that these rules apply in general only to real quantities, | > and can | > give manifestly wrong results if they are blindly applied to complex | > quantities. " | > | | Well yes in the general case it's not applieable, but x^0 is 1 in the | complex case, too. Just repeating it does not make it a reality. | And if 0^0 is converted from the real to the | complex domain (it's even a part of the complex domain) than the same | behaviour would be expected, otherwise the definition wouldn't be very | well. the point is that real or exponentiation is not the same as integer exponentiation. The latter has less freedom that ther former. | Has anyone found a hint in the ieee754 standard if there is something | about it in there? I haven't one here right now, well it's not | prizeless. Otherwise these discussion won't end. there are several standards, among which IEEE-754 and the ISO standard LIA (designed to correct the IEEE-754 shot). IEEE-754 does not concern itself with complex arithmetic (though C99 made some interesting and innovative extensions). I already quoted part 2 of LIA. Part 3 of LIA, concerning complex arithmetic, is being developed and is in its second stage. It is consistent with LIA-2. -- Gaby
Re: libgcc-std.ver question
On Wed, Mar 16, 2005 at 05:43:32PM -0800, Mike Stump wrote: > I have a question about libgcc export for shared libraries... libgcc > exports (via libgcc-std.ver): > > __ffsdi2 > > but not: > > __ffssi2 I suppose it would be ok, but it would only be relevent for embedded targets where "int" < SImode. Otherwise we use the plain "ffs" symbol in libc. r~
problems compiling gcc-3.3.1
hello , Following is the error i'am getting while compiling gcc-3.3.1.I am using headers of my system.How do i get rid of this. In file included from tconfig.h:23, from ../../../gcc-3.3.1/gcc/libgcc2.c:36: ../../../gcc-3.3.1/gcc/config/i386/linux.h:232:20: signal.h: No such file or directory ../../../gcc-3.3.1/gcc/config/i386/linux.h:233:26: sys/ucontext.h: No such file or directory make[2]: *** [libgcc/./_muldi3.o] Error 1 make[2]: Leaving directory `/home/test2/platdev/mysw/build-tools/gcc' make[1]: *** [libgcc.a] Error 2 make[1]: Leaving directory `/home/test2/platdev/mysw/build-tools/gcc' make: *** [all-gcc] Error 2 Regards Amit
Re: Compiler chokes on a simple template - why?
On Thu, Mar 17, 2005 at 11:03:53AM +0100, Giovanni Bajo wrote: > Topi Maenpaa <[EMAIL PROTECTED]> wrote: > > > The funny thing is that if I change the name of the "test2" function > > to "test", everything is OK. The compiler complains only if the > > functions have different names. Why does the name matter? > > This is surely a bug. Would you please file a bug report about this? That's what I thought - comeau seems to have the same bug btw. jon -- "You can lead a horticulture but you can't make her think." - Dorothy Parker
Re: Questions about trampolines
Clifford Wolf wrote: hmm.. what's about doing it gc-like. Instead of a stack there simply is a 'pool' of trampolines from which trampolines are allocated and a pointer to the trampoline is pushed on the stack. When the last trampoline from the pool is allocated, a 'garbage collector' is running over it and looking for pointers to trampolines between the stack pointer and the stack start address. Every trampoline which isn't possibly referenced is added to a free-list from which new trampolines are allocated. If you have only one procesor stack (i.e. single-threaded execution), you can handle the trampolines as a stack too. You don't need to deallocate till you allocate again, and then you adjust the trampoline stack so none of its static chain pointers points to a deallocated frame, or to the current frame (since you are only about to set up the trampolines for the current frame then). If you have multiple processor stacks, you have to register and later search them all in order to make the garbage-collection scheme work. that it doesn't point at any deallocated frames. Instead of adding the trampoline pool to libgcc (as suggested earlier in this thread) I would suggest that gcc generates a trampoline pool in a linkonce section every time a source file is compiled which requires trampolines. That way there wouldn't be any trampoline pool in an executeable which doesn't need one You don't need a linkonce section for this. The function that needs a trampoline calls allocation / deallocation functions, or if it inlines the code, it will reference the pool start addresses - either way, it will reference some symbols. By putting the .o file that provides these symbols along with the code and data parts of the trampoline pool into a static library - libgcc.a or otherwise - you make sure that the object is only linked in when needed. and a compiler option such as -ftrampoline-pool-size=32 could be used the specify the size of the trampoline pool on the command line. This is messy; say you have two libraries that are compiled with -ftrampoline-pool-size=32 ; they will then share a trampoline pool of 32 entries. If you compile one with -ftrampoline-pool-size=16 instead, you will have them using different pools, or maybe even get some multiply defined symbols. It is much saner to make this a link time option. By selecting a specific library for the trampoline pool, you can adjust the size on a program (or dso, you you don't export) basis, and you might even choose an alternate allocation strategy. I.e. you could have libgcc provide one with a size that works most of the time and uses destructors for portabiliyt and robustness, have a specialized lightweight one you can specifically use for single-threaded programs, and have a 64 bit linux specific one that ties into the threading code (or is part of a threads package) and mmaps trampoline code pages for every processor stack allocated, sufficiently large and at a fixed offset to the stack so that you can put the data part on the return stack in any suitably aligned position, and have a matching trampoline. I.e. the bare function address and the static chain pointer are 8 bytes each, so that a trampoline data part is 16 bytes. You require them to be 16-byte aligned on any processor stack. The mmapped trampoline can be an absolute function call to some helper code that does the real work, using the return address to figure out which trampoline is executed. This call should fit into 16 bytes too, so in the trampoline page to be mmapped , every 16 bytes there is such an absolute call insn. You can get a 1:1 correspondence between trampolines and processor stacks by allocating the stacks all in one specific memory area, and have an equally-sized area where trampolines are mapped. Thus, you can have differently-sized stacks, yet the trampoline code can add a constant offset to the return address to find the data part of the trampoline.
Re: Questions about trampolines
Hi, On Thu, Mar 17, 2005 at 01:35:29PM +, Joern RENNECKE wrote: > I.e. you could have libgcc provide one with a size that works most of the > time Some applications have recursions which go into a depth of 1000 and more. Some architectures have only a few k ram. Which "a size that works most of the time" would you suggest? It's ugly to have a static pool size. But it's intolerable to not allow the user to change that pool size easily using an option. > The mmapped trampoline can be an absolute function call to some helper > code that does the I am pretty sure that all processor architectures with such a strict haward design that it is impossible to generate dynamic code are MMU-less. yours, - clifford -- bash -c "gcc -o mysdldemo -Wall -O2 -lSDL -lm -pthread -x c <( echo -e ' #include \n#include \nint main(){SDL_Surface*s;SDL_Event e;int x,y,n;SDL_Init(SDL_INIT_VIDEO);s=SDL_SetVideoMode(640,480,32,0);for(x=0; x<640;x++)for(y=0;y<480;y++){float _Complex z=0, c=((x-400)/200.0) + ((y-240)/ 200.0)*1.0fi;for(n=1;n<64;n++){z=z*z+c;if(cabsf(z)>2){((Uint32*)s->pixels)[x+y *640]=n<<3;n=99;}}}SDL_UpdateRect(s,0,0,s->w,s->h);do SDL_WaitEvent(&e); while (e.type!=SDL_QUIT&&e.type!=SDL_KEYDOWN);SDL_Quit();return 0;}' ); ./mysdldemo" M$ is not the answer. M$ is the question. No is the answer! pgpUWbp1VmpeO.pgp Description: PGP signature
Re: Questions about trampolines
Clifford Wolf wrote: Some applications have recursions which go into a depth of 1000 and more. Some architectures have only a few k ram. Which "a size that works most of the time" would you suggest? It's ugly to have a static pool size. But it's intolerable to not allow the user to change that pool size easily using an option. Of course the user can change the size, by using a library with a different size. But there should be a sensible default. The size of that default can vary from target to target. The mmapped trampoline can be an absolute function call to some helper code that does the I am pretty sure that all processor architectures with such a strict haward design that it is impossible to generate dynamic code are MMU-less. The application of the MMU-based scheme is more to accelerate trampolines by avoiding cache coherency issues, without making allocation / deallocation more expensive. In fact, since the code is already there, the initialization is cheaper than for classic stack-based trampolines on pure von Neumann architectures. FWIW, for processor-stack based trampolines. if we could guarantee that trampolines are the only code that can be executed on the stack, we could avoid the memory-Icache coherency issue altogether by allocating entire cache lines for trampolines on the stack, and filling them up with trampolines (at least the code part), with a code part that does not change for any given stack location. I.e. after writing the code, we'd have to flush it to memory, but wouldn't need to invalidate the Icache, since the only old code that could be there would be identical to the code just written.
Re: Questions about trampolines
Joern RENNECKE wrote: Of course the user can change the size, by using a library with a different size. This is not an acceptable approach in a production environment, where switching libraries can force revalidation and retesting.
Re: Questions about trampolines
Robert Dewar wrote: Joern RENNECKE wrote: Of course the user can change the size, by using a library with a different size. This is not an acceptable approach in a production environment, where switching libraries can force revalidation and retesting. This sounds more like a problem with your process than a genuine technical problem. Why should an option that selects a different library be less safe than an option that changes code generation? But If you really want to, you can of course select a different module out of the same library, by playing with --defsym.
short int and conversions
Hi, I'm trying to port gcc 4.1 for an architecture that has the following memory layout BITS_PER_UNIT=32 and UNITS_PER_WORD=1. It has support (16bit registers and operators) for 16bit signed atithmetic used mainly for addressing. There are also operators for 32 bit integer and floating point support. I set SHORT_TYPE_SIZE=POINTER_SIZE=(BITS_PER_WORD/2). I reserved QImode and QFmode for 32 bit integer/floating point operations. And I defined a new fractional int mode FRACTIONAL_INT_MODE (PQ, 16, 1) for pointers and short int operations. When I try to compile a very simple program with short int xgcc segments for stack overflow because it calls recursively #32 0x0806dd6d in convert (type=0xb7c7b288, expr=0xb7c88090) at ../../gcc/c-convert.c:95 #33 0x08160626 in convert_to_integer (type=0xb7c7b288, expr=0xb7c88090) at ../../gcc/convert.c:442 I presume it tries to convert a small precision mode in something bigger but I cannot understand why. This is the first time I try to port gcc, so I don't know if my assumptions are reasonable or not. could someone help me? thanks andrea.
Re: __builtin_cpow((0,0),(0,0))
Ronny Peine <[EMAIL PROTECTED]> writes: > | Well yes in the general case it's not applieable, but x^0 is 1 in the > | complex case, too. On Thu, Mar 17, 2005 at 01:08:58PM +0100, Gabriel Dos Reis wrote: > Just repeating it does not make it a reality. However, repeating it does annoy the readership of this list, and arguing with it just seems to cause people on the other side to repeat their arguments once again. Can we please stop this discussion?
Re: short int and conversions
On Thu, 17 Mar 2005, Andrea wrote: > I'm trying to port gcc 4.1 for an architecture that has the following > memory layout BITS_PER_UNIT=32 and UNITS_PER_WORD=1. Support for systems with bytes wider than 8 bits is somewhat bitrotten at present, as it seems little has been done on the c4x port lately and it is the only such port we currently have; various PRs indicate it simply doesn't work (won't build libgcc) at present. I have however CC:ed the maintainer of the c4x port in case he should wish to improve the state of this port and the general support for such ports. > It has support (16bit registers and operators) for 16bit signed > atithmetic used mainly for addressing. There are also operators for 32 > bit integer and floating point support. > I set SHORT_TYPE_SIZE=POINTER_SIZE=(BITS_PER_WORD/2). short needs to have at least the precision of char in C. (C99 made explicit various aspects of the ordering rules for type precision which C90 was insufficiently complete about.) However, types narrower than char do work in the compiler - we have them for bit-fields. As required by the C standard, types narrower that int are promoted to int in arithmetic. Bit-field types don't have their own modes, but in principle you should be able to have a special type with its own mode narrower than char: however, you may need to implement optimizations which convert operations on promoted types to operations on narrow types for targets with such types. -- Joseph S. Myers http://www.srcf.ucam.org/~jsm28/gcc/ [EMAIL PROTECTED] (personal mail) [EMAIL PROTECTED] (CodeSourcery mail) [EMAIL PROTECTED] (Bugzilla assignments and CCs)
GCC3 to GCC4 performance regression. Bug?
I have been looking at a significant performance regression in the hmmer application between GCC 3.4 and GCC 4.0. I have a small cutdown test case (attached) that demonstrates the problem and which runs more than 10% slower on IA64 (HP-UX or Linux) when compiled with GCC 4.0 than when compiled with GCC 3.4. At first I thought this was just due to 'better' alias analysis in the P7Viterbi routine and that it was the right thing to do even if it was slower. It looked like GCC 3.4 does not believe that hmm->tsc could alias mmx but GCC 4.0 thinks they could and thus GCC 4.0 does more loads inside the inner loop of P7Viterbi. But then I noticed something weird, if I remove the field M (which is unused in my example) from the plan_s structure. GCC 4.0 runs as fast as GCC 3.4. I don't understand why this would affect things. Any optimization experts care to take a look at this test case and help me understand what is going on and if this change from 3.4 to 4.0 is intentional or not? Steve Ellcey [EMAIL PROTECTED] Test Case --- #define L_CONST 500 void *malloc(long size); struct plan7_s { int M; int **tsc; /* transition scores [0.6][1.M-1]*/ }; struct dpmatrix_s { int **mmx; }; struct dpmatrix_s *mx; void AllocPlan7Body(struct plan7_s *hmm, int M) { int i; hmm->tsc= malloc (7 * sizeof(int *)); hmm->tsc[0] = malloc ((M+16) * sizeof(int)); mx->mmx = (int **) malloc(sizeof(int *) * (L_CONST+1)); for (i = 0; i <= L_CONST; i++) { mx->mmx[i] = malloc (M+2+16); } return; } void P7Viterbi(int L, int M, struct plan7_s *hmm, int **mmx) { int i,k; for (i = 1; i <= L; i++) { for (k = 1; k <= M; k++) { mmx[i][k] = mmx[i-1][k-1] + hmm->tsc[0][k-1]; } } } main () { struct plan7_s *hmm; char dsq[L_CONST]; int i; hmm = (struct plan7_s *) malloc (sizeof (struct plan7_s)); mx = (struct dpmatrix_s *) malloc (sizeof (struct dpmatrix_s)); AllocPlan7Body(hmm, 10); for (i = 0; i < 60; i++) { P7Viterbi(500, 10, hmm, mx->mmx); } }
help with mudflap testsuite result analysis
So, I've been working on mudflap for darwin8, and these are the results I get... I know what you're thinking, it's impossible to get it working because it doesn't have --wrap and friends.. well, I pulled some magic pixie dust out and sprinkled it around and it's starting to work... The question is, how decent are the results and can you spot any systematic wrongs that appear and/or can you identify any non- portableness to darwin of mudflap? I started from 89 passes... :-) I fixed most all the obvious issues that appeared due to darwin from looking at the build result and looking at the libmudflap.log file. If someone would like to help track down the issues, I'd be interested in hearing from you. Thanks. libmudflap.log.bz Description: Binary data
For those who want to automatically generate predicates.md...
Hi, I created a set of scripts that generates predicates.md based on PREDICATE_CODES in tm.h. The generated file looks like this: ;; Predicate definitions for FIXME FIXME. ;; Copyright (C) 2005 Free Software Foundation, Inc. ;; ;; This file is part of GCC. ;; ;; : ;; : Usual copyright notice ;; : ;; Return true if OP is a valid source operand for an integer move instruction. (define_predicate "general_operand_src" (match_code "const_int,const_double,const,symbol_ref,label_ref,subreg,reg,mem") { if (GET_MODE (op) == mode && GET_CODE (op) == MEM && GET_CODE (XEXP (op, 0)) == POST_INC) return 1; return general_operand (op, mode); }) : : More predicates follow. : 1. A copyright is automatically inserted except the port name. 2. A comment for each function is taken from tm.c. 3. The name of a predicate along with codes it accepts are automatically taken from PREDICATE_CODES. 4. The C code for a predicate is automatically taken from tm.c. My scripts will only generate predicate.md. It does not remove PREDICATE_CODES from tm.h, predicates from tm.c, or prototypes from tm-protos.h. All these are left for your code cleanup pleasure. :-) Another thing that my scripts won't do is to convert a C-style predicate to a LISP-style predicate. My scripts are only meant to alleviate the mechanical part of the conversion. Anyway, untar the attachment and run predicatecodes.sh h8300 under config/h8300 to generate predicates.md. Of course, you can replace h8300 with any port with PREDICATE_CODES. My scripts are not robust, so don't blame me if they eat your files. I might actually start posting patches to convert to predicate.md. Kazu Hirata conv_predicate_codes.tar.gz Description: Binary data
Re: Newlib _ctype_ alias kludge now invalid due to PR middle-end/15700 fix.
Giovanni Bajo wrote: Hans-Peter Nilsson <[EMAIL PROTECTED]> wrote: So, the previously-questionable newlib alias-to-offset-in-table kludge is finally judged invalid. This is a heads-up for newlib users. IMHO it's not a GCC bug, though there's surely going to be some commotion. Maybe a NEWS item is called for, I dunno. It will be in NEWS, since RTH already updated http://gcc.gnu.org/gcc-4.0/changes.html. I hope newlib will be promptly fixed. Giovanni Bajo I have just checked in a patch to newlib that changes the ctype macros to use __ctype_ptr instead of _ctype_. In addition, a configuration check is made to see whether the array aliasing trick can be used or not. The code allows for backward compatibility except in the case where the old code is using negative offsets and the current version of newlib is built with a compiler that does not support the array aliasing trick. Corinna, if this causes any Cygwin issues, please let me know. -- Jeff J.
false spam positive from gcc-patches
Hi, any reason why the message http://gcc.gnu.org/ml/fortran/2005-03/msg00282.html was rejected as spam from gcc-patches, yet accepted on the fortran list?
Re: GCC3 to GCC4 performance regression. Bug?
Steve Ellcey schrieb: Test Case --- I think is the same bug(which was not considered one back then) as benjamin redelings described in the thread "C++ math optimization problem...". there are again unnecessary memory accesses as if the memory were volatile, which could be moved out of the inner loop. and gcc 3.4 does it in this case(there are other cases where both fail, see the math optimization thread) and once again changes which shouldn't have any effect enormously affect the inner loop. #define L_CONST 500 void *malloc(long size); struct plan7_s { int M; int **tsc; /* transition scores [0.6][1.M-1]*/ }; struct dpmatrix_s { int **mmx; }; struct dpmatrix_s *mx; void AllocPlan7Body(struct plan7_s *hmm, int M) { int i; hmm->tsc= malloc (7 * sizeof(int *)); hmm->tsc[0] = malloc ((M+16) * sizeof(int)); mx->mmx = (int **) malloc(sizeof(int *) * (L_CONST+1)); for (i = 0; i <= L_CONST; i++) { mx->mmx[i] = malloc (M+2+16); } return; } void P7Viterbi(int L, int M, struct plan7_s *hmm, int **mmx) { int i,k; for (i = 1; i <= L; i++) { for (k = 1; k <= M; k++) { mmx[i][k] = mmx[i-1][k-1] + hmm->tsc[0][k-1]; } } } main () { struct plan7_s *hmm; char dsq[L_CONST]; int i; hmm = (struct plan7_s *) malloc (sizeof (struct plan7_s)); mx = (struct dpmatrix_s *) malloc (sizeof (struct dpmatrix_s)); AllocPlan7Body(hmm, 10); for (i = 0; i < 60; i++) { P7Viterbi(500, 10, hmm, mx->mmx); } } -- Stefan Strasser
Re: libgcc-std.ver question
On Mar 17, 2005, at 4:27 AM, Richard Henderson wrote: I suppose it would be ok, but it would only be relevent for embedded targets where "int" < SImode. Otherwise we use the plain "ffs" symbol in libc. Ah, ok, that falls into the don't care bin for me... For them, they probably don't use shared libraries, preferring static versions... Thanks.
re: problems compiling gcc-3.3.1
Amit Thakar wrote: Following is the error i'am getting while compiling gcc-3.3.1.I am using headers of my system.How do i get rid of this. In file included from tconfig.h:23, from ../../../gcc-3.3.1/gcc/libgcc2.c:36: ../../../gcc-3.3.1/gcc/config/i386/linux.h:232:20: signal.h: No such file or directory ../../../gcc-3.3.1/gcc/config/i386/linux.h:233:26: sys/ucontext.h: No such file or directory make[2]: *** [libgcc/./_muldi3.o] Error 1 make[2]: Leaving directory `/home/test2/platdev/mysw/build-tools/gcc' make[1]: *** [libgcc.a] Error 2 make[1]: Leaving directory `/home/test2/platdev/mysw/build-tools/gcc' make: *** [all-gcc] Error 2 A little google searching turns up several hits, e.g. http://www.embeddedtux.org/pipermail/etux/2004-December/000925.html which say "use glibc headers". Is this a cross-compiler, or a native compiler? What's the target OS? Perhaps you should ask these questions on the crossgcc mailing list, which is a more comfortable place to talk about problems building old versions of gcc. - Dan
Re: Merging calls to `abort'
When they see abort: core dumped, they just curse Emacs for losing their work and switch to vi. I am dubious of that speculation, because Emacs is very good at not losing your work. It's true that they don't complain about it on the Emacs developer list, where you participate, because end-user complaints usually go to the GNU/Linux distributions first. They have not passed these complaints on to me, at least not in recent years. Anyway, this is a separate issue from the question of what GCC should do. GCC should treat multiple abort calls in whatever way is most useful for programs that have multiple abort calls.
coverage mismatch
Hi, I have been trying to use "-fprofile-generate" and "-fprofile-use" for some small bitwise C benchmarks (developed at MIT). I have a check-out of October 2004 GCC build of 4.0 version. It throws me "coverage mismatch error for "arcs"" saying number of counters is "6" instead of "5". How do I go around fixing these problems? In fact, 8 out of 15 of these benchmarks throw me the same problem. Most of these benchmarks have only one module "main.c". I compile the following way "gcc -O2 -fprofile-generate main.c" "gcc -O2 -fprofile-use main.c" -- here it throws error. Thanks for your help, regards, Raj
Re: Questions about trampolines
Joern RENNECKE wrote: You need to be able to set the value of a parameter over a widely varying range, what makes you think you can pick two values that will cover all cases, or 4 or 6 for that matter.
Re: coverage mismatch
On Mar 17, 2005, at 3:17 PM, Rajkishore Barik wrote: I have been trying to use "-fprofile-generate" and "-fprofile-use" for some small bitwise C benchmarks (developed at MIT). I have a check-out of October 2004 GCC build of 4.0 version. Try a checkout from today and let us know if the problem remains unfixed. If it is, please file a PR on out web site, thanks.
Re: short int and conversions
Thank you for your explanations, looking in "detail" what happens in my case (I would like to have modes that have less bits/precision than BITS_PER_UNIT), I cannot understand if there is a bug in convert.c:440 or is a feature that prevents me to use a FRACTIONAL_INT as a small precision ( wrote: > On Thu, 17 Mar 2005, Andrea wrote: > > > I'm trying to port gcc 4.1 for an architecture that has the following > > memory layout BITS_PER_UNIT=32 and UNITS_PER_WORD=1. > > Support for systems with bytes wider than 8 bits is somewhat bitrotten at > present, as it seems little has been done on the c4x port lately and it is > the only such port we currently have; various PRs indicate it simply > doesn't work (won't build libgcc) at present. I have however CC:ed the > maintainer of the c4x port in case he should wish to improve the state of > this port and the general support for such ports. > > > It has support (16bit registers and operators) for 16bit signed > > atithmetic used mainly for addressing. There are also operators for 32 > > bit integer and floating point support. > > I set SHORT_TYPE_SIZE=POINTER_SIZE=(BITS_PER_WORD/2). > > short needs to have at least the precision of char in C. (C99 made > explicit various aspects of the ordering rules for type precision which > C90 was insufficiently complete about.) > > However, types narrower than char do work in the compiler - we have them > for bit-fields. As required by the C standard, types narrower that int > are promoted to int in arithmetic. Bit-field types don't have their own > modes, but in principle you should be able to have a special type with its > own mode narrower than char: however, you may need to implement > optimizations which convert operations on promoted types to operations on > narrow types for targets with such types. > > -- > Joseph S. Myers http://www.srcf.ucam.org/~jsm28/gcc/ > [EMAIL PROTECTED] (personal mail) > [EMAIL PROTECTED] (CodeSourcery mail) > [EMAIL PROTECTED] (Bugzilla assignments and CCs) >
Re: RFC: Changes in the representation of call clobbers
What if we try a variation on this. Im not even sure how I feel about it since its even wonkier than what you suggest. first, create a unique GV for each type, and implement a gatherer definition. Instead of individual VMAYDEFS for 3 variables, we have a gatherer which assigns them all to one global var. something like: > # GV_14 = V_KILL_ALL > bar () then instead of using what would have been the new version of x, y or z, we use GV_14 for each of x, y, or z until it is redefined via another VMAYDEF. I know thats not very clear, so let me try to explain it more graphically with the virtual operands on the RHS of this listing: foo() { maps to: # X_2 = V_MUST_DEF # X_2 = V_MUST_DEF X = 3 # Y_4 = V_MUST_DEF# Y_4 = V_MUST_DEF Y = 1 # Z_6 = V_MUST_DEF# Z_6 = V_MUST_DEF # VUSE Z = X + 1 # X_3 = V_MAY_DEF # GV_13 = V_KILL_ALL # Y_5 = V_MAY_DEF # Z_7 = V_MAY_DEF bar () # VUSE # VUSE # Y_8 = V_MUST_DEF # Y_8 = V_MUST_DEF Y = X + 2 # VUSE # VUSE # VUSE # VUSE return Y + Z; } In effect, what we are doing is saying that at the call site every variable in the alias set for GV_13 has been MAYDEF'd. This means we dont know its value until its physically defined again, as Y is. Until then, we simply use the current GV variable instead of the individual variables. In into-ssa I guess this means the "current-def" would be set to the alias variable at these points. As I said, this looks pretty wonky, but I beleive it accurately represents reality. Other than the collector V_KILL_ALL, I dont think anything would change... would it? it looks a bit tricky to sort out bugs, especially in a large program with lots of variables. we might have to moidify the lister to add the variable names to the RHS when there is a reference to the GV to help. ie: # VUSE # VUSE # Y_8 = V_MUST_DEF # Y_8 = V_MUST_DEF Y = X + 2 # VUSE # VUSE # VUSE # VUSE return Y + Z; It will be even more cryptic than this in reality. The VUSE of GV_13 is redundant, as it is mentioned in the RHS of the V_MUST_DEF, leaving us with: # Y_8 = V_MUST_DEF Y = X + 2 which looks even wonkyier. Its precise and efficient however. And will it cause an issue to have GV_13 in the RHS of a V_MUST_DEF and then be used later in a VUSE as in the return stmt? In *theory* it shouldnt, but I dont know if anyone has written code which assumes the RHS of a VMAYDEF is dead. I think all the PHI node issues work themselves out too, but you spend way more time thinking about PHI nodes than I do. maybe you see an issue. Alternatively, if this is either too out there, or I'm otherwise off my rocker for some reason Ive missed, we can visit a solution to the only real problem with the proposal: > > - If there are no uses of X, Y and Z after the call to bar, DCE will > think that those stores are dead. We would have to hack DCE to somehow > seeing the call to bar() as a user for those stores. This is really the only issue I see, and Im not sure of a decent way to deal with it. I'll think about it. Andrew
Re: short int and conversions
> I'm trying to port gcc 4.1 for an architecture that has the following > memory layout BITS_PER_UNIT=32 and UNITS_PER_WORD=1. > It has support (16bit registers and operators) for 16bit signed > atithmetic used mainly for addressing. There are also operators for 32 > bit integer and floating point support. > I set SHORT_TYPE_SIZE=POINTER_SIZE=(BITS_PER_WORD/2). > > I reserved QImode and QFmode for 32 bit integer/floating point operations. > And I defined a new fractional int mode FRACTIONAL_INT_MODE (PQ, 16, 1) for > pointers and short int operations. > When I try to compile a very simple program with short int xgcc > segments for stack overflow because it calls recursively > #32 0x0806dd6d in convert (type=0xb7c7b288, expr=0xb7c88090) at > ../../gcc/c-convert.c:95 > #33 0x08160626 in convert_to_integer (type=0xb7c7b288, > expr=0xb7c88090) at ../../gcc/convert.c:442 > > I presume it tries to convert a small precision mode in something > bigger but I cannot understand why. > This is the first time I try to port gcc, so I don't know if my > assumptions are reasonable or not. With the caveat that I've never boot-strapped a port myself: - "unit" tends to be an acronym for char, as is QI mode for < 16-bit chars. - "word" tends to be an acronym for int/void*, typically represented as HI (16-bit) or SI (32-bit) mode operands, and typically the natural size of a memory access, although not necessarily. - correspondingly, 32-bit float operands tend to be represented as SF mode operands. (Q = quarter, H = half, S = single, D = Double) (I = integer, F = float) So in rough summary, the following may be reasonable choices (given your machine's apparent support of 16-bit and possibly lesser sized operations): bits mode ~type - 8 QI char/short (which can be emulated if necessary) 16 HI char/short/int/void* 16 HF (target-specific-float) 32 SI int/void*/long 32 SF float 64 DI long/void*/long-long/ 64 DF double Also as a generalization, it's likely wise not to try modeling a port after the c4x, as it's implementation seems at best very odd. (alternatively, a better model may be one of the supported 16/32 bit targets, depending on your machine's architecture.) best of luck.
Re: false spam positive from gcc-patches
Thomas Koenig wrote: any reason why the message http://gcc.gnu.org/ml/fortran/2005-03/msg00282.html was rejected as spam from gcc-patches, yet accepted on the fortran list? See http://www.sourceware.org/lists.html#rbl-sucks which has a discussion of how the spam filters work, and how to get around them. Possible reasons 1) The address you posted from is subscribed to one mailing list but not the other (perhaps an alternate address is subscribed on the other list). 2) The address you posted from is on the allow list for one mailing list but not allow list of the other. By the way, I think it is a word of all caps in the subject line which triggers the spam filter, and "PR" is such a word, which is unfortunate. This is just a guess though. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: false spam positive from gcc-patches
On Thu, Mar 17, 2005 at 09:20:45PM -0800, James E Wilson wrote: > Thomas Koenig wrote: > >any reason why the message > >http://gcc.gnu.org/ml/fortran/2005-03/msg00282.html > >was rejected as spam from gcc-patches, yet accepted on the fortran > >list? > > By the way, I think it is a word of all caps in the subject line which > triggers the spam filter, and "PR" is such a word, which is unfortunate. > This is just a guess though. Yeah, and this is really stupid for filtering Fortran related emails. Traditionally, all upper case words are used to distinguish between text and Fortran code/syntax/keywords. -- Steve
Re: supporting --with-cpu=default32 option for x86_64
Nitin Gupta wrote: following lines were added in config.gcc in order to recognise --with-cpu=default32. But I dont understand , how it was actually made to default to 32-bit. The trick is to look at the default64 code, and note what default32 doesn't do that default64 does do. The code you quoted is only clearing with_cpu when default32/default64 are given, because these are valid options to the actual gcc code. These just mean "use whatever the default with_cpu value is". -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: Suggestion for a fix to Bug middle-end/20177
Mostafa Hagog wrote: The question is: what is the correct fix for the longer term ? is it enough to mark the SMSed block dirty? or do we need also to keep the REG_DEAD correct in each basic-block separately? You either have to keep all REG_NOTES up to date, or call code that will recompute them. You can recompute REG_DEAD/REG_UNUSED notes by calling back into flow. This is presumably what happens when you mark the block dirty, so that would be a sufficient solution for REG_DEAD/REG_UNUSED. See for instance code in combine.c that updates REG_NOTES after combination. This is in distribute_notes. By the way, REG_UNUSED means that this instructions sets a register, and this value dies here. There are no uses of this register before the next set or the end of the function. Thus it holds register life info that is complimentary to REG_DEAD. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: RTL?
하태준 wrote: > where i get the impormation about code, log_links, reg_notes See the internals documentation, in the file gcc/doc/rtl.texi, or on the web at http://gcc.gnu.org/onlinedocs/gccint/Insns.html#Insns See also the sources for more info, as the docs may not be fully up to date, in particular the file rtl.h. -- Jim Wilson, GNU Tools Support, http://www.SpecifixInc.com
Re: reload question
Bernd Schmidt <[EMAIL PROTECTED]> writes: > Reload insns aren't themselves reloaded. You should look at the > SECONDARY_*_RELOAD_CLASS; they'll probably let you do what you want. Ah, thank you! I've defined SECONDARY_*_RELOAD_CLASS (and PREFERRED_* to try to help things along), and am now running into more understandable reload problems: "unable to find a register to spill in class" :-/ The problem, as I understand, is that reload doesn't deal with conflicts between secondary and primary reloads -- which are common with my arch because it's an accumulator architecture. For instance, slightly modifying my previous example: Say I've got a mov instruction that only works via an accumulator A, and a two-operand add instruction. "r" regclass includes regs A,X,Y, and "a" regclass only includes reg A. mov has constraints like: 0 = "g,a" 1 = "a,gi" and add3 has constraints: 0 = "a" 1 = "0"2 = "ri" (say) So if before reload you've got an instruction like: add temp, [sp + 4], [sp + 6] and v2 and v3 are in memory, it will have to have generate something like: mov A, [sp + 4]; primary reload 1 in X, with secondary reload 0 A mov X, A ; "" mov A, [sp + 6]; primary reload 2 in A, with no secondary reload add A, X mov temp, A There's really only _one_ register that can be used for many reloads, A. The problem is that reload doesn't seem to be able to produce this kind of output: if it chooses A as a primary reload (common, as most insns use A as a first operand), reload will think it conflicts with secondary reloads that also use A (when it really needn't, as the secondary reloads only use A "temporarily"). This is particularly bad with RELOAD_OTHER reloads, as I kludged around this to some degree by changing `reload_conflicts' (reload1.c) to always think secondary reloads _don't_ conflict [see patch1]. As that will fail in the case where a primary reload is loaded before a secondary reload using the same register, I _also_ modified `emit_reload_insns' to sort the order in which operand reloads are output so that an operand who's secondary reload interferes with another operand's primary reload is always loaded first. However I think this is not guaranteed to always work -- certainly merely disregarding conflicts with secondary reloads will fail for architectures which are slightly less anemic, say with _two_ accumulators... :_) Does anybody have a hint for a way to solve this problem? Reload is very confusing... Thanks, -Miles = patch1 = --- gcc-3.4.3/gcc/reload1.c 2004-05-02 21:37:17.0 +0900 +++ gcc-3.4.3-supk0-20050317/gcc/reload1.c 2005-03-17 19:49:35.935534000 +0900 @@ -1680,7 +1688,7 @@ find_reg (struct insn_chain *chain, int order) { int other = reload_order[k]; - if (rld[other].regno >= 0 && reloads_conflict (other, rnum)) + if (rld[other].regno >= 0 && reloads_conflict (other, rnum, 0)) for (j = 0; j < rld[other].nregs; j++) SET_HARD_REG_BIT (used_by_other_reload, rld[other].regno + j); } @@ -4601,18 +4609,25 @@ } /* Return 1 if the reloads denoted by R1 and R2 cannot share a register. - Return 0 otherwise. + Return 0 otherwise. If SECONDARIES_CAN_CONFLICT is zero, secondary + reloads are considered never to conflict; otherwise they are treated + normally. This function uses the same algorithm as reload_reg_free_p above. */ int -reloads_conflict (int r1, int r2) +reloads_conflict (int r1, int r2, int secondaries_can_conflict) { enum reload_type r1_type = rld[r1].when_needed; enum reload_type r2_type = rld[r2].when_needed; int r1_opnum = rld[r1].opnum; int r2_opnum = rld[r2].opnum; + /* Secondary reloads need not conflict with anything. */ + if (!secondaries_can_conflict + && (rld[r1].secondary_p || rld[r2].secondary_p)) +return 0; + /* RELOAD_OTHER conflicts with everything. */ if (r2_type == RELOAD_OTHER) return 1; = patch2 = --- gcc-3.4.3/gcc/reload1.c 2004-05-02 21:37:17.0 +0900 +++ gcc-3.4.3-supk0-20050317/gcc/reload1.c 2005-03-17 19:49:35.935534000 +0900 @@ -6951,6 +6966,51 @@ emit_reload_insns (struct insn_chain *chain) do_output_reload (chain, rld + j, j); } +#ifdef SECONDARY_INPUT_RELOAD_CLASS + for (j = 0; j < reload_n_operands; j++) +opnum_emit_pos[j] = emit_pos_opnum[j] = j; + + /* Order the operands to avoid conflicts between the primary reload of + one operand and a secondary reload in another operand (which we + ignored before). XXX this only works for input reloads!! */ + for (j = 0; j < n_reloads; j++) +if (rld[j].secondary_p) + /* This is a secon