Re: bootstrap with -ftree-parallelize-loops
Sent from my iPhone On Nov 20, 2008, at 4:28 AM, Razya Ladelsky <[EMAIL PROTECTED]> wrote: "Andrew Pinski" <[EMAIL PROTECTED]> wrote on 19/11/2008 20:54:19: On Wed, Nov 19, 2008 at 10:48 AM, David Edelsohn <[EMAIL PROTECTED]> wrote: On Wed, Nov 19, 2008 at 1:47 PM, Andrew Pinski <[EMAIL PROTECTED]> wrote: On Wed, Nov 19, 2008 at 10:40 AM, David Edelsohn <[EMAIL PROTECTED]> wrote: You can test -ftree-parallelize-loops building GCC with an installed version of GCC, but not as a three-stage bootstrap Except you can change libgomp into a target library that gets bootstrapped though. Just like libgcc is always built in each stage. Just add bootstrap=true to libgomp's target_modules and that should work. I have not tested it though. It's not enough. libgomp was not built. Any idea what else is missing? Did you rebuild Makefile.in? It is a generated file after all. Thanks, Andrew Pinski That will build libgomp but the spec file remains uninstalled. Hmm, how about adding -B../prev-libgomp to the command line then ? That should fix the problem there also. Thanks, Andrew Pinski
Re: gcc4.1.2 compilation errors
Sent from my iPhone On Nov 29, 2008, at 6:34 PM, yx <[EMAIL PROTECTED]> wrote: Hi, I am trying to compile SPEC2000 benchmark with gcc 4.1.2 (gfortran and g++ were compiled successfully for gcc4.1.2.) However I got compilation errors for 4 benchmarks in SPEC2000. The error messages are listed as follows: 176. gcc: reorg.c: In function 'find_end_label' reorg.c:831: error: invalid lvalue in increment reorg.c: In function delete_from_delay_slot? reorg.c:1033: error: invalid lvalue in increment reorg.c: In function make_return_insns? reorg.c:4157: error: invalid lvalue in increment reorg.c: In function dbr_schedule? reorg.c:4237: error: invalid lvalue in increment reorg.c:4312: warning: incompatible implicit declaration of built-in function memset specmake: *** [reorg.o] Error 1 specmake options 2> options.err | tee options.out You need the alt sources which remove the cast as lvalue. 252.eon: ggFrame2.cc: In function std::istream& operator>>(std::istream&, ggFrame2&)? ggFrame2.cc:64: error: no match for operator>> in is >> "(" 253.perlbmk: perlio.c: In function PerlIO_setpos perlio.c:472: error: incompatible type for argument 2 of fseek perlio.c: In function PerlIO_getpos perlio.c:490: error: incompatible types in assignment specmake: *** [perlio.o] Error 1 178.galgel: local412/bin/gfortran -c -o modules.o -malign-double -O3 modules.f90 In file modules.f90:1 Try with -ffixed-form, the source is written in fixed form Fortran 90/95. .f90 causes gfortran to default to free form. Thanks, Andrew Pinski C Maximal sizes 1 Error: Unclassifiable statement at (1) In file modules.f90:16 C *** Parameters of the problem *** 1 Error: Unclassifiable statement at (1) In file modules.f90:36 C *** Coefficients for boundary conditions *** 1 Error: Unclassifiable statement at (1) In file modules.f90:41 Real*8, Dimension(0:mm) :: 1 Error: Syntax error in data declaration at (1) In file modules.f90:42 *F1, F2, F3, F4, G1, G2, G3, G4, A, B 1 Error: Unclassifiable statement at (1) In file modules.f90:51 C * Inner products of polynomials ** 1 Error: Unclassifiable statement at (1) In file modules.f90:66 C * Inner products of functions ** 1 Error: Unclassifiable statement at (1) In file modules.f90:71 Real*8, Dimension(mm,mm) :: 1 Error: Syntax error in data declaration at (1) In file modules.f90:72 * VXX, VXY, VYX, VYY, VXX2, VXY2, VYX2, VYY2 1 Error: Unclassifiable statement at (1) In file modules.f90:84 Real*8, Dimension(mm,mm,mm) :: 1 Error: Syntax error in data declaration at (1) In file modules.f90:85 * VXXX, VXXY, VXYX, VXYY, VYXX, VYXY, VYYX, VYYY 1 Error: Unclassifiable statement at (1) In file modules.f90:97 Real*8, Dimension(mm,mm) :: TXX, TYY, TXX2, TYY2, 1 Error: Syntax error in data declaration at (1) In file modules.f90:98 * PXX, PYY, PXX2, PYY2 1 Error: Unclassifiable statement at (1) In file modules.f90:104 Real*8, Dimension(mm,mm,mm) :: 1 Error: Syntax error in data declaration at (1) In file modules.f90:105 * WXTX, WXTY, WYTX, WYTY, WXPX, WXPY, WYPX, WYPY 1 Error: Unclassifiable statement at (1) In file modules.f90:108 C *** Matrices of the Galerkin system *** 1 Error: Unclassifiable statement at (1) In file modules.f90:140 Real*8, Dimension(N11,N11) :: 1 Error: Syntax error in data declaration at (1) In file modules.f90:141 *POP, Poj1, Poj2, Poj3, Poj4 1 Error: Unclassifiable statement at (1) Any help will be greatly appreciated!
Re: [ARM] Implement __builtin_bswap32() via ARMv6 "rev" instruction
Sent from my iPhone On Dec 8, 2008, at 9:37 AM, "Alexandre Pereira Nunes" <[EMAIL PROTECTED] > wrote: 2008/12/8 Paul Brook <[EMAIL PROTECTED]>: On Monday 08 December 2008, Alexandre Pereira Nunes wrote: A patch follows. I didn't take care of the scheduling case the correct way, tought (aliased to clz class). Please read http://gcc.gnu.org/contribute.html In particular you need a copyright assignment, ChangeLog entry, and testing. I can provide these, tough as for the copyright assignment, the document mentions I can declare the changes in public domain, and since I already published something (which may or may not be used by someone in the future), I hereby do so. You should also be able to implement bswap16, and while we're here it probably makes sense to implement an optimized bswap sequence for pre-v6 cores. Arm has rev constructs for 16 bit packed integers, however AFAIK gcc has no builtin for these yet, and without this, it won't internally have any use for this instruction pattern, correct? I only saw mentions to bswap32 and bswap64 on the documentation. That is because bswap16 is the same as a rotate in 16bit mode by 8. Thanks, Andrew Pinski I'll take a look on the optimized bswap32 sequences for previous cores and get these expanded when not optimizing for size, IIRC I saw similar patterns on some other architecture machine description. Once all that is fixed, patches should be sent to [EMAIL PROTECTED] , not this list. Roger that; Thanks. -- Alexandre
Re: no conversion from char[] to char* on function calls under circumstances [was: A bug?]
C++98 is not C99 :) there is no rvalue to lvalue conversion for rvalue arrays in C++98. Also this code is still undefined C99 but will most likely become valid C1x. Sent from my iPhone On Dec 16, 2008, at 8:45 AM, Jan Engelhardt wrote: On Tuesday 2008-12-16 17:05, Michel Van den Bergh wrote: Hi, The following program segfaults when compiled with gcc but runs fine when compiled with g++ or icc (the intel C compiler) #include struct Hello { char world[20]; }; struct Hello s(){ struct Hello r; r.world[0]='H'; r.world[1]='\0'; return r; } int main(){ printf("%s\n",s().world); } Assigning s() to a variable and then using the variable avoids the segfault. Had you compiled with -Wall would you have noticed: e.c:13: warning: format ‘%s’ expects type ‘char *’, but argument 2 has type ‘char[20]’ And when there is a type mismatch, a crash is pretty likely. Not that I can say why gcc does not convert it to char* but g++ does. Now what happens? The following augmented snippet shows it: ---<8--- #include #include #include struct Hello { char world[20]; }; struct Hello s(void) { struct Hello r; strcpy(r.world, "Hello"); return r; } static void dump(const char *fmt, ...) { va_list argp; va_start(argp, fmt); char *p = va_arg(argp, char *); printf("%p\n", p); va_end(argp); } int main(void) { dump("", s().world); return 0; } --->8--- I get 0x6c6c6548, which is obviously part of the string Hello. So passing a char[20] into a varargs function seems not to convert it to char* when done through a non-visibile temporary (the result of s() is hidden on the stack of main).
Re: How to implement "unsigned long long __rdtsc ()" for x86?
Sent from my iPhone On Jan 16, 2009, at 9:23 AM, "H.J. Lu" wrote: Hi, I am trying to implement unsigned long long __rdtsc (void); for RDTSC as an intrinsic. It is easy to do it with asm statement. But I am having a hard time to implement it as a gcc builtin. The main problem is there is no input. It is impossible to write a proper RTL for it. Any suggestions? unspec_volatile? I don't see any issues with a no input unspec_volatile the rs6000 backend uses that already. Thanks, Andrew Pinski I am thinking about adding some fake readonly registers. Will it work? Thanks. -- H.J.
Re: bitwise dataflow
Sent from my iPhone On Mar 6, 2009, at 7:00 PM, Silvius Rus wrote: I'm thinking about adding bitwise dataflow analysis support to RTL. Is this a good idea? Bad idea? Already done? Please review if interested. There is already some bitwise dataflow implemented in combine. And I think df supports bytewise already but it is turned off because of problems with some targets and modes. I also think Steven B. had an implementation on the tree level or at least ideas on how to implement there. Thank you, Silvius Motivation == int foo(int x) { int i = 100; do { if (x > 0) x = x & 1; /* After this insn, all bits except 1 are 0. */ else x = x & 2; /* After this insn, all bits except 2 are 0. */ i--; } while (i); return x & 4; /* Before this insn, only bits 1 and 2 may be nonzero. */ } "foo" should simply return 0. This optimization is currently missed at -O2 and -O3 on x86_64. (Cases with simpler control are solved by the "combine" pass.) Proposal 1. Dataflow framework to propagate bitwise register properties. (Integrated with the current dataflow framework.) 2. Forward bitwise dataflow analysis: constant bit propagation. 3. Backward bitwise dataflow analysis: dead bit propagation. 4. Target applications: improve dce and see. (Others?) Preliminary Design == 1. I don't have enough understanding of GCC to decide whether it should be done at RTL level or tree level. After some brainstorming with Ian Taylor, Diego Novillo and others, we decided to go with RTL. 2. This problem could be solved using iterative dataflow with bitmaps. However, memory size would increase significantly compared to scalar analysis, as would processing time. For constant bit propagation, we need to keep 2 bits of state for each register bit. For 64 bit registers, that's a factor of 128x over the scalar reaching definition problem. Instead, I propose a sparse bitwise dataflow framework. We would still use the existing RTL dataflow framework to build scalar DU/UD chains. Once they are available, bitwise information is propagated only over these chains, analogous to the sparse constant propagation described by Wegman & Zadeck TOPLAS 1991. 3. This might be too much detail at this point, but just in case, here is a brief description of a bit constant propagation algorithm. For each instruction I in the function body For each register R in instruction I def_constant_bits(I, R) = collect constants from AND/OR/... operations. Iterate until the def_constant_bits don't change: For each instruction I in the function body For each register R used at I use_constant_bits(I, R) = merge (def_constant_bits(D, R)) across all definitions D of R that reach I For each register R defined at I def_constant_bits(I, R) = transfer (use_constant_bits(I, RU)) for all register uses RU, based on opcodes. The data structures and routines "collect", "merge" and "transfer" depend on the problem solved. 4. Complexity considerations. The solver visits every DU edge once for each iteration of the fixed point convergence loop. The maximum number of iterations is given by the height of the state lattice multiplied by the number of bits. Although this can be as high as 128 for constant bit propagation on x86_64, in practice we expect much lower times. Also, lower complexity guarantees can be given if less accurate information is allowed, e.g., byte level rather than bit level. For byte constants, the upper bound constant factor drops from 128 to 16. Some of these ideas came from discussions with Preston Briggs, Sriraman Tallam and others.
Re: Setting -frounding-math by default
Sent from my iPhone On Mar 8, 2009, at 3:26 PM, Sylvain Pion wrote: Joseph S. Myers a écrit : On Sun, 8 Mar 2009, Sylvain Pion wrote: http://gcc.gnu.org/ml/gcc-patches/2003-09/msg00104.html introduced the -frounding-math option, and changed the default behavior of GCC to optimize "unsafely". That is a misleading description. The cautionary text added by that patch is still present: This option is experimental and does not currently guarantee to disable all GCC optimizations that are affected by rounding mode. This is still true. GCC did not before the patch, did not after the patch and does not now fully support disabling optimizations that are unsafe in the presence of rounding mode changes; a few affected optimizations are disabled, but noone has seriously attempted to cover them all. GCC was "unsafe" before the patch and remains so whether or not you use the option. You are playing with words. Please step back and look at the facts for a moment. The fact is that Roger's patch introduced a regression (this word should be clear enough here), in that some users now have their old code broken, and they are forced to add the -frounding-math option (after having lost some time finding about this non trivial issue). This is a long term hindrance. Actually before roger's patch the default is the same. Just there was no way to turn it off. Even if -frounding-math is not 100% correct, it makes things work (more precisely, lack of it breaks code), and this is the only thing that matters here. I am tired of waiting for a solution in GCC, http://gcc.gnu.org/faq.html#support There are plenty of bug reports for floating-point issues, and a lack of volunteers fixing them, so perhaps you should not expect filing more bugs to help, but "waiting for a solution" isn't listed there at all. Thanks for this first-level hotline support reply. You should probably learn to whom you are replying to. In particular, I did send patches to GCC several times, look at the changelogs. I know what this is all about. And I know that it is beyond my time availability to learn enough to contribute patches for big things like supporting this pragma. I nevertheless try to find grants for funding people to implement some related things in GCC. And I also contribute time to help in the guidance of GCC with my expertise in this particular area, even if it requires a lot of time to convince people. So, please... I'm willing to contribute a patch that changes the default, if this pleases you. It will take me 30 minutes, and would take only 5 for a professional GCC developers since I have to setup the environment. The hard part, the most time consuming part here, is the discussion on the rationale, which is why I ended up with the decision of sending this mail rather than a patch directly. Moreover, when reading Roger's mail I refered to, it explicitly said that support for the pragma was "intended to come next". So "waiting for a fix" was what I was supposed to do, once this patch went in GCC, that introduced a regression. Somehow, I also think that this -fno-rounding-math default is inconsistent with the general policy of defaults in GCC which is to aim at safety and correctness first. The default is in accordance with standard requirements, and more more conservative than other compilers which tend to enable their - ffast-math equivalents by default rather than just -fno-rounding- math. At least these others have chosen to do it more consistently. They don't claim their default is anything sane at all for serious FP computations, which at least is clear. Moreover, standards are not perfect. In particular, in this area, C++ doesn't say much (I don't care about C, which lacks way too many necessary features for serious general scientific programming anyway). C++0x currently adopts C99's fesetround() without the pragma, so it's in an inconsistent state. I nevertheless try to have this fixed properly before C++0x is out (I was at Summit last week to defend my N2811 paper around precisely this issue, for example). Moreover, standards are far from saying anything about the best default for -frounding-math, it's a GCC matter. And GCC maintainers should be able to listen to expert users to help them making good choices. -- Sylvain Pion INRIA Sophia-Antipolis Geometrica Project-Team CGAL, http://cgal.org/
Re: shouldn't every middle-end pass be uniquely named?
Sent from my iPhone On Jul 31, 2008, at 1:11, Basile STARYNKEVITCH <[EMAIL PROTECTED]> wrote: Hello All, Some middle-end passes (those declared in tree-passes.h) are still unnamed. I tend to believe that it would be helpful (mostly for gcc debugging purposes) that every struct opt_pass (without exception) should be uniquely named (and that this should be enforced, eg. in ENABLE_CHECKING mode (essentially by registering each pass in an hash table in function next_pass_1 of gcc/passes.c) What do people think about that? Except as a habit (which I think is a bad one) is there any reason to have anonymous passes (those with a null pass->name), or (I don't know if such beast exists) homonym passes (two different passes with equal pass->name)? Yes. To prevent a dump file. One such example is freeing the internal data structures. That should not have a dump. Regards. -- Basile STARYNKEVITCH http://starynkevitch.net/Basile/ email: basilestarynkevitchnet mobile: +33 6 8501 2359 8, rue de la Faiencerie, 92340 Bourg La Reine, France *** opinions {are only mines, sont seulement les miennes} ***
Re: obvious race condition in darwin/netbsd __enable_execute_stack due to caching pagesize/mask
Sent from my iPhone On Aug 27, 2008, at 0:27, Jay <[EMAIL PROTECTED]> wrote: gcc 4.3.1 config/darwin.h: #define ENABLE_EXECUTE_STACK\ extern void __enable_execute_stack (void *);\ void \ __enable_execute_stack (void *addr) \ { \ extern int mprotect (void *, size_t, int); \ extern int getpagesize (void); \ static int size; \ static long mask;\ \ char *page, *end;\ \ if (size == 0) \ { \ size = getpagesize(); \ mask = ~((long) size - 1); Or even better store size after the store to mask. That is: int tmp = getpagesize(); *(volatile*)&mask = ~((long)tmp - 1); *(volatile*)&size = tmp; Thanks, Andrew Pinski \ } \ \ page = (char *) (((long) addr) & mask); \ end = (char *) long) (addr + (TARGET_64BIT ? 48 : 40))) & mask) + size); \ \ /* 7 == PROT_READ | PROT_WRITE | PROT_EXEC */\ (void) mprotect (page, end - page, 7); \ } contains obvious race condition. between the storing of size and the storing of mask. One thread might store size, making it not zero, and another thread could then decide size is stored (it is), and that mask is stored (it isn't), and go ahead and use the uninitialize zero value of mask. easy fix: don't cache mask. Calling mprotect is going to expensive anyway. Even getpagesize might be very cheap. Might not be worth caching. like so (diff written by hand, and lots of extra whitespace so mail programs maybe don't ruin the rest of the formating) extern int getpagesize (void); \ static int size;\ < static long mask; \ long mask; \ \ char *page, *end;\ \ if (size == 0) \ { \ size = getpagesize(); \ < mask = ~((long) size - 1); \ } \ mask = ~((long) size - 1); \ Or maybe just make the compile-time constant. same problem in netbsd.h no problem in openbsd.h, sol2.h, osf.h -- no cache no problem in freebsd.h, no use of page size And yes, I know Apple doesn't even support this and that the whole thing is maybe controversial, but that is independent. The race condition shouldn't be there. You might also fix it by checking if mask is zero, but that wouldn't suffice, without a memory barrier and/or volatile -- to force mask to be stored after size or such. You don't really want volatile, or, if you do put in volatile, you want to have locals to cache the globals, to avoid unnecessary extra fetches. (Yes, I'm micro optimizing and de-micro-optimzing, I realize.) Really just not static caching mask and possibly not caching pagesize is a good simple reliable efficient solution. Caching the values in locals would be goodness imho though, like, again just composed here, not compiled: extern int mprotect (void *, size_t, int); \ extern int getpagesize (void);\ static size_t static_size; \ size_t mask;\ size_t size; \ \ char *page, *end;
Re: Instrument-functions: possible also for function exit via exception?
Sent from my iPhone On Sep 1, 2008, at 4:22, Tim München <[EMAIL PROTECTED] wuppertal.de> wrote: On Friday 29 August 2008 23:04:15 you wrote: On Fri, Aug 29, 2008 at 2:36 AM, Tim München <[EMAIL PROTECTED]> wrote: is it somehow possible to also be notified if a function/method is left with a 'throw'? Or, would it be possible to patch gcc like that? I had a quick look into function.c but it seems not to be as straight- forward as the entry/exit instrumentation (What is probably the reason it isn't implemented yet). Or is it even impossible for some reason? In 4.0 and above -finstruction-functions actually emits the exit function call for exceptions too, it is wrapped in a "finally" clause. Thanks, Andrew Pinski Hi, sorry, I accidentially didn't reply to the list the last time; so once again: is it possible to backport this new behaviour to gcc 3.4.x ? (I am currently using gcc 3.4.6). I'd have a look in the differences between gcc 3.4 and 4.x, but maybe you know already that this is difficult/improbable/already done... This code was rewritten for tree-ssa; so it is not possible to merge it back without the whole tree-ssa code merged back. If I was you, I would move forward as 3.4 is no longer maintained. Likewise for 4.0 and 4.1. Thanks, Tim
Re: C/C++ FEs: Do we really need three char_type_nodes?
Yes char, unsigned char and signed char are three distant types. Unlike the other interger types in C/C++. That is they are incompatable types. Thanks, Andrew Pinski Sent from my iPhone On Sep 19, 2008, at 9:36 AM, "Diego Novillo" <[EMAIL PROTECTED]> wrote: When we instantiate char_type_node in tree.c:build_common_tree_nodes we very explicitly create a char_type_node that is signed or unsigned based on the value of -funsigned-char, but instead of make char_type_node point to signed_char_type_node or unsigned_char_type_node, we explicitly instantiate a different type. This is causing trouble in lto1 because when we stream out gimple, assignments between char and unsigned char go through without a cast. However, the LTO front end has no knowledge of the C/C++ FE flags, so it creates a signed char_type_node. This causes a SEGV in PRE because the assignment between char and unsigned char confuses it. This is trivially fixable by making char_type_node be 'unsigned char', so that when we stream out the types for the variables the types are explicitly signed or unsigned, instead of taking its sign implicitly from an FE flag. @@ -7674,12 +7713,8 @@ build_common_tree_nodes (bool signed_cha unsigned_char_type_node = make_unsigned_type (CHAR_TYPE_SIZE); TYPE_STRING_FLAG (unsigned_char_type_node) = 1; - /* Define `char', which is like either `signed char' or `unsigned char' - but not the same as either. */ - char_type_node -= (signed_char - ? make_signed_type (CHAR_TYPE_SIZE) - : make_unsigned_type (CHAR_TYPE_SIZE)); + /* Define `char', which is like either `signed char' or `unsigned char'. */ + char_type_node = signed_char ? signed_char_type_node : unsigned_char_type_node; TYPE_STRING_FLAG (char_type_node) = 1; short_integer_type_node = make_signed_type (SHORT_TYPE_SIZE); However, this has other side effects in the FE, as we now can't build libstdc++ (a PCH failure). So, do we really need a third, distinct, char_type_node? Is there some language or FE rule that causes us to do this? The comment in build_common_tree_nodes is not very illuminating. I could fix this in a slower way by crawling through the whole callgraph and changing 'char' to 'unsigned char' everywhere once the whole compilation unit is in GIMPLE form, but I'd like to consider doing that only if it's really necessary. Thanks. Diego.