CFP (deadline extension): GROW'10 (2nd Workshop on GCC Research Opportunities)
The submission deadline is extended until the 22nd of November, 2009. Apologies if you receive multiple copies of this call. CALL FOR PAPERS 2nd Workshop on GCC Research Opportunities (GROW'10) http://ctuning.org/workshop-grow10 January 23, 2010, Pisa, Italy (co-located with HiPEAC 2010 Conference) GROW workshop focuses on current challenges in research and development of compiler analyses and optimizations based on the free GNU Compiler Collection (GCC). The goal of this workshop is to bring together people from industry and academia that are interested in conducting research based on GCC and enhancing this compiler suite for research needs. The workshop will promote and disseminate compiler research (recent, ongoing or planned) with GCC, as a robust industrial-strength vehicle that supports free and collaborative research. The program will include an invited talk and a discussion panel on future research and development directions of GCC. Topics of interest Any issue related to innovative program analysis, optimizations and run-time adaptation with GCC including but not limited to: * Classical compiler analyses, transformations and optimizations * Power-aware analyses and optimizations * Language/Compiler/HW cooperation * Optimizing compilation tools for heterogeneous/reconfigurable/ multicore systems * Tools to improve compiler configurability and retargetability * Profiling, program instrumentation and dynamic analysis * Iterative and collective feedback-directed optimization * Case studies and performance evaluations * Techniques and tools to improve usability and quality of GCC * Plugins to enhance research capabilities of GCC Paper Submission Guidelines Submitted papers should be original and not published or submitted for publication elsewhere; papers similar to published or submitted work must include an explicit explanation. Papers should use the LNCS format and should be 12 pages maximum. Please, submit via the easychair system at the GROW'10 website. Papers will be refereed by the Program Committee and if accepted, and if the authors wish, will be made available on the workshop web site. Authors of the best papers from the workshop may be invited to revise their submission for the journal "Transactions on HiPEAC", if the work is in sufficiently mature form. Important Dates Deadline for submission: November 22, 2009 Decision notification: December 14, 2009 Workshop: January 23, 2010 half-day Organizers Grigori Fursin, INRIA, France Dorit Nuzman, IBM, Israel Program Committee Arutyun I. Avetisyan, ISP RAS, Russia Zbigniew Chamski, Infrasoft IT Solutions, Poland Albert Cohen, INRIA, France David Edelsohn, IBM, USA Bjorn Franke, University of Edinburgh, UK Grigori Fursin, INRIA, France Benedict Gaster, AMD, USA Jan Hubicka, SUSE Paul H.J. Kelly, Imperial College of London, UK Ondrej Lhotak, University of Waterloo, Canada Hans-Peter Nilsson, Axis Communications, Sweden Diego Novillo, Google, Canada Dorit Nuzman, IBM, Israel Sebastian Pop, AMD, USA Ian Lance Taylor, Google, USA Chengyong Wu, ICT, China Kenneth Zadeck, NaturalBridge, USA Ayal Zaks, IBM, Israel Keynote talk Diego Novillo, Google, Canada "Using GCC as a toolbox for research: GCC plugins and whole-program compilation" Previous Workshops GROW'09: http://www.doc.ic.ac.uk/~phjk/GROW09 GREPS'07: http://sysrun.haifa.il.ibm.com/hrl/greps2007
Re: i370 port - constructing compile script
Well, the configure process should result in the variable LIBOBJS in the generated libiberty Makefile to be set to list of objects containing implementations of replacement system routines. So if you do not have HAVE_STRCASECMP in config.h, you should have been getting strcasecmp.o in LIBOBJS ... And indeed, I sort of am. LIBOBJS includes a strcasecmp.s$U.s That suffix is certainly strange-looking though. I checked in config.log and I can see that it automatically detected that my "object code" has a ".s" extension, which is basically correct given that I forced the "-S" option. All of the LIBOBJS are like that. In addition, there's another problem - it has included strncmp in the list. I had a look and it appears that it attempts to actually run the program to see if strncmp works. That's not going to work in a cross-compile environment though. So maybe it assumes the worst. I've taken a look at the Makefile to try to find out what is happening. It seems that there are REQUIRED_OFILES which include things like safe-ctype and that has to have a ".o" extension. Give that those are hardcoded and forced to ".o", why isn't LIBOBJS done the same way? Anyway, I decided to change this: else LIBOBJS="$LIBOBJS $ac_func.$ac_objext" fi code you showed earlier to be hardcoded to .o. And then I changed ac_libobjs to stop putting that $U in there as well, and I finally got my strcasecmp. Note that I also seem to be getting strerror. It's on the list of "required files", even though it isn't required or wanted. configure correctly detected that I already had strerror. I manually excluded that from my list of files and now things are looking good again - including strcasecmp being automatically selected in the build process. :-) Hopefully it won't be too much longer before I have the stage 1 JCL being automatically generated so that I can verify the new files to be compiled actually work on MVS. :-) BFN. Paul.
Re: i370 port - constructing compile script
* Paul Edwards wrote on Sat, Nov 14, 2009 at 09:51:39AM CET: > >Well, the configure process should result in the variable LIBOBJS > >in the generated libiberty Makefile to be set to list of objects > >containing implementations of replacement system routines. > > > >So if you do not have HAVE_STRCASECMP in config.h, you should > >have been getting strcasecmp.o in LIBOBJS ... > > And indeed, I sort of am. > > LIBOBJS includes a strcasecmp.s$U.s > > That suffix is certainly strange-looking though. I checked in > config.log and I can see that it automatically detected that > my "object code" has a ".s" extension, which is basically > correct given that I forced the "-S" option. Why do you pass -S in the compiler script? configure sort of expects that neither -S nor -c are passed automatically. > In addition, there's another problem - it has included strncmp > in the list. I had a look and it appears that it attempts to > actually run the program to see if strncmp works. That's > not going to work in a cross-compile environment though. > So maybe it assumes the worst. Yes. The macro that does this is libiberty_AC_FUNC_STRNCMP in libiberty/aclocal.m4. In a cross-compile situation, the macro assumes that strncmp does not work. It uses the cache variable ac_cv_func_strncmp_works, which you can set if you need to override the decision, e.g.: ac_cv_func_strncmp_works=yes export ac_cv_func_strncmp_works ../gcc/configure ... A more permanent solution would be to set this correctly based upon $host in libiberty/configure.ac and regenerate libiberty/configure with autoconf. > And then I changed ac_libobjs to stop putting that $U in there as > well, and I finally got my strcasecmp. Why does that $U hurt you? It should get expanded to nothing later on. (It is a remainder from some ansi2knr scheme.) > Note that I also seem to be getting strerror. It's on the list > of "required files", even though it isn't required or wanted. > configure correctly detected that I already had strerror. > I manually excluded that from my list of files and now things > are looking good again - including strcasecmp being > automatically selected in the build process. :-) Again, rather than hacking the generated configure script, I think you should start modifying the input files, configure.ac in this case, for permanent solutions to your build issues. Even if you're only changing a few lines, doing it each time you want to build a different GCC version is an unnecessary burden. Thanks, Ralf
Re: i370 port - constructing compile script
LIBOBJS includes a strcasecmp.s$U.s That suffix is certainly strange-looking though. I checked in config.log and I can see that it automatically detected that my "object code" has a ".s" extension, which is basically correct given that I forced the "-S" option. Why do you pass -S in the compiler script? configure sort of expects that neither -S nor -c are passed automatically. The only thing the compiler is capable of doing is generating assembler code. Just getting that to work has been a 20 year effort. So what I have done is get the compiler to fail on any missing prototype. I think perhaps we need to have a generic gcc or autoconfigure option called "config by prototype". MVS is just one instance where you might wish to do it this way. Other ports in their infancy may not have working cross-assemblers and linkers either. It worked out quite well. that strncmp does not work. It uses the cache variable ac_cv_func_strncmp_works, which you can set if you need to override the decision, e.g.: ac_cv_func_strncmp_works=yes export ac_cv_func_strncmp_works Ok, thanks, I've added that, and can confirm that it did the trick. A more permanent solution would be to set this correctly based upon $host in libiberty/configure.ac and regenerate libiberty/configure with autoconf. Ok, that's what a lot of this exercise is about - finding out what needs to be changed in the long term in GCC 4 if MVS is to be supported. And then I changed ac_libobjs to stop putting that $U in there as well, and I finally got my strcasecmp. Why does that $U hurt you? It should get expanded to nothing later on. (It is a remainder from some ansi2knr scheme.) Ok, I put it back in, and indeed, it does work. I must have been confused by an unrelated failure. Note that I also seem to be getting strerror. It's on the list of "required files", even though it isn't required or wanted. configure correctly detected that I already had strerror. I manually excluded that from my list of files and now things are looking good again - including strcasecmp being automatically selected in the build process. :-) Again, rather than hacking the generated configure script, I think you should start modifying the input files, configure.ac in this case, for permanent solutions to your build issues. As above, that is certainly on the cards. However, I'm trying to flesh out the issues that exist before seeing if we can get agreement for changes in GCC 4. E.g. what do you think of the generic "configure by prototype rather than link" facility? Personally I'd like a "configure by standard" option, where autoconfigure knows what to do based on me just telling it that the compiler is C90 (or C99 as another option) compliant, so that I don't even need to provide headers. But I think the header file option is also useful, so both should be selectable. Even if you're only changing a few lines, doing it each time you want to build a different GCC version is an unnecessary burden. Man, I really wish that was even 1% of the issues that needed to be sorted out going from GCC 3.4.6 to GCC 4.x. :-) I'd be happy to do it for the rest of my life. :-) While the amount of intrusive code is relatively small, it's still quite widespread. ie more than 80 files. And that's just the intrusive code. There's all the separate port files that need to be taken care of. :-) There's a good reason it took 20 years to get to this point. :-) BFN. Paul.
RE: [plugins-ici-cloning-instrumentation] new GCC plugin developements
Hi all, Just a small update, that after some discussions with Joern we think that based on our time constraints and the current state of GCC, instead of trying to push full ICI into GCC we start from the opposite approach: We take all our plugins (support pass selection and reordering from MILEPOST; generic function cloning and fine-grain optimizations from GSOC'09) and trying to see which low-level GCC functionality is missing to support them. Then we provide a few hooks to support them, provide a few small updates to GCC and rewrite our plugins to support low-level plugin system. Joern will continue communicating about a few extensions to the plugin system we need to make it happen. This is a pragmatic step and should require minimal changes in GCC and will help us already use current plugin system for our work. However, I think there is still a benefit of ICI in separating GCC and plugins when using internal data structures, i.e. currently the referencing of data structures in GCC is hardwired in plugins. If one day these data structures change, we will need to rewrite all plugins. Using referencing mechanism in ICI (data is used in plugins indirectly through parameter registering) allows us to insure plugins compatibility but with the performance degradation. We can discuss that later after GCC 4.5 release and when we get some more experience from the users about plugins ... By the way, due to that, I think that maybe besides documenting all the data structures we should maybe also start providing info if they are used in some plugins. This maybe will help clearning up the internals of the compiler and will prevent careless changes of the data structures in GCC to keep plugins compatible?.. Anyway, Joern will continue communicating about the progress and extensions to the plugin system ... Take care and have a good weekend, Grigori
Re: Whole program optimization and functions-only-called-once.
Jan Hubicka wrote: -fno-ipa-cp should work around your problem for time being. Indeed it did. Some figures: hlprog (the main forecast program): link time optimization time: 3:20 minutes top memory usage:920 Mbyte Inliner report: Inlined 764 calls, eliminated 226 functions, size 260368 turned to 126882 size. hirvda (the observation usage program): link time optimization time: 10:05 minutes top memory usage:2.3 Gbyte Inliner report: Inlined 2518 calls, eliminated 608 functions, size 1187204 turned to 705838 size. Of course, there still is: Considering invlo6 size 1996. Called once from lowpass 530 insns. Inlined into lowpass which now has 2293 size for a net change of -2229 size. Considering invlo4 size 1462. Called once from lowpass 2293 insns. Not inlined because --param large-function-growth limit reached. Considering invlo2 size 933. Called once from lowpass 2293 insns. Not inlined because --param large-function-growth limit reached. where the largest callee *does* get inlined, while two smaller ones don't (I agree with Jan that this would have been solved by training the inliner with profiling data, because only invlo4 gets called). However, my endeavour is to boldly go where no inliner has gone before, and implement -falways-inline-functions-only-called-once, along the following lines: $ svn diff ipa-inline.c Index: ipa-inline.c === --- ipa-inline.c(revision 153776) +++ ipa-inline.c(working copy) @@ -1246,7 +1246,7 @@ node->callers->caller->global.size); } - if (cgraph_check_inline_limits (node->callers->caller, node, + if (1 || cgraph_check_inline_limits (node->callers->caller, node, &reason, false)) { cgraph_mark_inline (node->callers); (Sugg. b. Rich. G.), because inlining functions that are only called once is always profitable (in number of instructions saved). -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/ Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html
Re: howto graphically view .cfg file produced by -fdump-tree-cfg
On Sun, Nov 8, 2009 at 19:10, Larry Evans wrote: > Does someone know of a way to view this in a graphical way, > somewhat like what xvcg does for its cfg's? When I've needed to visualize a CFG, I just used a very simplistic script to paw through the dump file to produce graphviz output (attached). It probably needs tweaking as it's been a long time since I last used it. Diego. dump2dot Description: Binary data
Re: Whole program optimization and functions-only-called-once.
2009/11/14 Toon Moene : > Jan Hubicka wrote: > >> -fno-ipa-cp should work around your problem for time being. > > Indeed it did. Some figures: > > hlprog (the main forecast program): > > link time optimization time: 3:20 minutes > top memory usage: 920 Mbyte > > Inliner report: > > Inlined 764 calls, eliminated 226 functions, size 260368 turned to 126882 > size. > > hirvda (the observation usage program): > > link time optimization time: 10:05 minutes > top memory usage: 2.3 Gbyte > > Inliner report: > > Inlined 2518 calls, eliminated 608 functions, size 1187204 turned to 705838 > size. > > Of course, there still is: > > Considering invlo6 size 1996. > Called once from lowpass 530 insns. > Inlined into lowpass which now has 2293 size for a net change of -2229 > size. > > Considering invlo4 size 1462. > Called once from lowpass 2293 insns. > Not inlined because --param large-function-growth limit reached. > > Considering invlo2 size 933. > Called once from lowpass 2293 insns. > Not inlined because --param large-function-growth limit reached. > > where the largest callee *does* get inlined, while two smaller ones don't (I > agree with Jan that this would have been solved by training the inliner with > profiling data, because only invlo4 gets called). > > However, my endeavour is to boldly go where no inliner has gone before, and > implement -falways-inline-functions-only-called-once, along the following > lines: > > $ svn diff ipa-inline.c > Index: ipa-inline.c > === > --- ipa-inline.c (revision 153776) > +++ ipa-inline.c (working copy) > @@ -1246,7 +1246,7 @@ > node->callers->caller->global.size); > } > > - if (cgraph_check_inline_limits (node->callers->caller, node, > + if (1 || cgraph_check_inline_limits (node->callers->caller, > node, > &reason, false)) > { > cgraph_mark_inline (node->callers); > > (Sugg. b. Rich. G.), because inlining functions that are only called once is > always profitable (in number of instructions saved). ;) Note that some optimizers (for example value-numbering) contain cut-offs so that they are turned off for large functions as otherwise compile-time issues appear as algorithms are non-linear in the size of the function. So it might even be not profitable in the end for size and speed reasons. Richard.
Re: Whole program optimization and functions-only-called-once.
On Sat, Nov 14, 2009 at 8:51 PM, Richard Guenther wrote: > Note that some optimizers (for example value-numbering) contain cut-offs > so that they are turned off for large functions as otherwise compile-time > issues appear as algorithms are non-linear in the size of the function. > > So it might even be not profitable in the end for size and speed reasons. ...where one should keep in mind, that this is one of those areas where GCC is still at least a decade behind the best compilers in the industry. Those optimizations, that cut themselves off, would work just fine on regions instead of whole functions. Another thing that might be helpful, is partial inlining (e.g. http://www.csc.villanova.edu/~tway/publications/wayPDPTA02.pdf although I suspect that for the code from Toon only whole-function inlining is useful...?). Zadeck had code for structural analysis a couple of years ago. I don't think anyone has seriously worked with that to experiment with region based compilation. But I guess it will be the Next Big Challange for GCC, after LTO. Ciao! Steven
Re: Whole program optimization and functions-only-called-once.
Richard Guenther wrote: 2009/11/14 Toon Moene : However, my endeavour is to boldly go where no inliner has gone before, and implement -falways-inline-functions-only-called-once, along the following lines: ... (Sugg. b. Rich. G.), because inlining functions that are only called once is always profitable (in number of instructions saved). ;) Note that some optimizers (for example value-numbering) contain cut-offs so that they are turned off for large functions as otherwise compile-time issues appear as algorithms are non-linear in the size of the function. As you correctly note, this is a tongue-in-cheek remark - anyway, we (meaning, I) have first to find out why an executable, thus constructed, gets execution times for a time step (the "unit-of-work") between 61 and 94 seconds, something that should be close to the same on every time step. -- Toon Moene - e-mail: t...@moene.org - phone: +31 346 214290 Saturnushof 14, 3738 XG Maartensdijk, The Netherlands At home: http://moene.org/~toon/ Progress of GNU Fortran: http://gcc.gnu.org/gcc-4.5/changes.html
Re: Whole program optimization and functions-only-called-once.
On Sat, Nov 14, 2009 at 2:13 PM, Steven Bosscher wrote: > On Sat, Nov 14, 2009 at 8:51 PM, Richard Guenther > wrote: >> Note that some optimizers (for example value-numbering) contain cut-offs >> so that they are turned off for large functions as otherwise compile-time >> issues appear as algorithms are non-linear in the size of the function. >> >> So it might even be not profitable in the end for size and speed reasons. > > ...where one should keep in mind, that this is one of those areas > where GCC is still at least a decade behind the best compilers in the > industry. Those optimizations, that cut themselves off, would work > just fine on regions instead of whole functions. Another thing that > might be helpful, is partial inlining (e.g. > http://www.csc.villanova.edu/~tway/publications/wayPDPTA02.pdf > although I suspect that for the code from Toon only whole-function > inlining is useful...?). Indeed. For Tom it shouldn't really matter whether the functions are inlined or not - aliasing shouldn't be an issue here due to Fortran semantics. Maybe it's alignment ... With IPA-PTA aliasing shouldn't be an issue for C or C++ either, the alignment issue remains though. > Zadeck had code for structural analysis a couple of years ago. I don't > think anyone has seriously worked with that to experiment with region > based compilation. But I guess it will be the Next Big Challange for > GCC, after LTO. Yeah, I have some patches for the SSA propagators, but those are not the problematic ones with respect to compile-time. Value-numbering cut's itself off at a certain SCC size, which I suspect cannot be easily fixed with regions (regions probably can't really cross SCCs). I don't even remember which other passes have this kind of cut-offs .. Richard. > Ciao! > Steven >