Re: Long paths with ../../../../ throughout
Hi Ian, Thank you for your reply. Ian Lance Taylor wrote: Jon Grant writes: I see that some of the files are located in the -L library directory specified, crtbegin.o, crtend.o in which case, perhaps they both do not need their full long path specified. Most linkers do not use the -L path to search for file names on the command line. OK. Also I notice lots of duplicate parameters: Is this directory really needed twice? -L/usr/lib/gcc/i486-linux-gnu/4.3.3 -L/usr/lib/gcc/i486-linux-gnu/4.3.3 No. I would encourage you to investigate why it is happening. i tried: gcc -o t -Wl,-debug test.c, I see collect2 gets the duplicates passed to it, and then it passes it on to ld. I would have thought that if collect2 was compiled with define LINK_ELIMINATE_DUPLICATE_LDIRECTORIES it would strip out the duplicate parameters before calling ld. It does not appear to be switched on in this Ubuntu package I am using though. Is it on by default? No. It was introduced only to avoid an error in the linker in some version of SGI Irix. Generally the duplicate -L option does no harm. I was actually thinking along of the lines of eliminating it earlier in the process. Why does the directory get in there twice in the first place? OK, yes I agree, the earlier the better. However, I don't (yet) know enough about GCC, also, don't have time to scratch this itch currently. To see what collect2 is doing, use -Wl,-debug If I add this to my existing command line I see there not any output: $ gcc -### -o t -Wl,-debug test.c If I change to not have -### I see it does work, not sure why. -### controls the gcc driver, not the collect2 program. Ok, I realised -### means the commands are not executed, which explains why collect2 output was not visible. So I understand that this passes -debug to collect2. As collect2 only has -v mode to display version. Would a patch to add --help to it be supported? Also could describe something about collect2's purpose at the top of that --help output. I think that ordinary uses of -Wl,--help will expect to see the --help option for the linker, not for collect2. That said, I think it would be OK to add a --help option for collect2 which issued some output and then went on to invoke the linker. OK, I'll prepare a patch for this change. Also I'd like to add --version alias of current -v too. 1) collect.c:scan_libraries may not find ldd, in which case it displays message on output, and returns as normal. Should it not be fatal if ldd is required? It seems to me that it gives an error message, which should cause collect2 to exit with a non-zero status. Does that not happen for you? Note that ldd is only required on HP/UX. Just checked again, you are correct. 2) in collect2.c:main "-debug" is checked, and variable debug set to 1 (perhaps that should be "true" to match the style of other flags) Yes, and debug should be changed from int to bool. Ok, I'll reply with a patch soon. Something else, as there isn't a man page for collect2, could one be created? I don't know if the -Wl,-debug option is documented somewhere else currently. This is the only page I found: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html Is there a way to get collect2 to save the temporary .c file it generates to have a look at it? I believe it may be the __main() function, with the -debug option it gives the attached gplusplus_collect2_log.txt, looking at the [/tmp/ccyBAI9V.c] file though it is empty, any ideas? I'm trying to build GCC trunk, with this line below. I've installed the mpfr, gmp and mpc dev packges for Ubuntu 9.10, so not sure where to investigate next $ ./configure --with-mpfr=/usr --with-gmp=/usr -with-mpc=/usr [snip] checking for correct version of mpfr.h... yes checking for the correct version of mpc.h... no configure: error: Building GCC requires GMP 4.2+, MPFR 2.3.2+ and MPC 0.8.0+. Try the --with-gmp, --with-mpfr and/or --with-mpc options to specify their locations. Source code for these libraries can be found at their respective hosting sites as well as at ftp://gcc.gnu.org/pub/gcc/infrastructure/. See also http://gcc.gnu.org/install/prerequisites.html Thank you for your help so far. Please include my address in replies. Cheers, Jon j...@netbook:~/dev$ cat test.cpp #include #include std::string hello("Hello world!"); int main(void) { printf("%s\n", hello.c_str()); return 0; } j...@netbook:~/dev$ g++ -Wl,-debug -o t test.cpp Convert string '/usr/lib/gcc/i486-linux-gnu/4.4.1/:/usr/lib/gcc/i486-linux-gnu/4.4.1/:/usr/lib/gcc/i486-linux-gnu/:/usr/lib/gcc/i486-linux-gnu/4.4.1/:/usr/lib/gcc/i486-linux-gnu/:/usr/lib/gcc/i486-linux-gnu/4.4.1/:/usr/lib/gcc/i486-linux-gnu/' into prefixes, separator = ':' - add prefix: /usr/lib/gcc/i486-linux-gnu/4.4.1/ - add prefix: /usr/lib/gcc/i486-linux-gnu/4.4.1/ - add prefix: /us
Re: Long paths with ../../../../ throughout
Hello Ian Thank you for your reply. Ian Lance Taylor wrote: Jon writes: Is there a way to get collect2 to save the temporary .c file it generates to have a look at it? I believe it may be the __main() function, with the -debug option it gives the attached gplusplus_collect2_log.txt, looking at the [/tmp/ccyBAI9V.c] file though it is empty, any ideas? Using -debug will direct collect2 to save the temporary .c file when it creates one. However, in ordinary use on GNU/Linux, collect2 will never generate a temporary .c file. Is there any information about how GCC start up constructors for C/C++ are generated and called before main you could point me to please. I'd like to understand how it works. [.] Take a look at the config.log file to see the test that failed. Thanks, I had ibgmp3-dev, libmpfr-dev, libmpc-dev missing. The former is 0.7-1 in current Ubuntu, so I took from April's pre-release package (and deps). Not sure if this has been discussed, but my feedback would be for gcc build not to depend on packages until they are in the mainstream distros. I've attached collect2 patch. Let me know what you think of it. If happy with the patch. I'll prepare another which changes all the int 0/1 flags to be bool and true/false. Please include my address in any replies. Best regards, Jon 2010-02-03 Jon Grant * collect2.c: Handle --version as well as -v. Likewise handle --help as well as -h. Display Usage when --help given on command line. vflag, debug (and additional helpflag) use bool instead of int. Index: collect2.c === --- collect2.c (revision 156482) +++ collect2.c (working copy) @@ -174,7 +174,7 @@ int number; }; -int vflag;/* true if -v */ +bool vflag;/* true if -v or --version */ static int rflag; /* true if -r */ static int strip_flag; /* true if -s */ static const char *demangle_flag; @@ -193,7 +193,8 @@ /* Current LTO mode. */ static enum lto_mode_d lto_mode = LTO_MODE_NONE; -int debug;/* true if -debug */ +bool debug;/* true if -debug */ +bool helpflag; /* true if --help */ static int shared_obj; /* true if -shared */ @@ -1228,7 +1229,7 @@ for (i = 1; argv[i] != NULL; i ++) { if (! strcmp (argv[i], "-debug")) - debug = 1; + debug = true; else if (! strcmp (argv[i], "-flto") && ! use_plugin) { use_verbose = true; @@ -1458,7 +1459,7 @@ if (use_verbose && *q == '-' && q[1] == 'v' && q[2] == 0) { /* Turn on trace in collect2 if needed. */ - vflag = 1; + vflag = true; } } obstack_free (&temporary_obstack, temporary_firstobj); @@ -1588,7 +1589,7 @@ case 'v': if (arg[2] == '\0') - vflag = 1; + vflag = true; break; case '-': @@ -1619,6 +1620,10 @@ } else if (strncmp (arg, "--sysroot=", 10) == 0) target_system_root = arg + 10; + else if (strncmp (arg, "--version", 9) == 0) + vflag = true; + else if (strncmp (arg, "--help", 9) == 0) + helpflag = true; break; } } @@ -1720,6 +1725,17 @@ fprintf (stderr, "\n"); } + if (helpflag) +{ + fprintf (stderr, "collect2 is a GCC utility to arrange and call "); + fprintf (stderr, "various initialization functions at start time.\n"); + fprintf (stderr, "Wrapping the linker and generating an additional "); + fprintf (stderr, "temporary `.c' of constructor fnctions if needed.\n"); + fprintf (stderr, "Usage: collect2 [options]\n"); + fprintf (stderr, " -v, --version Display version\n"); + fprintf (stderr, " -debug Enable debug output. `gcc -Wl,-debug'\n"); +} + if (debug) { const char *ptr;
Re: Long paths with ../../../../ throughout
Updated patch attached which includes collect2.h change to bool. Please include my address in any replies. Best regards, Jon Index: collect2.c === --- collect2.c (revision 156482) +++ collect2.c (working copy) @@ -174,7 +174,7 @@ int number; }; -int vflag;/* true if -v */ +bool vflag;/* true if -v or --version */ static int rflag; /* true if -r */ static int strip_flag; /* true if -s */ static const char *demangle_flag; @@ -193,7 +193,8 @@ /* Current LTO mode. */ static enum lto_mode_d lto_mode = LTO_MODE_NONE; -int debug;/* true if -debug */ +bool debug;/* true if -debug */ +bool helpflag; /* true if --help */ static int shared_obj; /* true if -shared */ @@ -1228,7 +1229,7 @@ for (i = 1; argv[i] != NULL; i ++) { if (! strcmp (argv[i], "-debug")) - debug = 1; + debug = true; else if (! strcmp (argv[i], "-flto") && ! use_plugin) { use_verbose = true; @@ -1458,7 +1459,7 @@ if (use_verbose && *q == '-' && q[1] == 'v' && q[2] == 0) { /* Turn on trace in collect2 if needed. */ - vflag = 1; + vflag = true; } } obstack_free (&temporary_obstack, temporary_firstobj); @@ -1588,7 +1589,7 @@ case 'v': if (arg[2] == '\0') - vflag = 1; + vflag = true; break; case '-': @@ -1619,6 +1620,10 @@ } else if (strncmp (arg, "--sysroot=", 10) == 0) target_system_root = arg + 10; + else if (strncmp (arg, "--version", 9) == 0) + vflag = true; + else if (strncmp (arg, "--help", 9) == 0) + helpflag = true; break; } } @@ -1720,6 +1725,17 @@ fprintf (stderr, "\n"); } + if (helpflag) +{ + fprintf (stderr, "collect2 is a GCC utility to arrange and call "); + fprintf (stderr, "various initialization functions at start time.\n"); + fprintf (stderr, "Wrapping the linker and generating an additional "); + fprintf (stderr, "temporary `.c' of constructor fnctions if needed.\n"); + fprintf (stderr, "Usage: collect2 [options]\n"); + fprintf (stderr, " -v, --version Display version\n"); + fprintf (stderr, " -debug Enable debug output. `gcc -Wl,-debug'\n"); +} + if (debug) { const char *ptr; Index: collect2.h === --- collect2.h (revision 156482) +++ collect2.h (working copy) @@ -38,7 +38,7 @@ extern const char *c_file_name; extern struct obstack temporary_obstack; extern char *temporary_firstobj; -extern int vflag, debug; +extern bool vflag, debug; extern void error (const char *, ...) ATTRIBUTE_PRINTF_1; extern void notice (const char *, ...) ATTRIBUTE_PRINTF_1;
Re: Long paths with ../../../../ throughout
Hello Ian Ian Lance Taylor wrote: [.] I've attached collect2 patch. Let me know what you think of it. There is actually a GNU standard for --help output, and collect2 might as well follow it. http://www.gnu.org/prep/standards/html_node/_002d_002dhelp.html Ok, looks good, I've updated the changes, please find attached revised patch. Do you have a copyright assignment/disclaimer with the FSF? I asked FSF this week, I'm just waiting for the snail mail to arrive. Will post it back as soon as it does. Cheers, Jon Index: collect2.c === --- collect2.c (revision 156482) +++ collect2.c (working copy) @@ -174,7 +174,7 @@ int number; }; -int vflag;/* true if -v */ +bool vflag;/* true if -v or --version */ static int rflag; /* true if -r */ static int strip_flag; /* true if -s */ static const char *demangle_flag; @@ -193,7 +193,8 @@ /* Current LTO mode. */ static enum lto_mode_d lto_mode = LTO_MODE_NONE; -int debug;/* true if -debug */ +bool debug;/* true if -debug */ +bool helpflag; /* true if --help */ static int shared_obj; /* true if -shared */ @@ -1228,7 +1229,7 @@ for (i = 1; argv[i] != NULL; i ++) { if (! strcmp (argv[i], "-debug")) - debug = 1; + debug = true; else if (! strcmp (argv[i], "-flto") && ! use_plugin) { use_verbose = true; @@ -1458,7 +1459,7 @@ if (use_verbose && *q == '-' && q[1] == 'v' && q[2] == 0) { /* Turn on trace in collect2 if needed. */ - vflag = 1; + vflag = true; } } obstack_free (&temporary_obstack, temporary_firstobj); @@ -1588,7 +1589,7 @@ case 'v': if (arg[2] == '\0') - vflag = 1; + vflag = true; break; case '-': @@ -1619,6 +1620,10 @@ } else if (strncmp (arg, "--sysroot=", 10) == 0) target_system_root = arg + 10; + else if (strncmp (arg, "--version", 9) == 0) + vflag = true; + else if (strncmp (arg, "--help", 9) == 0) + helpflag = true; break; } } @@ -1720,6 +1725,20 @@ fprintf (stderr, "\n"); } + if (helpflag) +{ + fprintf (stderr, "Usage: collect2 [options]\n"); + fprintf (stderr, " Wrap linker and generate constructor code if needed.\n"); + fprintf (stderr, " Options:\n"); + fprintf (stderr, " -debug Enable debug output\n"); + fprintf (stderr, " --help Display this information\n"); + fprintf (stderr, " -v, --version Display this program's version number\n"); + fprintf (stderr, "Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";); + fprintf (stderr, "Report bugs: http://gcc.gnu.org/\n";); + + collect_exit (0); +} + if (debug) { const char *ptr; Index: collect2.h === --- collect2.h (revision 156482) +++ collect2.h (working copy) @@ -38,7 +38,7 @@ extern const char *c_file_name; extern struct obstack temporary_obstack; extern char *temporary_firstobj; -extern int vflag, debug; +extern bool vflag, debug; extern void error (const char *, ...) ATTRIBUTE_PRINTF_1; extern void notice (const char *, ...) ATTRIBUTE_PRINTF_1;
Re: Long paths with ../../../../ throughout
Hi Ian Ian Lance Taylor wrote, On 04/02/10 00:48: Jon writes: [.] I've attached collect2 patch. Let me know what you think of it. There is actually a GNU standard for --help output, and collect2 might as well follow it. http://www.gnu.org/prep/standards/html_node/_002d_002dhelp.html That's good. Please find updated patch attached. Do you have a copyright assignment/disclaimer with the FSF? Just got email notification from FSF that they received my GCC copyright assignment. Cheers, Jon Index: collect2.c === --- collect2.c (revision 156482) +++ collect2.c (working copy) @@ -174,7 +174,7 @@ int number; }; -int vflag;/* true if -v */ +bool vflag;/* true if -v or --version */ static int rflag; /* true if -r */ static int strip_flag; /* true if -s */ static const char *demangle_flag; @@ -193,7 +193,8 @@ /* Current LTO mode. */ static enum lto_mode_d lto_mode = LTO_MODE_NONE; -int debug;/* true if -debug */ +bool debug;/* true if -debug */ +bool helpflag; /* true if --help */ static int shared_obj; /* true if -shared */ @@ -1228,7 +1229,7 @@ for (i = 1; argv[i] != NULL; i ++) { if (! strcmp (argv[i], "-debug")) - debug = 1; + debug = true; else if (! strcmp (argv[i], "-flto") && ! use_plugin) { use_verbose = true; @@ -1458,7 +1459,7 @@ if (use_verbose && *q == '-' && q[1] == 'v' && q[2] == 0) { /* Turn on trace in collect2 if needed. */ - vflag = 1; + vflag = true; } } obstack_free (&temporary_obstack, temporary_firstobj); @@ -1588,7 +1589,7 @@ case 'v': if (arg[2] == '\0') - vflag = 1; + vflag = true; break; case '-': @@ -1619,6 +1620,10 @@ } else if (strncmp (arg, "--sysroot=", 10) == 0) target_system_root = arg + 10; + else if (strncmp (arg, "--version", 9) == 0) + vflag = true; + else if (strncmp (arg, "--help", 9) == 0) + helpflag = true; break; } } @@ -1720,6 +1725,20 @@ fprintf (stderr, "\n"); } + if (helpflag) +{ + fprintf (stderr, "Usage: collect2 [options]\n"); + fprintf (stderr, " Wrap linker and generate constructor code if needed.\n"); + fprintf (stderr, " Options:\n"); + fprintf (stderr, " -debug Enable debug output\n"); + fprintf (stderr, " --help Display this information\n"); + fprintf (stderr, " -v, --version Display this program's version number\n"); + fprintf (stderr, "Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";); + fprintf (stderr, "Report bugs: http://gcc.gnu.org/\n";); + + collect_exit (0); +} + if (debug) { const char *ptr; Index: collect2.h === --- collect2.h (revision 156482) +++ collect2.h (working copy) @@ -38,7 +38,7 @@ extern const char *c_file_name; extern struct obstack temporary_obstack; extern char *temporary_firstobj; -extern int vflag, debug; +extern bool vflag, debug; extern void error (const char *, ...) ATTRIBUTE_PRINTF_1; extern void notice (const char *, ...) ATTRIBUTE_PRINTF_1;
Re: Long paths with ../../../../ throughout
Joseph S. Myers wrote, On 20/02/10 11:36: On Sat, 20 Feb 2010, Jon wrote: + fprintf (stderr, "Report bugs: http://gcc.gnu.org/\n";); You should use bug_report_url from version.c, which can be controlled with --with-bugurl so that distributors only need to use one configure option to cause all bug reports for their distributions to be directed to themselves. Good point. Updated patch attached for review. Cheers, Jon Index: collect2.c === --- collect2.c (revision 156482) +++ collect2.c (working copy) @@ -174,7 +174,7 @@ int number; }; -int vflag;/* true if -v */ +bool vflag;/* true if -v or --version */ static int rflag; /* true if -r */ static int strip_flag; /* true if -s */ static const char *demangle_flag; @@ -193,7 +193,8 @@ /* Current LTO mode. */ static enum lto_mode_d lto_mode = LTO_MODE_NONE; -int debug;/* true if -debug */ +bool debug;/* true if -debug */ +bool helpflag; /* true if --help */ static int shared_obj; /* true if -shared */ @@ -1228,7 +1229,7 @@ for (i = 1; argv[i] != NULL; i ++) { if (! strcmp (argv[i], "-debug")) - debug = 1; + debug = true; else if (! strcmp (argv[i], "-flto") && ! use_plugin) { use_verbose = true; @@ -1458,7 +1459,7 @@ if (use_verbose && *q == '-' && q[1] == 'v' && q[2] == 0) { /* Turn on trace in collect2 if needed. */ - vflag = 1; + vflag = true; } } obstack_free (&temporary_obstack, temporary_firstobj); @@ -1588,7 +1589,7 @@ case 'v': if (arg[2] == '\0') - vflag = 1; + vflag = true; break; case '-': @@ -1619,6 +1620,10 @@ } else if (strncmp (arg, "--sysroot=", 10) == 0) target_system_root = arg + 10; + else if (strncmp (arg, "--version", 9) == 0) + vflag = true; + else if (strncmp (arg, "--help", 9) == 0) + helpflag = true; break; } } @@ -1720,6 +1725,20 @@ fprintf (stderr, "\n"); } + if (helpflag) +{ + fprintf (stderr, "Usage: collect2 [options]\n"); + fprintf (stderr, " Wrap linker and generate constructor code if needed.\n"); + fprintf (stderr, " Options:\n"); + fprintf (stderr, " -debug Enable debug output\n"); + fprintf (stderr, " --help Display this information\n"); + fprintf (stderr, " -v, --version Display this program's version number\n"); + fprintf (stderr, "Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";); + fprintf (stderr, "Report bugs: %s\n", bug_report_url); + + collect_exit (0); +} + if (debug) { const char *ptr; Index: collect2.h === --- collect2.h (revision 156482) +++ collect2.h (working copy) @@ -38,7 +38,7 @@ extern const char *c_file_name; extern struct obstack temporary_obstack; extern char *temporary_firstobj; -extern int vflag, debug; +extern bool vflag, debug; extern void error (const char *, ...) ATTRIBUTE_PRINTF_1; extern void notice (const char *, ...) ATTRIBUTE_PRINTF_1;
Re: Long paths with ../../../../ throughout
Hello Ian Ian Lance Taylor wrote, On 22/02/10 03:26: Jon writes: Good point. Updated patch attached for review. I suppose this counts as a functionality change, and as such should not be committed until after the release branch is made. This is OK when we are back in stage 1, with a ChangeLog entry, assuming it passes bootstrap (you didn't say). collect2_feb_21_help.patch attached again to go with ChangeLog: 2010-03-13 Jon Grant <0...@jguk.org> * collect2.c: debug changed to bool so true/false can be used. bool helpflag added. * collect2.c: --version now sets vflag true. --help no sets helpflag true. * collect2.c: when --help passed, standard help information is output on stderr * collect2.h: vflag changed to bool so true/false can be used. I think it passes bootstrap, my understanding of what is required: ./configure make make bootstrap I'm new to gcc, so if some extra steps to follow please let me know if there is an FAQ or document to follow. How long is it until back in stage 1 development phase? Thanks for reviewing so far Cheers, Jon Index: collect2.c === --- collect2.c (revision 156482) +++ collect2.c (working copy) @@ -174,7 +174,7 @@ int number; }; -int vflag;/* true if -v */ +bool vflag;/* true if -v or --version */ static int rflag; /* true if -r */ static int strip_flag; /* true if -s */ static const char *demangle_flag; @@ -193,7 +193,8 @@ /* Current LTO mode. */ static enum lto_mode_d lto_mode = LTO_MODE_NONE; -int debug;/* true if -debug */ +bool debug;/* true if -debug */ +bool helpflag; /* true if --help */ static int shared_obj; /* true if -shared */ @@ -1228,7 +1229,7 @@ for (i = 1; argv[i] != NULL; i ++) { if (! strcmp (argv[i], "-debug")) - debug = 1; + debug = true; else if (! strcmp (argv[i], "-flto") && ! use_plugin) { use_verbose = true; @@ -1458,7 +1459,7 @@ if (use_verbose && *q == '-' && q[1] == 'v' && q[2] == 0) { /* Turn on trace in collect2 if needed. */ - vflag = 1; + vflag = true; } } obstack_free (&temporary_obstack, temporary_firstobj); @@ -1588,7 +1589,7 @@ case 'v': if (arg[2] == '\0') - vflag = 1; + vflag = true; break; case '-': @@ -1619,6 +1620,10 @@ } else if (strncmp (arg, "--sysroot=", 10) == 0) target_system_root = arg + 10; + else if (strncmp (arg, "--version", 9) == 0) + vflag = true; + else if (strncmp (arg, "--help", 9) == 0) + helpflag = true; break; } } @@ -1720,6 +1725,20 @@ fprintf (stderr, "\n"); } + if (helpflag) +{ + fprintf (stderr, "Usage: collect2 [options]\n"); + fprintf (stderr, " Wrap linker and generate constructor code if needed.\n"); + fprintf (stderr, " Options:\n"); + fprintf (stderr, " -debug Enable debug output\n"); + fprintf (stderr, " --help Display this information\n"); + fprintf (stderr, " -v, --version Display this program's version number\n"); + fprintf (stderr, "Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";); + fprintf (stderr, "Report bugs: %s\n", bug_report_url); + + collect_exit (0); +} + if (debug) { const char *ptr; Index: collect2.h === --- collect2.h (revision 156482) +++ collect2.h (working copy) @@ -38,7 +38,7 @@ extern const char *c_file_name; extern struct obstack temporary_obstack; extern char *temporary_firstobj; -extern int vflag, debug; +extern bool vflag, debug; extern void error (const char *, ...) ATTRIBUTE_PRINTF_1; extern void notice (const char *, ...) ATTRIBUTE_PRINTF_1;
Re: Long paths with ../../../../ throughout
Ian Lance Taylor wrote, On 15/03/10 03:12: Jon writes: How long is it until back in stage 1 development phase? Reasonably soon, I hope, but there is no specific schedule. Hi Ian, Just wanted to ask if it had been possible to integrate the patch. Would it be useful for me to create a bugzilla ticket and add the patch there? Cheers, Jon
Re: Long paths with ../../../../ throughout
Hi Manuel Manuel López-Ibáñez wrote, On 25/04/10 22:00: [.] Jon, would you mind writing a proper Changelog? I've attached the Changelog I wrote before. I can change if needed, let me know what info I should add. I will test that the patch still passes the regression test and commit it for you. OK? That would be great, thank you. BTW, I returned the copyright assignment to FSF @ 17 Feb 2010, I think copyright-cl...@fsf.org will be able to confirm this. Best regards, Jon Index: collect2.c === --- collect2.c (revision 156482) +++ collect2.c (working copy) @@ -174,7 +174,7 @@ int number; }; -int vflag;/* true if -v */ +bool vflag;/* true if -v or --version */ static int rflag; /* true if -r */ static int strip_flag; /* true if -s */ static const char *demangle_flag; @@ -193,7 +193,8 @@ /* Current LTO mode. */ static enum lto_mode_d lto_mode = LTO_MODE_NONE; -int debug;/* true if -debug */ +bool debug;/* true if -debug */ +bool helpflag; /* true if --help */ static int shared_obj; /* true if -shared */ @@ -1228,7 +1229,7 @@ for (i = 1; argv[i] != NULL; i ++) { if (! strcmp (argv[i], "-debug")) - debug = 1; + debug = true; else if (! strcmp (argv[i], "-flto") && ! use_plugin) { use_verbose = true; @@ -1458,7 +1459,7 @@ if (use_verbose && *q == '-' && q[1] == 'v' && q[2] == 0) { /* Turn on trace in collect2 if needed. */ - vflag = 1; + vflag = true; } } obstack_free (&temporary_obstack, temporary_firstobj); @@ -1588,7 +1589,7 @@ case 'v': if (arg[2] == '\0') - vflag = 1; + vflag = true; break; case '-': @@ -1619,6 +1620,10 @@ } else if (strncmp (arg, "--sysroot=", 10) == 0) target_system_root = arg + 10; + else if (strncmp (arg, "--version", 9) == 0) + vflag = true; + else if (strncmp (arg, "--help", 9) == 0) + helpflag = true; break; } } @@ -1720,6 +1725,20 @@ fprintf (stderr, "\n"); } + if (helpflag) +{ + fprintf (stderr, "Usage: collect2 [options]\n"); + fprintf (stderr, " Wrap linker and generate constructor code if needed.\n"); + fprintf (stderr, " Options:\n"); + fprintf (stderr, " -debug Enable debug output\n"); + fprintf (stderr, " --help Display this information\n"); + fprintf (stderr, " -v, --version Display this program's version number\n"); + fprintf (stderr, "Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";); + fprintf (stderr, "Report bugs: %s\n", bug_report_url); + + collect_exit (0); +} + if (debug) { const char *ptr; Index: collect2.h === --- collect2.h (revision 156482) +++ collect2.h (working copy) @@ -38,7 +38,7 @@ extern const char *c_file_name; extern struct obstack temporary_obstack; extern char *temporary_firstobj; -extern int vflag, debug; +extern bool vflag, debug; extern void error (const char *, ...) ATTRIBUTE_PRINTF_1; extern void notice (const char *, ...) ATTRIBUTE_PRINTF_1; 2010-03-13 Jon Grant <0...@jguk.org> * collect2.c: debug changed to bool so true/false can be used. bool helpflag added. * collect2.c: --version now sets vflag true. --help no sets helpflag true. * collect2.c: when --help passed, standard help information is output on stderr * collect2.h: vflag changed to bool so true/false can be used.
Re: Long paths with ../../../../ throughout
Hi Manuel Manuel López-Ibáñez wrote, On 25/04/10 22:37: [.] http://gcc.gnu.org/wiki/ChangeLog Basically, in your case, do not repeat the filename and mention which function is affected (if any). 2010-03-13 Jon Grant <0...@jguk.org> * collect2.h: vflag extern changed to bool so true/false can be used. * collect2.c: "debug" global variable changed to bool so true/false can be used. * "helpflag" bool global variable added. * (main) sets "debug" to true instead of 1 when -debug is passed in argv * --help now sets "helpflag" to true instead of 1 * --version now sets "vflag" global bool true instead of 1 * if "helpflag" is true, standard help information is output on stderr I have reworked it into that format. I've not created PR. As approved, is this ok to go in without a PR? Cheers, Jon Index: collect2.c === --- collect2.c (revision 156482) +++ collect2.c (working copy) @@ -174,7 +174,7 @@ int number; }; -int vflag;/* true if -v */ +bool vflag;/* true if -v or --version */ static int rflag; /* true if -r */ static int strip_flag; /* true if -s */ static const char *demangle_flag; @@ -193,7 +193,8 @@ /* Current LTO mode. */ static enum lto_mode_d lto_mode = LTO_MODE_NONE; -int debug;/* true if -debug */ +bool debug;/* true if -debug */ +bool helpflag; /* true if --help */ static int shared_obj; /* true if -shared */ @@ -1228,7 +1229,7 @@ for (i = 1; argv[i] != NULL; i ++) { if (! strcmp (argv[i], "-debug")) - debug = 1; + debug = true; else if (! strcmp (argv[i], "-flto") && ! use_plugin) { use_verbose = true; @@ -1458,7 +1459,7 @@ if (use_verbose && *q == '-' && q[1] == 'v' && q[2] == 0) { /* Turn on trace in collect2 if needed. */ - vflag = 1; + vflag = true; } } obstack_free (&temporary_obstack, temporary_firstobj); @@ -1588,7 +1589,7 @@ case 'v': if (arg[2] == '\0') - vflag = 1; + vflag = true; break; case '-': @@ -1619,6 +1620,10 @@ } else if (strncmp (arg, "--sysroot=", 10) == 0) target_system_root = arg + 10; + else if (strncmp (arg, "--version", 9) == 0) + vflag = true; + else if (strncmp (arg, "--help", 9) == 0) + helpflag = true; break; } } @@ -1720,6 +1725,20 @@ fprintf (stderr, "\n"); } + if (helpflag) +{ + fprintf (stderr, "Usage: collect2 [options]\n"); + fprintf (stderr, " Wrap linker and generate constructor code if needed.\n"); + fprintf (stderr, " Options:\n"); + fprintf (stderr, " -debug Enable debug output\n"); + fprintf (stderr, " --help Display this information\n"); + fprintf (stderr, " -v, --version Display this program's version number\n"); + fprintf (stderr, "Overview: http://gcc.gnu.org/onlinedocs/gccint/Collect2.html\n";); + fprintf (stderr, "Report bugs: %s\n", bug_report_url); + + collect_exit (0); +} + if (debug) { const char *ptr; Index: collect2.h === --- collect2.h (revision 156482) +++ collect2.h (working copy) @@ -38,7 +38,7 @@ extern const char *c_file_name; extern struct obstack temporary_obstack; extern char *temporary_firstobj; -extern int vflag, debug; +extern bool vflag, debug; extern void error (const char *, ...) ATTRIBUTE_PRINTF_1; extern void notice (const char *, ...) ATTRIBUTE_PRINTF_1;
RFA; DFP and REAL_TYPE?
So I've been looking at using REAL_TYPE to represent decimal floating point values internally (to implement the C extensions for decimal floating point.) I believe David and yourself had some discussions on this some short time back. Anyway, I've now had a chance to play with this a bit, but not quite sure how well I like the way its coming out (though the alternative of introducing new type seems worse, imo). Warning: My thinking is likely clouded by a goal to wire in the decNumber routines to implement the algorithms/encodings for decimal floats (still working through permissions for this to happen though). First, I think we need to avoid going into the GCC REAL internal binary float representation for decimal floats. I'm guessing going into the binary representation (then performing various arithmetic operations) and then eventually dropping back out to decimal float will end up with errors that are trying to be avoided by decimal float in the first place. I'm looking for advice to going forward. I've already hacked up real_value.sig to hold a decimal128 encoded value. This is fugly, and obviously all sorts of things in real.c would break if I started using the various functions for real. But before I put down any significant work down the REAL_TYPE path, I thought it best to get guidance. 1) Stick with REAL_TYPE or is it hopeless and I should create DFLOAT_TYPE? 2) If the recommendation is to stick with REAL_TYPE. Is it ok to have some other internal representation? 3) Is there a preferred way to override real_value functions? I'm assuming that even if I use the real_value->sig field to hold the coeefficient rather than the ugly hack of holding a decimal128, I'll need to override various functions in real.c to 'do the right thing' for radix 10 reals. I could add a field to real_value to point to a function table, that if present to be called through. Or simply add various "if (r->b == 10) checks throughout real.c. Or other. Thoughts/concerns/questions/advice? Best Regards, Jon Grimm IBM Linux Technology Center.
Re: RFA; DFP and REAL_TYPE?
Mark Mitchell wrote: Robert Dewar wrote: Mark Mitchell wrote: I would expect that some decimal floating point values are not precisely representable in the binary format. OK, I agree that decimal floating-point needs its own format. But still you can store the decimal mantissa and decimal exponent in binary format without any problem, and that's probably what you want to do on a machine that does not have native decimal format support. I would think that, as elsewhere in real.c, you would probably want to use the same exact bit representation that will be used on the target. This is useful so that you can easily emit assembly literals by simply printing the bytes in hex, for example. Of course, you could do as you suggest (storing the various fields of the decimal number in binary formats), and, yes, on many host machines that would in more efficient internal computations. But, I'm not confident that the savings you would get out of that would outweigh the appeal of having bit-for-bit consistency between the host and target. In full disclosure, the 754r encoding is pretty ugly: high order bits from exponent and coefficient are packed into one field. The remaining bits of the exponent are in a second field, and the remaining bits of the coeffecient are compressed decimal encodings. So you pretty much have to come out of that encoding to do much useful with it. Honestly, I've been most just used the real decimal128 encoding as it was easiest to integrate with decNumber. decNumber has yet another architecture neutral format it does its computations in, but that didn't fit in the real_type, so I just used decimal128 as it 'fit' and decNumber could internally deal with that format already. In any case, this is rather a detail; the key decision Jon is trying to make is whether or not he has to introduce a new format in real.c, together with new routines to perform oeprations on that format, to which I think we agree the answer is in the affirmative. Yes. Thanks! This is exactly the discussion I was interested in (and to validate that my thinking was not totally off kilter). The specific internal representation can change once real.c is safe for some different representation. -- Jon Grimm <[EMAIL PROTECTED]>
Re: libstdc++ link failures on ppc64
Diego Novillo wrote: I see no changes in libstdc++ since the previous run and nothing in the C++ FE, so I'm not sure whether it may be something broken in my box. Anybody else seeing this failure? Yep. I see this here on the PPC64 nightly autotester. Br, Jon
Register allocation in GCC 4
Hi, I'm updating a GCC port to 4.0.0. I am seeing a problem whereby registers that are set to 1 in fixed_regs are being used. The problem is occuring quite early on in the compiler, as the registers appear in the 00.expand dump. The problem seems to occur for a DCmode value that is being allocated to several registers. The first 4 of these registers are not in fixed_regs, but the last 4 are (regs are 16-bit). I have made sure HARD_REGNO_MODE_OK for all these registers returns 0, but that hasn't had an effect. Can anyone suggest where I need to be looking to track this down? Cheers, Jon
RE: Register allocation in GCC 4
Hi Nathan, > I guess > it must be to do with function calling Good call. I screwed up the conversion from FUNCTION_ARG_PARTIAL_NREGS to TARGET_ARG_PARTIAL_BYTES. Cheers, Jon
Store scheduling with DFA scheduler
Hi, I'm trying to get the DFA scheduler in GCC 4.0.0 to schedule loads and stores, but I can only get it to work for loads. I have an automaton defined as follows: (define_automaton "cpu") (define_cpu_unit "x" "cpu") (define_cpu_unit "m" "cpu") (define_insn_reservation "arith" 1 (eq_attr "type" "arith") "x") (define_insn_reservation "loads" 2 (eq_attr "type" "load") "x,m") (define_insn_reservation "stores" 3 (eq_attr "type" "store") "x,m*2") All instructions take one cycle in "x". Loads then take one "m" cycle, while stores take two "m" cycles. Basically stores aren't fully pipelined. If I compile the following code: int x, y, z, w; void main() { x = x + 1; y = y + 1; z = z + 1; w = w + 1; } I get the following output: lhu r4, [x] lhu r5, [y] lhu r6, [z] lhu r7, [w] add r4, 1 add r5, 1 add r6, 1 add r7, 1 sh [x], r4 sh [y], r5 sh [z], r6 sh [w], r7 This therefore seems to be scheduling loads correctly, as before I added the automaton I was getting adds immediately following loads, but doesn't seem to be scheduling the stores correctly, as they are scheduledin consequtive slots. I would expect the optimial schedule to be something along the lines of: lhu r4, [x] lhu r5, [y] lhu r6, [z] lhu r7, [w] add r4, 1 sh [x], r4 add r5, 1 sh [y], r5 add r6, 1 sh [z], r6 add r7, 1 sh [w], r7 I'd be greatful for any suggestions as to what the problem might be. Cheers, Jon
Side-effect latency in DFA scheduler
Hi, How is the latency of instructions that have side effects modeled in the DFA scheduler. For example, define_insn_reservation only has one latency value, yet instructions such as loads with post increment addressing have two outputs, possibly with different latencies. Do both outputs get the same latency? Is there an option similar to -dp that outputs what latency the compiler has used for each instruction? Cheers, Jon
RE: Store scheduling with DFA scheduler
> Jon, > > (define_insn_reservation "arith" 1 (eq_attr "type" "arith") "x") > > (define_insn_reservation "loads" 2 (eq_attr "type" "load") "x,m") > > (define_insn_reservation "stores" 3 (eq_attr "type" > "store") "x,m*2") > > Stores don't really have a 'result', why have you set the > cycle count to 3? Shouldn't it be '1'? (then you won't need > store bypasses for autoincrements) Primilary because that's how it appears to be coded in the ARM port (e.g store_wbuf in arm-generic.md). I had tried both ways though, and for this particular problem, changing this value appears to have no effect. I can see that it would for autoinc though. Cheers, Jon
RE: Store scheduling with DFA scheduler
Hi Vlad, > There is not enough information to say what is wrong. It > would be better if you send gcc output when > -fsched-verbose=10 is used. Cheers, Jon ;; == ;; -- basic block 0 from 18 to 32 -- before reload ;; == ;; --- forward dependences: ;; --- Region Dependences --- b 0 bb 0 ;; insn codebb dep prio cost reservation ;; -- --- --- ;; 1810 0 0 6 2 x,m : 20 19 ;; 1992 0 1 4 1 x : 20 ;; 2010 0 2 3 3 x,m*2 : ;; 2210 0 0 6 2 x,m : 24 23 ;; 2392 0 1 4 1 x : 24 ;; 2410 0 2 3 3 x,m*2 : ;; 2610 0 0 6 2 x,m : 28 27 ;; 2792 0 1 4 1 x : 28 ;; 2810 0 2 3 3 x,m*2 : ;; 3010 0 0 6 2 x,m : 32 31 ;; 3192 0 1 4 1 x : 32 ;; 3210 0 2 3 3 x,m*2 : ;; Ready list after queue_to_ready:30 26 22 18 ;; Ready list after ready_sort:30 26 22 18 ;; Ready list (t = 0):30 26 22 18 ;;0--> 18 r41=[`x'] :x,m ;; dependences resolved: insn 19 into queue with cost=2 ;; Ready-->Q: insn 19: queued for 2 cycles. ;; Ready list (t = 0):30 26 22 ;; Ready list after queue_to_ready:30 26 22 ;; Ready list after ready_sort:30 26 22 ;; Ready list (t = 1):30 26 22 ;;1--> 22 r43=[`y'] :x,m ;; dependences resolved: insn 23 into queue with cost=2 ;; Ready-->Q: insn 23: queued for 2 cycles. ;; Ready list (t = 1):30 26 ;; Q-->Ready: insn 19: moving to ready without stalls ;; Ready list after queue_to_ready:19 30 26 ;; Ready list after ready_sort:19 30 26 ;; Ready list (t = 2):19 30 26 ;;2--> 26 r45=[`z'] :x,m ;; dependences resolved: insn 27 into queue with cost=2 ;; Ready-->Q: insn 27: queued for 2 cycles. ;; Ready list (t = 2):19 30 ;; Q-->Ready: insn 23: moving to ready without stalls ;; Ready list after queue_to_ready:23 19 30 ;; Ready list after ready_sort:23 19 30 ;; Ready list (t = 3):23 19 30 ;;3--> 30 r47=[`w'] :x,m ;; dependences resolved: insn 31 into queue with cost=2 ;; Ready-->Q: insn 31: queued for 2 cycles. ;; Ready list (t = 3):23 19 ;; Q-->Ready: insn 27: moving to ready without stalls ;; Ready list after queue_to_ready:27 23 19 ;; Ready list after ready_sort:27 23 19 ;; Ready list (t = 4):27 23 19 ;;4--> 19 {r41=r41+0x1;clobber System, CC.A;}:x ;; dependences resolved: insn 20 into queue with cost=1 ;; Ready-->Q: insn 20: queued for 1 cycles. ;; Ready list (t = 4):27 23 ;; Q-->Ready: insn 20: moving to ready without stalls ;; Q-->Ready: insn 31: moving to ready without stalls ;; Ready list after queue_to_ready:31 20 27 23 ;; Ready list after ready_sort:20 31 27 23 ;; Ready list (t = 5):20 31 27 23 ;;5--> 23 {r43=r43+0x1;clobber System, CC.A;}:x ;; dependences resolved: insn 24 into queue with cost=1 ;; Ready-->Q: insn 24: queued for 1 cycles. ;; Ready list (t = 5):20 31 27 ;; Q-->Ready: insn 24: moving to ready without stalls ;; Ready list after queue_to_ready:24 20 31 27 ;; Ready list after ready_sort:24 20 31 27 ;; Ready list (t = 6):24 20 31 27 ;;6--> 27 {r45=r45+0x1;clobber System, CC.A;}:x ;; dependences resolved: insn 28 into queue with cost=1 ;; Ready-->Q: insn 28: queued for 1 cycles. ;; Ready list (t = 6):24 20 31 ;; Q-->Ready: insn 28: moving to ready without stalls ;; Ready list after queue_to_ready:28 24 20 31 ;; Ready list after ready_sort:28 24 20 31 ;; Ready list (t = 7):28 24 20 31 ;;7--> 31 {r47=r47+0x1;clobber System, CC.A;}:x ;; dependences resolved: insn 32 into queue with cost=1 ;; Ready-->Q: insn 32: queued for 1 cycles. ;; Ready list (t = 7):28 24 20 ;; Q-->Ready: insn
Selective Mudflap
Hi, I'm trying to debug a large C application that (amongst other things) starts a JVM and uses Java's JDBC to connect to databases via JNI. If I use the sourceforge bounds checking patch I get a sensible list of errors (none from the JVM). I'd also like to use Mudflap however running the program with mudflap generates huge numbers of errors caused by the (uninstrumented) libjvm.so. I don't much care about errors in the JVM, whether they are real or imagined - I'm not going to be altering that code, is there any way to ask mudflap to suppress errors from all uninstrumented code or from a certain library or use a valgrind style suppressions file etc. Any advice appreciated. Jon.
Re: Selective Mudflap
"Frank Ch. Eigler" <[EMAIL PROTECTED]> writes: > "Jon Levell" <[EMAIL PROTECTED]> writes: > > I'm trying to debug a large C application that (amongst other > > things) starts a JVM and uses Java's JDBC to connect to > > databases via JNI. [..] > > of errors (none from the JVM). I'd also like to use Mudflap however > > running the program with mudflap generates huge numbers of errors > > caused by the (uninstrumented) libjvm.so. [...] > > Do these errors arise from malloc-type operations performed by the > JVM? Or from your code's use of JVM-provided pointers? Sadly, there The errors stem from inside the JVM. I presume when it is using pointers that the C application has provided because it was't compiled with mudflap itself. (I'm new to mudflap but the violations claim to be of type "register"). > is no valgrind-style exclusion facility around. However, if the JVM > interface is used predominantly in one direction (C code calling into > the JVM), it may be possible to programatically turn off mudflap > enforcement when your code is about to jump into the jvm. Maybe There is quite a lot of interaction so for now I'll use a script to post-process the Mudflap report. Because Mudflap is OSS, if someone else doesn't do it first, I might at some point add some simple way to exclude violations but that won't be any time soon - things are hectic here at the moment. Thank you very much for your prompt response and for Mudflap, it seems to be a very clever piece of software. Jon.
[URGENT] GCC 4.0 Nomination
Folks, GCC 4.0 has been shortlisted in this year's Linux Awards: * http://www.linuxawards.co.uk/content/view/14/40/ * Best Linux/Open Source Developer Tool. BUT...none of the folks in the office can apparently contact anyone about being available to attend the dinner next week in London, UK. Can someone suitably involved with the project urgently contact either myself or (preferably) Maggie Meer <[EMAIL PROTECTED]> about this? Cheers! Jon. pgpaFrQRsJT0A.pgp Description: PGP signature
Re: [URGENT] GCC 4.0 Nomination
On Sun, Oct 02, 2005 at 04:50:41PM +0100, Andrew Haley wrote: jcm>> GCC 4.0 has been shortlisted in this year's Linux Awards: jcm>> * http://www.linuxawards.co.uk/content/view/14/40/ > Going from the mailing lists there are about ten of us heavily > involved in gcc here in the UK. I'm not sure how you'd choose > someone, given that gcc is a collective effort. Absolutely. This is the situation and we're very much aware that none of the projects which have been nominated for Wednesday's Awards are one man bands - you'll see we have others like Eclipse, Mozilla, etc. etc. all in there and all of them are multinational efforts. But we need to find someone who is comfortable to accept the award on behalf of the team - otherwise GCC will be the only project which is not represented at the event. We understand the value of GCC (do a Google search on my name for example) and just want to give recognition. Jon. pgpqPOlHjHyQ7.pgp Description: PGP signature
Re: [URGENT] GCC 4.0 Nomination
On Tue, Oct 04, 2005 at 01:00:53PM +0100, Joern RENNECKE wrote: > I could make it there, but I'd have to leave shortly after 11 p.m., > since the last > train from paddington to bristol goes at half past eleven. That would be fine. I've spoken to the organisers and have had you added to the list of people attending the dinner. They suggest that you go to the stand of our magazine - Linux User & Developer - during the afternoon and pick up your pass or just turn up at 19:00 for the ceremony. It's a black tie event and apparently that means we have to wear a suit or something (I've got to go and dig one out myself tonight). Jon.
Installing GCC 4.1-20051223 on FreeBSD 6 failed
I'm trying to install the GCC 4.1 snapshot from Dec 23, 2005 on my FreeBSD box. I'm trying to try out gcj. The installation fails, complaining about not enough virtual memory. I just added another 2GB swap file on this box. I now have 1GB of physical RAM and 4GBs of swap. And that's not enough?! What do I need to do to get gcj and libjava to install? Is this a problem with GNU make or GCC? Thanks! Jon Brisbin Webmaster NPC International, Inc.
Re: Installing GCC 4.1-20051223 on FreeBSD 6 failed
Update: Just tarred everything up and stuck it on one of my servers, which has 4GBs of physical RAM and 2GBs of swap. Same problem: "virtual memory exhausted". If 6GBs isn't enough, then I'm out of ideas. I tried patching make with a patch I found on the make ML archives. No dice. Checked out make from CVS but the build is horribly broken (missing .po files and other such garbage). What do I do now? Thanks! Jon Brisbin Webmaster NPC International, Inc. Jon Brisbin wrote: I'm trying to install the GCC 4.1 snapshot from Dec 23, 2005 on my FreeBSD box. I'm trying to try out gcj. The installation fails, complaining about not enough virtual memory. I just added another 2GB swap file on this box. I now have 1GB of physical RAM and 4GBs of swap. And that's not enough?! What do I need to do to get gcj and libjava to install? Is this a problem with GNU make or GCC? Thanks! Jon Brisbin Webmaster NPC International, Inc.
Re: Installing GCC 4.1-20051223 on FreeBSD 6 failed
What parameter do I put into loader.conf to do that? I did some googling and the kern.maxdsiz parameter I found a reference to didn't work. Where do I find that information? If I were going to compile it with the Doug Lea malloc, would I need to recompile GCC? Thanks! Jon Brisbin Webmaster NPC International, Inc. H. J. Lu wrote: On Fri, Dec 30, 2005 at 10:53:43AM -0600, Jon Brisbin wrote: Update: Just tarred everything up and stuck it on one of my servers, which has 4GBs of physical RAM and 2GBs of swap. Same problem: "virtual memory exhausted". If 6GBs isn't enough, then I'm out of ideas. I tried patching make with a patch I found on the make ML archives. No dice. Checked out make from CVS but the build is horribly broken I am assuming that you have applied my make patch. Even with my make patch applied, make still uses lots of memory. Please make sure your memory limit is big enough. I have bash-3.00$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited file size (blocks, -f) unlimited pending signals (-i) 16372 max locked memory (kbytes, -l) 32 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size(512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 16372 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited H.J.
Scoping bug in gcc 4.0.1
Hi, Can somebody tell me whether there is a known bug in g++ 4.0.1 wrt scoping of members of a template base class. The following contrived test case generates a compiler error on 4.0.1, complaining that 'a' is not in the scope scope of D::f() template class T { protected: bool a; }; template class D : public T { public: void f(void) { a = true; } }; int main() { D i; i.f(); return 0; } s.cxx: In member function 'void D::f()': s.cxx:12: error: 'a' was not declared in this scope This code was accepted just fine on gcc 3.2.3, so could be a new bug, or a deliberate change to the scoping rules ? Thanks, Jon Bloomfield Architect 3Dlabs UK Ltd Notice The information in this message is confidential and may be legally privileged. It is intended solely for the addressee. Access to this message by anyone else is unauthorized. If you are not the intended recipient, any disclosure, copying or distribution of the message, or any action taken by you in reliance on it, is prohibited and may be unlawful. If you have received this message in error, please delete it and contact the sender immediately. Thank you.
LRA - insn does not satisty its constraints
Hi, I've been looking at updating some of the targets to use LRA. On some targets that have a one register & immediate instruction format (i.e. dest register and source register are the same), I see errors such as: error: insn does not satisfy its constraints: (insn 2 7 5 2 (set (reg/f:SI 12 r12 [39]) (plus:SI (reg/f:SI 15 sp) (const_int 4 [0x4]))) file.c 6 {addsi3} (nil)) internal compiler error: in reload_cse_simplify_operands, at postreload.c:411 Where the instruction pattern has a match operand constraint ("0"): (define_insn "addsi3" [(set (match_operand:SI 0 "register_operand" "=r,r,r,r,r") (plus:SI (match_operand:SI 1 "register_operand" "%0,0,0,0,0") (match_operand:SI 2 "reg_si_int_operand" "r,M,N,O,i")))] "" "addd\t%2, %0" ) This seems to work fine for a number of targets without the LRA, but fails with. Any pointers as to what needs to be changed to get this to work? Does the instruction now need additional alternatives to handle the case where the two registers are different, or are there some target hooks other than TARGET_LRA_P that need to be implemented? Thanks, Jon
pre_modify/post_modify with scaled register
Hi, The gccint docs for pre_modify/post_modify say that the address modifier must be one of three forms: (plus:m x z), (minus:m x z), or (plus:m x i), where z is an index register and i is a constant. Why isnt (plus:m x (mult:m z i)) supported, for architectures that support scaling of the index register (E.g. ARM?) Compiling: int *f(int *p, int x, int z) { p[z] = x; return p + z; } For ARM results in: str r1, [r0, r2, asl #2] add r0, r0, r2, asl #2 Rather than just: str r1, [r0, r2, asl #2]! Should this be improved by expanding what pre/post_modify supports, as above, or perhaps a peephole optimisation? Cheers, Jon
GAS GCC FAQ query
Hello Just looking at this page: http://gcc.gnu.org/faq.html#gas I saw this text "(the GNU loader)". Is this really an alternative name for gas? I've not seen it called GNU loader elsewhere. I was wondering if the text could just be removed. Please keep my email address in any replies. Best regards, Jon
gcc detect multiple -o passed on one command line
Hello Is it expected that more than one -o option should be allowed by GCC on command line? The later -o option overriding earlier. I had expected the parameter checking to detect this duplication of options. gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3 $ gcc -W -Wall -o main main.c -omup.o $ ls main.c mup.o I can create a bug ticket if needed. Let me know. Please keep my email address included in any replies. Best regards, Jon
Re: GAS GCC FAQ query
Hello. thank you for your reply. Jonathan Wakely wrote, On 05/05/11 22:47: On 5 May 2011 22:30, Jon Grant wrote: Hello Just looking at this page: http://gcc.gnu.org/faq.html#gas I saw this text "(the GNU loader)". Is this really an alternative name for gas? I've not seen it called GNU loader elsewhere. I was wondering if the text could just be removed. It refers to the linker, ld, not to gas, and shouldn't be removed. The parenthesized text represents an alternation, the paragraph should be read as referring to the GNU assembler or to the the GNU linker (aka loader). Not read "ld" called a "GNU loader" in binutils documentation, the common name I have seen is "GNU linker". (I do recall "ld" is an abbreviation for "load" though.) I would propose to clarify as: "To ensure that GCC finds the GNU assembler (or the GNU linker)," Best regards, Jon
Re: GAS GCC FAQ query
Gerald Pfeifer wrote, On 08/05/11 14:02: On Fri, 6 May 2011, Jonathan Wakely wrote: I would propose to clarify as: "To ensure that GCC finds the GNU assembler (or the GNU linker)," I see no harm in that change, Gerald, what do you think? Agreed. Things would have been different twenty years ago, but these days using linker is a lot more natural and common (as a grep in gcc/doc confirms, too). I went ahead and applied the patch below. Thanks for suggesting this! Great the change has been made Gerald, Jonathan, thank you. Best regards, Jon
Re: gcc detect multiple -o passed on one command line
Dave Korn wrote, On 07/05/11 16:01: On 06/05/2011 09:00, Andreas Schwab wrote: Ian Lance Taylor writes: The difference is that with -E the -o option is passed to cc1, whereas without it the -o option is passed to the assembler or the linker. The GNU assembler and linker both have the usual Unix behaviour of only using the last -o option. Nevertheless it might be a good idea to file a bug for binutils. Consistency is probably more important, and it helps in case of typos. In this case, I don't think consistency should win over maintaining long-established behaviour. I'm more inclined to say that cc1 should change to follow long-established *nix tradition. (I have absolutely found it useful on at least one occasion to be able to add a -o option into CFLAGS and know it would come last on the command-line and win.) Hello Would it be useful to have an option to enable warning if there are duplicates? From my point of view, I feel that not warning duplicates may let mistakes in the way gcc is invoked slip through, e.g. assist tracking down these issues in makefiles. Best regards, Jon
Re: Long paths with ../../../../ throughout
On 2 February 2010 22:47, Ian Lance Taylor wrote: > Jon writes: > >> Is there a way to get collect2 to save the temporary .c file it >> generates to have a look at it? I believe it may be the __main() >> function, with the -debug option it gives the attached >> gplusplus_collect2_log.txt, looking at the [/tmp/ccyBAI9V.c] file >> though it is empty, any ideas? > > Using -debug will direct collect2 to save the temporary .c file when > it creates one. However, in ordinary use on GNU/Linux, collect2 will > never generate a temporary .c file. Hello Ian, Another reply for this old thread. I wondered, if collect2 is possibly not needed in normal use on GNU/Linux, could GCC be configured to call ld directly in those cases to save launching another binary. Best regards, Jon
Re: Long paths with ../../../../ throughout
Ian Lance Taylor wrote, On 03/07/11 05:27: Jon Grant writes: [.] Another reply for this old thread. I wondered, if collect2 is possibly not needed in normal use on GNU/Linux, could GCC be configured to call ld directly in those cases to save launching another binary. collect2 is needed if you use -frepo or -flto. Hi Ian. Not sure how easy this is, but could those options simply be checked to determine if the linker could be called directly? Would save launching collect2 then, to speed up builds a bit! Best regards, Jon
onlinedocs formated text too small to read
Hello I'm using latest Firefox looking at the onlinedocs with a default Firefox install, default font sizes, no change in zoom level. http://gcc.gnu.org/onlinedocs/gcc/Variable-Attributes.html The monospace text is tiny, e.g.: struct foo { int x[2] __attribute__ ((aligned (8))); }; The fix is pretty easy, just change the embedded CSS that is generated: pre.smallexample { font-size:smaller } I propose this be changed to: pre.smallexample { font-size:normal } Note: There are some other CSS tags with "smaller", but they are unused on that page, perhaps worth checking if the others are used in other files and can be changed to "normal" as well. I'm not on the mailing list, so please keep my email address in any replies. Thanks, Jon
gcc bitfield order control
Hello I have a build with a lot of structures in a big-endian style layout. [I recognise that bit-fields are not portable due to their ordering not being locally MSB first like the regular bit shift operation << is.i.e.(1<<2) == 4 ] typedef struct reg32 { union { _uint32 REG32; struct { _uint32 BIT31:1; _uint32 BIT30:1; _uint32 BITS:30; } B; } R; } reg32_t; On a little-endian ARM build. Using : arm-none-eabi-gcc (Sourcery G++ Lite 2010.09-51) 4.5.1 Writing reg.R.REG32 = 1, results in BIT31 containing 1. Rather than the "1" being in the "BITS" field. My thought is I'll need to swap every structure to be in little-endian style ordering. Does anyone have any other ideas how to handle this with gcc? I was thinking to write a little program to make the changes to the header file. Please include my email address in any replies. Best regards, Jon
Re: onlinedocs formated text too small to read
Hello Georg-Johann Lay wrote, On 08/07/11 19:08: [.] I can confirm that it's hardly readable on some systems. I use Opera and several FF versions, some worse, some a bit less worse. IMO it's definitely to small, I already thought about complaining, too. Johann Could I ask, what would be the best way to progress this request? e.g. Should I create a bugzilla ticket. Best regards, Jon
Re: ARM summit at Plumbers 2011
On Tue, 2011-08-23 at 17:11 +0100, Steve McIntyre wrote: > UPDATE: we've not had many people confirm interest in this event yet, > which is a shame. If you would like to join us for this session, > please reply and let me know. If we don't get enough interest by the > end of Sunday (28th August), then we'll have to cancel the meeting. I'm obviously confirming, but I'll repeat that for the record. My interests here include helping to lead up Fedora's ARMv7 efforts, but also wider ARM platform standardization (boot, device enumeration, multi-arch, ABI, kernel consolidation, and many other things). If there's at least representation from a few of the distros (as it seems is the case at this point) then I think it's worthwhile having the formal slots. Nothing is lost in so doing. In any case, many discussions will take place if we have the opportunity to do so. Jon.
cc1.exe: warnings being treated as errors
Hello I noticed that when compiling C files with GCC and using the -Werror option, I see this additional output: cc1.exe: warnings being treated as errors ./src/main.c: In function 'main': ./src/main.c:41:15: error: unused variable 'hello' Is the "cc1" line output needed? Just wondering if it could be removed. Appears superfluous. If compiling with g++ it is : cc1plus: warnings being treated as errors I saw this in two slightly old builds of GCC: arm-none-eabi-gcc-4.5.1.exe (Sourcery G++ Lite 2010.09-51) 4.5.1 gcc (Ubuntu 4.4.3-4ubuntu5) 4.4.3 Please keep my email address in any replies as I'm not on the mailing list. Best regards, Jon
Re: cc1.exe: warnings being treated as errors
Jonathan Wakely wrote, On 19/09/11 19:40: On 19 September 2011 18:59, Jon Grant wrote: Hello I noticed that when compiling C files with GCC and using the -Werror option, I see this additional output: cc1.exe: warnings being treated as errors ./src/main.c: In function 'main': ./src/main.c:41:15: error: unused variable 'hello' Is the "cc1" line output needed? Just wondering if it could be removed. Appears superfluous. It's not superfluous, it says that the error following might have been a warning, except that -Werror was used. If you don't want it you can either fix the warning or not use -Werror. It's kind of re-iterating the command line options, that the user will choose to be aware of already. I don't recall seeing that text output before about ~1 year ago. I'd thought because the previous line of output said "gcc -Werror -Wall -o main main.c", the options clear. If it's really vauluble, that output could be turned on by an option itself! -Wdisplay-warning-upgrade. Leaving it off by default. Best regards, Jon
No pointer conversion warning for "bool" in C/C++
Hello Currently gcc, and g++ don't give a warning when a pointer was converted to a bool, in the same way it is for other types. Could I ask for opinion on this, and if I should create a bug ticket. Please find below output from compilation, and attachments showing the two tests. gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2 $ gcc -Wconversion -Wall -o t bool_conversion.c bool_conversion.c: In function ‘main’: bool_conversion.c:14:8: warning: assignment makes integer from pointer without a cast bool_conversion.c:15:9: warning: assignment makes integer from pointer without a cast ^ I expected to see a warning on line 13. $ g++ -Wconversion -o t bool_conversion.cpp bool_conversion.cpp: In constructor ‘A::A()’: bool_conversion.cpp:16:41: warning: converting to non-pointer type ‘int’ from NULL bool_conversion.cpp:16:41: warning: converting to non-pointer type ‘unsigned int’ from NULL ^ I expected to see a bool warning on line 16. I tested assigning NULL in these tests (Note, I also confirmed that assigning a pointer variable produced the same lack of warning output.) Please include my email address in any replies Best regards, Jon // g++ -Wconversion -o t main.cpp // Should this not give a warning for the bool conversion // include to get definition of NULL #include void * g_glob = NULL; class A { public: A(); bool m_bool; int m_int; unsigned int m_uint; }; A::A() : m_bool(g_glob), m_int(NULL), m_uint(NULL) { } int main() { return 0; } // gcc -Wconversion -o t bool_conversion.c // Should this not give a warning for the bool conversion // include to get definition of NULL #include #include int main(void) { bool m_bool; int m_int; unsigned int m_uint; m_bool = NULL; m_int = NULL; m_uint = NULL; return 0; }
Trying to find a gcc warning to detect different parameter names
Hello I am looking for a gcc option to give a warning when parameter names don't match between the prototype in C, and the definition. Could someone point me to the option if there is one please. Example provided below, where "offset" miss-spelt "offest". (I found -Wstrict-prototypes, but that only warns if types are not specified.). Would be quite handy to have this ability to check parameter names are consistent. Please include my email address in any replies. Best regards, Jon gcc (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2 // gcc -Wall -o main main.c #include void test_func(int offest); void test_func(int offset) { printf("%d\n", offset); } int main(void) { test_func(1); return 0; }
Re: No pointer conversion warning for "bool" in C/C++
Hello Jonathan Wakely wrote, On 26/09/11 08:10: On 26 September 2011 05:29, Ian Lance Taylor wrote: Jon Grant writes: Currently gcc, and g++ don't give a warning when a pointer was converted to a bool, in the same way it is for other types. At least in C++, it's not really true to say "in the same way it is for other types" because you cannot convert from a pointer to any integer type except bool. Your test uses NULL for the other integer types, which is an integral constant expression with value zero, so it's ok to convert that to an integer type. That's not true for general pointer values: if your test used m_int(g_glob) then it wouldn't compile. Good point. My test should have used g_glob due to NULL being a macro of 0 in C++. There is a lot of code which uses if (p) where p is a pointer to test whether p is not NULL. I don't think we could reasonably have gcc warn about such a case. We might be able to separate out conversion to bool on assignment from conversion to bool in a test, though. That would still break this: Base* p = getObj(); bool is_derived = dynamic_cast(p); What problem is the warning supposed to solve? A programmer assigning a bool with a pointer, there's an implicit evaluation there isn't there? rather than: bool invalid = (NULL == p); I expect this depends on what the standard allows then. Regards, Jon
Re: cc1.exe: warnings being treated as errors
Hi Jonathan Jonathan Wakely wrote, On 24/09/11 15:55: On 24 September 2011 15:40, Jon Grant wrote: It's kind of re-iterating the command line options, that the user will choose to be aware of already. I don't recall seeing that text output before about ~1 year ago. It was there in GCC 4.1, maybe earlier, I didn't check. However, coming back to my query: Is there a need to remind the user that warnings on the build are being treated as errors? Is this a special case because it would cause the build to stop? For example: -Wall means I see "control reaches end of non-void function" messages, but doesn't output "cc1.exe: all warnings turned on" I'd thought because the previous line of output said "gcc -Werror -Wall -o main main.c", the options clear. Not if you run "make" and it doesn't echo the compiler command, or run the compiler from an IDE, or anything else which shows the errors but not the command. I would have though that it's not GCC's responsibility to echo the options passed to it. Like the IDE example, the IDE can inform the user of what compiler options are in use; I don't see why GCC can't keep quiet about -Werror. Best regards, Jon
Re: No pointer conversion warning for "bool" in C/C++
Jonathan Wakely wrote, On 26/09/11 09:53: On 26 September 2011 09:32, Jon Grant wrote: [.] bool invalid = (NULL == p); Why is that preferable? It would be clearer IMHO what was happening. I expect this depends on what the standard allows then. 4.12 Boolean conversions [conv.bool] 1 A prvalue of arithmetic, unscoped enumeration, pointer, or pointer to member type can be converted to a prvalue of type bool. A zero value, null pointer value, or null member pointer value is converted to false; any other value is converted to true. A prvalue of type std::nullptr_t can be converted to a prvalue of type bool; the resulting value is false. I stand corrected. No reason to change anything if it is in the standard. Thank you for discussing this point. Best regards, Jon
Re: onlinedocs formated text too small to read
Georg-Johann Lay wrote, On 01/08/11 09:40: Jon Grant wrote: [.] http://gcc.gnu.org/ml/gcc/2011-07/msg00106.html CCed Gerald, I think he cares for that kind of things. If he does not answer (it's vacation time) file a PR so that it won't be forgotten. Johann Thank you. I filled a PR now: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50642 Best regards, Jon
Re: cc1.exe: warnings being treated as errors
Jonathan Wakely wrote, On 26/09/11 09:57: [.] Feel free to request a new option in Bugzilla to suppress the note, that's the right place for this discussion. Good point. I've created a ticket: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50643 Regards, Jon
__sync_sychronize on ARM
Greetings, I have been trying to help diagnose a failure to build in the icu package for Fedora on ARM systems, with gcc 4.7. I should very much like to know the answer to a few questions, so that I can help fix this. I would like to say at the outset that I believe I am a reasonably competent programmer, but I am not (yet perhaps) a gcc hacker, though I certainly enjoy any such opportunity to get to know its internals more. I understand more of the theory than the implementation of compilers :) The __sync_synchronize "legacy" sync function is intended to be used to perform an expensive data memory barrier operation. It is defined within libgcc in such a way that I *believe* means that, on most architectures, it is replaced with an inline assembly code emitted that performs a sync operation. On ARM, and some other architectures with mixed ISAs wherein there may not be a sync function nor one way to do this, this function (__sync_synchronize) can be a real function call. In that case, it might cause inline assembly generation, or e.g. call a kernel VDSO helper. The icu package contains a direct call to __sync_sychronize, especially in the iotest test cases. I believe that this compiles fine on x86 because there is no function call. However, on ARM, the code fails to link because the __sync_synchronize function is HIDDEN and not exported (or so goes my understanding - is that correct?). I am drawing a blank, though, on how this differs from earlier versions of gcc such as 4.6 (aside from a slight difference in the macro used to make it available), and whether it is indeed the case that this function should be made available within libgcc for direct linking? In other words, I do not know whether there is a bug here in gcc or whether icu needs changing. It seems that there are newer, less expensive sync operations that perhaps ought to be used instead, but I don't have the full context. Please forgive my lack of understanding of gcc intrinsics, and atomics. I would very much like to learn more about the internal implementation and I look forward to whatever information you can share with me. If you would like more information, I can happily provide it in the morning. It is very late here, but I wanted to start this thread asap. We are trying to fix icu so that we can continue to build Fedora 17 for ARM systems. Thanks very much, Jon.
Re: __sync_sychronize on ARM
Hi Ramana, Thanks very much for getting back to me! On Mon, 2012-01-30 at 08:50 +, Ramana Radhakrishnan wrote: > On Mon, Jan 30, 2012 at 6:56 AM, Jon Masters wrote: > > The __sync_synchronize "legacy" sync function is intended to be used to > > perform an expensive data memory barrier operation. It is defined within > > libgcc in such a way that I *believe* means that, on most architectures, > > it is replaced with an inline assembly code emitted that performs a sync > > operation. On ARM, and some other architectures with mixed ISAs wherein > > there may not be a sync function nor one way to do this, this function > > (__sync_synchronize) can be a real function call. In that case, it might > > cause inline assembly generation, or e.g. call a kernel VDSO helper. > > On ARM we don't have a kernel VDSO You're right of course! I was confusing the VDSO-like user mode mapped helpers (Documentation/arm/kernel_user_helpers.txt) with full VDSO. I apologize for my mistake. Nonetheless, I believe for the purposes of this thread, we can consider the behavior I described roughly consistent with reality, because a kernel helper will be called in a VDSO-like way. > sync_synchronize for older versions of > the architecture ( anything prior to armv6k) should result in a call > to sync_synchronize > in libgcc.a which should take care of calling the kernel helper function. This is what's confusing me :) Is one supposed (from some random source) to be calling __sync_synchronize or sync_synchronize? Convention suggests the latter, but I was sufficiently confused by the aliasing of the names in the source vs. the documentation, so I'd like to ask you :) > Therefore I'm assuming this is a breakage you face when building for > armv5te It is indeed. Thanks for noting that. > > The icu package contains a direct call to __sync_sychronize, especially > > in the iotest test cases. I believe that this compiles fine on x86 > > because there is no function call. However, on ARM, the code fails to > > link because the __sync_synchronize function is HIDDEN and not exported > > (or so goes my understanding - is that correct?). > > No, the HIDDEN shouldn't cause a link failure in this case - you > should be able to pull this > in when you link against the static libgcc where this should be defined. > > I don't know what your linker command line is so maybe that's a place > to start investigating from. Thanks! You're the second person to suggest that, so I'll look some more. Could you let me know about the correct function name, above? Appreciate the help. Jon.
RESOLVED - Re: __sync_sychronize on ARM
Hello everyone, Just a quick followup. This problem is now resolved. There is no breakage in gcc, just a problem in the Fedora icu package. That package contains some sed scripts in the "SPEC" (build description meta) file that intentionally were munging the Makefiles used to build ICU such that "nostdlib" was being given to gcc and it was never using libgcc. The intention of the person who made this change apparently was to prevent linking the standard math library into icu, but unfortunately a rather unusual solution was chosen. On some systems, one can almost get away with this because __sync_synchronize happens to be implemented in such a fashion that it is optimized into inline emitted assembly. On ARM, that isn't the case. In addition, it is likely that telling GCC not to link in core libraries like libgcc will lead to other breakage later. I have requested the package be fixed to remove the sed scripts and have temporarily (just to solve our problem in the Fedora ARM community) had "-lgcc" added to the linker flags as a very hackish solution for today. Thanks for the replies, and I apologize for the noise. I have learned a great deal about gcc atomics over the past few days. I have also learned that debugging packages requires that you build the package *exactly* as it is in the spec file, not just by running configure/make as therein ;) Jon.
RE: Lattice Mico32 port
Hi Richard, >> Index: gcc/config/lm32/sfp-machine.h >> Index: gcc/config/lm32/crti.S >> Index: gcc/config/lm32/lib1funcs.S >> Index: gcc/config/lm32/crtn.S >> Index: gcc/config/lm32/arithmetic.c >> Index: gcc/config/lm32/t-fprules-softfp >> Index: gcc/config/lm32/t-lm32 > >Can you move these to libgcc? The rules in libgcc/Makefile.in use $(gcc_srcdir) (E.g. for targets lib1asmfuncs-o). How would you suggest I do this? Cheers, Jon
RE: VOIDmode in ZERO_EXTEND crashed
> PS: Does gcc have a function which could dump the specified rtx? > I wanna dump the rtx when the crash happening. debug_rtx(x); You can also call this from within GDB, by typing: call debug_rtx(x) Cheers, Jon
RE: Lattice Mico32 port
> The port is ok to check in. Great - so can I apply it, or does someone else need to? Cheers, Jon
Long paths with ../../../../ throughout
Hello gcc -o t -### test.c Any easy way to evaluate and reduce command lines? Consider this: /usr/lib/gcc/i486-linux-gnu/4.3.3/../../../../lib/crt1.o Is actually the same as: /usr/lib/crt1.o -- which is much clearer! I'm using Ubuntu 9.04. Cheers, Jon $ gcc -o t -### test.c Using built-in specs. Target: i486-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 4.3.3-5ubuntu4' --with-bugurl=file:///usr/share/doc/gcc-4.3/README.Bugs --enable-languages=c,c++,fortran,objc,obj-c++ --prefix=/usr --enable-shared --with-system-zlib --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --enable-nls --with-gxx-include-dir=/usr/include/c++/4.3 --program-suffix=-4.3 --enable-clocale=gnu --enable-libstdcxx-debug --enable-objc-gc --enable-mpfr --enable-targets=all --with-tune=generic --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu --target=i486-linux-gnu Thread model: posix gcc version 4.3.3 (Ubuntu 4.3.3-5ubuntu4) COLLECT_GCC_OPTIONS='-o' 't' '-mtune=generic' "/usr/lib/gcc/i486-linux-gnu/4.3.3/cc1" "-quiet" "test.c" "-D_FORTIFY_SOURCE=2" "-quiet" "-dumpbase" "test.c" "-mtune=generic" "-auxbase" "test" "-fstack-protector" "-o" "/tmp/ccoCNitV.s" COLLECT_GCC_OPTIONS='-o' 't' '-mtune=generic' "as" "-Qy" "-o" "/tmp/ccKSwMpH.o" "/tmp/ccoCNitV.s" COMPILER_PATH=/usr/lib/gcc/i486-linux-gnu/4.3.3/:/usr/lib/gcc/i486-linux-gnu/4.3.3/:/usr/lib/gcc/i486-linux-gnu/:/usr/lib/gcc/i486-linux-gnu/4.3.3/:/usr/lib/gcc/i486-linux-gnu/:/usr/lib/gcc/i486-linux-gnu/4.3.3/:/usr/lib/gcc/i486-linux-gnu/ LIBRARY_PATH=/usr/lib/gcc/i486-linux-gnu/4.3.3/:/usr/lib/gcc/i486-linux-gnu/4.3.3/:/usr/lib/gcc/i486-linux-gnu/4.3.3/../../../../lib/:/lib/../lib/:/usr/lib/../lib/:/usr/lib/gcc/i486-linux-gnu/4.3.3/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-o' 't' '-mtune=generic' "/usr/lib/gcc/i486-linux-gnu/4.3.3/collect2" "--eh-frame-hdr" "-m" "elf_i386" "--hash-style=both" "-dynamic-linker" "/lib/ld-linux.so.2" "-o" "t" "-z" "relro" "/usr/lib/gcc/i486-linux-gnu/4.3.3/../../../../lib/crt1.o" "/usr/lib/gcc/i486-linux-gnu/4.3.3/../../../../lib/crti.o" "/usr/lib/gcc/i486-linux-gnu/4.3.3/crtbegin.o" "-L/usr/lib/gcc/i486-linux-gnu/4.3.3" "-L/usr/lib/gcc/i486-linux-gnu/4.3.3" "-L/usr/lib/gcc/i486-linux-gnu/4.3.3/../../../../lib" "-L/lib/../lib" "-L/usr/lib/../lib" "-L/usr/lib/gcc/i486-linux-gnu/4.3.3/../../.." "/tmp/ccKSwMpH.o" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "/usr/lib/gcc/i486-linux-gnu/4.3.3/crtend.o" "/usr/lib/gcc/i486-linux-gnu/4.3.3/../../../../lib/crtn.o"
Re: Long paths with ../../../../ throughout
I should add, I'm not on this mailing list, so please include my email address in any replies. Cheers, Jon
Re: Long paths with ../../../../ throughout
2010/1/19 Jon Grant : > I should add, I'm not on this mailing list, so please include my email > address in any replies. Also I notice lots of duplicate parameters: Is this directory really needed twice? -L/usr/lib/gcc/i486-linux-gnu/4.3.3 -L/usr/lib/gcc/i486-linux-gnu/4.3.3 also -lgcc_s is mentioned twice, as is -gcc Finally, could collect2 output command lines when in -verbose mode? Currently I can't see what parameters it is calling "ld" with.. when ld fails. I'm not on this mailing list, so please include my email address in any replies. Cheers, Jon
Re: Long paths with ../../../../ throughout
Hello Ian Thank you for the quick reply with explanations. 2010/1/19 Ian Lance Taylor : > Jon Grant writes: > >> Any easy way to evaluate and reduce command lines? Consider this: >> >> /usr/lib/gcc/i486-linux-gnu/4.3.3/../../../../lib/crt1.o >> >> Is actually the same as: /usr/lib/crt1.o -- which is much clearer! > > Using this form of path makes it easy to move an installed gcc tree to > a new location and have it continue to work correctly. Since normal > users never see these paths, the goal is correctness rather than > clarity. Ok I understand. The reason to build it up from a root and a target /lib/crt*.o file. I thought it would be possible to resolve the back to a direct pathname though to use for the parameters. I see that some of the files are located in the -L library directory specified, crtbegin.o, crtend.o in which case, perhaps they both do not need their full long path specified. >> Also I notice lots of duplicate parameters: >> >> Is this directory really needed twice? >> -L/usr/lib/gcc/i486-linux-gnu/4.3.3 -L/usr/lib/gcc/i486-linux-gnu/4.3.3 > > No. I would encourage you to investigate why it is happening. i tried: gcc -o t -Wl,-debug test.c, I see collect2 gets the duplicates passed to it, and then it passes it on to ld. I would have thought that if collect2 was compiled with define LINK_ELIMINATE_DUPLICATE_LDIRECTORIES it would strip out the duplicate parameters before calling ld. It does not appear to be switched on in this Ubuntu package I am using though. Is it on by default? >> also -lgcc_s is mentioned twice, as is -gcc > > This is because on some systems there is a circular dependency between > -lgcc and -lc. Some of the functions in -lgcc require functions in > -lc. If -lc was compiled with gcc, then on some systems some of the > functions in -lc will require -lgcc. Fortunately the functions which > -lc requires in -lgcc will never themselves require -lc. So > mentioning -lgcc twice, once before -lc and once after, suffices on > all systems. > >> Finally, could collect2 output command lines when in -verbose mode? >> Currently I can't see what parameters it is calling "ld" with.. when >> ld fails. > > To see what collect2 is doing, use -Wl,-debug. Is this documented If I add this to my existing command line I see there not any output: $ gcc -### -o t -Wl,-debug test.c If I change to not have -### I see it does work, not sure why. So I understand that this passes -debug to collect2. As collect2 only has -v mode to display version. Would a patch to add --help to it be supported? Also could describe something about collect2's purpose at the top of that --help output. Additional queries: 1) collect.c:scan_libraries may not find ldd, in which case it displays message on output, and returns as normal. Should it not be fatal if ldd is required? 2) in collect2.c:main "-debug" is checked, and variable debug set to 1 (perhaps that should be "true" to match the style of other flags) Please keep my email address in any reply. Cheers, Jon
Gprof can account for less than 1/3 of execution time?!?!
I have recently encountered a gross inaccuracy in gprof that I can't explain. Yes, I know gprof uses a sampling technique so I should not expect a high level of precision, but the results I am getting clearly reflect a more fundamental issue. The program in question has been compiled with -pg for all source code files. The time command reports 20 seconds of user time (which is consistent with personal observation) but the gprof output accounts for only about 6 seconds of the execution time. I have eliminated all IO from the program, and the results remain consistent. Gprof is sampling the program every 10 ms, so in the observed 20 seconds of execution time, it should be taking 2000 samples, which should be enough to avoid any grows inconsistencies. Any ideas would be appreciated. Jon
Re: Gprof can account for less than 1/3 of execution time?!?!
Maucci, Cyrille wrote: Hello Jon, I'm used to gprof on HPUX and can tell you that on HPUX when we gprof an executable, its only works on all the objects present in the executable but not the shared libs. Actually on HPUX, either you choose to gprof the exe or the libs but not both. When you want both you go to more advanced tools like Caliper. So I don't know which platform you were running on there and if gcc's gprof works as HPUX's gprof, but if there's the same limitation as with HPUX's gprof, maybe this is what you've hit? > This was run on an AMD Opteron running Linux. ++Cyrille PS: how can you claim you have eliminated all I/Os? Not sure what you are asking. I have deleted all input and output statements from the program. I replace the original input by a subroutine that generates the test data internally. In this case, I am running the program just to get the gprof data. If gprof were missing 10% of the execution time, I would shrug and say no big deal. But it's missing 70% of the execution time, which seems to imply something fundamentally wrong with either gprof or the way I am using it. Jon -Original Message- From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Jon Turner Sent: Sunday, February 21, 2010 7:27 PM To: gcc@gcc.gnu.org Subject: Gprof can account for less than 1/3 of execution time?!?! I have recently encountered a gross inaccuracy in gprof that I can't explain. Yes, I know gprof uses a sampling technique so I should not expect a high level of precision, but the results I am getting clearly reflect a more fundamental issue. The program in question has been compiled with -pg for all source code files. The time command reports 20 seconds of user time (which is consistent with personal observation) but the gprof output accounts for only about 6 seconds of the execution time. I have eliminated all IO from the program, and the results remain consistent. Gprof is sampling the program every 10 ms, so in the observed 20 seconds of execution time, it should be taking 2000 samples, which should be enough to avoid any grows inconsistencies. Any ideas would be appreciated. Jon
Re: Gprof can account for less than 1/3 of execution time?!?!
Yes, it is statically linked. In any case, there is very little usage of external libraries here. Jon Alan Modra wrote: On Sun, Feb 21, 2010 at 12:27:04PM -0600, Jon Turner wrote: The program in question has been compiled with -pg for all source code files. Linked statically too? If not, the missing time is probably spent in libc.so or other shared libraries.
Re: Gprof can account for less than 1/3 of execution time?!?!
You're not listening. I am using -pg and the program is statically linked. The concern I am raising is not about the function counting, but the reported running times, which are determined by sampling (read the gprof manual, if this is news to you). In this case, the mcount overhead cannot account for the discrepancy, since that would cause gprof to OVER-estimate the run time, while in this case it is UNDER-estimating. It's missing about 70% of the actual running time in the program. It conceivably I am doing something wrong. I hope so, since once I know what it is, I can fix it. But at the moment, it's hard to avoid the suspicion that something about the gprof implementation is deeply flawed. Jon Joern Rennecke wrote: Quoting Michael Matz : Hi, On Sun, 21 Feb 2010, Jon Turner wrote: I have recently encountered a gross inaccuracy in gprof that I can't explain. Yes, I know gprof uses a sampling technique This is incorrect. Code compiled with -pg will call mcount on each function entry. If there are many calls (compared to other computations) the mcount overhead might become fairly large. The mcount overhead actually depends on the machine description, although most ports have standardized on a very runtime profligate scheme.
Re: Gprof can account for less than 1/3 of execution time?!?!
graph::graph(int, int) 0.00 6.20 0.001 0.00 0.00 augPath::augPath(flograph&, int&) 0.00 6.20 0.001 0.00 0.00 augPath::~augPath() 0.00 6.20 0.001 0.00 0.00 digraph::makeSpace() 0.00 6.20 0.001 0.00 0.00 digraph::digraph(int, int) 0.00 6.20 0.001 0.00 0.00 flograph::makeSpace() 0.00 6.20 0.001 0.00 0.00 flograph::flograph(int, int, int, int) 0.00 6.20 0.001 0.00 6.15 shortPath::shortPath(flograph&, int&) 0.00 6.20 0.001 0.00 0.00 shortPath::~shortPath() I cut this off after gprof displayed the flat profile. The important thing to note is that the cumulative seconds reported by gprof never exceeds 6.2 seconds. Here's some basic cpu info. % more /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 15 model : 2 model name : Intel(R) Pentium(R) 4 CPU 2.40GHz stepping: 9 cpu MHz : 2394.152 cache size : 512 KB fdiv_bug: no hlt_bug : no f00f_bug: no coma_bug: no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca cmov pat pse36 clflush dts acpi mmx fx sr sse sse2 ss ht tm pbe up cid xtpr bogomips: 4790.27 clflush size: 64 Any insight or advice will be much appreciated. Jon
Re: Gprof can account for less than 1/3 of execution time?!?!
Doh! Thanks, Nathan. I think you put your finger on it. I was well aware of the overhead that gprof can introduce, but did not recognize that this overhead was not being counted by gprof. Jon Nathan Froyd wrote: On Mon, Feb 22, 2010 at 03:23:52PM -0600, Jon Turner wrote: In it, you will find a directory with all the source code needed to observe the problem for yourself. The top level directory contains a linux executable called maxFlo, which you should be able to run on a linux box as is. But if you want/need to compile things yourself, type "make clean" and "make all" in the top level directory and you should get a fresh copy of maxFlo. So, compiling maxFlo with no -pg option: @nightcrawler:~/src/gprof-trouble-case$ time ./maxFlo real0m3.465s user0m3.460s sys 0m0.000s Compiling maxFlo with -pg option: @nightcrawler:~/src/gprof-trouble-case$ time ./maxFlo real0m9.780s user0m9.760s sys 0m0.010s Notice that ~60% of the running time with gprof enabled is simply overhead from call counting and the like. That time isn't recorded by gprof. That alone accounts for your report about gprof ignoring 2/3 of the execution time. Checking to see whether maxFlo is a dynamic executable (since you claimed earlier that you were statically linking your program): @nightcrawler:~/src/gprof-trouble-case$ ldd ./maxFlo linux-vdso.so.1 => (0x7fff2977f000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x7fb422c21000) libm.so.6 => /lib/libm.so.6 (0x7fb42299d000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x7fb422786000) libc.so.6 => /lib/libc.so.6 (0x7fb422417000) /lib64/ld-linux-x86-64.so.2 (0x7fb422f31000) So calls to shared library functions (such as functions in libm) will not be caught by gprof. Those calls count account for a significant amount of running time of your program and gprof can't tell you about them. Inspecting the gmon.out file: @nightcrawler:~/src/gprof-trouble-case$ gprof maxFlo gmon.out Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds secondscalls s/call s/call name 16.09 0.37 0.3727649 0.00 0.00 shortPath::findPath() 12.61 0.66 0.29 55889952 0.00 0.00 graph::next(int,int) const 11.96 0.94 0.28 61391904 0.00 0.00 graph::mate(int,int) const 10.87 1.19 0.25 58654752 0.00 0.00 flograph::res(int,int) const 10.44 1.43 0.24 _fini 6.96 1.59 0.16 65055289 0.00 0.00 graph::term(int) const 6.96 1.75 0.16 61391904 0.00 0.00 digraph::tail(int) const [...lots of stuff elided...] 0.00 2.30 0.001 0.00 0.00 graph gprof is telling you about 2.3 seconds of your execution time. With the factors above accounted for, that doesn't seem unreasonable. -Nathan
Updating multilib support after a compiler is built
Hi, Is it possible to update the multilib combinations supported by GCC after it has been built? (I would like to build some libraries optimised for different CPUs variants, that aren't built by default). I tried doing this via a specs file, but something like the following fails: %rename multilib_matches old_multilib_matches *multilib_matches: mcpu=xyz mcpu=xyz;%(old_multilib_matches); with: multilib spec 'mcpu=xyz mcpu=xyz;%(old_multilib_matches);' is invalid So it looks like GCC isn't performing substitutions for the %s in multilib_matches specs. (The same seems to be true for the other multilib specs). Perhaps there's a simpler way? Cheers, Jon
RE: Updating multilib support after a compiler is built
Thanks for the suggestions. > If you only want to optimize some libraries but not others, GCC doesn't > effectively support different multilibs having different sets of libraries either. > My proposal <http://gcc.gnu.org/ml/gcc/2010-01/msg00063.html> > would have the effect of making it much easier to have different sets of > libraries for each multilib. Sounds like a good proposal. For now I've just hacked in a -multilib option. Cheers, Jon
Re: Bug in expand_builtin_setjmp_receiver ?
Hi Nathan, > lm32 has a gdb simulator available, so it should be fairly easy to write > a board file for it if one doesn't already exist. > > Unfortunately, building lm32-elf is broken in several different ways > right now. What problems do you have building lm32-elf? If you let me know, I can try to look in to them. Cheers, Jon
RE: Bug in expand_builtin_setjmp_receiver ?
Hi Fred, > If you have access to a lm32 toolchain, can you test if gcc.c- > torture/execute/built-in-setjmp.c passes at different optimization levels? For a SVN snapshot from yesterday, patched so it fixes the problem Nathan mentioned: FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O2 FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O3 -fomit-frame-pointer FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O3 -fomit-frame-pointer -funroll-loops FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O3 -fomit-frame-pointer -funroll-all-loops -finline-functions FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O3 -g FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -Os FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O2 -flto FAIL: gcc.c-torture/execute/built-in-setjmp.c execution, -O2 -fwhopr Cheers, Jon
optimizing calling conventions for function returns
Looking at assembly listings of the Linux kernel I see thousands of places where function returns are checked to be non-zero to indicate errors. For example something like this: mov bx, 0 .L1 call foo test ax,ax jnz .Lerror inc bx cmp bx, 10 jne .L1 .Lerror process error A new calling convention could push two return addresses for functions that return their status in EAX. On EAX=0 you take the first return, EAX != 0 you take the second. So the above code becomes: push .Lerror mov bx, 0 .L1 call foo inc bx cmp bx, 10 jne .L1 add sp, 2 .Lerror process error The called function then does 'ret' or 'ret 4' depending on the status of EAX != 0. Of course there are many further optimizations that can be done, but this illustrates the concept. Has work been done to evaluate a calling convention that takes error checks like this into account? Are there size/performance wins? Or am I just reinventing a variation on exception handling? -- Jon Smirl [EMAIL PROTECTED]
Re: optimizing calling conventions for function returns
On 5/23/06, Paul Brook <[EMAIL PROTECTED]> wrote: > Has work been done to evaluate a calling convention that takes error > checks like this into account? Are there size/performance wins? Or am > I just reinventing a variation on exception handling? This introduces an extra stack push and will confuse a call-stack branch predictor. If both the call stack and the test are normally predicted correctly I'd guess this would be a performance loss on modern cpus. Note that the error return is above the normal return and not placed there by a call, it should look like data to the predictor. The normal return is placed on the stack by a call which should continue to be correctly predicted, I would expect the error return path to be mispredicted but it is supposed to be the unlikely case. Is the callstack branch correctly predicted if the routine being called is complex? This does eliminate the test./jmp after every function call. Further branches could be eliminated by having multiple returns from the called function at the expense of increasing code size. Paul -- Jon Smirl [EMAIL PROTECTED]
Re: optimizing calling conventions for function returns
On 5/23/06, Florian Weimer <[EMAIL PROTECTED]> wrote: Yes, but the test/jump now happens in the callee, and you need to maintain an additional stack slot. I wouldn't be surprised if the The callee already had to implement the test/jmp in order to decide to return the error. So this shouldn't introduce another one. change isn't a win. Some form of exception handling for truly exceptional situations would probably be better (and might have helped to avoid quite a few of the last CVEs 8-). -- Jon Smirl [EMAIL PROTECTED]
Re: optimizing calling conventions for function returns
On 5/23/06, Gabriel Paubert <[EMAIL PROTECTED]> wrote: On Tue, May 23, 2006 at 11:21:46AM -0400, Jon Smirl wrote: > Has work been done to evaluate a calling convention that takes error > checks like this into account? Are there size/performance wins? Or am > I just reinventing a variation on exception handling? It's fairly close to Fortran alternate return labels, which were standard in Fortran 77 but have been declared obsolescent in later revisions of the standard. I like this method since it can be implemented transparently in C code. That means the Linux kernel could use it without rewriting everything. Regards, Gabriel -- Jon Smirl [EMAIL PROTECTED]
Re: optimizing calling conventions for function returns
On 5/23/06, Paul Brook <[EMAIL PROTECTED]> wrote: > Has work been done to evaluate a calling convention that takes error > checks like this into account? Are there size/performance wins? Or am > I just reinventing a variation on exception handling? This introduces an extra stack push and will confuse a call-stack branch predictor. If both the call stack and the test are normally predicted correctly I'd guess this would be a performance loss on modern cpus. I just finished writing a bunch of test cases to explore the idea. My conclusion is that if the error returns are very infrequent (<<1%) then this is a win. But if there are a significant number of error returns this is a major loss. These two instructions on the error return path are the killer: addl$4, %esp ret /* Return to error return */ Apparently the CPU has zero expectation that the address being jumped to is code. In the calling routine I pushed the error return as data. pushl $.L11 /* push return address */ So for the non-error path there is a win by removing the error test/jmp on the function return. But taking the error path is very expensive. I'm experimenting with 50 line assembly programs on a P4. I do wonder if these micro results would apply in a macro program. My test is losing because the return destination had been predicted and the introduction of the addl messed up the prediction. But in a large program with many levels of calls would the return always be predicted on the error path? -- Jon Smirl [EMAIL PROTECTED]
Re: optimizing calling conventions for function returns
On 5/25/06, Geert Bosch <[EMAIL PROTECTED]> wrote: On May 23, 2006, at 11:21, Jon Smirl wrote: > A new calling convention could push two return addresses for functions > that return their status in EAX. On EAX=0 you take the first return, > EAX != 0 you take the second. This seems the same as passing an extra function pointer argument and calling that instead of doing a regular return. Tail-call optimization should turn the calll into a jump. Why do you think a custom ABI is necessary? The new ABI may not be necessary but adding an extra parameter would require changing source everywhere. The ABI scheme is source transparent and lets the compiler locate the places where it would be a win. The ABI scheme would also let the alternative return be pushed on the stack once no matter how many calls were made, a parameter has to be pushed each time. I ran into another snag that taking the alternative return on a P4 has really bad performance impacts since it messes up prefetch. This sequence is the killer. addl$4, %esp ret /* Return to error return */ I can try coding this as a parameter and see how the compiler generates code differently. The sequence of call, test, jne (or slight variations) occurs in 1000's of places, if a better alternative can be found there could be significant perofrmance gains. I haven't found a good solution yet, any help would be appreciated. -Geert -- Jon Smirl [EMAIL PROTECTED]
Re: optimizing calling conventions for function returns
On 5/25/06, Jon Smirl <[EMAIL PROTECTED]> wrote: I ran into another snag that taking the alternative return on a P4 has really bad performance impacts since it messes up prefetch. This sequence is the killer. addl$4, %esp ret /* Return to error return */ I can try coding this as a parameter and see how the compiler generates code differently. jmp *4($esp) This is slightly faster than addl, ret. But my micro scale benchmarks are extremely influenced by changes in branch prediction. I still wonder how this would perform in large programs. It seems that the sequence ret test jne is very fast compared to jmp *4($esp) Even when they both end up at the same place. It looks to me like the call stack predictor is controlling everything. The only way to make this work would be to figure out some way to get the alternative return address into the call stack predictor. -- Jon Smirl [EMAIL PROTECTED]
Inline memcpy in GCC 4.1.1
Hi, I'm updating a port from 3.4.6 to 4.1.1. In 3.4.6, I hadn't implemented movmemsi patterns, but the compiler could still inline memcpy's (and also strcpys where source string is a const) by itself. After updating to 4.1.1, calls to memcpy are always generated. I've had a bash at implementing movmemsi, but in a test case that does a strcpy (dest, "const"), it appears the 4th parameter (alignment) is always 1, and doing a MEM_ALIGN on the source operand results in 8, despite the fact I have implemented the CONSTANT_ALIGNMENT and DATA_ALIGNMENT macros to ensure that STRING_CSTs and QImode ARRAY_TYPEs get implemented on a BITS_PER_WORD boundary (If I look at the assembler output, then that shows the string being aligned as expected on a word boundary). So, two questions: any idea why 4.1.1 is no longer able to automatically inline memcpys and why is the source operand for movmemsi not know to be as widely aligned as it actually is? Cheers, Jon
RE: Inline memcpy in GCC 4.1.1
> > In http://gcc.gnu.org/ml/gcc/2006-06/msg00185.html, your wrote: > > > So, two questions: any idea why 4.1.1 is no longer able to > > automatically inline memcpys and why is the source operand for > > movmemsi not know to be as widely aligned as it actually is? > > See PR middle-end/27226 > Thanks a lot, that patch fixed it. Cheers, Jon
Re: gets is not too dangerous
On Thu, 2006-08-31 at 17:52 -0400, Miguel Angel Champin Catalan wrote: > We are students of computer sciences in the Santa Maria University, > Chile. We just want to know if the function "gets" it's too dangerous > for a warning. The fact is that our teacher's assistant give us a > homework, and one restriction was to use gcc to compile our code, > without warnings. As others said, it's not GCC directly giving you the warning. But nonetheless, it's good to understand where your logic is flawed. > We ask you for a simple explanation (if it's possible) about our > warning, telling that "gets" is not too dangerous, because in our case, > works perfectly, under some restrictions obviously. Simply reading the man page states: No check for buffer overrun is performed (see BUGS below). Hopefully, you know what a buffer overrun/overflow is and understand why it is therefore a very bad idea to be using gets even in academic work. Cheers! Jon.
Integer promotion for register based arguments
Hi, I've tried compiling the following program targeting both MIPS, LM32 and ARM. long a, b; void func(short p) { b = (long)p; } int main() { if(a < 2) func((short)a); return 0; } For MIPS and LM32, truncation is performed in the calling function and sign extension in the called function. One of these operations seems redundant. For ARM, truncation is performed in the caller, but sign-extension isn't performed in the callee, which seems more efficient. Why might this be? - PROMOTE_MODE is defined for all targets such that HImode should be promoted. - TARGET_PROMOTE_FUNCTION_MODE is also defined for all targets such that function arguments should be promoted. Are there other target macros that control this? Thanks, Jon
RE: Integer promotion for register based arguments
Hi Andrew, > On 07/25/2012 12:15 PM, Jon Beniston wrote: > > For MIPS and LM32, truncation is performed in the calling function and > > sign extension in the called function. One of these operations seems > > redundant. For ARM, truncation is performed in the caller, but > > sign-extension isn't performed in the callee, which seems more > > efficient. Why might this be? > > This is defined by the system ABI, which specifies when zero- or sign- > extension get done. The ARM ABI explicitly requires a caller to extend types > appropriately before they are passed, and a callee can depend on that. We > in GCC have to follow the rules, and we can take advantage of them. > > I suspect the answer to your question will be found in the ABIs of the MIPS > and LM32, but I'm not familiar with either of those. In the LM32 case, this is something that was overlooked, so it isn't that way because that's how it is required. I guess my question is what would I need to change to make it work like the ARM port? I can't see how this is being controlled. Thanks, Jon
RE: Integer promotion for register based arguments
Hi Eric, > > I guess my question is what would I need to change to make it work > > like the ARM port? I can't see how this is being controlled. > > Try TARGET_PROMOTE_PROTOTYPES. For all 3 targets I believe this returns true (Both MIPS and LM32 use hook_bool_const_tree_true), so I presume it must be something else. Regards, Jon
RE: Identifying Compiler Options to Minimize Energy Consumption by Embedded Programs
Hi James, > - Which set of benchmarks are suitable for embedded applications and representative of possible applications? Have a look at CoreMark: http://www.coremark.org/ EEMBC also have EnergyBench: http://www.eembc.org/benchmark/power_sl.php although I think that might be commercial, but it may give you some ideas. Regards, Jon
RE: Integer promotion for register based arguments
Hi Eric, > > I guess my question is what would I need to change to make it work > > like the ARM port? I can't see how this is being controlled. > > Try TARGET_PROMOTE_PROTOTYPES. Thanks, actually it does turn out to be this, but I was confused by the documentation. If this returns true, I see sign extension performed in the callee, if false, no sign extension is performed in the callee. The documentation for this comes under the " Passing Function Arguments on the Stack" section, which says: "This target hook returns true if an argument declared in a prototype as an integral type smaller than int should actually be passed as an int. In addition to avoiding errors in certain cases of mismatch, it also makes for better code on certain machines." I would have thought if the args smaller than an int are actually passed as an int, that would have meant the promotion had already taken place and so wasn't needed in the callee. It could also be said that it makes worse code on other machines :) Thanks, Jon
Double word left shift optimisation
Hi, I'd like to try to optimise double word left shifts of sign/zero extended operands if a widening multiply instruction is available. For the following code: long long f(long a, long b) { return (long long)a << b; } ARM, MIPS etc expand to a fairly long sequence like: nor $3,$0,$5 sra $2,$4,31 srl $7,$4,1 srl $7,$7,$3 sll $2,$2,$5 andi$6,$5,0x20 sll $3,$4,$5 or $2,$7,$2 movn$2,$3,$6 movn$3,$0,$6 I'd like to optimise this to something like: (long long) a * (1 << b) Which should just be 3 or so instructions. I don't think this can be sensibly done in the target backend as the generated pattern is too complicated to match and am not familiar with the middle end. Any suggestions as to where and how this should be best implemented? Thanks, Jon
RE: Double word left shift optimisation
> This is interesting. I've quickly tried it out on the SH port. It can be > accomplished with the combine pass, although there are a few things that > should be taken care of: > - an "extendsidi2" pattern is required (so that the extension is not > performed before expand) > ... > One potential pitfall might be the handling of a real "reg:DI << reg:DI" > if there are no patterns already there that handle it (as it is the case for > the > SH port). If I observed correctly, the "ashldi3" expander must not FAIL for a > "reg:DI << reg:DI" (to do a lib call), or else combine would not arrive at the > pattern above. > > Hope this helps. Thanks Oleg, these were the bits I hadn't figured out. Cheers, Jon
Array alignment difference on stack
Hi, With a port of GCC 4.2.1 I'm working on, get_pointer_alignment() (via DECL_ALIGN) returns different values for a char array depending upon whether it is on the stack or not. For example, if the array is a global, get_pointer_alignment() always returns 32, regardless, if there are more than 4 elements in the array. This is good. However, when the array is a local, get_pointer_alignment() returns either 8, 16, or 32, depending upon the size size of the array. i.e. if the array has 11 elements, get_pointer_alignment() will return 8, if it has 16 elements, it will return 32. This has the knock on consequence that inline memcpys aren't efficient as they could be (i.e. I end up with byte-by-byte copies instead of word at a time) for arrays with an odd number of elements. How can I ensure alignment when allocating on the stack? The DATA_ALIGNMENT macro doesn't seem to be having the desired effect in this case. (1I don't even see varasm.c:align_variable being called, so it's not being used). Cheers, Jon
Re: Git and GCC
On 12/6/07, Daniel Berlin <[EMAIL PROTECTED]> wrote: > > While you won't get the git svn metadata if you clone the infradead > > repo, it can be recreated on the fly by git svn if you want to start > > commiting directly to gcc svn. > > > I will give this a try :) Back when I was working on the Mozilla repository we were able to convert the full 4GB CVS repository complete with all history into a 450MB pack file. That work is where the git-fastimport tool came from. But it took a month of messing with the import tools to achieve this and Mozilla still chose another VCS (mainly because of poor Windows support in git). Like Linus says, this type of command will yield the smallest pack file: git repack -a -d --depth=250 --window=250 I do agree that importing multi-gigabyte repositories is not a daily occurrence nor a turn-key operation. There are significant issues when translating from one VCS to another. The lack of global branch tracking in CVS causes extreme problems on import. Hand editing of CVS files also caused endless trouble. The key to converting repositories of this size is RAM. 4GB minimum, more would be better. git-repack is not multi-threaded. There were a few attempts at making it multi-threaded but none were too successful. If I remember right, with loads of RAM, a repack on a 450MB repository was taking about five hours on a 2.8Ghz Core2. But this is something you only have to do once for the import. Later repacks will reuse the original deltas. -- Jon Smirl [EMAIL PROTECTED]
Re: Git and GCC
On 12/6/07, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Thu, 6 Dec 2007, Jeff King wrote: > > > > What is really disappointing is that we saved only about 20% of the > > time. I didn't sit around watching the stages, but my guess is that we > > spent a long time in the single threaded "writing objects" stage with a > > thrashing delta cache. > > I don't think you spent all that much time writing the objects. That part > isn't very intensive, it's mostly about the IO. > > I suspect you may simply be dominated by memory-throughput issues. The > delta matching doesn't cache all that well, and using two or more cores > isn't going to help all that much if they are largely waiting for memory > (and quite possibly also perhaps fighting each other for a shared cache? > Is this a Core 2 with the shared L2?) When I lasted looked at the code, the problem was in evenly dividing the work. I was using a four core machine and most of the time one core would end up with 3-5x the work of the lightest loaded core. Setting pack.threads up to 20 fixed the problem. With a high number of threads I was able to get a 4hr pack to finished in something like 1:15. A scheme where each core could work a minute without communicating to the other cores would be best. It would also be more efficient if the cores could avoid having sync points between them. -- Jon Smirl [EMAIL PROTECTED]
Re: Git and GCC
On Thu, 2007-12-06 at 00:09, Linus Torvalds wrote: > Git also does delta-chains, but it does them a lot more "loosely". There > is no fixed entity. Delta's are generated against any random other version > that git deems to be a good delta candidate (with various fairly > successful heursitics), and there are absolutely no hard grouping rules. I'd like to learn more about that. Can someone point me to either more documentation on it? In the absence of that, perhaps a pointer to the source code that implements it? I guess one question I posit is, would it be more accurate to think of this as a "delta net" in a weighted graph rather than a "delta chain"? Thanks, jdl
Re: Git and GCC
On 12/6/07, Nicolas Pitre <[EMAIL PROTECTED]> wrote: > > When I lasted looked at the code, the problem was in evenly dividing > > the work. I was using a four core machine and most of the time one > > core would end up with 3-5x the work of the lightest loaded core. > > Setting pack.threads up to 20 fixed the problem. With a high number of > > threads I was able to get a 4hr pack to finished in something like > > 1:15. > > But as far as I know you didn't try my latest incarnation which has been > available in Git's master branch for a few months already. I've deleted all my giant packs. Using the kernel pack: 4GB Q6600 Using the current thread pack code I get these results. The interesting case is the last one. I set it to 15 threads and monitored with 'top'. For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and 74-100% was 100% CPU. It never used all for cores. The only other things running were top and my desktop. This is the same load balancing problem I observed earlier. Much more clock time was spent in the 2/1 core phases than the 3 core one. Threaded, threads = 5 [EMAIL PROTECTED]:/home/linux$ time git repack -a -d -f Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 528994), reused 0 (delta 0) real1m31.395s user2m59.239s sys 0m3.048s [EMAIL PROTECTED]:/home/linux$ 12 seconds counting 53 seconds compressing 38 seconds writing Without threads, [EMAIL PROTECTED]:/home/linux$ time git repack -a -d -f warning: no threads support, ignoring pack.threads Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 528999), reused 0 (delta 0) real2m54.849s user2m51.267s sys 0m1.412s [EMAIL PROTECTED]:/home/linux$ Threaded, threads = 5 [EMAIL PROTECTED]:/home/linux$ time git repack -a -d -f --depth=250 --window=250 Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 539080), reused 0 (delta 0) real9m18.032s user19m7.484s sys 0m3.880s [EMAIL PROTECTED]:/home/linux$ [EMAIL PROTECTED]:/home/linux/.git/objects/pack$ ls -l total 182156 -r--r--r-- 1 jonsmirl jonsmirl 15561848 2007-12-06 16:15 pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.idx -r--r--r-- 1 jonsmirl jonsmirl 170768761 2007-12-06 16:15 pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.pack [EMAIL PROTECTED]:/home/linux/.git/objects/pack$ Non-threaded: [EMAIL PROTECTED]:/home/linux$ time git repack -a -d -f --depth=250 --window=250 warning: no threads support, ignoring pack.threads Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 539080), reused 0 (delta 0) real18m51.183s user18m46.538s sys 0m1.604s [EMAIL PROTECTED]:/home/linux$ [EMAIL PROTECTED]:/home/linux/.git/objects/pack$ ls -l total 182156 -r--r--r-- 1 jonsmirl jonsmirl 15561848 2007-12-06 15:33 pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.idx -r--r--r-- 1 jonsmirl jonsmirl 170768761 2007-12-06 15:33 pack-f1f8637d2c68eb1c964ec7c1877196c0c7513412.pack [EMAIL PROTECTED]:/home/linux/.git/objects/pack$ Threaded, threads = 15 [EMAIL PROTECTED]:/home/linux$ time git repack -a -d -f --depth=250 --window=250 Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 539080), reused 0 (delta 0) real9m18.325s user19m14.340s sys 0m3.996s [EMAIL PROTECTED]:/home/linux$ -- Jon Smirl [EMAIL PROTECTED]
Re: Git and GCC
On 12/6/07, Nicolas Pitre <[EMAIL PROTECTED]> wrote: > On Thu, 6 Dec 2007, Jon Smirl wrote: > > > On 12/6/07, Nicolas Pitre <[EMAIL PROTECTED]> wrote: > > > > When I lasted looked at the code, the problem was in evenly dividing > > > > the work. I was using a four core machine and most of the time one > > > > core would end up with 3-5x the work of the lightest loaded core. > > > > Setting pack.threads up to 20 fixed the problem. With a high number of > > > > threads I was able to get a 4hr pack to finished in something like > > > > 1:15. > > > > > > But as far as I know you didn't try my latest incarnation which has been > > > available in Git's master branch for a few months already. > > > > I've deleted all my giant packs. Using the kernel pack: > > 4GB Q6600 > > > > Using the current thread pack code I get these results. > > > > The interesting case is the last one. I set it to 15 threads and > > monitored with 'top'. > > For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and > > 74-100% was 100% CPU. It never used all for cores. The only other > > things running were top and my desktop. This is the same load > > balancing problem I observed earlier. > > Well, that's possible with a window 25 times larger than the default. Why did it never use more than three cores? > > The load balancing is solved with a master thread serving relatively > small object list segments to any work thread that finished with its > previous segment. But the size for those segments is currently fixed to > window * 1000 which is way too large when window == 250. > > I have to find a way to auto-tune that segment size somehow. > > But with the default window size there should not be any such noticeable > load balancing problem. > > Note that threading only happens in the compression phase. The count > and write phase are hardly paralleled. > > > Nicolas > -- Jon Smirl [EMAIL PROTECTED]
Re: Git and GCC
On 12/6/07, Nicolas Pitre <[EMAIL PROTECTED]> wrote: > On Thu, 6 Dec 2007, Jon Smirl wrote: > > > On 12/6/07, Nicolas Pitre <[EMAIL PROTECTED]> wrote: > > > > When I lasted looked at the code, the problem was in evenly dividing > > > > the work. I was using a four core machine and most of the time one > > > > core would end up with 3-5x the work of the lightest loaded core. > > > > Setting pack.threads up to 20 fixed the problem. With a high number of > > > > threads I was able to get a 4hr pack to finished in something like > > > > 1:15. > > > > > > But as far as I know you didn't try my latest incarnation which has been > > > available in Git's master branch for a few months already. > > > > I've deleted all my giant packs. Using the kernel pack: > > 4GB Q6600 > > > > Using the current thread pack code I get these results. > > > > The interesting case is the last one. I set it to 15 threads and > > monitored with 'top'. > > For 0-60% compression I was at 300% CPU, 60-74% was 200% CPU and > > 74-100% was 100% CPU. It never used all for cores. The only other > > things running were top and my desktop. This is the same load > > balancing problem I observed earlier. > > Well, that's possible with a window 25 times larger than the default. > > The load balancing is solved with a master thread serving relatively > small object list segments to any work thread that finished with its > previous segment. But the size for those segments is currently fixed to > window * 1000 which is way too large when window == 250. > > I have to find a way to auto-tune that segment size somehow. That would be nice. Threading is most important on the giant pack/window combinations. The normal case is fast enough that I don't real notice it. These giant pack/window combos can run 8-10 hours. > > But with the default window size there should not be any such noticeable > load balancing problem. I only spend 30 seconds in the compression phase without making the window larger. It's not long enough to really see what is going on. > > Note that threading only happens in the compression phase. The count > and write phase are hardly paralleled. > > > Nicolas > -- Jon Smirl [EMAIL PROTECTED]
Re: Git and GCC
On 12/6/07, Nicolas Pitre <[EMAIL PROTECTED]> wrote: > > > Well, that's possible with a window 25 times larger than the default. > > > > Why did it never use more than three cores? > > You have 648366 objects total, and only 647457 of them are subject to > delta compression. > > With a window size of 250 and a default thread segment of window * 1000 > that means only 3 segments will be distributed to threads, hence only 3 > threads with work to do. One little tweak and the clock time drops from 9.5 to 6 minutes. The tweak makes all four cores work. [EMAIL PROTECTED]:/home/apps/git$ git diff diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index 4f44658..e0dd12e 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -1645,7 +1645,7 @@ static void ll_find_deltas(struct object_entry **list, unsigned list_size, } /* this should be auto-tuned somehow */ - chunk_size = window * 1000; + chunk_size = window * 50; do { unsigned sublist_size = chunk_size; [EMAIL PROTECTED]:/home/linux/.git$ time git repack -a -d -f --depth=250 --window=250 Counting objects: 648366, done. Compressing objects: 100% (647457/647457), done. Writing objects: 100% (648366/648366), done. Total 648366 (delta 539043), reused 0 (delta 0) real6m2.109s user20m0.491s sys 0m4.608s [EMAIL PROTECTED]:/home/linux/.git$ > > > Nicolas > -- Jon Smirl [EMAIL PROTECTED]
Re: Git and GCC
On 12/7/07, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Thu, 6 Dec 2007, Jon Smirl wrote: > > > > > > time git blame -C gcc/regclass.c > /dev/null > > > > [EMAIL PROTECTED]:/video/gcc$ time git blame -C gcc/regclass.c > /dev/null > > > > real1m21.967s > > user1m21.329s > > Well, I was also hoping for a "compared to not-so-aggressive packing" > number on the same machine.. IOW, what I was wondering is whether there is > a visible performance downside to the deeper delta chains in the 300MB > pack vs the (less aggressive) 500MB pack. Same machine with a default pack [EMAIL PROTECTED]:/video/gcc/.git/objects/pack$ ls -l total 2145716 -r--r--r-- 1 jonsmirl jonsmirl 23667932 2007-12-07 02:03 pack-bd163555ea9240a7fdd07d2708a293872665f48b.idx -r--r--r-- 1 jonsmirl jonsmirl 2171385413 2007-12-07 02:03 pack-bd163555ea9240a7fdd07d2708a293872665f48b.pack [EMAIL PROTECTED]:/video/gcc/.git/objects/pack$ Delta lengths have virtually no impact. The bigger pack file causes more IO which offsets the increased delta processing time. One of my rules is smaller is almost always better. Smaller eliminates IO and helps with the CPU cache. It's like the kernel being optimized for size instead of speed ending up being faster. time git blame -C gcc/regclass.c > /dev/null real1m19.289s user1m17.853s sys 0m0.952s > > Linus > -- Jon Smirl [EMAIL PROTECTED]
Re: Git and GCC
On 12/7/07, Jeff King <[EMAIL PROTECTED]> wrote: > On Thu, Dec 06, 2007 at 07:31:21PM -0800, David Miller wrote: > > > > So it is about 5% bigger. What is really disappointing is that we saved > > > only about 20% of the time. I didn't sit around watching the stages, but > > > my guess is that we spent a long time in the single threaded "writing > > > objects" stage with a thrashing delta cache. > > > > If someone can give me a good way to run this test case I can > > have my 64-cpu Niagara-2 box crunch on this and see how fast > > it goes and how much larger the resulting pack file is. > > That would be fun to see. The procedure I am using is this: > > # compile recent git master with threaded delta > cd git > echo THREADED_DELTA_SEARCH = 1 >>config.mak > make install > > # get the gcc pack > mkdir gcc && cd gcc > git --bare init > git config remote.gcc.url git://git.infradead.org/gcc.git > git config remote.gcc.fetch \ > '+refs/remotes/gcc.gnu.org/*:refs/remotes/gcc.gnu.org/*' > git remote update > > # make a copy, so we can run further tests from a known point > cd .. > cp -a gcc test > > # and test multithreaded large depth/window repacking > cd test > git config pack.threads 4 64 threads with 64 CPUs, if they are multicore you want even more. you need to adjust chunk_size as mentioned in the other mail. > time git repack -a -d -f --window=250 --depth=250 > > -Peff > -- Jon Smirl [EMAIL PROTECTED]
Re: Git and GCC
On 12/6/07, Linus Torvalds <[EMAIL PROTECTED]> wrote: > > > On Thu, 6 Dec 2007, Harvey Harrison wrote: > > > > I've updated the public mirror repo with the very-packed version. > > Side note: it might be interesting to compare timings for > history-intensive stuff with and without this kind of very-packed > situation. > > The very density of a smaller pack-file might be enough to overcome the > downsides (more CPU time to apply longer delta-chains), but regardless, > real numbers talks, bullshit walks. So wouldn't it be nice to have real > numbers? > > One easy way to get real numbers for history would be to just time some > reasonably costly operation that uses lots of history. Ie just do a > > time git blame -C gcc/regclass.c > /dev/null > > and see if the deeper delta chains are very expensive. [EMAIL PROTECTED]:/video/gcc$ time git blame -C gcc/regclass.c > /dev/null real1m21.967s user1m21.329s sys 0m0.640s The Mozilla repo is at least 50% larger than the gcc one. It took me 23 minutes to repack the gcc one on my $800 Dell. The trick to this is lots of RAM and 64b. There is little disk IO during the compression phase, everything is cached. I have a 4.8GB git process with 4GB of physical memory. Everything started slowing down a lot when the process got that big. Does git really need 4.8GB to repack? I could only keep 3.4GB resident. Luckily this happen at 95% completion. With 8GB of memory you should be able to do this repack in under 20 minutes. [EMAIL PROTECTED]:/video/gcc$ time git repack -a -d -f --depth=250 --window=250 real22m54.380s user69m18.948s sys 0m23.773s > (Yeah, the above is pretty much designed to be the worst possible case for > this kind of aggressive history packing, but I don't know if that choice > of file to try to annotate is a good choice or not. I suspect that "git > blame -C" with a CVS import is just horrid, because CVS commits tend to be > pretty big and nasty and not as localized as we've tried to make things in > the kernel, so doing the code copy detection is probably horrendously > expensive) > > Linus > - > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to [EMAIL PROTECTED] > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- Jon Smirl [EMAIL PROTECTED]
Re: Something is broken in repack
I added the gcc people to the CC, it's their repository. Maybe they can help up sort this out. On 12/11/07, Jon Smirl <[EMAIL PROTECTED]> wrote: > On 12/10/07, Nicolas Pitre <[EMAIL PROTECTED]> wrote: > > On Mon, 10 Dec 2007, Jon Smirl wrote: > > > > > New run using same configuration. With the addition of the more > > > efficient load balancing patches and delta cache accounting. > > > > > > Seconds are wall clock time. They are lower since the patch made > > > threading better at using all four cores. I am stuck at 380-390% CPU > > > utilization for the git process. > > > > > > complete seconds RAM > > > 10% 60900M (includes counting) > > > 20% 15900M > > > 30% 15900M > > > 40% 501.2G > > > 50% 801.3G > > > 60% 701.7G > > > 70% 140 1.8G > > > 80% 180 2.0G > > > 90% 280 2.2G > > > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > > > 100% 1390 2.85G > > > During the writing phase RAM fell to 1.6G > > > What is being freed in the writing phase?? > > > > The cached delta results, but you put a cap of 256MB for them. > > > > Could you try again with that cache disabled entirely, with > > pack.deltacachesize = 1 (don't use 0 as that means unbounded). > > > > And then, while still keeping the delta cache disabled, could you try > > with pack.threads = 2, and pack.threads = 1 ? > > > > I'm sorry to ask you to do this but I don't have enough ram to even > > complete a repack with threads=2 so I'm reattempting single threaded at > > the moment. But I really wonder if the threading has such an effect on > > memory usage. > > I already have a threads = 1 running with this config. Binary and > config were same from threads=4 run. > > 10% 28min 950M > 40% 135min 950M > 50% 157min 900M > 60% 160min 830M > 100% 170min 830M > > Something is hurting bad with threads. 170 CPU minutes with one > thread, versus 195 CPU minutes with four threads. > > Is there a different memory allocator that can be used when > multithreaded on gcc? This whole problem may be coming from the memory > allocation function. git is hardly interacting at all on the thread > level so it's likely a problem in the C run-time. > > [core] > repositoryformatversion = 0 > filemode = true > bare = false > logallrefupdates = true > [pack] > threads = 1 > deltacachesize = 256M > windowmemory = 256M > deltacachelimit = 0 > [remote "origin"] > url = git://git.infradead.org/gcc.git > fetch = +refs/heads/*:refs/remotes/origin/* > [branch "trunk"] > remote = origin > merge = refs/heads/trunk > > > > > > > > > > > > > > > > I have no explanation for the change in RAM usage. Two guesses come to > > > mind. Memory fragmentation. Or the change in the way the work was > > > split up altered RAM usage. > > > > > > Total CPU time was 195 minutes in 70 minutes clock time. About 70% > > > efficient. During the compress phase all four cores were active until > > > the last 90 seconds. Writing the objects took over 23 minutes CPU > > > bound on one core. > > > > > > New pack file is: 270,594,853 > > > Old one was: 344,543,752 > > > It still has 828,660 objects > > > > You mean the pack for the gcc repo is now less than 300MB? Wow. > > > > > > Nicolas > > > > > -- > Jon Smirl > [EMAIL PROTECTED] > -- Jon Smirl [EMAIL PROTECTED]
Re: Something is broken in repack
Switching to the Google perftools malloc http://goog-perftools.sourceforge.net/ 10% 30 828M 20% 15 831M 30% 10 834M 40% 50 1014M 50% 80 1086M 60% 80 1500M 70% 200 1.53G 80% 200 1.85G 90% 260 1.87G 95% 520 1.97G 100% 1335 2.24G Google allocator knocked 600MB off from memory use. Memory consumption did not fall during the write out phase like it did with gcc. Since all of this is with the same code except for changing the threading split, those runs where memory consumption went to 4.5GB with the gcc allocator must have triggered an extreme problem with fragmentation. Total CPU time 196 CPU minutes vs 190 for gcc. Google's claims of being faster are not true. So why does our threaded code take 20 CPU minutes longer (12%) to run than the same code with a single thread? Clock time is obviously faster. Are the threads working too close to each other in memory and bouncing cache lines between the cores? Q6600 is just two E6600s in the same package, the caches are not shared. Why does the threaded code need 2.24GB (google allocator, 2.85GB gcc) with 4 threads? But only need 950MB with one thread? Where's the extra gigabyte going? Is there another allocator to try? One that combines Google's efficiency with gcc's speed? On 12/11/07, Jon Smirl <[EMAIL PROTECTED]> wrote: > I added the gcc people to the CC, it's their repository. Maybe they > can help up sort this out. > > On 12/11/07, Jon Smirl <[EMAIL PROTECTED]> wrote: > > On 12/10/07, Nicolas Pitre <[EMAIL PROTECTED]> wrote: > > > On Mon, 10 Dec 2007, Jon Smirl wrote: > > > > > > > New run using same configuration. With the addition of the more > > > > efficient load balancing patches and delta cache accounting. > > > > > > > > Seconds are wall clock time. They are lower since the patch made > > > > threading better at using all four cores. I am stuck at 380-390% CPU > > > > utilization for the git process. > > > > > > > > complete seconds RAM > > > > 10% 60900M (includes counting) > > > > 20% 15900M > > > > 30% 15900M > > > > 40% 501.2G > > > > 50% 801.3G > > > > 60% 701.7G > > > > 70% 140 1.8G > > > > 80% 180 2.0G > > > > 90% 280 2.2G > > > > 95% 530 2.8G - 1,420 total to here, previous was 1,983 > > > > 100% 1390 2.85G > > > > During the writing phase RAM fell to 1.6G > > > > What is being freed in the writing phase?? > > > > > > The cached delta results, but you put a cap of 256MB for them. > > > > > > Could you try again with that cache disabled entirely, with > > > pack.deltacachesize = 1 (don't use 0 as that means unbounded). > > > > > > And then, while still keeping the delta cache disabled, could you try > > > with pack.threads = 2, and pack.threads = 1 ? > > > > > > I'm sorry to ask you to do this but I don't have enough ram to even > > > complete a repack with threads=2 so I'm reattempting single threaded at > > > the moment. But I really wonder if the threading has such an effect on > > > memory usage. > > > > I already have a threads = 1 running with this config. Binary and > > config were same from threads=4 run. > > > > 10% 28min 950M > > 40% 135min 950M > > 50% 157min 900M > > 60% 160min 830M > > 100% 170min 830M > > > > Something is hurting bad with threads. 170 CPU minutes with one > > thread, versus 195 CPU minutes with four threads. > > > > Is there a different memory allocator that can be used when > > multithreaded on gcc? This whole problem may be coming from the memory > > allocation function. git is hardly interacting at all on the thread > > level so it's likely a problem in the C run-time. > > > > [core] > > repositoryformatversion = 0 > > filemode = true > > bare = false > > logallrefupdates = true > > [pack] > > threads = 1 > > deltacachesize = 256M > > windowmemory = 256M > > deltacachelimit = 0 > > [remote "origin"] > > url = git://git.infradead.org/gcc.git > > fetch = +refs/heads/*:refs/remotes/origin/* > > [branch "trunk"] > > remote = origin > > merge = refs/heads/trunk > > > > > > > > > > > > > > > > > > > > > > > > > I have no explanation for the change in RAM usage. Two guesses come to > > > > mind. Memory fragmentation. Or the change in the way the work was > > > > split up altered RAM usage. > > > > > > > > Total CPU time was 195 minutes in 70 minutes clock time. About 70% > > > > efficient. During the compress phase all four cores were active until > > > > the last 90 seconds. Writing the objects took over 23 minutes CPU > > > > bound on one core. > > > > > > > > New pack file is: 270,594,853 > > > > Old one was: 344,543,752 > > > > It still has 828,660 objects > > > > > > You mean the pack for the gcc repo is now less than 300MB? Wow. > > > > > > > > > Nicolas > > > > > > > > > -- > > Jon Smirl > > [EMAIL PROTECTED] > > > > > -- > Jon Smirl > [EMAIL PROTECTED] > -- Jon Smirl [EMAIL PROTECTED]