Re: Install gcj with gcc5 on ubunto
Hello, I have an ubuntu. I installed gcc 5 to be able to install gcj (because gcj is no longer distributed with the new version of gcc).when I run apt-cache search gcj I get this : gcj-5-jdk - GCJ and Classpath development tools for Java(TM) I tried then to install it by running apt-get install gcj-5-jdk, I got this errors: (sorry for the frensh but this means that some packages (listed bellow) could not be installed sudo apt-get install gcj-5-jdk gcj-5-jdk:i386 (5.3.1-14ubuntu2). Vous pouvez lancer « apt --fix-broken install » pour corriger ces problèmes. Les paquets suivants contiennent des dépendances non satisfaites : gcj-5-jdk:i386 : Dépend: gcc-5-base:i386 (= 5.3.1-14ubuntu2) mais ne sera pas installé Dépend: gcc-5:i386 (>= 5.2.1-23) mais ne sera pas installé Dépend: libc6-dev:i386 (>= 2.13-0ubuntu6) mais ne sera pas installé Dépend: gcj-5:i386 (= 5.3.1-14ubuntu2) mais il n'est pas installable Dépend: gcj-5-jre:i386 (= 5.3.1-14ubuntu2) mais il n'est pas installable Dépend: libgcj16-dev:i386 (>= 5.3.1-14ubuntu2) mais il n'est pas installable Dépend: fastjar:i386 Dépend: libgcj-bc:i386 mais il n'est pas installable Dépend: libantlr-java:i386 mais il n'est pas installable Dépend: libc6:i386 (>= 2.4) mais ne sera pas installé Dépend: libgcj16:i386 (>= 5) mais il n'est pas installable Dépend: zlib1g:i386 (>= 1:1.1.4) mais ne sera pas installé Recommande: libecj-java-gcj:i386 mais il n'est pas installable Le lundi 1 juillet 2019 à 10:21:28 UTC+2, Anastasios Lisgaras via gcc-help a écrit : Hello Charfi Asma, What operating system (even - distribution of GNU/Linux) do you work with? Generally, however, in GNU/Linux run : apt-cache search gcj or apt search gcj to find the available packages for your distribution. Try it. I hope I helped you. On 7/1/19 7:20 AM, charfi asma via gcc-help wrote: > Hello,Can you give me installation. Instructions?? > To install gcj after installing gcc5 or gcc6.??Apt-get install gcj did not > work??Thank you very much > Envoy?? depuis Yahoo??Mail pour Android > -- Kind regards, Anastasios Lisgaras z
Re: [PATCH] Add .gnu.lto_.lto section.
Hi. Ok, so there's a version with added ChangeLog that survives regression tests. Ready to be installed? Thanks, Martin >From e6745583dc4b7f5543878c0a25498e818531f73e Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Fri, 21 Jun 2019 12:14:04 +0200 Subject: [PATCH 1/2] Add .gnu.lto_.lto section. gcc/ChangeLog: 2019-07-01 Martin Liska * lto-section-in.c (lto_get_section_data): Add "lto" section. * lto-section-out.c (lto_destroy_simple_output_block): Never compress LTO_section_lto section. * lto-streamer-out.c (produce_asm): Do not set major_version and minor_version. (lto_output_toplevel_asms): Likewise. (produce_lto_section): New function. (lto_output): Call produce_lto_section. (lto_write_mode_table): Do not set major_version and minor_version. (produce_asm_for_decls): Likewise. * lto-streamer.h (enum lto_section_type): Add LTO_section_lto type. (struct lto_header): Remove. (struct lto_section): New struct. (struct lto_simple_header): Do not inherit from lto_header. (struct lto_file_decl_data): Add lto_section_header field. gcc/lto/ChangeLog: 2019-07-01 Martin Liska * lto-common.c: Read LTO section and verify header. --- gcc/lto-section-in.c | 9 +++-- gcc/lto-section-out.c | 2 -- gcc/lto-streamer-out.c | 40 +--- gcc/lto-streamer.h | 25 + gcc/lto/lto-common.c | 15 +++ 5 files changed, 64 insertions(+), 27 deletions(-) diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c index 4cfc0cad4be..4e7d1181f23 100644 --- a/gcc/lto-section-in.c +++ b/gcc/lto-section-in.c @@ -52,10 +52,10 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] = "icf", "offload_table", "mode_table", - "hsa" + "hsa", + "lto" }; - /* Hooks so that the ipa passes can call into the lto front end to get sections. */ @@ -146,7 +146,7 @@ lto_get_section_data (struct lto_file_decl_data *file_data, /* WPA->ltrans streams are not compressed with exception of function bodies and variable initializers that has been verbatim copied from earlier compilations. */ - if (!flag_ltrans || decompress) + if ((!flag_ltrans || decompress) && section_type != LTO_section_lto) { /* Create a mapping header containing the underlying data and length, and prepend this to the uncompression buffer. The uncompressed data @@ -167,9 +167,6 @@ lto_get_section_data (struct lto_file_decl_data *file_data, data = buffer.data + header_length; } - lto_check_version (((const lto_header *)data)->major_version, - ((const lto_header *)data)->minor_version, - file_data->file_name); return data; } diff --git a/gcc/lto-section-out.c b/gcc/lto-section-out.c index c91e58f0465..7ae102164ef 100644 --- a/gcc/lto-section-out.c +++ b/gcc/lto-section-out.c @@ -285,8 +285,6 @@ lto_destroy_simple_output_block (struct lto_simple_output_block *ob) /* Write the header which says how to decode the pieces of the t. */ memset (&header, 0, sizeof (struct lto_simple_header)); - header.major_version = LTO_major_version; - header.minor_version = LTO_minor_version; header.main_size = ob->main_stream->total_size; lto_write_data (&header, sizeof header); diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c index dc68429303c..7dee770aa11 100644 --- a/gcc/lto-streamer-out.c +++ b/gcc/lto-streamer-out.c @@ -1974,10 +1974,6 @@ produce_asm (struct output_block *ob, tree fn) /* The entire header is stream computed here. */ memset (&header, 0, sizeof (struct lto_function_header)); - /* Write the header. */ - header.major_version = LTO_major_version; - header.minor_version = LTO_minor_version; - if (section_type == LTO_section_function_body) header.cfg_size = ob->cfg_stream->total_size; header.main_size = ob->main_stream->total_size; @@ -2270,10 +2266,6 @@ lto_output_toplevel_asms (void) /* The entire header stream is computed here. */ memset (&header, 0, sizeof (header)); - /* Write the header. */ - header.major_version = LTO_major_version; - header.minor_version = LTO_minor_version; - header.main_size = ob->main_stream->total_size; header.string_size = ob->string_stream->total_size; lto_write_data (&header, sizeof header); @@ -2390,6 +2382,29 @@ prune_offload_funcs (void) DECL_PRESERVE_P (fn_decl) = 1; } +/* Produce LTO section that contains global information + about LTO bytecode. */ + +static void +produce_lto_section () +{ + /* Stream LTO meta section. */ + output_block *ob = create_output_block (LTO_section_lto); + + char * section_name = lto_get_section_name (LTO_section_lto, NULL, NULL); + lto_begin_section (section_name, false); + free (section_name); + + lto_compression compression = ZLIB; + + bool slim_object = flag_generate_lto && !flag_fat_lto_objects; + lto_section s += { LTO_major_version, LTO_minor_version, slim_object, compression, 0 }; + lto_write_data (&s, sizeof s); + lto_end_se
[PATCH 2/2] Add zstd support for LTO bytecode compression.
Hi. This is updated version of the zstd patch that should handle all what Joseph pointed out. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin >From 5d2006c9c4d481f4083d5a591327ee64847b0bf7 Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Thu, 20 Jun 2019 10:08:17 +0200 Subject: [PATCH 2/2] Add zstd support for LTO bytecode compression. gcc/ChangeLog: 2019-07-01 Martin Liska * Makefile.in: Define ZSTD_LIB. * common.opt: Adjust compression level to support also zstd levels. * config.in: Regenerate. * configure: Likewise. * configure.ac: Add --with-zstd and --with-zstd-include options and detect ZSTD. * doc/install.texi: Mention zstd dependency. * gcc.c: Print supported LTO compression algorithms. * lto-compress.c (lto_normalized_zstd_level): Likewise. (lto_compression_zstd): Likewise. (lto_uncompression_zstd): Likewise. (lto_end_compression): Dispatch in between zlib and zstd. (lto_compression_zlib): Mark with ATTRIBUTE_UNUSED. (lto_uncompression_zlib): Make it static. * lto-compress.h (lto_end_uncompression): Fix GNU coding style. * lto-section-in.c (lto_get_section_data): Pass info about used compression. * lto-streamer-out.c: By default use zstd when possible. * timevar.def (TV_IPA_LTO_DECOMPRESS): Rename to decompression (TV_IPA_LTO_COMPRESS): Likewise for compression. --- gcc/Makefile.in| 4 +- gcc/common.opt | 4 +- gcc/config.in | 6 ++ gcc/configure | 163 - gcc/configure.ac | 66 + gcc/doc/install.texi | 6 ++ gcc/gcc.c | 5 ++ gcc/lto-compress.c | 141 +-- gcc/lto-compress.h | 3 +- gcc/lto-section-in.c | 2 +- gcc/lto-streamer-out.c | 4 + gcc/timevar.def| 4 +- 12 files changed, 378 insertions(+), 30 deletions(-) diff --git a/gcc/Makefile.in b/gcc/Makefile.in index d9e0885b96b..597dc01328b 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1065,7 +1065,7 @@ BUILD_LIBDEPS= $(BUILD_LIBIBERTY) LIBS = @LIBS@ libcommon.a $(CPPLIB) $(LIBINTL) $(LIBICONV) $(LIBBACKTRACE) \ $(LIBIBERTY) $(LIBDECNUMBER) $(HOST_LIBS) BACKENDLIBS = $(ISLLIBS) $(GMPLIBS) $(PLUGINLIBS) $(HOST_LIBS) \ - $(ZLIB) + $(ZLIB) $(ZSTD_LIB) # Any system libraries needed just for GNAT. SYSLIBS = @GNAT_LIBEXC@ @@ -1076,6 +1076,8 @@ GNATMAKE = @GNATMAKE@ # Libs needed (at present) just for jcf-dump. LDEXP_LIB = @LDEXP_LIB@ +ZSTD_LIB = @ZSTD_LIB@ + # Likewise, for use in the tools that must run on this machine # even if we are cross-building GCC. BUILD_LIBS = $(BUILD_LIBIBERTY) diff --git a/gcc/common.opt b/gcc/common.opt index a1544d06824..3b71a36552b 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1888,8 +1888,8 @@ Specify the algorithm to partition symbols and vars at linktime. ; The initial value of -1 comes from Z_DEFAULT_COMPRESSION in zlib.h. flto-compression-level= -Common Joined RejectNegative UInteger Var(flag_lto_compression_level) Init(-1) IntegerRange(0, 9) --flto-compression-level= Use zlib compression level for IL. +Common Joined RejectNegative UInteger Var(flag_lto_compression_level) Init(-1) IntegerRange(0, 19) +-flto-compression-level= Use zlib/zstd compression level for IL. flto-odr-type-merging Common Ignore diff --git a/gcc/config.in b/gcc/config.in index a718ceaf3da..13fd7959dd7 100644 --- a/gcc/config.in +++ b/gcc/config.in @@ -1926,6 +1926,12 @@ #endif +/* Define if you have a working header file. */ +#ifndef USED_FOR_TARGET +#undef HAVE_ZSTD_H +#endif + + /* Define if isl is in use. */ #ifndef USED_FOR_TARGET #undef HAVE_isl diff --git a/gcc/configure b/gcc/configure index 955e9ccc09b..8c9f7742ac7 100755 --- a/gcc/configure +++ b/gcc/configure @@ -782,6 +782,8 @@ manext LIBICONV_DEP LTLIBICONV LIBICONV +ZSTD_LIB +ZSTD_INCLUDE DL_LIB LDEXP_LIB EXTRA_GCC_LIBS @@ -959,6 +961,9 @@ with_pkgversion with_bugurl enable_languages with_multilib_list +with_zstd +with_zstd_include +with_zstd_lib enable_rpath with_libiconv_prefix enable_sjlj_exceptions @@ -1783,6 +1788,12 @@ Optional Packages: --with-pkgversion=PKG Use PKG in the version string in place of "GCC" --with-bugurl=URL Direct users to URL to report a bug --with-multilib-listselect multilibs (AArch64, SH and x86-64 only) + --with-zstd=PATHspecify prefix directory for installed zstd library. + Equivalent to --with-zstd-include=PATH/include plus + --with-zstd-lib=PATH/lib + --with-zstd-include=PATH + specify directory for installed zstd include files + --with-zstd-lib=PATHspecify directory for the installed zstd library --with-gnu-ld assume the C compiler uses GNU ld default=no --with-libiconv-prefix[=DIR] search for libiconv in DIR/include and DIR/lib --without-libiconv-prefix don't search for libiconv in includ
Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)
[Adding gcc-patches] Richard, do you have any further comments or is the revised patch good to commit? Martin On 6/25/19 2:30 PM, Martin Sebor wrote: On 6/25/19 3:53 AM, Jonathan Wakely wrote: On 24/06/19 19:42 +0200, Richard Biener wrote: On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor wrote: On 6/24/19 6:11 AM, Richard Biener wrote: > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor wrote: >> >> On 6/21/19 6:06 AM, Richard Biener wrote: >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor wrote: Bug 90923 shows that even though GCC hash-table based containers like hash_map can be instantiated on types with user-defined ctors and dtors they invoke the dtors of such types without invoking the corresponding ctors. It was thanks to this bug that I spent a day debugging "interesting" miscompilations during GCC bootstrap (in fairness, it was that and bug 90904 about auto_vec copy assignment/construction also being hosed even for POD types). The attached patch corrects the hash_map and hash_set templates to invoke the ctors of the elements they insert and makes them (hopefully) safe to use with non-trivial user-defined types. >>> >>> Hum. I am worried about the difference of assignment vs. construction >>> in ::put() >>> >>> + bool ins = hash_entry::is_empty (*e); >>> + if (ins) >>> + { >>> + e->m_key = k; >>> + new ((void *) &e->m_value) Value (v); >>> + } >>> + else >>> + e->m_value = v; >>> >>> why not invoke the dtor on the old value and then the ctor again? >> >> It wouldn't work for self-assignment: >> >> Value &y = m.get_or_insert (key); >> m.put (key, y); >> >> The usual rule of thumb for writes into containers is to use >> construction when creating a new element and assignment when >> replacing the value of an existing element. >> >> Which reminds me that the hash containers, despite being copy- >> constructible (at least for POD types, they aren't for user- >> defined types), also aren't safe for assignment even for PODs. >> I opened bug 90959 for this. Until the assignment is fixed >> I made it inaccessibe in the patch (I have fixed the copy >> ctor to DTRT for non-PODs). >> >>> How is an empty hash_entry constructed? >> >> In hash_table::find_slot_with_hash simply by finding an empty >> slot and returning a pointer to it. The memory for the slot >> is marked "empty" by calling the Traits::mark_empty() function. >> >> The full type of hash_map is actually >> >> hash_map> Value, >> simple_hashmap_traits, >> Value> >> >> and simple_hashmap_traits delegates it to default_hash_traits >> whose mark_empty() just clears the void*, leaving the Value >> part uninitialized. That makes sense because we don't want >> to call ctors for empty entries. I think the questions one >> might ask if one were to extend the design are: a) what class >> should invoke the ctor/assignment and b) should it do it >> directly or via the traits? >> >>> ::remove() doesn't >>> seem to invoke the dtor either, instead it relies on the >>> traits::remove function? >> >> Yes. There is no Traits::construct or assign or copy. We >> could add them but I'm not sure I see to what end (there could >> be use cases, I just don't know enough about these classes to >> think of any). >> >> Attached is an updated patch with the additional minor fixes >> mentioned above. >> >> Martin >> >> PS I think we would get much better results by adopting >> the properly designed and tested standard library containers >> than by spending time trying to improve the design of these >> legacy classes. For simple uses that don't need to integrate >> with the GC machinery the standard containers should be fine >> (plus, it'd provide us with greater motivation to improve >> them and the code GCC emits for their uses). Unfortunately, >> to be able to use the hash-based containers we would need to >> upgrade to C++ 11. Isn't it time yet? > > I don't think so. I'm also not sure if C++11 on its own is desirable > or if it should be C++14 or later at that point. SLES 12 has GCC 4.8 > as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7. > SLES 11 already struggles currently (GCC 4.3) but I'd no longer > consider that important enough. > > Note any such change to libstdc++ containers should be complete > and come with both code-size and compile-time and memory-usage > measurements (both of GCC and other apps of course). Can I go ahead and commit the patch? I think we need to document the requirements on Value classes better. @@ -177,7 +185,10 @@ public: INSERT); bool ins = Traits::is_empty (*e); if (ins) - e->m_key = k; + { + e->m_key = k; + new ((void *)&e->m_value) Value (); + } this now requires a default constructor and I always forget about diff
Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)
On Mon, Jul 1, 2019 at 4:55 PM Martin Sebor wrote:> > [Adding gcc-patches] > > Richard, do you have any further comments or is the revised patch > good to commit? No further comments from my side - it's good to commit. Richard. > Martin > > On 6/25/19 2:30 PM, Martin Sebor wrote: > > On 6/25/19 3:53 AM, Jonathan Wakely wrote: > >> On 24/06/19 19:42 +0200, Richard Biener wrote: > >>> On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor wrote: > > On 6/24/19 6:11 AM, Richard Biener wrote: > > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor > wrote: > >> > >> On 6/21/19 6:06 AM, Richard Biener wrote: > >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor > wrote: > > Bug 90923 shows that even though GCC hash-table based containers > like hash_map can be instantiated on types with user-defined ctors > and dtors they invoke the dtors of such types without invoking > the corresponding ctors. > > It was thanks to this bug that I spent a day debugging > "interesting" > miscompilations during GCC bootstrap (in fairness, it was that and > bug 90904 about auto_vec copy assignment/construction also being > hosed even for POD types). > > The attached patch corrects the hash_map and hash_set templates > to invoke the ctors of the elements they insert and makes them > (hopefully) safe to use with non-trivial user-defined types. > >>> > >>> Hum. I am worried about the difference of assignment vs. > construction > >>> in ::put() > >>> > >>> + bool ins = hash_entry::is_empty (*e); > >>> + if (ins) > >>> + { > >>> + e->m_key = k; > >>> + new ((void *) &e->m_value) Value (v); > >>> + } > >>> + else > >>> + e->m_value = v; > >>> > >>> why not invoke the dtor on the old value and then the ctor again? > >> > >> It wouldn't work for self-assignment: > >> > >> Value &y = m.get_or_insert (key); > >> m.put (key, y); > >> > >> The usual rule of thumb for writes into containers is to use > >> construction when creating a new element and assignment when > >> replacing the value of an existing element. > >> > >> Which reminds me that the hash containers, despite being copy- > >> constructible (at least for POD types, they aren't for user- > >> defined types), also aren't safe for assignment even for PODs. > >> I opened bug 90959 for this. Until the assignment is fixed > >> I made it inaccessibe in the patch (I have fixed the copy > >> ctor to DTRT for non-PODs). > >> > >>> How is an empty hash_entry constructed? > >> > >> In hash_table::find_slot_with_hash simply by finding an empty > >> slot and returning a pointer to it. The memory for the slot > >> is marked "empty" by calling the Traits::mark_empty() function. > >> > >> The full type of hash_map is actually > >> > >> hash_map >> Value, > >> simple_hashmap_traits, > >>Value> > >> > >> and simple_hashmap_traits delegates it to default_hash_traits > >> whose mark_empty() just clears the void*, leaving the Value > >> part uninitialized. That makes sense because we don't want > >> to call ctors for empty entries. I think the questions one > >> might ask if one were to extend the design are: a) what class > >> should invoke the ctor/assignment and b) should it do it > >> directly or via the traits? > >> > >>> ::remove() doesn't > >>> seem to invoke the dtor either, instead it relies on the > >>> traits::remove function? > >> > >> Yes. There is no Traits::construct or assign or copy. We > >> could add them but I'm not sure I see to what end (there could > >> be use cases, I just don't know enough about these classes to > >> think of any). > >> > >> Attached is an updated patch with the additional minor fixes > >> mentioned above. > >> > >> Martin > >> > >> PS I think we would get much better results by adopting > >> the properly designed and tested standard library containers > >> than by spending time trying to improve the design of these > >> legacy classes. For simple uses that don't need to integrate > >> with the GC machinery the standard containers should be fine > >> (plus, it'd provide us with greater motivation to improve > >> them and the code GCC emits for their uses). Unfortunately, > >> to be able to use the hash-based containers we would need to > >> upgrade to C++ 11. Isn't it time yet? > > > > I don't think so. I'm also not sure if C++11 on its own is desirable > > or if it should
Re: A better __builtin_constant_p
> Am I totally on the wrong track here? That depends on what you want your assumptions to do. This definitely doesn't solve the problems I'm having implementing C++ contracts, especially axioms, which can involve declarations of undecidable functions. For example, is_reachable(p, q) for a pair of pointers can't be implemented for normal systems (maybe it could with one sanitizer or another), but it could be used for static analyzers (or optimizers maybe?) that understood the inherent meaning of a "call" to that function. Our problem is that we need to communicate an assumption syntactically through to GIMPLE, but without the usual ODR requirements of C++. Specifically, we need to allow assumptions on conditions that use functions that are declared but not defined anywhere. Now, I'm not an optimizer expert at all, so I could be way wrong, but my sense is that depending on the constantness of a function will not help you get better assumptions, because you aren't necessarily interested in the value of the expression, only the facts introduced by it (e.g., for some x, x == 5). > Should we have __builtin_assume? I'd like it. The "if (predicate) unreachable" pattern doesn't work for C++ contracts. This works just fine for C++ contracts in LLVM. There's another solution to the problem: define undefinable functions as "= undecidable". > Should we instead fix our code so forgetting a "const" attribute won't > hurt performance if it can be inferred? Probably? If you have an assertion that you want to promote to an assumption vial __builtin_assume (using LLVM), you'll get a warning. Throw in -werror, and your program doesn't compile. That seems unfortunate. This has come up in WG21 discussions around contracts. There's some momentum to relax the requirement that asserted contracts don't have side effects because it's sometimes useful to call predicates that log something, throw, terminate, etc. If we allow "promotion" of assertions to assumptions, then that requirement is simply going to break a lot of code. Also, most functions aren't declared const, and requiring that broadly would be a non-starter for adding contracts to a program.
Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)
On 7/1/19 10:33 AM, Richard Biener wrote: On Mon, Jul 1, 2019 at 4:55 PM Martin Sebor wrote:> [Adding gcc-patches] Richard, do you have any further comments or is the revised patch good to commit? No further comments from my side - it's good to commit. After running a full bootstrap with the patch with the static_assert I found that the I didn't fully understand the KeyId type/compare_type requirements. Like value_type, this type too can be a non-POD type. It just needs a suitable Traits (AKA Descriptor) class. I've updated the comments to reflect that and removed the static_assert and checked in the original version of the change with better comments in r272893. Sorry about that hiccup. Martin Richard. Martin On 6/25/19 2:30 PM, Martin Sebor wrote: On 6/25/19 3:53 AM, Jonathan Wakely wrote: On 24/06/19 19:42 +0200, Richard Biener wrote: On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor wrote: On 6/24/19 6:11 AM, Richard Biener wrote: On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor wrote: On 6/21/19 6:06 AM, Richard Biener wrote: On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor wrote: Bug 90923 shows that even though GCC hash-table based containers like hash_map can be instantiated on types with user-defined ctors and dtors they invoke the dtors of such types without invoking the corresponding ctors. It was thanks to this bug that I spent a day debugging "interesting" miscompilations during GCC bootstrap (in fairness, it was that and bug 90904 about auto_vec copy assignment/construction also being hosed even for POD types). The attached patch corrects the hash_map and hash_set templates to invoke the ctors of the elements they insert and makes them (hopefully) safe to use with non-trivial user-defined types. Hum. I am worried about the difference of assignment vs. construction in ::put() + bool ins = hash_entry::is_empty (*e); + if (ins) + { + e->m_key = k; + new ((void *) &e->m_value) Value (v); + } + else + e->m_value = v; why not invoke the dtor on the old value and then the ctor again? It wouldn't work for self-assignment: Value &y = m.get_or_insert (key); m.put (key, y); The usual rule of thumb for writes into containers is to use construction when creating a new element and assignment when replacing the value of an existing element. Which reminds me that the hash containers, despite being copy- constructible (at least for POD types, they aren't for user- defined types), also aren't safe for assignment even for PODs. I opened bug 90959 for this. Until the assignment is fixed I made it inaccessibe in the patch (I have fixed the copy ctor to DTRT for non-PODs). How is an empty hash_entry constructed? In hash_table::find_slot_with_hash simply by finding an empty slot and returning a pointer to it. The memory for the slot is marked "empty" by calling the Traits::mark_empty() function. The full type of hash_map is actually hash_mapsimple_hashmap_traits, Value> and simple_hashmap_traits delegates it to default_hash_traits whose mark_empty() just clears the void*, leaving the Value part uninitialized. That makes sense because we don't want to call ctors for empty entries. I think the questions one might ask if one were to extend the design are: a) what class should invoke the ctor/assignment and b) should it do it directly or via the traits? ::remove() doesn't seem to invoke the dtor either, instead it relies on the traits::remove function? Yes. There is no Traits::construct or assign or copy. We could add them but I'm not sure I see to what end (there could be use cases, I just don't know enough about these classes to think of any). Attached is an updated patch with the additional minor fixes mentioned above. Martin PS I think we would get much better results by adopting the properly designed and tested standard library containers than by spending time trying to improve the design of these legacy classes. For simple uses that don't need to integrate with the GC machinery the standard containers should be fine (plus, it'd provide us with greater motivation to improve them and the code GCC emits for their uses). Unfortunately, to be able to use the hash-based containers we would need to upgrade to C++ 11. Isn't it time yet? I don't think so. I'm also not sure if C++11 on its own is desirable or if it should be C++14 or later at that point. SLES 12 has GCC 4.8 as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7. SLES 11 already struggles currently (GCC 4.3) but I'd no longer consider that important enough. Note any such change to libstdc++ containers should be complete and come with both code-size and compile-time and memory-usage measurements (both of GCC and other apps of course). Can I go ahead and commit the patch? I think we need to document the requirements on Value classes better. @@ -177,7 +185,10 @@ public: INSERT); bool ins = Traits::is_empty (*e); if (ins) - e->m_key = k; + {
RFC on a new optimization
I've been looking at trying to optimize the performance of code for programs that use functions like qsort where a function is passed the name of a function and some constant parameter(s). The function qsort itself is an excellent example of what I'm trying to show what I want to do, except for being in a library, so please ignore that while I proceed assuming that that qsort is not in a library. In qsort the user passes in a size of the array elements and comparison function name in addition to the location of the array to be sorted. I noticed that for a given call site that the first two are always the same so why not create a specialized version of qsort that eliminates them and internally uses a constant value for the size parameter and does a direct call instead of an indirect call. The later lets the comparison function code be inlined. This seems to me to be a very useful optimization where heavy use is made of this programming idiom. I saw a 30%+ overall improvement when I specialized a function like this by hand in an application. My question is does anything inside gcc do something similar? I don't want to reinvent the wheel and I want to do something that plays nicely with the rest of gcc so it makes it into real world. Note, I should mention that I'm an experienced compiler developed and I'm planning on adding this optimization unless it's obvious from the ensuing discussion that either it's a bad idea or that it's a matter of simply tweaking gcc a bit to get this optimization to occur. Thanks, Gary Oblock
Re: RFC on a new optimization
On 7/1/19 3:58 PM, Gary Oblock wrote: > I've been looking at trying to optimize the performance of code for > programs that use functions like qsort where a function is passed the > name of a function and some constant parameter(s). > > The function qsort itself is an excellent example of what I'm trying to show > what I want to do, except for being in a library, so please ignore > that while I proceed assuming that that qsort is not in a library. In > qsort the user passes in a size of the array elements and comparison > function name in addition to the location of the array to be sorted. I > noticed that for a given call site that the first two are always the > same so why not create a specialized version of qsort that eliminates > them and internally uses a constant value for the size parameter and > does a direct call instead of an indirect call. The later lets the > comparison function code be inlined. > > This seems to me to be a very useful optimization where heavy use is > made of this programming idiom. I saw a 30%+ overall improvement when > I specialized a function like this by hand in an application. > > My question is does anything inside gcc do something similar? I don't > want to reinvent the wheel and I want to do something that plays > nicely with the rest of gcc so it makes it into real world. Note, I > should mention that I'm an experienced compiler developed and I'm > planning on adding this optimization unless it's obvious from the > ensuing discussion that either it's a bad idea or that it's a matter > of simply tweaking gcc a bit to get this optimization to occur. Jan is the expert in this space, but yes, GCC has devirtualization and function specialization. See ipa-devirt.c and ipa-cp.c You can use the -fdump-ipa-all-details option to produce debugging dumps for the IPA passes. THat might help guide you a bit. jeff
Re: [EXT] Re: RFC on a new optimization
On 7/1/19 3:08 PM, Jeff Law wrote: > External Email > > -- > On 7/1/19 3:58 PM, Gary Oblock wrote: >> I've been looking at trying to optimize the performance of code for >> programs that use functions like qsort where a function is passed the >> name of a function and some constant parameter(s). >> >> The function qsort itself is an excellent example of what I'm trying to show >> what I want to do, except for being in a library, so please ignore >> that while I proceed assuming that that qsort is not in a library. In >> qsort the user passes in a size of the array elements and comparison >> function name in addition to the location of the array to be sorted. I >> noticed that for a given call site that the first two are always the >> same so why not create a specialized version of qsort that eliminates >> them and internally uses a constant value for the size parameter and >> does a direct call instead of an indirect call. The later lets the >> comparison function code be inlined. >> >> This seems to me to be a very useful optimization where heavy use is >> made of this programming idiom. I saw a 30%+ overall improvement when >> I specialized a function like this by hand in an application. >> >> My question is does anything inside gcc do something similar? I don't >> want to reinvent the wheel and I want to do something that plays >> nicely with the rest of gcc so it makes it into real world. Note, I >> should mention that I'm an experienced compiler developed and I'm >> planning on adding this optimization unless it's obvious from the >> ensuing discussion that either it's a bad idea or that it's a matter >> of simply tweaking gcc a bit to get this optimization to occur. > Jan is the expert in this space, but yes, GCC has devirtualization and > function specialization. See ipa-devirt.c and ipa-cp.c You can use the > -fdump-ipa-all-details option to produce debugging dumps for the IPA > passes. THat might help guide you a bit. > > > jeff > Jeff, I assume you mean Jan Hubicka? I'll certainly have a look at the code dump you mention. I do have a high level design in mind already but I'm always up for making my life easier. Thanks, Gary
Re: [EXT] Re: RFC on a new optimization
On 7/1/19 5:01 PM, Gary Oblock wrote: > On 7/1/19 3:08 PM, Jeff Law wrote: >> External Email >> >> -- >> On 7/1/19 3:58 PM, Gary Oblock wrote: >>> I've been looking at trying to optimize the performance of code for >>> programs that use functions like qsort where a function is passed the >>> name of a function and some constant parameter(s). >>> >>> The function qsort itself is an excellent example of what I'm trying to show >>> what I want to do, except for being in a library, so please ignore >>> that while I proceed assuming that that qsort is not in a library. In >>> qsort the user passes in a size of the array elements and comparison >>> function name in addition to the location of the array to be sorted. I >>> noticed that for a given call site that the first two are always the >>> same so why not create a specialized version of qsort that eliminates >>> them and internally uses a constant value for the size parameter and >>> does a direct call instead of an indirect call. The later lets the >>> comparison function code be inlined. >>> >>> This seems to me to be a very useful optimization where heavy use is >>> made of this programming idiom. I saw a 30%+ overall improvement when >>> I specialized a function like this by hand in an application. >>> >>> My question is does anything inside gcc do something similar? I don't >>> want to reinvent the wheel and I want to do something that plays >>> nicely with the rest of gcc so it makes it into real world. Note, I >>> should mention that I'm an experienced compiler developed and I'm >>> planning on adding this optimization unless it's obvious from the >>> ensuing discussion that either it's a bad idea or that it's a matter >>> of simply tweaking gcc a bit to get this optimization to occur. >> Jan is the expert in this space, but yes, GCC has devirtualization and >> function specialization. See ipa-devirt.c and ipa-cp.c You can use the >> -fdump-ipa-all-details option to produce debugging dumps for the IPA >> passes. THat might help guide you a bit. >> >> >> jeff >> > Jeff, > > I assume you mean Jan Hubicka? Yes. Jeff
Re: Doubts regarding the _Dependent_ptr keyword
On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg wrote: > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan < > ramana@googlemail.com> wrote: > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg wrote: >> > >> > As we have some working front-end code for _Dependent_ptr, What should >> we do next? What I understand, we can start adding the library for >> dependent_ptr and its functions for C corresponding to the ones we created >> as C++ template library. Then, after that, we can move on to generating the >> assembly code part. >> > >> >> >> I think the next step is figuring out how to model the Dependent >> pointer information in the IR and figuring out what optimizations to >> allow or not with that information. At this point , I suspect we need >> a plan on record and have the conversation upstream on the lists. >> >> I think we need to put down a plan on record. >> >> Ramana > > [CCing gcc mailing list] > > So, shall I start looking over the pointer optimizations only and see what > information we may be needed on the same examples in the IR itself? > > - Akshat > I have coded an example where equality comparison kills dependency from the document P0190R4 as shown below : 1. struct rcutest rt = {1, 2, 3}; 2. void thread0 () 3. { 4. rt.a = -42; 5. rt.b = -43; 6. rt.c = -44; 7. rcu_assign_pointer(gp, &rt); 8. } 9. 10. void thread1 () 11. { 12.int i = -1; 13.int j = -1; 14._Dependent_ptr struct rcutest *p; 15. 16.p = rcu_dereference(gp); 17.j = p->a; 18. if (p == &rt) 19.i = p->b; /*Dependency breaking point*/ 20. else if(p) 21. i = p->c; 22. assert(i<0); 23. assert(j<0); 24. } The gimple unoptimized code produced for lines 17-24 is shown below 1. if (p_16 == &rt) 2. goto ; [INV] 3. else 4.goto ; [INV] 5. 6. : 7. i_19 = p_16->b; 8. goto ; [INV] 9. 10. : 11. if (p_16 != 0B) 12.goto ; [INV] 13. else 14.goto ; [INV] 15. 16. : 17. i_18 = p_16->c; 18. 19. : 20. # i_7 = PHI 21. _3 = i_7 < 0; 22. _4 = (int) _3; 23. assert (_4); 24. _5 = j_17 < 0; 25. _6 = (int) _5; 26. assert (_6); 27. return; The optimized code after -O1 is applied for the same lines is hown below : 1. if (_2 == &rt) 2.goto ; [30.00%] 3. else 4.goto ; [70.00%] 5. 6. [local count: 322122547]: 7. i_12 = rt.b; 8. goto ; [100.00%] 9. 10. [local count: 751619277]: 11. if (_1 != 0) 12. goto ; [50.00%] 13. else 14.goto ; [50.00%] 15. 16. [local count: 375809638]: 17. i_11 = MEM[(dependent_ptr struct rcutest *)_2].c; 18. 19.[local count: 1073741824]: 20. # i_7 = PHI 21. _3 = i_7 < 0; 22. _4 = (int) _3; 23. assert (_4); 24. _5 = j_10 < 0; 25. _6 = (int) _5; 26. assert (_6); 27. return; Statement 19 in the program gets converted from i_19 = p_16->b; in line 7 in unoptimized code to i_12 = rt.b; in line 7 in optimized code which breaks the dependency chain. We need to figure out the pass that does that and put some handling code in there for the _dependent_ptr qualified pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or any other option alone does not produce the optimized code that breaks the dependency. But applying -O1, i.e., allowing all the optimizations does so. As passes are applied in a certain order, we need to figure out upto what passes, the code remains same and after what pass the dependency does not holds. So, we need to check the translated code after every pass. Does this sounds like a workable plan for ? Let me know your thoughts. If this sounds good then, we can do this for all the optimizations that may kill the dependencies at somepoint. -Akshat
Re: Doubts regarding the _Dependent_ptr keyword
On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote: > On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg wrote: > > > On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan < > > ramana@googlemail.com> wrote: > > > >> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg wrote: > >> > > >> > As we have some working front-end code for _Dependent_ptr, What should > >> we do next? What I understand, we can start adding the library for > >> dependent_ptr and its functions for C corresponding to the ones we created > >> as C++ template library. Then, after that, we can move on to generating the > >> assembly code part. > >> > > >> > >> > >> I think the next step is figuring out how to model the Dependent > >> pointer information in the IR and figuring out what optimizations to > >> allow or not with that information. At this point , I suspect we need > >> a plan on record and have the conversation upstream on the lists. > >> > >> I think we need to put down a plan on record. > >> > >> Ramana > > > > [CCing gcc mailing list] > > > > So, shall I start looking over the pointer optimizations only and see what > > information we may be needed on the same examples in the IR itself? > > > > - Akshat > > > I have coded an example where equality comparison kills dependency from the > document P0190R4 as shown below : > > 1. struct rcutest rt = {1, 2, 3}; > 2. void thread0 () > 3. { > 4. rt.a = -42; > 5. rt.b = -43; > 6. rt.c = -44; > 7. rcu_assign_pointer(gp, &rt); > 8. } > 9. > 10. void thread1 () > 11. { > 12.int i = -1; > 13.int j = -1; > 14._Dependent_ptr struct rcutest *p; > 15. > 16.p = rcu_dereference(gp); > 17.j = p->a; > 18. if (p == &rt) > 19.i = p->b; /*Dependency breaking point*/ > 20. else if(p) > 21. i = p->c; > 22. assert(i<0); > 23. assert(j<0); > 24. } > The gimple unoptimized code produced for lines 17-24 is shown below > > 1. if (p_16 == &rt) > 2. goto ; [INV] > 3. else > 4.goto ; [INV] > 5. > 6. : > 7. i_19 = p_16->b; > 8. goto ; [INV] > 9. > 10. : > 11. if (p_16 != 0B) > 12.goto ; [INV] > 13. else > 14.goto ; [INV] > 15. > 16. : > 17. i_18 = p_16->c; > 18. > 19. : > 20. # i_7 = PHI > 21. _3 = i_7 < 0; > 22. _4 = (int) _3; > 23. assert (_4); > 24. _5 = j_17 < 0; > 25. _6 = (int) _5; > 26. assert (_6); > 27. return; > > The optimized code after -O1 is applied for the same lines is hown below : > > 1. if (_2 == &rt) > 2.goto ; [30.00%] > 3. else > 4.goto ; [70.00%] > 5. > 6. [local count: 322122547]: > 7. i_12 = rt.b; > 8. goto ; [100.00%] > 9. > 10. [local count: 751619277]: > 11. if (_1 != 0) > 12. goto ; [50.00%] > 13. else > 14.goto ; [50.00%] > 15. > 16. [local count: 375809638]: > 17. i_11 = MEM[(dependent_ptr struct rcutest *)_2].c; > 18. > 19.[local count: 1073741824]: > 20. # i_7 = PHI > 21. _3 = i_7 < 0; > 22. _4 = (int) _3; > 23. assert (_4); > 24. _5 = j_10 < 0; > 25. _6 = (int) _5; > 26. assert (_6); > 27. return; Good show on tracing this through! > Statement 19 in the program gets converted from i_19 = p_16->b; in line 7 > in unoptimized code to i_12 = rt.b; in line 7 in optimized code which > breaks the dependency chain. We need to figure out the pass that does that > and put some handling code in there for the _dependent_ptr qualified > pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or > any other option alone does not produce the optimized code that breaks the > dependency. But applying -O1, i.e., allowing all the optimizations does so. > As passes are applied in a certain order, we need to figure out upto what > passes, the code remains same and after what pass the dependency does not > holds. So, we need to check the translated code after every pass. > > Does this sounds like a workable plan for ? Let me know your thoughts. If > this sounds good then, we can do this for all the optimizations that may > kill the dependencies at somepoint. I don't know of a better plan. My usual question... Is there some way to script the checking of the translated code at the end of each pass? Thanx, Paul
Re: Doubts regarding the _Dependent_ptr keyword
On 2019-07-01 8:59 p.m., Paul E. McKenney wrote: > On Tue, Jul 02, 2019 at 05:58:48AM +0530, Akshat Garg wrote: >> On Tue, Jun 25, 2019 at 9:49 PM Akshat Garg wrote: >> >>> On Tue, Jun 25, 2019 at 4:04 PM Ramana Radhakrishnan < >>> ramana@googlemail.com> wrote: >>> On Tue, Jun 25, 2019 at 11:03 AM Akshat Garg wrote: > > As we have some working front-end code for _Dependent_ptr, What should we do next? What I understand, we can start adding the library for dependent_ptr and its functions for C corresponding to the ones we created as C++ template library. Then, after that, we can move on to generating the assembly code part. > I think the next step is figuring out how to model the Dependent pointer information in the IR and figuring out what optimizations to allow or not with that information. At this point , I suspect we need a plan on record and have the conversation upstream on the lists. I think we need to put down a plan on record. Ramana >>> >>> [CCing gcc mailing list] >>> >>> So, shall I start looking over the pointer optimizations only and see what >>> information we may be needed on the same examples in the IR itself? >>> >>> - Akshat >>> >> I have coded an example where equality comparison kills dependency from the >> document P0190R4 as shown below : >> >> 1. struct rcutest rt = {1, 2, 3}; >> 2. void thread0 () >> 3. { >> 4. rt.a = -42; >> 5. rt.b = -43; >> 6. rt.c = -44; >> 7. rcu_assign_pointer(gp, &rt); >> 8. } >> 9. >> 10. void thread1 () >> 11. { >> 12.int i = -1; >> 13.int j = -1; >> 14._Dependent_ptr struct rcutest *p; >> 15. >> 16.p = rcu_dereference(gp); >> 17.j = p->a; >> 18. if (p == &rt) >> 19.i = p->b; /*Dependency breaking point*/ >> 20. else if(p) >> 21. i = p->c; >> 22. assert(i<0); >> 23. assert(j<0); >> 24. } >> The gimple unoptimized code produced for lines 17-24 is shown below >> >> 1. if (p_16 == &rt) >> 2. goto ; [INV] >> 3. else >> 4.goto ; [INV] >> 5. >> 6. : >> 7. i_19 = p_16->b; >> 8. goto ; [INV] >> 9. >> 10. : >> 11. if (p_16 != 0B) >> 12.goto ; [INV] >> 13. else >> 14.goto ; [INV] >> 15. >> 16. : >> 17. i_18 = p_16->c; >> 18. >> 19. : >> 20. # i_7 = PHI >> 21. _3 = i_7 < 0; >> 22. _4 = (int) _3; >> 23. assert (_4); >> 24. _5 = j_17 < 0; >> 25. _6 = (int) _5; >> 26. assert (_6); >> 27. return; >> >> The optimized code after -O1 is applied for the same lines is hown below : >> >> 1. if (_2 == &rt) >> 2.goto ; [30.00%] >> 3. else >> 4.goto ; [70.00%] >> 5. >> 6. [local count: 322122547]: >> 7. i_12 = rt.b; >> 8. goto ; [100.00%] >> 9. >> 10. [local count: 751619277]: >> 11. if (_1 != 0) >> 12. goto ; [50.00%] >> 13. else >> 14.goto ; [50.00%] >> 15. >> 16. [local count: 375809638]: >> 17. i_11 = MEM[(dependent_ptr struct rcutest *)_2].c; >> 18. >> 19.[local count: 1073741824]: >> 20. # i_7 = PHI >> 21. _3 = i_7 < 0; >> 22. _4 = (int) _3; >> 23. assert (_4); >> 24. _5 = j_10 < 0; >> 25. _6 = (int) _5; >> 26. assert (_6); >> 27. return; > > Good show on tracing this through! > >> Statement 19 in the program gets converted from i_19 = p_16->b; in line 7 >> in unoptimized code to i_12 = rt.b; in line 7 in optimized code which >> breaks the dependency chain. We need to figure out the pass that does that >> and put some handling code in there for the _dependent_ptr qualified >> pointers. Passing simply -fipa-pure-const, -fguess-branch-probability or >> any other option alone does not produce the optimized code that breaks the >> dependency. But applying -O1, i.e., allowing all the optimizations does so. >> As passes are applied in a certain order, we need to figure out upto what >> passes, the code remains same and after what pass the dependency does not >> holds. So, we need to check the translated code after every pass. >> >> Does this sounds like a workable plan for ? Let me know your thoughts. If >> this sounds good then, we can do this for all the optimizations that may >> kill the dependencies at somepoint. > > I don't know of a better plan. > > My usual question... Is there some way to script the checking of the > translated code at the end of each pass? > > Thanx, Paul > I don't off the top of my head where the documentation is but writing a gcc tool to inspect passes if one doesn't exist is the best way forward. A gcc tool would be exposed to those internals but not sure if it's easy to do that in the time frame due to the effort required by you or Akshat. Cheers, Nick