[PATCH] Make function clone name numbering independent.
Hi, This patch is a small part of the work I'm doing to make function codegen/assembly independent from one another as mentioned in: https://gcc.gnu.org/ml/gcc/2018-07/msg00210.html. It deals with clone_fn_id_num rather than object UIDs and I figured it's better to make my first submission with a smaller, simpler and self-contained patch. This changes clone_function_name_1 such that clone names are based on a per-function rather than a global counter so that the number of clones of one function doesn't affect the numbering of clone names of other functions. This should have minimal impact as the only user of the clone names that I found (https://gcc.gnu.org/ml/gcc/2013-03/msg00268.html) doesn't actually care about the specific numeric values. Thanks - Michael gcc: 2018-07-16 Michael Ploujnikov Make function clone name numbering independent. * cgraphclones.c: Replace clone_fn_id_num with clone_fn_ids. (clone_function_name_1): Use it. testsuite: 2018-07-16 Michael Ploujnikov Clone id counters should be completely independent from one another. * gcc/testsuite/gcc.dg/independent-cloneids-1.c: New test. --- gcc/cgraphclones.c| 11 ++-- gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++ 2 files changed, 47 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c diff --git gcc/cgraphclones.c gcc/cgraphclones.c index 69572b9..c5a40bd 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -528,7 +528,7 @@ cgraph_node::create_clone (tree new_decl, gcov_type gcov_count, int freq, return new_node; } -static GTY(()) unsigned int clone_fn_id_num; +static GTY(()) hash_map *clone_fn_ids; /* Return a new assembler name for a clone with SUFFIX of a decl named NAME. */ @@ -543,7 +543,14 @@ clone_function_name_1 (const char *name, const char *suffix) memcpy (prefix, name, len); strcpy (prefix + len + 1, suffix); prefix[len] = symbol_table::symbol_suffix_separator (); - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); + unsigned int *suffix_counter; + if (!clone_fn_ids) { +/* Initialize the per-function counter hash table if this is the first call */ +clone_fn_ids = hash_map::create_ggc (1000); + } + suffix_counter = &clone_fn_ids->get_or_insert(name); + ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, *suffix_counter); + *suffix_counter = *suffix_counter + 1; return get_identifier (tmp_name); } diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c new file mode 100644 index 000..d723e20 --- /dev/null +++ gcc/testsuite/gcc.dg/independent-cloneids-1.c @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp" } */ + +extern int printf (const char *, ...); + +static int __attribute__ ((noinline)) +foo (int arg) +{ + return 7 * arg; +} + +static int __attribute__ ((noinline)) +bar (int arg) +{ + return arg * arg; +} + +int +baz (int arg) +{ + printf("%d\n", bar (3)); + printf("%d\n", bar (4)); + printf("%d\n", foo (5)); + printf("%d\n", foo (6)); + /* adding or removing the following call should not affect foo + function's clone numbering */ + printf("%d\n", bar (7)); + return foo (8); +} + +/* { dg-final { scan-ipa-dump "Function bar.constprop.0" "cp" } } */ +/* { dg-final { scan-ipa-dump "Function bar.constprop.1" "cp" } } */ +/* { dg-final { scan-ipa-dump "Function bar.constprop.3" "cp" } } */ +/* { dg-final { scan-ipa-dump "Function foo.constprop.0" "cp" } } */ +/* { dg-final { scan-ipa-dump "Function foo.constprop.1" "cp" } } */ +/* { dg-final { scan-ipa-dump "Function foo.constprop.2" "cp" } } */ +/* { dg-final { scan-ipa-dump-not "Function foo.constprop.3" "cp" } } */ +/* { dg-final { scan-ipa-dump-not "Function foo.constprop.4" "cp" } } */ -- 2.7.4 signature.asc Description: OpenPGP digital signature
[PATCH] Make function clone name numbering independent.
On 2018-07-17 06:02 AM, Richard Biener wrote: > On Tue, Jul 17, 2018 at 8:10 AM Bernhard Reutner-Fischer > wrote: >> >> On 16 July 2018 21:38:36 CEST, Michael Ploujnikov >> wrote: >>> Hi, >>> >> >>> +clone_fn_ids = hash_map::create_ggc >>> (1000); >> >> Isn't 1000 a bit excessive? What about 64 or thereabouts? > > I'm not sure we should throw memory at this "problem" in this way. > What specific issue > does this address? This goes along with the general theme of preventing changes to one function affecting codegen of others. What I'm seeing in this case is when a function bar is modified codegen decides to create more clones of it (eg: during the constprop pass). These extra clones cause the global counter to increment so the clones of the unchanged function foo are named differently only because of a source change to bar. I was hoping that the testcase would be a good illustration, but perhaps not; is there anything else I can do to make this clearer? > > Iff then I belive forgoing the automatic counter addidion is the way > to go and hand > control of that to the callers (for example the caller in > lto/lto-partition.c doesn't > even seem to need that counter. How can you tell that privatize_symbol_name_1 doesn't need the counter? I'm assuming it has a good reason to call clone_function_name_1 rather than appending ".lto_priv" itself. > You also assume the string you key on persists - luckily the > lto-partition.c caller > leaks it but this makes your approach quite fragile in my eye (put the logic > into clone_function_name instead, where you at least know you are dealing > with a string from an indentifier which are never collected). > > Richard. > Is this what you had in mind?: diff --git gcc/cgraphclones.c gcc/cgraphclones.c index 6e84a31..f000420 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count, return new_node; } -static GTY(()) unsigned int clone_fn_id_num; +static GTY(()) hash_map *clone_fn_ids; /* Return a new assembler name for a clone with SUFFIX of a decl named NAME. */ @@ -521,14 +521,13 @@ tree clone_function_name_1 (const char *name, const char *suffix) { size_t len = strlen (name); - char *tmp_name, *prefix; + char *prefix; prefix = XALLOCAVEC (char, len + strlen (suffix) + 2); memcpy (prefix, name, len); strcpy (prefix + len + 1, suffix); prefix[len] = symbol_table::symbol_suffix_separator (); - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); - return get_identifier (tmp_name); + return get_identifier (prefix); } /* Return a new assembler name for a clone of DECL with SUFFIX. */ @@ -537,7 +536,17 @@ tree clone_function_name (tree decl, const char *suffix) { tree name = DECL_ASSEMBLER_NAME (decl); - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); + char *decl_name = IDENTIFIER_POINTER (name); + char *numbered_name; + unsigned int *suffix_counter; + if (!clone_fn_ids) { +/* Initialize the per-function counter hash table if this is the first call */ +clone_fn_ids = hash_map::create_ggc (64); + } + suffix_counter = &clone_fn_ids->get_or_insert(name); + ASM_FORMAT_PRIVATE_NAME (numbered_name, decl_name, *suffix_counter); + *suffix_counter = *suffix_counter + 1; + return clone_function_name_1 (numbered_name, suffix); } - Michael signature.asc Description: OpenPGP digital signature
Re: [PATCH] Make function clone name numbering independent.
On 2018-07-17 04:25 PM, Michael Ploujnikov wrote: > On 2018-07-17 06:02 AM, Richard Biener wrote: >> On Tue, Jul 17, 2018 at 8:10 AM Bernhard Reutner-Fischer >> wrote: >>> >>> On 16 July 2018 21:38:36 CEST, Michael Ploujnikov >>> wrote: >>>> Hi, >>>> >>> >>>> +clone_fn_ids = hash_map::create_ggc >>>> (1000); >>> >>> Isn't 1000 a bit excessive? What about 64 or thereabouts? >> >> I'm not sure we should throw memory at this "problem" in this way. >> What specific issue >> does this address? > > This goes along with the general theme of preventing changes to one function > affecting codegen of others. What I'm seeing in this case is when a function > bar is modified codegen decides to create more clones of it (eg: during the > constprop pass). These extra clones cause the global counter to increment so > the clones of the unchanged function foo are named differently only because > of a source change to bar. I was hoping that the testcase would be a good > illustration, but perhaps not; is there anything else I can do to make this > clearer? > > >> >> Iff then I belive forgoing the automatic counter addidion is the way >> to go and hand >> control of that to the callers (for example the caller in >> lto/lto-partition.c doesn't >> even seem to need that counter. > > How can you tell that privatize_symbol_name_1 doesn't need the counter? I'm > assuming it has a good reason to call clone_function_name_1 rather than > appending ".lto_priv" itself. > >> You also assume the string you key on persists - luckily the >> lto-partition.c caller >> leaks it but this makes your approach quite fragile in my eye (put the logic >> into clone_function_name instead, where you at least know you are dealing >> with a string from an indentifier which are never collected). >> >> Richard. >> > > Is this what you had in mind?: > > diff --git gcc/cgraphclones.c gcc/cgraphclones.c > index 6e84a31..f000420 100644 > --- gcc/cgraphclones.c > +++ gcc/cgraphclones.c > @@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count > prof_count, >return new_node; > } > > -static GTY(()) unsigned int clone_fn_id_num; > +static GTY(()) hash_map *clone_fn_ids; > > /* Return a new assembler name for a clone with SUFFIX of a decl named > NAME. */ > @@ -521,14 +521,13 @@ tree > clone_function_name_1 (const char *name, const char *suffix) > { >size_t len = strlen (name); > - char *tmp_name, *prefix; > + char *prefix; > >prefix = XALLOCAVEC (char, len + strlen (suffix) + 2); >memcpy (prefix, name, len); >strcpy (prefix + len + 1, suffix); >prefix[len] = symbol_table::symbol_suffix_separator (); > - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); > - return get_identifier (tmp_name); > + return get_identifier (prefix); > } > > /* Return a new assembler name for a clone of DECL with SUFFIX. */ > @@ -537,7 +536,17 @@ tree > clone_function_name (tree decl, const char *suffix) > { >tree name = DECL_ASSEMBLER_NAME (decl); > - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); > + char *decl_name = IDENTIFIER_POINTER (name); > + char *numbered_name; > + unsigned int *suffix_counter; > + if (!clone_fn_ids) { > +/* Initialize the per-function counter hash table if this is the first > call */ > +clone_fn_ids = hash_map::create_ggc (64); > + } > + suffix_counter = &clone_fn_ids->get_or_insert(name); > + ASM_FORMAT_PRIVATE_NAME (numbered_name, decl_name, *suffix_counter); > + *suffix_counter = *suffix_counter + 1; > + return clone_function_name_1 (numbered_name, suffix); > } > > - Michael > > > Ping, and below is the updated version of the full patch with changelogs: gcc: 2018-07-16 Michael Ploujnikov Make function clone name numbering independent. * cgraphclones.c: Replace clone_fn_id_num with clone_fn_ids. (clone_function_name_1): Move suffixing to clone_function_name and change it to use per-function clone_fn_ids. testsuite: 2018-07-16 Michael Ploujnikov Clone id counters should be completely independent from one another. * gcc/testsuite/gcc.dg/independent-cloneids-1.c: New test. --- gcc/cgraphclones.c| 19 ++ gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++ 2 files changed, 52 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c diff --git gcc/cg
Re: [PATCH] Make function clone name numbering independent.
On 2018-07-20 06:05 AM, Richard Biener wrote: > On Fri, Jul 20, 2018 at 4:48 AM Michael Ploujnikov > wrote: >> >> On 2018-07-17 04:25 PM, Michael Ploujnikov wrote: >>> On 2018-07-17 06:02 AM, Richard Biener wrote: >>>> On Tue, Jul 17, 2018 at 8:10 AM Bernhard Reutner-Fischer >>>> wrote: >>>>> >>>>> On 16 July 2018 21:38:36 CEST, Michael Ploujnikov >>>>> wrote: >>>>>> Hi, >>>>>> >>>>> >>>>>> +clone_fn_ids = hash_map::create_ggc >>>>>> (1000); >>>>> >>>>> Isn't 1000 a bit excessive? What about 64 or thereabouts? >>>> >>>> I'm not sure we should throw memory at this "problem" in this way. >>>> What specific issue >>>> does this address? >>> >>> This goes along with the general theme of preventing changes to one >>> function affecting codegen of others. What I'm seeing in this case is when >>> a function bar is modified codegen decides to create more clones of it (eg: >>> during the constprop pass). These extra clones cause the global counter to >>> increment so the clones of the unchanged function foo are named differently >>> only because of a source change to bar. I was hoping that the testcase >>> would be a good illustration, but perhaps not; is there anything else I can >>> do to make this clearer? >>> >>> >>>> >>>> Iff then I belive forgoing the automatic counter addidion is the way >>>> to go and hand >>>> control of that to the callers (for example the caller in >>>> lto/lto-partition.c doesn't >>>> even seem to need that counter. >>> >>> How can you tell that privatize_symbol_name_1 doesn't need the counter? I'm >>> assuming it has a good reason to call clone_function_name_1 rather than >>> appending ".lto_priv" itself. >>> >>>> You also assume the string you key on persists - luckily the >>>> lto-partition.c caller >>>> leaks it but this makes your approach quite fragile in my eye (put the >>>> logic >>>> into clone_function_name instead, where you at least know you are dealing >>>> with a string from an indentifier which are never collected). >>>> >>>> Richard. >>>> >>> >>> Is this what you had in mind?: >>> >>> diff --git gcc/cgraphclones.c gcc/cgraphclones.c >>> index 6e84a31..f000420 100644 >>> --- gcc/cgraphclones.c >>> +++ gcc/cgraphclones.c >>> @@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count >>> prof_count, >>>return new_node; >>> } >>> >>> -static GTY(()) unsigned int clone_fn_id_num; >>> +static GTY(()) hash_map *clone_fn_ids; >>> >>> /* Return a new assembler name for a clone with SUFFIX of a decl named >>> NAME. */ >>> @@ -521,14 +521,13 @@ tree >>> clone_function_name_1 (const char *name, const char *suffix) >>> { >>>size_t len = strlen (name); >>> - char *tmp_name, *prefix; >>> + char *prefix; >>> >>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2); >>>memcpy (prefix, name, len); >>>strcpy (prefix + len + 1, suffix); >>>prefix[len] = symbol_table::symbol_suffix_separator (); >>> - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); >>> - return get_identifier (tmp_name); >>> + return get_identifier (prefix); >>> } >>> >>> /* Return a new assembler name for a clone of DECL with SUFFIX. */ >>> @@ -537,7 +536,17 @@ tree >>> clone_function_name (tree decl, const char *suffix) >>> { >>>tree name = DECL_ASSEMBLER_NAME (decl); >>> - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); >>> + char *decl_name = IDENTIFIER_POINTER (name); >>> + char *numbered_name; >>> + unsigned int *suffix_counter; >>> + if (!clone_fn_ids) { >>> +/* Initialize the per-function counter hash table if this is the first >>> call */ >>> +clone_fn_ids = hash_map::create_ggc (64); >>> + } >>> + suffix_counter = &clone_fn_ids->get_or_insert(name); >>> + ASM_FORMAT_PRIVATE_NAME (numbered_name, decl_name, *suffix_counter); >>> + *suffix_counter = *suffix_counter + 1; >>> + return clone_funct
Re: [PATCH] Make function clone name numbering independent.
On 2018-07-24 09:57 AM, Michael Ploujnikov wrote: > On 2018-07-20 06:05 AM, Richard Biener wrote: >>> /* Return a new assembler name for a clone with SUFFIX of a decl named >>> NAME. */ >>> @@ -521,14 +521,13 @@ tree >>> clone_function_name_1 (const char *name, const char *suffix) >> >> pass this function the counter to use >> >>> { >>>size_t len = strlen (name); >>> - char *tmp_name, *prefix; >>> + char *prefix; >>> >>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2); >>>memcpy (prefix, name, len); >>>strcpy (prefix + len + 1, suffix); >>>prefix[len] = symbol_table::symbol_suffix_separator (); >>> - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); >> >> and keep using ASM_FORMAT_PRIVATE_NAME here. You need to change >> the lto/lto-partition.c caller (just use zero as counter). >> >>> - return get_identifier (tmp_name); >>> + return get_identifier (prefix); >>> } >>> >>> /* Return a new assembler name for a clone of DECL with SUFFIX. */ >>> @@ -537,7 +536,17 @@ tree >>> clone_function_name (tree decl, const char *suffix) >>> { >>>tree name = DECL_ASSEMBLER_NAME (decl); >>> - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); >>> + const char *decl_name = IDENTIFIER_POINTER (name); >>> + char *numbered_name; >>> + unsigned int *suffix_counter; >>> + if (!clone_fn_ids) { >>> +/* Initialize the per-function counter hash table if this is the first >>> call */ >>> +clone_fn_ids = hash_map::create_ggc (64); >>> + } >> >> I still do not like throwing memory at the problem in this way for the >> little benefit >> this change provides :/ >> >> So no approval from me at this point... >> >> Richard. > > Can you give me an idea of the memory constraints that are involved? > > The highest memory usage increase that I could find was when compiling > a source file (from Linux) with roughly 10,000 functions. It showed a 2kB > increase over the before-patch use of 6936kB which is barely 0.03%. > > Using a single counter can result in more confusing namespacing when > you have .bar.clone.4 despite there only being 3 clones of .bar. > > From a practical point of view this change is helpful to anyone > diffing binary output such as forensic analysts, Debian Reproducible > Builds or even someone validating compiler output (before and after an input > source patch). The extra changes that this patch alleviates are a > distraction and could even be misleading. For example, applying a > source patch to the same Linux source produces the following binary > diff before my change: > > --- /tmp/output.o.objdump > +++ /tmp/patched-output.o.objdump > @@ -1,5 +1,5 @@ > > -/tmp/uverbs_cmd/output.o: file format elf32-i386 > +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386 > > > Disassembly of section .text.get_order: > @@ -265,12 +265,12 @@ > 3:e9 fc ff ff ff jmp4 > 4: R_386_PC32 .text.put_uobj_read > > -Disassembly of section .text.trace_kmalloc.constprop.3: > +Disassembly of section .text.trace_kmalloc.constprop.4: > > - : > + : > 0:83 3d 04 00 00 00 00cmpl $0x0,0x4 > 2: R_386_32 __tracepoint_kmalloc > - 7:74 34 je 3d > > + 7:74 34 je 3d > > 9:55 push %ebp > a:89 cd mov%ecx,%ebp > c:57 push %edi > @@ -281,7 +281,7 @@ >13:8b 1d 10 00 00 00 mov0x10,%ebx > 15: R_386_32__tracepoint_kmalloc >19:85 db test %ebx,%ebx > - 1b:74 1b je 38 > > + 1b:74 1b je 38 > >1d:68 d0 00 00 00 push $0xd0 >22:89 fa mov%edi,%edx >24:89 f0 mov%esi,%eax > @@ -292,7 +292,7 @@ >31:58 pop%eax >32:83 3b 00cmpl $0x0,(%ebx) >35:5a pop%edx > - 36:eb e3 jmp1b > > + 36:eb e3 jmp1b > >38:5b pop%ebx >39:5e pop%esi >3a:
Re: [PATCH] Make function clone name numbering independent.
On 2018-07-26 01:27 PM, Michael Ploujnikov wrote: > On 2018-07-24 09:57 AM, Michael Ploujnikov wrote: >> On 2018-07-20 06:05 AM, Richard Biener wrote: >>>> /* Return a new assembler name for a clone with SUFFIX of a decl named >>>> NAME. */ >>>> @@ -521,14 +521,13 @@ tree >>>> clone_function_name_1 (const char *name, const char *suffix) >>> >>> pass this function the counter to use >>> >>>> { >>>>size_t len = strlen (name); >>>> - char *tmp_name, *prefix; >>>> + char *prefix; >>>> >>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2); >>>>memcpy (prefix, name, len); >>>>strcpy (prefix + len + 1, suffix); >>>>prefix[len] = symbol_table::symbol_suffix_separator (); >>>> - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); >>> >>> and keep using ASM_FORMAT_PRIVATE_NAME here. You need to change >>> the lto/lto-partition.c caller (just use zero as counter). >>> >>>> - return get_identifier (tmp_name); >>>> + return get_identifier (prefix); >>>> } >>>> >>>> /* Return a new assembler name for a clone of DECL with SUFFIX. */ >>>> @@ -537,7 +536,17 @@ tree >>>> clone_function_name (tree decl, const char *suffix) >>>> { >>>>tree name = DECL_ASSEMBLER_NAME (decl); >>>> - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); >>>> + const char *decl_name = IDENTIFIER_POINTER (name); >>>> + char *numbered_name; >>>> + unsigned int *suffix_counter; >>>> + if (!clone_fn_ids) { >>>> +/* Initialize the per-function counter hash table if this is the >>>> first call */ >>>> +clone_fn_ids = hash_map::create_ggc (64); >>>> + } >>> >>> I still do not like throwing memory at the problem in this way for the >>> little benefit >>> this change provides :/ >>> >>> So no approval from me at this point... >>> >>> Richard. >> >> Can you give me an idea of the memory constraints that are involved? >> >> The highest memory usage increase that I could find was when compiling >> a source file (from Linux) with roughly 10,000 functions. It showed a 2kB >> increase over the before-patch use of 6936kB which is barely 0.03%. >> >> Using a single counter can result in more confusing namespacing when >> you have .bar.clone.4 despite there only being 3 clones of .bar. >> >> From a practical point of view this change is helpful to anyone >> diffing binary output such as forensic analysts, Debian Reproducible >> Builds or even someone validating compiler output (before and after an input >> source patch). The extra changes that this patch alleviates are a >> distraction and could even be misleading. For example, applying a >> source patch to the same Linux source produces the following binary >> diff before my change: >> >> --- /tmp/output.o.objdump >> +++ /tmp/patched-output.o.objdump >> @@ -1,5 +1,5 @@ >> >> -/tmp/uverbs_cmd/output.o: file format elf32-i386 >> +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386 >> >> >> Disassembly of section .text.get_order: >> @@ -265,12 +265,12 @@ >> 3: e9 fc ff ff ff jmp4 >> 4: R_386_PC32 .text.put_uobj_read >> >> -Disassembly of section .text.trace_kmalloc.constprop.3: >> +Disassembly of section .text.trace_kmalloc.constprop.4: >> >> - : >> + : >> 0: 83 3d 04 00 00 00 00cmpl $0x0,0x4 >> 2: R_386_32 __tracepoint_kmalloc >> - 7: 74 34 je 3d >> >> + 7: 74 34 je 3d >> >> 9: 55 push %ebp >> a: 89 cd mov%ecx,%ebp >> c: 57 push %edi >> @@ -281,7 +281,7 @@ >>13: 8b 1d 10 00 00 00 mov0x10,%ebx >> 15: R_386_32__tracepoint_kmalloc >>19: 85 db test %ebx,%ebx >> - 1b: 74 1b je 38 >> >> + 1b: 74 1b je 38 >> >>1d: 68 d0 00 00 00 push $0xd0 >>22: 89 fa mov%edi,%edx >>24: 89 f0 mov%esi,%ea
Re: [PATCH] Make function clone name numbering independent.
On 2018-08-01 06:37 AM, Richard Biener wrote: > On Tue, Jul 31, 2018 at 7:40 PM Michael Ploujnikov > wrote: >> >> On 2018-07-26 01:27 PM, Michael Ploujnikov wrote: >>> On 2018-07-24 09:57 AM, Michael Ploujnikov wrote: >>>> On 2018-07-20 06:05 AM, Richard Biener wrote: >>>>>> /* Return a new assembler name for a clone with SUFFIX of a decl named >>>>>> NAME. */ >>>>>> @@ -521,14 +521,13 @@ tree >>>>>> clone_function_name_1 (const char *name, const char *suffix) >>>>> >>>>> pass this function the counter to use >>>>> >>>>>> { >>>>>>size_t len = strlen (name); >>>>>> - char *tmp_name, *prefix; >>>>>> + char *prefix; >>>>>> >>>>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2); >>>>>>memcpy (prefix, name, len); >>>>>>strcpy (prefix + len + 1, suffix); >>>>>>prefix[len] = symbol_table::symbol_suffix_separator (); >>>>>> - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); >>>>> >>>>> and keep using ASM_FORMAT_PRIVATE_NAME here. You need to change >>>>> the lto/lto-partition.c caller (just use zero as counter). >>>>> >>>>>> - return get_identifier (tmp_name); >>>>>> + return get_identifier (prefix); >>>>>> } >>>>>> >>>>>> /* Return a new assembler name for a clone of DECL with SUFFIX. */ >>>>>> @@ -537,7 +536,17 @@ tree >>>>>> clone_function_name (tree decl, const char *suffix) >>>>>> { >>>>>>tree name = DECL_ASSEMBLER_NAME (decl); >>>>>> - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); >>>>>> + const char *decl_name = IDENTIFIER_POINTER (name); >>>>>> + char *numbered_name; >>>>>> + unsigned int *suffix_counter; >>>>>> + if (!clone_fn_ids) { >>>>>> +/* Initialize the per-function counter hash table if this is the >>>>>> first call */ >>>>>> +clone_fn_ids = hash_map::create_ggc (64); >>>>>> + } >>>>> >>>>> I still do not like throwing memory at the problem in this way for the >>>>> little benefit >>>>> this change provides :/ >>>>> >>>>> So no approval from me at this point... >>>>> >>>>> Richard. >>>> >>>> Can you give me an idea of the memory constraints that are involved? >>>> >>>> The highest memory usage increase that I could find was when compiling >>>> a source file (from Linux) with roughly 10,000 functions. It showed a 2kB >>>> increase over the before-patch use of 6936kB which is barely 0.03%. >>>> >>>> Using a single counter can result in more confusing namespacing when >>>> you have .bar.clone.4 despite there only being 3 clones of .bar. >>>> >>>> From a practical point of view this change is helpful to anyone >>>> diffing binary output such as forensic analysts, Debian Reproducible >>>> Builds or even someone validating compiler output (before and after an >>>> input >>>> source patch). The extra changes that this patch alleviates are a >>>> distraction and could even be misleading. For example, applying a >>>> source patch to the same Linux source produces the following binary >>>> diff before my change: >>>> >>>> --- /tmp/output.o.objdump >>>> +++ /tmp/patched-output.o.objdump >>>> @@ -1,5 +1,5 @@ >>>> >>>> -/tmp/uverbs_cmd/output.o: file format elf32-i386 >>>> +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386 >>>> >>>> >>>> Disassembly of section .text.get_order: >>>> @@ -265,12 +265,12 @@ >>>> 3: e9 fc ff ff ff jmp4 >>>> 4: R_386_PC32 .text.put_uobj_read >>>> >>>> -Disassembly of section .text.trace_kmalloc.constprop.3: >>>> +Disassembly of section .text.trace_kmalloc.constprop.4: >>>> >>>> - : >>>> + : >>>> 0: 83 3d 04 00 00 00 00cmpl $0x0,0x4 >>>> 2: R_386_32 __tracepoint_kmalloc >&
[PING][PATCH] Make function clone name numbering independent.
Ping and I've updated the patch since last time as follows: - unittest scans assembly rather than the constprop dump because its forward changed - unittests should handle different hosts where any of NO_DOT_IN_LABEL, NO_DOLLAR_IN_LABEL or __USER_LABEL_PREFIX__ may be defined - not 100% it's safe to change DECL_NAME to DECL_ASSEMBLER_NAME in cgraph_node::create_virtual_clone, but I've attempted to reduce some code duplication - lto-partition.c: privatize_symbol_name_1 *does* need numbered names - but cold sections don't - Expecting an IDENTIFIER_NODE in clone_function_name_1 avoids unreliable string pointer use as pointed out in the first review - renamed clone_function_name_1 and clone_function_name to numbered_clone_function_name_1 and numbered_clone_function_name to clarify purpose and discourage future unintended uses Also bootstrapped and regtested. - Michael On 2018-08-02 03:05 PM, Michael Ploujnikov wrote: > On 2018-08-01 06:37 AM, Richard Biener wrote: >> On Tue, Jul 31, 2018 at 7:40 PM Michael Ploujnikov >> wrote: >>> >>> On 2018-07-26 01:27 PM, Michael Ploujnikov wrote: >>>> On 2018-07-24 09:57 AM, Michael Ploujnikov wrote: >>>>> On 2018-07-20 06:05 AM, Richard Biener wrote: >>>>>>> /* Return a new assembler name for a clone with SUFFIX of a decl named >>>>>>> NAME. */ >>>>>>> @@ -521,14 +521,13 @@ tree >>>>>>> clone_function_name_1 (const char *name, const char *suffix) >>>>>> >>>>>> pass this function the counter to use >>>>>> >>>>>>> { >>>>>>>size_t len = strlen (name); >>>>>>> - char *tmp_name, *prefix; >>>>>>> + char *prefix; >>>>>>> >>>>>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2); >>>>>>>memcpy (prefix, name, len); >>>>>>>strcpy (prefix + len + 1, suffix); >>>>>>>prefix[len] = symbol_table::symbol_suffix_separator (); >>>>>>> - ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++); >>>>>> >>>>>> and keep using ASM_FORMAT_PRIVATE_NAME here. You need to change >>>>>> the lto/lto-partition.c caller (just use zero as counter). >>>>>> >>>>>>> - return get_identifier (tmp_name); >>>>>>> + return get_identifier (prefix); >>>>>>> } >>>>>>> >>>>>>> /* Return a new assembler name for a clone of DECL with SUFFIX. */ >>>>>>> @@ -537,7 +536,17 @@ tree >>>>>>> clone_function_name (tree decl, const char *suffix) >>>>>>> { >>>>>>>tree name = DECL_ASSEMBLER_NAME (decl); >>>>>>> - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); >>>>>>> + const char *decl_name = IDENTIFIER_POINTER (name); >>>>>>> + char *numbered_name; >>>>>>> + unsigned int *suffix_counter; >>>>>>> + if (!clone_fn_ids) { >>>>>>> +/* Initialize the per-function counter hash table if this is the >>>>>>> first call */ >>>>>>> +clone_fn_ids = hash_map::create_ggc (64); >>>>>>> + } >>>>>> >>>>>> I still do not like throwing memory at the problem in this way for the >>>>>> little benefit >>>>>> this change provides :/ >>>>>> >>>>>> So no approval from me at this point... >>>>>> >>>>>> Richard. >>>>> >>>>> Can you give me an idea of the memory constraints that are involved? >>>>> >>>>> The highest memory usage increase that I could find was when compiling >>>>> a source file (from Linux) with roughly 10,000 functions. It showed a 2kB >>>>> increase over the before-patch use of 6936kB which is barely 0.03%. >>>>> >>>>> Using a single counter can result in more confusing namespacing when >>>>> you have .bar.clone.4 despite there only being 3 clones of .bar. >>>>> >>>>> From a practical point of view this change is helpful to anyone >>>>> diffing binary output such as forensic analysts, Debian Reproducible >>>>> Builds or even someone validating compiler output (before and after an >>>
Re: [PING v2][PATCH] Make function clone name numbering independent.
On 2018-08-13 07:58 PM, Michael Ploujnikov wrote: > Ping and I've updated the patch since last time as follows: > > - unittest scans assembly rather than the constprop dump because its > forward changed > - unittests should handle different hosts where any of > NO_DOT_IN_LABEL, NO_DOLLAR_IN_LABEL or __USER_LABEL_PREFIX__ may > be defined > - not 100% it's safe to change DECL_NAME to DECL_ASSEMBLER_NAME in > cgraph_node::create_virtual_clone, but I've attempted to reduce > some code duplication > - lto-partition.c: privatize_symbol_name_1 *does* need numbered > names > - but cold sections don't > - Expecting an IDENTIFIER_NODE in clone_function_name_1 avoids > unreliable string pointer use as pointed out in the first review > - renamed clone_function_name_1 and clone_function_name to > numbered_clone_function_name_1 and numbered_clone_function_name to > clarify purpose and discourage future unintended uses > > Also bootstrapped and regtested. Ping. I've done some more digging into the current uses of numbered_clone_function_name and checked if any tests fail if I change it to suffixed_function_name: - gcc/cgraphclones.c: DECL_NAME (new_decl) = numbered_clone_function_name (thunk->decl, "artificial_thunk"); - no new tests fail, inconclusive - and despite the comment on redirect_callee_duplicating_thunks about "one or more" duplicates it doesn't seem like duplicate_thunk_for_node would be called more than once for each node, assuming each node is named uniquely, but I'm far from an expert in this area - gcc/cgraphclones.c: SET_DECL_ASSEMBLER_NAME (new_decl, numbered_clone_function_name (old_decl, suffix)); - called by cgraph_node::create_virtual_clone, definitely needs it - this is where clones like .constprop come from - gcc/cgraphclones.c: DECL_NAME (new_decl) = numbered_clone_function_name (old_decl, suffix); - more tests fail - this is where clones like .simdclone come from - called by cgraph_node::create_version_clone_with_body, most likely needs it - gcc/config/rs6000/rs6000.c: tree decl_name = numbered_clone_function_name (default_decl, "resolver"); - doesn't look like this needs the numbering as there should only be one resolver per multi-version function, but need someone with an rs/6000 to confirm - gcc/lto/lto-partition.c: numbered_clone_function_name_1 (identifier, - definitely needed for disambiguation, shown by unit tests failing - gcc/multiple_target.c: numbered_clone_function_name (node->decl, - create_dispatcher_calls says it only creates the dispatcher once - no new tests fail, inconclusive - gcc/multiple_target.c: numbered_clone_function_name (node->decl, - I have a feeling this isn't necessary - no new tests fail, inconclusive - gcc/omp-expand.c: DECL_NAME (kern_fndecl) = numbered_clone_function_name (kern_fndecl, "kernel"); - no new tests fail, inconclusive - I didn't see (and couldn't figure out a way to get) any of the existing omp/acc tests actually exercise this codeptah - gcc/omp-low.c: return numbered_clone_function_name (current_function_decl, - definitely needed based on gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c and others - gcc/omp-simd-clone.c: DECL_NAME (new_decl) = numbered_clone_function_name (old_decl, "simdclone"); - no tests fail, inconclusive - can definitely have more than one simdclone per function as above, but maybe not these external types - some tests, like declare-simd.c actually exercise this codepath, but I couldn't find the resulting .simdclone decl the the simdclone pass dump nor in any of the other dumps to verify - gcc/symtab.c: DECL_NAME (new_decl) = numbered_clone_function_name (node->decl, "localalias"); - no tests fail, inconclusive - my understanding of function_and_variable_visibility says that there can only be one per function so maybe this isn't; hubicka? I'll add patches to switch these once someone with expertise in these areas can confirm that the numbering isn't needed in the respective decl names. - Michael From adde0632266d3f1b0540e09b9db931df4302d2bc Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Mon, 16 Jul 2018 12:55:49 -0400 Subject: [PATCH 1/4] Make function assembly more independent. This changes clone_function_name such that clone names are based on a per-function counter rather than a global one. gcc: 2018-08-02 Michael Ploujnikov Make function clone name numbering independent. * cgraphclones.c: Replace clone_fn_id_num with clone_fn_i
Re: [PING v2][PATCH] Make function clone name numbering independent.
Hi Martin, Richard, Thanks for your responses. On 2018-09-03 09:15 AM, Martin Jambor wrote: > Hi, > > On Mon, Sep 03 2018, Richard Biener wrote: >> On Mon, Sep 3, 2018 at 12:02 PM Martin Jambor wrote: >>> >>> Hi, >>> >>> On Fri, Aug 31 2018, Michael Ploujnikov wrote: >>>> I've done some more digging into the current uses of >>>> numbered_clone_function_name and checked if any tests fail if I change >>>> it to suffixed_function_name: >>>> >>>> - gcc/cgraphclones.c: DECL_NAME (new_decl) = >>>> numbered_clone_function_name (thunk->decl, "artificial_thunk"); >>>> - no new tests fail, inconclusive >>>> - and despite the comment on redirect_callee_duplicating_thunks >>>> about "one or more" duplicates it doesn't seem like >>>> duplicate_thunk_for_node would be called more than once for each >>>> node, assuming each node is named uniquely, but I'm far from an >>>> expert in this area >>> >>> The comment means that if there is a chain of thunks, the method clones >>> all of them. Nevertheless, you need name numbering here for the same >>> reason why you need them for constprop. >> >> The remaining question I had with the patch was if maybe all callers >> can handle assigning >> the numbering themselves, thus do sth like >> >>for-each-clone-of (i, fn) >> DECL_NAME (...) = numbered_clone_function_name (..., >> "artificial_thunk", i); >> >> which would make the map of name -> number unnecessary. Please see my comment below. >> > > That is what I would prefer too but it also has downsides. Callers > would have to do their book keeping and the various cloning interfaces > would get another parameter when already they have quite a few > (introducing some cloning context classes seems like an overkill to me > :-). I'm fine with doing it the way you guys are suggesting, but just to explain my original thinking: The way I see it is that there will always be some kind of name -> number mapping, even if it's managed by each individual user and even if it's actually cgraph_node -> number. Needless to say I'm far from an expert in all of the involved areas (this is my first contribution to GCC) so the most effective approach I could come up is doing it all in one place. Now with your expert advice and as I get a better understanding of how things like sra and other clones are created, I'm starting to see that a more tailored approach is possible and probably the way it should have been done in the first place. > > If we want to get rid of .constprop discrepancies, Could you please clarify to me what exactly you mean by the ".constprop discrepancies"? > something like the > following (only very mildly tested) patch should be enough (note that > IPA-CP currently does not really need the new has because it creates What do you mean "it doesn't need the new hash"? My independent-cloneids-1.c test shows that a function can have more than one .constprop clone, but I'm probably just not understanding something. > clones in only one pass through the call graph, but that is something > likely to change). We could then adjust other cloners that we care > about. > > However, if clones of thunks are one of them, then the optional > parameter will also proliferate to cgraph_node::create_clone(), > cgraph_edge::redirect_callee_duplicating_thunks() and > duplicate_thunk_for_node(). create_version_clone_with_body would almost > certainly need it because of IPA-SRA too (IPA-SRA can of course always > pass zero). I'll try to do this and to convert the other users in the next version of the patch, but I might need some help with understanding how some of those passes work! > > Martin > > > > diff --git a/gcc/cgraph.h b/gcc/cgraph.h > index 2b00f0165fa..703c3cfea7b 100644 > --- a/gcc/cgraph.h > +++ b/gcc/cgraph.h > @@ -964,11 +964,15 @@ public: >cgraph_node *new_inlined_to, >bitmap args_to_skip, const char *suffix = NULL); > > - /* Create callgraph node clone with new declaration. The actual body will > - be copied later at compilation stage. */ > + /* Create callgraph node clone with new declaration. The actual body will > be > + copied later at compilation stage. The name of the new clone will be > + constructed from the name of the original name, SUFFIX and a number > which > + can either be NUM_SUFFIX if non-negative or a unique incremented integer > + ot
Re: [PING v2][PATCH] Make function clone name numbering independent.
Hi Martin, On 2018-09-03 06:01 AM, Martin Jambor wrote: > Hi, > > On Fri, Aug 31 2018, Michael Ploujnikov wrote: >> I've done some more digging into the current uses of >> numbered_clone_function_name and checked if any tests fail if I change >> it to suffixed_function_name: >> >> - gcc/cgraphclones.c: DECL_NAME (new_decl) = numbered_clone_function_name >> (thunk->decl, "artificial_thunk"); >> - no new tests fail, inconclusive >> - and despite the comment on redirect_callee_duplicating_thunks >> about "one or more" duplicates it doesn't seem like >> duplicate_thunk_for_node would be called more than once for each >> node, assuming each node is named uniquely, but I'm far from an >> expert in this area > > The comment means that if there is a chain of thunks, the method clones > all of them. Nevertheless, you need name numbering here for the same > reason why you need them for constprop. > > >> - gcc/omp-expand.c: DECL_NAME (kern_fndecl) = >> numbered_clone_function_name (kern_fndecl, "kernel"); >> - no new tests fail, inconclusive >> - I didn't see (and couldn't figure out a way to get) any of the >> existing omp/acc tests actually exercise this codeptah > > I guess this one should not need it. Build with > --enable-offload-targets=hsa and run gomp.exp to try yourself. I can > run run-time HSA tests for you if you want. > > Martin > I've tried building with numbered_clone_function_name replaced by suffixed_function_name and with --enable-offload-targets=hsa and didn't see any errors in gomp.exp. I don't have a readily available HSA setup so if you could do a quick test, I would really appreciate it! - Michael signature.asc Description: OpenPGP digital signature
Re: [PATCH v3] Make function clone name numbering independent.
On 2018-12-01 11:29 a.m., H.J. Lu wrote: > This caused: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297 > Sorry about that. Looks like I should have been testing with --with-build-config=bootstrap-lto rather than just --enable-bootstrap. The quick fix would be to undo the patch to create_virtual_clone or to just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME (node->decl) instead of node pointers. Any preferences? The harder fix would be to figure out why some nodes share the same names and fix that, but maybe that's just inevitable with LTO? - Michael signature.asc Description: OpenPGP digital signature
Re: [PATCH v3] Make function clone name numbering independent.
On 2018-12-03 12:00 p.m., Michael Ploujnikov wrote: > On 2018-12-01 11:29 a.m., H.J. Lu wrote: >> This caused: >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297 >> > > Sorry about that. Looks like I should have been testing with > --with-build-config=bootstrap-lto rather than just --enable-bootstrap. > > The quick fix would be to undo the patch to create_virtual_clone or to > just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME > (node->decl) instead of node pointers. Any preferences? > > The harder fix would be to figure out why some nodes share the same > names and fix that, but maybe that's just inevitable with LTO? > > - Michael > Here's a quick fix while the issue is being investigated. Bootstrapped (--with-build-config=bootstrap-lto) and regtested on x86_64. Ok for trunk? From f5e2500f30ad337e85e0b53eaa15c724657966a2 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Mon, 3 Dec 2018 18:19:18 -0500 Subject: [PATCH] PR ipa/88297 gcc/ChangeLog: 2018-12-03 Michael Ploujnikov PR ipa/88297 * ipa-cp.c (create_specialized_node): Store clone counters by node assembler names. (ipcp_driver): Change type of clone_num_suffixes key type to const char*. --- gcc/ipa-cp.c | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git gcc/ipa-cp.c gcc/ipa-cp.c index e0cd1bc45b..f82473e37c 100644 --- gcc/ipa-cp.c +++ gcc/ipa-cp.c @@ -376,8 +376,8 @@ static profile_count max_count; static long overall_size, max_new_size; -/* Node to unique clone suffix number map. */ -static hash_map *clone_num_suffixes; +/* Node name to unique clone suffix number map. */ +static hash_map *clone_num_suffixes; /* Return the param lattices structure corresponding to the Ith formal parameter of the function described by INFO. */ @@ -3831,7 +3831,9 @@ create_specialized_node (struct cgraph_node *node, } } - unsigned &suffix_counter = clone_num_suffixes->get_or_insert (node); + unsigned &suffix_counter = clone_num_suffixes->get_or_insert ( + IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME ( + node->decl))); new_node = node->create_virtual_clone (callers, replace_trees, args_to_skip, "constprop", suffix_counter); @@ -5049,7 +5051,7 @@ ipcp_driver (void) ipa_check_create_node_params (); ipa_check_create_edge_args (); - clone_num_suffixes = new hash_map; + clone_num_suffixes = new hash_map; if (dump_file) { -- 2.19.1 signature.asc Description: OpenPGP digital signature
Re: [PING v2][PATCH] Make function clone name numbering independent.
On 2018-12-04 7:48 a.m., Martin Jambor wrote: > Hi, > > On Tue, Sep 04 2018, Michael Ploujnikov wrote: >> >> I've tried building with numbered_clone_function_name replaced by >> suffixed_function_name and with --enable-offload-targets=hsa and >> didn't see any errors in gomp.exp. I don't have a readily available >> HSA setup so if you could do a quick test, I would really appreciate >> it! > > Sorry it took so long, I have run regular tests on trunk today and your > patch does not appear to have caused any issues. > > Martin > Thank you for checking. - Michael signature.asc Description: OpenPGP digital signature
[PATCH] Testcase for PR 88297 and minor fixes
Thanks to Martin we now have a test that exercises (cp) cloning machinery during the WPA stage of LTO. Also, during debugging I found that print_all_lattices would trigger an assert if I tried to call it inside decide_whether_version_node. Finally I've attached some comment spelling fixes as a bonus. Bootstrapping (--with-build-config=bootstrap-lto) and regression testing on x86_64. Ok for trunk after tests pass? - Michael From f8f59d44141726e688cde077aabb5f2ce0bf53e0 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Thu, 6 Dec 2018 13:36:51 -0500 Subject: [PATCH 1/3] Skip constprop clones because we don't make lattices for them. gcc/ChangeLog: 2018-12-06 Michael Ploujnikov * ipa-cp.c (print_all_lattices): Skip cp clones. --- gcc/ipa-cp.c | 3 +++ 1 file changed, 3 insertions(+) diff --git gcc/ipa-cp.c gcc/ipa-cp.c index c7c462ab81..c4e879bbc6 100644 --- gcc/ipa-cp.c +++ gcc/ipa-cp.c @@ -542,6 +542,9 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits) struct ipa_node_params *info; info = IPA_NODE_REF (node); + /* Skip constprop clones since we don't make lattices for them. */ + if (info->ipcp_orig_node) + continue; fprintf (f, " Node: %s:\n", node->dump_name ()); count = ipa_get_param_count (info); for (i = 0; i < count; i++) -- 2.19.1 From 1ec489bbb30f410144c0bc84f0c160a8fbed0be6 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Thu, 6 Dec 2018 13:47:00 -0500 Subject: [PATCH 2/3] Testcase for cloning during LTO WPA stage. Written with Martin Jambor's help: https://gcc.gnu.org/ml/gcc/2018-12/msg00043.html gcc/testsuite/ChangeLog: 2018-12-06 Michael Ploujnikov * gcc.dg/lto/pr88297_0.c: New test. * gcc.dg/lto/pr88297_1.c: New test. --- gcc/testsuite/gcc.dg/lto/pr88297_0.c | 57 gcc/testsuite/gcc.dg/lto/pr88297_1.c | 25 2 files changed, 82 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/lto/pr88297_0.c create mode 100644 gcc/testsuite/gcc.dg/lto/pr88297_1.c diff --git gcc/testsuite/gcc.dg/lto/pr88297_0.c gcc/testsuite/gcc.dg/lto/pr88297_0.c new file mode 100644 index 00..d415015166 --- /dev/null +++ gcc/testsuite/gcc.dg/lto/pr88297_0.c @@ -0,0 +1,57 @@ +/* { dg-require-effective-target lto } */ +/* { dg-lto-options {{-flto -O3 -fipa-cp -fipa-cp-clone}} } */ +/* { dg-lto-do run } */ + +/* In order to trigger IPA-CP cloning we have to: + + 1. Put the calls in main into a loop; otherwise everything is + coldand we would not clone. + + 2. Make different foos and bars actually semantically different; + otherwise IPA-ICF unified them (as it should). + +*/ + +volatile int g; + +void __attribute__ ((noipa)) +use (int v) +{ + g = v; +} + +static int __attribute__ ((noinline)) +foo (int arg) +{ + return 7 * arg; +} + +static int __attribute__ ((noinline)) +bar (int arg) +{ + return arg * arg; +} + +extern int __attribute__ ((noinline)) +entry2 (void); + +int __attribute__ ((noipa)) +get_opaque_number (void) +{ + return 1; +} + +int main (void) +{ + int i; + for (i = 0; i < get_opaque_number (); i++) +{ + use (bar (3)); + use (bar (4)); + use (foo (5)); + use (foo (6)); + + entry2 (); +} + return 0; +} diff --git gcc/testsuite/gcc.dg/lto/pr88297_1.c gcc/testsuite/gcc.dg/lto/pr88297_1.c new file mode 100644 index 00..65c5321cde --- /dev/null +++ gcc/testsuite/gcc.dg/lto/pr88297_1.c @@ -0,0 +1,25 @@ +extern void __attribute__ ((noipa)) +use (int v); + + +static int __attribute__ ((noinline)) +foo (int arg) +{ + return 8 * arg; +} + +static int __attribute__ ((noinline)) +bar (int arg) +{ + return arg * arg + 3; +} + +int __attribute__ ((noinline)) +entry2 (void) +{ + use (bar (3)); + use (bar (4)); + use (foo (5)); + use (foo (6)); + return 0; +} -- 2.19.1 From c2db1a6aa7d6787685a28df1e677a66bac6cb9b5 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Thu, 6 Dec 2018 14:06:55 -0500 Subject: [PATCH 3/3] Spelling fixes in comments. gcc/ChangeLog: 2018-12-06 Michael Ploujnikov * ipa-cp.c (ipa_get_indirect_edge_target_1): Fix spelling in comments. --- gcc/ipa-cp.c | 16 1 file changed, 8 insertions(+), 8 deletions(-) diff --git gcc/ipa-cp.c gcc/ipa-cp.c index c4e879bbc6..8c419e1f53 100644 --- gcc/ipa-cp.c +++ gcc/ipa-cp.c @@ -191,7 +191,7 @@ public: /* Depth first search number and low link for topological sorting of values. */ int dfs, low_link; - /* True if this valye is currently on the topo-sort stack. */ + /* True if this value is currently on the topo-sort stack. */ bool on_stack; ipcp_value() @@ -883,7 +883,7 @@ ipcp_lattice::set_contains_variable () return ret; } -/* Set all aggegate lattices in PLATS to bottom and return true if they were +/* Set all aggregate lattices in PLATS to bottom and return true if they were not previously set as such. */ static inline bool
[PING 1] Testcase for PR 88297 and minor fixes
On 2018-12-06 3:13 p.m., Michael Ploujnikov wrote: > Thanks to Martin we now have a test that exercises (cp) cloning > machinery during the WPA stage of LTO. > > Also, during debugging I found that print_all_lattices would trigger > an assert if I tried to call it inside decide_whether_version_node. > > Finally I've attached some comment spelling fixes as a bonus. > > > Bootstrapping (--with-build-config=bootstrap-lto) and regression testing on > x86_64. > > Ok for trunk after tests pass? > > > - Michael > Ping? signature.asc Description: OpenPGP digital signature
Avoid unnecessarily numbered clone symbols
While working on https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html I've accumulated a few easy patches. The first one renames the functions in question to hopefully encourage proper future usage. The other ones use the unnumbered version of the clone name function where I've verified the numbers are not needed. I've verified these by doing a full bootstrap and a regression test, by instrumenting the code and by understanding and following the surrounding code to convince myself that the numbering is indeed not needed. For the cold functions I've also confirmed with Sriraman Tallam that they don't need to be numbered. Regards, - Michael From 0bbcf3b8c20498f4d861e088ff7ab38e2a43800b Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Tue, 7 Aug 2018 20:36:53 -0400 Subject: [PATCH 1/4] Rename clone_function_name_1 and clone_function_name to clarify usage. gcc: 2018-10-19 Michael Ploujnikov * gcc/cgraph.h: Rename clone_function_name_1 to numbered_clone_function_name_1. Rename clone_function_name to numbered_clone_function_name. * cgraphclones.c: Ditto. * config/rs6000/rs6000.c: Ditto. * lto/lto-partition.c: Ditto. * multiple_target.c: Ditto. * omp-expand.c: Ditto. * omp-low.c: Ditto. * omp-simd-clone.c: Ditto. * symtab.c: Ditto. --- gcc/cgraph.h | 4 ++-- gcc/cgraphclones.c | 20 +++- gcc/config/rs6000/rs6000.c | 2 +- gcc/lto/lto-partition.c| 4 ++-- gcc/multiple_target.c | 8 gcc/omp-expand.c | 2 +- gcc/omp-low.c | 4 ++-- gcc/omp-simd-clone.c | 2 +- gcc/symtab.c | 2 +- 9 files changed, 25 insertions(+), 23 deletions(-) diff --git gcc/cgraph.h gcc/cgraph.h index a8b1b4c..3583f7e 100644 --- gcc/cgraph.h +++ gcc/cgraph.h @@ -2368,8 +2368,8 @@ basic_block init_lowered_empty_function (tree, bool, profile_count); tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree); /* In cgraphclones.c */ -tree clone_function_name_1 (const char *, const char *); -tree clone_function_name (tree decl, const char *); +tree numbered_clone_function_name_1 (const char *, const char *); +tree numbered_clone_function_name (tree decl, const char *); void tree_function_versioning (tree, tree, vec *, bool, bitmap, bool, bitmap, basic_block); diff --git gcc/cgraphclones.c gcc/cgraphclones.c index 6e84a31..cdb183d 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -316,7 +316,7 @@ duplicate_thunk_for_node (cgraph_node *thunk, cgraph_node *node) gcc_checking_assert (!DECL_RESULT (new_decl)); gcc_checking_assert (!DECL_RTL_SET_P (new_decl)); - DECL_NAME (new_decl) = clone_function_name (thunk->decl, "artificial_thunk"); + DECL_NAME (new_decl) = numbered_clone_function_name (thunk->decl, "artificial_thunk"); SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl)); new_thunk = cgraph_node::create (new_decl); @@ -514,11 +514,11 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count, static GTY(()) unsigned int clone_fn_id_num; -/* Return a new assembler name for a clone with SUFFIX of a decl named - NAME. */ +/* Return NAME appended with string SUFFIX and a unique unspecified + number. */ tree -clone_function_name_1 (const char *name, const char *suffix) +numbered_clone_function_name_1 (const char *name, const char *suffix) { size_t len = strlen (name); char *tmp_name, *prefix; @@ -531,13 +531,15 @@ clone_function_name_1 (const char *name, const char *suffix) return get_identifier (tmp_name); } -/* Return a new assembler name for a clone of DECL with SUFFIX. */ +/* Return a new assembler name for a clone of DECL. Apart from the + string SUFFIX, the new name will end with a unique unspecified + number. */ tree -clone_function_name (tree decl, const char *suffix) +numbered_clone_function_name (tree decl, const char *suffix) { tree name = DECL_ASSEMBLER_NAME (decl); - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); + return numbered_clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); } @@ -585,7 +587,7 @@ cgraph_node::create_virtual_clone (vec redirect_callers, strcpy (name + len + 1, suffix); name[len] = '.'; DECL_NAME (new_decl) = get_identifier (name); - SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name (old_decl, suffix)); + SET_DECL_ASSEMBLER_NAME (new_decl, numbered_clone_function_name (old_decl, suffix)); SET_DECL_RTL (new_decl, NULL); new_node = create_clone (new_decl, count, false, @@ -964,7 +966,7 @@ cgraph_node::create_version_clone_with_body = build_function_decl_skip_args (old_decl, args_to_skip, skip_return); /* Generate a new name for the new version. */ - DECL_NAME (new_decl) = clone_function_name (old_decl, suffix); + DECL_NAME (new_decl) = numbered_clone_function_
Re: Avoid unnecessarily numbered clone symbols
On 2018-10-20 07:39 AM, Bernhard Reutner-Fischer wrote: > On 20 October 2018 00:26:15 CEST, Michael Ploujnikov > wrote: >> While working on >> https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html I've >> accumulated a few easy patches. > > > +/* Return decl name IDENTIFIER with string SUFFIX appended. */ > + > +tree > +suffixed_function_name (tree identifier, const char *suffix) > +{ > + const char *name = IDENTIFIER_POINTER (identifier); > + size_t len = strlen (name); > + char *prefix; > + > + prefix = XALLOCAVEC (char, len + strlen (suffix) + 2); > + memcpy (prefix, name, len); > + prefix[len] = symbol_table::symbol_suffix_separator (); > + strcpy (prefix + len + 1, suffix); > + return get_identifier (prefix); > +} > + > > FWIW I think I would phrase this as > > char *str = concat ( > IDENTIFIER_POINTER (identifier), > symbol_table::symbol_suffix_separator (), > suffix); > tree ret = get_identifier (str); > free (str); > return ret; > Thanks for the suggestion Bernhard. I also found that the last argument to concat has to be NULL and I can use ACONCAT to avoid the explicit free and since symbol_table::symbol_suffix_separator returns just one char I need to first put it into a string. I also looked into re-writing numbered_clone_function_name in a similar way, but before I got too far I found a small issue with my suffixed_function_name: If I'm going to write an exact replacement for numbered_clone_function_name then I need to also copy the double-underscore prefixing behaviour done by ASM_PN_FORMAT (right?) which is used by ASM_FORMAT_PRIVATE_NAME and write it as: char *separator = XALLOCAVEC (char, 2); separator[0] = symbol_table::symbol_suffix_separator (); separator[1] = 0; return get_identifier (ACONCAT (( #if defined (NO_DOT_IN_LABEL) && defined (NO_DOLLAR_IN_LABEL) "__", #endif IDENTIFIER_POINTER (identifier), separator, suffix, NULL))); (I'm not sure if the formatting is correct) However, then it's not exactly the same as the code that I'm also trying to replace in cgraph_node::create_virtual_clone because it doesn't add a double underscore if neither NO_DOT_IN_LABEL nor NO_DOLLAR_IN_LABEL is defined. Is this just an omission that I should fix by with my new function or was it indended that way and shouldn't be changed? Suggestions anyone? - Michael signature.asc Description: OpenPGP digital signature
[PATCH v2] Avoid unnecessarily numbered clone symbols
Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01258.html Fixed up the code after the change to concat suggested by Bernhard Reutner. Outstanding question still remains: To write an exact replacement for numbered_clone_function_name (apart from the numbering) I also need to copy the double underscore prefixing behaviour done by ASM_PN_FORMAT (right?) which is used by ASM_FORMAT_PRIVATE_NAME. Does that mean that I can't use my suffixed_function_name to replace the very similar looking code in cgraph_node::create_virtual_clone? Or is it just missing the double underscore prefix by mistake? - Michael From 74435e1d8c5984eaee766d7940eeffbe565fcc2e Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Tue, 7 Aug 2018 20:36:53 -0400 Subject: [PATCH 1/4] Rename clone_function_name_1 and clone_function_name to clarify usage. gcc: 2018-10-19 Michael Ploujnikov * gcc/cgraph.h: Rename clone_function_name_1 to numbered_clone_function_name_1. Rename clone_function_name to numbered_clone_function_name. * cgraphclones.c: Ditto. * config/rs6000/rs6000.c: Ditto. * lto/lto-partition.c: Ditto. * multiple_target.c: Ditto. * omp-expand.c: Ditto. * omp-low.c: Ditto. * omp-simd-clone.c: Ditto. * symtab.c: Ditto. --- gcc/cgraph.h | 4 ++-- gcc/cgraphclones.c | 22 +- gcc/config/rs6000/rs6000.c | 2 +- gcc/lto/lto-partition.c| 4 ++-- gcc/multiple_target.c | 8 gcc/omp-expand.c | 2 +- gcc/omp-low.c | 4 ++-- gcc/omp-simd-clone.c | 2 +- gcc/symtab.c | 3 ++- 9 files changed, 28 insertions(+), 23 deletions(-) diff --git gcc/cgraph.h gcc/cgraph.h index a8b1b4c..3583f7e 100644 --- gcc/cgraph.h +++ gcc/cgraph.h @@ -2368,8 +2368,8 @@ basic_block init_lowered_empty_function (tree, bool, profile_count); tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree); /* In cgraphclones.c */ -tree clone_function_name_1 (const char *, const char *); -tree clone_function_name (tree decl, const char *); +tree numbered_clone_function_name_1 (const char *, const char *); +tree numbered_clone_function_name (tree decl, const char *); void tree_function_versioning (tree, tree, vec *, bool, bitmap, bool, bitmap, basic_block); diff --git gcc/cgraphclones.c gcc/cgraphclones.c index 6e84a31..bc59dc2 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -316,7 +316,8 @@ duplicate_thunk_for_node (cgraph_node *thunk, cgraph_node *node) gcc_checking_assert (!DECL_RESULT (new_decl)); gcc_checking_assert (!DECL_RTL_SET_P (new_decl)); - DECL_NAME (new_decl) = clone_function_name (thunk->decl, "artificial_thunk"); + DECL_NAME (new_decl) = numbered_clone_function_name (thunk->decl, + "artificial_thunk"); SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl)); new_thunk = cgraph_node::create (new_decl); @@ -514,11 +515,11 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count, static GTY(()) unsigned int clone_fn_id_num; -/* Return a new assembler name for a clone with SUFFIX of a decl named - NAME. */ +/* Return NAME appended with string SUFFIX and a unique unspecified + number. */ tree -clone_function_name_1 (const char *name, const char *suffix) +numbered_clone_function_name_1 (const char *name, const char *suffix) { size_t len = strlen (name); char *tmp_name, *prefix; @@ -531,13 +532,15 @@ clone_function_name_1 (const char *name, const char *suffix) return get_identifier (tmp_name); } -/* Return a new assembler name for a clone of DECL with SUFFIX. */ +/* Return a new assembler name for a clone of DECL. Apart from the + string SUFFIX, the new name will end with a unique unspecified + number. */ tree -clone_function_name (tree decl, const char *suffix) +numbered_clone_function_name (tree decl, const char *suffix) { tree name = DECL_ASSEMBLER_NAME (decl); - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); + return numbered_clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); } @@ -585,7 +588,8 @@ cgraph_node::create_virtual_clone (vec redirect_callers, strcpy (name + len + 1, suffix); name[len] = '.'; DECL_NAME (new_decl) = get_identifier (name); - SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name (old_decl, suffix)); + SET_DECL_ASSEMBLER_NAME (new_decl, + numbered_clone_function_name (old_decl,suffix)); SET_DECL_RTL (new_decl, NULL); new_node = create_clone (new_decl, count, false, @@ -964,7 +968,7 @@ cgraph_node::create_version_clone_with_body = build_function_decl_skip_args (old_decl, args_to_skip, skip_return); /* Generate a new name for the new version. */ - DECL_NAME (new_decl) = clone_function_name (old_decl, suffix); + DECL_NAME (new_decl) = numbered_clone_function_name (old_decl, suffix); SET
[PATCH v3] Avoid unnecessarily numbered clone symbols
On 2018-10-21 09:14 PM, Michael Ploujnikov wrote: > Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01258.html > > Fixed up the code after the change to concat suggested by Bernhard > Reutner. > > Outstanding question still remains: > > To write an exact replacement for numbered_clone_function_name (apart > from the numbering) I also need to copy the double underscore > prefixing behaviour done by ASM_PN_FORMAT (right?) which is used by > ASM_FORMAT_PRIVATE_NAME. Does that mean that I can't use my > suffixed_function_name to replace the very similar looking code in > cgraph_node::create_virtual_clone? Or is it just missing the double > underscore prefix by mistake? I found https://gcc.gnu.org/ml/gcc-patches/2013-08/msg01028.html which answered my question so now I'm just simplifying cgraph_node::create_virtual_clone with ACONCAT. - Michael From 383c64faa956c8b06e680808ef275acb6a746158 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Tue, 7 Aug 2018 20:36:53 -0400 Subject: [PATCH 1/4] Rename clone_function_name_1 and clone_function_name to clarify usage. gcc: 2018-10-23 Michael Ploujnikov * gcc/cgraph.h: Rename clone_function_name_1 to numbered_clone_function_name_1. Rename clone_function_name to numbered_clone_function_name. * cgraphclones.c: Ditto. * config/rs6000/rs6000.c: Ditto. * lto/lto-partition.c: Ditto. * multiple_target.c: Ditto. * omp-expand.c: Ditto. * omp-low.c: Ditto. * omp-simd-clone.c: Ditto. * symtab.c: Ditto. --- gcc/cgraph.h | 4 ++-- gcc/cgraphclones.c | 22 +- gcc/config/rs6000/rs6000.c | 2 +- gcc/lto/lto-partition.c| 4 ++-- gcc/multiple_target.c | 8 gcc/omp-expand.c | 2 +- gcc/omp-low.c | 4 ++-- gcc/omp-simd-clone.c | 2 +- gcc/symtab.c | 3 ++- 9 files changed, 28 insertions(+), 23 deletions(-) diff --git gcc/cgraph.h gcc/cgraph.h index a8b1b4c..3583f7e 100644 --- gcc/cgraph.h +++ gcc/cgraph.h @@ -2368,8 +2368,8 @@ basic_block init_lowered_empty_function (tree, bool, profile_count); tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree); /* In cgraphclones.c */ -tree clone_function_name_1 (const char *, const char *); -tree clone_function_name (tree decl, const char *); +tree numbered_clone_function_name_1 (const char *, const char *); +tree numbered_clone_function_name (tree decl, const char *); void tree_function_versioning (tree, tree, vec *, bool, bitmap, bool, bitmap, basic_block); diff --git gcc/cgraphclones.c gcc/cgraphclones.c index 6e84a31..4395806 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -316,7 +316,8 @@ duplicate_thunk_for_node (cgraph_node *thunk, cgraph_node *node) gcc_checking_assert (!DECL_RESULT (new_decl)); gcc_checking_assert (!DECL_RTL_SET_P (new_decl)); - DECL_NAME (new_decl) = clone_function_name (thunk->decl, "artificial_thunk"); + DECL_NAME (new_decl) = numbered_clone_function_name (thunk->decl, + "artificial_thunk"); SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl)); new_thunk = cgraph_node::create (new_decl); @@ -514,11 +515,11 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count, static GTY(()) unsigned int clone_fn_id_num; -/* Return a new assembler name for a clone with SUFFIX of a decl named - NAME. */ +/* Return NAME appended with string SUFFIX and a unique unspecified + number. */ tree -clone_function_name_1 (const char *name, const char *suffix) +numbered_clone_function_name_1 (const char *name, const char *suffix) { size_t len = strlen (name); char *tmp_name, *prefix; @@ -531,13 +532,15 @@ clone_function_name_1 (const char *name, const char *suffix) return get_identifier (tmp_name); } -/* Return a new assembler name for a clone of DECL with SUFFIX. */ +/* Return a new assembler name for a clone of DECL. Apart from the + string SUFFIX, the new name will end with a unique unspecified + number. */ tree -clone_function_name (tree decl, const char *suffix) +numbered_clone_function_name (tree decl, const char *suffix) { tree name = DECL_ASSEMBLER_NAME (decl); - return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); + return numbered_clone_function_name_1 (IDENTIFIER_POINTER (name), suffix); } @@ -585,7 +588,8 @@ cgraph_node::create_virtual_clone (vec redirect_callers, strcpy (name + len + 1, suffix); name[len] = '.'; DECL_NAME (new_decl) = get_identifier (name); - SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name (old_decl, suffix)); + SET_DECL_ASSEMBLER_NAME (new_decl, + numbered_clone_function_name (old_decl, suffix)); SET_DECL_RTL (new_decl, NULL); new_node = create_clone (new_decl, count, false, @@ -964,7 +968,7 @@ cgraph_node::creat
[PATCH v4] Avoid unnecessarily numbering cloned symbols.
I've taken the advice from a discussion on IRC and re-wrote the patch with more uniform function names and using overloading. I think this function accomplished the following goals: - remove clone numbering where it's not needed: final.c:final_scan_insn_1 and symtab.c:simd_symtab_node::noninterposable_alias. - name and document the clone naming API such that future users won't accidentally use the numbering when it's not necessary; if you need numbering then you need to explicitly ask for it with the right function - provide a new function that allows users to specify a clone number explicitly as an argument My thoughts for future improvements: - It's a bit unfortunate that lto-partition.c:privatize_symbol_name_1 has to break the decl abstraction and pass in a string that it created into what I would consider the implementation-detail function. The best way I can think of to make it uniform with the rest of the users is to have it create a new empty decl with DECL_ASSEMBLER_NAME set to the new string - It's unfortunate that I have to duplicate the separator concatenation in the numberless clone_function_name, but I think it has to be like that unless ASM_FORMAT_PRIVATE_NAME making the number optional. From 0df79d0ac6a9891b344f988c7157a54cebbc1cb8 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Thu, 25 Oct 2018 13:16:36 -0400 Subject: [PATCH] Avoid unnecessarily numbering cloned symbols. gcc/ChangeLog: 2018-10-25 Michael Ploujnikov * cgraph.h (clone_function_name_1): Replaced by new clone_function_name_numbered that takes name as string; for privatize_symbol_name_1 use only. (clone_function_name): Renamed to clone_function_name_numbered to be explicit about numbering. (clone_function_name): New two-argument function that does not number its output. (clone_function_name): New three-argument function that takes a number to append to its output. * cgraphclones.c (duplicate_thunk_for_node): (clone_function_name_1): Renamed. (clone_function_name_numbered): Two new functions. (clone_function_name): Improved documentation. (cgraph_node::create_virtual_clone): Use clone_function_name_numbered. * config/rs6000/rs6000.c (make_resolver_func): Ditto. * final.c (final_scan_insn_1): Use the new clone_function_name without numbering. * multiple_target.c (create_dispatcher_calls): Ditto. (create_target_clone): Ditto. * omp-expand.c (grid_expand_target_grid_body): Ditto. * omp-low.c (create_omp_child_function_name): Ditto. * omp-simd-clone.c (simd_clone_create): Ditto. * symtab.c (simd_symtab_node::noninterposable_alias): Use the new clone_function_name without numbering. gcc/lto/ChangeLog: 2018-10-25 Michael Ploujnikov * lto-partition.c (privatize_symbol_name_1): Use clone_function_name_numbered. gcc/testsuite/ChangeLog: 2018-10-25 Michael Ploujnikov * gcc.dg/tree-prof/cold_partition_label.c: Update for cold section names without numbers. * gcc.dg/tree-prof/section-attr-1.c: Ditto. * gcc.dg/tree-prof/section-attr-2.c: Ditto. * gcc.dg/tree-prof/section-attr-3.c: Ditto. --- gcc/cgraph.h | 7 ++- gcc/cgraphclones.c | 66 ++ gcc/config/rs6000/rs6000.c | 2 +- gcc/lto/lto-partition.c| 4 +- gcc/multiple_target.c | 8 +-- gcc/omp-expand.c | 3 +- gcc/omp-low.c | 4 +- gcc/omp-simd-clone.c | 3 +- .../gcc.dg/tree-prof/cold_partition_label.c| 4 +- gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c| 2 +- gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c| 2 +- gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c| 2 +- 12 files changed, 79 insertions(+), 28 deletions(-) diff --git gcc/cgraph.h gcc/cgraph.h index a8b1b4c..971065d 100644 --- gcc/cgraph.h +++ gcc/cgraph.h @@ -2368,8 +2368,11 @@ basic_block init_lowered_empty_function (tree, bool, profile_count); tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree); /* In cgraphclones.c */ -tree clone_function_name_1 (const char *, const char *); -tree clone_function_name (tree decl, const char *); +tree clone_function_name_numbered (const char *name, const char *suffix); +tree clone_function_name_numbered (tree decl, const char *suffix); +tree clone_function_name (const char *name, const char *suffix, + unsigned long number); +tree clone_function_name (tree decl, const char *suffix); void tree_function_versioning (tree, tree, vec *, bool, bitmap, bool, bitmap, basic_block); diff --git gcc/cgraphclones.c gcc/cgraphclones.c index 6e84a31..0e496d9 100644 --- gcc/cgraphclones.c +++ gcc/c
Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.
Hi Martin, Thanks for the review. On 2018-10-26 03:51 AM, Martin Liška wrote: > On 10/26/18 12:59 AM, Michael Ploujnikov wrote: >> I've taken the advice from a discussion on IRC and re-wrote the patch >> with more uniform function names and using overloading. >> >> I think this function accomplished the following goals: >> - remove clone numbering where it's not needed: >>final.c:final_scan_insn_1 and >>symtab.c:simd_symtab_node::noninterposable_alias. >> - name and document the clone naming API such that future users won't >>accidentally use the numbering when it's not necessary; if you need >>numbering then you need to explicitly ask for it with the right >>function >> - provide a new function that allows users to specify a clone number >>explicitly as an argument > > Hello. > > Thanks for reworking that. > >> >> My thoughts for future improvements: >> - It's a bit unfortunate that lto-partition.c:privatize_symbol_name_1 >>has to break the decl abstraction and pass in a string that it >>created into what I would consider the implementation-detail >>function. The best way I can think of to make it uniform with the >>rest of the users is to have it create a new empty decl with >>DECL_ASSEMBLER_NAME set to the new string > > That's not nice to create artificial declaration. Having string variant > is fine for me. Ok. > >> - It's unfortunate that I have to duplicate the separator >>concatenation in the numberless clone_function_name, but I think it >>has to be like that unless ASM_FORMAT_PRIVATE_NAME making the >>number optional. >> > > That's also fine for me. I'm attaching small nits that I found. > And please reformat following chunk in ChangeLog entry: > > * cgraph.h (clone_function_name_1): Replaced by new > clone_function_name_numbered that takes name as string; for > privatize_symbol_name_1 use only. (clone_function_name): > Renamed to clone_function_name_numbered to be explicit about > numbering. (clone_function_name): New two-argument function > that does not number its output. (clone_function_name): New > three-argument function that takes a number to append to its > output. > > into: > > * cgraph.h (clone_function_name_1): Replaced by new > clone_function_name_numbered that takes name as string; for > privatize_symbol_name_1 use only. > (clone_function_name): Renamed to clone_function_name_numbered > to be explicit about... Fixed, assuming you wanted me to start each function on a new line. > suffix, > -(char*)0)); > +NULL)); >return get_identifier (result); > } I've actually been told that NULL isn't always the same on some targets and that I should use (char*)0 instead. Note that libiberty/concat.c itself uses (char*)0. > > I'm adding Honza to CC, hope he can review it quickly. > > Thanks, > Martin > Thanks again, Michael From aea94273e7a477a03d1ee10a5d9043d6d13b8e8d Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Thu, 25 Oct 2018 13:16:36 -0400 Subject: [PATCH] Avoid unnecessarily numbering cloned symbols. gcc/ChangeLog: 2018-10-26 Michael Ploujnikov * cgraph.h (clone_function_name_1): Replaced by new clone_function_name_numbered that takes name as string; for privatize_symbol_name_1 use only. (clone_function_name): Renamed to clone_function_name_numbered to be explicit about numbering. (clone_function_name): New two-argument function that does not number its output. (clone_function_name): New three-argument function that takes a number to append to its output. * cgraphclones.c (duplicate_thunk_for_node): (clone_function_name_1): Renamed. (clone_function_name_numbered): Two new functions. (clone_function_name): Improved documentation. (cgraph_node::create_virtual_clone): Use clone_function_name_numbered. * config/rs6000/rs6000.c (make_resolver_func): Ditto. * final.c (final_scan_insn_1): Use the new clone_function_name without numbering. * multiple_target.c (create_dispatcher_calls): Ditto. (create_target_clone): Ditto. * omp-expand.c (grid_expand_target_grid_body): Ditto. * omp-low.c (create_omp_child_function_name): Ditto. * omp-simd-clone.c (simd_clone_create): Ditto. * symtab.c (simd_symtab_node::noninterposable_alias): Use the new clone_function_name without numbering. gcc/lto/ChangeLog: 2018-10-26 Michael Ploujnikov * lto-partition.c (privatize_symbol_name_1): Use
Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.
Hi, On 2018-10-26 10:25 a.m., Jan Hubicka wrote: >> From aea94273e7a477a03d1ee10a5d9043d6d13b8e8d Mon Sep 17 00:00:00 2001 >> From: Michael Ploujnikov >> Date: Thu, 25 Oct 2018 13:16:36 -0400 >> Subject: [PATCH] Avoid unnecessarily numbering cloned symbols. >> >> gcc/ChangeLog: >> >> 2018-10-26 Michael Ploujnikov >> >> * cgraph.h (clone_function_name_1): Replaced by new >>clone_function_name_numbered that takes name as string; for >>privatize_symbol_name_1 use only. >>(clone_function_name): Renamed to >>clone_function_name_numbered to be explicit about numbering. >>(clone_function_name): New two-argument function that does >>not number its output. >>(clone_function_name): New three-argument function that >>takes a number to append to its output. >> * cgraphclones.c (duplicate_thunk_for_node): >>(clone_function_name_1): Renamed. >>(clone_function_name_numbered): Two new functions. >>(clone_function_name): Improved documentation. >>(cgraph_node::create_virtual_clone): Use clone_function_name_numbered. >> * config/rs6000/rs6000.c (make_resolver_func): Ditto. >> * final.c (final_scan_insn_1): Use the new clone_function_name >>without numbering. >> * multiple_target.c (create_dispatcher_calls): Ditto. >>(create_target_clone): Ditto. >> * omp-expand.c (grid_expand_target_grid_body): Ditto. >> * omp-low.c (create_omp_child_function_name): Ditto. >> * omp-simd-clone.c (simd_clone_create): Ditto. >> * symtab.c (simd_symtab_node::noninterposable_alias): Use the >>new clone_function_name without numbering. >> >> gcc/lto/ChangeLog: >> >> 2018-10-26 Michael Ploujnikov >> >> * lto-partition.c (privatize_symbol_name_1): Use >>clone_function_name_numbered. >> >> gcc/testsuite/ChangeLog: >> >> 2018-10-26 Michael Ploujnikov >> >> * gcc.dg/tree-prof/cold_partition_label.c: Update for cold >>section names without numbers. >> * gcc.dg/tree-prof/section-attr-1.c: Ditto. >> * gcc.dg/tree-prof/section-attr-2.c: Ditto. >> * gcc.dg/tree-prof/section-attr-3.c: Ditto. > > OK, > thanks! > Honza > Thanks again for the review. This is my first patch and I don't have commit access. What should I do? - Michael signature.asc Description: OpenPGP digital signature
Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.
On 2018-10-29 6:49 a.m., Martin Liška wrote: > On 10/29/18 9:40 AM, Martin Liška wrote: >> On 10/27/18 6:15 PM, Michael Ploujnikov wrote: >>> Hi, >>> >>> On 2018-10-26 10:25 a.m., Jan Hubicka wrote: >>>>> From aea94273e7a477a03d1ee10a5d9043d6d13b8e8d Mon Sep 17 00:00:00 2001 >>>>> From: Michael Ploujnikov >>>>> Date: Thu, 25 Oct 2018 13:16:36 -0400 >>>>> Subject: [PATCH] Avoid unnecessarily numbering cloned symbols. >>>>> >>>>> gcc/ChangeLog: >>>>> >>>>> 2018-10-26 Michael Ploujnikov >>>>> >>>>> * cgraph.h (clone_function_name_1): Replaced by new >>>>> clone_function_name_numbered that takes name as string; for >>>>> privatize_symbol_name_1 use only. >>>>> (clone_function_name): Renamed to >>>>> clone_function_name_numbered to be explicit about numbering. >>>>> (clone_function_name): New two-argument function that does >>>>> not number its output. >>>>> (clone_function_name): New three-argument function that >>>>> takes a number to append to its output. >>>>> * cgraphclones.c (duplicate_thunk_for_node): >>>>> (clone_function_name_1): Renamed. >>>>> (clone_function_name_numbered): Two new functions. >>>>> (clone_function_name): Improved documentation. >>>>> (cgraph_node::create_virtual_clone): Use clone_function_name_numbered. >>>>> * config/rs6000/rs6000.c (make_resolver_func): Ditto. >>>>> * final.c (final_scan_insn_1): Use the new clone_function_name >>>>> without numbering. >>>>> * multiple_target.c (create_dispatcher_calls): Ditto. >>>>> (create_target_clone): Ditto. >>>>> * omp-expand.c (grid_expand_target_grid_body): Ditto. >>>>> * omp-low.c (create_omp_child_function_name): Ditto. >>>>> * omp-simd-clone.c (simd_clone_create): Ditto. >>>>> * symtab.c (simd_symtab_node::noninterposable_alias): Use the >>>>> new clone_function_name without numbering. >>>>> >>>>> gcc/lto/ChangeLog: >>>>> >>>>> 2018-10-26 Michael Ploujnikov >>>>> >>>>> * lto-partition.c (privatize_symbol_name_1): Use >>>>> clone_function_name_numbered. >>>>> >>>>> gcc/testsuite/ChangeLog: >>>>> >>>>> 2018-10-26 Michael Ploujnikov >>>>> >>>>> * gcc.dg/tree-prof/cold_partition_label.c: Update for cold >>>>> section names without numbers. >>>>> * gcc.dg/tree-prof/section-attr-1.c: Ditto. >>>>> * gcc.dg/tree-prof/section-attr-2.c: Ditto. >>>>> * gcc.dg/tree-prof/section-attr-3.c: Ditto. >>>> >>>> OK, >>>> thanks! >>>> Honza >>>> >>> >>> Thanks again for the review. This is my first patch and I don't have >>> commit access. What should I do? >> >> I'm going to install the patch on your behalf. For write access you should >> follow these intructions: >> https://www.gnu.org/software/gcc/svnwrite.html#policies >> >> Martin >> >>> >>> >>> - Michael >>> >> > > But first I see some failures when I tried to apply the patch: > > $ patch -p0 --dry-run < > ~/Downloads/0001-Avoid-unnecessarily-numbering-cloned-symbols.patch > checking file gcc/cgraph.h > Hunk #1 succeeded at 2382 with fuzz 1 (offset 14 lines). > checking file gcc/cgraphclones.c > Hunk #1 succeeded at 317 (offset 1 line). > checking file gcc/config/rs6000/rs6000.c > Hunk #1 succeeded at 36997 (offset 485 lines). > checking file gcc/lto/lto-partition.c > checking file gcc/multiple_target.c > checking file gcc/omp-expand.c > checking file gcc/omp-low.c > checking file gcc/omp-simd-clone.c > checking file gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c > checking file gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c > Hunk #1 FAILED at 42. > 1 out of 1 hunk FAILED > checking file gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c > Hunk #1 FAILED at 41. > 1 out of 1 hunk FAILED > checking file gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c > Hunk #1 FAILED at 42. > 1 out of 1 hunk FAILED > > Can you please rebase that on top of current trunk? > Thanks, > Martin > Sorry about that. Attached is the fixed patch. Note that I'm cu
[doc PATCH] Fix weakref description.
I came across this typo and also added a similar ld invocation for illustration purposes as mentioned by Jakub on irc. From 2df4903f04fbe68e9e6a1ae0eea460e7592a8512 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Fri, 2 Nov 2018 13:40:50 -0400 Subject: [PATCH] Fix weakref description. gcc/ChangeLog: 2018-11-02 Michael Ploujnikov * doc/extend.texi: Fix typo in the weakref description. --- gcc/doc/extend.texi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git gcc/doc/extend.texi gcc/doc/extend.texi index 4dbb2da39e..027e0f75a1 100644 --- gcc/doc/extend.texi +++ gcc/doc/extend.texi @@ -3603,7 +3603,7 @@ symbol, not necessarily in the same translation unit. The effect is equivalent to moving all references to the alias to a separate translation unit, renaming the alias to the aliased symbol, declaring it as weak, compiling the two separate translation units and -performing a reloadable link on them. +performing a incremental link (like @code{ld -r}) on them. At present, a declaration to which @code{weakref} is attached can only be @code{static}. -- 2.19.1 signature.asc Description: OpenPGP digital signature
Re: [doc PATCH] Fix weakref description.
On 2018-11-02 1:59 p.m., Michael Ploujnikov wrote: > I came across this typo and also added a similar ld invocation for > illustration purposes as mentioned by Jakub on irc. > After talking to Jakub about it, I went with different terminology. - Michael From f14d7315e0dc9c4b6aff6137fd90e4d2595ef9f5 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Mon, 12 Nov 2018 12:42:37 -0500 Subject: [PATCH] Fix weakref description. gcc/ChangeLog: 2018-11-12 Michael Ploujnikov * doc/extend.texi: Fix typo in the weakref description. --- gcc/doc/extend.texi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git gcc/doc/extend.texi gcc/doc/extend.texi index e2b9ee11a54..fc507afe600 100644 --- gcc/doc/extend.texi +++ gcc/doc/extend.texi @@ -3619,7 +3619,7 @@ symbol, not necessarily in the same translation unit. The effect is equivalent to moving all references to the alias to a separate translation unit, renaming the alias to the aliased symbol, declaring it as weak, compiling the two separate translation units and -performing a reloadable link on them. +performing a link with relocatable output (ie: @code{ld -r}) on them. At present, a declaration to which @code{weakref} is attached can only be @code{static}. -- 2.19.1 signature.asc Description: OpenPGP digital signature
Re: [doc PATCH] Fix weakref description.
On 2018-11-12 12:50 p.m., Michael Ploujnikov wrote: > On 2018-11-02 1:59 p.m., Michael Ploujnikov wrote: >> I came across this typo and also added a similar ld invocation for >> illustration purposes as mentioned by Jakub on irc. >> > > After talking to Jakub about it, I went with different terminology. > > > - Michael > Installed as obvious. - Michael signature.asc Description: OpenPGP digital signature
[PATCH v3] Make function clone name numbering independent.
Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html It took longer than expected, but I've finally rebased this on top of the new clone_function_name* API and included the requested optimizations. There are a few remaining spots where we could probably apply similar optimizations: - gcc/multiple_target.c:create_target_clone - gcc/multiple_target.c:create_dispatcher_calls - gcc/omp-expand.c:grid_expand_target_grid_body But I've yet to figure out how these work in detail and with the new API these shouldn't block the main change from being merged. I've also included a small change to rs6000 which I'm pretty sure is safe, but I have no way of testing. I'm not sure what's the consensus on what's more appropriate, but I thought that it would be a good idea to keep these changes as separate patches since only the first one really changes behaviour, while the rest are optimizations. It's conceivable that someone who is backporting this to an older release might want to just backport the core bits of the change. I can re-submit it as one patch if that's more appropriate. Everything in these patches was bootstrapped and regression tested on x86_64. Ok for trunk? - Michael From 40cb5c888522d69bc42791f0c884dcb9e29eff37 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Thu, 1 Nov 2018 12:57:30 -0400 Subject: [PATCH 1/4] Make function assembly more independent. This is achieved by having clone_function_name assign unique clone numbers for each function independently. gcc: 2018-11-28 Michael Ploujnikov * cgraphclones.c: Replaced clone_fn_id_num with clone_fn_ids; hash map. (clone_function_name_numbered): Use clone_fn_ids. gcc/testsuite: 2018-11-28 Michael Ploujnikov * gcc.dg/independent-cloneids-1.c: New test. --- gcc/cgraphclones.c| 10 - gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++ 2 files changed, 46 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c diff --git gcc/cgraphclones.c gcc/cgraphclones.c index e17959c0ca..fdd84b60d3 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -513,7 +513,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count, return new_node; } -static GTY(()) unsigned int clone_fn_id_num; +static GTY(()) hash_map *clone_fn_ids; /* Return a new assembler name for a clone of decl named NAME. Apart from the string SUFFIX, the new name will end with a unique (for @@ -525,7 +525,13 @@ static GTY(()) unsigned int clone_fn_id_num; tree clone_function_name_numbered (const char *name, const char *suffix) { - return clone_function_name (name, suffix, clone_fn_id_num++); + /* Initialize the function->counter mapping the first time it's + needed. */ + if (!clone_fn_ids) +clone_fn_ids = hash_map::create_ggc (64); + unsigned int &suffix_counter = clone_fn_ids->get_or_insert ( + IDENTIFIER_POINTER (get_identifier (name))); + return clone_function_name (name, suffix, suffix_counter++); } /* Return a new assembler name for a clone of DECL. Apart from string diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c new file mode 100644 index 00..3203895492 --- /dev/null +++ gcc/testsuite/gcc.dg/independent-cloneids-1.c @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fipa-cp -fipa-cp-clone" } */ + +extern int printf (const char *, ...); + +static int __attribute__ ((noinline)) +foo (int arg) +{ + return 7 * arg; +} + +static int __attribute__ ((noinline)) +bar (int arg) +{ + return arg * arg; +} + +int +baz (int arg) +{ + printf("%d\n", bar (3)); + printf("%d\n", bar (4)); + printf("%d\n", foo (5)); + printf("%d\n", foo (6)); + /* adding or removing the following call should not affect foo + function's clone numbering */ + printf("%d\n", bar (7)); + return foo (8); +} + +/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 } } */ +/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]3:} } } */ +/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]4:} } } */ -- 2.19.1 From 80b07ddf059415c896cecb9862517899db3993e9 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Mon, 17 Sep 2018 16:02:27 -0400 Subject: [PATCH 2/4] Minimize clone counter memory usage in create_virtual_clone. Based on Martin Jambour's suggestion: https://g
Re: [PATCH v3] Make function clone name numbering independent.
On 2018-11-28 5:49 p.m., Segher Boessenkool wrote: > Hi! > > On Wed, Nov 28, 2018 at 04:09:14PM -0500, Michael Ploujnikov wrote: >> I've also included a small change to rs6000 which I'm pretty sure is >> safe, but I have no way of testing. > > Do you have an account on the GCC Compile Farm? > https://cfarm.tetaneutral.net/ > There are some nice Power machines in there! > > Does this patch dependent on the rest of the series? > > If it tests okay, it is okay for trunk of course. Thanks! > > One comment about your changelog: > >> 2018-11-28 Michael Ploujnikov >> >> * config/rs6000/rs6000.c (make_resolver_func): Generate >>resolver symbol with clone_function_name instead of >>clone_function_name_numbered. > > Those last two lines should not start with the spaces. It should be: > > 2018-11-28 Michael Ploujnikov > > * config/rs6000/rs6000.c (make_resolver_func): Generate > resolver symbol with clone_function_name instead of > clone_function_name_numbered. > > > Segher > Thanks for the review and suggestion to use the cfarm! I've successfully regtested this on power9 and committed the stand-alone change to rs6000.c after fixing the changelog. - Michael signature.asc Description: OpenPGP digital signature
Re: [PATCH v3] Make function clone name numbering independent.
Hi, On 2018-11-30 2:25 a.m., Richard Biener wrote: > + unsigned &clone_number = lto_clone_numbers->get_or_insert ( > + IDENTIFIER_POINTER (DECL_NAME (decl))); >name = maybe_rewrite_identifier (name); >symtab->change_decl_assembler_name (decl, > - clone_function_name_numbered ( > - name, "lto_priv")); > + clone_function_name ( > + name, "lto_priv", clone_number)); > > since we use 'name' from maybe_rewrite_identifier in the end wouldn't it > make more sense to use that as a key for lto_clone_numbers? Yup, that looks much better. Fixed it. > For the ipa-hsa.c changes since we iterate over all defined functions > we at most create a single clone per cgraph_node. That means your > hash-map keyed on cgraph_node is useless and you could use > an unnumbered clone (or a static zero number). Good catch. I was thinking of doing this, but it somehow fell through the cracks during the rebase. > > - SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl, > - suffix)); > + SET_DECL_ASSEMBLER_NAME (new_decl, > + clone_function_name ( > + IDENTIFIER_POINTER ( > + DECL_ASSEMBLER_NAME (old_decl)), > + suffix, > + num_suffix)); > > can you please hide the implementation detail of accessing the assembler name > by adding an overload to clone_function_name_numbered with an explicit > number? Done. > > OK with those changes. > > Thanks, > Richard. Thank you for the review. I will commit as soon as my last test run finishes. - Michael From ac1f1579d37804c97833d460ec6cd5b87d6184c7 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Thu, 1 Nov 2018 12:57:30 -0400 Subject: [PATCH 1/3] Make function assembly more independent. This is achieved by having clone_function_name assign unique clone numbers for each function independently. gcc: 2018-11-30 Michael Ploujnikov * cgraphclones.c: Replaced clone_fn_id_num with clone_fn_ids; hash map. (clone_function_name_numbered): Use clone_fn_ids. gcc/testsuite: 2018-11-30 Michael Ploujnikov * gcc.dg/independent-cloneids-1.c: New test. --- gcc/cgraphclones.c| 10 - gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++ 2 files changed, 46 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c diff --git gcc/cgraphclones.c gcc/cgraphclones.c index e17959c0ca..fdd84b60d3 100644 --- gcc/cgraphclones.c +++ gcc/cgraphclones.c @@ -513,7 +513,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count, return new_node; } -static GTY(()) unsigned int clone_fn_id_num; +static GTY(()) hash_map *clone_fn_ids; /* Return a new assembler name for a clone of decl named NAME. Apart from the string SUFFIX, the new name will end with a unique (for @@ -525,7 +525,13 @@ static GTY(()) unsigned int clone_fn_id_num; tree clone_function_name_numbered (const char *name, const char *suffix) { - return clone_function_name (name, suffix, clone_fn_id_num++); + /* Initialize the function->counter mapping the first time it's + needed. */ + if (!clone_fn_ids) +clone_fn_ids = hash_map::create_ggc (64); + unsigned int &suffix_counter = clone_fn_ids->get_or_insert ( + IDENTIFIER_POINTER (get_identifier (name))); + return clone_function_name (name, suffix, suffix_counter++); } /* Return a new assembler name for a clone of DECL. Apart from string diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c new file mode 100644 index 00..3203895492 --- /dev/null +++ gcc/testsuite/gcc.dg/independent-cloneids-1.c @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fipa-cp -fipa-cp-clone" } */ + +extern int printf (const char *, ...); + +static int __attribute__ ((noinline)) +foo (int arg) +{ + return 7 * arg; +} + +static int __attribute__ ((noinline)) +bar (int arg) +{ + return arg * arg; +} + +int +baz (int arg) +{ + printf("%d\n", bar (3)); + printf("%d\n", bar (4)); + printf("%d\n", foo (5)); + printf("%d\n", foo (6)); + /* adding or removing the following call should not affect foo + function's clone numbering */ + printf("%d\n", bar (7)); + return foo (8); +} + +/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 } } */ +/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 } } */ +/* { dg-final { scan-assembler-not {(?
[PATCH] String contents hash map key example
I thought it would be useful to others who are new to the GCC codebase to have an example of how to hash keys based on string contents rather than pointer addresses (the fact that some hash maps key based on semi-reliable pointers (due to ggc_mark_stringpool) into the symtab gives a false sense of security). - Michael From 3433efe4ac558de05410a9b185f4ff0a01e7e5df Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Fri, 11 Jan 2019 09:22:14 -0500 Subject: [PATCH] Document how to hash based on key string contents. gcc: 2019-01-18 Michael Ploujnikov * hash-map-tests.c (test_map_of_strings_to_int): Document how to hash based on key string contents. --- gcc/hash-map-tests.c | 20 1 file changed, 20 insertions(+) diff --git gcc/hash-map-tests.c gcc/hash-map-tests.c index 98b5830497..61da8233c4 100644 --- gcc/hash-map-tests.c +++ gcc/hash-map-tests.c @@ -77,6 +77,26 @@ test_map_of_strings_to_int () m.remove (eric); ASSERT_EQ (5, m.elements ()); ASSERT_EQ (NULL, m.get (eric)); + + /* A plain char * key is hashed based on its value (address), rather + than the string it points to. */ + char *another_ant = static_cast (xcalloc (4, 1)); + another_ant[0] = 'a'; + another_ant[1] = 'n'; + another_ant[2] = 't'; + another_ant[3] = 0; + ASSERT_NE (ant, another_ant); + unsigned prev_size = m.elements (); + ASSERT_EQ (false, m.put (another_ant, 7)); + ASSERT_EQ (prev_size + 1, m.elements ()); + + /* Need to use string_hash or nofree_string_hash key types to hash + based on the string contents. */ + hash_map string_map; + ASSERT_EQ (false, string_map.put (ant, 1)); + ASSERT_EQ (1, string_map.elements ()); + ASSERT_EQ (true, string_map.put (another_ant, 5)); + ASSERT_EQ (1, string_map.elements ()); } /* Run all of the selftests within this file. */ -- 2.19.1 signature.asc Description: OpenPGP digital signature
Re: [PATCH] Fix PR89150, GC of tree-form bitmaps
On 2019-02-04 11:39 a.m., Richard Biener wrote: > On February 4, 2019 5:07:00 PM GMT+01:00, Jeff Law wrote: >> On 2/4/19 6:15 AM, Richard Biener wrote: >>> >>> When I introduced tree-form bitmaps I forgot to think about GC. >>> The following drops the chain_prev annotation to make the marker >>> work for trees. I've also maked the obstack member GTY skip >>> (and prevent bitmap_obstack from gengtype processing) because >>> the obstack isn't used for GC allocated bitmaps. >>> >>> Bootstrap & regtest running on x86_64-unknown-linux-gnu. >>> >>> Richard. >>> >>> 2019-02-04 Richard Biener >>> >>> PR middle-end/89150 >>> * bitmap.h (struct bitmap_obstack): Do not mark GTY. >>> (struct bitmap_element): Drop chain_prev so we properly recurse on >>> the prev member, supporting tree views. >>> (struct bitmap_head): GTY skip the obstack member. >> Was there a particular failure mode you observed or was this discovered >> by inspection. > > It was discovered via an out of tree patch, the issue is only latent on trunk > (together with another unfixed pch issue of bitmaps). My apologies for not noticing this thread/PR earlier. I originally discovered this issue while working on an experimental patch, but since then I've also come up with a test case for this specific issue. Is it worth adding it to trunk? - Michael From 9f1049a1e24cd1aa9852c5dddd1342dce0226416 Mon Sep 17 00:00:00 2001 From: Michael Ploujnikov Date: Thu, 7 Feb 2019 14:43:40 -0500 Subject: [PATCH] Add a test for GC marking bitmaps in tree view mode gcc/ChangeLog: 2019-02-07 Michael Ploujnikov PR middle-end/89150 * bitmap.c (test_bitmap_tree_marking): New test. (NOT_NULL_OR_GARBAGE): For shortening test_bitmap_tree_marking. (bitmap_c_tests): Add test_bitmap_tree_marking. --- gcc/bitmap.c | 82 1 file changed, 82 insertions(+) diff --git gcc/bitmap.c gcc/bitmap.c index 5a8236de75..7e802a1600 100644 --- gcc/bitmap.c +++ gcc/bitmap.c @@ -2626,6 +2626,11 @@ bitmap_head::dump () #if CHECKING_P +static GTY(()) bitmap selftest_test_bitmap; + +/* From ggc-internal.h */ +extern bool ggc_force_collect; + namespace selftest { /* Selftests for bitmaps. */ @@ -2722,6 +2727,82 @@ test_bitmap_single_bit_set_p () ASSERT_EQ (1066, bitmap_first_set_bit (b)); } +#define NOT_NULL_OR_GARBAGE(NODE) \ + (NODE != NULL && (unsigned long)(NODE) != 0xa5a5a5a5UL) + +static void +test_bitmap_tree_marking () +{ + selftest_test_bitmap = BITMAP_GGC_ALLOC (); + bitmap_tree_view (selftest_test_bitmap); + + ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 1*BITMAP_ELEMENT_ALL_BITS)); + ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 5*BITMAP_ELEMENT_ALL_BITS)); + ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 10*BITMAP_ELEMENT_ALL_BITS)); + ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 15*BITMAP_ELEMENT_ALL_BITS)); + ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 30*BITMAP_ELEMENT_ALL_BITS)); + ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 25*BITMAP_ELEMENT_ALL_BITS)); + ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 3*BITMAP_ELEMENT_ALL_BITS)); + ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 4*BITMAP_ELEMENT_ALL_BITS)); + ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 26*BITMAP_ELEMENT_ALL_BITS)); + /* At this point it should look like: + 26 + / \ +25 30 + / + 5 + / \ +4 15 + / / + 3 10 + / +1 + + */ + ggc_force_collect = true; + ggc_collect (); + ggc_force_collect = false; + + ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first)); + ASSERT_TRUE (selftest_test_bitmap->first->indx == 26); + + ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first->next)); + ASSERT_TRUE (selftest_test_bitmap->first->next->indx == 30); + ASSERT_TRUE (selftest_test_bitmap->first->next->next == NULL); + ASSERT_TRUE (selftest_test_bitmap->first->next->prev == NULL); + + ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first->prev)); + ASSERT_TRUE (selftest_test_bitmap->first->prev->indx == 25); + ASSERT_TRUE (selftest_test_bitmap->first->prev->next == NULL); + + ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first->prev->prev)); + ASSERT_TRUE (selftest_test_bitmap->first->prev->prev->indx == 5); + + ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first->prev->prev->next)); + ASSERT_TRUE (selftest_test_bitmap->first->prev->prev->next->indx == 15); + ASSERT_TRUE (selftest_test_bitmap->first->prev->prev->next->next == NULL); + + ASSERT_TRUE (NOT_NULL_OR_GAR
Re: [PATCH] Fix PR89150, GC of tree-form bitmaps
On 2019-02-07 3:09 p.m., Jakub Jelinek wrote: > On Thu, Feb 07, 2019 at 03:04:21PM -0500, Michael Ploujnikov wrote: >> 2019-02-07 Michael Ploujnikov >> >> PR middle-end/89150 >> * bitmap.c (test_bitmap_tree_marking): New test. >> (NOT_NULL_OR_GARBAGE): For shortening >> test_bitmap_tree_marking. >> (bitmap_c_tests): Add test_bitmap_tree_marking. > > Could you do that instead in a plugin in the testsuite? > I mean, the patch is adding garbage collection roots, so it will not affect > just -fself-tests run, but also any time the compiler will do GC. > > Jakub > I'm not sure what I would need to do to get gengtype to process a test plugin source file and I can't find examples of this. Instead, shouldn't I just do something like what's at the bottom of gcc/ggc-tests.c and not worry about the extra root added to gt-bitmap.h? - Michael signature.asc Description: OpenPGP digital signature