[PATCH] Make function clone name numbering independent.

2018-07-16 Thread Michael Ploujnikov
Hi,

This patch is a small part of the work I'm doing to make function 
codegen/assembly independent from one another as mentioned in: 
https://gcc.gnu.org/ml/gcc/2018-07/msg00210.html. It deals with clone_fn_id_num 
rather than object UIDs and I figured it's better to make my first submission 
with a smaller, simpler and self-contained patch.

This changes clone_function_name_1 such that clone names are based on a 
per-function rather than a global counter so that the number of clones of one 
function doesn't affect the numbering of clone names of other functions.

This should have minimal impact as the only user of the clone names that I 
found (https://gcc.gnu.org/ml/gcc/2013-03/msg00268.html) doesn't actually care 
about the specific numeric values.


Thanks
- Michael

gcc:
2018-07-16  Michael Ploujnikov  

   Make function clone name numbering independent.
   * cgraphclones.c: Replace clone_fn_id_num with clone_fn_ids.
   (clone_function_name_1): Use it.

testsuite:
2018-07-16  Michael Ploujnikov  

Clone id counters should be completely independent from one another.
* gcc/testsuite/gcc.dg/independent-cloneids-1.c: New test.


---
 gcc/cgraphclones.c| 11 ++--
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++
 2 files changed, 47 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index 69572b9..c5a40bd 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -528,7 +528,7 @@ cgraph_node::create_clone (tree new_decl, gcov_type 
gcov_count, int freq,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map *clone_fn_ids;
 
 /* Return a new assembler name for a clone with SUFFIX of a decl named
NAME.  */
@@ -543,7 +543,14 @@ clone_function_name_1 (const char *name, const char 
*suffix)
   memcpy (prefix, name, len);
   strcpy (prefix + len + 1, suffix);
   prefix[len] = symbol_table::symbol_suffix_separator ();
-  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
+  unsigned int *suffix_counter;
+  if (!clone_fn_ids) {
+/* Initialize the per-function counter hash table if this is the first 
call */
+clone_fn_ids = hash_map::create_ggc (1000);
+  }
+  suffix_counter = &clone_fn_ids->get_or_insert(name);
+  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, *suffix_counter);
+  *suffix_counter = *suffix_counter + 1;
   return get_identifier (tmp_name);
 }
 
diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c 
gcc/testsuite/gcc.dg/independent-cloneids-1.c
new file mode 100644
index 000..d723e20
--- /dev/null
+++ gcc/testsuite/gcc.dg/independent-cloneids-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp"  } */
+
+extern int printf (const char *, ...);
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 7 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg;
+}
+
+int
+baz (int arg)
+{
+  printf("%d\n", bar (3));
+  printf("%d\n", bar (4));
+  printf("%d\n", foo (5));
+  printf("%d\n", foo (6));
+  /* adding or removing the following call should not affect foo
+ function's clone numbering */
+  printf("%d\n", bar (7));
+  return foo (8);
+}
+
+/* { dg-final { scan-ipa-dump "Function bar.constprop.0" "cp" } } */
+/* { dg-final { scan-ipa-dump "Function bar.constprop.1" "cp" } } */
+/* { dg-final { scan-ipa-dump "Function bar.constprop.3" "cp" } } */
+/* { dg-final { scan-ipa-dump "Function foo.constprop.0" "cp" } } */
+/* { dg-final { scan-ipa-dump "Function foo.constprop.1" "cp" } } */
+/* { dg-final { scan-ipa-dump "Function foo.constprop.2" "cp" } } */
+/* { dg-final { scan-ipa-dump-not "Function foo.constprop.3" "cp" } } */
+/* { dg-final { scan-ipa-dump-not "Function foo.constprop.4" "cp" } } */
-- 
2.7.4



signature.asc
Description: OpenPGP digital signature


[PATCH] Make function clone name numbering independent.

2018-07-17 Thread Michael Ploujnikov
On 2018-07-17 06:02 AM, Richard Biener wrote:
> On Tue, Jul 17, 2018 at 8:10 AM Bernhard Reutner-Fischer
>  wrote:
>>
>> On 16 July 2018 21:38:36 CEST, Michael Ploujnikov 
>>  wrote:
>>> Hi,
>>>
>>
>>> +clone_fn_ids = hash_map::create_ggc
>>> (1000);
>>
>> Isn't 1000 a bit excessive? What about 64 or thereabouts?
> 
> I'm not sure we should throw memory at this "problem" in this way.
> What specific issue
> does this address?

This goes along with the general theme of preventing changes to one function 
affecting codegen of others. What I'm seeing in this case is when a function 
bar is modified codegen decides to create more clones of it (eg: during the 
constprop pass). These extra clones cause the global counter to increment so 
the clones of the unchanged function foo are named differently only because of 
a source change to bar. I was hoping that the testcase would be a good 
illustration, but perhaps not; is there anything else I can do to make this 
clearer?


> 
> Iff then I belive forgoing the automatic counter addidion is the way
> to go and hand
> control of that to the callers (for example the caller in
> lto/lto-partition.c doesn't
> even seem to need that counter.

How can you tell that privatize_symbol_name_1 doesn't need the counter? I'm 
assuming it has a good reason to call clone_function_name_1 rather than 
appending ".lto_priv" itself.

> You also assume the string you key on persists - luckily the
> lto-partition.c caller
> leaks it but this makes your approach quite fragile in my eye (put the logic
> into clone_function_name instead, where you at least know you are dealing
> with a string from an indentifier which are never collected).
> 
> Richard.
> 

Is this what you had in mind?:

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index 6e84a31..f000420 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count 
prof_count,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map *clone_fn_ids;
 
 /* Return a new assembler name for a clone with SUFFIX of a decl named
NAME.  */
@@ -521,14 +521,13 @@ tree
 clone_function_name_1 (const char *name, const char *suffix)
 {
   size_t len = strlen (name);
-  char *tmp_name, *prefix;
+  char *prefix;
 
   prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
   memcpy (prefix, name, len);
   strcpy (prefix + len + 1, suffix);
   prefix[len] = symbol_table::symbol_suffix_separator ();
-  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
-  return get_identifier (tmp_name);
+  return get_identifier (prefix);
 }
 
 /* Return a new assembler name for a clone of DECL with SUFFIX.  */
@@ -537,7 +536,17 @@ tree
 clone_function_name (tree decl, const char *suffix)
 {
   tree name = DECL_ASSEMBLER_NAME (decl);
-  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
+  char *decl_name = IDENTIFIER_POINTER (name);
+  char *numbered_name;
+  unsigned int *suffix_counter;
+  if (!clone_fn_ids) {
+/* Initialize the per-function counter hash table if this is the first 
call */
+clone_fn_ids = hash_map::create_ggc (64);
+  }
+  suffix_counter = &clone_fn_ids->get_or_insert(name);
+  ASM_FORMAT_PRIVATE_NAME (numbered_name, decl_name, *suffix_counter);
+  *suffix_counter = *suffix_counter + 1;
+  return clone_function_name_1 (numbered_name, suffix);
 }
 
- Michael





signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Make function clone name numbering independent.

2018-07-19 Thread Michael Ploujnikov
On 2018-07-17 04:25 PM, Michael Ploujnikov wrote:
> On 2018-07-17 06:02 AM, Richard Biener wrote:
>> On Tue, Jul 17, 2018 at 8:10 AM Bernhard Reutner-Fischer
>>  wrote:
>>>
>>> On 16 July 2018 21:38:36 CEST, Michael Ploujnikov 
>>>  wrote:
>>>> Hi,
>>>>
>>>
>>>> +clone_fn_ids = hash_map::create_ggc
>>>> (1000);
>>>
>>> Isn't 1000 a bit excessive? What about 64 or thereabouts?
>>
>> I'm not sure we should throw memory at this "problem" in this way.
>> What specific issue
>> does this address?
> 
> This goes along with the general theme of preventing changes to one function 
> affecting codegen of others. What I'm seeing in this case is when a function 
> bar is modified codegen decides to create more clones of it (eg: during the 
> constprop pass). These extra clones cause the global counter to increment so 
> the clones of the unchanged function foo are named differently only because 
> of a source change to bar. I was hoping that the testcase would be a good 
> illustration, but perhaps not; is there anything else I can do to make this 
> clearer?
> 
> 
>>
>> Iff then I belive forgoing the automatic counter addidion is the way
>> to go and hand
>> control of that to the callers (for example the caller in
>> lto/lto-partition.c doesn't
>> even seem to need that counter.
> 
> How can you tell that privatize_symbol_name_1 doesn't need the counter? I'm 
> assuming it has a good reason to call clone_function_name_1 rather than 
> appending ".lto_priv" itself.
> 
>> You also assume the string you key on persists - luckily the
>> lto-partition.c caller
>> leaks it but this makes your approach quite fragile in my eye (put the logic
>> into clone_function_name instead, where you at least know you are dealing
>> with a string from an indentifier which are never collected).
>>
>> Richard.
>>
> 
> Is this what you had in mind?:
> 
> diff --git gcc/cgraphclones.c gcc/cgraphclones.c
> index 6e84a31..f000420 100644
> --- gcc/cgraphclones.c
> +++ gcc/cgraphclones.c
> @@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count 
> prof_count,
>return new_node;
>  }
>  
> -static GTY(()) unsigned int clone_fn_id_num;
> +static GTY(()) hash_map *clone_fn_ids;
>  
>  /* Return a new assembler name for a clone with SUFFIX of a decl named
> NAME.  */
> @@ -521,14 +521,13 @@ tree
>  clone_function_name_1 (const char *name, const char *suffix)
>  {
>size_t len = strlen (name);
> -  char *tmp_name, *prefix;
> +  char *prefix;
>  
>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>memcpy (prefix, name, len);
>strcpy (prefix + len + 1, suffix);
>prefix[len] = symbol_table::symbol_suffix_separator ();
> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
> -  return get_identifier (tmp_name);
> +  return get_identifier (prefix);
>  }
>  
>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
> @@ -537,7 +536,17 @@ tree
>  clone_function_name (tree decl, const char *suffix)
>  {
>tree name = DECL_ASSEMBLER_NAME (decl);
> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
> +  char *decl_name = IDENTIFIER_POINTER (name);
> +  char *numbered_name;
> +  unsigned int *suffix_counter;
> +  if (!clone_fn_ids) {
> +/* Initialize the per-function counter hash table if this is the first 
> call */
> +clone_fn_ids = hash_map::create_ggc (64);
> +  }
> +  suffix_counter = &clone_fn_ids->get_or_insert(name);
> +  ASM_FORMAT_PRIVATE_NAME (numbered_name, decl_name, *suffix_counter);
> +  *suffix_counter = *suffix_counter + 1;
> +  return clone_function_name_1 (numbered_name, suffix);
>  }
>  
> - Michael
> 
> 
> 

Ping, and below is the updated version of the full patch with changelogs:


gcc:
2018-07-16  Michael Ploujnikov  

   Make function clone name numbering independent.
   * cgraphclones.c: Replace clone_fn_id_num with clone_fn_ids.
   (clone_function_name_1): Move suffixing to clone_function_name
   and change it to use per-function clone_fn_ids.

testsuite:
2018-07-16  Michael Ploujnikov  

Clone id counters should be completely independent from one another.
* gcc/testsuite/gcc.dg/independent-cloneids-1.c: New test.

---
 gcc/cgraphclones.c| 19 ++
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++
 2 files changed, 52 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cg

Re: [PATCH] Make function clone name numbering independent.

2018-07-24 Thread Michael Ploujnikov
On 2018-07-20 06:05 AM, Richard Biener wrote:
> On Fri, Jul 20, 2018 at 4:48 AM Michael Ploujnikov
>  wrote:
>>
>> On 2018-07-17 04:25 PM, Michael Ploujnikov wrote:
>>> On 2018-07-17 06:02 AM, Richard Biener wrote:
>>>> On Tue, Jul 17, 2018 at 8:10 AM Bernhard Reutner-Fischer
>>>>  wrote:
>>>>>
>>>>> On 16 July 2018 21:38:36 CEST, Michael Ploujnikov 
>>>>>  wrote:
>>>>>> Hi,
>>>>>>
>>>>>
>>>>>> +clone_fn_ids = hash_map::create_ggc
>>>>>> (1000);
>>>>>
>>>>> Isn't 1000 a bit excessive? What about 64 or thereabouts?
>>>>
>>>> I'm not sure we should throw memory at this "problem" in this way.
>>>> What specific issue
>>>> does this address?
>>>
>>> This goes along with the general theme of preventing changes to one 
>>> function affecting codegen of others. What I'm seeing in this case is when 
>>> a function bar is modified codegen decides to create more clones of it (eg: 
>>> during the constprop pass). These extra clones cause the global counter to 
>>> increment so the clones of the unchanged function foo are named differently 
>>> only because of a source change to bar. I was hoping that the testcase 
>>> would be a good illustration, but perhaps not; is there anything else I can 
>>> do to make this clearer?
>>>
>>>
>>>>
>>>> Iff then I belive forgoing the automatic counter addidion is the way
>>>> to go and hand
>>>> control of that to the callers (for example the caller in
>>>> lto/lto-partition.c doesn't
>>>> even seem to need that counter.
>>>
>>> How can you tell that privatize_symbol_name_1 doesn't need the counter? I'm 
>>> assuming it has a good reason to call clone_function_name_1 rather than 
>>> appending ".lto_priv" itself.
>>>
>>>> You also assume the string you key on persists - luckily the
>>>> lto-partition.c caller
>>>> leaks it but this makes your approach quite fragile in my eye (put the 
>>>> logic
>>>> into clone_function_name instead, where you at least know you are dealing
>>>> with a string from an indentifier which are never collected).
>>>>
>>>> Richard.
>>>>
>>>
>>> Is this what you had in mind?:
>>>
>>> diff --git gcc/cgraphclones.c gcc/cgraphclones.c
>>> index 6e84a31..f000420 100644
>>> --- gcc/cgraphclones.c
>>> +++ gcc/cgraphclones.c
>>> @@ -512,7 +512,7 @@ cgraph_node::create_clone (tree new_decl, profile_count 
>>> prof_count,
>>>return new_node;
>>>  }
>>>
>>> -static GTY(()) unsigned int clone_fn_id_num;
>>> +static GTY(()) hash_map *clone_fn_ids;
>>>
>>>  /* Return a new assembler name for a clone with SUFFIX of a decl named
>>> NAME.  */
>>> @@ -521,14 +521,13 @@ tree
>>>  clone_function_name_1 (const char *name, const char *suffix)
>>>  {
>>>size_t len = strlen (name);
>>> -  char *tmp_name, *prefix;
>>> +  char *prefix;
>>>
>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>>memcpy (prefix, name, len);
>>>strcpy (prefix + len + 1, suffix);
>>>prefix[len] = symbol_table::symbol_suffix_separator ();
>>> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
>>> -  return get_identifier (tmp_name);
>>> +  return get_identifier (prefix);
>>>  }
>>>
>>>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
>>> @@ -537,7 +536,17 @@ tree
>>>  clone_function_name (tree decl, const char *suffix)
>>>  {
>>>tree name = DECL_ASSEMBLER_NAME (decl);
>>> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>>> +  char *decl_name = IDENTIFIER_POINTER (name);
>>> +  char *numbered_name;
>>> +  unsigned int *suffix_counter;
>>> +  if (!clone_fn_ids) {
>>> +/* Initialize the per-function counter hash table if this is the first 
>>> call */
>>> +clone_fn_ids = hash_map::create_ggc (64);
>>> +  }
>>> +  suffix_counter = &clone_fn_ids->get_or_insert(name);
>>> +  ASM_FORMAT_PRIVATE_NAME (numbered_name, decl_name, *suffix_counter);
>>> +  *suffix_counter = *suffix_counter + 1;
>>> +  return clone_funct

Re: [PATCH] Make function clone name numbering independent.

2018-07-26 Thread Michael Ploujnikov
On 2018-07-24 09:57 AM, Michael Ploujnikov wrote:
> On 2018-07-20 06:05 AM, Richard Biener wrote:
>>>  /* Return a new assembler name for a clone with SUFFIX of a decl named
>>> NAME.  */
>>> @@ -521,14 +521,13 @@ tree
>>>  clone_function_name_1 (const char *name, const char *suffix)
>>
>> pass this function the counter to use
>>
>>>  {
>>>size_t len = strlen (name);
>>> -  char *tmp_name, *prefix;
>>> +  char *prefix;
>>>
>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>>memcpy (prefix, name, len);
>>>strcpy (prefix + len + 1, suffix);
>>>prefix[len] = symbol_table::symbol_suffix_separator ();
>>> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
>>
>> and keep using ASM_FORMAT_PRIVATE_NAME here.  You need to change
>> the lto/lto-partition.c caller (just use zero as counter).
>>
>>> -  return get_identifier (tmp_name);
>>> +  return get_identifier (prefix);
>>>  }
>>>
>>>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
>>> @@ -537,7 +536,17 @@ tree
>>>  clone_function_name (tree decl, const char *suffix)
>>>  {
>>>tree name = DECL_ASSEMBLER_NAME (decl);
>>> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>>> +  const char *decl_name = IDENTIFIER_POINTER (name);
>>> +  char *numbered_name;
>>> +  unsigned int *suffix_counter;
>>> +  if (!clone_fn_ids) {
>>> +/* Initialize the per-function counter hash table if this is the first 
>>> call */
>>> +clone_fn_ids = hash_map::create_ggc (64);
>>> +  }
>>
>> I still do not like throwing memory at the problem in this way for the
>> little benefit
>> this change provides :/
>>
>> So no approval from me at this point...
>>
>> Richard.
> 
> Can you give me an idea of the memory constraints that are involved?
> 
> The highest memory usage increase that I could find was when compiling
> a source file (from Linux) with roughly 10,000 functions. It showed a 2kB
> increase over the before-patch use of 6936kB which is barely 0.03%.
> 
> Using a single counter can result in more confusing namespacing when
> you have .bar.clone.4 despite there only being 3 clones of .bar.
> 
> From a practical point of view this change is helpful to anyone
> diffing binary output such as forensic analysts, Debian Reproducible
> Builds or even someone validating compiler output (before and after an input
> source patch). The extra changes that this patch alleviates are a
> distraction and could even be misleading. For example, applying a
> source patch to the same Linux source produces the following binary
> diff before my change:
> 
> --- /tmp/output.o.objdump
> +++ /tmp/patched-output.o.objdump
> @@ -1,5 +1,5 @@
> 
> -/tmp/uverbs_cmd/output.o: file format elf32-i386
> +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386
> 
> 
>  Disassembly of section .text.get_order:
> @@ -265,12 +265,12 @@
> 3:e9 fc ff ff ff  jmp4 
>   4: R_386_PC32   .text.put_uobj_read
> 
> -Disassembly of section .text.trace_kmalloc.constprop.3:
> +Disassembly of section .text.trace_kmalloc.constprop.4:
> 
> - :
> + :
> 0:83 3d 04 00 00 00 00cmpl   $0x0,0x4
>   2: R_386_32 __tracepoint_kmalloc
> -   7:74 34   je 3d 
> 
> +   7:74 34   je 3d 
> 
> 9:55  push   %ebp
> a:89 cd   mov%ecx,%ebp
> c:57  push   %edi
> @@ -281,7 +281,7 @@
>13:8b 1d 10 00 00 00   mov0x10,%ebx
>   15: R_386_32__tracepoint_kmalloc
>19:85 db   test   %ebx,%ebx
> -  1b:74 1b   je 38 
> 
> +  1b:74 1b   je 38 
> 
>1d:68 d0 00 00 00  push   $0xd0
>22:89 fa   mov%edi,%edx
>24:89 f0   mov%esi,%eax
> @@ -292,7 +292,7 @@
>31:58  pop%eax
>32:83 3b 00cmpl   $0x0,(%ebx)
>35:5a  pop%edx
> -  36:eb e3   jmp1b 
> 
> +  36:eb e3   jmp1b 
> 
>38:5b  pop%ebx
>39:5e  pop%esi
>3a:  

Re: [PATCH] Make function clone name numbering independent.

2018-07-31 Thread Michael Ploujnikov
On 2018-07-26 01:27 PM, Michael Ploujnikov wrote:
> On 2018-07-24 09:57 AM, Michael Ploujnikov wrote:
>> On 2018-07-20 06:05 AM, Richard Biener wrote:
>>>>  /* Return a new assembler name for a clone with SUFFIX of a decl named
>>>> NAME.  */
>>>> @@ -521,14 +521,13 @@ tree
>>>>  clone_function_name_1 (const char *name, const char *suffix)
>>>
>>> pass this function the counter to use
>>>
>>>>  {
>>>>size_t len = strlen (name);
>>>> -  char *tmp_name, *prefix;
>>>> +  char *prefix;
>>>>
>>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>>>memcpy (prefix, name, len);
>>>>strcpy (prefix + len + 1, suffix);
>>>>prefix[len] = symbol_table::symbol_suffix_separator ();
>>>> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
>>>
>>> and keep using ASM_FORMAT_PRIVATE_NAME here.  You need to change
>>> the lto/lto-partition.c caller (just use zero as counter).
>>>
>>>> -  return get_identifier (tmp_name);
>>>> +  return get_identifier (prefix);
>>>>  }
>>>>
>>>>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
>>>> @@ -537,7 +536,17 @@ tree
>>>>  clone_function_name (tree decl, const char *suffix)
>>>>  {
>>>>tree name = DECL_ASSEMBLER_NAME (decl);
>>>> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>>>> +  const char *decl_name = IDENTIFIER_POINTER (name);
>>>> +  char *numbered_name;
>>>> +  unsigned int *suffix_counter;
>>>> +  if (!clone_fn_ids) {
>>>> +/* Initialize the per-function counter hash table if this is the 
>>>> first call */
>>>> +clone_fn_ids = hash_map::create_ggc (64);
>>>> +  }
>>>
>>> I still do not like throwing memory at the problem in this way for the
>>> little benefit
>>> this change provides :/
>>>
>>> So no approval from me at this point...
>>>
>>> Richard.
>>
>> Can you give me an idea of the memory constraints that are involved?
>>
>> The highest memory usage increase that I could find was when compiling
>> a source file (from Linux) with roughly 10,000 functions. It showed a 2kB
>> increase over the before-patch use of 6936kB which is barely 0.03%.
>>
>> Using a single counter can result in more confusing namespacing when
>> you have .bar.clone.4 despite there only being 3 clones of .bar.
>>
>> From a practical point of view this change is helpful to anyone
>> diffing binary output such as forensic analysts, Debian Reproducible
>> Builds or even someone validating compiler output (before and after an input
>> source patch). The extra changes that this patch alleviates are a
>> distraction and could even be misleading. For example, applying a
>> source patch to the same Linux source produces the following binary
>> diff before my change:
>>
>> --- /tmp/output.o.objdump
>> +++ /tmp/patched-output.o.objdump
>> @@ -1,5 +1,5 @@
>>
>> -/tmp/uverbs_cmd/output.o: file format elf32-i386
>> +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386
>>
>>
>>  Disassembly of section .text.get_order:
>> @@ -265,12 +265,12 @@
>> 3:   e9 fc ff ff ff  jmp4 
>>  4: R_386_PC32   .text.put_uobj_read
>>
>> -Disassembly of section .text.trace_kmalloc.constprop.3:
>> +Disassembly of section .text.trace_kmalloc.constprop.4:
>>
>> - :
>> + :
>> 0:   83 3d 04 00 00 00 00cmpl   $0x0,0x4
>>  2: R_386_32 __tracepoint_kmalloc
>> -   7:   74 34   je 3d 
>> 
>> +   7:   74 34   je 3d 
>> 
>> 9:   55  push   %ebp
>> a:   89 cd   mov%ecx,%ebp
>> c:   57  push   %edi
>> @@ -281,7 +281,7 @@
>>13:   8b 1d 10 00 00 00   mov0x10,%ebx
>>  15: R_386_32__tracepoint_kmalloc
>>19:   85 db   test   %ebx,%ebx
>> -  1b:   74 1b   je 38 
>> 
>> +  1b:   74 1b   je 38 
>> 
>>1d:   68 d0 00 00 00  push   $0xd0
>>22:   89 fa   mov%edi,%edx
>>24:   89 f0   mov%esi,%ea

Re: [PATCH] Make function clone name numbering independent.

2018-08-02 Thread Michael Ploujnikov
On 2018-08-01 06:37 AM, Richard Biener wrote:
> On Tue, Jul 31, 2018 at 7:40 PM Michael Ploujnikov
>  wrote:
>>
>> On 2018-07-26 01:27 PM, Michael Ploujnikov wrote:
>>> On 2018-07-24 09:57 AM, Michael Ploujnikov wrote:
>>>> On 2018-07-20 06:05 AM, Richard Biener wrote:
>>>>>>  /* Return a new assembler name for a clone with SUFFIX of a decl named
>>>>>> NAME.  */
>>>>>> @@ -521,14 +521,13 @@ tree
>>>>>>  clone_function_name_1 (const char *name, const char *suffix)
>>>>>
>>>>> pass this function the counter to use
>>>>>
>>>>>>  {
>>>>>>size_t len = strlen (name);
>>>>>> -  char *tmp_name, *prefix;
>>>>>> +  char *prefix;
>>>>>>
>>>>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>>>>>memcpy (prefix, name, len);
>>>>>>strcpy (prefix + len + 1, suffix);
>>>>>>prefix[len] = symbol_table::symbol_suffix_separator ();
>>>>>> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
>>>>>
>>>>> and keep using ASM_FORMAT_PRIVATE_NAME here.  You need to change
>>>>> the lto/lto-partition.c caller (just use zero as counter).
>>>>>
>>>>>> -  return get_identifier (tmp_name);
>>>>>> +  return get_identifier (prefix);
>>>>>>  }
>>>>>>
>>>>>>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
>>>>>> @@ -537,7 +536,17 @@ tree
>>>>>>  clone_function_name (tree decl, const char *suffix)
>>>>>>  {
>>>>>>tree name = DECL_ASSEMBLER_NAME (decl);
>>>>>> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>>>>>> +  const char *decl_name = IDENTIFIER_POINTER (name);
>>>>>> +  char *numbered_name;
>>>>>> +  unsigned int *suffix_counter;
>>>>>> +  if (!clone_fn_ids) {
>>>>>> +/* Initialize the per-function counter hash table if this is the 
>>>>>> first call */
>>>>>> +clone_fn_ids = hash_map::create_ggc (64);
>>>>>> +  }
>>>>>
>>>>> I still do not like throwing memory at the problem in this way for the
>>>>> little benefit
>>>>> this change provides :/
>>>>>
>>>>> So no approval from me at this point...
>>>>>
>>>>> Richard.
>>>>
>>>> Can you give me an idea of the memory constraints that are involved?
>>>>
>>>> The highest memory usage increase that I could find was when compiling
>>>> a source file (from Linux) with roughly 10,000 functions. It showed a 2kB
>>>> increase over the before-patch use of 6936kB which is barely 0.03%.
>>>>
>>>> Using a single counter can result in more confusing namespacing when
>>>> you have .bar.clone.4 despite there only being 3 clones of .bar.
>>>>
>>>> From a practical point of view this change is helpful to anyone
>>>> diffing binary output such as forensic analysts, Debian Reproducible
>>>> Builds or even someone validating compiler output (before and after an 
>>>> input
>>>> source patch). The extra changes that this patch alleviates are a
>>>> distraction and could even be misleading. For example, applying a
>>>> source patch to the same Linux source produces the following binary
>>>> diff before my change:
>>>>
>>>> --- /tmp/output.o.objdump
>>>> +++ /tmp/patched-output.o.objdump
>>>> @@ -1,5 +1,5 @@
>>>>
>>>> -/tmp/uverbs_cmd/output.o: file format elf32-i386
>>>> +/tmp/uverbs_cmd/patched-output.o: file format elf32-i386
>>>>
>>>>
>>>>  Disassembly of section .text.get_order:
>>>> @@ -265,12 +265,12 @@
>>>> 3:   e9 fc ff ff ff  jmp4 
>>>>  4: R_386_PC32   .text.put_uobj_read
>>>>
>>>> -Disassembly of section .text.trace_kmalloc.constprop.3:
>>>> +Disassembly of section .text.trace_kmalloc.constprop.4:
>>>>
>>>> - :
>>>> + :
>>>> 0:   83 3d 04 00 00 00 00cmpl   $0x0,0x4
>>>>  2: R_386_32 __tracepoint_kmalloc
>&

[PING][PATCH] Make function clone name numbering independent.

2018-08-13 Thread Michael Ploujnikov
Ping and I've updated the patch since last time as follows:

  - unittest scans assembly rather than the constprop dump because its
forward changed
  - unittests should handle different hosts where any of
NO_DOT_IN_LABEL, NO_DOLLAR_IN_LABEL or __USER_LABEL_PREFIX__ may
be defined
  - not 100% it's safe to change DECL_NAME to DECL_ASSEMBLER_NAME in
cgraph_node::create_virtual_clone, but I've attempted to reduce
some code duplication
  - lto-partition.c: privatize_symbol_name_1 *does* need numbered
names
  - but cold sections don't
  - Expecting an IDENTIFIER_NODE in clone_function_name_1 avoids
unreliable string pointer use as pointed out in the first review
  - renamed clone_function_name_1 and clone_function_name to
numbered_clone_function_name_1 and numbered_clone_function_name to
clarify purpose and discourage future unintended uses

Also bootstrapped and regtested.

- Michael

On 2018-08-02 03:05 PM, Michael Ploujnikov wrote:
> On 2018-08-01 06:37 AM, Richard Biener wrote:
>> On Tue, Jul 31, 2018 at 7:40 PM Michael Ploujnikov
>>  wrote:
>>>
>>> On 2018-07-26 01:27 PM, Michael Ploujnikov wrote:
>>>> On 2018-07-24 09:57 AM, Michael Ploujnikov wrote:
>>>>> On 2018-07-20 06:05 AM, Richard Biener wrote:
>>>>>>>  /* Return a new assembler name for a clone with SUFFIX of a decl named
>>>>>>> NAME.  */
>>>>>>> @@ -521,14 +521,13 @@ tree
>>>>>>>  clone_function_name_1 (const char *name, const char *suffix)
>>>>>>
>>>>>> pass this function the counter to use
>>>>>>
>>>>>>>  {
>>>>>>>size_t len = strlen (name);
>>>>>>> -  char *tmp_name, *prefix;
>>>>>>> +  char *prefix;
>>>>>>>
>>>>>>>prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
>>>>>>>memcpy (prefix, name, len);
>>>>>>>strcpy (prefix + len + 1, suffix);
>>>>>>>prefix[len] = symbol_table::symbol_suffix_separator ();
>>>>>>> -  ASM_FORMAT_PRIVATE_NAME (tmp_name, prefix, clone_fn_id_num++);
>>>>>>
>>>>>> and keep using ASM_FORMAT_PRIVATE_NAME here.  You need to change
>>>>>> the lto/lto-partition.c caller (just use zero as counter).
>>>>>>
>>>>>>> -  return get_identifier (tmp_name);
>>>>>>> +  return get_identifier (prefix);
>>>>>>>  }
>>>>>>>
>>>>>>>  /* Return a new assembler name for a clone of DECL with SUFFIX.  */
>>>>>>> @@ -537,7 +536,17 @@ tree
>>>>>>>  clone_function_name (tree decl, const char *suffix)
>>>>>>>  {
>>>>>>>tree name = DECL_ASSEMBLER_NAME (decl);
>>>>>>> -  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
>>>>>>> +  const char *decl_name = IDENTIFIER_POINTER (name);
>>>>>>> +  char *numbered_name;
>>>>>>> +  unsigned int *suffix_counter;
>>>>>>> +  if (!clone_fn_ids) {
>>>>>>> +/* Initialize the per-function counter hash table if this is the 
>>>>>>> first call */
>>>>>>> +clone_fn_ids = hash_map::create_ggc (64);
>>>>>>> +  }
>>>>>>
>>>>>> I still do not like throwing memory at the problem in this way for the
>>>>>> little benefit
>>>>>> this change provides :/
>>>>>>
>>>>>> So no approval from me at this point...
>>>>>>
>>>>>> Richard.
>>>>>
>>>>> Can you give me an idea of the memory constraints that are involved?
>>>>>
>>>>> The highest memory usage increase that I could find was when compiling
>>>>> a source file (from Linux) with roughly 10,000 functions. It showed a 2kB
>>>>> increase over the before-patch use of 6936kB which is barely 0.03%.
>>>>>
>>>>> Using a single counter can result in more confusing namespacing when
>>>>> you have .bar.clone.4 despite there only being 3 clones of .bar.
>>>>>
>>>>> From a practical point of view this change is helpful to anyone
>>>>> diffing binary output such as forensic analysts, Debian Reproducible
>>>>> Builds or even someone validating compiler output (before and after an 
>>>

Re: [PING v2][PATCH] Make function clone name numbering independent.

2018-08-31 Thread Michael Ploujnikov
On 2018-08-13 07:58 PM, Michael Ploujnikov wrote:
> Ping and I've updated the patch since last time as follows:
> 
>   - unittest scans assembly rather than the constprop dump because its
> forward changed
>   - unittests should handle different hosts where any of
> NO_DOT_IN_LABEL, NO_DOLLAR_IN_LABEL or __USER_LABEL_PREFIX__ may
> be defined
>   - not 100% it's safe to change DECL_NAME to DECL_ASSEMBLER_NAME in
> cgraph_node::create_virtual_clone, but I've attempted to reduce
> some code duplication
>   - lto-partition.c: privatize_symbol_name_1 *does* need numbered
> names
>   - but cold sections don't
>   - Expecting an IDENTIFIER_NODE in clone_function_name_1 avoids
> unreliable string pointer use as pointed out in the first review
>   - renamed clone_function_name_1 and clone_function_name to
> numbered_clone_function_name_1 and numbered_clone_function_name to
> clarify purpose and discourage future unintended uses
> 
> Also bootstrapped and regtested.

Ping.

I've done some more digging into the current uses of
numbered_clone_function_name and checked if any tests fail if I change
it to suffixed_function_name:

  - gcc/cgraphclones.c:  DECL_NAME (new_decl) = numbered_clone_function_name 
(thunk->decl, "artificial_thunk");
- no new tests fail, inconclusive
- and despite the comment on redirect_callee_duplicating_thunks
  about "one or more" duplicates it doesn't seem like
  duplicate_thunk_for_node would be called more than once for each
  node, assuming each node is named uniquely, but I'm far from an
  expert in this area
  - gcc/cgraphclones.c:  SET_DECL_ASSEMBLER_NAME (new_decl, 
numbered_clone_function_name (old_decl, suffix));
- called by cgraph_node::create_virtual_clone, definitely needs it
- this is where clones like .constprop come from
  - gcc/cgraphclones.c:  DECL_NAME (new_decl) = numbered_clone_function_name 
(old_decl, suffix);
- more tests fail
- this is where clones like .simdclone come from
- called by cgraph_node::create_version_clone_with_body, most likely needs 
it
  - gcc/config/rs6000/rs6000.c:  tree decl_name = numbered_clone_function_name 
(default_decl, "resolver");
- doesn't look like this needs the numbering as there should only
  be one resolver per multi-version function, but need someone
  with an rs/6000 to confirm
  - gcc/lto/lto-partition.c:  
numbered_clone_function_name_1 (identifier,
- definitely needed for disambiguation, shown by unit tests failing
  - gcc/multiple_target.c:
numbered_clone_function_name (node->decl,
- create_dispatcher_calls says it only creates the dispatcher once
- no new tests fail, inconclusive
  - gcc/multiple_target.c:
numbered_clone_function_name (node->decl,
- I have a feeling this isn't necessary
- no new tests fail, inconclusive
  - gcc/omp-expand.c:  DECL_NAME (kern_fndecl) = numbered_clone_function_name 
(kern_fndecl, "kernel");
- no new tests fail, inconclusive
- I didn't see (and couldn't figure out a way to get) any of the
  existing omp/acc tests actually exercise this codeptah
  - gcc/omp-low.c:  return numbered_clone_function_name (current_function_decl,
- definitely needed based on
  gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c and others
  - gcc/omp-simd-clone.c:  DECL_NAME (new_decl) = 
numbered_clone_function_name (old_decl, "simdclone");
- no tests fail, inconclusive
- can definitely have more than one simdclone per function as above, but
  maybe not these external types
- some tests, like declare-simd.c actually exercise this codepath,
  but I couldn't find the resulting .simdclone decl the the
  simdclone pass dump nor in any of the other dumps to verify
  - gcc/symtab.c:  DECL_NAME (new_decl) = numbered_clone_function_name 
(node->decl, "localalias");
- no tests fail, inconclusive
- my understanding of function_and_variable_visibility says that
  there can only be one per function so maybe this isn't; hubicka?

I'll add patches to switch these once someone with expertise in these
areas can confirm that the numbering isn't needed in the respective
decl names.


- Michael

From adde0632266d3f1b0540e09b9db931df4302d2bc Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Mon, 16 Jul 2018 12:55:49 -0400
Subject: [PATCH 1/4] Make function assembly more independent.

This changes clone_function_name such that clone names are based on a
per-function counter rather than a global one.

gcc:
2018-08-02  Michael Ploujnikov  

   Make function clone name numbering independent.
   * cgraphclones.c: Replace clone_fn_id_num with clone_fn_i

Re: [PING v2][PATCH] Make function clone name numbering independent.

2018-09-04 Thread Michael Ploujnikov
Hi Martin, Richard,

Thanks for your responses.

On 2018-09-03 09:15 AM, Martin Jambor wrote:
> Hi,
> 
> On Mon, Sep 03 2018, Richard Biener wrote:
>> On Mon, Sep 3, 2018 at 12:02 PM Martin Jambor  wrote:
>>>
>>> Hi,
>>>
>>> On Fri, Aug 31 2018, Michael Ploujnikov wrote:
>>>> I've done some more digging into the current uses of
>>>> numbered_clone_function_name and checked if any tests fail if I change
>>>> it to suffixed_function_name:
>>>>
>>>>   - gcc/cgraphclones.c:  DECL_NAME (new_decl) = 
>>>> numbered_clone_function_name (thunk->decl, "artificial_thunk");
>>>> - no new tests fail, inconclusive
>>>> - and despite the comment on redirect_callee_duplicating_thunks
>>>>   about "one or more" duplicates it doesn't seem like
>>>>   duplicate_thunk_for_node would be called more than once for each
>>>>   node, assuming each node is named uniquely, but I'm far from an
>>>>   expert in this area
>>>
>>> The comment means that if there is a chain of thunks, the method clones
>>> all of them.  Nevertheless, you need name numbering here for the same
>>> reason why you need them for constprop.
>>
>> The remaining question I had with the patch was if maybe all callers
>> can handle assigning
>> the numbering themselves, thus do sth like
>>
>>for-each-clone-of (i, fn)
>>   DECL_NAME (...) = numbered_clone_function_name (...,
>> "artificial_thunk", i);
>>
>> which would make the map of name -> number unnecessary.

Please see my comment below.

>>
> 
> That is what I would prefer too but it also has downsides.  Callers
> would have to do their book keeping and the various cloning interfaces
> would get another parameter when already they have quite a few
> (introducing some cloning context classes seems like an overkill to me
> :-).

I'm fine with doing it the way you guys are suggesting, but just to
explain my original thinking: The way I see it is that there will
always be some kind of name -> number mapping, even if it's managed by
each individual user and even if it's actually cgraph_node ->
number. Needless to say I'm far from an expert in all of the involved
areas (this is my first contribution to GCC) so the most effective
approach I could come up is doing it all in one place. Now with your
expert advice and as I get a better understanding of how things like
sra and other clones are created, I'm starting to see that a more
tailored approach is possible and probably the way it should have been
done in the first place.

> 
> If we want to get rid of .constprop discrepancies,

Could you please clarify to me what exactly you mean by the
".constprop discrepancies"?

> something like the
> following (only very mildly tested) patch should be enough (note that
> IPA-CP currently does not really need the new has because it creates

What do you mean "it doesn't need the new hash"? My
independent-cloneids-1.c test shows that a function can have more than
one .constprop clone, but I'm probably just not understanding
something.

> clones in only one pass through the call graph, but that is something
> likely to change).  We could then adjust other cloners that we care
> about.
> 
> However, if clones of thunks are one of them, then the optional
> parameter will also proliferate to cgraph_node::create_clone(),
> cgraph_edge::redirect_callee_duplicating_thunks() and
> duplicate_thunk_for_node().  create_version_clone_with_body would almost
> certainly need it because of IPA-SRA too (IPA-SRA can of course always
> pass zero).

I'll try to do this and to convert the other users in the next version
of the patch, but I might need some help with understanding how some
of those passes work!

> 
> Martin
> 
> 
> 
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index 2b00f0165fa..703c3cfea7b 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -964,11 +964,15 @@ public:
>cgraph_node *new_inlined_to,
>bitmap args_to_skip, const char *suffix = NULL);
>  
> -  /* Create callgraph node clone with new declaration.  The actual body will
> - be copied later at compilation stage.  */
> +  /* Create callgraph node clone with new declaration.  The actual body will 
> be
> + copied later at compilation stage.  The name of the new clone will be
> + constructed from the name of the original name, SUFFIX and a number 
> which
> + can either be NUM_SUFFIX if non-negative or a unique incremented integer
> + ot

Re: [PING v2][PATCH] Make function clone name numbering independent.

2018-09-04 Thread Michael Ploujnikov
Hi Martin,

On 2018-09-03 06:01 AM, Martin Jambor wrote:
> Hi,
> 
> On Fri, Aug 31 2018, Michael Ploujnikov wrote:
>> I've done some more digging into the current uses of
>> numbered_clone_function_name and checked if any tests fail if I change
>> it to suffixed_function_name:
>>
>>   - gcc/cgraphclones.c:  DECL_NAME (new_decl) = numbered_clone_function_name 
>> (thunk->decl, "artificial_thunk");
>> - no new tests fail, inconclusive
>> - and despite the comment on redirect_callee_duplicating_thunks
>>   about "one or more" duplicates it doesn't seem like
>>   duplicate_thunk_for_node would be called more than once for each
>>   node, assuming each node is named uniquely, but I'm far from an
>>   expert in this area
> 
> The comment means that if there is a chain of thunks, the method clones
> all of them.  Nevertheless, you need name numbering here for the same
> reason why you need them for constprop.
> 
> 
>>   - gcc/omp-expand.c:  DECL_NAME (kern_fndecl) = 
>> numbered_clone_function_name (kern_fndecl, "kernel");
>> - no new tests fail, inconclusive
>> - I didn't see (and couldn't figure out a way to get) any of the
>>   existing omp/acc tests actually exercise this codeptah
> 
> I guess this one should not need it.  Build with
> --enable-offload-targets=hsa and run gomp.exp to try yourself.  I can
> run run-time HSA tests for you if you want.
> 
> Martin
> 

I've tried building with numbered_clone_function_name replaced by
suffixed_function_name and with --enable-offload-targets=hsa and
didn't see any errors in gomp.exp. I don't have a readily available
HSA setup so if you could do a quick test, I would really appreciate
it!

- Michael



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v3] Make function clone name numbering independent.

2018-12-03 Thread Michael Ploujnikov
On 2018-12-01 11:29 a.m., H.J. Lu wrote:
> This caused:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297
> 

Sorry about that. Looks like I should have been testing with
--with-build-config=bootstrap-lto rather than just --enable-bootstrap.

The quick fix would be to undo the patch to create_virtual_clone or to
just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME
(node->decl) instead of node pointers. Any preferences?

The harder fix would be to figure out why some nodes share the same
names and fix that, but maybe that's just inevitable with LTO?

- Michael



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v3] Make function clone name numbering independent.

2018-12-03 Thread Michael Ploujnikov
On 2018-12-03 12:00 p.m., Michael Ploujnikov wrote:
> On 2018-12-01 11:29 a.m., H.J. Lu wrote:
>> This caused:
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88297
>>
> 
> Sorry about that. Looks like I should have been testing with
> --with-build-config=bootstrap-lto rather than just --enable-bootstrap.
> 
> The quick fix would be to undo the patch to create_virtual_clone or to
> just change clone_num_suffixes to key off of DECL_ASSEMBLER_NAME
> (node->decl) instead of node pointers. Any preferences?
> 
> The harder fix would be to figure out why some nodes share the same
> names and fix that, but maybe that's just inevitable with LTO?
> 
> - Michael
> 

Here's a quick fix while the issue is being investigated.

Bootstrapped (--with-build-config=bootstrap-lto) and regtested on x86_64.

Ok for trunk?
From f5e2500f30ad337e85e0b53eaa15c724657966a2 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Mon, 3 Dec 2018 18:19:18 -0500
Subject: [PATCH] PR ipa/88297

gcc/ChangeLog:

2018-12-03  Michael Ploujnikov  

	PR ipa/88297
	* ipa-cp.c (create_specialized_node): Store clone counters by
	node assembler names.
	(ipcp_driver): Change type of clone_num_suffixes key type to
	const char*.
---
 gcc/ipa-cp.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git gcc/ipa-cp.c gcc/ipa-cp.c
index e0cd1bc45b..f82473e37c 100644
--- gcc/ipa-cp.c
+++ gcc/ipa-cp.c
@@ -376,8 +376,8 @@ static profile_count max_count;
 
 static long overall_size, max_new_size;
 
-/* Node to unique clone suffix number map.  */
-static hash_map *clone_num_suffixes;
+/* Node name to unique clone suffix number map.  */
+static hash_map *clone_num_suffixes;
 
 /* Return the param lattices structure corresponding to the Ith formal
parameter of the function described by INFO.  */
@@ -3831,7 +3831,9 @@ create_specialized_node (struct cgraph_node *node,
 	}
 }
 
-  unsigned &suffix_counter = clone_num_suffixes->get_or_insert (node);
+  unsigned &suffix_counter = clone_num_suffixes->get_or_insert (
+			   IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (
+ node->decl)));
   new_node = node->create_virtual_clone (callers, replace_trees,
 	 args_to_skip, "constprop",
 	 suffix_counter);
@@ -5049,7 +5051,7 @@ ipcp_driver (void)
 
   ipa_check_create_node_params ();
   ipa_check_create_edge_args ();
-  clone_num_suffixes = new hash_map;
+  clone_num_suffixes = new hash_map;
 
   if (dump_file)
 {
-- 
2.19.1



signature.asc
Description: OpenPGP digital signature


Re: [PING v2][PATCH] Make function clone name numbering independent.

2018-12-04 Thread Michael Ploujnikov
On 2018-12-04 7:48 a.m., Martin Jambor wrote:
> Hi,
> 
> On Tue, Sep 04 2018, Michael Ploujnikov wrote:
>>
>> I've tried building with numbered_clone_function_name replaced by
>> suffixed_function_name and with --enable-offload-targets=hsa and
>> didn't see any errors in gomp.exp. I don't have a readily available
>> HSA setup so if you could do a quick test, I would really appreciate
>> it!
> 
> Sorry it took so long, I have run regular tests on trunk today and your
> patch does not appear to have caused any issues.
> 
> Martin
> 

Thank you for checking.

- Michael



signature.asc
Description: OpenPGP digital signature


[PATCH] Testcase for PR 88297 and minor fixes

2018-12-06 Thread Michael Ploujnikov
Thanks to Martin we now have a test that exercises (cp) cloning
machinery during the WPA stage of LTO.

Also, during debugging I found that print_all_lattices would trigger
an assert if I tried to call it inside decide_whether_version_node.

Finally I've attached some comment spelling fixes as a bonus.


Bootstrapping (--with-build-config=bootstrap-lto) and regression testing on 
x86_64.

Ok for trunk after tests pass?


- Michael
From f8f59d44141726e688cde077aabb5f2ce0bf53e0 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 6 Dec 2018 13:36:51 -0500
Subject: [PATCH 1/3] Skip constprop clones because we don't make lattices for
 them.

gcc/ChangeLog:

2018-12-06  Michael Ploujnikov  

	* ipa-cp.c (print_all_lattices): Skip cp clones.
---
 gcc/ipa-cp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git gcc/ipa-cp.c gcc/ipa-cp.c
index c7c462ab81..c4e879bbc6 100644
--- gcc/ipa-cp.c
+++ gcc/ipa-cp.c
@@ -542,6 +542,9 @@ print_all_lattices (FILE * f, bool dump_sources, bool dump_benefits)
   struct ipa_node_params *info;
 
   info = IPA_NODE_REF (node);
+  /* Skip constprop clones since we don't make lattices for them.  */
+  if (info->ipcp_orig_node)
+	continue;
   fprintf (f, "  Node: %s:\n", node->dump_name ());
   count = ipa_get_param_count (info);
   for (i = 0; i < count; i++)
-- 
2.19.1

From 1ec489bbb30f410144c0bc84f0c160a8fbed0be6 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 6 Dec 2018 13:47:00 -0500
Subject: [PATCH 2/3] Testcase for cloning during LTO WPA stage.

Written with Martin Jambor's help:
https://gcc.gnu.org/ml/gcc/2018-12/msg00043.html

gcc/testsuite/ChangeLog:

2018-12-06  Michael Ploujnikov  

	* gcc.dg/lto/pr88297_0.c: New test.
	* gcc.dg/lto/pr88297_1.c: New test.
---
 gcc/testsuite/gcc.dg/lto/pr88297_0.c | 57 
 gcc/testsuite/gcc.dg/lto/pr88297_1.c | 25 
 2 files changed, 82 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr88297_0.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr88297_1.c

diff --git gcc/testsuite/gcc.dg/lto/pr88297_0.c gcc/testsuite/gcc.dg/lto/pr88297_0.c
new file mode 100644
index 00..d415015166
--- /dev/null
+++ gcc/testsuite/gcc.dg/lto/pr88297_0.c
@@ -0,0 +1,57 @@
+/* { dg-require-effective-target lto } */
+/* { dg-lto-options {{-flto -O3 -fipa-cp -fipa-cp-clone}}  } */
+/* { dg-lto-do run } */
+
+/* In order to trigger IPA-CP cloning we have to:
+
+  1. Put the calls in main into a loop; otherwise everything is
+  coldand we would not clone.
+
+  2. Make different foos and bars actually semantically different;
+  otherwise IPA-ICF unified them (as it should).
+
+*/
+
+volatile int g;
+
+void __attribute__ ((noipa))
+use (int v)
+{
+  g = v;
+}
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 7 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg;
+}
+
+extern int __attribute__ ((noinline))
+entry2 (void);
+
+int  __attribute__ ((noipa))
+get_opaque_number (void)
+{
+  return 1;
+}
+
+int main (void)
+{
+  int i;
+  for (i = 0; i < get_opaque_number (); i++)
+{
+  use (bar (3));
+  use (bar (4));
+  use (foo (5));
+  use (foo (6));
+
+  entry2 ();
+}
+  return 0;
+}
diff --git gcc/testsuite/gcc.dg/lto/pr88297_1.c gcc/testsuite/gcc.dg/lto/pr88297_1.c
new file mode 100644
index 00..65c5321cde
--- /dev/null
+++ gcc/testsuite/gcc.dg/lto/pr88297_1.c
@@ -0,0 +1,25 @@
+extern void __attribute__ ((noipa))
+use (int v);
+
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 8 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg + 3;
+}
+
+int __attribute__ ((noinline))
+entry2 (void)
+{
+  use (bar (3));
+  use (bar (4));
+  use (foo (5));
+  use (foo (6));
+  return 0;
+}
-- 
2.19.1

From c2db1a6aa7d6787685a28df1e677a66bac6cb9b5 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 6 Dec 2018 14:06:55 -0500
Subject: [PATCH 3/3] Spelling fixes in comments.

gcc/ChangeLog:

2018-12-06  Michael Ploujnikov  

	* ipa-cp.c (ipa_get_indirect_edge_target_1): Fix spelling in
	comments.
---
 gcc/ipa-cp.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git gcc/ipa-cp.c gcc/ipa-cp.c
index c4e879bbc6..8c419e1f53 100644
--- gcc/ipa-cp.c
+++ gcc/ipa-cp.c
@@ -191,7 +191,7 @@ public:
   /* Depth first search number and low link for topological sorting of
  values.  */
   int dfs, low_link;
-  /* True if this valye is currently on the topo-sort stack.  */
+  /* True if this value is currently on the topo-sort stack.  */
   bool on_stack;
 
   ipcp_value()
@@ -883,7 +883,7 @@ ipcp_lattice::set_contains_variable ()
   return ret;
 }
 
-/* Set all aggegate lattices in PLATS to bottom and return true if they were
+/* Set all aggregate lattices in PLATS to bottom and return true if they were
not previously set as such.  */
 
 static inline bool

[PING 1] Testcase for PR 88297 and minor fixes

2018-12-12 Thread Michael Ploujnikov
On 2018-12-06 3:13 p.m., Michael Ploujnikov wrote:
> Thanks to Martin we now have a test that exercises (cp) cloning
> machinery during the WPA stage of LTO.
> 
> Also, during debugging I found that print_all_lattices would trigger
> an assert if I tried to call it inside decide_whether_version_node.
> 
> Finally I've attached some comment spelling fixes as a bonus.
> 
> 
> Bootstrapping (--with-build-config=bootstrap-lto) and regression testing on 
> x86_64.
> 
> Ok for trunk after tests pass?
> 
> 
> - Michael
> 

Ping?



signature.asc
Description: OpenPGP digital signature


Avoid unnecessarily numbered clone symbols

2018-10-19 Thread Michael Ploujnikov
While working on
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html I've
accumulated a few easy patches.

The first one renames the functions in question to hopefully encourage
proper future usage. The other ones use the unnumbered version of the
clone name function where I've verified the numbers are not
needed. I've verified these by doing a full bootstrap and a regression
test, by instrumenting the code and by understanding and following the
surrounding code to convince myself that the numbering is indeed not
needed. For the cold functions I've also confirmed with Sriraman
Tallam that they don't need to be numbered.



Regards,
- Michael
From 0bbcf3b8c20498f4d861e088ff7ab38e2a43800b Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Tue, 7 Aug 2018 20:36:53 -0400
Subject: [PATCH 1/4] Rename clone_function_name_1 and clone_function_name to
 clarify usage.

gcc:
2018-10-19  Michael Ploujnikov  

   * gcc/cgraph.h: Rename clone_function_name_1 to
 numbered_clone_function_name_1. Rename clone_function_name to
 numbered_clone_function_name.
   * cgraphclones.c: Ditto.
   * config/rs6000/rs6000.c: Ditto.
   * lto/lto-partition.c: Ditto.
   * multiple_target.c: Ditto.
   * omp-expand.c: Ditto.
   * omp-low.c: Ditto.
   * omp-simd-clone.c: Ditto.
   * symtab.c: Ditto.
---
 gcc/cgraph.h   |  4 ++--
 gcc/cgraphclones.c | 20 +++-
 gcc/config/rs6000/rs6000.c |  2 +-
 gcc/lto/lto-partition.c|  4 ++--
 gcc/multiple_target.c  |  8 
 gcc/omp-expand.c   |  2 +-
 gcc/omp-low.c  |  4 ++--
 gcc/omp-simd-clone.c   |  2 +-
 gcc/symtab.c   |  2 +-
 9 files changed, 25 insertions(+), 23 deletions(-)

diff --git gcc/cgraph.h gcc/cgraph.h
index a8b1b4c..3583f7e 100644
--- gcc/cgraph.h
+++ gcc/cgraph.h
@@ -2368,8 +2368,8 @@ basic_block init_lowered_empty_function (tree, bool, profile_count);
 tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree);
 /* In cgraphclones.c  */
 
-tree clone_function_name_1 (const char *, const char *);
-tree clone_function_name (tree decl, const char *);
+tree numbered_clone_function_name_1 (const char *, const char *);
+tree numbered_clone_function_name (tree decl, const char *);
 
 void tree_function_versioning (tree, tree, vec *,
 			   bool, bitmap, bool, bitmap, basic_block);
diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index 6e84a31..cdb183d 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -316,7 +316,7 @@ duplicate_thunk_for_node (cgraph_node *thunk, cgraph_node *node)
   gcc_checking_assert (!DECL_RESULT (new_decl));
   gcc_checking_assert (!DECL_RTL_SET_P (new_decl));
 
-  DECL_NAME (new_decl) = clone_function_name (thunk->decl, "artificial_thunk");
+  DECL_NAME (new_decl) = numbered_clone_function_name (thunk->decl, "artificial_thunk");
   SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl));
 
   new_thunk = cgraph_node::create (new_decl);
@@ -514,11 +514,11 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
 
 static GTY(()) unsigned int clone_fn_id_num;
 
-/* Return a new assembler name for a clone with SUFFIX of a decl named
-   NAME.  */
+/* Return NAME appended with string SUFFIX and a unique unspecified
+   number.  */
 
 tree
-clone_function_name_1 (const char *name, const char *suffix)
+numbered_clone_function_name_1 (const char *name, const char *suffix)
 {
   size_t len = strlen (name);
   char *tmp_name, *prefix;
@@ -531,13 +531,15 @@ clone_function_name_1 (const char *name, const char *suffix)
   return get_identifier (tmp_name);
 }
 
-/* Return a new assembler name for a clone of DECL with SUFFIX.  */
+/* Return a new assembler name for a clone of DECL. Apart from the
+   string SUFFIX, the new name will end with a unique unspecified
+   number.  */
 
 tree
-clone_function_name (tree decl, const char *suffix)
+numbered_clone_function_name (tree decl, const char *suffix)
 {
   tree name = DECL_ASSEMBLER_NAME (decl);
-  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
+  return numbered_clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
 }
 
 
@@ -585,7 +587,7 @@ cgraph_node::create_virtual_clone (vec redirect_callers,
   strcpy (name + len + 1, suffix);
   name[len] = '.';
   DECL_NAME (new_decl) = get_identifier (name);
-  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name (old_decl, suffix));
+  SET_DECL_ASSEMBLER_NAME (new_decl, numbered_clone_function_name (old_decl, suffix));
   SET_DECL_RTL (new_decl, NULL);
 
   new_node = create_clone (new_decl, count, false,
@@ -964,7 +966,7 @@ cgraph_node::create_version_clone_with_body
   = build_function_decl_skip_args (old_decl, args_to_skip, skip_return);
 
   /* Generate a new name for the new version. */
-  DECL_NAME (new_decl) = clone_function_name (old_decl, suffix);
+  DECL_NAME (new_decl) = numbered_clone_function_

Re: Avoid unnecessarily numbered clone symbols

2018-10-20 Thread Michael Ploujnikov
On 2018-10-20 07:39 AM, Bernhard Reutner-Fischer wrote:
> On 20 October 2018 00:26:15 CEST, Michael Ploujnikov 
>  wrote:
>> While working on
>> https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html I've
>> accumulated a few easy patches.
> 
>  
> +/* Return decl name IDENTIFIER with string SUFFIX appended.  */
> +
> +tree
> +suffixed_function_name (tree identifier, const char *suffix)
> +{
> +  const char *name = IDENTIFIER_POINTER (identifier);
> +  size_t len = strlen (name);
> +  char *prefix;
> +
> +  prefix = XALLOCAVEC (char, len + strlen (suffix) + 2);
> +  memcpy (prefix, name, len);
> +  prefix[len] = symbol_table::symbol_suffix_separator ();
> +  strcpy (prefix + len + 1, suffix);
> +  return get_identifier (prefix);
> +}
> +
> 
> FWIW I think I would phrase this as
> 
> char *str = concat (
>   IDENTIFIER_POINTER (identifier),
>   symbol_table::symbol_suffix_separator (),
>   suffix);
>   tree ret = get_identifier (str);
>   free (str);
>   return ret;
> 

Thanks for the suggestion Bernhard. I also found that the last
argument to concat has to be NULL and I can use ACONCAT to avoid the
explicit free and since symbol_table::symbol_suffix_separator returns
just one char I need to first put it into a string.

I also looked into re-writing numbered_clone_function_name in a
similar way, but before I got too far I found a small issue with my
suffixed_function_name: If I'm going to write an exact replacement for
numbered_clone_function_name then I need to also copy the
double-underscore prefixing behaviour done by ASM_PN_FORMAT (right?)
which is used by ASM_FORMAT_PRIVATE_NAME and write it as:

  char *separator = XALLOCAVEC (char, 2);
  separator[0] = symbol_table::symbol_suffix_separator ();
  separator[1] = 0;
  return get_identifier (ACONCAT ((
#if defined (NO_DOT_IN_LABEL) && defined (NO_DOLLAR_IN_LABEL)
"__",
#endif
IDENTIFIER_POINTER (identifier),
separator,
suffix,
NULL)));

(I'm not sure if the formatting is correct)

However, then it's not exactly the same as the code that I'm also
trying to replace in cgraph_node::create_virtual_clone because it
doesn't add a double underscore if neither NO_DOT_IN_LABEL nor
NO_DOLLAR_IN_LABEL is defined. Is this just an omission that I should
fix by with my new function or was it indended that way and shouldn't
be changed?

Suggestions anyone?


- Michael



signature.asc
Description: OpenPGP digital signature


[PATCH v2] Avoid unnecessarily numbered clone symbols

2018-10-21 Thread Michael Ploujnikov
Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01258.html

Fixed up the code after the change to concat suggested by Bernhard
Reutner.

Outstanding question still remains:

To write an exact replacement for numbered_clone_function_name (apart
from the numbering) I also need to copy the double underscore
prefixing behaviour done by ASM_PN_FORMAT (right?)  which is used by
ASM_FORMAT_PRIVATE_NAME. Does that mean that I can't use my
suffixed_function_name to replace the very similar looking code in
cgraph_node::create_virtual_clone? Or is it just missing the double
underscore prefix by mistake?


- Michael
From 74435e1d8c5984eaee766d7940eeffbe565fcc2e Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Tue, 7 Aug 2018 20:36:53 -0400
Subject: [PATCH 1/4] Rename clone_function_name_1 and clone_function_name to
 clarify usage.

gcc:
2018-10-19  Michael Ploujnikov  

   * gcc/cgraph.h: Rename clone_function_name_1 to
 numbered_clone_function_name_1. Rename clone_function_name to
 numbered_clone_function_name.
   * cgraphclones.c: Ditto.
   * config/rs6000/rs6000.c: Ditto.
   * lto/lto-partition.c: Ditto.
   * multiple_target.c: Ditto.
   * omp-expand.c: Ditto.
   * omp-low.c: Ditto.
   * omp-simd-clone.c: Ditto.
   * symtab.c: Ditto.
---
 gcc/cgraph.h   |  4 ++--
 gcc/cgraphclones.c | 22 +-
 gcc/config/rs6000/rs6000.c |  2 +-
 gcc/lto/lto-partition.c|  4 ++--
 gcc/multiple_target.c  |  8 
 gcc/omp-expand.c   |  2 +-
 gcc/omp-low.c  |  4 ++--
 gcc/omp-simd-clone.c   |  2 +-
 gcc/symtab.c   |  3 ++-
 9 files changed, 28 insertions(+), 23 deletions(-)

diff --git gcc/cgraph.h gcc/cgraph.h
index a8b1b4c..3583f7e 100644
--- gcc/cgraph.h
+++ gcc/cgraph.h
@@ -2368,8 +2368,8 @@ basic_block init_lowered_empty_function (tree, bool, profile_count);
 tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree);
 /* In cgraphclones.c  */
 
-tree clone_function_name_1 (const char *, const char *);
-tree clone_function_name (tree decl, const char *);
+tree numbered_clone_function_name_1 (const char *, const char *);
+tree numbered_clone_function_name (tree decl, const char *);
 
 void tree_function_versioning (tree, tree, vec *,
 			   bool, bitmap, bool, bitmap, basic_block);
diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index 6e84a31..bc59dc2 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -316,7 +316,8 @@ duplicate_thunk_for_node (cgraph_node *thunk, cgraph_node *node)
   gcc_checking_assert (!DECL_RESULT (new_decl));
   gcc_checking_assert (!DECL_RTL_SET_P (new_decl));
 
-  DECL_NAME (new_decl) = clone_function_name (thunk->decl, "artificial_thunk");
+  DECL_NAME (new_decl) = numbered_clone_function_name (thunk->decl,
+		   "artificial_thunk");
   SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl));
 
   new_thunk = cgraph_node::create (new_decl);
@@ -514,11 +515,11 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
 
 static GTY(()) unsigned int clone_fn_id_num;
 
-/* Return a new assembler name for a clone with SUFFIX of a decl named
-   NAME.  */
+/* Return NAME appended with string SUFFIX and a unique unspecified
+   number.  */
 
 tree
-clone_function_name_1 (const char *name, const char *suffix)
+numbered_clone_function_name_1 (const char *name, const char *suffix)
 {
   size_t len = strlen (name);
   char *tmp_name, *prefix;
@@ -531,13 +532,15 @@ clone_function_name_1 (const char *name, const char *suffix)
   return get_identifier (tmp_name);
 }
 
-/* Return a new assembler name for a clone of DECL with SUFFIX.  */
+/* Return a new assembler name for a clone of DECL.  Apart from the
+   string SUFFIX, the new name will end with a unique unspecified
+   number.  */
 
 tree
-clone_function_name (tree decl, const char *suffix)
+numbered_clone_function_name (tree decl, const char *suffix)
 {
   tree name = DECL_ASSEMBLER_NAME (decl);
-  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
+  return numbered_clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
 }
 
 
@@ -585,7 +588,8 @@ cgraph_node::create_virtual_clone (vec redirect_callers,
   strcpy (name + len + 1, suffix);
   name[len] = '.';
   DECL_NAME (new_decl) = get_identifier (name);
-  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name (old_decl, suffix));
+  SET_DECL_ASSEMBLER_NAME (new_decl,
+			   numbered_clone_function_name (old_decl,suffix));
   SET_DECL_RTL (new_decl, NULL);
 
   new_node = create_clone (new_decl, count, false,
@@ -964,7 +968,7 @@ cgraph_node::create_version_clone_with_body
   = build_function_decl_skip_args (old_decl, args_to_skip, skip_return);
 
   /* Generate a new name for the new version. */
-  DECL_NAME (new_decl) = clone_function_name (old_decl, suffix);
+  DECL_NAME (new_decl) = numbered_clone_function_name (old_decl, suffix);
   SET

[PATCH v3] Avoid unnecessarily numbered clone symbols

2018-10-23 Thread Michael Ploujnikov
On 2018-10-21 09:14 PM, Michael Ploujnikov wrote:
> Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-10/msg01258.html
> 
> Fixed up the code after the change to concat suggested by Bernhard
> Reutner.
> 
> Outstanding question still remains:
> 
> To write an exact replacement for numbered_clone_function_name (apart
> from the numbering) I also need to copy the double underscore
> prefixing behaviour done by ASM_PN_FORMAT (right?)  which is used by
> ASM_FORMAT_PRIVATE_NAME. Does that mean that I can't use my
> suffixed_function_name to replace the very similar looking code in
> cgraph_node::create_virtual_clone? Or is it just missing the double
> underscore prefix by mistake?

I found https://gcc.gnu.org/ml/gcc-patches/2013-08/msg01028.html which
answered my question so now I'm just simplifying
cgraph_node::create_virtual_clone with ACONCAT.


- Michael
From 383c64faa956c8b06e680808ef275acb6a746158 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Tue, 7 Aug 2018 20:36:53 -0400
Subject: [PATCH 1/4] Rename clone_function_name_1 and clone_function_name to
 clarify usage.

gcc:
2018-10-23  Michael Ploujnikov  

   * gcc/cgraph.h: Rename clone_function_name_1 to
 numbered_clone_function_name_1. Rename clone_function_name to
 numbered_clone_function_name.
   * cgraphclones.c: Ditto.
   * config/rs6000/rs6000.c: Ditto.
   * lto/lto-partition.c: Ditto.
   * multiple_target.c: Ditto.
   * omp-expand.c: Ditto.
   * omp-low.c: Ditto.
   * omp-simd-clone.c: Ditto.
   * symtab.c: Ditto.
---
 gcc/cgraph.h   |  4 ++--
 gcc/cgraphclones.c | 22 +-
 gcc/config/rs6000/rs6000.c |  2 +-
 gcc/lto/lto-partition.c|  4 ++--
 gcc/multiple_target.c  |  8 
 gcc/omp-expand.c   |  2 +-
 gcc/omp-low.c  |  4 ++--
 gcc/omp-simd-clone.c   |  2 +-
 gcc/symtab.c   |  3 ++-
 9 files changed, 28 insertions(+), 23 deletions(-)

diff --git gcc/cgraph.h gcc/cgraph.h
index a8b1b4c..3583f7e 100644
--- gcc/cgraph.h
+++ gcc/cgraph.h
@@ -2368,8 +2368,8 @@ basic_block init_lowered_empty_function (tree, bool, profile_count);
 tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree);
 /* In cgraphclones.c  */
 
-tree clone_function_name_1 (const char *, const char *);
-tree clone_function_name (tree decl, const char *);
+tree numbered_clone_function_name_1 (const char *, const char *);
+tree numbered_clone_function_name (tree decl, const char *);
 
 void tree_function_versioning (tree, tree, vec *,
 			   bool, bitmap, bool, bitmap, basic_block);
diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index 6e84a31..4395806 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -316,7 +316,8 @@ duplicate_thunk_for_node (cgraph_node *thunk, cgraph_node *node)
   gcc_checking_assert (!DECL_RESULT (new_decl));
   gcc_checking_assert (!DECL_RTL_SET_P (new_decl));
 
-  DECL_NAME (new_decl) = clone_function_name (thunk->decl, "artificial_thunk");
+  DECL_NAME (new_decl) = numbered_clone_function_name (thunk->decl,
+		   "artificial_thunk");
   SET_DECL_ASSEMBLER_NAME (new_decl, DECL_NAME (new_decl));
 
   new_thunk = cgraph_node::create (new_decl);
@@ -514,11 +515,11 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
 
 static GTY(()) unsigned int clone_fn_id_num;
 
-/* Return a new assembler name for a clone with SUFFIX of a decl named
-   NAME.  */
+/* Return NAME appended with string SUFFIX and a unique unspecified
+   number.  */
 
 tree
-clone_function_name_1 (const char *name, const char *suffix)
+numbered_clone_function_name_1 (const char *name, const char *suffix)
 {
   size_t len = strlen (name);
   char *tmp_name, *prefix;
@@ -531,13 +532,15 @@ clone_function_name_1 (const char *name, const char *suffix)
   return get_identifier (tmp_name);
 }
 
-/* Return a new assembler name for a clone of DECL with SUFFIX.  */
+/* Return a new assembler name for a clone of DECL.  Apart from the
+   string SUFFIX, the new name will end with a unique unspecified
+   number.  */
 
 tree
-clone_function_name (tree decl, const char *suffix)
+numbered_clone_function_name (tree decl, const char *suffix)
 {
   tree name = DECL_ASSEMBLER_NAME (decl);
-  return clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
+  return numbered_clone_function_name_1 (IDENTIFIER_POINTER (name), suffix);
 }
 
 
@@ -585,7 +588,8 @@ cgraph_node::create_virtual_clone (vec redirect_callers,
   strcpy (name + len + 1, suffix);
   name[len] = '.';
   DECL_NAME (new_decl) = get_identifier (name);
-  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name (old_decl, suffix));
+  SET_DECL_ASSEMBLER_NAME (new_decl,
+			   numbered_clone_function_name (old_decl, suffix));
   SET_DECL_RTL (new_decl, NULL);
 
   new_node = create_clone (new_decl, count, false,
@@ -964,7 +968,7 @@ cgraph_node::creat

[PATCH v4] Avoid unnecessarily numbering cloned symbols.

2018-10-25 Thread Michael Ploujnikov
I've taken the advice from a discussion on IRC and re-wrote the patch
with more uniform function names and using overloading.

I think this function accomplished the following goals:
 - remove clone numbering where it's not needed:
   final.c:final_scan_insn_1 and
   symtab.c:simd_symtab_node::noninterposable_alias.
 - name and document the clone naming API such that future users won't
   accidentally use the numbering when it's not necessary; if you need
   numbering then you need to explicitly ask for it with the right
   function
 - provide a new function that allows users to specify a clone number
   explicitly as an argument

My thoughts for future improvements:
 - It's a bit unfortunate that lto-partition.c:privatize_symbol_name_1
   has to break the decl abstraction and pass in a string that it
   created into what I would consider the implementation-detail
   function. The best way I can think of to make it uniform with the
   rest of the users is to have it create a new empty decl with
   DECL_ASSEMBLER_NAME set to the new string
 - It's unfortunate that I have to duplicate the separator
   concatenation in the numberless clone_function_name, but I think it
   has to be like that unless ASM_FORMAT_PRIVATE_NAME making the
   number optional.
From 0df79d0ac6a9891b344f988c7157a54cebbc1cb8 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 25 Oct 2018 13:16:36 -0400
Subject: [PATCH] Avoid unnecessarily numbering cloned symbols.

gcc/ChangeLog:

2018-10-25  Michael Ploujnikov  

	* cgraph.h (clone_function_name_1): Replaced by new
  clone_function_name_numbered that takes name as string; for
  privatize_symbol_name_1 use only.  (clone_function_name):
  Renamed to clone_function_name_numbered to be explicit about
  numbering.  (clone_function_name): New two-argument function
  that does not number its output.  (clone_function_name): New
  three-argument function that takes a number to append to its
  output.
* cgraphclones.c (duplicate_thunk_for_node):
	(clone_function_name_1): Renamed.
	(clone_function_name_numbered): Two new functions.
	(clone_function_name): Improved documentation.
	(cgraph_node::create_virtual_clone): Use clone_function_name_numbered.
	* config/rs6000/rs6000.c (make_resolver_func): Ditto.
* final.c (final_scan_insn_1): Use the new clone_function_name
  without numbering.
	* multiple_target.c (create_dispatcher_calls): Ditto.
	(create_target_clone): Ditto.
	* omp-expand.c (grid_expand_target_grid_body): Ditto.
	* omp-low.c (create_omp_child_function_name): Ditto.
	* omp-simd-clone.c (simd_clone_create): Ditto.
   	* symtab.c (simd_symtab_node::noninterposable_alias): Use the
  new clone_function_name without numbering.

gcc/lto/ChangeLog:

2018-10-25  Michael Ploujnikov  

	* lto-partition.c (privatize_symbol_name_1): Use
  clone_function_name_numbered.

gcc/testsuite/ChangeLog:

2018-10-25  Michael Ploujnikov  

	* gcc.dg/tree-prof/cold_partition_label.c: Update for cold
  section names without numbers.
* gcc.dg/tree-prof/section-attr-1.c: Ditto.
	* gcc.dg/tree-prof/section-attr-2.c: Ditto.
	* gcc.dg/tree-prof/section-attr-3.c: Ditto.
---
 gcc/cgraph.h   |  7 ++-
 gcc/cgraphclones.c | 66 ++
 gcc/config/rs6000/rs6000.c |  2 +-
 gcc/lto/lto-partition.c|  4 +-
 gcc/multiple_target.c  |  8 +--
 gcc/omp-expand.c   |  3 +-
 gcc/omp-low.c  |  4 +-
 gcc/omp-simd-clone.c   |  3 +-
 .../gcc.dg/tree-prof/cold_partition_label.c|  4 +-
 gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c|  2 +-
 gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c|  2 +-
 gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c|  2 +-
 12 files changed, 79 insertions(+), 28 deletions(-)

diff --git gcc/cgraph.h gcc/cgraph.h
index a8b1b4c..971065d 100644
--- gcc/cgraph.h
+++ gcc/cgraph.h
@@ -2368,8 +2368,11 @@ basic_block init_lowered_empty_function (tree, bool, profile_count);
 tree thunk_adjust (gimple_stmt_iterator *, tree, bool, HOST_WIDE_INT, tree);
 /* In cgraphclones.c  */
 
-tree clone_function_name_1 (const char *, const char *);
-tree clone_function_name (tree decl, const char *);
+tree clone_function_name_numbered (const char *name, const char *suffix);
+tree clone_function_name_numbered (tree decl, const char *suffix);
+tree clone_function_name (const char *name, const char *suffix,
+			  unsigned long number);
+tree clone_function_name (tree decl, const char *suffix);
 
 void tree_function_versioning (tree, tree, vec *,
 			   bool, bitmap, bool, bitmap, basic_block);
diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index 6e84a31..0e496d9 100644
--- gcc/cgraphclones.c
+++ gcc/c

Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.

2018-10-26 Thread Michael Ploujnikov
Hi Martin,

Thanks for the review.

On 2018-10-26 03:51 AM, Martin Liška wrote:
> On 10/26/18 12:59 AM, Michael Ploujnikov wrote:
>> I've taken the advice from a discussion on IRC and re-wrote the patch
>> with more uniform function names and using overloading.
>>
>> I think this function accomplished the following goals:
>>  - remove clone numbering where it's not needed:
>>final.c:final_scan_insn_1 and
>>symtab.c:simd_symtab_node::noninterposable_alias.
>>  - name and document the clone naming API such that future users won't
>>accidentally use the numbering when it's not necessary; if you need
>>numbering then you need to explicitly ask for it with the right
>>function
>>  - provide a new function that allows users to specify a clone number
>>explicitly as an argument
> 
> Hello.
> 
> Thanks for reworking that.
> 
>>
>> My thoughts for future improvements:
>>  - It's a bit unfortunate that lto-partition.c:privatize_symbol_name_1
>>has to break the decl abstraction and pass in a string that it
>>created into what I would consider the implementation-detail
>>function. The best way I can think of to make it uniform with the
>>rest of the users is to have it create a new empty decl with
>>DECL_ASSEMBLER_NAME set to the new string
> 
> That's not nice to create artificial declaration. Having string variant
> is fine for me.

Ok.

> 
>>  - It's unfortunate that I have to duplicate the separator
>>concatenation in the numberless clone_function_name, but I think it
>>has to be like that unless ASM_FORMAT_PRIVATE_NAME making the
>>number optional.
>>
> 
> That's also fine for me. I'm attaching small nits that I found.
> And please reformat following chunk in ChangeLog entry:
> 
>   * cgraph.h (clone_function_name_1): Replaced by new
>   clone_function_name_numbered that takes name as string; for
>   privatize_symbol_name_1 use only.  (clone_function_name):
>   Renamed to clone_function_name_numbered to be explicit about
>   numbering.  (clone_function_name): New two-argument function
>   that does not number its output.  (clone_function_name): New
>   three-argument function that takes a number to append to its
>   output.
> 
> into:
> 
>   * cgraph.h (clone_function_name_1): Replaced by new
>   clone_function_name_numbered that takes name as string; for
>   privatize_symbol_name_1 use only.
> (clone_function_name): Renamed to clone_function_name_numbered
>   to be explicit about...

Fixed, assuming you wanted me to start each function on a new line.

>  suffix,
> -(char*)0));
> +NULL));
>return get_identifier (result);
>  }

I've actually been told that NULL isn't always the same on some
targets and that I should use (char*)0 instead. Note that
libiberty/concat.c itself uses (char*)0.

> 
> I'm adding Honza to CC, hope he can review it quickly.
> 
> Thanks,
> Martin
> 

Thanks again,
Michael
From aea94273e7a477a03d1ee10a5d9043d6d13b8e8d Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 25 Oct 2018 13:16:36 -0400
Subject: [PATCH] Avoid unnecessarily numbering cloned symbols.

gcc/ChangeLog:

2018-10-26  Michael Ploujnikov  

	* cgraph.h (clone_function_name_1): Replaced by new
	  clone_function_name_numbered that takes name as string; for
	  privatize_symbol_name_1 use only.
	  (clone_function_name): Renamed to
	  clone_function_name_numbered to be explicit about numbering.
	  (clone_function_name): New two-argument function that does
	  not number its output.
	  (clone_function_name): New three-argument function that
	  takes a number to append to its output.
	* cgraphclones.c (duplicate_thunk_for_node):
	  (clone_function_name_1): Renamed.
	  (clone_function_name_numbered): Two new functions.
	  (clone_function_name): Improved documentation.
	  (cgraph_node::create_virtual_clone): Use clone_function_name_numbered.
	* config/rs6000/rs6000.c (make_resolver_func): Ditto.
	* final.c (final_scan_insn_1): Use the new clone_function_name
	  without numbering.
	* multiple_target.c (create_dispatcher_calls): Ditto.
	  (create_target_clone): Ditto.
	* omp-expand.c (grid_expand_target_grid_body): Ditto.
	* omp-low.c (create_omp_child_function_name): Ditto.
	* omp-simd-clone.c (simd_clone_create): Ditto.
	* symtab.c (simd_symtab_node::noninterposable_alias): Use the
	  new clone_function_name without numbering.

gcc/lto/ChangeLog:

2018-10-26  Michael Ploujnikov  

	* lto-partition.c (privatize_symbol_name_1): Use
	

Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.

2018-10-27 Thread Michael Ploujnikov
Hi,

On 2018-10-26 10:25 a.m., Jan Hubicka wrote:
>> From aea94273e7a477a03d1ee10a5d9043d6d13b8e8d Mon Sep 17 00:00:00 2001
>> From: Michael Ploujnikov 
>> Date: Thu, 25 Oct 2018 13:16:36 -0400
>> Subject: [PATCH] Avoid unnecessarily numbering cloned symbols.
>>
>> gcc/ChangeLog:
>>
>> 2018-10-26  Michael Ploujnikov  
>>
>>  * cgraph.h (clone_function_name_1): Replaced by new
>>clone_function_name_numbered that takes name as string; for
>>privatize_symbol_name_1 use only.
>>(clone_function_name): Renamed to
>>clone_function_name_numbered to be explicit about numbering.
>>(clone_function_name): New two-argument function that does
>>not number its output.
>>(clone_function_name): New three-argument function that
>>takes a number to append to its output.
>>  * cgraphclones.c (duplicate_thunk_for_node):
>>(clone_function_name_1): Renamed.
>>(clone_function_name_numbered): Two new functions.
>>(clone_function_name): Improved documentation.
>>(cgraph_node::create_virtual_clone): Use clone_function_name_numbered.
>>  * config/rs6000/rs6000.c (make_resolver_func): Ditto.
>>  * final.c (final_scan_insn_1): Use the new clone_function_name
>>without numbering.
>>  * multiple_target.c (create_dispatcher_calls): Ditto.
>>(create_target_clone): Ditto.
>>  * omp-expand.c (grid_expand_target_grid_body): Ditto.
>>  * omp-low.c (create_omp_child_function_name): Ditto.
>>  * omp-simd-clone.c (simd_clone_create): Ditto.
>>  * symtab.c (simd_symtab_node::noninterposable_alias): Use the
>>new clone_function_name without numbering.
>>
>> gcc/lto/ChangeLog:
>>
>> 2018-10-26  Michael Ploujnikov  
>>
>>  * lto-partition.c (privatize_symbol_name_1): Use
>>clone_function_name_numbered.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-10-26  Michael Ploujnikov  
>>
>>  * gcc.dg/tree-prof/cold_partition_label.c: Update for cold
>>section names without numbers.
>>  * gcc.dg/tree-prof/section-attr-1.c: Ditto.
>>  * gcc.dg/tree-prof/section-attr-2.c: Ditto.
>>  * gcc.dg/tree-prof/section-attr-3.c: Ditto.
> 
> OK,
> thanks!
> Honza
> 

Thanks again for the review. This is my first patch and I don't have
commit access. What should I do?


- Michael



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v4] Avoid unnecessarily numbering cloned symbols.

2018-10-29 Thread Michael Ploujnikov

On 2018-10-29 6:49 a.m., Martin Liška wrote:
> On 10/29/18 9:40 AM, Martin Liška wrote:
>> On 10/27/18 6:15 PM, Michael Ploujnikov wrote:
>>> Hi,
>>>
>>> On 2018-10-26 10:25 a.m., Jan Hubicka wrote:
>>>>> From aea94273e7a477a03d1ee10a5d9043d6d13b8e8d Mon Sep 17 00:00:00 2001
>>>>> From: Michael Ploujnikov 
>>>>> Date: Thu, 25 Oct 2018 13:16:36 -0400
>>>>> Subject: [PATCH] Avoid unnecessarily numbering cloned symbols.
>>>>>
>>>>> gcc/ChangeLog:
>>>>>
>>>>> 2018-10-26  Michael Ploujnikov  
>>>>>
>>>>>   * cgraph.h (clone_function_name_1): Replaced by new
>>>>> clone_function_name_numbered that takes name as string; for
>>>>> privatize_symbol_name_1 use only.
>>>>> (clone_function_name): Renamed to
>>>>> clone_function_name_numbered to be explicit about numbering.
>>>>> (clone_function_name): New two-argument function that does
>>>>> not number its output.
>>>>> (clone_function_name): New three-argument function that
>>>>> takes a number to append to its output.
>>>>>   * cgraphclones.c (duplicate_thunk_for_node):
>>>>> (clone_function_name_1): Renamed.
>>>>> (clone_function_name_numbered): Two new functions.
>>>>> (clone_function_name): Improved documentation.
>>>>> (cgraph_node::create_virtual_clone): Use clone_function_name_numbered.
>>>>>   * config/rs6000/rs6000.c (make_resolver_func): Ditto.
>>>>>   * final.c (final_scan_insn_1): Use the new clone_function_name
>>>>> without numbering.
>>>>>   * multiple_target.c (create_dispatcher_calls): Ditto.
>>>>> (create_target_clone): Ditto.
>>>>>   * omp-expand.c (grid_expand_target_grid_body): Ditto.
>>>>>   * omp-low.c (create_omp_child_function_name): Ditto.
>>>>>   * omp-simd-clone.c (simd_clone_create): Ditto.
>>>>>   * symtab.c (simd_symtab_node::noninterposable_alias): Use the
>>>>> new clone_function_name without numbering.
>>>>>
>>>>> gcc/lto/ChangeLog:
>>>>>
>>>>> 2018-10-26  Michael Ploujnikov  
>>>>>
>>>>>   * lto-partition.c (privatize_symbol_name_1): Use
>>>>> clone_function_name_numbered.
>>>>>
>>>>> gcc/testsuite/ChangeLog:
>>>>>
>>>>> 2018-10-26  Michael Ploujnikov  
>>>>>
>>>>>   * gcc.dg/tree-prof/cold_partition_label.c: Update for cold
>>>>> section names without numbers.
>>>>>   * gcc.dg/tree-prof/section-attr-1.c: Ditto.
>>>>>   * gcc.dg/tree-prof/section-attr-2.c: Ditto.
>>>>>   * gcc.dg/tree-prof/section-attr-3.c: Ditto.
>>>>
>>>> OK,
>>>> thanks!
>>>> Honza
>>>>
>>>
>>> Thanks again for the review. This is my first patch and I don't have
>>> commit access. What should I do?
>>
>> I'm going to install the patch on your behalf. For write access you should
>> follow these intructions: 
>> https://www.gnu.org/software/gcc/svnwrite.html#policies
>>
>> Martin
>>
>>>
>>>
>>> - Michael
>>>
>>
> 
> But first I see some failures when I tried to apply the patch:
> 
> $ patch -p0 --dry-run < 
> ~/Downloads/0001-Avoid-unnecessarily-numbering-cloned-symbols.patch
> checking file gcc/cgraph.h
> Hunk #1 succeeded at 2382 with fuzz 1 (offset 14 lines).
> checking file gcc/cgraphclones.c
> Hunk #1 succeeded at 317 (offset 1 line).
> checking file gcc/config/rs6000/rs6000.c
> Hunk #1 succeeded at 36997 (offset 485 lines).
> checking file gcc/lto/lto-partition.c
> checking file gcc/multiple_target.c
> checking file gcc/omp-expand.c
> checking file gcc/omp-low.c
> checking file gcc/omp-simd-clone.c
> checking file gcc/testsuite/gcc.dg/tree-prof/cold_partition_label.c
> checking file gcc/testsuite/gcc.dg/tree-prof/section-attr-1.c
> Hunk #1 FAILED at 42.
> 1 out of 1 hunk FAILED
> checking file gcc/testsuite/gcc.dg/tree-prof/section-attr-2.c
> Hunk #1 FAILED at 41.
> 1 out of 1 hunk FAILED
> checking file gcc/testsuite/gcc.dg/tree-prof/section-attr-3.c
> Hunk #1 FAILED at 42.
> 1 out of 1 hunk FAILED
> 
> Can you please rebase that on top of current trunk?
> Thanks,
> Martin
> 

Sorry about that. Attached is the fixed patch. Note that I'm cu

[doc PATCH] Fix weakref description.

2018-11-02 Thread Michael Ploujnikov
I came across this typo and also added a similar ld invocation for
illustration purposes as mentioned by Jakub on irc.
From 2df4903f04fbe68e9e6a1ae0eea460e7592a8512 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Fri, 2 Nov 2018 13:40:50 -0400
Subject: [PATCH] Fix weakref description.

gcc/ChangeLog:

2018-11-02  Michael Ploujnikov  

	* doc/extend.texi: Fix typo in the weakref description.
---
 gcc/doc/extend.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git gcc/doc/extend.texi gcc/doc/extend.texi
index 4dbb2da39e..027e0f75a1 100644
--- gcc/doc/extend.texi
+++ gcc/doc/extend.texi
@@ -3603,7 +3603,7 @@ symbol, not necessarily in the same translation unit.
 The effect is equivalent to moving all references to the alias to a
 separate translation unit, renaming the alias to the aliased symbol,
 declaring it as weak, compiling the two separate translation units and
-performing a reloadable link on them.
+performing a incremental link (like @code{ld -r}) on them.
 
 At present, a declaration to which @code{weakref} is attached can
 only be @code{static}.
-- 
2.19.1



signature.asc
Description: OpenPGP digital signature


Re: [doc PATCH] Fix weakref description.

2018-11-12 Thread Michael Ploujnikov
On 2018-11-02 1:59 p.m., Michael Ploujnikov wrote:
> I came across this typo and also added a similar ld invocation for
> illustration purposes as mentioned by Jakub on irc.
> 

After talking to Jakub about it, I went with different terminology.


- Michael
From f14d7315e0dc9c4b6aff6137fd90e4d2595ef9f5 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Mon, 12 Nov 2018 12:42:37 -0500
Subject: [PATCH] Fix weakref description.

gcc/ChangeLog:

2018-11-12  Michael Ploujnikov  

	* doc/extend.texi: Fix typo in the weakref description.
---
 gcc/doc/extend.texi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git gcc/doc/extend.texi gcc/doc/extend.texi
index e2b9ee11a54..fc507afe600 100644
--- gcc/doc/extend.texi
+++ gcc/doc/extend.texi
@@ -3619,7 +3619,7 @@ symbol, not necessarily in the same translation unit.
 The effect is equivalent to moving all references to the alias to a
 separate translation unit, renaming the alias to the aliased symbol,
 declaring it as weak, compiling the two separate translation units and
-performing a reloadable link on them.
+performing a link with relocatable output (ie: @code{ld -r}) on them.
 
 At present, a declaration to which @code{weakref} is attached can
 only be @code{static}.
-- 
2.19.1



signature.asc
Description: OpenPGP digital signature


Re: [doc PATCH] Fix weakref description.

2018-11-13 Thread Michael Ploujnikov
On 2018-11-12 12:50 p.m., Michael Ploujnikov wrote:
> On 2018-11-02 1:59 p.m., Michael Ploujnikov wrote:
>> I came across this typo and also added a similar ld invocation for
>> illustration purposes as mentioned by Jakub on irc.
>>
> 
> After talking to Jakub about it, I went with different terminology.
> 
> 
> - Michael
> 

Installed as obvious.

- Michael



signature.asc
Description: OpenPGP digital signature


[PATCH v3] Make function clone name numbering independent.

2018-11-28 Thread Michael Ploujnikov
Continuing from https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00228.html

It took longer than expected, but I've finally rebased this on top of
the new clone_function_name* API and included the requested
optimizations.

There are a few remaining spots where we could probably apply similar
optimizations:

- gcc/multiple_target.c:create_target_clone
- gcc/multiple_target.c:create_dispatcher_calls
- gcc/omp-expand.c:grid_expand_target_grid_body

But I've yet to figure out how these work in detail and with the new
API these shouldn't block the main change from being merged.

I've also included a small change to rs6000 which I'm pretty sure is
safe, but I have no way of testing.

I'm not sure what's the consensus on what's more appropriate, but I
thought that it would be a good idea to keep these changes as separate
patches since only the first one really changes behaviour, while the
rest are optimizations. It's conceivable that someone who is
backporting this to an older release might want to just backport the
core bits of the change. I can re-submit it as one patch if that's
more appropriate.

Everything in these patches was bootstrapped and regression tested on
x86_64.

Ok for trunk?

- Michael
From 40cb5c888522d69bc42791f0c884dcb9e29eff37 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 1 Nov 2018 12:57:30 -0400
Subject: [PATCH 1/4] Make function assembly more independent.

This is achieved by having clone_function_name assign unique clone
numbers for each function independently.

gcc:

2018-11-28  Michael Ploujnikov  

	* cgraphclones.c: Replaced clone_fn_id_num with clone_fn_ids;
	  hash map.
	(clone_function_name_numbered): Use clone_fn_ids.

gcc/testsuite:

2018-11-28  Michael Ploujnikov  

	* gcc.dg/independent-cloneids-1.c: New test.
---
 gcc/cgraphclones.c| 10 -
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++
 2 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index e17959c0ca..fdd84b60d3 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -513,7 +513,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map *clone_fn_ids;
 
 /* Return a new assembler name for a clone of decl named NAME.  Apart
from the string SUFFIX, the new name will end with a unique (for
@@ -525,7 +525,13 @@ static GTY(()) unsigned int clone_fn_id_num;
 tree
 clone_function_name_numbered (const char *name, const char *suffix)
 {
-  return clone_function_name (name, suffix, clone_fn_id_num++);
+  /* Initialize the function->counter mapping the first time it's
+ needed.  */
+  if (!clone_fn_ids)
+clone_fn_ids = hash_map::create_ggc (64);
+  unsigned int &suffix_counter = clone_fn_ids->get_or_insert (
+   IDENTIFIER_POINTER (get_identifier (name)));
+  return clone_function_name (name, suffix, suffix_counter++);
 }
 
 /* Return a new assembler name for a clone of DECL.  Apart from string
diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c
new file mode 100644
index 00..3203895492
--- /dev/null
+++ gcc/testsuite/gcc.dg/independent-cloneids-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fipa-cp -fipa-cp-clone"  } */
+
+extern int printf (const char *, ...);
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 7 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg;
+}
+
+int
+baz (int arg)
+{
+  printf("%d\n", bar (3));
+  printf("%d\n", bar (4));
+  printf("%d\n", foo (5));
+  printf("%d\n", foo (6));
+  /* adding or removing the following call should not affect foo
+ function's clone numbering */
+  printf("%d\n", bar (7));
+  return foo (8);
+}
+
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]3:} } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]4:} } } */
-- 
2.19.1

From 80b07ddf059415c896cecb9862517899db3993e9 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Mon, 17 Sep 2018 16:02:27 -0400
Subject: [PATCH 2/4] Minimize clone counter memory usage in
 create_virtual_clone.

Based on Martin Jambour's suggestion:
https://g

Re: [PATCH v3] Make function clone name numbering independent.

2018-11-29 Thread Michael Ploujnikov
On 2018-11-28 5:49 p.m., Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Nov 28, 2018 at 04:09:14PM -0500, Michael Ploujnikov wrote:
>> I've also included a small change to rs6000 which I'm pretty sure is
>> safe, but I have no way of testing.
> 
> Do you have an account on the GCC Compile Farm?
> https://cfarm.tetaneutral.net/
> There are some nice Power machines in there!
> 
> Does this patch dependent on the rest of the series?
> 
> If it tests okay, it is okay for trunk of course.  Thanks!
> 
> One comment about your changelog:
> 
>> 2018-11-28  Michael Ploujnikov  
>>
>>  * config/rs6000/rs6000.c (make_resolver_func): Generate
>>resolver symbol with clone_function_name instead of
>>clone_function_name_numbered.
> 
> Those last two lines should not start with the spaces.  It should be:
> 
> 2018-11-28  Michael Ploujnikov  
> 
>   * config/rs6000/rs6000.c (make_resolver_func): Generate
>   resolver symbol with clone_function_name instead of
>   clone_function_name_numbered.
> 
> 
> Segher
> 

Thanks for the review and suggestion to use the cfarm! I've
successfully regtested this on power9 and committed the stand-alone
change to rs6000.c after fixing the changelog.

- Michael



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v3] Make function clone name numbering independent.

2018-11-30 Thread Michael Ploujnikov
Hi,

On 2018-11-30 2:25 a.m., Richard Biener wrote:
> +  unsigned &clone_number = lto_clone_numbers->get_or_insert (
> + IDENTIFIER_POINTER (DECL_NAME (decl)));
>name = maybe_rewrite_identifier (name);
>symtab->change_decl_assembler_name (decl,
> -  clone_function_name_numbered (
> -  name, "lto_priv"));
> +  clone_function_name (
> +  name, "lto_priv", clone_number));
> 
> since we use 'name' from maybe_rewrite_identifier in the end wouldn't it
> make more sense to use that as a key for lto_clone_numbers?

Yup, that looks much better. Fixed it.

> For the ipa-hsa.c changes since we iterate over all defined functions
> we at most create a single clone per cgraph_node.  That means your
> hash-map keyed on cgraph_node is useless and you could use
> an unnumbered clone (or a static zero number).

Good catch. I was thinking of doing this, but it somehow fell through the 
cracks during the rebase.

> 
> -  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,
> -   suffix));
> +  SET_DECL_ASSEMBLER_NAME (new_decl,
> +   clone_function_name (
> + IDENTIFIER_POINTER (
> +   DECL_ASSEMBLER_NAME (old_decl)),
> + suffix,
> + num_suffix));
> 
> can you please hide the implementation detail of accessing the assembler name
> by adding an overload to clone_function_name_numbered with an explicit
> number?

Done.


> 
> OK with those changes.
> 
> Thanks,
> Richard.

Thank you for the review. I will commit as soon as my last test run finishes.

- Michael
From ac1f1579d37804c97833d460ec6cd5b87d6184c7 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 1 Nov 2018 12:57:30 -0400
Subject: [PATCH 1/3] Make function assembly more independent.

This is achieved by having clone_function_name assign unique clone
numbers for each function independently.

gcc:

2018-11-30  Michael Ploujnikov  

	* cgraphclones.c: Replaced clone_fn_id_num with clone_fn_ids;
	hash map.
	(clone_function_name_numbered): Use clone_fn_ids.

gcc/testsuite:

2018-11-30  Michael Ploujnikov  

	* gcc.dg/independent-cloneids-1.c: New test.
---
 gcc/cgraphclones.c| 10 -
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++
 2 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index e17959c0ca..fdd84b60d3 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -513,7 +513,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map *clone_fn_ids;
 
 /* Return a new assembler name for a clone of decl named NAME.  Apart
from the string SUFFIX, the new name will end with a unique (for
@@ -525,7 +525,13 @@ static GTY(()) unsigned int clone_fn_id_num;
 tree
 clone_function_name_numbered (const char *name, const char *suffix)
 {
-  return clone_function_name (name, suffix, clone_fn_id_num++);
+  /* Initialize the function->counter mapping the first time it's
+ needed.  */
+  if (!clone_fn_ids)
+clone_fn_ids = hash_map::create_ggc (64);
+  unsigned int &suffix_counter = clone_fn_ids->get_or_insert (
+   IDENTIFIER_POINTER (get_identifier (name)));
+  return clone_function_name (name, suffix, suffix_counter++);
 }
 
 /* Return a new assembler name for a clone of DECL.  Apart from string
diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c
new file mode 100644
index 00..3203895492
--- /dev/null
+++ gcc/testsuite/gcc.dg/independent-cloneids-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fipa-cp -fipa-cp-clone"  } */
+
+extern int printf (const char *, ...);
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 7 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg;
+}
+
+int
+baz (int arg)
+{
+  printf("%d\n", bar (3));
+  printf("%d\n", bar (4));
+  printf("%d\n", foo (5));
+  printf("%d\n", foo (6));
+  /* adding or removing the following call should not affect foo
+ function's clone numbering */
+  printf("%d\n", bar (7));
+  return foo (8);
+}
+
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-not {(?

[PATCH] String contents hash map key example

2019-01-18 Thread Michael Ploujnikov
I thought it would be useful to others who are new to the GCC codebase
to have an example of how to hash keys based on string contents rather
than pointer addresses (the fact that some hash maps key based on
semi-reliable pointers (due to ggc_mark_stringpool) into the symtab
gives a false sense of security).


- Michael
From 3433efe4ac558de05410a9b185f4ff0a01e7e5df Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Fri, 11 Jan 2019 09:22:14 -0500
Subject: [PATCH] Document how to hash based on key string contents.

gcc:

2019-01-18  Michael Ploujnikov  

	* hash-map-tests.c (test_map_of_strings_to_int): Document how
	to hash based on key string contents.
---
 gcc/hash-map-tests.c | 20 
 1 file changed, 20 insertions(+)

diff --git gcc/hash-map-tests.c gcc/hash-map-tests.c
index 98b5830497..61da8233c4 100644
--- gcc/hash-map-tests.c
+++ gcc/hash-map-tests.c
@@ -77,6 +77,26 @@ test_map_of_strings_to_int ()
   m.remove (eric);
   ASSERT_EQ (5, m.elements ());
   ASSERT_EQ (NULL, m.get (eric));
+
+  /* A plain char * key is hashed based on its value (address), rather
+ than the string it points to.  */
+  char *another_ant = static_cast  (xcalloc (4, 1));
+  another_ant[0] = 'a';
+  another_ant[1] = 'n';
+  another_ant[2] = 't';
+  another_ant[3] = 0;
+  ASSERT_NE (ant, another_ant);
+  unsigned prev_size = m.elements ();
+  ASSERT_EQ (false, m.put (another_ant, 7));
+  ASSERT_EQ (prev_size + 1, m.elements ());
+
+  /* Need to use string_hash or nofree_string_hash key types to hash
+ based on the string contents.  */
+  hash_map  string_map;
+  ASSERT_EQ (false, string_map.put (ant, 1));
+  ASSERT_EQ (1, string_map.elements ());
+  ASSERT_EQ (true, string_map.put (another_ant, 5));
+  ASSERT_EQ (1, string_map.elements ());
 }
 
 /* Run all of the selftests within this file.  */
-- 
2.19.1



signature.asc
Description: OpenPGP digital signature


Re: [PATCH] Fix PR89150, GC of tree-form bitmaps

2019-02-07 Thread Michael Ploujnikov
On 2019-02-04 11:39 a.m., Richard Biener wrote:
> On February 4, 2019 5:07:00 PM GMT+01:00, Jeff Law  wrote:
>> On 2/4/19 6:15 AM, Richard Biener wrote:
>>>
>>> When I introduced tree-form bitmaps I forgot to think about GC.
>>> The following drops the chain_prev annotation to make the marker
>>> work for trees.  I've also maked the obstack member GTY skip
>>> (and prevent bitmap_obstack from gengtype processing) because
>>> the obstack isn't used for GC allocated bitmaps.
>>>
>>> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
>>>
>>> Richard.
>>>
>>> 2019-02-04  Richard Biener  
>>>
>>> PR middle-end/89150
>>> * bitmap.h (struct bitmap_obstack): Do not mark GTY.
>>> (struct bitmap_element): Drop chain_prev so we properly recurse on
>>> the prev member, supporting tree views.
>>> (struct bitmap_head): GTY skip the obstack member.
>> Was there a particular failure mode you observed or was this discovered
>> by inspection.
> 
> It was discovered via an out of tree patch, the issue is only latent on trunk 
> (together with another unfixed pch issue of bitmaps). 

My apologies for not noticing this thread/PR earlier. I originally discovered 
this issue while working on an experimental patch, but since then I've also 
come up with a test case for this specific issue. Is it worth adding it to 
trunk?

- Michael
From 9f1049a1e24cd1aa9852c5dddd1342dce0226416 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 7 Feb 2019 14:43:40 -0500
Subject: [PATCH] Add a test for GC marking bitmaps in tree view mode

gcc/ChangeLog:

2019-02-07  Michael Ploujnikov  

	PR middle-end/89150
	* bitmap.c (test_bitmap_tree_marking): New test.
	(NOT_NULL_OR_GARBAGE): For shortening
	test_bitmap_tree_marking.
	(bitmap_c_tests): Add test_bitmap_tree_marking.
---
 gcc/bitmap.c | 82 
 1 file changed, 82 insertions(+)

diff --git gcc/bitmap.c gcc/bitmap.c
index 5a8236de75..7e802a1600 100644
--- gcc/bitmap.c
+++ gcc/bitmap.c
@@ -2626,6 +2626,11 @@ bitmap_head::dump ()
 
 #if CHECKING_P
 
+static GTY(()) bitmap selftest_test_bitmap;
+
+/* From ggc-internal.h */
+extern bool ggc_force_collect;
+
 namespace selftest {
 
 /* Selftests for bitmaps.  */
@@ -2722,6 +2727,82 @@ test_bitmap_single_bit_set_p ()
   ASSERT_EQ (1066, bitmap_first_set_bit (b));
 }
 
+#define NOT_NULL_OR_GARBAGE(NODE) \
+  (NODE != NULL && (unsigned long)(NODE) != 0xa5a5a5a5UL)
+
+static void
+test_bitmap_tree_marking ()
+{
+  selftest_test_bitmap = BITMAP_GGC_ALLOC ();
+  bitmap_tree_view (selftest_test_bitmap);
+
+  ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 1*BITMAP_ELEMENT_ALL_BITS));
+  ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 5*BITMAP_ELEMENT_ALL_BITS));
+  ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 10*BITMAP_ELEMENT_ALL_BITS));
+  ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 15*BITMAP_ELEMENT_ALL_BITS));
+  ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 30*BITMAP_ELEMENT_ALL_BITS));
+  ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 25*BITMAP_ELEMENT_ALL_BITS));
+  ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 3*BITMAP_ELEMENT_ALL_BITS));
+  ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 4*BITMAP_ELEMENT_ALL_BITS));
+  ASSERT_TRUE (bitmap_set_bit (selftest_test_bitmap, 26*BITMAP_ELEMENT_ALL_BITS));
+  /* At this point it should look like:
+  26
+ /  \
+25  30
+   /
+  5
+ / \
+4  15
+   /  /
+  3  10
+ /
+1
+
+  */
+  ggc_force_collect = true;
+  ggc_collect ();
+  ggc_force_collect = false;
+
+  ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first));
+  ASSERT_TRUE (selftest_test_bitmap->first->indx == 26);
+
+  ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first->next));
+  ASSERT_TRUE (selftest_test_bitmap->first->next->indx == 30);
+  ASSERT_TRUE (selftest_test_bitmap->first->next->next == NULL);
+  ASSERT_TRUE (selftest_test_bitmap->first->next->prev == NULL);
+
+  ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first->prev));
+  ASSERT_TRUE (selftest_test_bitmap->first->prev->indx == 25);
+  ASSERT_TRUE (selftest_test_bitmap->first->prev->next == NULL);
+
+  ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first->prev->prev));
+  ASSERT_TRUE (selftest_test_bitmap->first->prev->prev->indx == 5);
+
+  ASSERT_TRUE (NOT_NULL_OR_GARBAGE (selftest_test_bitmap->first->prev->prev->next));
+  ASSERT_TRUE (selftest_test_bitmap->first->prev->prev->next->indx == 15);
+  ASSERT_TRUE (selftest_test_bitmap->first->prev->prev->next->next == NULL);
+
+  ASSERT_TRUE (NOT_NULL_OR_GAR

Re: [PATCH] Fix PR89150, GC of tree-form bitmaps

2019-02-08 Thread Michael Ploujnikov
On 2019-02-07 3:09 p.m., Jakub Jelinek wrote:
> On Thu, Feb 07, 2019 at 03:04:21PM -0500, Michael Ploujnikov wrote:
>> 2019-02-07  Michael Ploujnikov  
>>
>>  PR middle-end/89150
>>  * bitmap.c (test_bitmap_tree_marking): New test.
>>  (NOT_NULL_OR_GARBAGE): For shortening
>>  test_bitmap_tree_marking.
>>  (bitmap_c_tests): Add test_bitmap_tree_marking.
> 
> Could you do that instead in a plugin in the testsuite?
> I mean, the patch is adding garbage collection roots, so it will not affect
> just -fself-tests run, but also any time the compiler will do GC.
> 
>   Jakub
> 

I'm not sure what I would need to do to get gengtype to process a test
plugin source file and I can't find examples of this.

Instead, shouldn't I just do something like what's at the bottom of
gcc/ggc-tests.c and not worry about the extra root added to
gt-bitmap.h?


- Michael



signature.asc
Description: OpenPGP digital signature