date:20190624

Re: [testsuite] What's the expected behaviour of dg-require-effective-target shared?

2019-06-24 Thread Christophe Lyon

On Fri, 21 Jun 2019 at 16:28, Iain Sandoe  wrote:
>
> Hi Christophe,
>
> we’ve been looking at some cases where Darwin tests fail or pass unexpectedly 
> depending on
> options.  It came as a surprise to see it failing a test for shared support 
> (since it’s always
> supported shared libs).
>
> -
>
> It’s a long time ago, but in r216117 you added this to target-supports.
>
> # Return 1 if -shared is supported, as in no warnings or errors
> # emitted, 0 otherwise.
>
> proc check_effective_target_shared { } {
>  # Note that M68K has a multilib that supports -fpic but not
>  # -fPIC, so we need to check both.  We test with a program that
>  # requires GOT references.
>  return [check_no_compiler_messages shared executable {
>extern int foo (void); extern int bar;
>   int baz (void) { return foo () + bar; }
>  } "-shared -fpic”
> }
>
> 
>
> The thing is that this is testing two things:
>
> 1) if the target consumes -shared -fpic without warning
>
> 2) assuming that those cause a shared lib to be made it also tests that
> the target will allow a link of that to complete with undefined symbols.
>
> So Darwin *does* support  “-shared -fpic” and is very happy to make
> shared libraries.  However, it doesn’t (by default) allow  undefined symbols
> in the link.
>
> So my question is really about the intent of the test:
>
>   if the intent is to see if the target supports shared libs, then we should
>   arrange for Darwin to pass - either by hardwiring it (since all Darwin
>   versions do support shared) or by adding suitable options to suppress
>   the error.
>
>   if the intent is to check that the target supports linking a shared lib with
>  undefined external symbols, then perhaps we need a different test for the
>  “just supports shared libs”
>
> ===

The patch was posted in https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00596.html
so it's really a matter of testing whether shared libs are supported.

> (note, also the comment doesn’t match what’s actually done, but that’s prob
>  a cut & pasto).
Indeed looks like I cut & pasted too much from check_effective_target_fpic

Thanks,

Christophe

> thanks
> Iain
>
>
>
>

Re: [PATCH] Add .gnu.lto_.meta section.

2019-06-24 Thread Richard Biener

On Fri, Jun 21, 2019 at 4:01 PM Martin Liška  wrote:
>
> On 6/21/19 2:57 PM, Jan Hubicka wrote:
> > This looks like good step (and please stream it in host independent
> > way). I suppose all these issues can be done one-by-one.
>
> So there's a working patch for that. However one will see following errors
> when using an older compiler or older LTO bytecode:
>
> $ gcc main9.o -flto
> lto1: fatal error: bytecode stream in file ‘main9.o’ generated with LTO 
> version -25480.4493 instead of the expected 9.0
>
> $ gcc main.o
> lto1: internal compiler error: compressed stream: data error

This is because of your change to bitfields or because with the old
scheme the header with the
version is compressed (is it?).  I'd simply avoid any layout changes
in the version check range.

> To be honest, I would prefer the new .gnu.lto_.meta section.
> Richi why is that so ugly?

Because it's a change in the wrong direction and doesn't solve the
issue we already
have (cannot determine if a section is compressed or not).  ELF section overhead
is quite big if you have lots of small functions.

Richard.

>
> Martin

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

2019-06-24 Thread Richard Biener

On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor  wrote:
>
> On 6/21/19 6:06 AM, Richard Biener wrote:
> > On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor  wrote:
> >>
> >> Bug 90923 shows that even though GCC hash-table based containers
> >> like hash_map can be instantiated on types with user-defined ctors
> >> and dtors they invoke the dtors of such types without invoking
> >> the corresponding ctors.
> >>
> >> It was thanks to this bug that I spent a day debugging "interesting"
> >> miscompilations during GCC bootstrap (in fairness, it was that and
> >> bug 90904 about auto_vec copy assignment/construction also being
> >> hosed even for POD types).
> >>
> >> The attached patch corrects the hash_map and hash_set templates
> >> to invoke the ctors of the elements they insert and makes them
> >> (hopefully) safe to use with non-trivial user-defined types.
> >
> > Hum.  I am worried about the difference of assignment vs. construction
> > in ::put()
> >
> > +  bool ins = hash_entry::is_empty (*e);
> > +  if (ins)
> > +   {
> > + e->m_key = k;
> > + new ((void *) &e->m_value) Value (v);
> > +   }
> > +  else
> > +   e->m_value = v;
> >
> > why not invoke the dtor on the old value and then the ctor again?
>
> It wouldn't work for self-assignment:
>
>Value &y = m.get_or_insert (key);
>m.put (key, y);
>
> The usual rule of thumb for writes into containers is to use
> construction when creating a new element and assignment when
> replacing the value of an existing element.
>
> Which reminds me that the hash containers, despite being copy-
> constructible (at least for POD types, they aren't for user-
> defined types), also aren't safe for assignment even for PODs.
> I opened bug 90959 for this.  Until the assignment is fixed
> I made it inaccessibe in the patch (I have fixed the copy
> ctor to DTRT for non-PODs).
>
> > How is an empty hash_entry constructed?
>
> In hash_table::find_slot_with_hash simply by finding an empty
> slot and returning a pointer to it.  The memory for the slot
> is marked "empty" by calling the Traits::mark_empty() function.
>
> The full type of hash_map is actually
>
>hash_map Value,
> simple_hashmap_traits,
>   Value>
>
> and simple_hashmap_traits delegates it to default_hash_traits
> whose mark_empty() just clears the void*, leaving the Value
> part uninitialized.  That makes sense because we don't want
> to call ctors for empty entries.  I think the questions one
> might ask if one were to extend the design are: a) what class
> should invoke the ctor/assignment and b) should it do it
> directly or via the traits?
>
> > ::remove() doesn't
> > seem to invoke the dtor either, instead it relies on the
> > traits::remove function?
>
> Yes.  There is no Traits::construct or assign or copy.  We
> could add them but I'm not sure I see to what end (there could
> be use cases, I just don't know enough about these classes to
> think of any).
>
> Attached is an updated patch with the additional minor fixes
> mentioned above.
>
> Martin
>
> PS I think we would get much better results by adopting
> the properly designed and tested standard library containers
> than by spending time trying to improve the design of these
> legacy classes.  For simple uses that don't need to integrate
> with the GC machinery the standard containers should be fine
> (plus, it'd provide us with greater motivation to improve
> them and the code GCC emits for their uses).  Unfortunately,
> to be able to use the hash-based containers we would need to
> upgrade to C++ 11.  Isn't it time yet?

I don't think so.  I'm also not sure if C++11 on its own is desirable
or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
SLES 11 already struggles currently (GCC 4.3) but I'd no longer
consider that important enough.

Note any such change to libstdc++ containers should be complete
and come with both code-size and compile-time and memory-usage
measurements (both of GCC and other apps of course).

Richard.

Re: [PATCH] Add .gnu.lto_.meta section.

2019-06-24 Thread Martin Liška

On 6/24/19 2:02 PM, Richard Biener wrote:
> On Fri, Jun 21, 2019 at 4:01 PM Martin Liška  wrote:
>>
>> On 6/21/19 2:57 PM, Jan Hubicka wrote:
>>> This looks like good step (and please stream it in host independent
>>> way). I suppose all these issues can be done one-by-one.
>>
>> So there's a working patch for that. However one will see following errors
>> when using an older compiler or older LTO bytecode:
>>
>> $ gcc main9.o -flto
>> lto1: fatal error: bytecode stream in file ‘main9.o’ generated with LTO 
>> version -25480.4493 instead of the expected 9.0
>>
>> $ gcc main.o
>> lto1: internal compiler error: compressed stream: data error
> 
> This is because of your change to bitfields or because with the old
> scheme the header with the
> version is compressed (is it?).

Because currently also the header is compressed.

> I'd simply avoid any layout changes
> in the version check range.

Well, then we have to find out how to distinguish between compression 
algorithms.

> 
>> To be honest, I would prefer the new .gnu.lto_.meta section.
>> Richi why is that so ugly?
> 
> Because it's a change in the wrong direction and doesn't solve the
> issue we already
> have (cannot determine if a section is compressed or not).

That's not true, the .gnu.lto_.meta section will be always uncompressed and we 
can
also backport changes to older compiler that can read it and print a proper 
error
message about LTO bytecode version mismatch. 

> ELF section overhead
> is quite big if you have lots of small functions.

My patch is actually shrinking space as I'm suggesting to add _one_ extra ELF 
section
and remove the section header from all other LTO sections. That will save space
for all function sections.

Martin

> 
> Richard.
> 
>>
>> Martin

Re: [PATCH] Add .gnu.lto_.meta section.

2019-06-24 Thread Richard Biener

On Mon, Jun 24, 2019 at 2:12 PM Martin Liška  wrote:
>
> On 6/24/19 2:02 PM, Richard Biener wrote:
> > On Fri, Jun 21, 2019 at 4:01 PM Martin Liška  wrote:
> >>
> >> On 6/21/19 2:57 PM, Jan Hubicka wrote:
> >>> This looks like good step (and please stream it in host independent
> >>> way). I suppose all these issues can be done one-by-one.
> >>
> >> So there's a working patch for that. However one will see following errors
> >> when using an older compiler or older LTO bytecode:
> >>
> >> $ gcc main9.o -flto
> >> lto1: fatal error: bytecode stream in file ‘main9.o’ generated with LTO 
> >> version -25480.4493 instead of the expected 9.0
> >>
> >> $ gcc main.o
> >> lto1: internal compiler error: compressed stream: data error
> >
> > This is because of your change to bitfields or because with the old
> > scheme the header with the
> > version is compressed (is it?).
>
> Because currently also the header is compressed.

That was it, yeah :/  Stupid decisions in the past.

I guess we have to bite the bullet and do this kind of incompatible
change, accepting
the odd error message above.

> > I'd simply avoid any layout changes
> > in the version check range.
>
> Well, then we have to find out how to distinguish between compression 
> algorithms.
>
> >
> >> To be honest, I would prefer the new .gnu.lto_.meta section.
> >> Richi why is that so ugly?
> >
> > Because it's a change in the wrong direction and doesn't solve the
> > issue we already
> > have (cannot determine if a section is compressed or not).
>
> That's not true, the .gnu.lto_.meta section will be always uncompressed and 
> we can
> also backport changes to older compiler that can read it and print a proper 
> error
> message about LTO bytecode version mismatch.

We can always backport changes, yes, but I don't see why we have to.

> > ELF section overhead
> > is quite big if you have lots of small functions.
>
> My patch is actually shrinking space as I'm suggesting to add _one_ extra ELF 
> section
> and remove the section header from all other LTO sections. That will save 
> space
> for all function sections.

But we want the header there to at least say if the section is
compressed or not.
The fact that we have so many ELF section means we have the redundant version
info everywhere.

We should have a single .gnu.lto_ section (and also get rid of those
__gnu_lto_v1 and __gnu_lto_slim COMMON symbols - checking for
existence of a symbol is more expensive compared to existence
of a section).

Richard.

> Martin
>
> >
> > Richard.
> >
> >>
> >> Martin
>

Parallelize GCC with Threads -- First Evaluation

2019-06-24 Thread Giuliano Belinassi

Hi,

Parallelize GCC with Threads -- First Evaluation

Hi everyone,

I am attaching the first evaluation report here publicly for gathering
feedback. The file is in markdown format and it can be easily be converted to
PDF for better visualization.

I am also open to suggestions and ideas in order to improve the current project 
:-)

My branch can be seen here: https://gitlab.com/flusp/gcc/tree/giulianob_parallel

Giuliano
# Parallelize GCC with Threads: First Evaluation Report


## Tasks accomplished until the First Evaluation:

These are the tasks which I accomplished until now:

  - Refactor `cgraph_node::expand()` and `expand_all_functions`, splitting IPA,
GIMPLE and RTL passes

  - Some documentation about the global states of the GIMPLE passes.

And therefore accomplishing all itens planned to the first evaluation.

Here I explain some technical challenges I was put through and architectural
decisions that I've made.


## Chapter 1 -- Splitting `node::expand`.

This chapter discusses the the technical issues found when splitting the
`node::expand` into `node::expand_ipa`, `node::expand`, and `node::expand_rtl`.

The first step was to split `all_passes` into `all_passes` and
`all_rtl_passes`. This was straightforward, I had to modify the `passes.def`
file, add a new pass list there and call the constructor in `passes.c`.
Unfortunately I could not reuse the `pass_rest_of_compilation`, as sugested by
Richi, therefore I've created a new pass.

The next step was to split `node::expand` into `node::expand_ipa`,
`node::expand` and `node::expand_rtl`. These names are according to the folowing
distinction:

  - `expand_ipa`, which are related with the Inter Process Analysis (IPA).
  - `expand`: which are Intra Process GIMPLE passes. (`all_passes`)
  - `expand_rtl`: which are Intra Process RTL passes (`all_rtl_passes`).

This was somewhat straightforward, since I just had to store the
`location_t input_location` in the `struct function`.

Unfortunately, executing `node::expand_ipa` in every function before
`node::expand`, and `node::expand_rtl` on every function after `node::expand`
was way harder since I had to modify lots of things such as:

   1. The `default_obstack`: There is a pass called `pass_fixup_cfg` which is
  called in both `pass_local_optimization_pases` and `all_passes`. This
  pass uses a global shared object called `default_obstack`, which is
  modified every time it runs into a new function. The solution was to
  create a local obstack to each function, storing it in the `struct
  function`. Unfortunately, this broke several tests related to IPA, which
  made me define a new compiler state called `EXPAND_IPA`, then select the
  default obstack when the compiler is in this state, or the local obstack
  when it is in EXPAND state, fixing the issue.

   2. Adjust the cfun stack to push the cfun before executing the passes, and
poping it when it finishes. I've done it to `node::expand_ipa`, `node::expand`,
and `node::expand_rtl`

   3. Set the `gimple_register_cfg_hooks` when node::expand is called

   4. Store the `input_location` in the `struct function`

This allowed me to refactor the loop in `expand_all_functions`:
```
	FOR_EACH_CGRAPH_NODE(node) {
node->expand_ipa();
node->expand();
node->expand_rtl();
}
```
into:
```
FOR_EACH_CGRAPH_NODE(node)
node->expand_ipa();
FOR_EACH_CGRAPH_NODE(node)
node->expand();
FOR_EACH_CGRAPH_NODE(node)
node->expand_rtl();
```
Enabling me to proceed with the parallelization.


## Chapter 2: -- Parallelization Decisions

After the work done in Chapter 1, I started working on how I will parallelize
the compiler. Richi suggested a pipelined architecture: split the passes into
threads, and pass the cfun across it. When one thread finishes executing its
pass to one cfun, it sends the cfun to the next thread, which holds the next
pass, until all passes are over. This has the advantage of not requiring a
per-pass analysis of global states but has the disadvantage of being slower
(just imagine some passes being slower than every other passes). What I propose
is to compile functions in parallel. Since at this point we are executing
Intra Process optimizations, this is theoretically possible. The issue is that
this is way harder to implement since it requires looking into the global
states of every local pass.

With this in mind, I created a `worker_set` structure that implements a
producer-consumer threadsafe queue: One thread (the master) inserts all cfuns
into a queue, and the other threads (the workers) are responsible to remove its
the cfuns from the queue and start processing. When a thread finishes its work,
it will automatically remove the next function from the queue until it finds a
NULL function, which represents that there is no more work in the queue.
Regarding implementations details, I am using pthreads and unix semaphores.

After this structure was finished, I m

Re: [PATCH] Add .gnu.lto_.meta section.

2019-06-24 Thread Martin Liška

On 6/24/19 2:44 PM, Richard Biener wrote:
> On Mon, Jun 24, 2019 at 2:12 PM Martin Liška  wrote:
>>
>> On 6/24/19 2:02 PM, Richard Biener wrote:
>>> On Fri, Jun 21, 2019 at 4:01 PM Martin Liška  wrote:

 On 6/21/19 2:57 PM, Jan Hubicka wrote:
> This looks like good step (and please stream it in host independent
> way). I suppose all these issues can be done one-by-one.

 So there's a working patch for that. However one will see following errors
 when using an older compiler or older LTO bytecode:

 $ gcc main9.o -flto
 lto1: fatal error: bytecode stream in file ‘main9.o’ generated with LTO 
 version -25480.4493 instead of the expected 9.0

 $ gcc main.o
 lto1: internal compiler error: compressed stream: data error
>>>
>>> This is because of your change to bitfields or because with the old
>>> scheme the header with the
>>> version is compressed (is it?).
>>
>> Because currently also the header is compressed.
> 
> That was it, yeah :/  Stupid decisions in the past.
> 
> I guess we have to bite the bullet and do this kind of incompatible
> change, accepting
> the odd error message above.
> 
>>> I'd simply avoid any layout changes
>>> in the version check range.
>>
>> Well, then we have to find out how to distinguish between compression 
>> algorithms.
>>
>>>
 To be honest, I would prefer the new .gnu.lto_.meta section.
 Richi why is that so ugly?
>>>
>>> Because it's a change in the wrong direction and doesn't solve the
>>> issue we already
>>> have (cannot determine if a section is compressed or not).
>>
>> That's not true, the .gnu.lto_.meta section will be always uncompressed and 
>> we can
>> also backport changes to older compiler that can read it and print a proper 
>> error
>> message about LTO bytecode version mismatch.
> 
> We can always backport changes, yes, but I don't see why we have to.

I'm fine with the backward compatibility break. But we should also consider 
lto-plugin.c
that is parsing following 2 sections:

91  #define LTO_SECTION_PREFIX  ".gnu.lto_.symtab"
92  #define LTO_SECTION_PREFIX_LEN  (sizeof (LTO_SECTION_PREFIX) - 1)
93  #define OFFLOAD_SECTION ".gnu.offload_lto_.opts"
94  #define OFFLOAD_SECTION_LEN (sizeof (OFFLOAD_SECTION) - 1)

> 
>>> ELF section overhead
>>> is quite big if you have lots of small functions.
>>
>> My patch is actually shrinking space as I'm suggesting to add _one_ extra 
>> ELF section
>> and remove the section header from all other LTO sections. That will save 
>> space
>> for all function sections.
> 
> But we want the header there to at least say if the section is
> compressed or not.
> The fact that we have so many ELF section means we have the redundant version
> info everywhere.
> 
> We should have a single .gnu.lto_ section (and also get rid of those
> __gnu_lto_v1 and __gnu_lto_slim COMMON symbols - checking for
> existence of a symbol is more expensive compared to existence
> of a section).

I like removal of the 2 aforementioned sections. To be honest I would recommend 
to
add a new .gnu.lto_.meta section. We can use it instead of __gnu_lto_v1 and we 
can
have a flag there instead of __gnu_lto_slim. As a second step, I'm willing to 
concatenate all

  LTO_section_function_body,
  LTO_section_static_initializer

sections into a single one. That will require an index that will have to be 
created. I can discuss
that with Honza as he suggested using something smarter than function names.

Thoughts?
Martin

> 
> Richard.
> 
>> Martin
>>
>>>
>>> Richard.
>>>

 Martin
>>

Re: [PATCH] Add .gnu.lto_.meta section.

2019-06-24 Thread Iain Sandoe



> On 24 Jun 2019, at 14:31, Martin Liška  wrote:
> 
> On 6/24/19 2:44 PM, Richard Biener wrote:
>> On Mon, Jun 24, 2019 at 2:12 PM Martin Liška  wrote:
>>> 
>>> On 6/24/19 2:02 PM, Richard Biener wrote:
 On Fri, Jun 21, 2019 at 4:01 PM Martin Liška  wrote:
> 
> On 6/21/19 2:57 PM, Jan Hubicka wrote:
>> This looks like good step (and please stream it in host independent
>> way). I suppose all these issues can be done one-by-one.
> 
> So there's a working patch for that. However one will see following errors
> when using an older compiler or older LTO bytecode:
> 
> $ gcc main9.o -flto
> lto1: fatal error: bytecode stream in file ‘main9.o’ generated with LTO 
> version -25480.4493 instead of the expected 9.0
> 
> $ gcc main.o
> lto1: internal compiler error: compressed stream: data error
 
 This is because of your change to bitfields or because with the old
 scheme the header with the
 version is compressed (is it?).
>>> 
>>> Because currently also the header is compressed.
>> 
>> That was it, yeah :/  Stupid decisions in the past.
>> 
>> I guess we have to bite the bullet and do this kind of incompatible
>> change, accepting
>> the odd error message above.
>> 
 I'd simply avoid any layout changes
 in the version check range.
>>> 
>>> Well, then we have to find out how to distinguish between compression 
>>> algorithms.
>>> 
 
> To be honest, I would prefer the new .gnu.lto_.meta section.
> Richi why is that so ugly?
 
 Because it's a change in the wrong direction and doesn't solve the
 issue we already
 have (cannot determine if a section is compressed or not).
>>> 
>>> That's not true, the .gnu.lto_.meta section will be always uncompressed and 
>>> we can
>>> also backport changes to older compiler that can read it and print a proper 
>>> error
>>> message about LTO bytecode version mismatch.
>> 
>> We can always backport changes, yes, but I don't see why we have to.
> 
> I'm fine with the backward compatibility break. But we should also consider 
> lto-plugin.c
> that is parsing following 2 sections:
> 
>91  #define LTO_SECTION_PREFIX  ".gnu.lto_.symtab"
>92  #define LTO_SECTION_PREFIX_LEN  (sizeof (LTO_SECTION_PREFIX) - 1)
>93  #define OFFLOAD_SECTION ".gnu.offload_lto_.opts"
>94  #define OFFLOAD_SECTION_LEN (sizeof (OFFLOAD_SECTION) - 1)
> 
>> 
 ELF section overhead
 is quite big if you have lots of small functions.
>>> 
>>> My patch is actually shrinking space as I'm suggesting to add _one_ extra 
>>> ELF section
>>> and remove the section header from all other LTO sections. That will save 
>>> space
>>> for all function sections.
>> 
>> But we want the header there to at least say if the section is
>> compressed or not.
>> The fact that we have so many ELF section means we have the redundant version
>> info everywhere.
>> 
>> We should have a single .gnu.lto_ section (and also get rid of those
>> __gnu_lto_v1 and __gnu_lto_slim COMMON symbols - checking for
>> existence of a symbol is more expensive compared to existence
>> of a section).
> 
> I like removal of the 2 aforementioned sections. To be honest I would 
> recommend to
> add a new .gnu.lto_.meta section. We can use it instead of __gnu_lto_v1 and 
> we can
> have a flag there instead of __gnu_lto_slim. As a second step, I'm willing to 
> concatenate all
> 
>  LTO_section_function_body,
>  LTO_section_static_initializer
> 
> sections into a single one. That will require an index that will have to be 
> created. I can discuss
> that with Honza as he suggested using something smarter than function names.

I already implemented a scheme (using three sections: INDEX, NAMES, PAYLOAD)  
for Mach-O -
 since it doesn’t have  unlimited section count - it works - and hardly rocket 
science ;) 

- if one were to import the tabular portion of that at the start of a section 
and then the variable portion
 as a trailer … it could all be a single section.

iain

> Martin
> 
>> 
>> Richard.
>> 
>>> Martin
>>> 
 
 Richard.
 
> 
> Martin

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

2019-06-24 Thread Martin Sebor


On 6/24/19 6:11 AM, Richard Biener wrote:

On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor  wrote:


On 6/21/19 6:06 AM, Richard Biener wrote:

On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor  wrote:


Bug 90923 shows that even though GCC hash-table based containers
like hash_map can be instantiated on types with user-defined ctors
and dtors they invoke the dtors of such types without invoking
the corresponding ctors.

It was thanks to this bug that I spent a day debugging "interesting"
miscompilations during GCC bootstrap (in fairness, it was that and
bug 90904 about auto_vec copy assignment/construction also being
hosed even for POD types).

The attached patch corrects the hash_map and hash_set templates
to invoke the ctors of the elements they insert and makes them
(hopefully) safe to use with non-trivial user-defined types.


Hum.  I am worried about the difference of assignment vs. construction
in ::put()

+  bool ins = hash_entry::is_empty (*e);
+  if (ins)
+   {
+ e->m_key = k;
+ new ((void *) &e->m_value) Value (v);
+   }
+  else
+   e->m_value = v;

why not invoke the dtor on the old value and then the ctor again?


It wouldn't work for self-assignment:

Value &y = m.get_or_insert (key);
m.put (key, y);

The usual rule of thumb for writes into containers is to use
construction when creating a new element and assignment when
replacing the value of an existing element.

Which reminds me that the hash containers, despite being copy-
constructible (at least for POD types, they aren't for user-
defined types), also aren't safe for assignment even for PODs.
I opened bug 90959 for this.  Until the assignment is fixed
I made it inaccessibe in the patch (I have fixed the copy
ctor to DTRT for non-PODs).


How is an empty hash_entry constructed?


In hash_table::find_slot_with_hash simply by finding an empty
slot and returning a pointer to it.  The memory for the slot
is marked "empty" by calling the Traits::mark_empty() function.

The full type of hash_map is actually

hash_map,
   Value>

and simple_hashmap_traits delegates it to default_hash_traits
whose mark_empty() just clears the void*, leaving the Value
part uninitialized.  That makes sense because we don't want
to call ctors for empty entries.  I think the questions one
might ask if one were to extend the design are: a) what class
should invoke the ctor/assignment and b) should it do it
directly or via the traits?


::remove() doesn't
seem to invoke the dtor either, instead it relies on the
traits::remove function?


Yes.  There is no Traits::construct or assign or copy.  We
could add them but I'm not sure I see to what end (there could
be use cases, I just don't know enough about these classes to
think of any).

Attached is an updated patch with the additional minor fixes
mentioned above.

Martin

PS I think we would get much better results by adopting
the properly designed and tested standard library containers
than by spending time trying to improve the design of these
legacy classes.  For simple uses that don't need to integrate
with the GC machinery the standard containers should be fine
(plus, it'd provide us with greater motivation to improve
them and the code GCC emits for their uses).  Unfortunately,
to be able to use the hash-based containers we would need to
upgrade to C++ 11.  Isn't it time yet?


I don't think so.  I'm also not sure if C++11 on its own is desirable
or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
SLES 11 already struggles currently (GCC 4.3) but I'd no longer
consider that important enough.

Note any such change to libstdc++ containers should be complete
and come with both code-size and compile-time and memory-usage
measurements (both of GCC and other apps of course).


Can I go ahead and commit the patch?

Martin

RE: gcc: -ftest-coverage and -auxbase

2019-06-24 Thread David.Taylor

> From: Martin Liška 
> Sent: Thursday, June 20, 2019 9:49 AM

> I see. What about the following patch:
>
> $ ./gcc/xgcc -Bgcc /tmp/main.c --coverage -fprofile-note=/tmp/main.gcno $
> ls -l /tmp/main.gcno
> -rw-r--r-- 1 marxin users 428 Jun 20 15:48 /tmp/main.gcno

Thanks.  That did the trick.

Next, I will be looking at modifying GCC to do one
or both of two things --

. put the coverage variables into a section named on
  the command line instead of .data.  The .data section
  would remain the default.

. when requested on the command line, initialize the
  instrumentation variables at run time instead of at
  compile time.  Compile time would remain the default.

I haven't looked into the implementation of either, but the
former feels easy.  The latter is much more useful for us, but
I expect it is also significantly harder.

My thinking is that I will implement the former and will look into
how hard the latter is.  Depending on the outcome / other demands
on my time, I might implement it as well.

Background (or why do I want this):

Our use is embedded.  We disallow .data.  Specialized data,
for example, read-only data or data that goes into a different
named section, is allowed.  (Part of why we disallow .data is that
we support warm boot, not just cold boot.)

If neither was implemented, then for files currently without .data,
we could alter the linker script.  This would mean that files with
.data would be uninstrumentable.  Not ideal.

If the former was implemented, then any file which is not time
critical nor containing code called by the instrumentation code
would be instrumentable.

If the latter was implemented, then that would facilitate clearing
the data on demand.  Since one of the goals is not just to collect
coverage data but to also learn what coverage is provided by each
test or collection of tests, that is greatly facilitated by being able to
clear the accumulated results on demand.  A cold boot would take
20-30 minutes and you might need to 'set up' the tests...

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

2019-06-24 Thread Richard Biener

On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor  wrote:
>
> On 6/24/19 6:11 AM, Richard Biener wrote:
> > On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor  wrote:
> >>
> >> On 6/21/19 6:06 AM, Richard Biener wrote:
> >>> On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor  wrote:
> 
>  Bug 90923 shows that even though GCC hash-table based containers
>  like hash_map can be instantiated on types with user-defined ctors
>  and dtors they invoke the dtors of such types without invoking
>  the corresponding ctors.
> 
>  It was thanks to this bug that I spent a day debugging "interesting"
>  miscompilations during GCC bootstrap (in fairness, it was that and
>  bug 90904 about auto_vec copy assignment/construction also being
>  hosed even for POD types).
> 
>  The attached patch corrects the hash_map and hash_set templates
>  to invoke the ctors of the elements they insert and makes them
>  (hopefully) safe to use with non-trivial user-defined types.
> >>>
> >>> Hum.  I am worried about the difference of assignment vs. construction
> >>> in ::put()
> >>>
> >>> +  bool ins = hash_entry::is_empty (*e);
> >>> +  if (ins)
> >>> +   {
> >>> + e->m_key = k;
> >>> + new ((void *) &e->m_value) Value (v);
> >>> +   }
> >>> +  else
> >>> +   e->m_value = v;
> >>>
> >>> why not invoke the dtor on the old value and then the ctor again?
> >>
> >> It wouldn't work for self-assignment:
> >>
> >> Value &y = m.get_or_insert (key);
> >> m.put (key, y);
> >>
> >> The usual rule of thumb for writes into containers is to use
> >> construction when creating a new element and assignment when
> >> replacing the value of an existing element.
> >>
> >> Which reminds me that the hash containers, despite being copy-
> >> constructible (at least for POD types, they aren't for user-
> >> defined types), also aren't safe for assignment even for PODs.
> >> I opened bug 90959 for this.  Until the assignment is fixed
> >> I made it inaccessibe in the patch (I have fixed the copy
> >> ctor to DTRT for non-PODs).
> >>
> >>> How is an empty hash_entry constructed?
> >>
> >> In hash_table::find_slot_with_hash simply by finding an empty
> >> slot and returning a pointer to it.  The memory for the slot
> >> is marked "empty" by calling the Traits::mark_empty() function.
> >>
> >> The full type of hash_map is actually
> >>
> >> hash_map >>  Value,
> >>  simple_hashmap_traits,
> >>Value>
> >>
> >> and simple_hashmap_traits delegates it to default_hash_traits
> >> whose mark_empty() just clears the void*, leaving the Value
> >> part uninitialized.  That makes sense because we don't want
> >> to call ctors for empty entries.  I think the questions one
> >> might ask if one were to extend the design are: a) what class
> >> should invoke the ctor/assignment and b) should it do it
> >> directly or via the traits?
> >>
> >>> ::remove() doesn't
> >>> seem to invoke the dtor either, instead it relies on the
> >>> traits::remove function?
> >>
> >> Yes.  There is no Traits::construct or assign or copy.  We
> >> could add them but I'm not sure I see to what end (there could
> >> be use cases, I just don't know enough about these classes to
> >> think of any).
> >>
> >> Attached is an updated patch with the additional minor fixes
> >> mentioned above.
> >>
> >> Martin
> >>
> >> PS I think we would get much better results by adopting
> >> the properly designed and tested standard library containers
> >> than by spending time trying to improve the design of these
> >> legacy classes.  For simple uses that don't need to integrate
> >> with the GC machinery the standard containers should be fine
> >> (plus, it'd provide us with greater motivation to improve
> >> them and the code GCC emits for their uses).  Unfortunately,
> >> to be able to use the hash-based containers we would need to
> >> upgrade to C++ 11.  Isn't it time yet?
> >
> > I don't think so.  I'm also not sure if C++11 on its own is desirable
> > or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
> > as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
> > SLES 11 already struggles currently (GCC 4.3) but I'd no longer
> > consider that important enough.
> >
> > Note any such change to libstdc++ containers should be complete
> > and come with both code-size and compile-time and memory-usage
> > measurements (both of GCC and other apps of course).
>
> Can I go ahead and commit the patch?

I think we need to document the requirements on Value classes better.

@@ -177,7 +185,10 @@ public:
   INSERT);
   bool ins = Traits::is_empty (*e);
   if (ins)
-   e->m_key = k;
+   {
+ e->m_key = k;
+ new ((void *)&e->m_value) Value ();
+   }

this now requires a default constructor and I always forget about
differences between the different form of init

GSoC Project: Make C/C++ not automatically promote memory_order_consume to memory_order_acquire

2019-06-24 Thread Akshat Garg

Hello everyone,

We are working on the project to not allow memory_order_consume gets
automatically promoted to memory_order_acquire. In implementing
memory_order_consume, the problem comes when tracing dependencies at the
C/C++ source level. We are first targeting to trace the dependencies only
for the pointer types at C source level. To mark the pointers at C source
level, we thought of introducing a new qualifier _Dependent_ptr, for only
pointer type variables. We believe adding this qualifier will allow us to
trace dependencies and not break any of the pointer optimizations.

For start, we have implemented the _Dependent_ptr qualifier in the GCC C
front-end. The patch is given in this link:
https://github.com/AKG001/gcc/commit/fb4187bc3872a50880159232cf336f0a03505fa8
and given below also.

We have also prepared some testcases for checking whether _Dependent_ptr
parses correctly or not. A simple test case is present on this
https://github.com/AKG001/gcc/commit/193a040df649d773fab5cb5e2698e9729dfd2b97
Some litmus test cases from the document (
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0190r4.pdf) are
present in this commit
https://github.com/AKG001/gcc/commit/b4c3c55b11597bde8ea5b297343ab349caedb6a4
For litmus test cases, the file name convention is: _.c

Full set of testcases are present in this repository
https://github.com/AKG001/test/. The .original files are also present for
each test case.
Command used to generate .original file: ./cc1 -fdump-tree-original .c

We request the community to give us their valuable feedback. Also, let us
know if something is wrong or can be done in a better way.

Thanks,
Akshat

---
commit fb4187bc3872a50880159232cf336f0a03505fa8
Author: Akshat Garg 
Date:   Sat Jun 8 15:06:23 2019 +0530

Add _Dependent_ptr qualifier

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 4057be3..6d37851 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -345,6 +345,7 @@ const struct c_common_resword c_common_reswords[] =
   { "_Alignas", RID_ALIGNAS,   D_CONLY },
   { "_Alignof", RID_ALIGNOF,   D_CONLY },
   { "_Atomic", RID_ATOMIC,D_CONLY },
+  { "_Dependent_ptr",   RID_DEPENDENT_PTR,  0 },
   { "_Bool", RID_BOOL,  D_CONLY },
   { "_Complex", RID_COMPLEX, 0 },
   { "_Imaginary", RID_IMAGINARY, D_CONLY },
@@ -7845,6 +7846,7 @@ keyword_is_type_qualifier (enum rid keyword)
 case RID_VOLATILE:
 case RID_RESTRICT:
 case RID_ATOMIC:
+case RID_DEPENDENT_PTR:
   return true;
 default:
   return false;
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 1cf2cae..a4c7e03 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -68,7 +68,7 @@ enum rid
   RID_UNSIGNED, RID_LONG,RID_CONST, RID_EXTERN,
   RID_REGISTER, RID_TYPEDEF, RID_SHORT, RID_INLINE,
   RID_VOLATILE, RID_SIGNED,  RID_AUTO,  RID_RESTRICT,
-  RID_NORETURN, RID_ATOMIC,
+  RID_NORETURN, RID_ATOMIC, RID_DEPENDENT_PTR,

   /* C extensions */
   RID_COMPLEX, RID_THREAD, RID_SAT,
diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index b10599f..3ac3a21 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -3189,6 +3189,7 @@ check_format_types (const substring_loc &fmt_loc,
   && (TYPE_READONLY (cur_type)
   || TYPE_VOLATILE (cur_type)
   || TYPE_ATOMIC (cur_type)
+|| TYPE_DEPENDENT_PTR (cur_type)
   || TYPE_RESTRICT (cur_type)))
  warning (OPT_Wformat_, "extra type qualifiers in format "
  "argument (argument %d)",
diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
index 3e25624..e034a8f 100644
--- a/gcc/c-family/c-pretty-print.c
+++ b/gcc/c-family/c-pretty-print.c
@@ -181,6 +181,8 @@ pp_c_cv_qualifiers (c_pretty_printer *pp, int
qualifiers, bool func_type)

   if (qualifiers & TYPE_QUAL_ATOMIC)
 pp_c_ws_string (pp, "_Atomic");
+  if (qualifiers & TYPE_QUAL_DEPENDENT_PTR)
+pp_c_ws_string (pp, "_Dependent_ptr");
   if (qualifiers & TYPE_QUAL_CONST)
 pp_c_ws_string (pp, func_type ? "__attribute__((const))" : "const");
   if (qualifiers & TYPE_QUAL_VOLATILE)
diff --git a/gcc/c/c-aux-info.c b/gcc/c/c-aux-info.c
index 96bb2e2..6632447 100644
--- a/gcc/c/c-aux-info.c
+++ b/gcc/c/c-aux-info.c
@@ -284,6 +284,8 @@ gen_type (const char *ret_val, tree t, formals_style
style)
  case POINTER_TYPE:
   if (TYPE_ATOMIC (t))
 ret_val = concat ("_Atomic ", ret_val, NULL);
+  if (TYPE_DEPENDENT_PTR (t))
+ret_val = concat ("_Dependent_ptr ", ret_val, NULL);
   if (TYPE_READONLY (t))
 ret_val = concat ("const ", ret_val, NULL);
   if (TYPE_VOLATILE (t))
@@ -427,6 +429,8 @@ gen_type (const char *ret_val, tree t, formals_style
style)
 }
   if (TYPE_ATOMIC (t))
 ret_val = concat ("_Atomic ", ret_val, NULL);
+  if (TYPE_DEPENDENT_PTR (t))
+ret_val = concat ("_Dependent_ptr ", ret_

Re: [PATCH] Add .gnu.lto_.meta section.

2019-06-24 Thread Richard Biener

On Mon, Jun 24, 2019 at 3:31 PM Martin Liška  wrote:
>
> On 6/24/19 2:44 PM, Richard Biener wrote:
> > On Mon, Jun 24, 2019 at 2:12 PM Martin Liška  wrote:
> >>
> >> On 6/24/19 2:02 PM, Richard Biener wrote:
> >>> On Fri, Jun 21, 2019 at 4:01 PM Martin Liška  wrote:
> 
>  On 6/21/19 2:57 PM, Jan Hubicka wrote:
> > This looks like good step (and please stream it in host independent
> > way). I suppose all these issues can be done one-by-one.
> 
>  So there's a working patch for that. However one will see following 
>  errors
>  when using an older compiler or older LTO bytecode:
> 
>  $ gcc main9.o -flto
>  lto1: fatal error: bytecode stream in file ‘main9.o’ generated with LTO 
>  version -25480.4493 instead of the expected 9.0
> 
>  $ gcc main.o
>  lto1: internal compiler error: compressed stream: data error
> >>>
> >>> This is because of your change to bitfields or because with the old
> >>> scheme the header with the
> >>> version is compressed (is it?).
> >>
> >> Because currently also the header is compressed.
> >
> > That was it, yeah :/  Stupid decisions in the past.
> >
> > I guess we have to bite the bullet and do this kind of incompatible
> > change, accepting
> > the odd error message above.
> >
> >>> I'd simply avoid any layout changes
> >>> in the version check range.
> >>
> >> Well, then we have to find out how to distinguish between compression 
> >> algorithms.
> >>
> >>>
>  To be honest, I would prefer the new .gnu.lto_.meta section.
>  Richi why is that so ugly?
> >>>
> >>> Because it's a change in the wrong direction and doesn't solve the
> >>> issue we already
> >>> have (cannot determine if a section is compressed or not).
> >>
> >> That's not true, the .gnu.lto_.meta section will be always uncompressed 
> >> and we can
> >> also backport changes to older compiler that can read it and print a 
> >> proper error
> >> message about LTO bytecode version mismatch.
> >
> > We can always backport changes, yes, but I don't see why we have to.
>
> I'm fine with the backward compatibility break. But we should also consider 
> lto-plugin.c
> that is parsing following 2 sections:
>
> 91  #define LTO_SECTION_PREFIX  ".gnu.lto_.symtab"
> 92  #define LTO_SECTION_PREFIX_LEN  (sizeof (LTO_SECTION_PREFIX) - 1)
> 93  #define OFFLOAD_SECTION ".gnu.offload_lto_.opts"
> 94  #define OFFLOAD_SECTION_LEN (sizeof (OFFLOAD_SECTION) - 1)

Yeah, I know.  And BFD and gold hard-coded those __gnu_lto_{v1,slim} symbols...

> >
> >>> ELF section overhead
> >>> is quite big if you have lots of small functions.
> >>
> >> My patch is actually shrinking space as I'm suggesting to add _one_ extra 
> >> ELF section
> >> and remove the section header from all other LTO sections. That will save 
> >> space
> >> for all function sections.
> >
> > But we want the header there to at least say if the section is
> > compressed or not.
> > The fact that we have so many ELF section means we have the redundant 
> > version
> > info everywhere.
> >
> > We should have a single .gnu.lto_ section (and also get rid of those
> > __gnu_lto_v1 and __gnu_lto_slim COMMON symbols - checking for
> > existence of a symbol is more expensive compared to existence
> > of a section).
>
> I like removal of the 2 aforementioned sections. To be honest I would 
> recommend to
> add a new .gnu.lto_.meta section.

Why .meta?  Why not just .gnu.lto_?

> We can use it instead of __gnu_lto_v1 and we can
> have a flag there instead of __gnu_lto_slim. As a second step, I'm willing to 
> concatenate all
>
>   LTO_section_function_body,
>   LTO_section_static_initializer
>
> sections into a single one. That will require an index that will have to be 
> created. I can discuss
> that with Honza as he suggested using something smarter than function names.

I think the index belongs to symtab?

Let's properly do it if we want to change it.  Removing of
__gnu_lto_v1/slim is going to be
the most intrusive change btw. and orthogonal to the section changes.

Richard.

>
> Thoughts?
> Martin
>
> >
> > Richard.
> >
> >> Martin
> >>
> >>>
> >>> Richard.
> >>>
> 
>  Martin
> >>
>

Re: Parallelize GCC with Threads -- First Evaluation

2019-06-24 Thread nick

On 2019-06-24 8:59 a.m., Giuliano Belinassi wrote:
> Hi,
> 
> Parallelize GCC with Threads -- First Evaluation
> 
> Hi everyone,
> 
> I am attaching the first evaluation report here publicly for gathering
> feedback. The file is in markdown format and it can be easily be converted to
> PDF for better visualization.
> 
> I am also open to suggestions and ideas in order to improve the current 
> project :-)
> 
> My branch can be seen here: 
> https://gitlab.com/flusp/gcc/tree/giulianob_parallel
> 
> Giuliano
> 

Guiliano,

Three things first your original proposal was just for expand_all_functions so 
don't
know if it's extended out now but there's other parts in my research so the 
title
was a little confusing. 

I'm assuming this is outside the scope of the current project but does your 
all_rtl_passes
function help out with architecture specific tuning flags as it seems that my 
research
states that's one area of shared state.

In addition for memory this may be really hard to do but can you have a 
signaler 
that tells each phase what  data to pass on therefore notifying what needs to be
passed on to the next pass. So if expand_all_functions needs to pass x the 
signaler
will notify the pass and just swap the values into the pass lock less if 
possible
or just kill it off if not. This would mean writing a GENERIC to RTL final 
passes
signaler which may take too long considering the scope of the project.

Again that's just off the top of my head so it may be a really bad idea,

Nick 

P.S. Good luck through.

Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime

2019-06-24 Thread 김규래

Hi,
I'm not very familiar with the gomp plugin system.
However, looking at 'GOMP_PLUGIN_target_task_completion' seem like tasks have 
to go in and out of the runtime.
In that case, is it right that the tasks have to know from which queue they 
came from?
I think I'll have to add the id of the corresponding queue of each task in the 
gomp_task structure.
 
Ray Kim

Re: Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime

2019-06-24 Thread Jakub Jelinek

On Tue, Jun 25, 2019 at 04:55:17AM +0900, 김규래 wrote:
> I'm not very familiar with the gomp plugin system.
> However, looking at 'GOMP_PLUGIN_target_task_completion' seem like tasks have 
> to go in and out of the runtime.
> In that case, is it right that the tasks have to know from which queue they 
> came from?
> I think I'll have to add the id of the corresponding queue of each task in 
> the gomp_task structure.

While libgomp has a plugin system, the only supported plugins are those in
the tree, i.e. libgomp/plugin/plugin-{hsa,nvptx}.c and
liboffloadmic/plugin/*
nvptx plugin doesn't have async support ATM, so it is just hsa and xeonphi
offloading that can call it when an asynchronous target region execution is
over.  No matter in which task queue the task is, gomp_target_task_completion
needs to ensure that if something already waits on it (taskwait, taskgroup
end, barrier, dependency wait), that it is awaken.
And, like for other parts of task.c, there needs to be a design what lock
is used to protect any code that needs to be guarded.
The current code as you know uses team->task_lock as a big lock, I think
with the separate workqueues + work stealing you need per implicit task lock
+ the per team (team->task_lock), design the locking such that there is no
ABBA deadlock possibility and that you use the team task lock only when
really necessary (not sure, but perhaps one example where I really don't see
much way to avoid the per team lock are task dependencies, the hash table
for that etc.).

Jakub

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

2019-06-24 Thread Martin Sebor


On 6/24/19 11:42 AM, Richard Biener wrote:

On Mon, Jun 24, 2019 at 4:35 PM Martin Sebor  wrote:


On 6/24/19 6:11 AM, Richard Biener wrote:

On Fri, Jun 21, 2019 at 7:17 PM Martin Sebor  wrote:


On 6/21/19 6:06 AM, Richard Biener wrote:

On Wed, Jun 19, 2019 at 5:15 AM Martin Sebor  wrote:


Bug 90923 shows that even though GCC hash-table based containers
like hash_map can be instantiated on types with user-defined ctors
and dtors they invoke the dtors of such types without invoking
the corresponding ctors.

It was thanks to this bug that I spent a day debugging "interesting"
miscompilations during GCC bootstrap (in fairness, it was that and
bug 90904 about auto_vec copy assignment/construction also being
hosed even for POD types).

The attached patch corrects the hash_map and hash_set templates
to invoke the ctors of the elements they insert and makes them
(hopefully) safe to use with non-trivial user-defined types.


Hum.  I am worried about the difference of assignment vs. construction
in ::put()

+  bool ins = hash_entry::is_empty (*e);
+  if (ins)
+   {
+ e->m_key = k;
+ new ((void *) &e->m_value) Value (v);
+   }
+  else
+   e->m_value = v;

why not invoke the dtor on the old value and then the ctor again?


It wouldn't work for self-assignment:

 Value &y = m.get_or_insert (key);
 m.put (key, y);

The usual rule of thumb for writes into containers is to use
construction when creating a new element and assignment when
replacing the value of an existing element.

Which reminds me that the hash containers, despite being copy-
constructible (at least for POD types, they aren't for user-
defined types), also aren't safe for assignment even for PODs.
I opened bug 90959 for this.  Until the assignment is fixed
I made it inaccessibe in the patch (I have fixed the copy
ctor to DTRT for non-PODs).


How is an empty hash_entry constructed?


In hash_table::find_slot_with_hash simply by finding an empty
slot and returning a pointer to it.  The memory for the slot
is marked "empty" by calling the Traits::mark_empty() function.

The full type of hash_map is actually

 hash_map,
Value>

and simple_hashmap_traits delegates it to default_hash_traits
whose mark_empty() just clears the void*, leaving the Value
part uninitialized.  That makes sense because we don't want
to call ctors for empty entries.  I think the questions one
might ask if one were to extend the design are: a) what class
should invoke the ctor/assignment and b) should it do it
directly or via the traits?


::remove() doesn't
seem to invoke the dtor either, instead it relies on the
traits::remove function?


Yes.  There is no Traits::construct or assign or copy.  We
could add them but I'm not sure I see to what end (there could
be use cases, I just don't know enough about these classes to
think of any).

Attached is an updated patch with the additional minor fixes
mentioned above.

Martin

PS I think we would get much better results by adopting
the properly designed and tested standard library containers
than by spending time trying to improve the design of these
legacy classes.  For simple uses that don't need to integrate
with the GC machinery the standard containers should be fine
(plus, it'd provide us with greater motivation to improve
them and the code GCC emits for their uses).  Unfortunately,
to be able to use the hash-based containers we would need to
upgrade to C++ 11.  Isn't it time yet?


I don't think so.  I'm also not sure if C++11 on its own is desirable
or if it should be C++14 or later at that point.  SLES 12 has GCC 4.8
as host compiler (but also GCC 7+ optionally), SLES 15 has GCC 7.
SLES 11 already struggles currently (GCC 4.3) but I'd no longer
consider that important enough.

Note any such change to libstdc++ containers should be complete
and come with both code-size and compile-time and memory-usage
measurements (both of GCC and other apps of course).


Can I go ahead and commit the patch?


I think we need to document the requirements on Value classes better.

@@ -177,7 +185,10 @@ public:
INSERT);
bool ins = Traits::is_empty (*e);
if (ins)
-   e->m_key = k;
+   {
+ e->m_key = k;
+ new ((void *)&e->m_value) Value ();
+   }

this now requires a default constructor and I always forget about
differences between the different form of initializations -- for a POD,
does this zero the entry?


Yes, the /value initialization/ above (via Value ()) zeroes out
PODs, and it doesn't need a ctor.  The placement new syntax should
not require any changes to existing code.

Does this updated comment for hash_map make it clearer?

  /* KeyId must be a trivial (POD) type.  Value may be non-trivial
 (non-POD).  Inserted elements are value-initialized either to
 zero for POD types or by invoking their default ctor.  Removed
 elements are destroyed by inv

Re: [testsuite] What's the expected behaviour of dg-require-effective-target shared?

Re: [PATCH] Add .gnu.lto_.meta section.

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

Re: [PATCH] Add .gnu.lto_.meta section.

Re: [PATCH] Add .gnu.lto_.meta section.

Parallelize GCC with Threads -- First Evaluation

Re: [PATCH] Add .gnu.lto_.meta section.

Re: [PATCH] Add .gnu.lto_.meta section.

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

RE: gcc: -ftest-coverage and -auxbase

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

GSoC Project: Make C/C++ not automatically promote memory_order_consume to memory_order_acquire

Re: [PATCH] Add .gnu.lto_.meta section.

Re: Parallelize GCC with Threads -- First Evaluation

Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime

Re: Re: [GSoC'19, libgomp work-stealing] Task parallelism runtime

Re: [PATCH] let hash-based containers work with non-trivial types (PR 90923)

17 matches

Site Navigation

Mail list logo

Footer information