Re: 'hash_map>'

2021-08-07 Thread Thomas Schwinge
Hi!

On 2021-08-06T19:37:58+0100, Jonathan Wakely  wrote:
> On Fri, 6 Aug 2021, 17:58 Thomas Schwinge,  wrote:
>> So I'm trying to do some C++...  ;-)
>>
>> Given:
>>
>> /* A map from SSA names or var decls to record fields.  */
>> typedef hash_map field_map_t;
>>
>> /* For each propagation record type, this is a map from SSA names or
>> var decls
>>to propagate, to the field in the record type that should be used
>> for
>>transmission and reception.  */
>> typedef hash_map record_field_map_t;
>>
>> Thus, that's a 'hash_map>'.  (I may do that,
>> right?)  Looking through GCC implementation files, very most of all uses
>> of 'hash_map' boil down to pointer key ('tree', for example) and
>> pointer/integer value.
>>
>> Then:
>>
>> record_field_map_t field_map ([...]); // see below
>> for ([...])
>>   {
>> tree record_type = [...];
>> [...]
>> bool existed;
>> field_map_t &fields
>>   = field_map.get_or_insert (record_type, &existed);
>> gcc_checking_assert (!existed);
>> [...]
>> for ([...])
>>   fields.put ([...], [...]);
>> [...]
>>   }
>> [stuff that looks up elements from 'field_map']
>> field_map.empty ();
>>
>> This generally works.
>>
>> If I instantiate 'record_field_map_t field_map (40);', Valgrind is happy.
>> If however I instantiate 'record_field_map_t field_map (13);' (where '13'
>> would be the default for 'hash_map'), Valgrind complains:
>>
>> 2,080 bytes in 10 blocks are definitely lost in loss record 828 of 876
>>at 0x483DD99: calloc (vg_replace_malloc.c:762)
>>by 0x175F010: xcalloc (xmalloc.c:162)
>>by 0xAF4A2C: hash_table> simple_hashmap_traits, tree_node*> 
>> >::hash_entry, false, xcallocator>::hash_table(unsigned long, bool, bool, 
>> bool, mem_alloc_origin) (hash-table.h:275)
>>by 0x15E0120: hash_map> simple_hashmap_traits, tree_node*> 
>> >::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143)
>>by 0x15DEE87: hash_map> simple_hashmap_traits, tree_node*> >, 
>> simple_hashmap_traits, hash_map> tree_node*, simple_hashmap_traits, 
>> tree_node*> > > >::get_or_insert(tree_node* const&, bool*) (hash-map.h:205)
>>by 0x15DD52C: execute_omp_oacc_neuter_broadcast() 
>> (omp-oacc-neuter-broadcast.cc:1371)
>>[...]
>>
>> (That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc'
>> file.)
>>
>> My suspicion was that it is due to the 'field_map' getting resized as it
>> incrementally grows (and '40' being big enough for that to never happen),
>> and somehow the non-POD (?) value objects not being properly handled
>> during that.  Working my way a bit through 'gcc/hash-map.*' and
>> 'gcc/hash-table.*' (but not claiming that I understand all that, off
>> hand), it seems as if my theory is right: I'm able to plug this memory
>> leak as follows:
>>
>> --- gcc/hash-table.h
>> +++ gcc/hash-table.h
>> @@ -820,6 +820,8 @@ hash_table::expand ()
>>  {
>>value_type *q = find_empty_slot_for_expand (Descriptor::hash 
>> (x));
>>   new ((void*) q) value_type (std::move (x));
>> + //BAD Descriptor::remove (x); // (doesn't make sense and) a ton of 
>> "Invalid read [...] inside a block of size [...] free'd"
>> + x.~value_type (); //GOOD This seems to work!  -- but does it make 
>> sense?
>>  }
>>
>>p++;
>>
>> However, that doesn't exactly look like a correct fix, does it?  I'd
>> expect such a manual destructor call in combination with placement new
>> (that is being used here, obviously) -- but this is after 'std::move'?
>> However, this also survives a smoke-test-like run of parts of the GCC
>> testsuite, bootstrap and complete run now ongoing.

That testing came back without any issues.

> Does GCC's hash_map assume you only use it to store POD (plain old data)
> types

Don't you disappoint me, C++!

> which don't need to be destroyed, because they don't have any
> dynamically allocated memory or other resources?
>
> A hash_map is not a POD, because it does have dynamically allocated memory.

ACK, that's what I tried to say above in my "layman's terms".  ;-)

> If my guess is right, then hash_map should really use a static_assert to
> enforce that requirement, instead of letting you use it in a way that will
> leak.

Eh, yes, at the very least!

Or, of course, make it work?  I mean GCC surely isn't the first software
project to desire implementing a 'hash_map' storing non-POD objects?
Don't you disappoint me, C++!

Alternative to that manual destructor call (per my patch/hack above) --
is maybe something wrong in the 'value_type' constructor implementation
or any other bits related to the 'std::move'?  (Is that where the non-POD
source data ought to be destructed; via "move" instead of "copy"
semantics?)

"Learning C++ by actual need."  ;-D


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH;

Re: 'hash_map>'

2021-08-07 Thread Jonathan Wakely via Gcc
On Sat, 7 Aug 2021, 09:08 Thomas Schwinge,  wrote:

> Hi!
>
> On 2021-08-06T19:37:58+0100, Jonathan Wakely 
> wrote:
> > On Fri, 6 Aug 2021, 17:58 Thomas Schwinge, 
> wrote:
> >> So I'm trying to do some C++...  ;-)
> >>
> >> Given:
> >>
> >> /* A map from SSA names or var decls to record fields.  */
> >> typedef hash_map field_map_t;
> >>
> >> /* For each propagation record type, this is a map from SSA names or
> >> var decls
> >>to propagate, to the field in the record type that should be used
> >> for
> >>transmission and reception.  */
> >> typedef hash_map record_field_map_t;
> >>
> >> Thus, that's a 'hash_map>'.  (I may do that,
> >> right?)  Looking through GCC implementation files, very most of all uses
> >> of 'hash_map' boil down to pointer key ('tree', for example) and
> >> pointer/integer value.
> >>
> >> Then:
> >>
> >> record_field_map_t field_map ([...]); // see below
> >> for ([...])
> >>   {
> >> tree record_type = [...];
> >> [...]
> >> bool existed;
> >> field_map_t &fields
> >>   = field_map.get_or_insert (record_type, &existed);
> >> gcc_checking_assert (!existed);
> >> [...]
> >> for ([...])
> >>   fields.put ([...], [...]);
> >> [...]
> >>   }
> >> [stuff that looks up elements from 'field_map']
> >> field_map.empty ();
> >>
> >> This generally works.
> >>
> >> If I instantiate 'record_field_map_t field_map (40);', Valgrind is
> happy.
> >> If however I instantiate 'record_field_map_t field_map (13);' (where
> '13'
> >> would be the default for 'hash_map'), Valgrind complains:
> >>
> >> 2,080 bytes in 10 blocks are definitely lost in loss record 828 of
> 876
> >>at 0x483DD99: calloc (vg_replace_malloc.c:762)
> >>by 0x175F010: xcalloc (xmalloc.c:162)
> >>by 0xAF4A2C: hash_table simple_hashmap_traits, tree_node*>
> >::hash_entry, false, xcallocator>::hash_table(unsigned long, bool, bool,
> bool, mem_alloc_origin) (hash-table.h:275)
> >>by 0x15E0120: hash_map simple_hashmap_traits, tree_node*>
> >::hash_map(unsigned long, bool, bool, bool) (hash-map.h:143)
> >>by 0x15DEE87: hash_map tree_node*, simple_hashmap_traits,
> tree_node*> >, simple_hashmap_traits,
> hash_map simple_hashmap_traits, tree_node*> > >
> >::get_or_insert(tree_node* const&, bool*) (hash-map.h:205)
> >>by 0x15DD52C: execute_omp_oacc_neuter_broadcast()
> (omp-oacc-neuter-broadcast.cc:1371)
> >>[...]
> >>
> >> (That's with '#pragma GCC optimize "O0"' at the top of the 'gcc/*.cc'
> >> file.)
> >>
> >> My suspicion was that it is due to the 'field_map' getting resized as it
> >> incrementally grows (and '40' being big enough for that to never
> happen),
> >> and somehow the non-POD (?) value objects not being properly handled
> >> during that.  Working my way a bit through 'gcc/hash-map.*' and
> >> 'gcc/hash-table.*' (but not claiming that I understand all that, off
> >> hand), it seems as if my theory is right: I'm able to plug this memory
> >> leak as follows:
> >>
> >> --- gcc/hash-table.h
> >> +++ gcc/hash-table.h
> >> @@ -820,6 +820,8 @@ hash_table::expand
> ()
> >>  {
> >>value_type *q = find_empty_slot_for_expand
> (Descriptor::hash (x));
> >>   new ((void*) q) value_type (std::move (x));
> >> + //BAD Descriptor::remove (x); // (doesn't make sense and) a
> ton of "Invalid read [...] inside a block of size [...] free'd"
> >> + x.~value_type (); //GOOD This seems to work!  -- but does it
> make sense?
> >>  }
> >>
> >>p++;
> >>
> >> However, that doesn't exactly look like a correct fix, does it?  I'd
> >> expect such a manual destructor call in combination with placement new
> >> (that is being used here, obviously) -- but this is after 'std::move'?
> >> However, this also survives a smoke-test-like run of parts of the GCC
> >> testsuite, bootstrap and complete run now ongoing.
>
> That testing came back without any issues.
>
> > Does GCC's hash_map assume you only use it to store POD (plain old data)
> > types
>
> Don't you disappoint me, C++!
>

It's not a limitation of C++, just this data structure.



> > which don't need to be destroyed, because they don't have any
> > dynamically allocated memory or other resources?
> >
> > A hash_map is not a POD, because it does have dynamically allocated
> memory.
>
> ACK, that's what I tried to say above in my "layman's terms".  ;-)
>
> > If my guess is right, then hash_map should really use a static_assert to
> > enforce that requirement, instead of letting you use it in a way that
> will
> > leak.
>
> Eh, yes, at the very least!
>
> Or, of course, make it work?  I mean GCC surely isn't the first software
> project to desire implementing a 'hash_map' storing non-POD objects?
> Don't you disappoint me, C++!
>


Of course it's possible.


> Alternative to that manual destructor call (per my patch/

Optimizer failure

2021-08-07 Thread Stefan Kanthak
Hi,

for the function (really: ternary expressions)

int dummy(int x)
{
#ifdef VARIANT
x < 0 ? --x : x > 0 ? ++x : 0;
#else
x < 0 ? --x : x > 0 ? ++x : x;
#endif
}

GCC 10.2.0 generates the following code targeting AMD64:

testl   %edi, %edi
js  .L0
leal1(%rdi), %eax
movl$0, %edi  # SUPERFLUOUS:
cmove   %edi, %eax#  cmove is only executed if edi was already 0
ret
.L0:
leal-1(%rdi), %eax
ret

Targeting i386, GCC 10.2.0 generates it the other way 'round:

movl4(%esp), %eax
testl   %eax, %eax
js  L0
leal1(%eax), %edx
movl$0, %eax  # SUPERFLUOUS:
cmovne  %edx, %eax#  cmovne is only executed if eax was not 0
ret
L0:
subl$1, %eax
ret

regards
Stefan


Re: Suboptimal code generated for __buitlin_trunc on AMD64 without SS4_4.1

2021-08-07 Thread Stefan Kanthak
Joseph Myers  wrote:


> On Fri, 6 Aug 2021, Stefan Kanthak wrote:

PLEASE DON'T STRIP ATTRIBUTION LINES: I did not write the following paragraph!

>> > I don't know what the standard says about NaNs in this case, I seem to
>> > remember that arithmetic instructions typically produce QNaN when one of
>> > the inputs is a NaN, whether signaling or not. 
>> 
>> 
>> and its cousins as well as the C standard say
>> 
>> | If x is NaN, a NaN shall be returned.
>> 
>> That's why I mentioned that the code GCC generates also doesn't quiet SNaNs.
> 
> You should be looking at TS 18661-3 / C2x Annex F for sNaN handling;

I'll do so as soon as GCC drops support for all C dialects before C2x!

Unless you use a time machine and fix the POSIX and ISO C standards
written in the past you CAN'T neglect all software written before C2x
modified sNaN handling that relies on the documented behaviour at the
time it was written.

> the POSIX attempts to deal with signaling NaNs aren't well thought out.

[...]

> Though in C2x mode, these SSE2 code sequences won't be used by default
> anyway, except for rint (C2x implies -fno-fp-int-builtin-inexact).

regards
Stefan


C++ member functions vs C++ free-functions, which compiles faster???

2021-08-07 Thread unlvsur unlvsur via Gcc


struct foo
{
void write()
{
//dosomething
}
};

Vs

struct foo
{
};
Inline void write(foo&)
{
//dosomething
}

Which one compiles faster?
My guess is that the first one compiles faster since it is more context free 
compared to 2nd one.

Sent from Mail for Windows 10



gcc-11-20210807 is now available

2021-08-07 Thread GCC Administrator via Gcc
Snapshot gcc-11-20210807 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/11-20210807/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 11 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-11 revision 41ddd56e5d4bb3df096d714e7438b836323b813a

You'll find:

 gcc-11-20210807.tar.xz   Complete GCC

  SHA256=fc75b36c6c2f816dfc3c4688d79b8ff857be4ba3b9ed8864943b88e9abf006e6
  SHA1=5d475e4eebd38910f1fc7e1493ac52f5c02b8ee5

Diffs from 11-20210731 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-11
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.