On 12/10/19 3:07 PM, David Malcolm wrote:
On Tue, 2019-12-03 at 15:41 -0700, Martin Sebor wrote:
After allocating a new chunk of memory hash_table::expand() copy-
assigns elements from the current array over the newly allocated
elements without constructing them.

Similarly, after destroying existing elements, hash_table::
empty_slow() assigns a new value to them.  This bug was
introduced in r249234 when trying to deal with -Wclass-memaccess
instances when the warning was first added.

Neither of these is a problem for PODs but both cause trouble when
the hash_table contains elements of a type with a user-defined copy
assignment operator.  There are at least two such instances in GCC
already and a third is under review.

The attached patch avoids this by using placement new to construct
new elements in uninitialized storage and restoring the original
call to memset in hash_table::empty_slow(), analogously to what
was done in r272893 for PR90923,

Longer term, to make these templates safely and efficiently usable
with non-POD types with user-defined copy ctors and copy assignment
operators I think these classes will probably need to be enhanced
to make use of "assign" and "move" traits functions to efficiently
assign and move objects.

Martin

diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index ba5d64fb7a3..26bac624a08 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -818,8 +818,7 @@ hash_table<Descriptor, Lazy, Allocator>::expand ()
        if (!is_empty (x) && !is_deleted (x))
          {
            value_type *q = find_empty_slot_for_expand (Descriptor::hash (x));
-
-          *q = x;
+         new ((void*) q) value_type (x);
          }
p++;
@@ -869,14 +868,8 @@ hash_table<Descriptor, Lazy, Allocator>::empty_slow ()
        m_size_prime_index = nindex;
      }
    else
-    {
-#ifndef BROKEN_VALUE_INITIALIZATION
-      for ( ; size; ++entries, --size)
-       *entries = value_type ();
-#else
-      memset (entries, 0, size * sizeof (value_type));
-#endif
-    }
+    memset ((void *) entries, 0, size * sizeof (value_type));
+
    m_n_deleted = 0;
    m_n_elements = 0;
  }

On attempting to rebase my analyzer branch I found that this patch
uncovered a bug in it, but also possibly a bug in hash-table.h.

In the analyzer branch I have a hash_map with a key where the "empty"
value isn't all-zero-bits.

There's a test case for it in comment #3 in 92762.


Specifically, in gcc/analyzer/program-state.h: sm_state_map has a
hash_map <svalue_id, entry_t> map_t, where svalue_id, the key type, has
hash traits:

template <>
inline void
pod_hash_traits<svalue_id>::mark_empty (value_type &v)
{
   v = svalue_id::null ();
}

which is a -1 int (all ones).

memset to zero bits populates the "empty" slots with a key with a non-
empty value for this key type, effectively corrupting the data
structure (luckily a selftest caught this).

Looking at the above patch, it looks like I was unwittingly relying on
two things:
(a) #ifndef BROKEN_VALUE_INITIALIZATION, and
(b) that the default ctor for my key type was the "empty" value.

However, shouldn't empty_slow be calling the Descriptor::mark_empty
function?  Doesn't this memset code assume that the "empty" value of
the hash_table entry is all-zeroes (and thus imposing the same
assumption on all hash_maps' key and value types?) - which isn't the
case for my code.  I don't remember seeing that assumption documented.

Or am I misreading this?

IIRC, I had tried the below but it caused problems during bootstrap
that I didn't spend enough time investigating.

  for (size_t i = size - 1; i < size; i--)
    if (!is_empty (entries[i]) && !is_deleted (entries[i]))
      {
        Descriptor::remove (entries[i]);
        mark_empty (entries[i]);
      }

(I think it also caused compilation error in one of the IPA passes
because of a missing mark_empty function in its traits class; maybe
ipa_bit_ggc_hash_traits in ipa-prop.c).

To be honest, I'm not sure I understand why the memset is even there.
It seems like a leak to me (I can reproduce leaks with a user-defined
type).  I was going to get back to it at some point.

Martin

Reply via email to