Re: [C++0x] contiguous bitfields race implementation

Richard Guenther Wed, 10 Aug 2011 04:17:31 -0700

On Tue, Aug 9, 2011 at 8:39 PM, Aldy Hernandez <al...@redhat.com> wrote:
>
>> ok, so now you do this only for the first field in a bitfield group.  But
>> you
>> do it for _all_ bitfield groups in a struct, not only for the interesting
>> one.
>>
>> May I suggest to split the loop into two, first searching the first field
>> in the bitfield group that contains fld and then in a separate loop
>> computing
>> the bitwidth?
>
> Excellent idea.  Done!  Now there are at most two calls to
> get_inner_reference, and in many cases, only one.
>
>> Backing up, considering one of my earlier questions.  What is *offset
>> supposed to be relative to?  The docs say sth like "relative to
>> INNERDECL",
>> but the code doesn't contain a reference to INNERDECL anymore.
>
> Sorry, I see your confusion.  The comments at the top were completely out of
> date.  I have simplified and rewritten them accordingly.  I am attaching
> get_bit_range() with these and other changes you suggested. See if it makes
> sense now.
>
>> Now we come to that padding thing.  What's the C++ memory model
>> semantic for re-used tail padding?  Consider
>
> Andrew addressed this elsewhere.
>
>> There is too much get_inner_reference and tree folding stuff in this
>> patch (which makes it expensive given that the algorithm is still
>> inherently quadratic).  You can rely on the bitfield group advancing
>> by integer-cst bits (but the start offset may be non-constant, so
>> may the size of the underlying record).
>
> Now there are only two tree folding calls (apart from get_inner_reference),
> and the common case has very simple arithmetic tuples.  I see no clear way
> of removing the last call to get_inner_reference(), as the padding after the
> field can only be calculated by calling get_inner_reference() on the
> subsequent field.
>
>> Now seeing all this - and considering that this is purely C++ frontend
>> semantics.  Why can't the C++ frontend itself constrain accesses
>> according to the required semantics?  It could simply create
>> BIT_FIELD_REF<MEM_REF<&containing_record,
>> byte-offset-to-start-of-group>, bit-size, bit-offset>  for all bitfield
>> references (with a proper
>> type for the MEM_REF, specifying the size of the group).  That would
>> also avoid issues during tree optimization and would at least allow
>> optimizing the bitfield accesses according to the desired C++ semantics.
>
> Andrew addressed this as well.  Could you respond to his email if you think
> it is unsatisfactory?


Some comments.

      /* If we have a bit-field with a bitsize > 0... */
      if (DECL_BIT_FIELD_TYPE (fld)
          && tree_low_cst (DECL_SIZE (fld), 1) > 0)

I think we can check bitsize != 0, thus

&& !integer_zerop (DECL_SIZE (fld))

instead.  You don't break groups here with MAX_FIXED_MODE_SIZE, so
I don't think it's ok to do that in the 2nd loop

          /* Short-circuit out if we have the max bits allowed.  */
          if (cumulative_bitsize >= MAX_FIXED_MODE_SIZE)
            {
              *maxbits = MAX_FIXED_MODE_SIZE;
              /* Calculate byte offset to the beginning of the bit region.  */
              gcc_assert (start_bitpos % BITS_PER_UNIT == 0);
              *offset = fold_build2 (PLUS_EXPR, TREE_TYPE (start_offset),
                                     start_offset,
                                     build_int_cst (integer_type_node,
                                                    start_bitpos / 
BITS_PER_UNIT));
              return;

apart from the *offset calculation being redundant, *offset + maxbits
may not include the referenced field.  How do you plan to find
an "optimal" window for such access? (*)

  /* Count the bitsize of the bitregion containing the field in question.  */
  found = false;
  cumulative_bitsize = 0;
  for (fld = bitregion_start; fld; fld = DECL_CHAIN (fld))
    {
      if (TREE_CODE (fld) != FIELD_DECL)
        continue;
      if (fld == field)
        found = true;

      if (DECL_BIT_FIELD_TYPE (fld)
          && tree_low_cst (DECL_SIZE (fld), 1) > 0)
        {
...
         }
      else if (found)
        break;

should probably be

      if (!DECL_BIT_FIELD_TYPE (fld)
         || integer_zerop (DECL_SIZE (fld)))
        break;

we know that we'll eventually find field.

  /* If we found the end of the bit field sequence, include the
     padding up to the next field...  */
  if (fld)
    {

could be a non-FIELD_DECL, you have to skip those first.

     /* Calculate bitpos and offset of the next field.  */
      get_inner_reference (build3 (COMPONENT_REF,
                                   TREE_TYPE (exp),
                                   TREE_OPERAND (exp, 0),
                                   fld, NULL_TREE),
                           &tbitsize, &end_bitpos, &end_offset,
                           &tmode, &tunsignedp, &tvolatilep, true);
      gcc_assert (end_bitpos % BITS_PER_UNIT == 0);

      if (end_offset)
        {
          tree type = TREE_TYPE (end_offset), end;

          /* Calculate byte offset to the end of the bit region.  */
          end = fold_build2 (PLUS_EXPR, type,
                             end_offset,
                             build_int_cst (type,
                                            end_bitpos / BITS_PER_UNIT));
          maxbits_tree = fold_build2 (MINUS_EXPR, type, end, *offset);
        }
      else
        maxbits_tree = build_int_cst (integer_type_node,
                                      end_bitpos - start_bitpos);

      /* ?? Can we get a variable-lengthened offset here ?? */
      gcc_assert (host_integerp (maxbits_tree, 1));
      *maxbits = TREE_INT_CST_LOW (maxbits_tree);

I think you may end up enlarging maxbits to more than
MAX_FIXED_MODE_SIZE here.  What you instead should do (I think)
is sth along

   *maxbits = MIN (MAX_FIXED_MODE_SIZE,
                            *maxbits
                            + operand_equal_p (DECL_FIELD_OFFSET
(fld), DECL_FIELD_OFFSET (field)) ? DECL_FIELD_BIT_OFFSET (fld) -
DECL_FIELD_BIT_OFFSET (field) : DECL_OFFSET_ALIGN (field) -
DECL_FIELD_BIT_OFFSET (field));

Note that another complication comes to my mind now - the offset
field of a COMPONENT_REF is used to specify a variable offset
and has to be used, if present, instead of DECL_FIELD_OFFSET.
Thus your building of COMPONENT_REFs to then pass them to
get_inner_reference is broken.  As you are in generic code and not
in the C++ frontend I believe you have to properly handle this case
(may I suggest to, at the start of the function, simply return a
minimum byte-aligned blob for the case that there is a variable
offset to the bitfield?)

  /* ...otherwise, this is the last element in the structure.  */
  else
    {
      /* Include the padding at the end of structure.  */
      *maxbits = TREE_INT_CST_LOW (TYPE_SIZE (record_type))
        - TREE_INT_CST_LOW (start_bitpos_direct_parent);
      if (*maxbits > MAX_FIXED_MODE_SIZE)
        *maxbits = MAX_FIXED_MODE_SIZE;
    }

with Andrews answer this is invalid.  You can (and should) at most do

  else
    *maxbits = (*maxbits + BITS_PER_UNIT - 1) & ~(BITS_PER_UNIT - 1);

thus, round *maxbits up to the next byte.

There is still the general issue of packed bitfields which will probably
make the issue of the computed group not covering all of field more
prominent (esp. if you limit to MAX_FIXED_MODE_SIZE - consider
struct __attribute__((packed)) { long long : 1; long long a : 64; char
c; } where
a does not fit in a DImode mem but crosses it.  Why constrain
*maxbits to MAX_FIXED_MODE_SIZE at all?  Shoudln't the *offset,
*maxbits pair just constrain what the caller does, not force it to actually
use an access covering that full range (does it?)?

Richard.

(*) For bitfield lowering we discussed this a bit and the solution would be
to mirror what place_field does, fill groups until the space for the mode of
the sofar largest field is filled (doesn't work for packed bitfields of course).

> a
>
>
>

Re: [C++0x] contiguous bitfields race implementation

Reply via email to