https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99728

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jason at gcc dot gnu.org

--- Comment #17 from Richard Biener <rguenth at gcc dot gnu.org> ---
So we have

  val$v_62 = *d_28(D).lam1D.32701.vD.32579;
  *d_28(D).lam1D.32701 = *d_28(D).lam2D.32702;
  *d_28(D).lam2D.32702.vD.32579 = _34;

I believe that the SRA pass has the best analysis capabilities to eventually
decompose aggregate copies into register pieces (with cost considerations).
In particular it knows (but without flow info) what kind of types
sub-accesses use.  Since we want the aggregate copy replaced with pieces
that match the rest of the accesses (here because of LIMs restrictions).

In particular we'd like to use 'vector double' typed accesses here, sth
the middle-end usually avoids for block-copies which aggregate copies
are to the middle-end.

That said, it would be _much_ easier if the frontend with its language specific
semantic knowledge could avoid doing block-copies for such simple wrappers
and instead perform (recursively) memberwise copy (for single member
aggregates).

Of course the simple fix in source is to add

  Tvsimple &operator=(const Tvsimple &other) { v = other.v; return *this;}

producing optimal code.  Jason - would you consider this premature
"optimization" in the C++ frontend?  It doesn't seem that there's
a operator= synthesized, instead we directly emit

   <<cleanup_point <<< Unknown tree: expr_stmt
  (void) (d->lam1 = *(const struct Tvsimple &) &d->lam2) >>>>>;

from

    d.lam1 = d.lam2;

from build_over_call which has a series of optimizations at

  else if (DECL_ASSIGNMENT_OPERATOR_P (fn)
           && DECL_OVERLOADED_OPERATOR_IS (fn, NOP_EXPR)
           && trivial_fn_p (fn))
    {
...
      if (is_really_empty_class (type, /*ignore_vptr*/true))
        {
          /* Avoid copying empty classes.  */
          val = build2 (COMPOUND_EXPR, type, arg, to);
          suppress_warning (val, OPT_Wunused);
        }
      else if (tree_int_cst_equal (TYPE_SIZE (type), TYPE_SIZE (as_base)))
        {
          if (is_std_init_list (type)
              && conv_binds_ref_to_prvalue (convs[1]))
            warning_at (loc, OPT_Winit_list_lifetime,
                        "assignment from temporary %<initializer_list%> does "
                        "not extend the lifetime of the underlying array");
          arg = cp_build_fold_indirect_ref (arg);
          val = build2 (MODIFY_EXPR, TREE_TYPE (to), to, arg);

so we handle empty classes, maybe we can also handle single data-member
classes (not sure how to exactly test for this - walking TYPE_FIELDs
repeatedly for each considered assignment would be slow I guess).

Reply via email to