On Tue, Sep 3, 2019 at 3:09 PM Ulrich Weigand <uweig...@de.ibm.com> wrote:
>
> Richard Biener wrote:
> > On Tue, Sep 3, 2019 at 1:56 PM Ulrich Weigand <uweig...@de.ibm.com> wrote:
> > > combined with the fact that get_object_alignment_2 actually itself
> > > uses type alignment if we have an actual memory object:
> > >       /* When EXP is an actual memory reference then we can use
> > >          TYPE_ALIGN of a pointer indirection to derive alignment.
> > >          Do so only if get_pointer_alignment_1 did not reveal absolute
> > >          alignment knowledge and if using that alignment would
> > >          improve the situation.  */
> > >
> > > Is this not correct either, or is the situation there somehow different?
> >
> > We're looking at an address, the above applies to places where we actually
> > dereference it and there we take the type from the type of the access
> > and not from the type of the pointer (or the pointed-to type) since that 
> > would
> > be equally wrong.
>
> I guess I'm confused again, sorry :-)   What is "the type of the access",
> and where do we get it from?  In your example below, if somebody were
> to access s.i here, what would the "type of the access" be?

The type of the access is 'int' I guess (packed is outside of the C standard).

> > >  But in that case I'd assume that
> > > every possible value of "ptr" at runtime must be 16 byte aligned, or
> > > else it wouldn't be a valid value of a variable of that type, right?
> > > So in that case assuming a 16-byte alignment would be correct?
> >
> > If you read the C standards fine-print then yes.  But people (or
> > even the C frontend!) hardly get that correct since for example
> > for
> >
> > struct __attribute__((packed)) { int i; } s;
> >
> > void foo ()
> > {
> >   __builtin_printf ("%p", &s.i);
> > }
> >
> > the C fronted actually creates a int * typed pointer for the ADDR_EXPR
> > (and not an int * variant with 1-byte alignment).  And people use
> > int * to pass such pointers everywhere.
>
> Well ... I'd say if you cast to int * and then use an atomic operation,
> it's your own fault that it fails :-/   If the frontend itself uses
> the wrong type, that is of course a problem.

Yes.  The C standard says that if you cast something to a pointer type
the pointer has to be aligned according to the pointed-to type, otherwise
undefined.  But we have no chance to use this because of this kind of
issues (and of course developer laziness...).

> > > > The proper way to carry the information from source to RTL expansion
> > > > (or earlier GIMPLE) is to add another argument to the builtins
> > > > specifying alignment which the FEs can derive from the argument type
> > > > given appropriate semantics of the builtin.
> > >
> > > I guess I can have a look to do it this way, if necessary.
> > > Is there already some example of a builtin that does that?
> >
> > I think the atomic_* builtins avoid (or should avoid) the issue by
> > only "expanding"
> > to the fixed size vairants if the memory is appropriately aligned
> > and otherwise fall back to _n which doesn't assume any alignment(?)
>
> Well, this is exactly what we're trying to do!  If the compiler can
> prove correct alignment, we want to emit the atomic instruction.
> Otherwise, we want to emit the library call.  (The library function
> will do a run-time alignment check, and if the object is aligned
> after all, it will also use the atomic instruction; otherwise it
> will use a lock.)
>
> The problem is that the decision whether to emit the instruction or
> the library call is currently done by the back-end: if the insn
> expander fails, the middle-end will fall back to the libcall.
>
> So the back-end needs to base this decision on the alignment, but
> the only input it has is the MEM (and its alignment setting).  But
> that MEM is **generated** by get_builtin_sync_mem using the alignment
> computed there, which is exactly the place I'm trying to fix ...
>
> Are you saying that this very decision ought to be made earlier
> in the front-end?  That would of course also be OK with me.

I'm not sure how it is done now but IIRC the users
use __atomic_load (ptr) and then the frontend changes that
to one of BUILT_IN_ATOMIC_LOAD_{N,1,2,4,8,16} based on
some criteria (size mainly).  I'm saying we should factor in
alignment here, _not_ using say BUILT_IN_ATOMIC_LOAD_16
if according to the C standard the data isn't aligned.  Maybe we can
ask the target whether the alignment according to C matches the
requirement for _16 expansion.  And if things are not fine
the FE should instead use BUILT_IN_ATOMIC_LOAD_N
which we later if the compiler can prove bigger alignment and N
is constant could expand as one of the others.

But safety first.

> > > We definitely must handle the misaligned case (since the **default**
> > > alignment of __int128 on s390x is misaligned for atomic access).
> >
> > But is it fast (and atomic)?  Would going the _n way "work" or do
> > you need a special 8-byte-aligned__int128 path?
>
> See above.  We want the libcall fallback in the unaligned case.  It
> is sort-of fast (except for the function call and run-time check
> overhead) if the object actually is aligned, less fast if it isn't.

OK.

So yes, I'd say try tackling the issue in the frontend which is the
last to know the alignment in the most optimistic way (according
to the C standard).  At RTL expansion time we have to be (very)
conservative.

Btw, the alternative is to add an extra alignment parameter
to all of the atomic builtins where the FE could stick the alignment
for us to use during RTL expansion.  But given we have the _{1,2,4,8,16,N}
variants it sounds easier to use those...

Note we only document __atomic_load and __atomic_load_n so
the _{1,2,4,8,16} variants seem to be implementation detail.

Richard.

> Bye,
> Ulrich
>
> --
>   Dr. Ulrich Weigand
>   GNU/Linux compilers and toolchain
>   ulrich.weig...@de.ibm.com
>

Reply via email to