Richard Biener wrote: > On Tue, Sep 3, 2019 at 1:56 PM Ulrich Weigand <uweig...@de.ibm.com> wrote: > > combined with the fact that get_object_alignment_2 actually itself > > uses type alignment if we have an actual memory object: > > /* When EXP is an actual memory reference then we can use > > TYPE_ALIGN of a pointer indirection to derive alignment. > > Do so only if get_pointer_alignment_1 did not reveal absolute > > alignment knowledge and if using that alignment would > > improve the situation. */ > > > > Is this not correct either, or is the situation there somehow different? > > We're looking at an address, the above applies to places where we actually > dereference it and there we take the type from the type of the access > and not from the type of the pointer (or the pointed-to type) since that would > be equally wrong.
I guess I'm confused again, sorry :-) What is "the type of the access", and where do we get it from? In your example below, if somebody were to access s.i here, what would the "type of the access" be? > > But in that case I'd assume that > > every possible value of "ptr" at runtime must be 16 byte aligned, or > > else it wouldn't be a valid value of a variable of that type, right? > > So in that case assuming a 16-byte alignment would be correct? > > If you read the C standards fine-print then yes. But people (or > even the C frontend!) hardly get that correct since for example > for > > struct __attribute__((packed)) { int i; } s; > > void foo () > { > __builtin_printf ("%p", &s.i); > } > > the C fronted actually creates a int * typed pointer for the ADDR_EXPR > (and not an int * variant with 1-byte alignment). And people use > int * to pass such pointers everywhere. Well ... I'd say if you cast to int * and then use an atomic operation, it's your own fault that it fails :-/ If the frontend itself uses the wrong type, that is of course a problem. > > > The proper way to carry the information from source to RTL expansion > > > (or earlier GIMPLE) is to add another argument to the builtins > > > specifying alignment which the FEs can derive from the argument type > > > given appropriate semantics of the builtin. > > > > I guess I can have a look to do it this way, if necessary. > > Is there already some example of a builtin that does that? > > I think the atomic_* builtins avoid (or should avoid) the issue by > only "expanding" > to the fixed size vairants if the memory is appropriately aligned > and otherwise fall back to _n which doesn't assume any alignment(?) Well, this is exactly what we're trying to do! If the compiler can prove correct alignment, we want to emit the atomic instruction. Otherwise, we want to emit the library call. (The library function will do a run-time alignment check, and if the object is aligned after all, it will also use the atomic instruction; otherwise it will use a lock.) The problem is that the decision whether to emit the instruction or the library call is currently done by the back-end: if the insn expander fails, the middle-end will fall back to the libcall. So the back-end needs to base this decision on the alignment, but the only input it has is the MEM (and its alignment setting). But that MEM is **generated** by get_builtin_sync_mem using the alignment computed there, which is exactly the place I'm trying to fix ... Are you saying that this very decision ought to be made earlier in the front-end? That would of course also be OK with me. > > We definitely must handle the misaligned case (since the **default** > > alignment of __int128 on s390x is misaligned for atomic access). > > But is it fast (and atomic)? Would going the _n way "work" or do > you need a special 8-byte-aligned__int128 path? See above. We want the libcall fallback in the unaligned case. It is sort-of fast (except for the function call and run-time check overhead) if the object actually is aligned, less fast if it isn't. Bye, Ulrich -- Dr. Ulrich Weigand GNU/Linux compilers and toolchain ulrich.weig...@de.ibm.com