Re: Found one (the?) ARM bug: (Pointer) aliasing issues

Iain Buclaw Sat, 02 Nov 2013 19:11:26 -0700

On 2 November 2013 20:07, Johannes Pfau <nos...@example.com> wrote:

> I think I finally found the root cause for one of the remaining ARM
> bugs (the codegen bug which only appears on -O2 or higher).
>
> What happened in the test case was that the gcc backend illegally moved
> a read to a memory location before the write. It turns out that GCC
> assumed that the read and write locations were in different alias sets
> and therefore could not possibly reference the same memory location. One
> alias set was for 'ubyte[]' and one for 'char[]'.
>
> The code that triggers this issue is in std.algorithm.find:
> --------
> R1 find(...)(R1 haystack, R2 needle) if (/* are strings*/)
> {
>     return cast(char[]) .find!(ubyte[], ubyte[])
>         (cast(ubyte[]) haystack, cast(ubyte[])needle);
> }
> --------
> (Real code:
>
> https://github.com/D-Programming-GDC/GDC/blob/master/libphobos/src/std/algorithm.d#L3555
> )
>
> The generic emitted by gdc:
> --------
> return <retval> = *(struct  *) &find (*(struct  *) &haystack, *(struct
> *) &needle);
> --------
>
> Or in the raw form:
> @44 = ubyte[]
> @11 = string
> --------
> @11     record_type      name: @17      size: @18      algn: 32
>                          tag : struct   flds: @19
> @31     pointer_type     size: @15      algn: 32       ptd : @11
> @40     pointer_type     size: @15      algn: 32       ptd : @44
> @41     call_expr        type: @44      fn  : @45      0   : @46
>                          1   : @47
> @44     record_type      name: @49      size: @18      algn:
>                          tag : struct   flds: @50
> @46     indirect_ref     type: @44      op 0: @53
> @47     indirect_ref     type: @44      op 0: @54
> @53     nop_expr         type: @40      op 0: @58
> @54     nop_expr         type: @40      op 0: @59
> @58     addr_expr        type: @31      op 0: @30
> @59     addr_expr        type: @31      op 0: @61
> --------
>
> Here it's easy to see that we essentially generate this code:
> *(cast(ubyte[]*)(&haystack))
> and this is AFAIK a violation of the aliasing rules.
>
> This problem is not observed at -O1 as the function call generates a
> new stackframe and we therefore have two distinct memory locations. But
> with -O2 inlining removes this copy and we now have variables
> referencing the same memory location with type ubyte[] and char[].
>
> I can not provide a reduced testcase as the smallest changes in
> compiler codegen (gcc version, gdc commit) or the test case can hide
> this issue. It's always reproducible with test15.d in the test suite.
> (In older gdc versions the test case does not segfault but it produces
> wrong output at -O2. This can be a very subtle difference)
>
> But here's a working code snippet which illustrates the generic
> generated by GDC:
>
> -----------------
> void main()
> {
>         char[] in1 = "Test".dup;
>         char[] in2 = "Test2".dup;
>
>         char[] result = cast(char[])find(cast(ubyte[])in1,
>             cast(ubyte[])in2);
> }
>
> ubyte[] find(ubyte[] a, ubyte[] b)
> {
>     return a;
> }
> -----------------
>
>
> So the important question here: Is this a bug in GDC codegen or is the
> code in std.algorithm invalid? According to
> http://dlang.org/expression.html
> "The cast is done as a type paint" so this could indeed be interpreted
> as a user mistake. But OTOH that page also talks about a runtime check
> of the array .lengths which is clearly missing here.
>
> I'm also wondering if that runtime check can actually fix this
> aliasing issue or if it can come up again if the runtime check itself
> is inlined?
>



I guess D does not have any aliasing rules.  Question: does this occur when
you pass -fno-strict-aliasing?

By default, GDC defines -fstrict-aliasing (which is also enabled by default
in -O2 in GCC), which is nice to have, but if it's proving to be
non-trivial, we can always disallow the optimisation being done in the
middle-end.  See the LANG_HOOKS_GET_ALIAS_SET langhook - last time I
checked, returning 0 disables aliasing rules from taking effect.

Regards.
-- 
Iain Buclaw

*(p < e ? p++ : p) = (c & 0x0f) + '0';

Re: Found one (the?) ARM bug: (Pointer) aliasing issues

Reply via email to