[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#727167, @hfinkel wrote:

> I'm not sure this is the right way to do this; I suspect we're lumping 
> together a bunch of different bugs:
>
> 1. vector types need to have tbaa which makes them alias with their element 
> types [to be clear, as vector types are an implementation extension, this is 
> our choice; I believe most users expect this to be true, but I'm certainly 
> open to leaving this as-is (i.e. the vector types and scalar types as 
> independent/non-aliasing)].
> 2. tbaa can't be used for write <-> write queries (only read <-> write 
> queries) because the writes can change the effective type
> 3. our 'struct' path TBAA for unions is broken (and to fix this we need to 
> invert the tree structure, etc. as discussed on the list)
>
>   (1) and (3) are independent bugs. (1) if desired, should just be fixed (the 
> vector type should be a child of the scalar element type in the current 
> representation). (2) needs to be fixed too, is not strictly related to 
> unions, and needs to be fixed in the backend. Doing this does not seem hard 
> (we just need to record a Write Boolean in MemoryLocation and then check them 
> in TypeBasedAAResult (TBAA can't be used if both locations are writes). (3) 
> is what this should address. What we should do here is only generate the 
> scalar TBAA for the union accesses. As I recall, the only reason that the 
> scalar TBAA for union accesses is a problem is because of (2), but that's 
> easy to fix in the backend (and we need to do that anyway).
>
>   @dberlin , do you agree?


I pretty strongly agree that these are different bugs, and at the least, 
deserving of different patches as they have different tradeoffs/solutions.
Ii think what should probably happen here is that we fix #1 and #3 in separate 
patches, and discuss #2 a bit until we come to consensus about what should be 
done.

I'm not sure about the solution to #2, because i thought there were very 
specific points in time at which the effective type could change.
It seems like it should be possible to generate a correct tree that represents 
this rather than try to hack up the backend, since the frontend knows when this 
has happened.
(IE if the effective type has changed from double to int, it should have double 
and int parents to make sure it conflicts with both.  Though i guess maybe 
that's impossible in the current rep until we fix it).
I generally think it should be the responsibility of the frontend to generate 
metadata that is correct for all types of queries, and if we have to extend 
that, i'd extend it, rather than try to special case it in the backend, 
*especially* by changing a core structure like MemoryLocation.

In fact, i mostly believe tbaa doesn't belong in memory location, and that it 
should be looked up separately by the TBAA AA impl since that's the thing that 
should care about it.  As we've repeatedly said, memory locations and pointers 
don't really have tbaa info, instructions do.  We're really just trying to pass 
that along.
This would require changing it from including the tbaa tag to including the 
instruction the location came from, which may actually be useful for other 
reasons anyway (we'd be able to look up instruction-specific aa info, in the aa 
impls that have that info, instead of having to shove it all in memory location 
as we do now, and the impl tries to recreate apples from applesauce).  The cost 
of doing this is that if we include the instruction in operator==, we'll end up 
with more queries in things like memoryssa, uses memorylocation as a 
discriminator.IE it expects everything with the same memorylocation to have 
the same aliasing results, so it cuts down on queries by using it as a key.  
This will still work, it just won't do anything useful if the instruction is 
the discriminator.This seems solvable by saying two memorylocations are 
equal if the pointers and sizes are equal, and  instructions are the same kind 
(read/write/kill) and have equal metadata (IE not just tbaa).  Which is really 
likely what we want anyway, rather than "pointer, size, tbaa info implies alias 
equality".  Maybe i'm wrong.
We'd definitely have to benchmark.
(and yes, i realize this is more involved than we may want to do right now, i'd 
actually be willing to sign up to do it if we come to consensus and the numbers 
look good).


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#727352, @hfinkel wrote:

> > I'm not sure about the solution to #2, because i thought there were very 
> > specific points in time at which the effective type could change.
>
> I think this is a key point. I'm not sure that there are specific points that 
> the frontend can deduce:
>
>   union U {
> int i;
> float f;
>   };
>   
>   void bar1(int *i) {
> *i = 0; // we just reset the type here
>   }
>   
>   void bar2(float *f) {
> *f = 0.0f; // we just reset the type here too
>   }
>   
>   void foo(U *u) {
> bar1(&u->i);
> bar2(&u->f);
>   }
>   
>
> Even if the union has structs instead of scalar types, I'm not sure that 
> changes the situation. There certainly are situation where you can't silently 
> change the types of objects in C++ just by starting to write a to 
> differently-typed object at the same location, but I think that using this 
> property relies on having some lifetime information for the objects in 
> question, and so AA would need to be able to use this lifetime information to 
> do more. This seems like an orthogonal issue (i.e. we can always add TBAA 
> write <-> write ability in the presence of such lifetime information as an 
> additional feature). Maybe I'm missing something...


Union type accesses must explicitly be made through a union if you want the 
effective type to change.
This is now actually codified in either c11 or c++14 (i can't remember which), 
but even before that, it's the only thing gcc/llvm have *ever*  guaranteed.

If you don't make it an explicitly visible union access at each point, you will 
get wrong code, and this is pretty much unsolvable in general if you want tbaa 
to work at all.

Because otherwise foo, bar1, and bar2 could all be in different translation 
units, in which case, you just destroyed all usefulness of tbaa because it has 
to assume all pointers can conflict with all pointers :)


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#727370, @dberlin wrote:

> In https://reviews.llvm.org/D31885#727352, @hfinkel wrote:
>
> > > I'm not sure about the solution to #2, because i thought there were very 
> > > specific points in time at which the effective type could change.
> >
> > I think this is a key point. I'm not sure that there are specific points 
> > that the frontend can deduce:
> >
> >   union U {
> > int i;
> > float f;
> >   };
> >   
> >   void bar1(int *i) {
> > *i = 0; // we just reset the type here
> >   }
> >   
> >   void bar2(float *f) {
> > *f = 0.0f; // we just reset the type here too
> >   }
> >   
> >   void foo(U *u) {
> > bar1(&u->i);
> > bar2(&u->f);
> >   }
> >   
> >
> > Even if the union has structs instead of scalar types, I'm not sure that 
> > changes the situation. There certainly are situation where you can't 
> > silently change the types of objects in C++ just by starting to write a to 
> > differently-typed object at the same location, but I think that using this 
> > property relies on having some lifetime information for the objects in 
> > question, and so AA would need to be able to use this lifetime information 
> > to do more. This seems like an orthogonal issue (i.e. we can always add 
> > TBAA write <-> write ability in the presence of such lifetime information 
> > as an additional feature). Maybe I'm missing something...
>
>
> Union type accesses must explicitly be made through a union if you want the 
> effective type to change.
>  This is now actually codified in either c11 or c++14 (i can't remember 
> which), but even before that, it's the only thing gcc/llvm have *ever*  
> guaranteed.
>
> If you don't make it an explicitly visible union access at each point, you 
> will get wrong code, and this is pretty much unsolvable in general if you 
> want tbaa to work at all.
>
> Because otherwise foo, bar1, and bar2 could all be in different translation 
> units, in which case, you just destroyed all usefulness of tbaa because it 
> has to assume all pointers can conflict with all pointers :)


To expand a bit:
It's trivial to make the fact that they are in a union invisible.

In the above example, assuming you make the union accesses explicit, the union 
accesses themselves have to *always* conflict with all types in the union , and 
now you can do that if you know it's a union access., but you couldn't 
otherwise.
IE this is the precise reason that gcc/llvm have required the union accesses be 
explicit - to associate them with the right tbaa type.

Past that, you aren't going to be able to usefully differentiate between the 
current effective type of a union.

There are a few other points where the effective type may change, but i believe 
the frontend does know about them (IE placement new, etc).


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#727499, @hfinkel wrote:

> In https://reviews.llvm.org/D31885#727371, @efriedma wrote:
>
> > In https://reviews.llvm.org/D31885#727167, @hfinkel wrote:
> >
> > > I'm not sure this is the right way to do this; I suspect we're lumping 
> > > together a bunch of different bugs:
> > >
> > > 1. vector types need to have tbaa which makes them alias with their 
> > > element types [to be clear, as vector types are an implementation 
> > > extension, this is our choice; I believe most users expect this to be 
> > > true, but I'm certainly open to leaving this as-is (i.e. the vector types 
> > > and scalar types as independent/non-aliasing)].
> > > 2. tbaa can't be used for write <-> write queries (only read <-> write 
> > > queries) because the writes can change the effective type
> > > 3. our 'struct' path TBAA for unions is broken (and to fix this we need 
> > > to invert the tree structure, etc. as discussed on the list)
> >
> >
> > See https://bugs.llvm.org/show_bug.cgi?id=28189 for a testcase for (2) for 
> > this which doesn't involve unions.
>
>
> Yes, this is what I had in mind. However, we may just want to not handle this 
> at all. The demonstration you provide:
>
>   #include 
>   #include 
>   #include 
>   int f(int* x, int *y) {
> *x = 10;
> int z = *y;
> *(float*)x = 1.0;
> return z;
>   }
>   int (*ff)(int*,int*) = f;
>   int main() {
> void* x = malloc(4);
> printf("%d\n", ff(x, x));
>   }
>   
>
> shows that the problem is more than I implied. To support this, we not only 
> need to ignore the TBAA between the two writes (*x and *(float*)x), but also 
> between the float write and the preceding int read. I wonder how much of TBAA 
> we could keep at all and still support this. Thoughts?


The standard is just kind of broken here, 
It assumes that you can assign effective types at object creation points, and 
track them for all time.
But you can't.  f() could be in a different translation unit, but it still 
needs to be allowed to assume that the int *and the float *can't possible 
conflict.  Otherwise, tbaa is useless.  This is even now codified for unions 
after many years.
What is being demonstrated is just another way to achieve the same problem was 
fixed by requiring union accesses to be explicit, and so i'd say it should have 
the same resolution:
Such an effective type change must be more explicit than "i allocated typeless 
memory, and so i can do what i want with it".
Because we can't *ever* make that work.


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#727519, @dberlin wrote:

> In https://reviews.llvm.org/D31885#727499, @hfinkel wrote:
>
> > In https://reviews.llvm.org/D31885#727371, @efriedma wrote:
> >
> > > In https://reviews.llvm.org/D31885#727167, @hfinkel wrote:
> > >
> > > > I'm not sure this is the right way to do this; I suspect we're lumping 
> > > > together a bunch of different bugs:
> > > >
> > > > 1. vector types need to have tbaa which makes them alias with their 
> > > > element types [to be clear, as vector types are an implementation 
> > > > extension, this is our choice; I believe most users expect this to be 
> > > > true, but I'm certainly open to leaving this as-is (i.e. the vector 
> > > > types and scalar types as independent/non-aliasing)].
> > > > 2. tbaa can't be used for write <-> write queries (only read <-> write 
> > > > queries) because the writes can change the effective type
> > > > 3. our 'struct' path TBAA for unions is broken (and to fix this we need 
> > > > to invert the tree structure, etc. as discussed on the list)
> > >
> > >
> > > See https://bugs.llvm.org/show_bug.cgi?id=28189 for a testcase for (2) 
> > > for this which doesn't involve unions.
> >
> >
> > Yes, this is what I had in mind. However, we may just want to not handle 
> > this at all. The demonstration you provide:
> >
> >   #include 
> >   #include 
> >   #include 
> >   int f(int* x, int *y) {
> > *x = 10;
> > int z = *y;
> > *(float*)x = 1.0;
> > return z;
> >   }
> >   int (*ff)(int*,int*) = f;
> >   int main() {
> > void* x = malloc(4);
> > printf("%d\n", ff(x, x));
> >   }
> >   
> >
> > shows that the problem is more than I implied. To support this, we not only 
> > need to ignore the TBAA between the two writes (*x and *(float*)x), but 
> > also between the float write and the preceding int read. I wonder how much 
> > of TBAA we could keep at all and still support this. Thoughts?
>
>
> The standard is just kind of broken here, 
>  It assumes that you can assign effective types at object creation points, 
> and track them for all time.
>  But you can't.  f() could be in a different translation unit, but it still 
> needs to be allowed to assume that the int *and the float *can't possible 
> conflict.  Otherwise, tbaa is useless.  This is even now codified for unions 
> after many years.
>  What is being demonstrated is just another way to achieve the same problem 
> was fixed by requiring union accesses to be explicit, and so i'd say it 
> should have the same resolution:
>  Such an effective type change must be more explicit than "i allocated 
> typeless memory, and so i can do what i want with it".
>  Because we can't *ever* make that work.


In particular, i could have:

foo.c:

  #include 
  #include 
  #include 
  int f(int* x, int *y) {
*x = 10;
int z = *y;
*(float*)x = 1.0;
return z;
  }

bar.c:

  extern int f(int *, int *)
  
  int (*ff)(int*,int*) = f;
  int main() {
void* x = malloc(4);
printf("%d\n", ff(x, x));
  }

In foo.c, there is no information you could ever use to tell you the *(float*) 
is legal or illegal.  That is just a normal function :)

Thus, GCC, et all have taken the position that if you change the effective 
type, it must be completely visible in all cases.
Anything else would have rules that change when you inline, do LTO, etc
This where the "must make union accesses visible" came from.  I'm not sure what 
we would do for malloc+memcpy'd memory, but i assume something similar: It must 
be completely visible that it came from typeless memory, in all cases, and then 
we could tag it properly.


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-14 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#727529, @efriedma wrote:

> > Such an effective type change must be more explicit than "i allocated 
> > typeless memory, and so i can do what i want with it".
>
> How can you change the effective type of malloc'ed memory in C, if storing a 
> value of a new type doesn't have any effect?  memset?  A new C language 
> feature?  Something else?


memcpy is the traditional way, but in some sense it doesn't matter.
What you've pointed out is a clear standards bug in the sense that there *is no 
way* to implement it without removing TBAA from the standard.


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D30538: Add documentation for -fno-strict-aliasing

2017-03-02 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D30538#690699, @hans wrote:

> +1 for documenting this, but I have to leave it to the language lawyers for 
> how to fomulate it.
>
> > Enables/disables the strict aliasing assumption, which assumes that objects 
> > of different types do not share the same location in memory.
>
> I think it needs to say incompatible types at least (and `char` and `unsigned 
> char` are also special). And isn't it really about pointers -- the compiler 
> assumes that when dereferncing two pointers of incompatible types, those 
> pointers do not refer to the same memory?
>
> > clang does not allow "type-punning" by writing and reading from different 
> > union members
>
> I thought clang does allow type-punning through unions, as long as it's in a 
> single function, but that it fails when things get more complicated.


The limitation is really "union accesses must be visibly through a union, and 
if you try to trick the compiler, you will lose".
That is what we meant to allow, it just is still broken *anyway*

> +dannyb who enjoys this stuff ;)

You are so mean :)


https://reviews.llvm.org/D30538



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-19 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#730072, @rjmccall wrote:

> Thanks for CC'ing me.  There are two problems here.
>
> The second is that our alias-analysis philosophy says that the fact that two 
> accesses "sufficiently obviously" can alias should always override TBAA.


I'm not sure i agree.
We have made explicit exceptions for parts of the standard that are nonsensical 
(and they've been slowly updated in newer standards to be less nonsensical).  
Outside of that, i don't believe that is our philosophy, nor would i believe it 
should be.
There is no "do what i mean" here we can define that makes sense.

>   I have a hard time seeing what about the test cases I'm seeing makes it 
> insufficiently obvious that the accesses alias.  The  accesses are ultimately 
> derived from the same pointer; either the analysis is capable of counting to 
> 32, in which case it should see that they definitely alias, or it is not, in 
> which can it needs to assume that they might.  Perhaps the existing AA 
> interface is just inadequate?  It seems to me that it can't express that 
> middle-ground of "I have a concrete reason to think that these pointers are 
> related but cannot answer for certain whether they overlap".  Because if it 
> could just answer that rather than having to downgrade all the way to 
> MayAlias, it seems clear to me that (1) most memory transforms would treat it 
> exactly the same as MayAlias but (2) it should be considered strong enough to 
> suppress TBAA.

Just to note: Even assuming this wasn't crazytown, we certainly can't do it in  
the backend :)

The answer in such a case is to disable TBAA on those accesses and not emit it.
That is the direction we are moving, because trying to implement any rule about 
what "sufficiently obvious" means in the backend is a losing game.
It has no idea what it is doing, nor can it.
TBAA in the backend is a misnomer.
It is just walking a tree of hierarchical sets.   It has literally no idea what 
any of those sets mean, nor should it.

> John.




Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-19 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#730766, @rjmccall wrote:

> In https://reviews.llvm.org/D31885#730539, @dberlin wrote:
>
> > In https://reviews.llvm.org/D31885#730072, @rjmccall wrote:
> >
> > > Thanks for CC'ing me.  There are two problems here.
> > >
> > > The second is that our alias-analysis philosophy says that the fact that 
> > > two accesses "sufficiently obviously" can alias should always override 
> > > TBAA.
> >
> >
> > I'm not sure i agree.
> >  We have made explicit exceptions for parts of the standard that are 
> > nonsensical (and they've been slowly updated in newer standards to be less 
> > nonsensical).  Outside of that, i don't believe that is our philosophy, nor 
> > would i believe it should be.
> >  There is no "do what i mean" here we can define that makes sense.
>
>
> There was a deliberate decision to make TBAA conservative about type punning 
> in LLVM because, in practice, the strict form of TBAA you think we should 
> follow has proven to be a failed optimization paradigm: it just leads users 
> to globally disable TBAA, which is obviously a huge loss.  This was never 
> motivated purely by the nonsensicality of the standardization of unions.


I think you have completely missed my point.  i have not said we should be 
super strict, i'm saying "there is no DWIM" and "we can't do it in the 
backend". work.

> (And the nonsensicality of the standard very much continues.  The code that 
> we've all agreed is obviously being miscompiled here is still a strict 
> violation of the "improved" union rules in C++, and if we decide to treat it 
> as ill-formed we're going to break a massive amount of code (and vector code 
> in particular heavily uses unions), because trust me, nobody is reliably 
> placement-newing union fields when they're non-trivial.  C++'s new rules are 
> unworkable in part because they rely on C++-specific features that cannot be 
> easily adopted in headers which need to be language-neutral.)
> 
>>>   I have a hard time seeing what about the test cases I'm seeing makes it 
>>> insufficiently obvious that the accesses alias.  The  accesses are 
>>> ultimately derived from the same pointer; either the analysis is capable of 
>>> counting to 32, in which case it should see that they definitely alias, or 
>>> it is not, in which can it needs to assume that they might.  Perhaps the 
>>> existing AA interface is just inadequate?  It seems to me that it can't 
>>> express that middle-ground of "I have a concrete reason to think that these 
>>> pointers are related but cannot answer for certain whether they overlap".  
>>> Because if it could just answer that rather than having to downgrade all 
>>> the way to MayAlias, it seems clear to me that (1) most memory transforms 
>>> would treat it exactly the same as MayAlias but (2) it should be considered 
>>> strong enough to suppress TBAA.
>> 
>> Just to note: Even assuming this wasn't crazytown, we certainly can't do it 
>> in  the backend :)
> 
> Well, thanks for calling me crazy, I guess.

???
I didn't call you crazy, and apologies if you took it that way.
I meant the approach of trying to do it in the backend is crazy and doesn't 
work.

> John.




Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-19 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.



> Your proposal to we simply drop TBAA from all accesses related to unions is 
> extremely conservative.  It means that an access through a union has to be 
> treated as potentially aliasing every other visible access, at least in terms 
> of their types.  That level of conservatism is not necessary in this case if 
> we hew to the original understanding of LLVM's TBAA that punning is only 
> allowed when it is sufficiently obvious.  If you are interested in making a 
> strict definition of "sufficiently obvious", and then arguing that it cannot 
> possibly work, we can have that conversation; but right now I'm not seeing 
> anything that says that our approach is basically broken, other you 
> repeatedly asserting it because you think TBAA should be capable of 
> completely standing alone.

Did you look at the large number of bugs that prompted this patch and 
discussions on the mailing list?

> My proposal is that (1) we should correctly encode the aliasability of vector 
> types in TBAA metadata and, separately, (2) we should also ensure that TBAA 
> is correctly overridden by BasicAA in this case, which may require 
> introducing an intermediate level of response between MayAlias and MustAlias.

How do you see #2 working?
RIght now, it is overridden if TBAA says no-alias and basicaa says must-alias.

It's quite literally not possible for basicaa to detect all cases it should 
override.  If you want paper references on the strict breakdown of what can be 
decided statically, i'm happy to provide them.
Otherwise, there are plenty of bugs filed with real-world examples already.
So it doesn't now, it can't in the future.  This is the source of a number of 
those bugs.
If BasicAA could prove no-alias, it wouldn't' need to override, so you can't 
just override any may-alias cases.  It can't detect all must-aliasing, or even 
anything approaching them.  
It doesn't work in probabilities, and llvm's type system is unrelated to the C 
level one, so it can't even flag those things that might be "bad", so there is 
nothing it can answer in between.
This opposed to doing it in the front end, where it knows *all* of this, and 
could simply stop telling the backend two things don't alias when you want it 
to do that.

So i'd like to understand how you see it "correctly overriding in BasicAA". 
There is *nothing*, in any of these accesses or examples, at the LLVM IR level, 
that tells us we should override it.


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-19 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#730909, @rjmccall wrote:

> In https://reviews.llvm.org/D31885#730853, @hfinkel wrote:
>
> > > There was a deliberate decision to make TBAA conservative about type 
> > > punning in LLVM because, in practice, the strict form of TBAA you think 
> > > we should follow has proven to be a failed optimization paradigm: it just 
> > > leads users to globally disable TBAA, which is obviously a huge loss. 
> > > This was never motivated purely by the nonsensicality of the 
> > > standardization of unions.
> >
> > Okay, so the problem right now is that we're **not** conservative regarding 
> > union type punning, and while BasicAA sometimes fixes this for us, it 
> > doesn't always, and can't always, and so now we're miscompiling code.
>
>
> I agree that BasicAA "can't always", but I'm not sure it "can't often 
> enough".  Again, it seems to me that the low-level problem is just that the 
> AA interface isn't designed for what we're trying to do with TBAA.  
> "MayAlias" is the default answer for everything that can't be proven one way 
> or the other, and "MustAlias" demands that the memory be actually known to 
> overlap.  If there were an intermediate "LikelyAlias" for pointers that are 
> known to be related but BasicAA just doesn't know for certain whether the 
> accesses overlap, then TBAA would turn itself off in those cases as long as 
> at least a basic value-propagation pass has been run.  That would put us on 
> much firmer ground to say "it's reasonable for the compiler to assume that 
> these things don't alias, and if you're going to use type-punning like this, 
> you just need to rewrite your code to make it more obvious that the pointers 
> are related".  When you're given a language rule as weak as this, especially 
> one that's so frequently violated, that's the only kind of argument which is 
> going to have any traction with users.
>
> John.




In https://reviews.llvm.org/D31885#730909, @rjmccall wrote:

> In https://reviews.llvm.org/D31885#730853, @hfinkel wrote:
>
> > > There was a deliberate decision to make TBAA conservative about type 
> > > punning in LLVM because, in practice, the strict form of TBAA you think 
> > > we should follow has proven to be a failed optimization paradigm: it just 
> > > leads users to globally disable TBAA, which is obviously a huge loss. 
> > > This was never motivated purely by the nonsensicality of the 
> > > standardization of unions.
> >
> > Okay, so the problem right now is that we're **not** conservative regarding 
> > union type punning, and while BasicAA sometimes fixes this for us, it 
> > doesn't always, and can't always, and so now we're miscompiling code.
>
>
> I agree that BasicAA "can't always", but I'm not sure it "can't often 
> enough".  Again, it seems to me that the low-level problem is just that the 
> AA interface isn't designed for what we're trying to do with TBAA.  
> "MayAlias" is the default answer for everything that can't be proven one way 
> or the other, and "MustAlias" demands that the memory be actually known to 
> overlap.  If there were an intermediate "LikelyAlias" for pointers that are 
> known to be related but BasicAA just doesn't know for certain whether the 
> accesses overlap, then TBAA would turn itself off in those cases as long as 
> at least a basic value-propagation pass has been run.  That would put us on 
> much firmer ground to say "it's reasonable for the compiler to assume that 
> these things don't alias, and if you're going to use type-punning like this, 
> you just need to rewrite your code to make it more obvious that the pointers 
> are related".  When you're given a language rule as weak as this, especially 
> one that's so frequently violated, that's the only kind of argument which is 
> going to have any traction with users.
>
> John.


Just so i understand: Ignoring everything else (we can't actually make 
likelyalias work, i think the code in the bugs makes that very clear), you also 
believe we should effectively pessimize every other language that generates 
correct TBAA info for LLVM  and will now no longer get optimized well because 
we've decided not to have clang emit TBAA metadata only in the cases where  it 
actually wants TBAA applied?


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D31885: Remove TBAA information from LValues representing union members

2017-04-19 Thread Daniel Berlin via Phabricator via cfe-commits
dberlin added a comment.

In https://reviews.llvm.org/D31885#730936, @rjmccall wrote:

> In https://reviews.llvm.org/D31885#730920, @dberlin wrote:
>
> > Just so i understand: Ignoring everything else (we can't actually make 
> > likelyalias work, i think the code in the bugs makes that very clear),
>
>
> None of the code in the bugs I've seen makes that very clear, but perhaps I 
> just missed the compelling example.
>
> > you also believe we should effectively pessimize every other language that 
> > generates correct TBAA info for LLVM  and will now no longer get optimized 
> > well because we've decided not to have clang emit TBAA metadata only in the 
> > cases where  it actually wants TBAA applied?
>
> Are you under the impression that I'm proposing something new and that TBAA 
> does not currently defer to BasicAA?


Well, actually, chandler changed how it works, so it doesn't actually work 
quite the way it used to but it has the same effect ;)
(It used to be explicit deferral, it is not explicit, neither depends on the 
other any more.  It just happens to be how the default AA pipeline is ordered 
right now)
I'm quite aware of the machinations of AA in LLVM :)

I'm saying "this approach does not work well now" (ie see bugs) , and i don't 
think continuing to try to make the contract even *more* incestuous is better 
than trying to make it *less*.

I see literally no advantage to doing so, and plenty of disadvantage 
(pessimization of other languages, continuation of a pipeline ordering that 
requires overrides, etc)

IE this is a thing we should be fixing, over time, to be a cleaner and better 
design, that does not make clang special here, and fixes these bugs.
Others generate tbaa metadata where they want it, and avoid it where they want 
basicaa to make a decision.

I have trouble seeing, why, if folks are willing to do that work, one would be 
opposed to it.
You seem to claim better optimization opportunity.
I ran numbers.
Across all things in llvm's benchmark suite, disabling unions entirely for tbaa 
causes no regressions on x86 (I could try more platforms if you like).

I also  ran it across 50 third party packages and spec2006 that use strict 
aliasing (These are just those things i could quickly test that had perf 
benchmarks)

If it really caused any real perf loss, we could always define metadata to say 
these accesses are noalias if we can't prove they are mustalias.
We could even use that metadata to explicitly try to have basicaa try much 
harder than it does now to prove must-aliasing (for example, ignore depth 
limits), since it would likely be too expensive to try it for everything.

as an aside, I do not believe you could reasonably apply this to unions  
because it's not possible to tell people what to do differently when it's *llvm 
ir* that has the issue.

IE i can't reasonably tell users "please stop doing things that don't generate 
explicit enough IR".
Especially when those answers change with inlining decisions, etc.

If the frontend did it, i would likely have something reasonably to tell them.


Repository:
  rL LLVM

https://reviews.llvm.org/D31885



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits