On Tue, Nov 8, 2011 at 10:45 AM, Richard Guenther
<[email protected]> wrote:
> On Tue, Nov 8, 2011 at 1:29 AM, Jeff Law <[email protected]> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 11/07/11 15:53, Richard Guenther wrote:
>>> On Mon, Nov 7, 2011 at 10:25 PM, Jakub Jelinek <[email protected]>
>>> wrote:
>>>> Hi!
>>>>
>>>> This patch attempts to optimize VEC_BASE if we know that offsetof
>>>> of base is 0 (unless the compiler is doing something strange, it
>>>> is true). It doesn't have a clear code size effect, some .text
>>>> sections grew, supposedly because of more inlining, some .text
>>>> sections shrunk.
>>>>
>>>> Bootstrapped/regtested on x86_64-linux and i686-linux.
>>>
>>> I wonder why the compiler doesn't optimize this ... certainly it
>>> looks backward to, in
>>>
>>> <bb 2>: if (c_2(D) != 0B) goto <bb 3>; else goto <bb 4>;
>>>
>>> <bb 3>: D.2948_3 = &c_2(D)->fld; goto <bb 5>;
>>>
>>> <bb 4>: D.2948_4 = 0B;
>>>
>>> <bb 5>: # D.2948_1 = PHI <D.2948_3(3), 0B(4)> return D.2948_1;
>>>
>>> see that D.2948_4 is equal to D.2948_3 for c_2 == 0, so I'm not
>>> sure which pass would be able to detect this (but the optimziation
>>> opportunity would be on the PHI node, so maybe it should be done in
>>> phiopt).
>> ?!? When c2 == 0 the return value is supposed to be zero, that's one
>> of the fundamental problems with the way we've defined VEC_BASE.
>>
>> In fact cases where we immediately dereference VEC_BASE are precisely
>> what got me looking at the executable path optimization.
>>
>> Assuming this gets inlined and the result is used in a memory
>> dereference, the new pass will do exactly what we want. Namely it'll
>> determine that BB4 can never be executed at runtime and it's control
>> dependent on the edge 2->4. It zaps the edge 2->4, cleaning up the
>> conditional in the process. That makes BB4 unreachable and BB2, BB4
>> and BB5 mergable and everything collapses into one simple assignment.
>
> But there is no dereference in the code above - &c->base is an
> address computation. But we can still optimize
>
> if (c)
> return &c->base;
> return NULL;
>
> to
>
> return &c->base;
>
> if &c->base == NULL iff c == NULL.
>
> So I think this is orthogonal to any undefinedness of dereferencing.
>
> The above pattern occurs frequently so that the computed address
> is either a valid dereferencable address or NULL.
Thus, a similar testcase would be
int f(int i)
{
if (i)
return i;
return 0;
}
phiopt optimizes that, but it fails to optimize
struct C { int i; };
int *g(struct C *p)
{
if (p)
return &p->i;
return (int *)0;
}
but that's also because we do not optimize &p->i to just p.
Richard.
> Richard.
>