Re: [RFC] Return Value Propagation in IPA-CP

2024-09-09 Thread Dhruv Chawla via Gcc

Ping (https://gcc.gnu.org/pipermail/gcc/2024-August/244625.html).

On 22/08/24 11:32, Dhruv Chawla via Gcc wrote:

External email: Use caution opening links or attachments


* Table Of Contents *

- Introduction
- Motivating Test Cases
- Proposed Solution
- Other Options
- Existing Solutions
- Primary Optimizations To Be Improved
- Front-End Attribute For Recording Return-Value Information
- Future Work
- References
- Questions

* Introduction *

Return jump functions offer a way of modeling return values of a function in
terms of its formal parameters, similar to how regular jump functions can be
used to model the actual parameters of a callee at a callsite in terms of the
formal parameters of the caller.

By discovering this information and propagating it across the program callgraph
in LTO mode, various transformations like value-range propagation, sibling-call
and chaining-call optimization can be augmented to take advantage of it and
potentially perform better optimizations.

Currently, ipa-cp models parameters using jump functions, propagates the values
of those jump functions by iterating to a fixed-point using lattices, and
clones functions by taking decisions from those lattice values.

We propose extending ipa-cp with the analysis and propagation of return jump
functions, starting with the basic case of a direct return of the form
"return x;" where "x" is any formal parameter of the function.

* Motivating Test Cases *

* Optimizing tail calls:

- PR92867

char buf[128] = { 1 };

char *foo (__SIZE_TYPE__ n)
{
   return __builtin_memset (buf, ' ', n);
}

char *bar (__SIZE_TYPE__ n)
{
   __builtin_memset (buf, ' ', n);
   return buf;
}

Here, the call to __builtin_memset in foo gets successfully converted to a
tail-call, however the call in bar cannot be optimized to a tail-call. This is
an optimization that LLVM is able to perform.

Link to compiler explorer: https://godbolt.org/z/E81axqWTb

- PR67797

#include 

void *my_func(void *s, size_t n)
{
     memset(s, 0, n);
     return s;
}

This is a similar case to the above example. LLVM is able to optimize the call
to memset to be a tail-call as well.

Link to compiler explorer: https://godbolt.org/z/YzjGc59f8

* Optimizing chaining calls:

#include 

void f (int x, int y)
{
   std::cout << x;
   std::cout << y;
}

This function can be optimized to the following:

#include 

void f (int x, int y)
{
   std::cout << x << y;
}

LLVM is able to perform this optimization as well:
https://godbolt.org/z/MW75on1o5

* Proposed Solution *

* Extending IPA-CP:

1. Add return jump function data structures
   - This involves updating ipa_node_params to contain information regarding the
     return statements of the function, namely the lattice and the jump function
     describing the return value, where both use existing data structures.
   - The ipa_node_params class is reused to avoid introducing a new class and a
     corresponding function_summary type, though it is trivial to add if deemed
     a better solution. The ipa_return_value_summary class may also be a good
     place for this information, however using it would involve moving it from
     ipa-prop.cc to ipa-prop.h.
   - Additionally, ipa_edge_args is updated to track whether or not it is a
     callgraph edge originating from a return statement. This enables the
     propagation of information in the WPA phase.

2. LGEN
   - Set up return jump functions for each function similar to the parameter
     jump function. When it cannot be conclusively determined that one formal
     parameter is always returned (such as conditionally returning either of
     two), the jump function is marked as invalid.
   - This involves looking through phi nodes when return statements of
     conditional branches are merged into a single exit block. Note that
     returning a call expression does not count towards invalidating the
     information for that function.

3. WPA
   - Implement return value information propagation. It is not possible to do
     this in the existing propagate_constants_topo function, because it iterates
     in reverse postorder i.e. from caller to callee. However return value
     propagation needs to be done from callee to caller, thus there is a need to
     iterate in a postorder fashion.
   - The lattice values are initialized using the jump functions computed in the
     LGEN phase. These values are then propagated over the callgraph.
   - The existing lattices are reused, with three possible values like usual.
     The possible values are:

   return_lattice -> { TOP, param_decl, BOTTOM }

     where TOP and BOTTOM are defined as usual. param_decl refers to the tree of
     either the parameter declaration or its type, whichever is available. The
     meet operator is defined as usual for TOP and BOTTOM. When both are
     param_decl, the following meet operation is defined:

   meet(x, y) = x if x == y, BOTTOM otherwise

   - Finally, nodes to which no information could be

Re: Proposed CHOST change for the 64bit time_t transition

2024-09-09 Thread Arsen Arsenović via Gcc
Jacob Bachmeyer  writes:

>> At that point, we should bump SONAME of libc and simply remove 32-bit
>> time support.  This would probably be okay generally.
>
> This is probably the best solution to this problem at hand, especially since
> the old ABI has a definite expiration date about 14 years from now.  Bump the
> libc SONAME major and hope that we can get rid of the last dependencies on the
> old SONAME before the deadline.  We will have 14 years to do it, if that arch
> is even still used then.

Indeed.  I believe the current thinking is that the existing software
for the old ABI could benefit from libc updates, hence not breaking it,
but.. it practically is somewhat broken already (hence the troubles that
lead to this thread).

>> [...]
>> But, in the case we don't do a bump, why not update the tuple?  This'd
>> allow easy communication of whether we have 32 bit time to all
>> components of the system, and, in lieu of a better detection mechanism,
>> it'd allow anyone at a glance to look at a hosts tuple and see whether
>> it is compatible with something based on the tuple it was built on.
>>   
>
> This is closely related to the idea I floated a year ago of redefining
> configuration tuples as lists of tags (with a canonical order) progressively
> narrowing a broad architecture.  Start with CPU architecture and work
> "narrower" from there.  In that system, adding "-t32" and "-t64" to indicate
> time_t width would be the simple solution.  In context then, it was to handle
> different libc choices on Windows.

Yes, that's about what I imagined as the effect of the suffix.
-- 
Arsen Arsenović


signature.asc
Description: PGP signature