On 3/11/20 7:34 AM, Nidal Faour via Gcc-patches wrote:
> This patch is a code density oriented and attempt to remove redundant 
> sign/zero extension from assignment statement.
> The approach taken is to use VRP data while expanding the assignment to RTL 
> to determine whether a sign/zero extension is necessary.
> Thought the motivation of the patch is code density but it also good for 
> speed.
>
> for example:
> extern unsigned int func ();
>
>   unsigned char
>   foo (unsigned int arg)
>   {
>     if (arg == 2)
>       return 0;
>
>     return (func() == arg || arg == 7);
>   }
>
> the result of the comparison in the return will yield a False or True, which 
> will be converted to integer and then casting to unsigned char.
> this casting from integer to unsigned char is redundant because the value is 
> either 0 or 1.
>
> this patch is targeting the RISCV-32bit only.
> This patch has been tested on a real embedded project and saved about 0.2% of 
> code size.
>
> P.S. I have an FSF license under Western Digital incorporation.
> P.S. I've attached the patch as a file in case my email client corrupted the 
> patch itself
So at a high level, this really shouldn't be a target dependent
transformation AFAICT.  So I think all the hook related stuff should be
removed.  Furthermore, REE does a fair amount of this kind of stuff.  I
realize it doesn't handle the testcase you've included, but if you're
looking to improve GCC's ability to identify and remove redundant
extensions, you might want to get familiar with REE.  It's most glaring
deficiencies are that it doesn't know about SUBREGS which are a form of
extension and that it does not handle cases where a copy is needed and
there are multiple definitions that reach the extension we're trying to
eliminate.

Anyway, the goal of this patch appears to be to remove extensions that
result from expanding a NOP_EXPR (and the other conversion tree codes,
though NOP_EXPR will be the most common case we'll see).





> Attaching the correct patch. 
>
> diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
> index 9864e4344d2..c3739b2f331 100644
> --- a/gcc/cfgexpand.c
> +++ b/gcc/cfgexpand.c
> @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public 
> License
>  along with GCC; see the file COPYING3.  If not see
>  <http://www.gnu.org/licenses/>.  */
>  
> +#include <algorithm>

Why did you need <algorithm>?  That seems odd at best.


> @@ -3640,6 +3643,107 @@ expand_clobber (tree lhs)
>      }
>  }
>  
> +/* Print the ranges of the operand and save it to the dump file
> +   used for debug purposes only.  */
> +
> +void
> +print_range (tree operand, wide_int min, wide_int max)
> +{
> +  pretty_printer buffer;
> +  pp_needs_newline (&buffer) = false;
> +  buffer.buffer->stream = dump_file;
> +  fprintf (dump_file, "Range for lhs: [");
> +  pp_wide_int (&buffer, min, TYPE_SIGN (TREE_TYPE (operand)));
> +  pp_printf (&buffer, ", ");
> +  pp_wide_int (&buffer, max, TYPE_SIGN (TREE_TYPE (operand)));
> +  pp_printf (&buffer, "]");
> +  pp_flush (&buffer);
> +  fprintf (dump_file, "\n");
> +}

I'm surprised we don't already have a range dumper you could use.  Also
note that you're querying global ranges.  You might consider using the
new ranger APIs which allow you to get context sensitive range
information which in turn may allow you to remove more useless extensions.


> +
> +
> +
> +/* Check that REG and SUBREG modes match between src and dest,
> +   and that there is a cast to be removed.  */
> +
> +bool
> +cast_to_remove_p (rtx src, rtx dest, machine_mode from
> +                             , machine_mode to)
> +{
> +  if (GET_CODE (src) != SUBREG
> +      || GET_CODE (dest) != SUBREG
> +      || GET_CODE (SUBREG_REG (src)) != REG
> +      || GET_CODE (SUBREG_REG (dest)) != REG
> +      || GET_MODE (src) != GET_MODE (dest)
> +      || GET_MODE (SUBREG_REG (src)) != GET_MODE (SUBREG_REG (dest))
> +      || GET_MODE (src) != to
> +      || GET_MODE (SUBREG_REG (src)) != from)
> +    return false;
> +  return true;

This seems highly restrictive.  I don't see why the DEST would need to
be a SUBREG expression.  ISTM you'd want to remove something like this,
though your patch won't:


(set (reg:SI) (zero_extend:SI (subreg:QI (reg:SI))))


You may also need to check SUBREG_BYTE here.  I'm pretty sure you need
to be more careful for big endian targets as well.



> +}
> +
> +bool
> +remove_cast_p (rtx temp, rtx target, gassign *assign_stmt)
> +{
> +  enum gimple_code code;
> +  value_range_kind lhs_range, rhs_range;
> +  unsigned int ops_num;
> +  wide_int lhs_min, lhs_max, rhs_min, rhs_max;
> +  if (gimple_assign_cast_p (assign_stmt))
> +  {
> +    code = gimple_code (assign_stmt);
> +    ops_num = gimple_num_ops (assign_stmt);
> +    if (code == GIMPLE_ASSIGN && ops_num < 3)

Is there any case where we get more than 2 ops here?  If it's a cast,
then I would think not.



> +    {
> +      tree lhs = gimple_assign_lhs (assign_stmt);
> +      if (POINTER_TYPE_P (TREE_TYPE (lhs)))
> +       return false;
> +      lhs_range = get_range_info (lhs, &lhs_min, &lhs_max);
> +      if (dump_file && lhs_range == VR_RANGE)
> +       print_range (lhs, lhs_min, lhs_max);

I'm not sure why you need the range of the LHS.  What you really care
about is the size of the destination object, not the ranges it may have.



> +
> +      tree rhs = gimple_assign_rhs1 (assign_stmt);
> +      if (POINTER_TYPE_P (TREE_TYPE (rhs)))
> +       return false;
> +      rhs_range = get_range_info (rhs, &rhs_min, &rhs_max);
> +      if (rhs_range == VR_RANGE)
> +      {
> +       unsigned int rhs_precision
> +      = std::max (wi::min_precision (rhs_max, TYPE_SIGN (TREE_TYPE (rhs))),
> +             wi::min_precision (rhs_min, TYPE_SIGN (TREE_TYPE (rhs))));
> +       unsigned int lhs_precision = TYPE_PRECISION (TREE_TYPE (lhs));
> +       if (lhs_precision > rhs_precision)
> +       {
> +
> +      if (dump_file)
> +      {
> +        print_range (rhs, rhs_min, rhs_max);
> +        if (lhs_precision > rhs_precision)
> +             fprintf (dump_file,
> +                 "EXPAND-CAST: This casting can be optimized\n");
> +      }
> +
> +      return cast_to_remove_p (temp,
> +                      target,
> +                      TYPE_MODE (TREE_TYPE (rhs)),
> +                      TYPE_MODE (TREE_TYPE (lhs)));
> +       }

It appears to me that you always remove the extension when the LHS
precision is more than the RHS precision.   That should work for zero
extension, but I'm pretty sure its wrong for sign extension, but I don't
see where you filter out any sign extending casts.


For the sign extension case, you want to check the range of the source
object to see if its sign bit of the narrowing subreg's mode is on in
the source range. 


You might also look at Kugan Vivekanandara's patches from 2015.  I think
they stalled out as they weren't ready for that release and I don't
recall Kugan re-engaging on them once we got past that year's
release.    But there may be useful nuggets in those patches.

Jeff

Reply via email to