On 06/18/2018 09:37 AM, Qing Zhao wrote:
> Hi,
> 
> this is the 3rd (and the last) patch for PR78809:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
> Inline strcmp with small constant strings
> 
> The design doc for PR78809 is at:
> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
> 
> this patch is for the third part of change of PR78809.
> 
> C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
>    if the result is NOT used to do simple equality test against zero, one of
> "s1" or "s2" is a small constant string, n is a constant, and the Min value of
> the length of the constant string and "n" is smaller than a predefined
> threshold T,
>    inline the call by a byte-to-byte comparision sequence to avoid calling
> overhead.
> 
> adding test case strcmpopt_5.c into gcc.dg for part C of PR78809.
> 
> bootstraped and tested on both X86 and Aarch64. no regression.
> 
> I have done some experiments to check the run-time performance impact 
> and code-size impact from such inlining with different threshold for the
> length of the constant string to inline, the data were recorded in the 
> attachments of 
> PR78809:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809.
> 
> and finally decide that the default value of the threshold is 3. 
> this value of the threshold can be adjusted by the new option:
> 
> --param builtin-string-cmp-inline-length=
> 
> The following is the patch.
> 
> thanks.
> 
> Qing
> 
> gcc/ChangeLog:
> 
> +2018-06-18  Qing Zhao  <qing.z...@oracle.com>
> +
> +       PR middle-end/78809
> +       * builtins.c (expand_builtin_memcmp): Inline the calls first
> +       when result_eq is false.
> +       (expand_builtin_strcmp): Inline the calls first.
> +       (expand_builtin_strncmp): Likewise.
> +       (inline_string_cmp): New routine. Expand a string compare 
> +       call by using a sequence of char comparison.
> +       (inline_expand_builtin_string_cmp): New routine. Inline expansion
> +       a call to str(n)cmp/memcmp.
> +       * doc/invoke.texi (--param builtin-string-cmp-inline-length): New 
> option.
> +       * params.def (BUILTIN_STRING_CMP_INLINE_LENGTH): New.
> +
> 
> gcc/testsuite/ChangeLog:
> 
> +2018-06-18  Qing Zhao  <qing.z...@oracle.com>
> +
> +       PR middle-end/78809
> +       * gcc.dg/strcmpopt_5.c: New test.
So I still need to dig into this patch.  But I wanted to raise an
potential issue and get yours and Martin's thoughts on it.

Martin (and others) have been working hard to improve GCC's ability to
give good diagnostics for various problems with calls to mem* and str*
functions (buffer overflows, restrict issues, etc).

One of the problems Martin has identified is early conversion of these
calls into inlined direct operations.  If those conversions happen prior
to the analysis for warnings we obviously can't issue any relevant warnings.


> +
> 
> 
> 
> 0001-3nd-Patch-for-PR78009.patch
> 
> 
> From fcf6ced44e6e3e4a14894cc8f43ac39bc17aafee Mon Sep 17 00:00:00 2001
> From: qing zhao <qing.z...@oracle.com>
> Date: Thu, 14 Jun 2018 14:32:46 -0700
> Subject: [PATCH] 3nd Patch for PR78009
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78809
> Inline strcmp with small constant strings
> 
> The design doc for PR78809 is at:
> https://www.mail-archive.com/gcc@gcc.gnu.org/msg83822.html
> 
> this patch is for the third part of change of PR78809.
> 
> C. for strcmp (s1, s2), strncmp (s1, s2, n), and memcmp (s1, s2, n)
>    if the result is NOT used to do simple equality test against zero, one of
> "s1" or "s2" is a small constant string, n is a constant, and the Min value of
> the length of the constant string and "n" is smaller than a predefined
> threshold T,
>    inline the call by a byte-to-byte comparision sequence to avoid calling
> overhead.
> 
> adding test case strcmpopt_5.c into gcc.dg for part C of PR78809.
> 
> bootstraped and tested on both X86 and Aarch64. no regression.
> ---
>  gcc/builtins.c                     | 172 
> +++++++++++++++++++++++++++++++++++--
>  gcc/doc/invoke.texi                |   5 ++
>  gcc/params.def                     |   4 +
>  gcc/testsuite/gcc.dg/strcmpopt_5.c |  80 +++++++++++++++++
>  4 files changed, 253 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/strcmpopt_5.c
> 
> diff --git a/gcc/builtins.c b/gcc/builtins.c
> index 6b3e6b2..cf50d05 100644
> --- a/gcc/builtins.c
> +++ b/gcc/builtins.c
> @@ -4358,6 +4360,17 @@ expand_builtin_memcmp (tree exp, rtx target, bool 
> result_eq)
>                        POINTER_TYPE, POINTER_TYPE, INTEGER_TYPE, VOID_TYPE))
>      return NULL_RTX;
>  
> +  /* due to the performance benefit, always inline the calls first 
> +     when result_eq is false.  */
Please capitalize the first word in sentences like this.  This nit
appears in most of your comments.


> +  rtx result = NULL_RTX;
> +   
> +  if (!result_eq) 
> +    {
> +      result = inline_expand_builtin_string_cmp (exp, target, true);
> +      if (result)
> +     return result;
> +    }
> +
So I believe you do inline expansion here prior to the checks for
Wstringop_overflow.  I think you can safely move this code to after the
warn_stringop_overflow checks.  Though you may want to make this code
conditional on both calls to check_access returning true and avoiding
your transformation if either or both calls return false.

Alternately you'd need to verify that inline_expand_builtin_string_cmp
always returns false for cases which are going to generate a warning.
But that seems a bit tougher to maintain over time if we were to add
more warnings to this code.



> @@ -4448,6 +4461,12 @@ expand_builtin_strcmp (tree exp, ATTRIBUTE_UNUSED rtx 
> target)
>    if (!validate_arglist (exp, POINTER_TYPE, POINTER_TYPE, VOID_TYPE))
>      return NULL_RTX;
>  
> +  /* due to the performance benefit, always inline the calls first.  */
> +  rtx result = NULL_RTX;
> +  result = inline_expand_builtin_string_cmp (exp, target, false);
> +  if (result)
> +    return result;
Similarly.



> @@ -4562,6 +4580,12 @@ expand_builtin_strncmp (tree exp, ATTRIBUTE_UNUSED rtx 
> target,
>                        POINTER_TYPE, POINTER_TYPE, INTEGER_TYPE, VOID_TYPE))
>      return NULL_RTX;
>  
> +  /* due to the performance benefit, always inline the calls first.  */
> +  rtx result = NULL_RTX;
> +  result = inline_expand_builtin_string_cmp (exp, target, false);
> +  if (result)
> +    return result;
Similarly here.

> @@ -6640,6 +6664,138 @@ expand_builtin_goacc_parlevel_id_size (tree exp, rtx 
> target, int ignore)
>    return target;
>  }
>  
> +/* Expand a string compare operation using a sequence of char comparison 
> +   to get rid of the calling overhead.
> +
> +   var_str is the variable string source;
> +   const_str is the constant string source;
> +   length is the number of chars to compare;
> +   const_str_n indicates which source string is the constant string;
> +   is_memcmp indicates whether it's a memcmp or strcmp.
When referring to arguments in comments, please capitalize them.  ie
VAR_STR, CONST_STR, etc.


> +   
> +   to: (assume const_str_n is 2, i.e., arg2 is a constant string)
> +
> +   target = var_str[0] - const_str[0];
> +   if (target != 0)
> +     goto ne_label;
> +     ...
> +   target = var_str[length - 2] - const_str[length - 2];
> +   if (target != 0)
> +     goto ne_label;
> +   target = var_str[length - 1] - const_str[length - 1];
> +   ne_label:
> +  */
But leave them as lower case in code fragments like this.


So this generally looks pretty good.  THe biggest technical concern is
making sure we're doing the right thing WRT issuing warnings.  You can
tackle that problem by deferring inlining to a later point after
warnings have been issued or by verifying that your routines do not
inline in cases where warnings will be issued.  It may be worth adding
testcases for these issues.

There's a large number of comments that need capitalization fixes.

Given there was no measured runtime performance impact, but slight
improvements on codesize for values <= 3, let's go ahead with that as
the default.

Can you address the issues above and repost for final review?

Thanks,
jeff

Reply via email to