https://gcc.gnu.org/g:bf85c4df80923c1afb4c2420aac252617fabb67c

commit r13-9569-gbf85c4df80923c1afb4c2420aac252617fabb67c
Author: Jakub Jelinek <ja...@redhat.com>
Date:   Wed Nov 6 10:21:09 2024 +0100

    store-merging: Don't use sub_byte_op_p mode for empty_ctor_p unless 
necessary [PR117439]
    
    encode_tree_to_bitpos uses the more expensive sub_byte_op_p mode in which
    it has to allocate a buffer and do various extra work like shifting the bits
    etc. if bitlen or bitpos aren't multiples of BITS_PER_UNIT, or if bitlen
    doesn't have corresponding integer mode.
    The last case is explained later in the comments:
      /* The native_encode_expr machinery uses TYPE_MODE to determine how many
         bytes to write.  This means it can write more than
         ROUND_UP (bitlen, BITS_PER_UNIT) / BITS_PER_UNIT bytes (for example
         write 8 bytes for a bitlen of 40).  Skip the bytes that are not within
         bitlen and zero out the bits that are not relevant as well (that may
         contain a sign bit due to sign-extension).  */
    Now, we've later added empty_ctor_p support, either {} CONSTRUCTOR
    or {CLOBBER}, which doesn't use native_encode_expr at all, just memset,
    so that case doesn't need those fancy games unless bitlen or bitpos
    aren't multiples of BITS_PER_UNIT (unlikely, but let's pretend it is
    possible).
    
    The following patch makes us use the fast path even for empty_ctor_p
    which occupy full bytes, we can just memset that in the provided buffer and
    don't need to XALLOCAVEC another buffer.
    
    This patch in itself fixes the testcase from the PR (which was about using
    huge XALLLOCAVEC), but I want to do some other changes, to be posted in a
    next patch.
    
    2024-11-06  Jakub Jelinek  <ja...@redhat.com>
    
            PR tree-optimization/117439
            * gimple-ssa-store-merging.cc (encode_tree_to_bitpos): For
            empty_ctor_p use !sub_byte_op_p even if bitlen doesn't have an
            integral mode.
    
    (cherry picked from commit aab572240a0752da74029ed9f8918e0b1628e8ba)

Diff:
---
 gcc/gimple-ssa-store-merging.cc | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/gimple-ssa-store-merging.cc b/gcc/gimple-ssa-store-merging.cc
index 7243f1d289a5..4898f48ec523 100644
--- a/gcc/gimple-ssa-store-merging.cc
+++ b/gcc/gimple-ssa-store-merging.cc
@@ -1849,14 +1849,15 @@ encode_tree_to_bitpos (tree expr, unsigned char *ptr, 
int bitlen, int bitpos,
                       unsigned int total_bytes)
 {
   unsigned int first_byte = bitpos / BITS_PER_UNIT;
-  bool sub_byte_op_p = ((bitlen % BITS_PER_UNIT)
-                       || (bitpos % BITS_PER_UNIT)
-                       || !int_mode_for_size (bitlen, 0).exists ());
   bool empty_ctor_p
     = (TREE_CODE (expr) == CONSTRUCTOR
        && CONSTRUCTOR_NELTS (expr) == 0
        && TYPE_SIZE_UNIT (TREE_TYPE (expr))
-                      && tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (expr))));
+       && tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (expr))));
+  bool sub_byte_op_p = ((bitlen % BITS_PER_UNIT)
+                       || (bitpos % BITS_PER_UNIT)
+                       || (!int_mode_for_size (bitlen, 0).exists ()
+                           && !empty_ctor_p));
 
   if (!sub_byte_op_p)
     {

Reply via email to