On Mon, 10 Nov 2025, Victor Do Nascimento wrote:

> This patch series extends the GCC vectorizer's capability as to be
> able to vectorize uncounted loops, as per the following example:
> 
> while (str[i] != 0)
>   str[i] ^=0x20;
> 
> Though this implementation has been demonstrated not to cause any
> regressions, either in the GCC testsuite or in performance, the scope
> of this patch-series is limited.  It lays the foundational groundwork
> for the vectorization of such loops but leaves further features to be
> enabled in separate patches.  Namely, peeling for alignment and alias
> analysis.
> 
> This submission provides only limited unit tests and is made with the
> primary purpose of getting feedback on the design choices made, while
> further tests used in development are added in the run-up to the stage
> 1 deadline.
> 
> As such, this patch series is split into a large number of patches.
> The intuition behind this is to be able to have each commit message
> explain the rationale behind each change and allow for easier
> feedback for any of these choices.
> 
> The work borrows heavily and builds upon the previous early break
> vectorization work, whereby an early-break loop is one with both a
> counting IV exit as well as one or more non IV-counting exit.
> 
> By having all exits behave as early break exits, we are able to extend
> the types of loops which get vectorized while keeping changes to the
> code base fairly minimal.
> 
> The changes can be broadly described as follows:
> 
>   - Relax the constraint that loops must have a known iteration count
>   in order to be considered for vectorization.
>   - Implement a way of retrieving whether loop_vinfo refers to an
>   uncounted loop.  This is done w/ the LOOP_VINFO_NITERS_UNCOUNTED_P
>   accessor macro for loop_vinfo.
>   - Categorize uncounted loops as satisfying the criterion given in
>   LOOP_VINFO_EARLY_BREAKS_VECT_PEELED.  This ensures that whatever
>   exit is assigned to the "main" exit is given equal treatment to
>   early-break exits.
>   - Make all exit conditions early breaks for uncounted loops: Given
>   the absence of any IV-counting exits, it makes no sense for any exit
>   to be associated with LOOP_VINFO_LOOP_IV_COND.  Consequently, all
>   exit conditions are assigned to LOOP_VINFO_LOOP_CONDS.
>   - In choosing the loop's "main" exit, we choose the last exit in the
>   loop.  This choice is made as it facilitates the job of implementing
>   peeling for alignment, wherein it is required that the effective
>   latch be empty.
>   - Ensure that we don't segfault from functions which attempt to
>   derive useful information the niter count.  For such functions, we
>   return some value such as `NULL_TREE'.  This allows for the calling
>   function to choose how to deal with the unknown.  Where types would be
>   derived from `TREE_TYPE (niters)', we fall-back to the
>   `size_type_node', given the association of `size_t' with the maximum
>   size of a theoretically possible object of any type.  This should
>   thus be able to accommodate any induction variable count as well as
>   the type of `niters' would.

For the middle-end it makes more sense to use 'sizetype' which
is __SIZE_TYPE__.  I'll comment on this more when looking at individual
patches.

>   - Disable niter-based profitability checking.  At runtime, this
>   would require knowledge of the maximum number of iterations that
>   will be executed so as to ascertain whether or not it will be
>   beneficial to performance to run the vectorized loop.

Reasonable.  Note there might be still an upper bound on the number of
iterations, derived from array sizes or similar.  Those static checks
should be preserved.

>From the description it sounds that I might approve parts of the
series that might stand on its own - feel free to push approved
parts to shrink the series size for re-spins (assuming independent
testing succeeds, of course).

Thanks for working on this.
Richard.

> Victor Do Nascimento (13):
>   vect: Relax known iteration number constraint
>   vect: Make all exit conditions early breaks for uncounted loops
>   vect: Correct analysis of nested loops
>   vect: Extend `vec_init_loop_exit_info' to handle uncounted loops
>   vect: Add default types & retvals for uncounted loops
>   vect: guard niters manipulation with `LOOP_VINFO_NITERS_UNCOUNTED_P'
>   vect: Disable niters-based skipping of uncounted vectorized loops
>   vect: Reclassify early break fold left reductions as simple reductions
>   vect: Fix uncounted PHI handling of
>     `slpeel_tree_duplicate_loop_to_edge_cfg'
>   vect: Correct resetting of live out values on epilog loop entry
>   vect: Disable use of partial vectors for uncounted loops
>   vect: Reject uncounted loop vectorization where alias analysis may
>     fail
>   vect: Add uncounted loop unit tests
> 
>  .../gcc.dg/vect/vect-early-break_40.c         |   3 +-
>  gcc/testsuite/gcc.dg/vect/vect-uncounted-1.c  |  18 +++
>  gcc/testsuite/gcc.dg/vect/vect-uncounted-2.c  |  24 ++++
>  gcc/testsuite/gcc.dg/vect/vect-uncounted-3.c  |  16 +++
>  .../gcc.dg/vect/vect-uncounted-run-1.c        |  33 ++++++
>  gcc/tree-vect-data-refs.cc                    |  11 +-
>  gcc/tree-vect-loop-manip.cc                   | 103 +++++++++++-------
>  gcc/tree-vect-loop.cc                         |  88 ++++++++++-----
>  gcc/tree-vect-stmts.cc                        |   3 +-
>  gcc/tree-vectorizer.h                         |   8 +-
>  10 files changed, 235 insertions(+), 72 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-uncounted-run-1.c
> 
> 

-- 
Richard Biener <[email protected]>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Jochen Jaser, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Reply via email to