https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113860

--- Comment #3 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Pengxuan Zheng <pzh...@gcc.gnu.org>:

https://gcc.gnu.org/g:e4b8db26de35239bd621aad9c0361f25d957122b

commit r15-2659-ge4b8db26de35239bd621aad9c0361f25d957122b
Author: Pengxuan Zheng <quic_pzh...@quicinc.com>
Date:   Wed Jul 31 17:00:01 2024 -0700

    aarch64: Improve Advanced SIMD popcount expansion by using SVE [PR113860]

    This patch improves the Advanced SIMD popcount expansion by using SVE if
    available.

    For example, GCC currently generates the following code sequence for V2DI:
      cnt     v31.16b, v31.16b
      uaddlp  v31.8h, v31.16b
      uaddlp  v31.4s, v31.8h
      uaddlp  v31.2d, v31.4s

    However, by using SVE, we can generate the following sequence instead:
      ptrue   p7.b, all
      cnt     z31.d, p7/m, z31.d

    Similar improvements can be made for V4HI, V8HI, V2SI and V4SI too.

    The scalar popcount expansion can also be improved similarly by using SVE
and
    those changes will be included in a separate patch.

            PR target/113860

    gcc/ChangeLog:

            * config/aarch64/aarch64-simd.md (popcount<mode>2): Add TARGET_SVE
            support.
            * config/aarch64/aarch64-sve.md (@aarch64_pred_<optab><mode>): Use
new
            iterator SVE_VDQ_I.
            * config/aarch64/iterators.md (SVE_VDQ_I): New mode iterator.
            (VPRED): Add V8QI, V16QI, V4HI, V8HI and V2SI.

    gcc/testsuite/ChangeLog:

            * gcc.target/aarch64/popcnt-sve.c: New test.

    Signed-off-by: Pengxuan Zheng <quic_pzh...@quicinc.com>

Reply via email to