Sorry for the slow reply. "yangyang (ET)" <yangyang...@huawei.com> writes: > Hi, > > This is a simple fix for PR96195. > > For the test case, GCC generates the following gimple statement in > pass_vect: > > vect__21.16_58 = zp.simdclone.2 (vect_vec_iv_.15_56); > > The mode of vect__21.16_58 is VNx2SI while the mode of zp.simdclone.2 > (vect_vec_iv_.15_56) is V4SI, resulting in the crash. > > In vectorizable_simd_clone_call, type compatibility is handled based on > the number of elements and the type compatibility of elements, which is not > enough. > This patch add VIEW_CONVERT_EXPRs if the arguments types and return > type of simd clone function are distinct with the vectype of stmt. > > Added one testcase for this. Bootstrap and tested on both aarch64 and > x86 Linux platform, no new regression witnessed.
I agree this looks correct as far as the target-independent interface goes. However, the underlying problem is that we haven't yet added support for SVE omp simd functions. What should happen for the testcase is that we assume both SVE and Advanced SIMD versions of zp exist and call the SVE version instead of the Advanced SIMD version. There again, for little-endian -msve-vector-bits=128 there should be no overhead with using the Advanced SIMD version, and big-endian -msve-vector-bits=128 is equivalent to -msve-vector-bits=scalable. Things would get more interesting for: #pragma omp declare simd simdlen(8) int zp (int); and -msve-vector-bits=256, but again, we don't yet support simdlen(8) for Advanced SIMD. So all in all, I agree this is the right fix. Pushed to master with a minor whitespace fixup for: > + gassign *new_stmt > + = gimple_build_assign (make_ssa_name (atype), > + vec_oprnd0); …the indentation on this line. Thanks, Richard