> ### Background > On AArch64, the minimum vector length supported is 64-bit for basic types, > except for `byte` and `boolean` (32-bit and 16-bit respectively to match > special Vector API features). This limitation prevents intrinsification of > vector type conversions between `short` and wider types (e.g. `long/double`) > in Vector API when the entire vector length is within 128 bits, resulting in > degraded performance for such conversions. > > For example, type conversions between `ShortVector.SPECIES_128` and > `LongVector.SPECIES_128` are not supported on AArch64 NEON and SVE > architectures with 128-bit max vector size. This occurs because the compiler > would need to generate a vector with 2 short elements, resulting in a 32-bit > vector size. > > To intrinsify such type conversion APIs, we need to relax the min vector > length constraint from 64-bit to 32-bit for short vectors. > > ### Impact Analysis > #### 1. Vector types > Vectors only with `short` element types will be affected, as we just > supported 32-bit `short` vectors in this change. > > #### 2. Vector API > No impact on Vector API or the vector-specific nodes. The minimum vector > shape at API level remains 64-bit. It's not possible to generate a final > vector IR with 32-bit vector size. Type conversions may generate intermediate > 32-bit vectors, but they will be resized or cast to vectors with at least > 64-bit length. > > #### 3. Auto-vectorization > Enables vectorization of cases containing only 2 `short` lanes, with > significant performance improvements. Since we have supported 32-bit vectors > for `byte` type for a long time, extending this to `short` did not introduce > additional risks. > > #### 4. Codegen of vector nodes > NEON doesn't support 32-bit SIMD instructions, so we use 64-bit instructions > instead. For lanewise operations, this is safe because the higher half bits > can be ignored. > > Details: > - Lanewise vector operations are unaffected as explained above. > - NEON supports vector load/store instructions with 32-bit vector size, > which we already use in relevant IRs (shared by SVE). > - Cross-lane operations like reduction may be affected, potentially causing > incorrect results for `min/max/mul/and` reductions. The min vector size for > such operations should remain 64-bit. We've added assertions in match rules. > Since it's currently not possible to generate such reductions (Vector API > minimum is 64-bit, and SLP doesn't support subword type reductions), we > maintain the status quo. However, adding an explicit vector size check in > `match_rule_s...
Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision: Refine comments based on review suggestion ------------- Changes: - all: https://git.openjdk.org/jdk/pull/26057/files - new: https://git.openjdk.org/jdk/pull/26057/files/5af5bd49..4e15e588 Webrevs: - full: https://webrevs.openjdk.org/?repo=jdk&pr=26057&range=01 - incr: https://webrevs.openjdk.org/?repo=jdk&pr=26057&range=00-01 Stats: 9 lines in 3 files changed: 0 ins; 0 del; 9 mod Patch: https://git.openjdk.org/jdk/pull/26057.diff Fetch: git fetch https://git.openjdk.org/jdk.git pull/26057/head:pull/26057 PR: https://git.openjdk.org/jdk/pull/26057