Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors

Xiaohong Gong Tue, 01 Jul 2025 20:17:19 -0700

On Tue, 1 Jul 2025 05:59:15 GMT, Xiaohong Gong <xg...@openjdk.org> wrote:


> ### Background
> On AArch64, the minimum vector length supported is 64-bit for basic types, 
> except for `byte` and `boolean` (32-bit and 16-bit respectively to match 
> special Vector API features). This limitation prevents intrinsification of 
> vector type conversions between `short` and wider types (e.g. `long/double`) 
> in Vector API when the entire vector length is within 128 bits, resulting in 
> degraded performance for such conversions.
> 
> For example, type conversions between `ShortVector.SPECIES_128` and 
> `LongVector.SPECIES_128` are not supported on AArch64 NEON and SVE 
> architectures with 128-bit max vector size. This occurs because the compiler 
> would need to generate a vector with 2 short elements, resulting in a 32-bit 
> vector size.
> 
> To intrinsify such type conversion APIs, we need to relax the min vector 
> length constraint from 64-bit to 32-bit for short vectors.
> 
> ### Impact Analysis
> #### 1. Vector types
> Vectors only with `short` element types will be affected, as we just 
> supported 32-bit `short` vectors in this change.
> 
> #### 2. Vector API
> No impact on Vector API or the vector-specific nodes. The minimum vector 
> shape at API level remains 64-bit. It's not possible to generate a final 
> vector IR with 32-bit vector size. Type conversions may generate intermediate 
> 32-bit vectors, but they will be resized or cast to vectors with at least 
> 64-bit length.
> 
> #### 3. Auto-vectorization
> Enables vectorization of cases containing only 2 `short` lanes, with 
> significant performance improvements. Since we have supported 32-bit vectors 
> for `byte` type for a long time, extending this to `short` did not introduce 
> additional risks.
> 
> #### 4. Codegen of vector nodes
> NEON doesn't support 32-bit SIMD instructions, so we use 64-bit instructions 
> instead. For lanewise operations, this is safe because the higher half bits 
> can be ignored.
> 
> Details:
>  - Lanewise vector operations are unaffected as explained above.
>  - NEON supports vector load/store instructions with 32-bit vector size, 
> which we already use in relevant IRs (shared by SVE).
>  - Cross-lane operations like reduction may be affected, potentially causing 
> incorrect results for `min/max/mul/and` reductions. The min vector size for 
> such operations should remain 64-bit. We've added assertions in match rules. 
> Since it's currently not possible to generate such reductions (Vector API 
> minimum is 64-bit, and SLP doesn't support subword type reductions), we 
> maintain the status quo. However, adding an explicit vector size check in 
> `match_rule_s...

Hi @theRealAph , I'v updated the patch by fixing the comment issues. Could you 
please take a look at it again? Thanks a lot!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/26057#issuecomment-3026147575

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors

Reply via email to