Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v4]

2025-07-18 Thread Xiaohong Gong
On Wed, 9 Jul 2025 01:23:43 GMT, Xiaohong Gong wrote: >> ### Background >> On AArch64, the minimum vector length supported is 64-bit for basic types, >> except for `byte` and `boolean` (32-bit and 16-bit respectively to match >> special Vector API features).

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v4]

2025-07-15 Thread Xiaohong Gong
On Wed, 9 Jul 2025 01:23:43 GMT, Xiaohong Gong wrote: >> ### Background >> On AArch64, the minimum vector length supported is 64-bit for basic types, >> except for `byte` and `boolean` (32-bit and 16-bit respectively to match >> special Vector API features).

Re: RFR: 8358768: [vectorapi] Make VectorOperators.SUADD an Associative [v2]

2025-07-10 Thread Xiaohong Gong
On Wed, 9 Jul 2025 22:52:58 GMT, Ian Graves wrote: >> Adding SUADD an associative operation in the Vector API. Saturated addition >> on fixed-width unsigned integers is provably associative. > > Ian Graves has updated the pull request incrementally with one additional > commit since the last re

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-10 Thread Xiaohong Gong
On Fri, 4 Jul 2025 09:11:40 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Refine the comment in ad file > > This looks good. Thanks. Hi @theRealAph , would you m

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v4]

2025-07-10 Thread Xiaohong Gong
On Thu, 10 Jul 2025 01:40:06 GMT, Xiaohong Gong wrote: >> Thanks for making the changes. Looks good to me. > >> Thanks for making the changes. Looks good to me. > > Thanks a lot for your review! > @XiaohongGong The code changes look sane, although, for the record, I

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v4]

2025-07-09 Thread Xiaohong Gong
On Wed, 9 Jul 2025 09:17:13 GMT, Xiaohong Gong wrote: >> Hi @eme64 , could you please help take a look at this patch especially the >> test part since most of the tests are SLP related? It will be helpful if you >> could also help trigger a testing for it. Th

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v4]

2025-07-09 Thread Xiaohong Gong
On Wed, 9 Jul 2025 10:43:07 GMT, Bhavana Kilambi wrote: > Thanks for making the changes. Looks good to me. Thanks a lot for your review! - PR Comment: https://git.openjdk.org/jdk/pull/26057#issuecomment-3054908101

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v4]

2025-07-09 Thread Xiaohong Gong
On Wed, 9 Jul 2025 09:06:44 GMT, Xiaohong Gong wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Disable auto-vectorization of double to short conversion for NEON and >> update t

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v4]

2025-07-09 Thread Xiaohong Gong
On Wed, 9 Jul 2025 01:23:43 GMT, Xiaohong Gong wrote: >> ### Background >> On AArch64, the minimum vector length supported is 64-bit for basic types, >> except for `byte` and `boolean` (32-bit and 16-bit respectively to match >> special Vector API features).

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-08 Thread Xiaohong Gong
On Tue, 8 Jul 2025 10:33:50 GMT, Fei Gao wrote: >>> > > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java#L388-L392 >>> > > > >>> > > > >>> > > > Actually I didn't change the min vector siz

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-08 Thread Xiaohong Gong
On Tue, 8 Jul 2025 09:07:00 GMT, Xiaohong Gong wrote: >>> > > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java#L388-L392 >>> > > > >>> &

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v4]

2025-07-08 Thread Xiaohong Gong
`min/max/mul/and` reductions. The min vector size for > such operations should remain 64-bit. We've added assertions in match rules. > Since it's currently not possible to generate such reductions (Vector API > minimum is 64-bit, and SLP doesn't support subword type red

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-08 Thread Xiaohong Gong
On Tue, 8 Jul 2025 10:33:50 GMT, Fei Gao wrote: > > > > > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java#L388-L392 > > > > > > > > > > > > > > > > > > Actually I didn't change the min ve

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-08 Thread Xiaohong Gong
On Tue, 8 Jul 2025 09:00:53 GMT, Xiaohong Gong wrote: >>> > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java#L388-L392 >>> > > >>> > > >

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-08 Thread Xiaohong Gong
On Tue, 8 Jul 2025 08:18:57 GMT, Fei Gao wrote: > > > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java#L388-L392 > > > > > > > > > > > > Actually I didn't change the min vector size for `c

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-07 Thread Xiaohong Gong
On Mon, 7 Jul 2025 13:23:15 GMT, Fei Gao wrote: > > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/hotspot/jtreg/compiler/vectorization/runner/ArrayTypeConvertTest.java#L388-L392 > > > > > > Actually I didn't change the min vector size for `char` vectors in

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-07 Thread Xiaohong Gong
On Mon, 7 Jul 2025 06:59:20 GMT, Xiaohong Gong wrote: >> Have you measured the performance of this micro-benchmark on NEON machine? >> https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/micro/org/openjdk/bench/vm/compiler/TypeVectorOperations.ja

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-07 Thread Xiaohong Gong
On Sat, 5 Jul 2025 15:08:35 GMT, Fei Gao wrote: > Have you measured the performance of this micro-benchmark on NEON machine? > > https://github.com/openjdk/jdk/blob/f2d2eef988c57cc9f6194a8fd5b2b422035ee68f/test/micro/org/openjdk/bench/vm/compiler/TypeVectorOperations.java#L251-L256 > > We added

Integrated: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-07-06 Thread Xiaohong Gong
On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs > for X86 platforms [1]. However, the current implementation is not optimal for > AArch64 SVE platform, which natively supports vector instructions fo

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-07-06 Thread Xiaohong Gong
On Mon, 7 Jul 2025 02:05:06 GMT, Xiaohong Gong wrote: >> @XiaohongGong I quickly scanned the patch, it looks good to me too. I'm >> submitting some internal testing now, to make sure our extended testing does >> not break on integration. Should take about 24h. > >

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-07-06 Thread Xiaohong Gong
On Wed, 2 Jul 2025 08:24:22 GMT, Emanuel Peter wrote: >>> Agree with Paul, these are minor regressions. Let us proceed with this >>> patch. >> >> Thanks so much for your review @sviswa7 ! > > @XiaohongGong I quickly scanned the patch, it looks good to me too. I'm > submitting some internal tes

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-04 Thread Xiaohong Gong
On Fri, 4 Jul 2025 09:11:40 GMT, Andrew Haley wrote: > This looks good. Thanks. Thanks so much for your review! - PR Comment: https://git.openjdk.org/jdk/pull/26057#issuecomment-3035115512

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-03 Thread Xiaohong Gong
On Thu, 3 Jul 2025 06:10:28 GMT, Xiaohong Gong wrote: >> ### Background >> On AArch64, the minimum vector length supported is 64-bit for basic types, >> except for `byte` and `boolean` (32-bit and 16-bit respectively to match >> special Vector API features).

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v3]

2025-07-02 Thread Xiaohong Gong
`min/max/mul/and` reductions. The min vector size for > such operations should remain 64-bit. We've added assertions in match rules. > Since it's currently not possible to generate such reductions (Vector API > minimum is 64-bit, and SLP doesn't support subword type red

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v2]

2025-07-02 Thread Xiaohong Gong
On Wed, 2 Jul 2025 08:15:34 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Refine comments based on review suggestion > > src/hotspot/cpu/aarch64/aa

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-07-02 Thread Xiaohong Gong
On Wed, 2 Jul 2025 01:52:19 GMT, Xiaohong Gong wrote: >> Agree with Paul, these are minor regressions. Let us proceed with this >> patch. > >> Agree with Paul, these are minor regressions. Let us proceed with this patch. > > Thanks so much for your review @sviswa7 !

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors

2025-07-01 Thread Xiaohong Gong
On Tue, 1 Jul 2025 05:59:15 GMT, Xiaohong Gong wrote: > ### Background > On AArch64, the minimum vector length supported is 64-bit for basic types, > except for `byte` and `boolean` (32-bit and 16-bit respectively to match > special Vector API features). This limitat

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors [v2]

2025-07-01 Thread Xiaohong Gong
`min/max/mul/and` reductions. The min vector size for > such operations should remain 64-bit. We've added assertions in match rules. > Since it's currently not possible to generate such reductions (Vector API > minimum is 64-bit, and SLP doesn't support subword type red

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-07-01 Thread Xiaohong Gong
On Tue, 1 Jul 2025 21:30:20 GMT, Sandhya Viswanathan wrote: > Agree with Paul, these are minor regressions. Let us proceed with this patch. Thanks so much for your review @sviswa7 ! - PR Comment: https://git.openjdk.org/jdk/pull/25138#issuecomment-3026080679

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-07-01 Thread Xiaohong Gong
On Tue, 1 Jul 2025 18:03:33 GMT, Paul Sandoz wrote: > This is a nice simplification, Java changes look good. I'll let the Intel > folks sign-off related to regressions. IMO minor regressions like this are > acceptable if the generated code quality is good, and if the benchmark > reports higher

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-07-01 Thread Xiaohong Gong
On Wed, 25 Jun 2025 09:16:48 GMT, Xiaohong Gong wrote: >> JDK-8318650 introduced hotspot intrinsification of subword gather load APIs >> for X86 platforms [1]. However, the current implementation is not optimal >> for AArch64 SVE platform, which natively supports vecto

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-07-01 Thread Xiaohong Gong
On Tue, 1 Jul 2025 06:41:32 GMT, Xiaohong Gong wrote: >> Ping again! Thanks in advance! > >> @XiaohongGong I'm a little busy at the moment, and soon going on a summer >> vacation, so I cannot promise a full review soon. Feel free to ask someone >> else to have

Re: RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors

2025-07-01 Thread Xiaohong Gong
On Tue, 1 Jul 2025 08:10:16 GMT, Andrew Haley wrote: >> ### Background >> On AArch64, the minimum vector length supported is 64-bit for basic types, >> except for `byte` and `boolean` (32-bit and 16-bit respectively to match >> special Vector API features). This limitation prevents intrinsifica

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-07-01 Thread Xiaohong Gong
On Tue, 1 Jul 2025 06:07:03 GMT, Xiaohong Gong wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains three commits: >> >> - Address review comments >> - Merge 'jdk:mas

RFR: 8359419: AArch64: Relax min vector length to 32-bit for short vectors

2025-06-30 Thread Xiaohong Gong
### Background On AArch64, the minimum vector length supported is 64-bit for basic types, except for `byte` and `boolean` (32-bit and 16-bit respectively to match special Vector API features). This limitation prevents intrinsification of vector type conversions between `short` and wider types (e

Re: RFR: 8354242: VectorAPI: combine vector not operation with compare [v9]

2025-06-26 Thread Xiaohong Gong
On Wed, 25 Jun 2025 10:08:23 GMT, erifan wrote: >> This patch optimizes the following patterns: >> For integer types: >> >> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1)) >> => (VectorMaskCmp src1 src2 ncond) >> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1)) >> => (VectorMa

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-06-25 Thread Xiaohong Gong
On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs > for X86 platforms [1]. However, the current implementation is not optimal for > AArch64 SVE platform, which natively supports vector instructions fo

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-06-25 Thread Xiaohong Gong
.325 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms > 256 14484.252 14255.156 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms > 1024 3664.9003595.615 0.98 > GatherOperationsBenchmark.microByteGather128 thrpt 30 ops/ms >

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-06-25 Thread Xiaohong Gong
On Mon, 2 Jun 2025 10:48:25 GMT, Emanuel Peter wrote: >>> > @XiaohongGong Thanks for splitting this one out, and for investigating >>> > the regressions here. >>> > Putting the permalink here, fixed to the current change (the link you >>> > pasted will always refer to the newest, which may late

Re: RFR: 8354242: VectorAPI: combine vector not operation with compare [v8]

2025-06-11 Thread Xiaohong Gong
On Fri, 6 Jun 2025 10:38:11 GMT, erifan wrote: >> This patch optimizes the following patterns: >> For integer types: >> >> (XorV (VectorMaskCmp src1 src2 cond) (Replicate -1)) >> => (VectorMaskCmp src1 src2 ncond) >> (XorVMask (VectorMaskCmp src1 src2 cond) (MaskAll m1)) >> => (VectorMas

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-06-02 Thread Xiaohong Gong
On Tue, 3 Jun 2025 01:45:57 GMT, Xiaohong Gong wrote: >>> > @XiaohongGong Thanks for splitting this one out, and for investigating >>> > the regressions here. >>> > Putting the permalink here, fixed to the current change (the link you >>> > p

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-06-02 Thread Xiaohong Gong
On Fri, 30 May 2025 08:15:22 GMT, Xiaohong Gong wrote: >>> @XiaohongGong Thanks for splitting this one out, and for investigating the >>> regressions here. >>> >>> Putting the permalink here, fixed to the current change (the link you >>> pasted wil

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-05-30 Thread Xiaohong Gong
On Tue, 20 May 2025 05:40:04 GMT, Xiaohong Gong wrote: > > @XiaohongGong Thanks for splitting this one out, and for investigating the > > regressions here. > > Putting the permalink here, fixed to the current change (the link you > > pasted will always refer to the ne

Re: RFR: 8354242: VectorAPI: combine vector not operation with compare [v6]

2025-05-28 Thread Xiaohong Gong
On Wed, 28 May 2025 12:26:31 GMT, Emanuel Peter wrote: >> erifan has updated the pull request with a new target base due to a merge or >> a rebase. The incremental webrev excludes the unrelated changes brought in >> by the merge/rebase. The pull request contains 10 additional commits since >>

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-05-26 Thread Xiaohong Gong
On Mon, 26 May 2025 06:51:12 GMT, Emanuel Peter wrote: >> JDK-8318650 introduced hotspot intrinsification of subword gather load APIs >> for X86 platforms [1]. However, the current implementation is not optimal >> for AArch64 SVE platform, which natively supports vector instructions for >> sub

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-05-19 Thread Xiaohong Gong
On Tue, 20 May 2025 02:22:13 GMT, Xiaohong Gong wrote: >> Ping again~ could any one please take a look at this PR? Thanks a lot! > >> Hi @XiaohongGong , Very nice work!, Looks good to me, will do some testing >> and get back. >> >> Do you have an

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-05-19 Thread Xiaohong Gong
On Mon, 19 May 2025 03:10:46 GMT, Xiaohong Gong wrote: >> JDK-8318650 introduced hotspot intrinsification of subword gather load APIs >> for X86 platforms [1]. However, the current implementation is not optimal >> for AArch64 SVE platform, which natively supports vecto

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-05-18 Thread Xiaohong Gong
On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs > for X86 platforms [1]. However, the current implementation is not optimal for > AArch64 SVE platform, which natively supports vector instructions fo

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-05-13 Thread Xiaohong Gong
On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs > for X86 platforms [1]. However, the current implementation is not optimal for > AArch64 SVE platform, which natively supports vector instructions fo

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-05-09 Thread Xiaohong Gong
On Fri, 9 May 2025 07:35:41 GMT, Xiaohong Gong wrote: > JDK-8318650 introduced hotspot intrinsification of subword gather load APIs > for X86 platforms [1]. However, the current implementation is not optimal for > AArch64 SVE platform, which natively supports vector instructions fo

RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API

2025-05-09 Thread Xiaohong Gong
JDK-8318650 introduced hotspot intrinsification of subword gather load APIs for X86 platforms [1]. However, the current implementation is not optimal for AArch64 SVE platform, which natively supports vector instructions for subword gather load operations using an int vector for indices (see [2][

Re: RFR: 8354242: VectorAPI: combine vector not operation with compare [v5]

2025-05-07 Thread Xiaohong Gong
On Wed, 7 May 2025 11:02:43 GMT, Jatin Bhateja wrote: >> Hi @jatin-bhateja It is feasible. But I was thinking about whether another >> solution would be better, which is to turn `VectorMask.fromLong(SPECIES, >> -1L)` into `MaskAll(true)` in the mid-end. In this way, we don't need to >> check

Withdrawn: 8351623: VectorAPI: Refactor subword gather load and add SVE implementation

2025-04-24 Thread Xiaohong Gong
On Wed, 16 Apr 2025 08:58:34 GMT, Xiaohong Gong wrote: > ### Summary: > [JDK-8318650](http://java-service.client.nvidia.com/?q=8318650) added the > hotspot intrinsifying of subword gather load APIs for X86 platforms [1]. This > patch aims at implementing the equivalent funct

Re: RFR: 8351623: VectorAPI: Refactor subword gather load and add SVE implementation

2025-04-24 Thread Xiaohong Gong
On Wed, 16 Apr 2025 08:58:34 GMT, Xiaohong Gong wrote: > ### Summary: > [JDK-8318650](http://java-service.client.nvidia.com/?q=8318650) added the > hotspot intrinsifying of subword gather load APIs for X86 platforms [1]. This > patch aims at implementing the equivalent funct

Re: RFR: 8351623: VectorAPI: Refactor subword gather load and add SVE implementation

2025-04-23 Thread Xiaohong Gong
On Thu, 17 Apr 2025 01:42:22 GMT, Xiaohong Gong wrote: >> ### Summary: >> [JDK-8318650](http://java-service.client.nvidia.com/?q=8318650) added the >> hotspot intrinsifying of subword gather load APIs for X86 platforms [1]. >> This patch aims at implementing the equi

Re: RFR: 8351623: VectorAPI: Refactor subword gather load and add SVE implementation

2025-04-21 Thread Xiaohong Gong
On Sun, 20 Apr 2025 03:28:48 GMT, SendaoYan wrote: >> ### Summary: >> [JDK-8318650](http://java-service.client.nvidia.com/?q=8318650) added the >> hotspot intrinsifying of subword gather load APIs for X86 platforms [1]. >> This patch aims at implementing the equivalent functionality for AArch64

Re: RFR: 8353786: Migrate Vector API math library support to FFM API [v10]

2025-04-17 Thread Xiaohong Gong
On Thu, 17 Apr 2025 18:03:47 GMT, Vladimir Ivanov wrote: >> Migrate Vector API math library (SVML and SLEEF) linkage from native code >> (in JVM) to Java FFM API. >> >> Since FFM API doesn't support vector calling conventions yet, migration >> affects only symbol lookup for now. But it still e

Re: RFR: 8353786: Migrate Vector API math library support to FFM API [v5]

2025-04-17 Thread Xiaohong Gong
On Thu, 17 Apr 2025 18:08:21 GMT, Vladimir Ivanov wrote: >> Please see the `addr` definition code in >> https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/vectorIntrinsics.cpp#L1877 >> . If queried `addr` returns `nullptr` for 256-bit vectors, and the arch >> supports scalable v

Re: RFR: 8351623: VectorAPI: Refactor subword gather load and add SVE implementation

2025-04-16 Thread Xiaohong Gong
On Wed, 16 Apr 2025 08:58:34 GMT, Xiaohong Gong wrote: > ### Summary: > [JDK-8318650](http://java-service.client.nvidia.com/?q=8318650) added the > hotspot intrinsifying of subword gather load APIs for X86 platforms [1]. This > patch aims at implementing the equivalent funct

Re: RFR: 8353786: Migrate Vector API math library support to FFM API [v5]

2025-04-16 Thread Xiaohong Gong
On Wed, 16 Apr 2025 18:26:18 GMT, Vladimir Ivanov wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMathLibrary.java >> line 240: >> >>> 238: if (isAARCH64() && vspecies.vectorBitSize() > 128) { >>> 239: return false; // FIXME: SVE s

RFR: 8351623: VectorAPI: Refactor subword gather load and add SVE implementation

2025-04-16 Thread Xiaohong Gong
### Summary: [JDK-8318650](http://java-service.client.nvidia.com/?q=8318650) added the hotspot intrinsifying of subword gather load APIs for X86 platforms [1]. This patch aims at implementing the equivalent functionality for AArch64 SVE platform. In addition to the AArch64 backend support, this

Re: RFR: 8353786: Migrate Vector API math library support to FFM API [v5]

2025-04-15 Thread Xiaohong Gong
On Fri, 11 Apr 2025 21:23:52 GMT, Vladimir Ivanov wrote: >> Migrate Vector API math library (SVML and SLEEF) linkage from native code >> (in JVM) to Java FFM API. >> >> Since FFM API doesn't support vector calling conventions yet, migration >> affects only symbol lookup for now. But it still e

Re: RFR: 8353786: Migrate Vector API math library support to FFM API [v5]

2025-04-15 Thread Xiaohong Gong
On Wed, 16 Apr 2025 00:20:07 GMT, Paul Sandoz wrote: >> Vladimir Ivanov has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contains 19 additional >> commits

Re: RFR: 8353786: Migrate Vector API math library support to FFM API [v5]

2025-04-15 Thread Xiaohong Gong
On Tue, 15 Apr 2025 17:43:52 GMT, Vladimir Ivanov wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMathLibrary.java >> line 198: >> >>> 196: if (vspecies.vectorBitSize() < 128) { >>> 197: return false; // 64-bit vectors are not supported >>

Re: RFR: 8353786: Migrate Vector API math library support to FFM API [v5]

2025-04-15 Thread Xiaohong Gong
On Fri, 11 Apr 2025 21:23:52 GMT, Vladimir Ivanov wrote: >> Migrate Vector API math library (SVML and SLEEF) linkage from native code >> (in JVM) to Java FFM API. >> >> Since FFM API doesn't support vector calling conventions yet, migration >> affects only symbol lookup for now. But it still e

Integrated: 8350748: VectorAPI: Method "checkMaskFromIndexSize" should be force inlined

2025-03-02 Thread Xiaohong Gong
On Thu, 27 Feb 2025 06:43:19 GMT, Xiaohong Gong wrote: > Method `checkMaskFromIndexSize` is called by some vector masked APIs like > `fromArray/intoArray/fromMemorySegment/...`. It is used to check whether the > index of any active lanes in a mask will reach out of the boundary of the

Re: RFR: 8350748: VectorAPI: Method "checkMaskFromIndexSize" should be force inlined

2025-03-02 Thread Xiaohong Gong
On Thu, 27 Feb 2025 23:30:29 GMT, Paul Sandoz wrote: >> Method `checkMaskFromIndexSize` is called by some vector masked APIs like >> `fromArray/intoArray/fromMemorySegment/...`. It is used to check whether the >> index of any active lanes in a mask will reach out of the boundary of the >> give

RFR: 8350748: VectorAPI: Method "checkMaskFromIndexSize" should be force inlined

2025-02-26 Thread Xiaohong Gong
Method `checkMaskFromIndexSize` is called by some vector masked APIs like `fromArray/intoArray/fromMemorySegment/...`. It is used to check whether the index of any active lanes in a mask will reach out of the boundary of the given Array/MemorySegment. This function should be force inlined, or a

Re: RFR: 8346954: [JMH] jdk.incubator.vector.MaskedLogicOpts fails due to IndexOutOfBoundsException [v2]

2025-02-26 Thread Xiaohong Gong
On Wed, 26 Feb 2025 07:04:58 GMT, Nicole Xu wrote: >> Suite `MaskedLogicOpts.maskedLogicOperationsLong512()` failed on both x86 >> and AArch64 with the following error: >> >> >> java.lang.IndexOutOfBoundsException: Index 252 out of bounds for length 249 >> >> >> The variable `long256_arr_idx

Re: RFR: 8350682 [JMH] vector.IndexInRangeBenchmark failed with IOOBE

2025-02-26 Thread Xiaohong Gong
On Wed, 26 Feb 2025 16:25:46 GMT, Vladimir Ivanov wrote: > Yes, exceptions reported for runs with size=1024. The test support max > size=512 and have no checks for passed params. The change makes sense to me. Thanks for your fixing! - PR Comment: https://git.openjdk.org/jdk/pull/2

Re: RFR: 8350682 [JMH] vector.IndexInRangeBenchmark failed with IOOBE

2025-02-26 Thread Xiaohong Gong
On Tue, 25 Feb 2025 19:06:05 GMT, Vladimir Ivanov wrote: > Array initialization by parameter was added. Extra constant was used to align > cycle step with used arrays. Marked as reviewed by xgong (Committer). - PR Review: https://git.openjdk.org/jdk/pull/23783#pullrequestreview-26

Re: RFR: 8350682 [JMH] vector.IndexInRangeBenchmark failed with IOOBE

2025-02-25 Thread Xiaohong Gong
On Wed, 26 Feb 2025 06:59:27 GMT, Xiaohong Gong wrote: > Hi @IvaVladimir , thank you for fixing this benchmark. Could you please tell > more information about the IOOBE crash (e.g. `size`, `benchmark name`, `arch > info`, .etc) ? I still cannot figure out why it can fail with IOOB

Re: RFR: 8350682 [JMH] vector.IndexInRangeBenchmark failed with IOOBE

2025-02-25 Thread Xiaohong Gong
On Tue, 25 Feb 2025 19:06:05 GMT, Vladimir Ivanov wrote: > Array initialization by parameter was added. Extra constant was used to align > cycle step with used arrays. Hi @IvaVladimir , thank you for fixing this benchmark. Could you please tell more information about the IOOBE crash (e.g. `siz

Re: RFR: 8346954: [JMH] jdk.incubator.vector.MaskedLogicOpts fails due to IndexOutOfBoundsException

2025-02-10 Thread Xiaohong Gong
On Wed, 8 Jan 2025 09:04:47 GMT, Nicole Xu wrote: > Suite MaskedLogicOpts.maskedLogicOperationsLong512() failed on both x86 and > AArch64 with the following error: > > > java.lang.IndexOutOfBoundsException: Index 252 out of bounds for length 249 > > > The variable `long256_arr_idx` is misuse

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v9]

2024-02-06 Thread Xiaohong Gong
On Thu, 7 Dec 2023 09:30:01 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> opt

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v9]

2023-12-07 Thread Xiaohong Gong
On Thu, 7 Dec 2023 09:30:01 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> opt

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v9]

2023-12-07 Thread Xiaohong Gong
/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request incrementally with on

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v7]

2023-12-07 Thread Xiaohong Gong
On Wed, 6 Dec 2023 11:46:03 GMT, Magnus Ihse Bursie wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Add "--with-libsleef-lib" and "--with-libsleef-include" options &

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v8]

2023-12-06 Thread Xiaohong Gong
/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request with a new target b

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v7]

2023-12-06 Thread Xiaohong Gong
/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request incrementally with one

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-12-06 Thread Xiaohong Gong
On Tue, 5 Dec 2023 13:03:22 GMT, Magnus Ihse Bursie wrote: >> Thanks for the suggestion @magicus ! >> >> The check in current `lib-sleef.m4` is very common: >> >> - If user has specified libsleef root by '--with-libsleef', we assume it is >> the manually built sleef lib. So only `lib/` and `i

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-06 Thread Xiaohong Gong
On Fri, 1 Dec 2023 16:26:02 GMT, Magnus Ihse Bursie wrote: >> Xiaohong Gong has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request conta

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-05 Thread Xiaohong Gong
On Tue, 5 Dec 2023 13:00:04 GMT, Magnus Ihse Bursie wrote: > So you need to check both the flag and the header file? Oh well, then this is > probably as good as it gets. Yes, we have to check both the flag and the header file. - PR Comment: https://git.openjdk.org/jdk/pull/16234#i

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-12-04 Thread Xiaohong Gong
On Fri, 1 Dec 2023 16:36:18 GMT, Magnus Ihse Bursie wrote: >> You need to expand this logic to cover more instances. See e.g. lib-ffi.m4 >> for inspiration. >> >> Basic flow: >> * if user has specified libsleef root with argument, check both lib/ and >> lib64/ under that root. >> * if user has

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-04 Thread Xiaohong Gong
On Mon, 4 Dec 2023 08:31:17 GMT, Xiaohong Gong wrote: >> The final thing we need to resolve properly is the SVE compiler test. >> >> @theRealAph says: >>> arm_sve.h is part of GCC. It was added to GCC in 2019. >> >> A more relevant question is what v

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-04 Thread Xiaohong Gong
On Fri, 1 Dec 2023 16:45:49 GMT, Magnus Ihse Bursie wrote: > The final thing we need to resolve properly is the SVE compiler test. > > @theRealAph says: > > > arm_sve.h is part of GCC. It was added to GCC in 2019. > > A more relevant question is what version of gcc it was added, and if that >

Re: RFR: 8319872: AArch64: [vectorapi] Implementation of unsigned (zero extended) casts [v4]

2023-12-03 Thread Xiaohong Gong
On Wed, 22 Nov 2023 07:05:21 GMT, Eric Liu wrote: >> Vector API defines zero-extend operations [1], which are going to be >> intrinsified and generated to `VectorUCastNode` by C2. This patch adds >> backend implementation for `VectorUCastNode` on AArch64. >> >> The micro benchmark shows signif

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v6]

2023-12-01 Thread Xiaohong Gong
/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request with a new target b

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-11-30 Thread Xiaohong Gong
On Thu, 30 Nov 2023 06:39:43 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> opt

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-11-30 Thread Xiaohong Gong
On Thu, 30 Nov 2023 20:13:06 GMT, Magnus Ihse Bursie wrote: > Not having a build time dependency on libsleef means you cannot really verify > that the functions you want to call are correct, but maybe you feel secure > that they will never change? I'm not sure. The main reason that we add such

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-11-30 Thread Xiaohong Gong
On Thu, 30 Nov 2023 11:13:14 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Rename vmath to sleef in configure > > make/autoconf/lib-sleef.m4 line 56: > >&g

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-30 Thread Xiaohong Gong
On Wed, 22 Nov 2023 09:05:31 GMT, Andrew Haley wrote: >>> Have you considered the possibility of copying the sleef source to the >>> OpenJDK repository and thereby it becomes part of the build process? I >>> don't know how straightforward that is technically and IANAL but I think >>> it's wort

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-29 Thread Xiaohong Gong
On Thu, 23 Nov 2023 14:05:51 GMT, Magnus Ihse Bursie wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Address review comments in build system > > make/autoconf/lib-vmath.m4 line 70

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v5]

2023-11-29 Thread Xiaohong Gong
/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request incrementally wit

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-27 Thread Xiaohong Gong
On Mon, 27 Nov 2023 16:43:09 GMT, Andrew Haley wrote: >> Apparently the situation is this: If your build machine happens to have SVE, >> then you will get SVE support in the vmath library. The SVE support will be >> used during runtime if the machine you run on has SVE support. >> >> If your b

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-26 Thread Xiaohong Gong
On Thu, 23 Nov 2023 15:43:34 GMT, Andrew Haley wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Address review comments in build system > > make/autoconf/lib-vmath.m4 line 94: >

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-26 Thread Xiaohong Gong
On Thu, 23 Nov 2023 14:01:48 GMT, Magnus Ihse Bursie wrote: >> OK, I see. It makes sense that the suffix name should be choosed mainly >> based on the real module name that is searched/checked in configure. > > This still needs fixing. Yes, I will fix this together with removing the SVE cflags

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-26 Thread Xiaohong Gong
On Thu, 23 Nov 2023 14:10:02 GMT, Magnus Ihse Bursie wrote: >> Xiaohong Gong has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Address review comments in build system > > make/autoconf/lib-vmath.m4 line

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v4]

2023-11-23 Thread Xiaohong Gong
/pull/3638 > [2] https://sleef.org/ > [3] https://packages.fedoraproject.org/pkgs/sleef/sleef/ > [4] https://packages.debian.org/bookworm/libsleef3 > [5] https://mail.openjdk.org/pipermail/panama-dev/2022-December/018172.html Xiaohong Gong has updated the pull request incrementally with one

Re: RFR: 8312425: [vectorapi] AArch64: Optimize vector math operations with SLEEF [v3]

2023-11-23 Thread Xiaohong Gong
On Wed, 15 Nov 2023 01:32:00 GMT, Xiaohong Gong wrote: >> Currently the vector floating-point math APIs like >> `VectorOperators.SIN/COS/TAN...` are not intrinsified on AArch64 platform, >> which causes large performance gap on AArch64. Note that those APIs are >> opt

  1   2   >