from:"Sandhya Viswanathan"

Re: RFR: 8355563: VectorAPI: Refactor current implementation of subword gather load API [v2]

2025-07-01 Thread Sandhya Viswanathan

On Wed, 25 Jun 2025 09:16:48 GMT, Xiaohong Gong wrote: >> JDK-8318650 introduced hotspot intrinsification of subword gather load APIs >> for X86 platforms [1]. However, the current implementation is not optimal >> for AArch64 SVE platform, which natively supports vector instructions for >> sub

Re: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v6]

2025-05-30 Thread Sandhya Viswanathan

On Fri, 30 May 2025 19:34:16 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for >> java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are >> included to check the performance of specific input value ranges to help >> prevent regression

Re: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v5]

2025-05-30 Thread Sandhya Viswanathan

On Thu, 29 May 2025 18:56:11 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for >> java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are >> included to check the performance of specific input value ranges to help >> prevent regression

Re: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v4]

2025-05-28 Thread Sandhya Viswanathan

On Wed, 28 May 2025 18:36:38 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for >> java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are >> included to check the performance of specific input value ranges to help >> prevent regression

Re: RFR: 8353686: Optimize Math.cbrt for x86 64 bit platforms [v3]

2025-05-27 Thread Sandhya Viswanathan

On Tue, 6 May 2025 21:45:34 GMT, Mohamed Issa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for >> java.lang.Math.cbrt() using libm. There is a new set of micro-benchmarks are >> included to check the performance of specific input value ranges to help >> prevent regressions

Re: RFR: 8350682: [JMH] vector.IndexInRangeBenchmark failed with IndexOutOfBoundsException for size=1024

2025-03-03 Thread Sandhya Viswanathan

On Tue, 25 Feb 2025 19:06:05 GMT, Vladimir Ivanov wrote: > Array initialization by parameter was added. Extra constant was used to align > cycle step with used arrays. Looks good to me. - Marked as reviewed by sviswanathan (Reviewer). PR Review: https://git.openjdk.org/jdk/pull/2

Re: RFR: 8350682: [JMH] vector.IndexInRangeBenchmark failed with IndexOutOfBoundsException for size=1024

2025-02-28 Thread Sandhya Viswanathan

On Tue, 25 Feb 2025 19:06:05 GMT, Vladimir Ivanov wrote: > Array initialization by parameter was added. Extra constant was used to align > cycle step with used arrays. test/micro/org/openjdk/bench/jdk/incubator/vector/IndexInRangeBenchmark.java line 51: > 49: @Setup(Level.Trial) > 50:

Re: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v18]

2025-02-19 Thread Sandhya Viswanathan

On Tue, 18 Feb 2025 02:36:13 GMT, Julian Waters wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Review comments resolutions > > Is anyone else getting compile failures after this was integrated? This > weirdly se

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v6]

2025-02-11 Thread Sandhya Viswanathan

On Tue, 11 Feb 2025 21:47:31 GMT, Volodymyr Paprotski wrote: >> (Also see `8319429: Resetting MXCSR flags degrades ecore`) >> >> This PR fixes two issues: >> - the original issue is a crash caused by `__ warn` corrupting the stack on >> Windows only >> - This issue also uncovered that -Xcheck:

Re: RFR: 8344802: Crash in StubRoutines::verify_mxcsr with -XX:+EnableX86ECoreOpts and -Xcheck:jni [v5]

2025-02-07 Thread Sandhya Viswanathan

On Mon, 3 Feb 2025 21:43:56 GMT, Volodymyr Paprotski wrote: >> (Also see `8319429: Resetting MXCSR flags degrades ecore`) >> >> This PR fixes two issues: >> - the original issue is a crash caused by `__ warn` corrupting the stack on >> Windows only >> - This issue also uncovered that -Xcheck:j

Re: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v15]

2025-01-29 Thread Sandhya Viswanathan

On Wed, 29 Jan 2025 06:26:41 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by >> [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection

Re: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v14]

2025-01-28 Thread Sandhya Viswanathan

On Tue, 28 Jan 2025 06:26:11 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by >> [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection

Re: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v13]

2025-01-27 Thread Sandhya Viswanathan

On Mon, 27 Jan 2025 08:35:44 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by >> [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection

Re: RFR: 8342103: C2 compiler support for Float16 type and associated scalar operations [v13]

2025-01-27 Thread Sandhya Viswanathan

On Mon, 27 Jan 2025 08:35:44 GMT, Jatin Bhateja wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by >> [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes included with this patch:- >> >> 1. Detection

Re: RFR: 8342103: C2 compiler support for Float16 type and associated operations

2024-11-19 Thread Sandhya Viswanathan

On Mon, 14 Oct 2024 11:40:01 GMT, Jatin Bhateja wrote: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by > [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of vario

Re: RFR: 8342103: C2 compiler support for Float16 type and associated operations

2024-11-19 Thread Sandhya Viswanathan

On Mon, 14 Oct 2024 11:40:01 GMT, Jatin Bhateja wrote: > Hi All, > > This patch adds C2 compiler support for various Float16 operations added by > [PR#22128](https://github.com/openjdk/jdk/pull/22128) > > Following is the summary of changes included with this patch:- > > 1. Detection of vario

Re: RFR: 8341137: Optimize long vector multiplication using x86 VPMUL[U]DQ instruction [v5]

2024-11-19 Thread Sandhya Viswanathan

On Thu, 14 Nov 2024 18:24:59 GMT, Jatin Bhateja wrote: >> This patch optimizes LongVector multiplication by inferring VPMUL[U]DQ >> instruction for following IR pallets. >> >> >>MulVL ( AndV SRC1, 0x) ( AndV SRC2, 0x) >>MulVL (URShiftVL SRC1 , 32) (

Re: RFR: 8342103: C2 compiler support for Float16 type and associated operations

2024-11-19 Thread Sandhya Viswanathan

On Tue, 19 Nov 2024 00:29:42 GMT, Sandhya Viswanathan wrote: >> Hi All, >> >> This patch adds C2 compiler support for various Float16 operations added by >> [PR#22128](https://github.com/openjdk/jdk/pull/22128) >> >> Following is the summary of changes

Re: RFR: 8342103: C2 compiler support for Float16 type and associated operations

2024-11-19 Thread Sandhya Viswanathan

On Tue, 19 Nov 2024 08:43:06 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/x86.ad line 11015: >> >>> 11013: ins_encode %{ >>> 11014: int vlen_enc = vector_length_encoding(this); >>> 11015: __ evfmadd132ph($dst$$XMMRegister, $src2$$XMMRegister, >>> $src1$$XMMRegister, vlen_enc); >>

Re: RFR: 8341137: Optimize long vector multiplication using x86 VPMUL[U]DQ instruction [v4]

2024-11-13 Thread Sandhya Viswanathan

On Wed, 13 Nov 2024 02:43:12 GMT, Jatin Bhateja wrote: >> This patch optimizes LongVector multiplication by inferring VPMUL[U]DQ >> instruction for following IR pallets. >> >> >>MulVL ( AndV SRC1, 0x) ( AndV SRC2, 0x) >>MulVL (URShiftVL SRC1 , 32) (

Re: RFR: 8341137: Optimize long vector multiplication using x86 VPMUL[U]DQ instruction [v2]

2024-11-12 Thread Sandhya Viswanathan

On Sun, 10 Nov 2024 07:36:55 GMT, Jatin Bhateja wrote: >> Yes, this should ensure 0x. > > We land here only after checking if inputs are uints, didn't want redundant > match, its just a convince routine for forwarding inputs. I will create a > lambda for this. uint check only ensures v

Re: RFR: 8341137: Optimize long vector multiplication using x86 VPMUL[U]DQ instruction [v2]

2024-11-08 Thread Sandhya Viswanathan

On Fri, 8 Nov 2024 20:25:10 GMT, Vladimir Ivanov wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Creating specialized IR to shield pattern from subsequent transforms in >> optimization pipeline > > src/hotspot/sh

Re: RFR: 8310691: [REDO] [vectorapi] Refactor VectorShuffle implementation [v2]

2024-11-08 Thread Sandhya Viswanathan

On Sun, 6 Oct 2024 10:24:53 GMT, Quan Anh Mai wrote: >> Quan Anh Mai has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains one commit: >> >> [vectorapi] Refactor VectorShuffle implementation > > I have adapted the patch in accordance

Re: RFR: 8338021: Support new unsigned and saturating vector operators in VectorAPI [v30]

2024-10-25 Thread Sandhya Viswanathan

On Tue, 22 Oct 2024 15:56:18 GMT, Paul Sandoz wrote: >> Hey @eme64 , >> >>> Wow this is really a very moving target - quite frustrating to review - it >>> takes up way too much of the reviewers bandwidth. You really need to split >>> up your PRs as much as possible so that review is easier and

Re: RFR: 8338021: Support new unsigned and saturating vector operators in VectorAPI [v32]

2024-10-25 Thread Sandhya Viswanathan

On Thu, 24 Oct 2024 13:36:50 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >

Re: RFR: 8338021: Support new unsigned and saturating vector operators in VectorAPI [v25]

2024-10-17 Thread Sandhya Viswanathan

On Thu, 17 Oct 2024 15:41:58 GMT, Jatin Bhateja wrote: > > Rather than adding more IR test functionality to this PR that requires > > additional review my recommendation would be to follow up in another PR or > > before hand rethink our approach. > > Agree, I am thinking of developing an autom

Re: RFR: 8338023: Support two vector selectFrom API [v17]

2024-10-16 Thread Sandhya Viswanathan

On Sun, 13 Oct 2024 11:18:01 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8338023: Support two vector selectFrom API [v16]

2024-10-03 Thread Sandhya Viswanathan

On Thu, 3 Oct 2024 19:05:14 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8338023: Support two vector selectFrom API [v13]

2024-10-03 Thread Sandhya Viswanathan

On Thu, 3 Oct 2024 05:04:35 GMT, Jatin Bhateja wrote: >> I see the problem with float/double vectors. Let us do the rearrange form >> only for Integral (byte, short, int, long) vectors then. For float/double >> vector we could keep the code that you have currently. > > You will also need additi

Re: RFR: 8338023: Support two vector selectFrom API [v15]

2024-10-03 Thread Sandhya Viswanathan

On Thu, 3 Oct 2024 05:09:22 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8338023: Support two vector selectFrom API [v13]

2024-10-03 Thread Sandhya Viswanathan

On Thu, 3 Oct 2024 05:04:35 GMT, Jatin Bhateja wrote: >> I see the problem with float/double vectors. Let us do the rearrange form >> only for Integral (byte, short, int, long) vectors then. For float/double >> vector we could keep the code that you have currently. > > You will also need additi

Re: RFR: 8338021: Support new unsigned and saturating vector operators in VectorAPI [v19]

2024-10-03 Thread Sandhya Viswanathan

On Tue, 1 Oct 2024 05:09:25 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Integrated: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes

2024-10-01 Thread Sandhya Viswanathan

On Mon, 19 Aug 2024 21:47:23 GMT, Sandhya Viswanathan wrote: > Currently the rearrange and selectFrom APIs check shuffle indices and throw > IndexOutOfBoundsException if there is any exceptional source index in the > shuffle. This causes the generated code to be less optimal. This PR

Re: RFR: 8338023: Support two vector selectFrom API [v14]

2024-10-01 Thread Sandhya Viswanathan

On Tue, 1 Oct 2024 09:51:27 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8338023: Support two vector selectFrom API [v13]

2024-10-01 Thread Sandhya Viswanathan

On Tue, 1 Oct 2024 09:53:02 GMT, Jatin Bhateja wrote: >> Thanks for the example. Yes, you have a point there. So we would need to do: >>src1.rearrange(this.lanewise(VectorOperators.AND, 2 * VLENGTH - >> 1).toShuffle(), src2); > >> This could instead be: src1.rearrange(this.lanewise(VectorOp

Re: RFR: 8338023: Support two vector selectFrom API [v13]

2024-09-30 Thread Sandhya Viswanathan

On Mon, 30 Sep 2024 22:51:57 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Handling NPOT vector length for AArch64 SVE with vector sizes varying b/w >>

Re: RFR: 8338023: Support two vector selectFrom API [v13]

2024-09-30 Thread Sandhya Viswanathan

On Tue, 24 Sep 2024 07:10:24 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8338023: Support two vector selectFrom API [v13]

2024-09-30 Thread Sandhya Viswanathan

On Mon, 30 Sep 2024 21:28:22 GMT, Paul Sandoz wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java >> line 551: >> >>> 549: return ((ByteVector)src1).vectorFactory(res); >>> 550: } >>> 551: >> >> This could instead be: >>src1.rearrange(this.lan

Re: RFR: 8338023: Support two vector selectFrom API [v13]

2024-09-30 Thread Sandhya Viswanathan

On Tue, 24 Sep 2024 07:10:24 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8338023: Support two vector selectFrom API [v13]

2024-09-30 Thread Sandhya Viswanathan

On Tue, 24 Sep 2024 07:10:24 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8338023: Support two vector selectFrom API [v12]

2024-09-23 Thread Sandhya Viswanathan

On Wed, 18 Sep 2024 07:21:52 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm [v12]

2024-09-19 Thread Sandhya Viswanathan

On Thu, 19 Sep 2024 21:15:11 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for >> java.lang.Math.tanh() using libm >> >> Benchmark (ops/ms) | Stock JDK | Tanh intrinsic | Speedup >> -- | -- | -- | -- >> MathBench.tanhDouble | 70900 | 95618 | 1.35x

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes [v2]

2024-09-19 Thread Sandhya Viswanathan

On Wed, 18 Sep 2024 12:23:48 GMT, Emanuel Peter wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Address review comments > > I'm a bit confused b

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes [v3]

2024-09-19 Thread Sandhya Viswanathan

On Thu, 19 Sep 2024 07:29:11 GMT, Jatin Bhateja wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Change method name > > Hi @sviswa7 , some comments, overall patch looks good to

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes [v4]

2024-09-19 Thread Sandhya Viswanathan

3,1) > 0x7f40d022750c: vmovdqu 0x40(%rsi,%r13,1),%xmm1 > 0x7f40d0227513: vpshufb %xmm2,%xmm1,%xmm1 > 0x7f40d0227518: vmovdqu %xmm1,0x40(%rax,%r13,1) > 0x7f40d022751f: add$0x40,%ebx > 0x7f40d0227522: cmp%r8d,%ebx > 0x7f40d022752

Re: RFR: 8310691: [REDO] [vectorapi] Refactor VectorShuffle implementation

2024-09-18 Thread Sandhya Viswanathan

On Wed, 18 Sep 2024 17:18:42 GMT, Paul Sandoz wrote: > Will this have any direct impact on the changes proposed by #20508 and #20634? I think we should first get the 20508 and 20634 integrated before this one. - PR Comment: https://git.openjdk.org/jdk/pull/21042#issuecomment-235902

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes [v3]

2024-09-18 Thread Sandhya Viswanathan

3,1) > 0x7f40d022750c: vmovdqu 0x40(%rsi,%r13,1),%xmm1 > 0x7f40d0227513: vpshufb %xmm2,%xmm1,%xmm1 > 0x7f40d0227518: vmovdqu %xmm1,0x40(%rax,%r13,1) > 0x7f40d022751f: add$0x40,%ebx > 0x7f40d0227522: cmp%r8d,%ebx > 0x7f40d022752

Re: RFR: 8310691: [REDO] [vectorapi] Refactor VectorShuffle implementation

2024-09-18 Thread Sandhya Viswanathan

On Tue, 17 Sep 2024 22:29:01 GMT, Paul Sandoz wrote: > > @PaulSandoz What do you think regarding x86-32? > > I don't see anything obvious in the changes of this PR that would affect > x86-32, but i ain't a HotSpot expert. Perhaps this just exacerbates some > existing bug?@sviswa7 what do you t

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes [v2]

2024-09-18 Thread Sandhya Viswanathan

On Wed, 18 Sep 2024 12:23:48 GMT, Emanuel Peter wrote: > I'm a bit confused by the name `shuffleWrapIndexes` and > `inline_vector_shuffle_wrap_indexes`. > > Are you **shuffling wrap-indexes**? I don't know what that would even mean. I > think you should name it `wrapShuffleIndexes`. Or is ther

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes

2024-09-17 Thread Sandhya Viswanathan

On Tue, 17 Sep 2024 18:21:43 GMT, Paul Sandoz wrote: > > Adding link to UTF-8 decoding use case for convenience and reminder: > > https://github.com/AugustNagro/utf8.java/blob/master/src/main/java/com/augustnagro/utf8/Utf8.java. > > Another related link to base 64 decoding > https://github.com

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes

2024-09-16 Thread Sandhya Viswanathan

On Fri, 13 Sep 2024 19:45:11 GMT, Paul Sandoz wrote: >>> Given `rearrange` with 1 vector gets wrapping indices semantics. I think we >>> should stop normalizing indices when converting a `Vector` into a >>> `VectorShuffle` (currently we wrap all out-of-bound elements to `[-VLEN, >>> 0)`). Then

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm [v2]

2024-09-13 Thread Sandhya Viswanathan

On Fri, 13 Sep 2024 22:30:25 GMT, Srinivas Vamsi Parasa wrote: >> So far, this will be the only intrinsic implementation of tanh. Therefore, >> at the moment it is just checking the consistency of the intrinsic >> implementation with StrictMath/FDLIBM tanh. If the intrinsic has a ~1 ulp >> ac

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes

2024-09-13 Thread Sandhya Viswanathan

On Fri, 13 Sep 2024 19:45:11 GMT, Paul Sandoz wrote: >>> Given `rearrange` with 1 vector gets wrapping indices semantics. I think we >>> should stop normalizing indices when converting a `Vector` into a >>> `VectorShuffle` (currently we wrap all out-of-bound elements to `[-VLEN, >>> 0)`). Then

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes [v2]

2024-09-13 Thread Sandhya Viswanathan

3,1) > 0x7f40d022750c: vmovdqu 0x40(%rsi,%r13,1),%xmm1 > 0x7f40d0227513: vpshufb %xmm2,%xmm1,%xmm1 > 0x7f40d0227518: vmovdqu %xmm1,0x40(%rax,%r13,1) > 0x7f40d022751f: add$0x40,%ebx > 0x7f40d0227522: cmp%r8d,%ebx > 0x7f40d022752

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes

2024-09-13 Thread Sandhya Viswanathan

On Fri, 13 Sep 2024 19:04:12 GMT, Jatin Bhateja wrote: >> @jatin-bhateja If you could expand on this comment with specific cases it >> will be helpful. The loadShuffle generation is needed for platform specific >> handling of shuffles and cannot be optimized out here. > > Hi @sviswa7, I was sug

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes

2024-09-13 Thread Sandhya Viswanathan

On Fri, 13 Sep 2024 05:30:36 GMT, Jatin Bhateja wrote: >> Currently the rearrange and selectFrom APIs check shuffle indices and throw >> IndexOutOfBoundsException if there is any exceptional source index in the >> shuffle. This causes the generated code to be less optimal. This PR modifies >>

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes

2024-09-13 Thread Sandhya Viswanathan

On Fri, 13 Sep 2024 17:20:40 GMT, Quan Anh Mai wrote: > Given `rearrange` with 1 vector gets wrapping indices semantics. I think we > should stop normalizing indices when converting a `Vector` into a > `VectorShuffle` (currently we wrap all out-of-bound elements to `[-VLEN, > 0)`). Then the re

Re: RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes

2024-09-12 Thread Sandhya Viswanathan

On Thu, 22 Aug 2024 18:21:50 GMT, Paul Sandoz wrote: > API shapes are good! > > I see you intrinsified `selectFrom` which, IIUC, optimally generates C2 nodes > that are functionally equivalent to the Java expression > `v.rearrange(this.toShuffle())`. That way we can better generate an optimal

RFR: 8340079: Modify rearrange/selectFrom Vector API methods to perform wrapIndexes instead of checkIndexes

2024-09-12 Thread Sandhya Viswanathan

Currently the rearrange and selectFrom APIs check shuffle indices and throw IndexOutOfBoundsException if there is any exceptional source index in the shuffle. This causes the generated code to be less optimal. This PR modifies the rearrange/selectFrom Vector API methods to perform wrapIndexes in

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm [v2]

2024-09-11 Thread Sandhya Viswanathan

On Wed, 11 Sep 2024 01:59:54 GMT, Joe Darcy wrote: >>> If the test is going to use randomness, then its jtreg tags should include >>> >>> `@key randomness` >>> >>> and it is preferable to use jdk.test.lib.RandomFactory to get and Random >>> object since that handles printing out a key so the r

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm [v3]

2024-09-10 Thread Sandhya Viswanathan

On Thu, 5 Sep 2024 19:10:34 GMT, Srinivas Vamsi Parasa wrote: >> The goal of this PR is to implement an x86_64 intrinsic for >> java.lang.Math.tanh() using libm >> >> Benchmark (ops/ms) | Stock JDK | Tanh intrinsic | Speedup >> -- | -- | -- | -- >> MathBench.tanhDouble | 70900 | 95618 | 1.35x >

Re: RFR: 8338021: Support new unsigned and saturating vector operators in VectorAPI [v8]

2024-09-06 Thread Sandhya Viswanathan

On Fri, 6 Sep 2024 06:43:31 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm [v3]

2024-09-06 Thread Sandhya Viswanathan

On Fri, 6 Sep 2024 21:15:07 GMT, Sandhya Viswanathan wrote: >> @vamsi-parasa don't hesitate in adding as much and explicit information >> about the original source from where the algorithm has been picked up, even >> though the PR explicitly mentions libm. Ad

Re: RFR: 8338694: x86_64 intrinsic for tanh using libm [v3]

2024-09-06 Thread Sandhya Viswanathan

On Wed, 4 Sep 2024 01:57:42 GMT, Jatin Bhateja wrote: >> @theRealAph, this implementation is based on Intel libm math library and >> meets the accuracy requirements. The algorithm is provided in the comments. > > @vamsi-parasa don't hesitate in adding as much and explicit information > about t

Re: RFR: 8338021: Support new unsigned and saturating vector operators in VectorAPI [v8]

2024-09-06 Thread Sandhya Viswanathan

On Fri, 6 Sep 2024 18:39:08 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Review suggestions > > src/jdk.incubator.vector/share/classes/jdk/incubat

Re: RFR: 8338021: Support new unsigned and saturating vector operators in VectorAPI [v8]

2024-09-06 Thread Sandhya Viswanathan

On Fri, 6 Sep 2024 06:43:31 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v2]

2024-09-03 Thread Sandhya Viswanathan

On Mon, 2 Sep 2024 12:15:10 GMT, Jatin Bhateja wrote: >> If the aim is to reduce the number of nodes, we could merge the >> Op_SaturatingAddVB, Op_SaturatingAddVS, Op_SaturatingAddVI, and >> Op_SaturatingAddVL into one Op_SaturatingAddV. Likewise for unsigned >> saturating add into Op_Saturati

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v5]

2024-09-03 Thread Sandhya Viswanathan

On Mon, 2 Sep 2024 12:20:59 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]

2024-09-03 Thread Sandhya Viswanathan

On Mon, 2 Sep 2024 12:17:08 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/x86.ad line 10656: >> >>> 10654: match(Set dst (SaturatingSubVI src1 src2)); >>> 10655: match(Set dst (SaturatingSubVL src1 src2)); >>> 10656: effect(TEMP ktmp); >> >> This needs TEMP dst as well. > > There is no

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v5]

2024-09-03 Thread Sandhya Viswanathan

On Mon, 2 Sep 2024 12:20:59 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]

2024-08-29 Thread Sandhya Viswanathan

On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v2]

2024-08-28 Thread Sandhya Viswanathan

On Wed, 28 Aug 2024 00:12:26 GMT, Sandhya Viswanathan wrote: >> Hey @jaskarth , Central idea behind introducing VectorReinterpretNode after >> unsigned vector IR is to facilitate unboxing-boxing optimization, this >> explicit reinterpretation ensures type compatibility b

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v2]

2024-08-27 Thread Sandhya Viswanathan

On Thu, 15 Aug 2024 06:59:53 GMT, Jatin Bhateja wrote: >>> its usage in existing patch is limited to [type >>> comparison.](https://github.com/openjdk/jdk/pull/20507/files#diff-3559dcf23b719805be5fd06fd5c1851dbd8f53e47afe6d99cba13a3de0ebc6b2R1542) >> >> Ah, that makes sense to me. I took a clos

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]

2024-08-27 Thread Sandhya Viswanathan

On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]

2024-08-27 Thread Sandhya Viswanathan

On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]

2024-08-26 Thread Sandhya Viswanathan

On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Re: RFR: 8338021: Support saturating vector operators in VectorAPI [v4]

2024-08-23 Thread Sandhya Viswanathan

On Mon, 19 Aug 2024 07:19:30 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> following new vector operators. >> >> >> . SUADD : Saturating unsigned addition. >> . SADD: Saturating signed addition. >>

Re: RFR: 8338023: Support two vector selectFrom API [v3]

2024-08-21 Thread Sandhya Viswanathan

On Wed, 21 Aug 2024 16:42:44 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8338023: Support two vector selectFrom API [v3]

2024-08-21 Thread Sandhya Viswanathan

On Wed, 21 Aug 2024 18:27:09 GMT, Paul Sandoz wrote: > Is it possible for the intrinsic to be responsible for wrapping, if needed? > If was looking at > [`vpermi2b`](https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=vpermi2b&ig_expand=4917,4982,5004,5010,5014&techs=A

Re: RFR: 8338023: Support two vector selectFrom API [v3]

2024-08-21 Thread Sandhya Viswanathan

On Wed, 21 Aug 2024 16:49:40 GMT, Jatin Bhateja wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Pass explicit wrap argument to selectFrom API with default value set to >> true. > > Hi @rose00 , @sviswa7 , @PaulSa

Re: RFR: 8338023: Support two vector selectFrom API [v2]

2024-08-19 Thread Sandhya Viswanathan

On Mon, 19 Aug 2024 07:36:15 GMT, Jatin Bhateja wrote: >> Hi All, >> >> As per the discussion on panama-dev mailing list[1], patch adds the support >> for following new two vector permutation APIs. >> >> >> Declaration:- >> Vector.selectFrom(Vector v1, Vector v2) >> >> >> Semantics:- >>

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v47]

2024-05-28 Thread Sandhya Viswanathan

On Tue, 28 May 2024 23:52:27 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v43]

2024-05-28 Thread Sandhya Viswanathan

On Tue, 28 May 2024 18:11:13 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 1333: >> >>> 1331: >>> 1332: __ cmpq(nMinusK, 32); >>> 1333: __ jae_b(L_greaterThan32); >> >> Should this check be (n-k+1) >= 32? And so accordingly (n-k) >= 31 >> __ cmpq

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v43]

2024-05-28 Thread Sandhya Viswanathan

On Tue, 28 May 2024 17:30:24 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 278: >> >>> 276: __ bind(L_nextCheck); >>> 277: __ testq(haystack_len_p, haystack_len_p); >>> 278: __ je(L_zeroCheckFailed); >> >> This check could be removed as the next

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v43]

2024-05-28 Thread Sandhya Viswanathan

On Tue, 28 May 2024 17:59:49 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/c2_stubGenerator_x86_64_string.cpp line 578: >> >>> 576: // helper jumps to L_checkRangeAndReturn with a (-1) return value. >>> 577: big_case_loop_helper(false, 0, L_checkRangeAndReturn, L_loopTop, >>> mask, h

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v43]

2024-05-28 Thread Sandhya Viswanathan

On Sat, 25 May 2024 22:19:41 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v35]

2024-05-24 Thread Sandhya Viswanathan

On Thu, 23 May 2024 23:12:42 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v41]

2024-05-24 Thread Sandhya Viswanathan

On Fri, 24 May 2024 23:15:26 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v20]

2024-05-24 Thread Sandhya Viswanathan

On Fri, 17 May 2024 23:47:45 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v27]

2024-05-24 Thread Sandhya Viswanathan

On Wed, 22 May 2024 18:52:27 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v25]

2024-05-24 Thread Sandhya Viswanathan

On Wed, 22 May 2024 17:40:24 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v40]

2024-05-24 Thread Sandhya Viswanathan

On Fri, 24 May 2024 20:47:23 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v35]

2024-05-24 Thread Sandhya Viswanathan

On Thu, 23 May 2024 23:12:42 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v20]

2024-05-21 Thread Sandhya Viswanathan

On Fri, 17 May 2024 23:47:45 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-17 Thread Sandhya Viswanathan

On Thu, 16 May 2024 17:08:21 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 238: >> >>> 236: const Register needle = rdx; >>> 237: const Register needle_len = rcx; >>> 238: >> >> This is the calling convention on Linux. How is windows plat

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-17 Thread Sandhya Viswanathan

On Thu, 16 May 2024 20:22:40 GMT, Scott Gibbons wrote: >> src/hotspot/cpu/x86/stubGenerator_x86_64_string.cpp line 1510: >> >>> 1508: compare_big_haystack_to_needle(sizeKnown, size, >>> NUMBER_OF_NEEDLE_BYTES_TO_COMPARE, loop_top, hsPtrRet, hsLength, >>> 1509:

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v11]

2024-05-17 Thread Sandhya Viswanathan

On Fri, 17 May 2024 21:16:47 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark(algorithm) (dataSize) (keyLength) >> (provider) Mode Cnt ScoreError Units >> SignatureBench.ECDSA.signSHA256withECDSA1024 256

Re: RFR: 8329538: Accelerate P256 on x86_64 using Montgomery intrinsic [v9]

2024-05-16 Thread Sandhya Viswanathan

On Fri, 10 May 2024 00:19:32 GMT, Volodymyr Paprotski wrote: >> Performance. Before: >> >> Benchmark(algorithm) (dataSize) (keyLength) >> (provider) Mode Cnt ScoreError Units >> SignatureBench.ECDSA.signSHA256withECDSA1024 256

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-15 Thread Sandhya Viswanathan

On Sat, 4 May 2024 19:35:21 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

Re: RFR: 8320448: Accelerate IndexOf using AVX2 [v19]

2024-05-14 Thread Sandhya Viswanathan

On Sat, 4 May 2024 19:35:21 GMT, Scott Gibbons wrote: >> Re-write the IndexOf code without the use of the pcmpestri instruction, only >> using AVX2 instructions. This change accelerates String.IndexOf on average >> 1.3x for AVX2. The benchmark numbers: >> >> >> Benchmark

1 2 3 >

1 - 100 of 220 matches

Mail list logo