On Tue, 7 Jan 2025 10:39:18 GMT, Shaojin Wen wrote:
> In PR #22928, UUID introduced long-based vectorized hexadecimal to string
> conversion, which can also be used in Integer::toHexString and
> Long::toHexString to eliminate table lookups. The benefit of eliminating
> table lookups is that th
On Fri, 15 Nov 2024 22:17:10 GMT, Chen Liang wrote:
> When core reflection was migrated to be implemented by Method Handles,
> somehow, the method handles are not used for native methods, which are
> generally linkable by method handles. This causes significant performance
> regressions when
On Fri, 15 Nov 2024 10:05:29 GMT, Aleksey Shipilev wrote:
>> See the bug for more discussion and reproducer. This PR replaces the ad-hoc
>> linked list segmented list of arrays. Arrays are easy targets for GC. There
>> are possible improvements here, most glaring is parallelism that is
>> curr
On Tue, 12 Nov 2024 10:03:50 GMT, Emanuel Peter wrote:
>>> Thanks @minborg for this :) Please remember to add the misprediction count
>>> if you can and avoid the bulk methods by having a `nextMemorySegment()`
>>> benchmark method which make a single fill call site to observe the
>>> different
On Mon, 11 Nov 2024 17:23:56 GMT, Maurizio Cimadamore
wrote:
>> Per Minborg has updated the pull request with a new target base due to a
>> merge or a rebase. The incremental webrev excludes the unrelated changes
>> brought in by the merge/rebase. The pull request contains six additional
>> c
On Mon, 11 Nov 2024 14:50:57 GMT, Per Minborg wrote:
>> This PR proposes to add a new `MemorySegment::fill" benchmark where the byte
>> size of the segments varies.
>
> Per Minborg has updated the pull request with a new target base due to a
> merge or a rebase. The incremental webrev excludes
On Mon, 11 Nov 2024 11:55:27 GMT, Per Minborg wrote:
> This PR proposes to add a new `MemorySegment::fill" benchmark where the byte
> size of the segments varies.
Thanks @minborg for this :) Please remember to add the misprediction count if
you can - or just avoid the bulk methods instead i.e.
On Fri, 27 Sep 2024 14:21:57 GMT, Galder Zamarreño wrote:
>> This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in
>> order to help improve vectorization performance.
>>
>> Currently vectorization does not kick in for loops containing either of
>> these calls because of
On Thu, 12 Sep 2024 11:14:32 GMT, Per Minborg wrote:
>>> Wdyt about the benchmark I have added at [#20829
>>> (comment)](https://github.com/openjdk/jdk/pull/20829#issuecomment-2326404582)?
>>> Sadly I didn't yet run against this PR
>>
>> Let's see what we can do about this later.
>
>> Wdyt abo
On Tue, 3 Sep 2024 07:52:44 GMT, Per Minborg wrote:
> This PR proposes to handle smaller FFM copy operations with Java code rather
> than transitioning to native code. This will improve performance. In this PR,
> copy operations involving zero to 63 bytes will be handled by Java code.
>
> Here
On Tue, 3 Sep 2024 15:44:34 GMT, Maurizio Cimadamore
wrote:
>> src/java.base/share/classes/jdk/internal/foreign/AbstractMemorySegmentImpl.java
>> line 642:
>>
>>> 640: // 0...0X00
>>> 641: if (remaining >= 4) {
>>> 642: final int v =
>>> SCOPED_MEMORY_A
On Tue, 3 Sep 2024 07:52:44 GMT, Per Minborg wrote:
> This PR proposes to handle smaller FFM copy operations with Java code rather
> than transitioning to native code. This will improve performance. In this PR,
> copy operations involving zero to 63 bytes will be handled by Java code.
>
> Here
On Tue, 3 Sep 2024 07:52:44 GMT, Per Minborg wrote:
> This PR proposes to handle smaller FFM copy operations with Java code rather
> than transitioning to native code. This will improve performance. In this PR,
> copy operations involving zero to 63 bytes will be handled by Java code.
>
> Here
On Tue, 3 Sep 2024 08:39:02 GMT, Per Minborg wrote:
>> I found similar small improvements to be had (I wrote about them offline)
>> when replacing the bitwise-based tests (e.g. `foo & 4 != 0`) with a more
>> explicit check for `remainingBytes >=4`. Seems like bitwise operations are
>> not as o
On Mon, 2 Sep 2024 09:36:40 GMT, Maurizio Cimadamore
wrote:
> This looks good. Again, the goal of this PR is not to squeeze every
> nanosecond out - but, rather, to achieve a performance model that is
> "sensible"
fully agree and yep - this looks pretty good already, well done @minborg !
tha
On Tue, 27 Aug 2024 17:09:18 GMT, Galder Zamarreño wrote:
>> This patch intrinsifies `Math.max(long, long)` and `Math.min(long, long)` in
>> order to help improve vectorization performance.
>>
>> Currently vectorization does not kick in for loops containing either of
>> these calls because of
On Fri, 30 Aug 2024 10:51:59 GMT, Per Minborg wrote:
>> The performance of the `MemorySegment::fil` can be improved by replacing the
>> `checkAccess()` method call with calling `checkReadOnly()` instead (as the
>> bounds of the segment itself do not need to be checked).
>>
>> Also, smaller seg
On Fri, 30 Aug 2024 15:21:52 GMT, Maurizio Cimadamore
wrote:
> in this case, we can't optimize as well, because we have different branches
> which get taken or not in a less predictable fashion.
Exactly - It has been designed to show the case when the conditions materialize
(because are take
On Fri, 30 Aug 2024 12:15:36 GMT, Per Minborg wrote:
>> @minborg Hi! I didn't checked the numbers with the benchmark I've written at
>> https://github.com/openjdk/jdk/pull/20712#discussion_r1732802685 which is
>> meant to stress the branch predictor (without enough `samples` i.e. past
>> 128K
On Fri, 30 Aug 2024 10:51:59 GMT, Per Minborg wrote:
>> The performance of the `MemorySegment::fil` can be improved by replacing the
>> `checkAccess()` method call with calling `checkReadOnly()` instead (as the
>> bounds of the segment itself do not need to be checked).
>>
>> Also, smaller seg
On Fri, 30 Aug 2024 09:09:57 GMT, Per Minborg wrote:
>> The performance of the `MemorySegment::fil` can be improved by replacing the
>> `checkAccess()` method call with calling `checkReadOnly()` instead (as the
>> bounds of the segment itself do not need to be checked).
>>
>> Also, smaller seg
On Wed, 28 Aug 2024 09:06:48 GMT, Per Minborg wrote:
>> How fast do we need to be here given we are measuring in a few nanoseconds
>> per operation?
>>
>> What if the goal is not to regress from say explicitly filling in a small
>> sized segment or a comparable array (e.g., < 8 bytes) then ma
On Tue, 27 Aug 2024 09:47:20 GMT, Per Minborg wrote:
>> As discussed offline, can't we use a stable array of functions or something
>> like that which can be populated lazily? That way you can access the
>> function you want in a single array access, and we could put all these
>> helper method
On Mon, 26 Aug 2024 12:11:29 GMT, Per Minborg wrote:
>> The performance of the `MemorySegment::fil` can be improved by replacing the
>> `checkAccess()` method call with calling `checkReadOnly()` instead (as the
>> bounds of the segment itself do not need to be checked).
>>
>> Also, smaller seg
On Mon, 26 Aug 2024 12:07:02 GMT, Per Minborg wrote:
>> Per Minborg has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> Add a comment about the old switch type
>
> Here is what it looks like for Windows x64:
>
>  and only about 1
On Sat, 23 Mar 2024 02:16:57 GMT, Scott Gibbons wrote:
>> Re-write the IndexOf code without the use of the pcmpestri instruction, only
>> using AVX2 instructions. This change accelerates String.IndexOf on average
>> 1.3x for AVX2. The benchmark numbers:
>>
>>
>> Benchmark
On Sat, 23 Mar 2024 02:16:57 GMT, Scott Gibbons wrote:
>> Re-write the IndexOf code without the use of the pcmpestri instruction, only
>> using AVX2 instructions. This change accelerates String.IndexOf on average
>> 1.3x for AVX2. The benchmark numbers:
>>
>>
>> Benchmark
On Sun, 17 Sep 2023 16:01:33 GMT, Shaojin Wen wrote:
> @cl4es made performance optimizations for the simple specifiers of
> String.format in PR https://github.com/openjdk/jdk/pull/2830. Based on the
> same idea, I continued to make improvements. I made patterns like %2d %02d
> also be optimize
On Wed, 27 Sep 2023 09:35:47 GMT, 温绍锦 wrote:
>> @cl4es made performance optimizations for the simple specifiers of
>> String.format in PR https://github.com/openjdk/jdk/pull/2830. Based on the
>> same idea, I continued to make improvements. I made patterns like %2d %02d
>> also be optimized.
>
On Tue, 30 May 2023 09:32:02 GMT, Jan Lahoda wrote:
>> @forax
>>
>> Hi! Sorry for this sudden message, but this one captured my attention
>>
>>> and subtype checks are usually fast.
>>
>> And I hope this PR to be the right place to raise this.
>>
>> I was looking this PR to better understand
On Tue, 2 May 2023 13:57:37 GMT, Rémi Forax wrote:
>> Jan Lahoda has updated the pull request with a new target base due to a
>> merge or a rebase. The incremental webrev excludes the unrelated changes
>> brought in by the merge/rebase. The pull request contains four additional
>> commits sinc
On Wed, 15 Mar 2023 14:56:28 GMT, Eirik Bjorsnos wrote:
>> Eirik Bjorsnos has updated the pull request incrementally with one
>> additional commit since the last revision:
>>
>> Update StringLatin1.canEncode to sync with same test in CharacterData.of
>
> Just for fun, I tried with a benchmark
On Wed, 15 Mar 2023 13:42:22 GMT, Eirik Bjorsnos wrote:
>> Can you check what happen adding much more inputs to the dataset including
>> non-latin chars as well and use `-prof perfnorm` to check what `perf` report
>> re branches/branch-misses?
>>
>> You can use `SplittableRandom` to pre-popula
On Wed, 15 Mar 2023 12:28:05 GMT, Eirik Bjorsnos wrote:
>>> `if (ch && 0xFF00 == 0) {`
>>
>> This seems to perform similar to baseline:
>>
>>
>> Benchmark (codePoint) Mode Cnt Score Error Units
>> Characters.isDigit 48 avgt 15 0.890 ± 0.025 ns/op
>> Character
On Wed, 8 Feb 2023 00:07:14 GMT, Claes Redestad wrote:
>> This patch adds special-cases to `Arrays.copyOf` and `Arrays.copyOfRange` to
>> clone arrays when `newLength` or range inputs span the input array. This
>> helps eliminate range checks and has been verified to help various String
>> ope
On Tue, 7 Feb 2023 22:43:15 GMT, Claes Redestad wrote:
>> This adds a local, specialized `copyBytes` method to `String` that avoids
>> certain redundant range checks and clamping that JIT has issues removing
>> fully.
>>
>> This has a small but statistically significant effect on `String`
>>
On Tue, 7 Feb 2023 20:32:11 GMT, Claes Redestad wrote:
>> src/java.base/share/classes/java/lang/String.java line 698:
>>
>>> 696: }
>>> 697:
>>> 698: static byte[] copyBytes(byte[] bytes, int offset, int length) {
>>
>> Given that the stub generated for array copy seems highly dependen
On Tue, 7 Feb 2023 15:25:05 GMT, Claes Redestad wrote:
>> This adds a local, specialized `copyBytes` method to `String` that avoids
>> certain redundant range checks and clamping that JIT has issues removing
>> fully.
>>
>> This has a small but statistically significant effect on `String`
>>
39 matches
Mail list logo