Use `JavaLangAccess::uncheckedNewStringNoRepl` in `MemorySegment::getString` to 
avoid byte[] allocation in the String constructor.

Fall back to the old code in the case of malformed input to get replacement 
characters as per Javadoc API specification. The existing tests in 
`TestStringEncoding` seem sufficient to me.

I ran the tier1 test suite and it passes.

For performance analysis I ran.

    make test TEST="micro:org.openjdk.bench.java.lang.foreign.ToJavaStringTest" 
MICRO="OPTIONS=-prof gc"

on an AMD Ryzen 7 PRO 5750GE.

These are the formatted results, the current master is the line on top, this 
feature branch is the line below. We can see an improvement in throughput 
driven by a reduction in allocation.


Benchmark                                              (size)  Mode  Cnt     
Score    Error   Units

ToJavaStringTest.panama_readString                          5  avgt   30    
18.996 ±  0.044   ns/op
ToJavaStringTest.panama_readString                          5  avgt   30    
13.851 ±  0.028   ns/op

ToJavaStringTest.panama_readString                         20  avgt   30    
23.570 ±  0.050   ns/op
ToJavaStringTest.panama_readString                         20  avgt   30    
18.401 ±  0.069   ns/op

ToJavaStringTest.panama_readString                        100  avgt   30    
32.094 ±  0.207   ns/op
ToJavaStringTest.panama_readString                        100  avgt   30    
24.427 ±  0.112   ns/op

ToJavaStringTest.panama_readString                        200  avgt   30    
43.029 ±  0.185   ns/op
ToJavaStringTest.panama_readString                        200  avgt   30    
31.914 ±  0.064   ns/op

ToJavaStringTest.panama_readString                        451  avgt   30    
81.145 ±  0.403   ns/op
ToJavaStringTest.panama_readString                        451  avgt   30    
58.975 ±  0.233   ns/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm       5  avgt   30    
72.000 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm       5  avgt   30    
48.000 ±  0.001    B/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm      20  avgt   30   
104.000 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm      20  avgt   30    
64.000 ±  0.001    B/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm     100  avgt   30   
264.000 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm     100  avgt   30   
144.000 ±  0.001    B/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm     200  avgt   30   
456.001 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm     200  avgt   30   
240.000 ±  0.001    B/op

ToJavaStringTest.panama_readString:gc.alloc.rate.norm     451  avgt   30   
968.001 ±  0.001    B/op
ToJavaStringTest.panama_readString:gc.alloc.rate.norm     451  avgt   30   
496.001 ±  0.001    B/op


I looked into whether there are inlining issues with the current or proposed 
code. For this I ran


var segment = MemorySegment.ofArray(new byte[] {'c', 'o', 'f', 'f', 'e', 'e', ' 
', 'b', 'a', 'b', 'e', 0});
for (int i = 0; i < 200_000_000; i++) {
  var string = segment.getString(0);
  if (System.identityHashCode(string) == 1) {
    System.out.println("x");
   }
}


with `-XX:+PrintCompilation -XX:+PrintInlining -XX:+UnlockDiagnosticVMOptions `

for the current master the simplified output is like this


@ 75   jdk.internal.foreign.AbstractMemorySegmentImpl::getString (9 bytes)   
force inline by annotation   callee changed to  help.GetString::main (107 
bytes)    -> TypeProfile (95646/95646 counts) = 
jdk/internal/foreign/HeapMemorySegmentImpl$OfByte
  @ 5   jdk.internal.foreign.AbstractMemorySegmentImpl::getString (12 bytes)   
force inline by annotation
    @ 1   java.util.Objects::requireNonNull (14 bytes)   force inline by 
annotation
    @ 8   jdk.internal.foreign.StringSupport::read (67 bytes)   force inline by 
annotation
      @ 1   jdk.internal.foreign.StringSupport$CharsetKind::of (102 bytes)   
inline (hot)
      @ 4   java.lang.Enum::ordinal (5 bytes)   accessor
      @ 45   jdk.internal.foreign.StringSupport::readByte (41 bytes)   force 
inline by annotation
        @ 3   jdk.internal.foreign.AbstractMemorySegmentImpl::byteSize (5 
bytes)   accessor
        @ 6   jdk.internal.foreign.StringSupport::strlenByte (232 bytes)   
force inline by annotation
          <snip>
        @ 27   java.lang.foreign.MemorySegment::copy (29 bytes)   force inline 
by annotation
          <snip>
        @ 37   java.lang.String::<init> (16 bytes)   inline (hot)
          @ 2   java.util.Objects::requireNonNull (14 bytes)   force inline by 
annotation
          @ 12   java.lang.String::<init> (86 bytes)   inline (hot)
            @ 23   java.lang.String::utf8 (271 bytes)   inline (hot)
              @ 9   java.lang.StringCoding::countPositives (33 bytes)   
(intrinsic)
              @ 27   java.util.Arrays::copyOfRange (82 bytes)   inline (hot)
                @ 11   java.lang.Object::clone (0 bytes)   failed to inline: 
native method   (intrinsic)
              @ 31   java.lang.String::<init> (15 bytes)   inline (hot)
                @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
            @ 82   java.lang.String::<init> (37 bytes)   inline (hot)
              @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)


As we can see everything is inlined but since `String::utf8` explicitly asks 
for a copy of the `byte[]` 
https://github.com/openjdk/jdk/blob/3263361a28c7e8c02734cb94bc9576e9f3ba5b50/src/java.base/share/classes/java/lang/String.java#L575
 this can not be optimized away.

This is the compliation log with the proposed changes


@ 75   jdk.internal.foreign.AbstractMemorySegmentImpl::getString (9 bytes)   
force inline by annotation   callee changed to  help.GetString::main (107 
bytes)    -> TypeProfile (121249/121249 counts) = 
jdk/internal/foreign/HeapMemorySegmentImpl$OfByte
  @ 5   jdk.internal.foreign.AbstractMemorySegmentImpl::getString (12 bytes)   
force inline by annotation
    @ 1   java.util.Objects::requireNonNull (14 bytes)   force inline by 
annotation
    @ 8   jdk.internal.foreign.StringSupport::read (67 bytes)   force inline by 
annotation
      @ 1   jdk.internal.foreign.StringSupport$CharsetKind::of (102 bytes)   
inline (hot)
      @ 4   java.lang.Enum::ordinal (5 bytes)   accessor
      @ 45   jdk.internal.foreign.StringSupport::readByte (55 bytes)   force 
inline by annotation
        @ 3   jdk.internal.foreign.AbstractMemorySegmentImpl::byteSize (5 
bytes)   accessor
        @ 6   jdk.internal.foreign.StringSupport::strlenByte (232 bytes)   
force inline by annotation
          <snip>
        @ 27   java.lang.foreign.MemorySegment::copy (29 bytes)   force inline 
by annotation
          <snip>
        @ 36   java.lang.System$1::uncheckedNewStringNoRepl (6 bytes)   inline 
(hot)
          @ 2   java.lang.String::newStringNoRepl (33 bytes)   inline (hot)
            @ 2   java.lang.String::newStringNoRepl1 (284 bytes)   inline (hot)
              @ 22   java.lang.String::newStringUTF8NoRepl (322 bytes)   inline 
(hot)
                @ 4   java.lang.String::checkBoundsOffCount (10 bytes)   inline 
(hot)
                  @ 6   jdk.internal.util.Preconditions::checkFromIndexSize (25 
bytes)   inline (hot)
                @ 24   java.lang.StringCoding::countPositives (33 bytes)   
(intrinsic)
                @ 73   java.lang.String::<init> (15 bytes)   inline (hot)
                  @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)


everything still inlines even though some methods are larger.

-------------

Commit messages:
 - 8362893: Improve performance for MemorySegment::getString

Changes: https://git.openjdk.org/jdk/pull/26493/files
  Webrev: https://webrevs.openjdk.org/?repo=jdk&pr=26493&range=00
  Issue: https://bugs.openjdk.org/browse/JDK-8362893
  Stats: 19 lines in 1 file changed: 16 ins; 0 del; 3 mod
  Patch: https://git.openjdk.org/jdk/pull/26493.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/26493/head:pull/26493

PR: https://git.openjdk.org/jdk/pull/26493

Reply via email to