Hi Brett,

I'd suggest separate initialization and test methods for the two cases to get more reliable numbers.

By using @Trial and using a common field for the test data, I think you have handicapped C2. The training runs JMH does to warm up C2 are 'seeing' two different types for the value of sequence. Making the test runs independent will remov doubt about interactions due to the test setup.

Roger

On 7/21/25 1:43 PM, Brett Okken wrote:
> output labeled as StringBuffer but the jmh creates StringBuilder.

Ugh - sorry about that. But yes - this is about StringBuilder vs String.

> I would not be surprised that C2 has more optimizations for String than for StringBuilder.

If that were true, it would not surprise me. However, these tests show the opposite. String is /slower/ than StringBuilder.

On Mon, Jul 21, 2025 at 12:34 PM Roger Riggs <roger.ri...@oracle.com> wrote:

    Hi Brett,

    The labeling of the output is confusing, the test output labeled
    as StringBuffer but the jmh creates StringBuilder.
    (StringBuffer methods are all synchronized and could explain why
    they are slower).

    Also, I would not be surprised that C2 has more optimizations for
    String than for StringBuilder.

    Regards, Roger

    On 7/19/25 6:09 PM, Brett Okken wrote:
    Making sequence a local variable does improve things (especially
    for ascii), but a substantial difference remains. It appears that
    the performance difference for ascii goes all the way back to jdk
    11. The difference for non-ascii showed up in jdk 21. I wonder if
    this is related to the index checks?

    jdk 11

    Benchmark  (data)      (source)  Mode  Cnt     Score    Error  Units
    test        ascii        String  avgt    3  1137.348 ±   12.835
     ns/op
    test        ascii  StringBuffer  avgt    3   712.874 ±  509.320
     ns/op
    test    non-ascii        String  avgt    3   668.657 ±  246.550
     ns/op
    test    non-ascii  StringBuffer  avgt    3   897.344 ± 4353.414
     ns/op


    jdk 17
    Benchmark  (data)      (source)  Mode  Cnt     Score    Error  Units
    test        ascii        String  avgt    3  1321.497 ± 2107.466
     ns/op
    test        ascii  StringBuffer  avgt    3   715.936 ±  412.189
     ns/op
    test    non-ascii        String  avgt    3   722.986 ±  443.389
     ns/op
    test    non-ascii  StringBuffer  avgt    3   722.787 ±  771.816
     ns/op


    jdk 21
    Benchmark  (data)      (source)  Mode  Cnt     Score     Error  Units
    test        ascii        String  avgt    3  1150.301 ┬▒   918.549
     ns/op
    test        ascii  StringBuffer  avgt    3   713.183 ┬▒   543.850
     ns/op
    test    non-ascii        String  avgt    3  4642.667 ┬▒ 11481.029
     ns/op
    test    non-ascii  StringBuffer  avgt    3   728.027 ┬▒   936.521
     ns/op


    jdk 25
    Benchmark  (data)      (source)  Mode  Cnt     Score    Error  Units
    test        ascii        String  avgt    3  1184.513 ┬▒ 2057.498
     ns/op
    test        ascii  StringBuffer  avgt    3   786.611 ┬▒  411.657
     ns/op
    test    non-ascii        String  avgt    3  4197.585 ┬▒ 2761.388
     ns/op
    test    non-ascii  StringBuffer  avgt    3   716.375 ┬▒  815.349
     ns/op


    jdk 26
    Benchmark  (data)      (source)  Mode  Cnt     Score   Error  Units
    test        ascii        String  avgt    3  1107.207 ┬▒ 423.072
     ns/op
    test        ascii  StringBuffer  avgt    3   742.780 ┬▒ 178.890
     ns/op
    test    non-ascii        String  avgt    3  4043.914 ┬▒ 498.439
     ns/op
    test    non-ascii  StringBuffer  avgt    3   712.535 ┬▒ 583.255
     ns/op


    On Sat, Jul 19, 2025 at 4:17 PM Chen Liang
    <liangchenb...@gmail.com> wrote:

        Without looking at C2 IRs, I think there are a few potential
        culprits we can look into:
        1. JDK-8351000 and JDK-8351443 updated StringBuilder
        2. Sequence field is read in the loop; I wonder if making it
        an explicit immutable local variable changes anything here.

        On Sat, Jul 19, 2025 at 2:34 PM Brett Okken
        <brett.okken...@gmail.com> wrote:

            I was looking at the performance of StringCharBuffer for
            various
            backing CharSequence types and was surprised to see a
            significant
            performance difference between String and StringBuffer. I
            wrote a
            small jmh which shows that the String implementation of
            charAt is
            significantly slower than StringBuilder. Is this expected?

            Benchmark                            (data) (source) 
            Mode  Cnt
              Score       Error  Units
            CharSequenceCharAtBenchmark.test      ascii   String 
            avgt    3
            2537.311 ┬▒  8952.197  ns/op
            CharSequenceCharAtBenchmark.test      ascii StringBuffer 
            avgt    3
            852.004 ┬▒  2532.958  ns/op
            CharSequenceCharAtBenchmark.test  non-ascii   String 
            avgt    3
            5115.381 ┬▒ 13822.592  ns/op
            CharSequenceCharAtBenchmark.test  non-ascii StringBuffer 
            avgt    3
            836.230 ┬▒  1154.191  ns/op



            @Measurement(iterations = 3, time = 5, timeUnit =
            TimeUnit.SECONDS)
            @Warmup(iterations = 2, time = 7, timeUnit =
            TimeUnit.SECONDS)
            @BenchmarkMode(Mode.AverageTime)
            @OutputTimeUnit(TimeUnit.NANOSECONDS)
            @State(Scope.Benchmark)
            @Fork(value = 1, jvmArgsPrepend = {"-Xms512M", "-Xmx512M"})
            public class CharSequenceCharAtBenchmark {

                @Param(value = {"ascii", "non-ascii"})
                public String data;

                @Param(value = {"String", "StringBuffer"})
                public String source;

                private CharSequence sequence;

                @Setup(Level.Trial)
                public void setup() throws Exception {
                    StringBuilder sb = new StringBuilder(3152);
                    for (int i=0; i<3152; ++i) {
                        char c = (char) i;
                        if ("ascii".equals(data)) {
                            c = (char) (i & 0x7f);
                        }
                        sb.append(c);
                    }

                    switch(source) {
                        case "String":
                            sequence = sb.toString();
                            break;
                        case "StringBuffer":
                            sequence = sb;
                            break;
                        default:
                            throw new IllegalArgumentException(source);
                    }
                }

                @Benchmark
                public int test() {
                    int sum = 0;
                    for (int i=0, j=sequence.length(); i<j; ++i) {
                        sum += sequence.charAt(i);
                    }
                    return sum;
                }
            }


Reply via email to