All,

Please ignore the fact that my benchmark is all oriented around toUpperCase instead of toLowerCase :)

-chris

On 9/8/23 13:25, Christopher Schultz wrote:
All,

There are many cases in Tomcat where we change the letter-case of a String value so it's easier to compare when case doesn't matter. In particular, HTTP header names and many spec-defined values are supposed to be case-insensitive and so all comparisons involving them must be done without regard to letter-case.

The idiom in Tomcat source code for that is[1]:

     collection.add(element.toLowerCase(Locale.ENGLISH));

Locale.ENGLISH is used because all of these values are supposed to be in ASCII encoding and Locale.ENGLISH is as good as any equivalent Locale that (nominally) uses (mostly) ASCII semantics.

It turns out that String.toLowerCase (and it's mirror, String.toUpperCase) has a ton of code in it to manage the many complexities of Locales in which we are not interested.

Implementing an ASCII-only version of toLowerCase appears to have a speed improvement of roughly 2x for some simple cases. I have a sample microbenchmark below and the output of jmh on Java 17.

Given the frequency of calls to toLowerCase (many ties per request), I think it may be a worthwhile performance improvement to implement and use our own version of toLowerCase and use it when only ASCII is expected.

It may even be possible to write a more complicated version of toLowerCase than I have below that performs even faster (e.g. for String values that end up not having any upper-case characters at all).

WDYT?

-chris

[1] https://github.com/apache/tomcat/blob/feb77a15849389001ebcfdd623df86a42a62019e/java/org/apache/tomcat/util/http/parser/TokenList.java#L95

Benchmark                                Mode  Cnt         Score Error Units MyBenchmark.testStringToUpperCase       thrpt    5  28130795.259 ± 1297495.570  ops/s MyBenchmark.testStringToUpperCaseASCII  thrpt    5  52221288.421 ± 5112349.492  ops/s

Source:

import java.util.concurrent.TimeUnit;

import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.Warmup;

@Warmup(iterations=5, time=5, timeUnit=TimeUnit.SECONDS)
@Measurement(iterations=5, time=5, timeUnit=TimeUnit.SECONDS)
@BenchmarkMode(Mode.Throughput)
@Fork(1)
public class MyBenchmark {

     private static final String SOURCE = "X-Frame-Options";

     @Benchmark
     public String testStringToUpperCase() {
         return SOURCE.toUpperCase();
     }

     @Benchmark
     public String testStringToUpperCaseASCII() {
         return toUpperCaseASCII(SOURCE);
     }

     public String toUpperCaseASCII(String s) {
         int len = s.length();
         char[] result = new char[len];
         for(int i=0; i<len; i++) {
             char c = s.charAt(i);

             if(c >= 'a' && c <= 'z') {
                 c -= 32;
             }

             result[i] = c;
         }

         return new String(result);
     }
}

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to