On Thu, 12 Mar 2026 11:09:38 GMT, Eirik Bjørsnøs <[email protected]> wrote:
>> This PR optimizes string concatenation in `URL::toExternalForm`. This method
>> is called by `URL:toString` and executes often enough to be noticed during
>> startup / classloading profiling. It was previously optimized in JDK 10 via
>> JDK-8193034, but we can do more.
>>
>> Key observations:
>>
>> * The current method concatenates concatenated strings. We can get away with
>> less by flattening and concatenating once per call.
>> * The current method does not take advantage of the fact that most URL
>> components are optional and several are commonly not present. We can
>> specialize for those cheaper common concatenations.
>>
>> This PR reduces runtime with 60% or more, depending on present URL
>> components. (Benchmark results in first comment). I also ran the benchmark
>> with compilation excluded and found no regressions for the interpreter case.
>>
>> Working on this PR revealed that current tests miss coverage for the case
>> where a query or ref component is the empty string. Some new test cases are
>> added to `Constructor.java`. Otherwise, `URL::toString` seems well tested by
>> current tests.
>>
>> Performance refactoring, `noreg-perf`.
>
> Eirik Bjørsnøs has updated the pull request incrementally with seven
> additional commits since the last revision:
>
> - Add comment explaining why method is fast
> - Give up on vertical alignment
> - Shorten fast paths by using a ternary operator
> - Boolean locals for optionality allows tighter code
> - Fast paths can be moved to top of method
> - Add test cases for URLs with empty (blank, but non null) query and ref
> components
> - Fix regression caused by treating blank query or ref as not present
Tier 1-2 is clear on d19db35a4bb.
I'm able to reproduce the following improvements on Linux x64:
| `master` (5e588085e9c) | PR (d19db35a4bb) | Improvement |
|------------------------:|-----------------:|------------:|
| 26.413 | 11.633 | 2.271 |
| 44.212 | 15.910 | 2.779 |
| 44.562 | 16.353 | 2.725 |
| 53.224 | 21.499 | 2.476 |
| 33.721 | 15.291 | 2.205 |
| 55.265 | 19.915 | 2.775 |
| 55.399 | 20.504 | 2.702 |
| 64.569 | 24.313 | 2.656 |
Dropped some minor remarks, but I see no showstoppers.
4a11d5ed25f yields the following improvement on my local `linux-x64`:
(auth) (query) (ref) Mode Cnt Score Error Units PR-Score PR-Score%
false false false avgt 15 26.413 ± 0.418 ns/op 13.662 1.933
false false true avgt 15 44.212 ± 0.994 ns/op 15.706 2.815
false true false avgt 15 44.562 ± 1.009 ns/op 16.760 2.659
false true true avgt 15 53.224 ± 0.896 ns/op 19.778 2.691
true false false avgt 15 33.721 ± 1.173 ns/op 15.567 2.166
true false true avgt 15 55.265 ± 1.275 ns/op 19.479 2.837
true true false avgt 15 55.399 ± 1.102 ns/op 19.418 2.853
true true true avgt 15 64.569 ± 1.295 ns/op 23.059 2.800
@eirbjo, thanks so much for the contribution. AFAICT, you can remove the
`Draft` label. In the meantime, I'm running Tier 1-2 against 4a11d5ed25f.
src/java.base/share/classes/java/net/URLStreamHandler.java line 483:
> 481: * @return a string representation of the {@code URL} argument.
> 482: */
> 483: protected String toExternalForm(URL u) {
I've tried to outsmart @eirbjo, and failed beautifully:
1. In the original method (5e588085e9c), replaced the conditional string
concatenation with an explicit `StringBuilder` that is populated using `if`
blocks:
var sb = new StringBuilder(u.getProtocol()).append(':');
var authority = u.getAuthority();
if (authority != null && !authority.isEmpty) {
sb.append("//").append(authority);
}
// ...
This hardly yielded any benefit. I concluded that fast path is indeed
helping.
2. In the most recent commit (4a11d5ed25f), in the slow path (i.e., not
specialized), replaced locals with `StringBuilder`+`if` stunt I've tried to
pull above. This resulted in performance regression. I concluded that indified
string concatenation, where lengths of the parts to be concatenated are known
in advance, does deliver.
I suggest adding a comment block to the beginning of this method explaining why
it is fast. Inspirational material:
// This method exercises following optimizations:
//
// 1. Fast paths are introduced for common cases
//
// 2. All string concatenations are intentionally performed without any
// conditionals to allow indified string concatenation to kick in.
In particular, I find the 2nd warning important. IMHO, it is very likely
someone can accidentally update an existing `foo + bar` with `foo + (x ? "a" :
"b") + bar`, and we're back to square one.
src/java.base/share/classes/java/net/URLStreamHandler.java line 486:
> 484: // Optionality, subtly different for authority
> 485: var emptyAuth = u.getAuthority() == null ||
> u.getAuthority().isEmpty();
> 486: var emptyPath = u.getPath() == null;
*Nit:* Vertical alignment is difficult to get right, even when you trying to
get it right, like this one. I don't think it is even sustainable in the long
term — see the hidden crying kitten patterns in `vmIntrinsics.hpp`. I kindly
urge you to reconsider your decision. That said, I will respect your conclusion
regardless.
src/java.base/share/classes/java/net/URLStreamHandler.java line 488:
> 486:
> 487: // Local variables used by concatenation
> 488: String protocol = u.getProtocol(), authSeparator, authority,
> path, querySeparator, query, refSeparator, ref;
*Nit:* You may consider declaring the variables the first time they're
assigned. That would decrease the cognitive load on specialized concatenations
where only a subset of variables are needed. That said, current form is okay
too.
src/java.base/share/classes/java/net/URLStreamHandler.java line 494:
> 492: String p = u.getPath();
> 493: String q = u.getQuery();
> 494: String r = u.getRef();
Suggestion:
String a = u.getAuthority();
String p = u.getPath();
String q = u.getQuery();
String r = u.getRef();
src/java.base/share/classes/java/net/URLStreamHandler.java line 496:
> 494: } else {
> 495: return u.getProtocol() + "://" + u.getAuthority() + path;
> 496: }
You can further shorten this snippet:
Suggestion:
return emptyAuth
? (u.getProtocol() + ":" + path)
: (u.getProtocol() + "://" + u.getAuthority() + path);
-------------
PR Review: https://git.openjdk.org/jdk/pull/30151#pullrequestreview-3927627147
PR Review: https://git.openjdk.org/jdk/pull/30151#pullrequestreview-3934936911
PR Review Comment: https://git.openjdk.org/jdk/pull/30151#discussion_r2923336961
PR Review Comment: https://git.openjdk.org/jdk/pull/30151#discussion_r2923217419
PR Review Comment: https://git.openjdk.org/jdk/pull/30151#discussion_r2916602279
PR Review Comment: https://git.openjdk.org/jdk/pull/30151#discussion_r2916583233
PR Review Comment: https://git.openjdk.org/jdk/pull/30151#discussion_r2923189924