No problem. Good luck to you Bram! Looks like a fun mystery. On Tue, Mar 24, 2026 at 11:34 AM Bram Luyten <[email protected]> wrote: > > Hi David, > > Thank you for the detailed response. I owe you an apology: after > re-examining our data based on your feedback, the wall-clock profiling > led us to an incorrect attribution. Sorry for the noise. > > You were right to question the wall-clock numbers for background > threads. When we re-checked with CPU profiling (async-profiler -e cpu), > AdaptiveExecutionStrategy.produce() shows exactly 0 CPU samples on the > Solr 10 branch. The selector thread is idle, not busy-polling. > Wall-clock profiling inflated it because it samples all threads > regardless of state. Total CPU samples are nearly identical between > branches (17,519 vs 17,119), same distribution. > > To answer your question: we create exactly one CoreContainer for the > entire test suite, held as a static singleton with 6 cores. Between > tests we clear data via deleteByQuery + commit, but the container stays > alive for the full JVM lifetime. So the "lots of CoreContainers" > scenario does not apply here. > > Given identical CPU profiles and zero Jetty CPU samples, the Solr path > is almost certainly not our bottleneck. We will look elsewhere. I don't > think the SolrStartup benchmark would be productive at this point. > > Again, apologies for the false alarm, and thank you for steering us in > the right direction. > > Best regards, > Bram Luyten > > On Tue, Mar 24, 2026 at 2:43 PM David Smiley <[email protected]> wrote: > > > Hello Bram, > > > > Some of what you are sharing confuses me. I don't think sharing the > > wall-clock-time is pertinent for background threads -- and I assume > > those Jetty HttpClients are in the background doing nothing. Yes, > > CoreContainer creates a Jetty HttpClient that is unused in an embedded > > mode. Curious; are you creating lots of CoreContainers (perhaps > > indirectly via creating EmbeddedSolrServer)? Maybe we have a > > regression there. I suspect a test environment would be doing this, > > creating a CoreContainer for each test, basically. Solr's tests do > > this too! And a slowdown as big as you show sounds like something > > we'd notice... most likely. On the other hand, if your CI/tests > > creates very few CoreContainers and there's all this slowdown you > > report, then CoreContainer startup is mostly irrelevant. > > > > We do have a benchmark that should capture a slowdown in this area -- > > > > https://github.com/apache/solr/blob/9c911e7337cd1026accc1a825e26906039982328/solr/benchmark/src/java/org/apache/solr/bench/lifecycle/SolrStartup.java > > (scope is a bit larger but good enough) but we don't have continuous > > benchmarking over releases to make relative comparisons. We've been > > talking about that, but the recent discussions are unlikely to support > > a way to do this for embedded Solr. I've been working on this > > benchmark code lately as well. *Anyway*, I recommend that you try > > this benchmark, starting with its great README, mostly documenting JMH > > itself. If you do that and find some curious/suspicious things, I'd > > love to hear more! > > > > On Tue, Mar 24, 2026 at 3:51 AM Bram Luyten <[email protected]> > > wrote: > > > > > > Hi all, > > > > > > Disclaimer: I am a DSpace developer, not a Solr/Jetty internals > > > expert. Much of the profiling and analysis below was done with heavy > > > assistance from Claude. I'm sharing this because the data seems > > > significant, > > > but I may be misinterpreting some of it. Corrections and guidance are > > very > > > welcome. > > > > > > > > > CONTEXT > > > --------------- > > > > > > We are upgrading DSpace (open-source repository software) from > > > Spring Boot 3 / Solr 8 to Spring Boot 4 / Solr 10. Our integration > > > test suite uses embedded Solr via solr-core as a test dependency > > > (EmbeddedSolrServer style, no HTTP traffic -- everything is > > > in-process in a single JVM). > > > > > > After the upgrade, our IT suite went from ~31 minutes to ~2 hours > > > in CI. We spent considerable time profiling and eliminating other > > > causes (Hibernate 7, Spring 7, H2 database, GC, lock contention, > > > caching). Wall-clock profiling with async-profiler ultimately > > > pointed to embedded Solr as the primary bottleneck. > > > > > > Note: we previously reported the Solr 10 POM issue with missing > > > Jackson 2 dependency versions (solr-core, solr-solrj, solr-api). > > > We have the workaround in place (explicit dependency declarations), > > > so the embedded Solr 10 has a complete classpath. > > > > > > > > > THE PROBLEM > > > ---------------------- > > > > > > Wall-clock profiling (async-profiler -e wall) of the same test class > > > (DiscoveryRestControllerIT, 83 tests) on both branches shows: > > > > > > Component Main (Solr 8) SB4 (Solr 10) Difference > > > ---------------------------------------------------------------- > > > Solr total 3.6s 11.5s +7.9s > > > Hibernate 0.2s 0.2s 0.0s > > > H2 Database 0.1s 0.1s 0.0s > > > Spring 0.1s 0.1s 0.0s > > > Test total 68.4s 84.3s +15.9s > > > > > > Solr accounts for 50% of the total wall-clock difference (7.9s out > > > of 15.9s). Hibernate, H2, and Spring are essentially unchanged. > > > > > > > > > THE ROOT CAUSE > > > --------------------------- > > > > > > Breaking down the Solr wall-clock time by operation: > > > > > > Operation Main SB4 > > > --------------------------------------------------------------- > > > Jetty EatWhatYouKill.produce() 2558 (58%) -- > > > Jetty AdaptiveExecutionStrategy.produce() -- 12786 (91%) > > > DirectUpdateHandler2.commit() 522 (12%) 707 (5%) > > > SpellChecker.newSearcher() 119 (3%) 261 (2%) > > > > > > (Numbers are async-profiler wall-clock samples) > > > > > > The dominant operation is Jetty's NIO selector execution strategy: > > > > > > - Solr 8 / Jetty 9: EatWhatYouKill.produce(): 2558 samples (58%) > > > - Solr 10 / Jetty 12: AdaptiveExecutionStrategy.produce(): 12786 > > samples > > > (91%) > > > - That is a 5x increase in wall-clock time > > > > > > The full stack trace shows: > > > > > > ThreadPoolExecutor > > > -> MDCAwareThreadPoolExecutor > > > -> ManagedSelector (Jetty NIO selector) > > > -> AdaptiveExecutionStrategy.produce() > > > -> AdaptiveExecutionStrategy.tryProduce() > > > -> AdaptiveExecutionStrategy.produceTask() > > > -> ... -> KQueue.poll (macOS NIO) > > > > > > This is the Jetty HTTP client's NIO event loop. Even though we use > > > EmbeddedSolrServer (no HTTP traffic), Solr 10's CoreContainer > > > appears to create an internal Jetty HTTP client (likely for > > > inter-shard communication via HttpJettySolrClient). In embedded > > > single-node mode, this client has no work to do, but its NIO > > > selector thread still runs, and AdaptiveExecutionStrategy.produce() > > > idles much less efficiently than Jetty 9's EatWhatYouKill did. > > > > > > On macOS this manifests as busy-polling in KQueue.poll. The impact > > > may differ on Linux (epoll). > > > > > > > > > PROFILING METHODOLOGY > > > ----------------------------------------- > > > > > > - Tool: async-profiler 4.3 (wall-clock mode, safepoint-free) > > > - JDK: OpenJDK 21.0.9 > > > - Both branches use the same H2 2.4.240 test database > > > - Both branches use the same test code and Solr schema/config > > > - The only Solr-related difference is the Solr version (8.11.4 vs > > 10.0.0) > > > - Profiling was done on macOS (Apple Silicon), but the CI slowdown > > > (GitHub Actions, Ubuntu) shows the same pattern at larger scale > > > > > > > > > WHAT WE RULED OUT > > > --------------------------------- > > > > > > Before identifying the Solr/Jetty issue, we investigated and ruled > > > out many other causes: > > > > > > - Hibernate 7 overhead: SQL query count is similar (fewer on SB4), > > > query execution time is <40ms total for 1400+ queries > > > - H2 database: same version (2.4.240) on both branches, negligible > > > wall-clock difference > > > - GC pauses: only +0.7s extra on SB4 (1.4% of total difference) > > > - Lock contention: main actually has MORE lock contention than SB4 > > > - Hibernate session.clear(): tested with/without, no effect > > > - JaCoCo coverage: tested with/without, no effect > > > - Hibernate caching (L2, query cache): disabled both, no effect > > > - Hibernate batch fetch size: tested, no effect > > > > > > > > > QUESTIONS FOR THE SOLR TEAM > > > -------------------------------------------------- > > > > > > 1. Does embedded mode (EmbeddedSolrServer / CoreContainer without > > > an HTTP listener) need to create a Jetty HTTP client at all? > > > If the client is only for shard-to-shard communication, it > > > seems unnecessary in single-node embedded testing. > > > > > > 2. If the HTTP client is required, can its NIO selector / thread > > > pool be configured with minimal resources for embedded mode? > > > (e.g., fewer selector threads, smaller thread pool, or an > > > idle-friendly execution strategy) > > > > > > 3. Is there a Solr configuration (solr.xml property, system > > > property, or CoreContainer API) that we can use from the > > > consuming application to reduce this overhead? > > > > > > 4. Is this specific to macOS (KQueue) or does it also affect > > > Linux (epoll)? Our CI runs on Ubuntu and shows a larger > > > slowdown (3.8x) than local macOS (1.28x), which could be > > > related. > > > > > > ENVIRONMENT > > > ----------------------- > > > > > > Solr: 10.0.0 (solr-core as test dependency for embedded server) > > > Jetty: 12.0.x (pulled in transitively by Solr 10) > > > JDK: 21 > > > OS: macOS (profiled), Ubuntu (CI where the 4x slowdown manifests) > > > Project: DSpace (https://github.com/DSpace/DSpace) > > > PR: https://github.com/DSpace/DSpace/pull/11810 > > > > > > Happy to provide the full async-profiler flame graph files or > > > additional profiling data if useful. > > > > > > Thanks, > > > Bram Luyten, Atmire > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [email protected] > > For additional commands, e-mail: [email protected] > > > >
--------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
