gnodet commented on code in PR #1937: URL: https://github.com/apache/maven-resolver/pull/1937#discussion_r3484313557
########## maven-resolver-util/src/main/java/org/eclipse/aether/util/concurrency/ConcurrentWeakCache.java: ########## @@ -0,0 +1,246 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.eclipse.aether.util.concurrency; + +import java.lang.ref.ReferenceQueue; +import java.lang.ref.WeakReference; +import java.util.concurrent.ConcurrentHashMap; + +/** + * A concurrent cache with weak keys and weak values, inspired by maven-impl's Cache class + * but stripped down and optimized for hot-path performance. + * <p> + * Design compared to the full Cache: + * <ul> + * <li><b>Zero allocation on get()</b> — uses a ThreadLocal reusable lookup key + * instead of allocating a new wrapper on every read</li> + * <li><b>O(1) cleanup per stale entry</b> — uses identity-based removal from + * the ReferenceQueue instead of a full {@code entrySet().removeIf()} scan</li> + * <li><b>Lock-free reads</b> — {@link ConcurrentHashMap#get} is a volatile read + * with no lock acquisition</li> + * <li><b>Lock-striped writes</b> — ConcurrentHashMap uses fine-grained locking</li> + * </ul> + * <p> + * Keys are held via {@link WeakReference}: when a key is no longer strongly referenced + * elsewhere, its entry becomes eligible for garbage collection. Values are also held + * via WeakReference. Stale entries are cleaned up lazily on {@link #put}. + * + * @param <K> the type of keys + * @param <V> the type of values + * @since 2.0.19 + */ +public class ConcurrentWeakCache<K, V> { + + private final ConcurrentHashMap<Object, WeakReference<V>> map; + private final ReferenceQueue<Object> queue = new ReferenceQueue<>(); + + @SuppressWarnings("rawtypes") + private static final ThreadLocal<LookupKey> LOOKUP_KEY = new ThreadLocal<LookupKey>() { + @Override + protected LookupKey initialValue() { + return new LookupKey(); + } + }; + + /** + * Creates a new cache with default initial capacity. + */ + public ConcurrentWeakCache() { + this.map = new ConcurrentHashMap<>(); + } + + /** + * Creates a new cache with the given initial capacity. + * + * @param initialCapacity the initial capacity + */ + public ConcurrentWeakCache(int initialCapacity) { + this.map = new ConcurrentHashMap<>(initialCapacity); + } + + /** + * Returns the value for the given key, or {@code null} if the key is not present + * or the value has been garbage collected. + * <p> + * This method is lock-free and allocates no objects. + * + * @param key the key to look up + * @return the cached value, or {@code null} + */ + @SuppressWarnings("unchecked") + public V get(K key) { + LookupKey<K> lookupKey = LOOKUP_KEY.get(); + lookupKey.set(key); + try { + WeakReference<V> ref = map.get(lookupKey); + return ref != null ? ref.get() : null; + } finally { + lookupKey.clear(); + } + } + + /** + * Stores a key-value pair in the cache. Both key and value are held via weak references. + * <p> + * Also performs lazy cleanup of entries whose keys have been garbage collected. + * + * @param key the key + * @param value the value + */ + public void put(K key, V value) { + expungeStaleEntries(); + map.put(new WeakKey<>(key, queue), new WeakReference<>(value)); + } + + /** + * If the key is not already present (or its value has been GC'd), stores the key-value pair + * and returns the given value. If the key is already present with a live value, returns the + * existing value without storing. This ensures concurrent callers for the same key always + * receive the same instance. + * <p> + * Also performs lazy cleanup of entries whose keys have been garbage collected. + * + * @param key the key + * @param value the value to store if absent + * @return the existing value if present, or the given value if newly stored + */ + public V putIfAbsent(K key, V value) { + // Fast path: lock-free lookup, zero allocation (ThreadLocal LookupKey) + V existing = get(key); + if (existing != null) { + return existing; + } + // Slow path: allocate WeakKey + WeakReference and insert + expungeStaleEntries(); + WeakReference<V> newRef = new WeakReference<>(value); + WeakReference<V> existingRef = map.putIfAbsent(new WeakKey<>(key, queue), newRef); + if (existingRef != null) { + V existingValue = existingRef.get(); + if (existingValue != null) { + return existingValue; + } + // Existing entry's value was GC'd — overwrite with new value + map.put(new WeakKey<>(key, queue), newRef); + } + return value; + } Review Comment: Good catch. Fixed in 7afa928 — replaced the non-atomic `putIfAbsent()` + `put()` two-step with a single `ConcurrentHashMap.merge()` call. The merge lambda atomically decides to keep the existing live value or replace a GC'd one, so concurrent callers for the same key are now guaranteed to receive the same instance. Benchmark confirms no perf regression (55.2M ops/sec vs 51.6M before — slightly faster since it's one CHM operation instead of two). ########## maven-resolver-util/src/main/java/org/eclipse/aether/util/version/GenericVersionScheme.java: ########## @@ -54,102 +50,18 @@ */ public class GenericVersionScheme extends VersionSchemeSupport { - // Using WeakHashMap wrapped in synchronizedMap for thread safety and memory-sensitive caching - private final Map<String, GenericVersion> versionCache = Collections.synchronizedMap(new WeakHashMap<>()); - - // Cache statistics - private final AtomicLong cacheHits = new AtomicLong(0); - private final AtomicLong cacheMisses = new AtomicLong(0); - private final AtomicLong totalRequests = new AtomicLong(0); - - // Static statistics across all instances - private static final AtomicLong GLOBAL_CACHE_HITS = new AtomicLong(0); - private static final AtomicLong GLOBAL_CACHE_MISSES = new AtomicLong(0); - private static final AtomicLong GLOBAL_TOTAL_REQUESTS = new AtomicLong(0); - private static final AtomicLong INSTANCE_COUNT = new AtomicLong(0); - - static { - // Register shutdown hook to print statistics if enabled - if (isStatisticsEnabled()) { - Runtime.getRuntime().addShutdownHook(new Thread(GenericVersionScheme::printGlobalStatistics)); - } - } - - public GenericVersionScheme() { - INSTANCE_COUNT.incrementAndGet(); - } - - /** - * Checks if version scheme cache statistics should be printed. - * This checks both the system property and the configuration property. - */ - private static boolean isStatisticsEnabled() { - // Check system property first (for backwards compatibility and ease of use) - String sysProp = System.getProperty(ConfigurationProperties.VERSION_SCHEME_CACHE_DEBUG); - if (sysProp != null) { - return Boolean.parseBoolean(sysProp); - } - - // Default to false if not configured - return ConfigurationProperties.DEFAULT_VERSION_SCHEME_CACHE_DEBUG; - } + // Concurrent cache with weak keys and weak values: lock-free reads (volatile read, no lock + // acquisition via ConcurrentHashMap), lock-striped writes, zero allocation on get() via + // ThreadLocal lookup key, weak references allow GC under memory pressure. + private final ConcurrentWeakCache<String, GenericVersion> versionCache = new ConcurrentWeakCache<>(); Review Comment: Intentional. The cache statistics (`AtomicLong` increments on every `get()`, `getCacheStatistics()` method, `VERSION_SCHEME_CACHE_DEBUG` flag) were part of the synchronized implementation that caused the 2× regression. This PR's goal is to eliminate contention on these hot paths — adding atomic counter overhead back would partially undo that. The method lived on the concrete `GenericVersionScheme` class (not on the `VersionScheme` interface), and was added in 2.0.10 as a debugging aid, not as a user-facing API. ########## maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/transformer/PathConflictResolver.java: ########## @@ -635,9 +642,12 @@ private List<Path> addChildren(List<DependencyNode> children) throws RepositoryE ArrayList<Path> added = new ArrayList<>(children.size()); for (DependencyNode child : children) { String childConflictId = this.state.conflictIds.get(child); - Set<String> conflictIdsSinceRoot = new HashSet<>(this.conflictIdsSinceRoot); - boolean cycle = !conflictIdsSinceRoot.add(childConflictId); - Path c = new Path(this.state, child, childConflictId, conflictIdsSinceRoot, this); + // Detect cycles by walking up the parent chain instead of copying a HashSet per child. + // This trades O(depth) per check for eliminating the dominant memory consumer: + // previously, each Path copied its parent's entire conflict-ID set, creating millions + // of HashMap.Node objects for large graphs. + boolean cycle = hasConflictIdOnPathToRoot(childConflictId); Review Comment: Fixed in b69da10 — updated the Javadoc to describe the parent-chain walk via `hasConflictIdOnPathToRoot()`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
