gnodet commented on code in PR #1937:
URL: https://github.com/apache/maven-resolver/pull/1937#discussion_r3484312810


##########
maven-resolver-util/src/main/java/org/eclipse/aether/util/concurrency/ConcurrentWeakCache.java:
##########
@@ -0,0 +1,246 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.eclipse.aether.util.concurrency;
+
+import java.lang.ref.ReferenceQueue;
+import java.lang.ref.WeakReference;
+import java.util.concurrent.ConcurrentHashMap;
+
+/**
+ * A concurrent cache with weak keys and weak values, inspired by maven-impl's 
Cache class
+ * but stripped down and optimized for hot-path performance.
+ * <p>
+ * Design compared to the full Cache:
+ * <ul>
+ *   <li><b>Zero allocation on get()</b> — uses a ThreadLocal reusable lookup 
key
+ *       instead of allocating a new wrapper on every read</li>
+ *   <li><b>O(1) cleanup per stale entry</b> — uses identity-based removal from
+ *       the ReferenceQueue instead of a full {@code entrySet().removeIf()} 
scan</li>
+ *   <li><b>Lock-free reads</b> — {@link ConcurrentHashMap#get} is a volatile 
read
+ *       with no lock acquisition</li>
+ *   <li><b>Lock-striped writes</b> — ConcurrentHashMap uses fine-grained 
locking</li>
+ * </ul>
+ * <p>
+ * Keys are held via {@link WeakReference}: when a key is no longer strongly 
referenced
+ * elsewhere, its entry becomes eligible for garbage collection. Values are 
also held
+ * via WeakReference. Stale entries are cleaned up lazily on {@link #put}.
+ *
+ * @param <K> the type of keys
+ * @param <V> the type of values
+ * @since 2.0.19
+ */
+public class ConcurrentWeakCache<K, V> {
+
+    private final ConcurrentHashMap<Object, WeakReference<V>> map;
+    private final ReferenceQueue<Object> queue = new ReferenceQueue<>();
+
+    @SuppressWarnings("rawtypes")
+    private static final ThreadLocal<LookupKey> LOOKUP_KEY = new 
ThreadLocal<LookupKey>() {
+        @Override
+        protected LookupKey initialValue() {
+            return new LookupKey();
+        }
+    };
+
+    /**
+     * Creates a new cache with default initial capacity.
+     */
+    public ConcurrentWeakCache() {
+        this.map = new ConcurrentHashMap<>();
+    }
+
+    /**
+     * Creates a new cache with the given initial capacity.
+     *
+     * @param initialCapacity the initial capacity
+     */
+    public ConcurrentWeakCache(int initialCapacity) {
+        this.map = new ConcurrentHashMap<>(initialCapacity);
+    }
+
+    /**
+     * Returns the value for the given key, or {@code null} if the key is not 
present
+     * or the value has been garbage collected.
+     * <p>
+     * This method is lock-free and allocates no objects.
+     *
+     * @param key the key to look up
+     * @return the cached value, or {@code null}
+     */
+    @SuppressWarnings("unchecked")
+    public V get(K key) {
+        LookupKey<K> lookupKey = LOOKUP_KEY.get();
+        lookupKey.set(key);
+        try {
+            WeakReference<V> ref = map.get(lookupKey);
+            return ref != null ? ref.get() : null;
+        } finally {
+            lookupKey.clear();
+        }
+    }
+
+    /**
+     * Stores a key-value pair in the cache. Both key and value are held via 
weak references.
+     * <p>
+     * Also performs lazy cleanup of entries whose keys have been garbage 
collected.
+     *
+     * @param key   the key
+     * @param value the value
+     */
+    public void put(K key, V value) {
+        expungeStaleEntries();
+        map.put(new WeakKey<>(key, queue), new WeakReference<>(value));
+    }
+
+    /**
+     * If the key is not already present (or its value has been GC'd), stores 
the key-value pair
+     * and returns the given value. If the key is already present with a live 
value, returns the
+     * existing value without storing. This ensures concurrent callers for the 
same key always
+     * receive the same instance.
+     * <p>
+     * Also performs lazy cleanup of entries whose keys have been garbage 
collected.
+     *
+     * @param key   the key
+     * @param value the value to store if absent
+     * @return the existing value if present, or the given value if newly 
stored
+     */
+    public V putIfAbsent(K key, V value) {
+        // Fast path: lock-free lookup, zero allocation (ThreadLocal LookupKey)
+        V existing = get(key);
+        if (existing != null) {
+            return existing;
+        }
+        // Slow path: allocate WeakKey + WeakReference and insert
+        expungeStaleEntries();
+        WeakReference<V> newRef = new WeakReference<>(value);
+        WeakReference<V> existingRef = map.putIfAbsent(new WeakKey<>(key, 
queue), newRef);
+        if (existingRef != null) {
+            V existingValue = existingRef.get();
+            if (existingValue != null) {
+                return existingValue;
+            }
+            // Existing entry's value was GC'd — overwrite with new value
+            map.put(new WeakKey<>(key, queue), newRef);
+        }
+        return value;
+    }

Review Comment:
   Good catch. Fixed in 7afa928 — replaced the non-atomic `putIfAbsent()` + 
`put()` two-step with a single `ConcurrentHashMap.merge()` call. The merge 
lambda atomically decides to keep the existing live value or replace a GC'd 
one, so concurrent callers for the same key are now guaranteed to receive the 
same instance. Benchmark confirms no perf regression (55.2M ops/sec vs 51.6M 
before — slightly faster since it's one CHM operation instead of two).



##########
maven-resolver-util/src/main/java/org/eclipse/aether/util/version/GenericVersionScheme.java:
##########
@@ -54,102 +50,18 @@
  */
 public class GenericVersionScheme extends VersionSchemeSupport {
 
-    // Using WeakHashMap wrapped in synchronizedMap for thread safety and 
memory-sensitive caching
-    private final Map<String, GenericVersion> versionCache = 
Collections.synchronizedMap(new WeakHashMap<>());
-
-    // Cache statistics
-    private final AtomicLong cacheHits = new AtomicLong(0);
-    private final AtomicLong cacheMisses = new AtomicLong(0);
-    private final AtomicLong totalRequests = new AtomicLong(0);
-
-    // Static statistics across all instances
-    private static final AtomicLong GLOBAL_CACHE_HITS = new AtomicLong(0);
-    private static final AtomicLong GLOBAL_CACHE_MISSES = new AtomicLong(0);
-    private static final AtomicLong GLOBAL_TOTAL_REQUESTS = new AtomicLong(0);
-    private static final AtomicLong INSTANCE_COUNT = new AtomicLong(0);
-
-    static {
-        // Register shutdown hook to print statistics if enabled
-        if (isStatisticsEnabled()) {
-            Runtime.getRuntime().addShutdownHook(new 
Thread(GenericVersionScheme::printGlobalStatistics));
-        }
-    }
-
-    public GenericVersionScheme() {
-        INSTANCE_COUNT.incrementAndGet();
-    }
-
-    /**
-     * Checks if version scheme cache statistics should be printed.
-     * This checks both the system property and the configuration property.
-     */
-    private static boolean isStatisticsEnabled() {
-        // Check system property first (for backwards compatibility and ease 
of use)
-        String sysProp = 
System.getProperty(ConfigurationProperties.VERSION_SCHEME_CACHE_DEBUG);
-        if (sysProp != null) {
-            return Boolean.parseBoolean(sysProp);
-        }
-
-        // Default to false if not configured
-        return ConfigurationProperties.DEFAULT_VERSION_SCHEME_CACHE_DEBUG;
-    }
+    // Concurrent cache with weak keys and weak values: lock-free reads 
(volatile read, no lock
+    // acquisition via ConcurrentHashMap), lock-striped writes, zero 
allocation on get() via
+    // ThreadLocal lookup key, weak references allow GC under memory pressure.
+    private final ConcurrentWeakCache<String, GenericVersion> versionCache = 
new ConcurrentWeakCache<>();

Review Comment:
   Intentional. The cache statistics (`AtomicLong` increments on every `get()`, 
`getCacheStatistics()` method, `VERSION_SCHEME_CACHE_DEBUG` flag) were part of 
the synchronized implementation that caused the 2× regression. This PR's goal 
is to eliminate contention on these hot paths — adding atomic counter overhead 
back would partially undo that. The method lived on the concrete 
`GenericVersionScheme` class (not on the `VersionScheme` interface), and was 
added in 2.0.10 as a debugging aid, not as a user-facing API.



##########
maven-resolver-util/src/main/java/org/eclipse/aether/util/graph/transformer/PathConflictResolver.java:
##########
@@ -635,9 +642,12 @@ private List<Path> addChildren(List<DependencyNode> 
children) throws RepositoryE
             ArrayList<Path> added = new ArrayList<>(children.size());
             for (DependencyNode child : children) {
                 String childConflictId = this.state.conflictIds.get(child);
-                Set<String> conflictIdsSinceRoot = new 
HashSet<>(this.conflictIdsSinceRoot);
-                boolean cycle = !conflictIdsSinceRoot.add(childConflictId);
-                Path c = new Path(this.state, child, childConflictId, 
conflictIdsSinceRoot, this);
+                // Detect cycles by walking up the parent chain instead of 
copying a HashSet per child.
+                // This trades O(depth) per check for eliminating the dominant 
memory consumer:
+                // previously, each Path copied its parent's entire 
conflict-ID set, creating millions
+                // of HashMap.Node objects for large graphs.
+                boolean cycle = hasConflictIdOnPathToRoot(childConflictId);

Review Comment:
   Fixed in b69da10 — updated the Javadoc to describe the parent-chain walk via 
`hasConflictIdOnPathToRoot()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to