Jinwoo Hwang created GEODE-10520:
------------------------------------

             Summary: Security: Eliminate DirectBuffer Access to sun.nio.ch 
Internal Package
                 Key: GEODE-10520
                 URL: https://issues.apache.org/jira/browse/GEODE-10520
             Project: Geode
          Issue Type: Improvement
            Reporter: Jinwoo Hwang
            Assignee: Jinwoo Hwang
             Fix For: 2.1.0


h2. Summary

UnsafeThreadLocal currently uses reflection to access private internals of 
{{java.lang.ThreadLocal}} and {{{}java.lang.ThreadLocalMap{}}}, requiring the 
JVM flag {{{}--add-opens=java.base/java.lang=ALL-UNNAMED{}}}. This creates 
maintenance burden, security concerns, and compatibility risks with future Java 
versions.
h2. Description
h3. Current Implementation

The {{UnsafeThreadLocal}} class (located in 
{{{}geode-core/src/main/java/org/apache/geode/distributed/internal/deadlock/UnsafeThreadLocal.java{}}})
 extends {{ThreadLocal<T>}} and provides a {{get(Thread)}} method that allows 
reading ThreadLocal values from arbitrary threads - a capability needed for 
deadlock detection.

The current implementation uses reflection to:
 # Access the private {{getMap(Thread)}} method of ThreadLocal
 # Access the private {{getEntry(ThreadLocal)}} method of ThreadLocalMap
 # Access the private {{value}} field of ThreadLocalMap.Entry

*Code excerpt showing reflection usage:*
{code:java}
private static Object get(ThreadLocal threadLocal, Thread thread) {
  try {
    Object threadLocalMap = invokePrivate(threadLocal, "getMap", 
        new Class[] {Thread.class}, new Object[] {thread});
    
    if (threadLocalMap != null) {
      Object entry = invokePrivate(threadLocalMap, "getEntry", 
          new Class[] {ThreadLocal.class}, new Object[] {threadLocal});
      if (entry != null) {
        return getPrivate(entry, "value");
      }
    }
    return null;
  } catch (Exception e) {
    throw new RuntimeException("Unable to extract thread local", e);
  }
}
{code}
h3. Required JVM Flags

To enable this reflection access, Geode requires the following JVM flag 
(defined in {{{}MemberJvmOptions.java{}}}):
{code:java}
private static final String JAVA_LANG_OPEN = 
    "--add-opens=java.base/java.lang=ALL-UNNAMED";
{code}
This flag must be passed to every Geode member JVM, adding operational 
complexity.
h3. Usage Context

UnsafeThreadLocal is used in three key areas for deadlock detection:
 # *DLockService* ({{{}geode-core/.../locks/DLockService.java{}}})
 ** Tracks what distributed lock a thread is blocked on
 ** Used by: {{private final UnsafeThreadLocal<Object> blockedOn}}
 # *MessageDependencyMonitor* 
({{{}geode-core/.../deadlock/MessageDependencyMonitor.java{}}})
 ** Tracks ReplyProcessor21 instances threads are waiting for
 ** Tracks MessageWithReply instances being processed
 ** Used by: {{private final UnsafeThreadLocal<ReplyProcessor21> 
waitingProcessors}}
 ** Used by: {{private final UnsafeThreadLocal<MessageWithReply> 
processingMessages}}
 # *DLockDependencyMonitor* 
({{{}geode-core/.../deadlock/DLockDependencyMonitor.java{}}})
 ** Queries blockedOn status from DLockService for dependency graph construction

h2. Problems with Current Approach
h3. 1. Java Module System Violations

*Severity: High*

The reflection-based approach directly violates the Java Platform Module System 
(JPMS) encapsulation:
 * Accesses internal implementation details of {{java.lang}} package
 * Bypasses the module system's access controls
 * Requires explicit {{--add-opens}} directive to override module boundaries

{*}Risk{*}: Future Java versions may further restrict or remove the ability to 
open internal packages, even with command-line flags.
h3. 2. Maintenance and Fragility

*Severity: High*

The implementation is brittle and depends on:
 * Internal class names remaining unchanged (ThreadLocalMap, Entry)
 * Internal method signatures staying stable (getMap, getEntry)
 * Internal field names not changing (value)
 * Class hierarchy not being refactored

Any changes to ThreadLocal internals in future JDK releases could break Geode 
without warning, as these are non-public APIs with no stability guarantees.
h3. 3. Security Concerns

*Severity: Medium*

Using {{{}--add-opens=java.base/java.lang=ALL-UNNAMED{}}}:
 * Opens the entire {{java.lang}} package to all unnamed modules
 * Grants broader access than necessary (only need ThreadLocal access)
 * Weakens Java's security model and encapsulation guarantees
 * May trigger security audit flags in enterprise environments

h3. 4. Operational Complexity

*Severity: Medium*

Current deployment requirements:
 * JVM flags must be correctly configured on all Geode servers
 * Different flags needed for Java 8 vs. Java 11+ deployments
 * Documentation and deployment scripts must maintain flag lists
 * Missing flags cause runtime failures that are difficult to diagnose

>From {{{}MemberJvmOptions.java{}}}:
{code:java}
static final List<String> JAVA_11_OPTIONS = Arrays.asList(
    COM_SUN_JMX_REMOTE_SECURITY_EXPORT,
    SUN_NIO_CH_EXPORT,
    COM_SUN_MANAGEMENT_INTERNAL_OPEN,
    JAVA_LANG_OPEN,  // ← Required only for UnsafeThreadLocal
    JAVA_NIO_OPEN);
{code}
h3. 5. Performance Overhead

*Severity: Low*

Reflection calls have performance costs:
 * Method lookup via reflection on each invocation
 * Access permission checks
 * Potential JIT optimization barriers
 * Exception handling overhead

While not critical for deadlock detection (which is infrequent), it's still 
unnecessary overhead.
h3. 6. Testing and CI/CD Complexity

*Severity: Low*

Test infrastructure must:
 * Configure JVM flags correctly for all test executions
 * Maintain separate flag lists for different Java versions
 * Handle flag propagation in parallel test runs
 * Document requirements for external contributors

h2. Business Impact
h3. Reduced Operational Risk

Eliminating the need for {{--add-opens}} flags:
 * Simplifies deployment procedures
 * Reduces configuration errors in production
 * Removes a potential point of failure during Java upgrades
 * Improves compatibility with container orchestration platforms

h3. Better Java Version Compatibility

A non-reflection implementation:
 * Works seamlessly across all Java versions (8, 11, 17, 21+)
 * Doesn't depend on JDK internal implementation details
 * Reduces risk during major Java version upgrades
 * Aligns with Java's "code to interfaces, not implementations" principle

h3. Enhanced Security Posture

Removing reflection-based access:
 * Eliminates the need to weaken module system protections
 * Reduces attack surface by not opening internal packages
 * Simplifies security audits and compliance reviews
 * Follows principle of least privilege

h3. Improved Developer Experience

A cleaner implementation:
 * Easier to understand and maintain
 * No special JVM configuration needed for development
 * Simpler IDE setup and debugging
 * Reduces onboarding friction for new contributors

h3. Long-Term Maintainability

Moving away from internal APIs:
 * Protects against breaking changes in future JDK releases
 * Reduces technical debt
 * Makes the codebase more sustainable
 * Demonstrates modern Java best practices

h2. Affected Components
h3. Files Modified in This Refactoring
 * 
{{geode-core/src/main/java/org/apache/geode/distributed/internal/deadlock/UnsafeThreadLocal.java}}
 * 
{{geode-gfsh/src/main/java/org/apache/geode/management/internal/cli/commands/MemberJvmOptions.java}}
 

h3. Files Using UnsafeThreadLocal (Not Modified)

These files use {{UnsafeThreadLocal}} but require no changes as the public API 
remains unchanged:
 * 
{{geode-core/src/main/java/org/apache/geode/distributed/internal/locks/DLockService.java}}
 - Uses {{UnsafeThreadLocal<Object> blockedOn}} for deadlock detection
 * 
{{geode-core/src/main/java/org/apache/geode/distributed/internal/deadlock/MessageDependencyMonitor.java}}
 - Uses two UnsafeThreadLocal instances for tracking message dependencies
 * 
{{geode-core/src/main/java/org/apache/geode/distributed/internal/deadlock/DLockDependencyMonitor.java}}
 - Queries UnsafeThreadLocal values for dependency graph construction

h3. Test Files (Executed, Not Modified)
 * 
{{geode-core/src/test/java/org/apache/geode/distributed/internal/deadlock/UnsafeThreadLocalJUnitTest.java}}
 - All tests pass with new implementation

h2. Scope
 * Refactoring UnsafeThreadLocal implementation to eliminate reflection
 * Updating MemberJvmOptions to remove JAVA_LANG_OPEN flag
 * Ensuring all existing tests pass with new implementation
 * Maintaining exact same public API and behavior
 * Supporting all currently supported Java versions (17, 21)

h2. Testing Requirements
h3. Functional Testing
 * All existing UnsafeThreadLocal unit tests must pass
 * Deadlock detection integration tests must pass
 * DLock-related distributed tests must pass
 * Message processing tests must pass

h3. Cross-Version Testing
 * Verify functionality on Java 17, and 21
 * Ensure no regression in deadlock detection accuracy
 * Validate thread safety under concurrent access

h3. Performance Testing
 * Verify no performance regression in deadlock detection
 * Measure overhead reduction from removing reflection
 * Test under high concurrency scenarios

h3. Negative Testing
 * Verify proper cleanup when threads terminate
 * Test behavior with many ThreadLocal instances
 * Validate memory leak prevention

h2. Compatibility
h3. Backward Compatibility
 * Public API of UnsafeThreadLocal must remain unchanged
 * Behavior must be functionally equivalent to current implementation
 * No changes to deadlock detection output or timing

h3. Forward Compatibility
 * Implementation should work with future Java versions
 * Should not depend on version-specific features
 * Should gracefully handle JVM updates

h2. Priority Assessment

*Recommended Priority: Medium-High*
h3. Factors Supporting High Priority
 * Eliminates security concerns with module system violations
 * Improves long-term maintainability significantly
 * Reduces operational complexity
 * Removes technical debt that will only grow over time

h3. Factors Supporting Medium Priority
 * Current implementation works (with JVM flags)
 * Not causing immediate production issues
 * Other reflection usage in Geode also needs addressing
 * Requires careful testing of deadlock detection

h2. Acceptance Criteria
 # UnsafeThreadLocal no longer uses reflection to access ThreadLocal internals
 # No {{--add-opens=java.base/java.lang=ALL-UNNAMED}} flag required
 # All existing tests pass without modification
 # Deadlock detection functionality unchanged
 # Works on Java 11, 17, and 21 without special configuration
 # Performance is equal or better than current implementation
 # No memory leaks when threads are terminated
 # Code is well-documented explaining the new approach

h2. Related Issues
 * Related to any other tickets addressing reflection usage in Geode
 * May be part of larger "Java Module System Compliance" initiative



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to