[ 
https://issues.apache.org/jira/browse/CASSANDRA-21156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18065771#comment-18065771
 ] 

Dipankar Achinta edited comment on CASSANDRA-21156 at 3/13/26 9:04 PM:
-----------------------------------------------------------------------

Did some investigation on the reported behavior, seems like a class 
initialization ordering issue.

—

Below are my observations:
 * During {*}_DatabaseDescriptor.loadConfig()_{*}, using the deprecated key 
*_table_count_warn_threshold_* in `{_}cassandra.yaml`{_} triggers a config 
converter ({_}*TABLE_COUNT_THRESHOLD_TO_GUARDRAIL*{_}).
 * That converter calls 
{_}*SchemaConstants.getLocalAndReplicatedSystemTableNames()*{_}, which accesses 
*_SystemKeyspace.TABLE_NAMES_* — a non-compile-time-constant with shape, 
`{+}_static final ImmutableSet_{+}`.
 * This access triggers _*SystemKeyspace's*_ static initializer 
({_}*<clinit>*{_}) before _*DatabaseDescriptor*_ has finished loading, 
producing a partially initialized state and a downstream 
_*NullPointerException* or_ {_}*ExceptionInInitializerError*{_}.

 
+*According to 
[JLS|https://docs.oracle.com/javase/specs/jls/se7/html/jls-12.html]:*+
 
 * +{_}Section{_}{_}12.4.1{_}+ — A class is initialized on first active use. 
Reading a `static final` field that is a _*compile-time constant*_ 
({_}primitives{_} and `{_}String{_}` literals) is +*not*+ an active use and 
does +*not*+ trigger `{_}*<clinit>*{_}`. Reading any other `{_}*static 
final*{_}` field (e.g. `{_}ImmutableSet{_}`) is an active use and *will* 
trigger `{_}*<clinit>*{_}`.
 * +{_}Section{_}{_}12.4.2 :: Step 3{_}+ — If the class is currently being 
initialized by the same thread (a recursive request), the JVM returns the 
partially initialized class immediately. Static fields not yet assigned at that 
point read as their default value (`null` for references).

 
{+}**{+}{+}*Code Path:*{+}
 
DatabaseDescriptor.<clinit> *(loadConfig running)*
└─→ Converters.TABLE_COUNT_THRESHOLD_TO_GUARDRAIL
                └─→ SchemaConstants.getLocalAndReplicatedSystemTableNames()
                                └─→ SystemKeyspace.TABLE_NAMES *← non-constant 
field access*
                                               └─→ SystemKeyspace.<clinit> *← 
triggered too early*
                                                              └─→ 
DatabaseDescriptor.getPartitioner()
                                                                             
└─→ partitioner == null *← partially initialized*
                                                                                
            └─→ +NPE+ / +ExceptionInInitializerError+
 
{+}*Reproducer*{+}{*}:{*} Sample program to test/confirm the 
partial-initialization problem.
{code:java}
public class EarlyInitDemo {

    static class DatabaseDescriptor {
        static String partitioner = null;
        static {
            String names = SchemaConstants.getSystemTableNames(); // triggers 
the chain
            partitioner = "Murmur3Partitioner";                   // set AFTER 
YAML convert/load
        }
        static String getPartitioner() { return partitioner; }
    }

    static class SchemaConstants {
        static String getSystemTableNames() {
            return SystemKeyspace.TABLE_NAMES.toString(); // non-constant: 
triggers <clinit>
        }
    }

    static class SystemKeyspace {
        static final java.util.Set<String> TABLE_NAMES;
        static {
            String p = DatabaseDescriptor.getPartitioner(); // same thread: 
gets null
            if (p == null) throw new NullPointerException("DD not initialized 
yet");
            TABLE_NAMES = java.util.Set.of("local", "peers");
        }
    }

    public static void main(String[] args) {
        try {
            DatabaseDescriptor.getPartitioner();
        } catch (ExceptionInInitializerError e) {
            e.printStackTrace(); // Caused by: NPE in SystemKeyspace.<clinit>
        }
    }
}{code}
 

To bypass this, tried a quick and dirty patch locally that doesn't trigger the 
early class loading.
 * Basically replaced `{_}*.addAll(SystemKeyspace.TABLE_NAMES)*{_}` with 
individual `{_}*.add(SystemKeyspace.BATCHES)*{_}`, etc.
 * String literals and `{+}_static final String_{+}` fields initialized to 
literals are compile-time constants, so accessing them never triggers 
`{+}_<clinit>_{+}`.
 * *{+}Downside{+}:* verbose; requires manual sync when new system tables are 
added.

—
 
+*Alternate Fix*+ — declare the size as a `{+}_static final int_{+}`. The 
converter only calls `{_}*.size()*{_}` — it never iterates the set.
 
A `{*}_static final int_{*}` is always a compile-time constant.
{code:java}
// SchemaConstants.java
public static final int LOCAL_AND_REPLICATED_SYSTEM_TABLE_COUNT =
          25   // SystemKeyspace
        + 11   // SchemaKeyspace
        +  2   // TraceKeyspace
        +  5   // AuthKeyspace
        +  7   // SystemDistributedKeyspace
        +  2;  // AccordKeyspace{code}
*{+}Downside{+}:* Again needs to be kept in-sync, if system table count changes.
 
{code:java}
// Converters.java
TABLE_COUNT_THRESHOLD_TO_GUARDRAIL(int.class, int.class,
    i -> i - SchemaConstants.LOCAL_AND_REPLICATED_SYSTEM_TABLE_COUNT,
    o -> o == null ? null : o + 
SchemaConstants.LOCAL_AND_REPLICATED_SYSTEM_TABLE_COUNT);{code}
`{*}_getLocalAndReplicatedSystemTableNames()_{*}` has no other callers apart 
from the _*Converters.TABLE_COUNT_THRESHOLD_TO_GUARDRAIL*_ enum.


was (Author: JIRAUSER312736):
Did some investigation on the reported behavior, seems like a class 
initialization ordering issue.

—

Below are my observations:
 * During {*}_DatabaseDescriptor.loadConfig()_{*}, using the deprecated key 
*_table_count_warn_threshold_* in `{_}cassandra.yaml`{_} triggers a config 
converter ({_}*TABLE_COUNT_THRESHOLD_TO_GUARDRAIL*{_}).
 * That converter calls 
{_}*SchemaConstants.getLocalAndReplicatedSystemTableNames()*{_}, which accesses 
*_SystemKeyspace.TABLE_NAMES_* — a non-compile-time-constant with shape, 
`{+}_static final ImmutableSet_{+}`.
 * This access triggers _*SystemKeyspace's*_ static initializer 
({_}*<clinit>*{_}) before _*DatabaseDescriptor*_ has finished loading, 
producing a partially initialized state and a downstream 
_*NullPointerException* or_ {_}*ExceptionInInitializerError*{_}.

 
+*According to 
[JLS|https://docs.oracle.com/javase/specs/jls/se7/html/jls-12.html]:*+
 
 * +{_}Section{_}{_}12.4.1{_}+ — A class is initialized on first active use. 
Reading a `static final` field that is a _*compile-time constant*_ 
({_}primitives{_} and `{_}String{_}` literals) is +*not*+ an active use and 
does +*not*+ trigger `{_}*<clinit>*{_}`. Reading any other `{_}*static 
final*{_}` field (e.g. `{_}ImmutableSet{_}`) is an active use and *will* 
trigger `{_}*<clinit>*{_}`.

 * +{_}Section{_}{_}12.4.2 :: Step 3{_}+ — If the class is currently being 
initialized by the same thread (a recursive request), the JVM returns the 
partially initialized class immediately. Static fields not yet assigned at that 
point read as their default value (`null` for references).
 
+*Code Path:*+
 
DatabaseDescriptor.<clinit> *(loadConfig running)*
└─→ Converters.TABLE_COUNT_THRESHOLD_TO_GUARDRAIL
                └─→ SchemaConstants.getLocalAndReplicatedSystemTableNames()
                                └─→ SystemKeyspace.TABLE_NAMES *← non-constant 
field access*
                                               └─→ SystemKeyspace.<clinit> *← 
triggered too early*
                                                              └─→ 
DatabaseDescriptor.getPartitioner()
                                                                             
└─→ partitioner == null *← partially initialized*
                                                                                
            └─→ +NPE+ / +ExceptionInInitializerError+
 
{+}*Reproducer*{+}{*}:{*} Sample program to test/confirm the 
partial-initialization problem.
{code:java}
public class EarlyInitDemo {

    static class DatabaseDescriptor {
        static String partitioner = null;
        static {
            String names = SchemaConstants.getSystemTableNames(); // triggers 
the chain
            partitioner = "Murmur3Partitioner";                   // set AFTER 
YAML convert/load
        }
        static String getPartitioner() { return partitioner; }
    }

    static class SchemaConstants {
        static String getSystemTableNames() {
            return SystemKeyspace.TABLE_NAMES.toString(); // non-constant: 
triggers <clinit>
        }
    }

    static class SystemKeyspace {
        static final java.util.Set<String> TABLE_NAMES;
        static {
            String p = DatabaseDescriptor.getPartitioner(); // same thread: 
gets null
            if (p == null) throw new NullPointerException("DD not initialized 
yet");
            TABLE_NAMES = java.util.Set.of("local", "peers");
        }
    }

    public static void main(String[] args) {
        try {
            DatabaseDescriptor.getPartitioner();
        } catch (ExceptionInInitializerError e) {
            e.printStackTrace(); // Caused by: NPE in SystemKeyspace.<clinit>
        }
    }
}{code}
 

To bypass this, tried a quick and dirty patch locally that doesn't trigger the 
early class loading.
 * Basically replaced `{_}*.addAll(SystemKeyspace.TABLE_NAMES)*{_}` with 
individual `{_}*.add(SystemKeyspace.BATCHES)*{_}`, etc.
 * String literals and `{+}_static final String_{+}` fields initialized to 
literals are compile-time constants, so accessing them never triggers 
`{+}_<clinit>_{+}`.
 * *{+}Downside{+}:* verbose; requires manual sync when new system tables are 
added.
 
—
 
+*Alternate Fix*+ — declare the size as a `{+}_static final int_{+}`. The 
converter only calls `{_}*.size()*{_}` — it never iterates the set.
 
A `{*}_static final int_{*}` is always a compile-time constant.
{code:java}
// SchemaConstants.java
public static final int LOCAL_AND_REPLICATED_SYSTEM_TABLE_COUNT =
          25   // SystemKeyspace
        + 11   // SchemaKeyspace
        +  2   // TraceKeyspace
        +  5   // AuthKeyspace
        +  7   // SystemDistributedKeyspace
        +  2;  // AccordKeyspace{code}
*{+}Downside{+}:* Again needs to be kept in-sync, if system table count changes.
 
{code:java}
// Converters.java
TABLE_COUNT_THRESHOLD_TO_GUARDRAIL(int.class, int.class,
    i -> i - SchemaConstants.LOCAL_AND_REPLICATED_SYSTEM_TABLE_COUNT,
    o -> o == null ? null : o + 
SchemaConstants.LOCAL_AND_REPLICATED_SYSTEM_TABLE_COUNT);{code}
`{*}_getLocalAndReplicatedSystemTableNames()_{*}` has no other callers apart 
from the _*Converters.TABLE_COUNT_THRESHOLD_TO_GUARDRAIL*_ enum.

> Static init race between paxos v2 and table count guardrail causes NPE on 
> startup
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21156
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21156
>             Project: Apache Cassandra
>          Issue Type: Bug
>          Components: Local/Startup and Shutdown
>            Reporter: Blake Eggleston
>            Priority: Normal
>             Fix For: 5.0.x
>
>
> Set the following config values:
> table_count_warn_threshold: 400
> causes this exception on startup:
> {{ERROR [main] 2026-02-02 15:52:22,389 CassandraDaemon.java:887 - Exception 
> encountered during startup
> java.lang.ExceptionInInitializerError: null
>        at 
> org.apache.cassandra.db.SystemKeyspace.<clinit>(SystemKeyspace.java:239)
>        at 
> org.apache.cassandra.schema.SchemaConstants.getLocalAndReplicatedSystemTableNames(SchemaConstants.java:184)
>        at 
> org.apache.cassandra.config.Converters.lambda$static$32(Converters.java:128)
>        at org.apache.cassandra.config.Converters.convert(Converters.java:174)
>        at org.apache.cassandra.config.Replacement$1.set(Replacement.java:76)
>        at 
> org.apache.cassandra.config.YamlConfigurationLoader$PropertiesChecker$1.set(YamlConfigurationLoader.java:376)
>        at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.constructJavaBean2ndStep(Constructor.java:276)
>        at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.construct(Constructor.java:169)
>        at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:320)
>        at 
> org.yaml.snakeyaml.constructor.BaseConstructor.constructObjectNoCheck(BaseConstructor.java:264)
>        at 
> org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:247)
>        at 
> org.yaml.snakeyaml.constructor.BaseConstructor.constructDocument(BaseConstructor.java:201)
>        at 
> org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:185)
>        at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:493)
>        at org.yaml.snakeyaml.Yaml.loadAs(Yaml.java:486)
>        at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:310)
>        at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:141)
>        at 
> org.apache.cassandra.config.YamlConfigurationLoader.loadConfig(YamlConfigurationLoader.java:116)
>        at 
> org.apache.cassandra.config.DatabaseDescriptor.loadConfig(DatabaseDescriptor.java:399)
>        at 
> org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:261)
>        at 
> org.apache.cassandra.config.DatabaseDescriptor.daemonInitialization(DatabaseDescriptor.java:246)
>        at 
> org.apache.cassandra.service.CassandraDaemon.applyConfig(CassandraDaemon.java:780)
>        at 
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:723)
>        at 
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:865)
> Caused by: java.lang.NullPointerException: null
>        at org.apache.cassandra.db.DataRange.allData(DataRange.java:71)
>        at 
> org.apache.cassandra.service.paxos.uncommitted.PaxosUncommittedIndex.<clinit>(PaxosUncommittedIndex.java:90)
>        ... 24 common frames omitted}}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to