This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch branch-4.1
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-4.1 by this push:
new 6073eb7344a3 [SPARK-55747][SQL] Fix NPE when accessing elements from
an array that is null
6073eb7344a3 is described below
commit 6073eb7344a3157726f2244accbb72ff14bb6279
Author: Wenchen Fan <[email protected]>
AuthorDate: Sun Mar 1 23:06:47 2026 +0800
[SPARK-55747][SQL] Fix NPE when accessing elements from an array that is
null
### What changes were proposed in this pull request?
The `GetArrayItem` expression incorrectly computed `nullable = false` when
indexing into arrays with `containsNull = false` (e.g., from split()), even
when the array itself could be null. This caused codegen to skip null checks,
leading to NPE on `array.numElements()` during bounds checking.
### Why are the changes needed?
To resolve NPE within spark engine.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Tests in this PR.
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #54546 from stevomitric/stevomitric/fix-npe-codegen.
Lead-authored-by: Wenchen Fan <[email protected]>
Co-authored-by: Stevo Mitric <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit c0e367a54b1bb1877d1a367af8d321aca59dff59)
Signed-off-by: Wenchen Fan <[email protected]>
---
.../sql/catalyst/expressions/complexTypeExtractors.scala | 2 +-
.../scala/org/apache/spark/sql/StringFunctionsSuite.scala | 15 +++++++++++++++
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
index dba061eeb870..f40077c53311 100644
---
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
+++
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeExtractors.scala
@@ -431,7 +431,7 @@ trait GetArrayItemUtil {
true
}
} else {
- if (failOnError) arrayElementNullable else true
+ if (failOnError) arrayElementNullable || child.nullable else true
}
}
}
diff --git
a/sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala
b/sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala
index ff0ee19ae971..7bfc8cf4fa61 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/StringFunctionsSuite.scala
@@ -1470,4 +1470,19 @@ class StringFunctionsSuite extends QueryTest with
SharedSparkSession {
Seq(Row("abc", "def")))
}
}
+
+ test("SPARK-55747: GetArrayItem NPE on null array from split() with ANSI
enabled") {
+ // GetArrayItem.nullable was incorrectly computed as false when the array
type has
+ // containsNull=false (e.g., from StringSplit) but the array itself can be
null.
+ // This caused codegen to skip null checks, leading to NPE when calling
+ // array.numElements() on a null array during bounds checking.
+ withTable("t") {
+ sql("CREATE TABLE t (s STRING) USING parquet")
+ sql("INSERT INTO t VALUES ('a-b'), (null)")
+ checkAnswer(
+ sql("SELECT split(s, '-')[size(split(s, '-')) - 1] FROM t"),
+ Seq(Row("b"), Row(null))
+ )
+ }
+ }
}
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]