This is an automated email from the ASF dual-hosted git repository.

gengliangwang pushed a commit to branch branch-4.x
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-4.x by this push:
     new 617b3d7857fd [SPARK-57201][SQL] Skip the null check in 
UnsafeProjection codegen for statically non-null values
617b3d7857fd is described below

commit 617b3d7857fdb2fdaa526152f09023e1798c7643
Author: Gengliang Wang <[email protected]>
AuthorDate: Tue Jun 2 22:11:52 2026 -0700

    [SPARK-57201][SQL] Skip the null check in UnsafeProjection codegen for 
statically non-null values
    
    ### What changes were proposed in this pull request?
    
    `GenerateUnsafeProjection` writes each field as `if (${input.isNull}) { 
setNull } else { write }` whenever the field's schema is nullable. When the 
bound value is statically non-null its `isNull` is the literal `false`, so this 
emits a dead `if (false) { setNullAt(i); } else { write(...); }`.
    
    This PR skips the null check (and the dead `setNull` branch) when 
`input.isNull == FalseLiteral`, emitting just the write -- the same code the 
existing `!nullable` path already produces. Symmetrically, when `input.isNull 
== TrueLiteral` (statically null), it emits only `setNull`. The `FalseLiteral` 
comparison is the idiom already used in this file for the top-level 
`zeroOutNullBytes` fast path.
    
    Before (for a non-null value in a nullable-schema field):
    
    ```java
    if (false) {
      rowWriter.setNullAt(0);
    } else {
      rowWriter.write(0, value);
    }
    ```
    
    After:
    
    ```java
    rowWriter.write(0, value);
    ```
    
    ### Why are the changes needed?
    
    Sub-task of SPARK-56908 (reduce generated Java size in whole-stage 
codegen). Dumping the TPC-DS whole-stage codegen shows ~358 such dead `if 
(false) { setNullAt } else { write }` blocks; the schema marks the field 
nullable while the bound value is statically non-null. Emitting only the write 
removes the dead branch and the redundant conditional from the generated 
projection code.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No. The row writer's null bits are already cleared up front 
(`resetRowWriter` / `zeroOutNullBytes`), so writing a non-null value without 
`setNullAt` is exactly what the existing `!nullable` path does; results are 
unchanged.
    
    ### How was this patch tested?
    
    Behavior-preserving change covered by the existing projection suites: 
`GeneratedProjectionSuite` and `UnsafeRowConverterSuite` (39 tests) pass. 
Additionally verified by re-dumping the TPC-DS whole-stage codegen: the ~358 
dead `if (false) { setNullAt }` blocks are gone and every generated subtree 
still compiles.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    Generated-by: Claude Code (Opus 4.8)
    
    Closes #56257 from gengliangwang/spark-unsafeproj-nullcheck.
    
    Authored-by: Gengliang Wang <[email protected]>
    Signed-off-by: Gengliang Wang <[email protected]>
    (cherry picked from commit 8a73a0c2b4848372c740c15db4f30578b97c9c52)
    Signed-off-by: Gengliang Wang <[email protected]>
---
 .../expressions/codegen/GenerateUnsafeProjection.scala         | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala
index 0f64eefe1a06..cf2902175ea7 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateUnsafeProjection.scala
@@ -119,11 +119,19 @@ object GenerateUnsafeProjection extends 
CodeGenerator[Seq[Expression], UnsafePro
         }
 
         val writeField = writeElement(ctx, input.value, index.toString, dt, 
rowWriter)
-        if (!nullable) {
+        if (!nullable || input.isNull == FalseLiteral) {
+          // The value is statically known to be non-null, so skip the null 
check and the
+          // (dead) setNull branch and just write the value.
           s"""
              |${input.code}
              |${writeField.trim}
            """.stripMargin
+        } else if (input.isNull == TrueLiteral) {
+          // The value is statically known to be null, so only set the null 
bit.
+          s"""
+             |${input.code}
+             |${setNull.trim}
+           """.stripMargin
         } else {
           s"""
              |${input.code}


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to