This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch branch-4.1
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-4.1 by this push:
     new d50e9b7fee5d [SPARK-54454][SQL] Enable variant shredding and variant 
logical type annotation configs by default
d50e9b7fee5d is described below

commit d50e9b7fee5d1cdb7452461c1d48074353bac133
Author: Harsh Motwani <[email protected]>
AuthorDate: Fri Nov 28 13:59:06 2025 -0800

    [SPARK-54454][SQL] Enable variant shredding and variant logical type 
annotation configs by default
    
    ### What changes were proposed in this pull request?
    
    This PR enables the annotation of the variant parquet logical type and 
shredded writes and reads by default.
    
    ### Why are the changes needed?
    
    1. Having variant data annotated with the variant logical type is required 
by the parquet variant spec 
([source](https://github.com/apache/parquet-format/blob/master/VariantEncoding.md#variant-in-parquet)).
 This is necessary to adhere to the spec
    2. Variant shredding brings in significant performance optimizations over 
regular unshredded variants, and should be the default mode.
    
    ### Does this PR introduce _any_ user-facing change?
    
    Yes, variant data written by Spark would be annotated with the variant 
logical type annotation and variant shredding would be enabled by default.
    
    ### How was this patch tested?
    
    Existing tests.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #53164 from harshmotw-db/harshmotw-db/enable_variant_shredding.
    
    Lead-authored-by: Harsh Motwani <[email protected]>
    Co-authored-by: Wenchen Fan <[email protected]>
    Signed-off-by: Dongjoon Hyun <[email protected]>
    (cherry picked from commit 3a06297fb9a66dca9bd5597630e34b4b057e893f)
    Signed-off-by: Dongjoon Hyun <[email protected]>
---
 .../src/main/scala/org/apache/spark/sql/internal/SQLConf.scala    | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index 4b82966b2b6d..951bdb30c701 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -1598,7 +1598,7 @@ object SQLConf {
         "variant logical type.")
       .version("4.1.0")
       .booleanConf
-      .createWithDefault(false)
+      .createWithDefault(true)
 
   val PARQUET_IGNORE_VARIANT_ANNOTATION =
     buildConf("spark.sql.parquet.ignoreVariantAnnotation")
@@ -5526,7 +5526,7 @@ object SQLConf {
         "requested fields.")
       .version("4.0.0")
       .booleanConf
-      .createWithDefault(false)
+      .createWithDefault(true)
 
   val VARIANT_WRITE_SHREDDING_ENABLED =
     buildConf("spark.sql.variant.writeShredding.enabled")
@@ -5534,7 +5534,7 @@ object SQLConf {
       .doc("When true, the Parquet writer is allowed to write shredded 
variant. ")
       .version("4.0.0")
       .booleanConf
-      .createWithDefault(false)
+      .createWithDefault(true)
 
   val VARIANT_FORCE_SHREDDING_SCHEMA_FOR_TEST =
     buildConf("spark.sql.variant.forceShreddingSchemaForTest")
@@ -5567,7 +5567,7 @@ object SQLConf {
       .doc("Infer shredding schema when writing Variant columns in Parquet 
tables.")
       .version("4.1.0")
       .booleanConf
-      .createWithDefault(false)
+      .createWithDefault(true)
 
   val LEGACY_CSV_ENABLE_DATE_TIME_PARSING_FALLBACK =
     buildConf("spark.sql.legacy.csv.enableDateTimeParsingFallback")


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to