This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new 7120e6b88f23 [SPARK-45905][SQL] Least common type between decimal
types should retain integral digits first
7120e6b88f23 is described below
commit 7120e6b88f2327ffb71c4bca14b10b15aeb26c32
Author: Wenchen Fan <[email protected]>
AuthorDate: Wed Nov 15 20:39:38 2023 +0800
[SPARK-45905][SQL] Least common type between decimal types should retain
integral digits first
### What changes were proposed in this pull request?
This is kind of a followup of https://github.com/apache/spark/pull/20023 .
It's simply wrong to cut the decimal precision to 38 if a wider decimal
type exceeds the max precision, which drops the integral digits and makes the
decimal value very likely to overflow.
In https://github.com/apache/spark/pull/20023 , we fixed this issue for
arithmetic operations, but many other operations suffer from the same issue:
Union, binary comparison, in subquery, create_array, coalesce, etc.
This PR fixes all the remaining operators, without the min scale
limitation, which should be applied to division and multiple only according to
the SQL server doc:
https://learn.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-ver15
### Why are the changes needed?
To produce reasonable wider decimal type.
### Does this PR introduce _any_ user-facing change?
Yes, the final data type of these operators will be changed if it's decimal
type and its precision exceeds the max and the scale is not 0.
### How was this patch tested?
updated tests
### Was this patch authored or co-authored using generative AI tooling?
No
Closes #43781 from cloud-fan/decimal.
Lead-authored-by: Wenchen Fan <[email protected]>
Co-authored-by: Wenchen Fan <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
---
docs/sql-ref-ansi-compliance.md | 19 ++
.../org/apache/spark/sql/types/DecimalType.scala | 12 ++
.../sql/catalyst/analysis/DecimalPrecision.scala | 7 +-
.../org/apache/spark/sql/internal/SQLConf.scala | 12 +-
.../catalyst/analysis/AnsiTypeCoercionSuite.scala | 10 +-
.../sql/catalyst/analysis/TypeCoercionSuite.scala | 18 +-
.../typeCoercion/native/mapZipWith.sql.out | 50 ++---
.../results/typeCoercion/native/mapZipWith.sql.out | 42 +----
.../approved-plans-v2_7/q36a.sf100/explain.txt | 6 +-
.../approved-plans-v2_7/q36a/explain.txt | 6 +-
.../tpcds-query-results/v2_7/q36a.sql.out | 202 ++++++++++-----------
.../apache/spark/sql/DataFrameFunctionsSuite.scala | 21 +--
12 files changed, 191 insertions(+), 214 deletions(-)
diff --git a/docs/sql-ref-ansi-compliance.md b/docs/sql-ref-ansi-compliance.md
index 4729db16d63f..90e65e5ce36e 100644
--- a/docs/sql-ref-ansi-compliance.md
+++ b/docs/sql-ref-ansi-compliance.md
@@ -240,6 +240,25 @@ The least common type resolution is used to:
- Derive the result type for expressions such as the case expression.
- Derive the element, key, or value types for array and map constructors.
Special rules are applied if the least common type resolves to FLOAT. With
float type values, if any of the types is INT, BIGINT, or DECIMAL the least
common type is pushed to DOUBLE to avoid potential loss of digits.
+
+Decimal type is a bit more complicated here, as it's not a simple type but has
parameters: precision and scale.
+A `decimal(precision, scale)` means the value can have at most `precision -
scale` digits in the integral part and `scale` digits in the fractional part.
+A least common type between decimal types should have enough digits in both
integral and fractional parts to represent all values.
+More precisely, a least common type between `decimal(p1, s1)` and `decimal(p2,
s2)` has the scale of `max(s1, s2)` and precision of `max(s1, s2) + max(p1 -
s1, p2 - s2)`.
+However, decimal types in Spark have a maximum precision: 38. If the final
decimal type need more precision, we must do truncation.
+Since the digits in the integral part are more significant, Spark truncates
the digits in the fractional part first. For example, `decimal(48, 20)` will be
reduced to `decimal(38, 10)`.
+
+Note, arithmetic operations have special rules to calculate the least common
type for decimal inputs:
+
+| Operation | Result precision | Result scale |
+|------------|------------------------------------------|---------------------|
+| e1 + e2 | max(s1, s2) + max(p1 - s1, p2 - s2) + 1 | max(s1, s2) |
+| e1 - e2 | max(s1, s2) + max(p1 - s1, p2 - s2) + 1 | max(s1, s2) |
+| e1 * e2 | p1 + p2 + 1 | s1 + s2 |
+| e1 / e2 | p1 - s1 + s2 + max(6, s1 + p2 + 1) | max(6, s1 + p2 + 1) |
+| e1 % e2 | min(p1 - s1, p2 - s2) + max(s1, s2) | max(s1, s2) |
+
+The truncation rule is also different for arithmetic operations: they retain
at least 6 digits in the fractional part, which means we can only reduce
`scale` to 6. Overflow may happen in this case.
```sql
-- The coalesce function accepts any set of argument types as long as they
share a least common type.
diff --git
a/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala
b/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala
index af545341dee7..9de34d0b3bc1 100644
--- a/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala
+++ b/sql/api/src/main/scala/org/apache/spark/sql/types/DecimalType.scala
@@ -146,6 +146,18 @@ object DecimalType extends AbstractDataType {
DecimalType(min(precision, MAX_PRECISION), min(scale, MAX_SCALE))
}
+ private[sql] def boundedPreferIntegralDigits(precision: Int, scale: Int):
DecimalType = {
+ if (precision <= MAX_PRECISION) {
+ DecimalType(precision, scale)
+ } else {
+ // If we have to reduce the precision, we should retain the digits in
the integral part first,
+ // as they are more significant to the value. Here we reduce the scale
as well to drop the
+ // digits in the fractional part.
+ val diff = precision - MAX_PRECISION
+ DecimalType(MAX_PRECISION, math.max(0, scale - diff))
+ }
+ }
+
private[sql] def checkNegativeScale(scale: Int): Unit = {
if (scale < 0 && !SqlApiConf.get.allowNegativeScaleOfDecimalEnabled) {
throw DataTypeErrors.negativeScaleNotAllowedError(scale)
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala
index 09cf61a77955..d5e914b343d5 100644
---
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala
+++
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/DecimalPrecision.scala
@@ -20,6 +20,7 @@ package org.apache.spark.sql.catalyst.analysis
import org.apache.spark.sql.catalyst.expressions._
import org.apache.spark.sql.catalyst.expressions.Literal._
import org.apache.spark.sql.catalyst.types.DataTypeUtils
+import org.apache.spark.sql.internal.SQLConf
import org.apache.spark.sql.types._
@@ -64,7 +65,11 @@ object DecimalPrecision extends TypeCoercionRule {
def widerDecimalType(p1: Int, s1: Int, p2: Int, s2: Int): DecimalType = {
val scale = max(s1, s2)
val range = max(p1 - s1, p2 - s2)
- DecimalType.bounded(range + scale, scale)
+ if (conf.getConf(SQLConf.LEGACY_RETAIN_FRACTION_DIGITS_FIRST)) {
+ DecimalType.bounded(range + scale, scale)
+ } else {
+ DecimalType.boundedPreferIntegralDigits(range + scale, scale)
+ }
}
override def transform: PartialFunction[Expression, Expression] = {
diff --git
a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
index c12c5a826ba1..2d67a8428d22 100644
--- a/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
+++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
@@ -4550,6 +4550,16 @@ object SQLConf {
.booleanConf
.createWithDefault(false)
+ val LEGACY_RETAIN_FRACTION_DIGITS_FIRST =
+ buildConf("spark.sql.legacy.decimal.retainFractionDigitsOnTruncate")
+ .internal()
+ .doc("When set to true, we will try to retain the fraction digits first
rather than " +
+ "integral digits as prior Spark 4.0, when getting a least common type
between decimal " +
+ "types, and the result decimal precision exceeds the max precision.")
+ .version("4.0.0")
+ .booleanConf
+ .createWithDefault(false)
+
/**
* Holds information about keys that have been deprecated.
*
@@ -5434,7 +5444,7 @@ class SQLConf extends Serializable with Logging with
SqlApiConf {
}
def legacyRaiseErrorWithoutErrorClass: Boolean =
- getConf(SQLConf.LEGACY_RAISE_ERROR_WITHOUT_ERROR_CLASS)
+ getConf(SQLConf.LEGACY_RAISE_ERROR_WITHOUT_ERROR_CLASS)
/** ********************** SQLConf functionality methods ************ */
diff --git
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercionSuite.scala
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercionSuite.scala
index cc0f6046de00..38acb56ad1e0 100644
---
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercionSuite.scala
+++
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnsiTypeCoercionSuite.scala
@@ -481,9 +481,9 @@ class AnsiTypeCoercionSuite extends TypeCoercionSuiteBase {
:: Literal.create(null, DecimalType(22, 10))
:: Literal.create(null, DecimalType(38, 38))
:: Nil),
- CreateArray(Literal.create(null, DecimalType(5, 3)).cast(DecimalType(38,
38))
- :: Literal.create(null, DecimalType(22, 10)).cast(DecimalType(38, 38))
- :: Literal.create(null, DecimalType(38, 38))
+ CreateArray(Literal.create(null, DecimalType(5, 3)).cast(DecimalType(38,
26))
+ :: Literal.create(null, DecimalType(22, 10)).cast(DecimalType(38, 26))
+ :: Literal.create(null, DecimalType(38, 38)).cast(DecimalType(38, 26))
:: Nil))
}
@@ -530,9 +530,9 @@ class AnsiTypeCoercionSuite extends TypeCoercionSuiteBase {
:: Literal.create(null, DecimalType(38, 38))
:: Nil),
CreateMap(Literal(1)
- :: Literal.create(null, DecimalType(38, 0)).cast(DecimalType(38, 38))
+ :: Literal.create(null, DecimalType(38, 0))
:: Literal(2)
- :: Literal.create(null, DecimalType(38, 38))
+ :: Literal.create(null, DecimalType(38, 38)).cast(DecimalType(38, 0))
:: Nil))
// type coercion for both map keys and values
ruleTest(AnsiTypeCoercion.FunctionArgumentConversion,
diff --git
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
index db86e7131446..330252d26dc5 100644
---
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
+++
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercionSuite.scala
@@ -782,7 +782,7 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase {
widenTestWithStringPromotion(
ArrayType(DecimalType(36, 0), containsNull = false),
ArrayType(DecimalType(36, 35), containsNull = false),
- Some(ArrayType(DecimalType(38, 35), containsNull = true)))
+ Some(ArrayType(DecimalType(38, 2), containsNull = false)))
// MapType
widenTestWithStringPromotion(
@@ -808,7 +808,7 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase {
widenTestWithStringPromotion(
MapType(StringType, DecimalType(36, 0), valueContainsNull = false),
MapType(StringType, DecimalType(36, 35), valueContainsNull = false),
- Some(MapType(StringType, DecimalType(38, 35), valueContainsNull = true)))
+ Some(MapType(StringType, DecimalType(38, 2), valueContainsNull = false)))
widenTestWithStringPromotion(
MapType(IntegerType, StringType, valueContainsNull = false),
MapType(DecimalType.IntDecimal, StringType, valueContainsNull = false),
@@ -816,7 +816,7 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase {
widenTestWithStringPromotion(
MapType(DecimalType(36, 0), StringType, valueContainsNull = false),
MapType(DecimalType(36, 35), StringType, valueContainsNull = false),
- None)
+ Some(MapType(DecimalType(38, 2), StringType, valueContainsNull = false)))
// StructType
widenTestWithStringPromotion(
@@ -847,7 +847,7 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase {
widenTestWithStringPromotion(
new StructType().add("num", DecimalType(36, 0), nullable = false),
new StructType().add("num", DecimalType(36, 35), nullable = false),
- Some(new StructType().add("num", DecimalType(38, 35), nullable = true)))
+ Some(new StructType().add("num", DecimalType(38, 2), nullable = false)))
widenTestWithStringPromotion(
new StructType().add("num", IntegerType),
@@ -1046,9 +1046,9 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase {
:: Literal.create(null, DecimalType(22, 10))
:: Literal.create(null, DecimalType(38, 38))
:: Nil),
- CreateArray(Literal.create(null, DecimalType(5, 3)).cast(DecimalType(38,
38))
- :: Literal.create(null, DecimalType(22, 10)).cast(DecimalType(38, 38))
- :: Literal.create(null, DecimalType(38, 38))
+ CreateArray(Literal.create(null, DecimalType(5, 3)).cast(DecimalType(38,
26))
+ :: Literal.create(null, DecimalType(22, 10)).cast(DecimalType(38, 26))
+ :: Literal.create(null, DecimalType(38, 38)).cast(DecimalType(38, 26))
:: Nil))
}
@@ -1095,9 +1095,9 @@ class TypeCoercionSuite extends TypeCoercionSuiteBase {
:: Literal.create(null, DecimalType(38, 38))
:: Nil),
CreateMap(Literal(1)
- :: Literal.create(null, DecimalType(38, 0)).cast(DecimalType(38, 38))
+ :: Literal.create(null, DecimalType(38, 0))
:: Literal(2)
- :: Literal.create(null, DecimalType(38, 38))
+ :: Literal.create(null, DecimalType(38, 38)).cast(DecimalType(38, 0))
:: Nil))
// type coercion for both map keys and values
ruleTest(TypeCoercion.FunctionArgumentConversion,
diff --git
a/sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/mapZipWith.sql.out
b/sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/mapZipWith.sql.out
index bc8909f1c103..4b8d37d22548 100644
---
a/sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/mapZipWith.sql.out
+++
b/sql/core/src/test/resources/sql-tests/analyzer-results/typeCoercion/native/mapZipWith.sql.out
@@ -128,24 +128,13 @@ Project [map_zip_with(double_map#x, cast(float_map#x as
map<double,float>), lamb
SELECT map_zip_with(decimal_map1, decimal_map2, (k, v1, v2) -> struct(k, v1,
v2)) m
FROM various_maps
-- !query analysis
-org.apache.spark.sql.catalyst.ExtendedAnalysisException
-{
- "errorClass" : "DATATYPE_MISMATCH.MAP_ZIP_WITH_DIFF_TYPES",
- "sqlState" : "42K09",
- "messageParameters" : {
- "functionName" : "`map_zip_with`",
- "leftType" : "\"DECIMAL(36,0)\"",
- "rightType" : "\"DECIMAL(36,35)\"",
- "sqlExpr" : "\"map_zip_with(decimal_map1, decimal_map2,
lambdafunction(struct(k, v1, v2), k, v1, v2))\""
- },
- "queryContext" : [ {
- "objectType" : "",
- "objectName" : "",
- "startIndex" : 8,
- "stopIndex" : 81,
- "fragment" : "map_zip_with(decimal_map1, decimal_map2, (k, v1, v2) ->
struct(k, v1, v2))"
- } ]
-}
+Project [map_zip_with(cast(decimal_map1#x as
map<decimal(38,2),decimal(36,0)>), cast(decimal_map2#x as
map<decimal(38,2),decimal(36,35)>), lambdafunction(struct(k, lambda k#x, v1,
lambda v1#x, v2, lambda v2#x), lambda k#x, lambda v1#x, lambda v2#x, false)) AS
m#x]
++- SubqueryAlias various_maps
+ +- View (`various_maps`,
[boolean_map#x,tinyint_map#x,smallint_map#x,int_map#x,bigint_map#x,decimal_map1#x,decimal_map2#x,double_map#x,float_map#x,date_map#x,timestamp_map#x,string_map1#x,string_map2#x,string_map3#x,string_map4#x,array_map1#x,array_map2#x,struct_map1#x,struct_map2#x])
+ +- Project [cast(boolean_map#x as map<boolean,boolean>) AS
boolean_map#x, cast(tinyint_map#x as map<tinyint,tinyint>) AS tinyint_map#x,
cast(smallint_map#x as map<smallint,smallint>) AS smallint_map#x,
cast(int_map#x as map<int,int>) AS int_map#x, cast(bigint_map#x as
map<bigint,bigint>) AS bigint_map#x, cast(decimal_map1#x as
map<decimal(36,0),decimal(36,0)>) AS decimal_map1#x, cast(decimal_map2#x as
map<decimal(36,35),decimal(36,35)>) AS decimal_map2#x, cast(double_map#x as
map<d [...]
+ +- Project [boolean_map#x, tinyint_map#x, smallint_map#x, int_map#x,
bigint_map#x, decimal_map1#x, decimal_map2#x, double_map#x, float_map#x,
date_map#x, timestamp_map#x, string_map1#x, string_map2#x, string_map3#x,
string_map4#x, array_map1#x, array_map2#x, struct_map1#x, struct_map2#x]
+ +- SubqueryAlias various_maps
+ +- LocalRelation [boolean_map#x, tinyint_map#x, smallint_map#x,
int_map#x, bigint_map#x, decimal_map1#x, decimal_map2#x, double_map#x,
float_map#x, date_map#x, timestamp_map#x, string_map1#x, string_map2#x,
string_map3#x, string_map4#x, array_map1#x, array_map2#x, struct_map1#x,
struct_map2#x]
-- !query
@@ -178,24 +167,13 @@ Project [map_zip_with(cast(decimal_map1#x as
map<double,decimal(36,0)>), double_
SELECT map_zip_with(decimal_map2, int_map, (k, v1, v2) -> struct(k, v1, v2)) m
FROM various_maps
-- !query analysis
-org.apache.spark.sql.catalyst.ExtendedAnalysisException
-{
- "errorClass" : "DATATYPE_MISMATCH.MAP_ZIP_WITH_DIFF_TYPES",
- "sqlState" : "42K09",
- "messageParameters" : {
- "functionName" : "`map_zip_with`",
- "leftType" : "\"DECIMAL(36,35)\"",
- "rightType" : "\"INT\"",
- "sqlExpr" : "\"map_zip_with(decimal_map2, int_map,
lambdafunction(struct(k, v1, v2), k, v1, v2))\""
- },
- "queryContext" : [ {
- "objectType" : "",
- "objectName" : "",
- "startIndex" : 8,
- "stopIndex" : 76,
- "fragment" : "map_zip_with(decimal_map2, int_map, (k, v1, v2) -> struct(k,
v1, v2))"
- } ]
-}
+Project [map_zip_with(cast(decimal_map2#x as
map<decimal(38,28),decimal(36,35)>), cast(int_map#x as
map<decimal(38,28),int>), lambdafunction(struct(k, lambda k#x, v1, lambda v1#x,
v2, lambda v2#x), lambda k#x, lambda v1#x, lambda v2#x, false)) AS m#x]
++- SubqueryAlias various_maps
+ +- View (`various_maps`,
[boolean_map#x,tinyint_map#x,smallint_map#x,int_map#x,bigint_map#x,decimal_map1#x,decimal_map2#x,double_map#x,float_map#x,date_map#x,timestamp_map#x,string_map1#x,string_map2#x,string_map3#x,string_map4#x,array_map1#x,array_map2#x,struct_map1#x,struct_map2#x])
+ +- Project [cast(boolean_map#x as map<boolean,boolean>) AS
boolean_map#x, cast(tinyint_map#x as map<tinyint,tinyint>) AS tinyint_map#x,
cast(smallint_map#x as map<smallint,smallint>) AS smallint_map#x,
cast(int_map#x as map<int,int>) AS int_map#x, cast(bigint_map#x as
map<bigint,bigint>) AS bigint_map#x, cast(decimal_map1#x as
map<decimal(36,0),decimal(36,0)>) AS decimal_map1#x, cast(decimal_map2#x as
map<decimal(36,35),decimal(36,35)>) AS decimal_map2#x, cast(double_map#x as
map<d [...]
+ +- Project [boolean_map#x, tinyint_map#x, smallint_map#x, int_map#x,
bigint_map#x, decimal_map1#x, decimal_map2#x, double_map#x, float_map#x,
date_map#x, timestamp_map#x, string_map1#x, string_map2#x, string_map3#x,
string_map4#x, array_map1#x, array_map2#x, struct_map1#x, struct_map2#x]
+ +- SubqueryAlias various_maps
+ +- LocalRelation [boolean_map#x, tinyint_map#x, smallint_map#x,
int_map#x, bigint_map#x, decimal_map1#x, decimal_map2#x, double_map#x,
float_map#x, date_map#x, timestamp_map#x, string_map1#x, string_map2#x,
string_map3#x, string_map4#x, array_map1#x, array_map2#x, struct_map1#x,
struct_map2#x]
-- !query
diff --git
a/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/mapZipWith.sql.out
b/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/mapZipWith.sql.out
index 452603c2b1eb..5c00e8a5b63d 100644
---
a/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/mapZipWith.sql.out
+++
b/sql/core/src/test/resources/sql-tests/results/typeCoercion/native/mapZipWith.sql.out
@@ -79,26 +79,9 @@ struct<m:map<double,struct<k:double,v1:double,v2:float>>>
SELECT map_zip_with(decimal_map1, decimal_map2, (k, v1, v2) -> struct(k, v1,
v2)) m
FROM various_maps
-- !query schema
-struct<>
+struct<m:map<decimal(38,2),struct<k:decimal(38,2),v1:decimal(36,0),v2:decimal(36,35)>>>
-- !query output
-org.apache.spark.sql.catalyst.ExtendedAnalysisException
-{
- "errorClass" : "DATATYPE_MISMATCH.MAP_ZIP_WITH_DIFF_TYPES",
- "sqlState" : "42K09",
- "messageParameters" : {
- "functionName" : "`map_zip_with`",
- "leftType" : "\"DECIMAL(36,0)\"",
- "rightType" : "\"DECIMAL(36,35)\"",
- "sqlExpr" : "\"map_zip_with(decimal_map1, decimal_map2,
lambdafunction(struct(k, v1, v2), k, v1, v2))\""
- },
- "queryContext" : [ {
- "objectType" : "",
- "objectName" : "",
- "startIndex" : 8,
- "stopIndex" : 81,
- "fragment" : "map_zip_with(decimal_map1, decimal_map2, (k, v1, v2) ->
struct(k, v1, v2))"
- } ]
-}
+{9.22:{"k":9.22,"v1":null,"v2":9.22337203685477897945456575809789456},922337203685477897945456575809789456.00:{"k":922337203685477897945456575809789456.00,"v1":922337203685477897945456575809789456,"v2":null}}
-- !query
@@ -123,26 +106,9 @@
struct<m:map<double,struct<k:double,v1:decimal(36,0),v2:double>>>
SELECT map_zip_with(decimal_map2, int_map, (k, v1, v2) -> struct(k, v1, v2)) m
FROM various_maps
-- !query schema
-struct<>
+struct<m:map<decimal(38,28),struct<k:decimal(38,28),v1:decimal(36,35),v2:int>>>
-- !query output
-org.apache.spark.sql.catalyst.ExtendedAnalysisException
-{
- "errorClass" : "DATATYPE_MISMATCH.MAP_ZIP_WITH_DIFF_TYPES",
- "sqlState" : "42K09",
- "messageParameters" : {
- "functionName" : "`map_zip_with`",
- "leftType" : "\"DECIMAL(36,35)\"",
- "rightType" : "\"INT\"",
- "sqlExpr" : "\"map_zip_with(decimal_map2, int_map,
lambdafunction(struct(k, v1, v2), k, v1, v2))\""
- },
- "queryContext" : [ {
- "objectType" : "",
- "objectName" : "",
- "startIndex" : 8,
- "stopIndex" : 76,
- "fragment" : "map_zip_with(decimal_map2, int_map, (k, v1, v2) -> struct(k,
v1, v2))"
- } ]
-}
+{2.0000000000000000000000000000:{"k":2.0000000000000000000000000000,"v1":null,"v2":1},9.2233720368547789794545657581:{"k":9.2233720368547789794545657581,"v1":9.22337203685477897945456575809789456,"v2":null}}
-- !query
diff --git
a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a.sf100/explain.txt
b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a.sf100/explain.txt
index 033ff1940d34..4ea7fcca2c1e 100644
---
a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a.sf100/explain.txt
+++
b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a.sf100/explain.txt
@@ -146,7 +146,7 @@ Input [4]: [i_category#12, i_class#11, sum#15, sum#16]
Keys [2]: [i_category#12, i_class#11]
Functions [2]: [sum(UnscaledValue(ss_net_profit#4)),
sum(UnscaledValue(ss_ext_sales_price#3))]
Aggregate Attributes [2]: [sum(UnscaledValue(ss_net_profit#4))#17,
sum(UnscaledValue(ss_ext_sales_price#3))#18]
-Results [6]: [cast((MakeDecimal(sum(UnscaledValue(ss_net_profit#4))#17,17,2) /
MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2)) as
decimal(38,20)) AS gross_margin#19, i_category#12, i_class#11, 0 AS
t_category#20, 0 AS t_class#21, 0 AS lochierarchy#22]
+Results [6]: [cast((MakeDecimal(sum(UnscaledValue(ss_net_profit#4))#17,17,2) /
MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2)) as
decimal(38,11)) AS gross_margin#19, i_category#12, i_class#11, 0 AS
t_category#20, 0 AS t_class#21, 0 AS lochierarchy#22]
(23) ReusedExchange [Reuses operator id: 21]
Output [4]: [i_category#23, i_class#24, sum#25, sum#26]
@@ -174,7 +174,7 @@ Input [5]: [i_category#23, sum#37, isEmpty#38, sum#39,
isEmpty#40]
Keys [1]: [i_category#23]
Functions [2]: [sum(ss_net_profit#31), sum(ss_ext_sales_price#32)]
Aggregate Attributes [2]: [sum(ss_net_profit#31)#41,
sum(ss_ext_sales_price#32)#42]
-Results [6]: [cast((sum(ss_net_profit#31)#41 / sum(ss_ext_sales_price#32)#42)
as decimal(38,20)) AS gross_margin#43, i_category#23, null AS i_class#44, 0 AS
t_category#45, 1 AS t_class#46, 1 AS lochierarchy#47]
+Results [6]: [(sum(ss_net_profit#31)#41 / sum(ss_ext_sales_price#32)#42) AS
gross_margin#43, i_category#23, null AS i_class#44, 0 AS t_category#45, 1 AS
t_class#46, 1 AS lochierarchy#47]
(28) ReusedExchange [Reuses operator id: 21]
Output [4]: [i_category#48, i_class#49, sum#50, sum#51]
@@ -202,7 +202,7 @@ Input [4]: [sum#60, isEmpty#61, sum#62, isEmpty#63]
Keys: []
Functions [2]: [sum(ss_net_profit#54), sum(ss_ext_sales_price#55)]
Aggregate Attributes [2]: [sum(ss_net_profit#54)#64,
sum(ss_ext_sales_price#55)#65]
-Results [6]: [cast((sum(ss_net_profit#54)#64 / sum(ss_ext_sales_price#55)#65)
as decimal(38,20)) AS gross_margin#66, null AS i_category#67, null AS
i_class#68, 1 AS t_category#69, 1 AS t_class#70, 2 AS lochierarchy#71]
+Results [6]: [(sum(ss_net_profit#54)#64 / sum(ss_ext_sales_price#55)#65) AS
gross_margin#66, null AS i_category#67, null AS i_class#68, 1 AS t_category#69,
1 AS t_class#70, 2 AS lochierarchy#71]
(33) Union
diff --git
a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a/explain.txt
b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a/explain.txt
index 7c2e7d17af2b..0db1fa04fc6f 100644
---
a/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a/explain.txt
+++
b/sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v2_7/q36a/explain.txt
@@ -146,7 +146,7 @@ Input [4]: [i_category#10, i_class#9, sum#15, sum#16]
Keys [2]: [i_category#10, i_class#9]
Functions [2]: [sum(UnscaledValue(ss_net_profit#4)),
sum(UnscaledValue(ss_ext_sales_price#3))]
Aggregate Attributes [2]: [sum(UnscaledValue(ss_net_profit#4))#17,
sum(UnscaledValue(ss_ext_sales_price#3))#18]
-Results [6]: [cast((MakeDecimal(sum(UnscaledValue(ss_net_profit#4))#17,17,2) /
MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2)) as
decimal(38,20)) AS gross_margin#19, i_category#10, i_class#9, 0 AS
t_category#20, 0 AS t_class#21, 0 AS lochierarchy#22]
+Results [6]: [cast((MakeDecimal(sum(UnscaledValue(ss_net_profit#4))#17,17,2) /
MakeDecimal(sum(UnscaledValue(ss_ext_sales_price#3))#18,17,2)) as
decimal(38,11)) AS gross_margin#19, i_category#10, i_class#9, 0 AS
t_category#20, 0 AS t_class#21, 0 AS lochierarchy#22]
(23) ReusedExchange [Reuses operator id: 21]
Output [4]: [i_category#23, i_class#24, sum#25, sum#26]
@@ -174,7 +174,7 @@ Input [5]: [i_category#23, sum#37, isEmpty#38, sum#39,
isEmpty#40]
Keys [1]: [i_category#23]
Functions [2]: [sum(ss_net_profit#31), sum(ss_ext_sales_price#32)]
Aggregate Attributes [2]: [sum(ss_net_profit#31)#41,
sum(ss_ext_sales_price#32)#42]
-Results [6]: [cast((sum(ss_net_profit#31)#41 / sum(ss_ext_sales_price#32)#42)
as decimal(38,20)) AS gross_margin#43, i_category#23, null AS i_class#44, 0 AS
t_category#45, 1 AS t_class#46, 1 AS lochierarchy#47]
+Results [6]: [(sum(ss_net_profit#31)#41 / sum(ss_ext_sales_price#32)#42) AS
gross_margin#43, i_category#23, null AS i_class#44, 0 AS t_category#45, 1 AS
t_class#46, 1 AS lochierarchy#47]
(28) ReusedExchange [Reuses operator id: 21]
Output [4]: [i_category#48, i_class#49, sum#50, sum#51]
@@ -202,7 +202,7 @@ Input [4]: [sum#60, isEmpty#61, sum#62, isEmpty#63]
Keys: []
Functions [2]: [sum(ss_net_profit#54), sum(ss_ext_sales_price#55)]
Aggregate Attributes [2]: [sum(ss_net_profit#54)#64,
sum(ss_ext_sales_price#55)#65]
-Results [6]: [cast((sum(ss_net_profit#54)#64 / sum(ss_ext_sales_price#55)#65)
as decimal(38,20)) AS gross_margin#66, null AS i_category#67, null AS
i_class#68, 1 AS t_category#69, 1 AS t_class#70, 2 AS lochierarchy#71]
+Results [6]: [(sum(ss_net_profit#54)#64 / sum(ss_ext_sales_price#55)#65) AS
gross_margin#66, null AS i_category#67, null AS i_class#68, 1 AS t_category#69,
1 AS t_class#70, 2 AS lochierarchy#71]
(33) Union
diff --git a/sql/core/src/test/resources/tpcds-query-results/v2_7/q36a.sql.out
b/sql/core/src/test/resources/tpcds-query-results/v2_7/q36a.sql.out
index a74e6211b49a..a3e82c380dcc 100644
--- a/sql/core/src/test/resources/tpcds-query-results/v2_7/q36a.sql.out
+++ b/sql/core/src/test/resources/tpcds-query-results/v2_7/q36a.sql.out
@@ -1,105 +1,105 @@
-- Automatically generated by TPCDSQueryTestSuite
-- !query schema
-struct<gross_margin:decimal(38,20),i_category:string,i_class:string,lochierarchy:int,rank_within_parent:int>
+struct<gross_margin:decimal(38,11),i_category:string,i_class:string,lochierarchy:int,rank_within_parent:int>
-- !query output
--0.43310777865000000000 NULL NULL 2 1
--0.44057752675000000000 Home
NULL 1 1
--0.43759152110000000000 Music
NULL 1 2
--0.43708103961000000000 NULL NULL 1 3
--0.43616253139000000000 Shoes
NULL 1 4
--0.43567118609000000000 Children
NULL 1 5
--0.43423932352000000000 Sports
NULL 1 6
--0.43342977300000000000 Electronics
NULL 1 7
--0.43243283121000000000 Women
NULL 1 8
--0.43164166900000000000 Men
NULL 1 9
--0.42516187690000000000 Books
NULL 1 10
--0.42448713381000000000 Jewelry
NULL 1 11
--0.73902664238792748962 NULL shirts
0 1
--0.61125804873635587486 NULL country
0 2
--0.53129803597069255822 NULL dresses
0 3
--0.51266635289382758517 NULL athletic
0 4
--0.45290387783638603924 NULL mens
0 5
--0.41288056661656330013 NULL accessories
0 6
--0.40784754677005682440 NULL NULL 0 7
--0.34254844860867375832 NULL baseball
0 8
--0.32511461675631534897 NULL infants
0 9
--0.44733955704648003493 Books
computers 0 1
--0.44221358112622373783 Books
home repair 0 2
--0.44131129175272951442 Books
romance 0 3
--0.43954111564375046074 Books
history 0 4
--0.43921337505389731821 Books
mystery 0 5
--0.43904020269360481109 Books
sports 0 6
--0.42821476999837619396 Books
travel 0 7
--0.42609067296303848297 Books
cooking 0 8
--0.42538995145338568328 Books
fiction 0 9
--0.42446563616188232944 Books
arts 0 10
--0.42424821311884350413 Books
parenting 0 11
--0.41822014479424203008 Books
reference 0 12
--0.41350839325516811781 Books
business 0 13
--0.40935208137315013129 Books
science 0 14
--0.40159380735731858928 Books
self-help 0 15
--0.36957884843305744526 Books
entertainments 0 16
--0.44602461556731552282 Children
school-uniforms 0 1
--0.44141106040000560852 Children
toddlers 0 2
--0.43479886701046623711 Children
infants 0 3
--0.41900662971936329442 Children
newborn 0 4
--0.41526603781609697786 Children
NULL 0 5
--0.45347482218635333366 Electronics
personal 0 1
--0.44349670349829474271 Electronics
stereo 0 2
--0.44262427232850112058 Electronics
automotive 0 3
--0.44115886172705231970 Electronics
portable 0 4
--0.43972786651639318010 Electronics
memory 0 5
--0.43889275271590953040 Electronics
scanners 0 6
--0.43879181695132886061 Electronics
karoke 0 7
--0.43743655149948399284 Electronics
dvd/vcr players 0 8
--0.43737666390514154910 Electronics
cameras 0 9
--0.43390499017233926812 Electronics
wireless 0 10
--0.43163869754114299547 Electronics
audio 0 11
--0.42958938669780912634 Electronics
camcorders 0 12
--0.42872845803629855724 Electronics
musical 0 13
--0.42228240153396399656 Electronics
televisions 0 14
--0.41893847772039275795 Electronics
monitors 0 15
--0.39793878022746331540 Electronics
disk drives 0 16
--0.49051156860507320113 Home
NULL 0 1
--0.48431476750686752965 Home
blinds/shades 0 2
--0.47545837941951440918 Home
bathroom 0 3
--0.45726228921216284093 Home
rugs 0 4
--0.45540507568891021759 Home
furniture 0 5
--0.45303572267019508501 Home
flatware 0 6
--0.44755542058111800358 Home
tables 0 7
--0.44419847780930149402 Home
wallpaper 0 8
--0.44092345226680695671 Home
glassware 0 9
--0.43877591834074789745 Home
decor 0 10
--0.43765482553654514822 Home
accent 0 11
--0.43188199218974854630 Home
bedding 0 12
--0.43107417904272222899 Home
kids 0 13
--0.42474436355625900935 Home
lighting 0 14
--0.41783311109052416746 Home
curtains/drapes 0 15
--0.41767111806961188479 Home
mattresses 0 16
--0.40562188698541221499 Home
paint 0 17
--0.45165056505480816921 Jewelry
jewelry boxes 0 1
--0.44372227804836590137 Jewelry
estate 0 2
--0.44251815032563188894 Jewelry
gold 0 3
--0.43978127753996883542 Jewelry
consignment 0 4
--0.43821750044359339153 Jewelry
custom 0 5
--0.43439645036479672989 Jewelry
bracelets 0 6
--0.43208398325687772942 Jewelry
loose stones 0 7
--0.43060897375114375156 Jewelry
diamonds 0 8
--0.42847505748860847066 Jewelry
costume 0 9
--0.42667449062277843561 Jewelry
rings 0 10
--0.41987969011585456826 Jewelry
mens watch 0 11
--0.41624621972944533035 Jewelry
semi-precious 0 12
--0.41148949162100715771 Jewelry
womens watch 0 13
--0.39725668174847694299 Jewelry
birdal 0 14
--0.39665274051903254057 Jewelry
pendants 0 15
--0.38423525233438861010 Jewelry
earings 0 16
--0.44464388887858793403 Men
shirts 0 1
--0.43719860800637369827 Men
accessories 0 2
--0.43164606665359630905 Men
sports-apparel 0 3
--0.41530906677293519754 Men
pants 0 4
--0.38332708894803499123 Men
NULL 0 5
--0.47339698705534020269 Music
NULL 0 1
--0.44193214675249008923 Music
rock 0 2
--0.44008174913565459246 Music
country 0 3
--0.43863444992223641373 Music
pop 0 4
+-0.43310777865 NULL NULL 2 1
+-0.44057752675 Home NULL
1 1
+-0.43759152110 Music NULL
1 2
+-0.43708103961 NULL NULL 1 3
+-0.43616253139 Shoes NULL
1 4
+-0.43567118609 Children NULL
1 5
+-0.43423932352 Sports NULL
1 6
+-0.43342977300 Electronics NULL
1 7
+-0.43243283121 Women NULL
1 8
+-0.43164166900 Men NULL
1 9
+-0.42516187690 Books NULL
1 10
+-0.42448713381 Jewelry NULL
1 11
+-0.73902664239 NULL shirts
0 1
+-0.61125804874 NULL country
0 2
+-0.53129803597 NULL dresses
0 3
+-0.51266635289 NULL athletic
0 4
+-0.45290387784 NULL mens
0 5
+-0.41288056662 NULL accessories
0 6
+-0.40784754677 NULL NULL 0 7
+-0.34254844861 NULL baseball
0 8
+-0.32511461676 NULL infants
0 9
+-0.44733955705 Books
computers 0 1
+-0.44221358113 Books home
repair 0 2
+-0.44131129175 Books romance
0 3
+-0.43954111564 Books history
0 4
+-0.43921337505 Books mystery
0 5
+-0.43904020269 Books sports
0 6
+-0.42821477000 Books travel
0 7
+-0.42609067296 Books cooking
0 8
+-0.42538995145 Books fiction
0 9
+-0.42446563616 Books arts
0 10
+-0.42424821312 Books
parenting 0 11
+-0.41822014479 Books
reference 0 12
+-0.41350839326 Books
business 0 13
+-0.40935208137 Books science
0 14
+-0.40159380736 Books
self-help 0 15
+-0.36957884843 Books
entertainments 0 16
+-0.44602461557 Children
school-uniforms 0 1
+-0.44141106040 Children
toddlers 0 2
+-0.43479886701 Children infants
0 3
+-0.41900662972 Children newborn
0 4
+-0.41526603782 Children NULL
0 5
+-0.45347482219 Electronics
personal 0 1
+-0.44349670350 Electronics stereo
0 2
+-0.44262427233 Electronics
automotive 0 3
+-0.44115886173 Electronics
portable 0 4
+-0.43972786652 Electronics memory
0 5
+-0.43889275272 Electronics
scanners 0 6
+-0.43879181695 Electronics karoke
0 7
+-0.43743655150 Electronics dvd/vcr
players 0 8
+-0.43737666391 Electronics cameras
0 9
+-0.43390499017 Electronics
wireless 0 10
+-0.43163869754 Electronics audio
0 11
+-0.42958938670 Electronics
camcorders 0 12
+-0.42872845804 Electronics musical
0 13
+-0.42228240153 Electronics
televisions 0 14
+-0.41893847772 Electronics
monitors 0 15
+-0.39793878023 Electronics disk
drives 0 16
+-0.49051156861 Home NULL
0 1
+-0.48431476751 Home
blinds/shades 0 2
+-0.47545837942 Home
bathroom 0 3
+-0.45726228921 Home rugs
0 4
+-0.45540507569 Home
furniture 0 5
+-0.45303572267 Home
flatware 0 6
+-0.44755542058 Home tables
0 7
+-0.44419847781 Home
wallpaper 0 8
+-0.44092345227 Home
glassware 0 9
+-0.43877591834 Home decor
0 10
+-0.43765482554 Home accent
0 11
+-0.43188199219 Home bedding
0 12
+-0.43107417904 Home kids
0 13
+-0.42474436356 Home
lighting 0 14
+-0.41783311109 Home
curtains/drapes 0 15
+-0.41767111807 Home
mattresses 0 16
+-0.40562188699 Home paint
0 17
+-0.45165056505 Jewelry jewelry
boxes 0 1
+-0.44372227805 Jewelry estate
0 2
+-0.44251815033 Jewelry gold
0 3
+-0.43978127754 Jewelry
consignment 0 4
+-0.43821750044 Jewelry custom
0 5
+-0.43439645036 Jewelry
bracelets 0 6
+-0.43208398326 Jewelry loose
stones 0 7
+-0.43060897375 Jewelry
diamonds 0 8
+-0.42847505749 Jewelry costume
0 9
+-0.42667449062 Jewelry rings
0 10
+-0.41987969012 Jewelry mens
watch 0 11
+-0.41624621973 Jewelry
semi-precious 0 12
+-0.41148949162 Jewelry womens
watch 0 13
+-0.39725668175 Jewelry birdal
0 14
+-0.39665274052 Jewelry
pendants 0 15
+-0.38423525233 Jewelry earings
0 16
+-0.44464388888 Men shirts
0 1
+-0.43719860801 Men
accessories 0 2
+-0.43164606665 Men
sports-apparel 0 3
+-0.41530906677 Men pants
0 4
+-0.38332708895 Men NULL
0 5
+-0.47339698706 Music NULL
0 1
+-0.44193214675 Music rock
0 2
+-0.44008174914 Music country
0 3
+-0.43863444992 Music pop
0 4
diff --git
a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
index 135ce834bfe5..bb024b79598e 100644
--- a/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
+++ b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameFunctionsSuite.scala
@@ -1529,23 +1529,10 @@ class DataFrameFunctionsSuite extends QueryTest with
SharedSparkSession {
Seq(Row(false))
)
- if (!conf.ansiEnabled) {
- checkError(
- exception = intercept[AnalysisException] {
- OneRowRelation().selectExpr("array_contains(array(1),
.01234567890123456790123456780)")
- },
- errorClass = "DATATYPE_MISMATCH.ARRAY_FUNCTION_DIFF_TYPES",
- parameters = Map(
- "sqlExpr" -> "\"array_contains(array(1),
0.01234567890123456790123456780)\"",
- "functionName" -> "`array_contains`",
- "dataType" -> "\"ARRAY\"",
- "leftType" -> "\"ARRAY<INT>\"",
- "rightType" -> "\"DECIMAL(38,29)\""
- ),
- queryContext = Array(ExpectedContext("", "", 0, 55,
- "array_contains(array(1), .01234567890123456790123456780)"))
- )
- }
+ checkAnswer(
+ OneRowRelation().selectExpr("array_contains(array(1),
.01234567890123456790123456780)"),
+ Seq(Row(false))
+ )
checkError(
exception = intercept[AnalysisException] {
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]