This is an automated email from the ASF dual-hosted git repository.
wenchen pushed a commit to branch branch-3.0
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.0 by this push:
new b00895c [SPARK-30937][DOC] Group Hive upgrade guides together
b00895c is described below
commit b00895ceded4da49793314833e5442249d05f461
Author: yi.wu <[email protected]>
AuthorDate: Thu Feb 27 21:29:42 2020 +0800
[SPARK-30937][DOC] Group Hive upgrade guides together
### What changes were proposed in this pull request?
This PR groups all hive upgrade related migration guides inside Spark 3.0
together.
Also add another behavior change of `ScriptTransform` in the new Hive
section.
### Why are the changes needed?
Make the doc more clearly to user.
### Does this PR introduce any user-facing change?
No, new doc for Spark 3.0.
### How was this patch tested?
N/A.
Closes #27670 from Ngone51/hive_migration.
Authored-by: yi.wu <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 22dfd15a4574a5cccdc54c96f11de28d58363016)
Signed-off-by: Wenchen Fan <[email protected]>
---
docs/sql-migration-guide.md | 10 +++++++---
.../spark/sql/hive/execution/ScriptTransformationSuite.scala | 5 ++---
2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/docs/sql-migration-guide.md b/docs/sql-migration-guide.md
index 7e0a536..d241a66 100644
--- a/docs/sql-migration-guide.md
+++ b/docs/sql-migration-guide.md
@@ -254,7 +254,7 @@ license: |
</tr>
</table>
- - Since Spark 3.0, CREATE TABLE without a specific provider will use the
value of `spark.sql.sources.default` as its provider. In Spark version 2.4 and
earlier, it was hive. To restore the behavior before Spark 3.0, you can set
`spark.sql.legacy.createHiveTableByDefault.enabled` to `true`.
+ - Since Spark 3.0, `CREATE TABLE` without a specific provider will use the
value of `spark.sql.sources.default` as its provider. In Spark version 2.4 and
earlier, it was hive. To restore the behavior before Spark 3.0, you can set
`spark.sql.legacy.createHiveTableByDefault.enabled` to `true`.
- Since Spark 3.0, the unary arithmetic operator plus(`+`) only accepts
string, numeric and interval type values as inputs. Besides, `+` with a
integral string representation will be coerced to double value, e.g. `+'1'`
results `1.0`. In Spark version 2.4 and earlier, this operator is ignored.
There is no type checking for it, thus, all type values with a `+` prefix are
valid, e.g. `+ array(1, 2)` is valid and results `[1, 2]`. Besides, there is no
type coercion for it at all, e.g. in [...]
@@ -332,10 +332,14 @@ license: |
- Since Spark 3.0, `SHOW CREATE TABLE` will always return Spark DDL, even
when the given table is a Hive serde table. For generating Hive DDL, please use
`SHOW CREATE TABLE AS SERDE` command instead.
- - Since Spark 3.0, we upgraded the built-in Hive from 1.2 to 2.3. This may
need to set `spark.sql.hive.metastore.version` and
`spark.sql.hive.metastore.jars` according to the version of the Hive metastore.
+ - Since Spark 3.0, we upgraded the built-in Hive from 1.2 to 2.3 and it
brings following impacts:
+
+ - You may need to set `spark.sql.hive.metastore.version` and
`spark.sql.hive.metastore.jars` according to the version of the Hive metastore
you want to connect to.
For example: set `spark.sql.hive.metastore.version` to `1.2.1` and
`spark.sql.hive.metastore.jars` to `maven` if your Hive metastore version is
1.2.1.
- - Since Spark 3.0, we upgraded the built-in Hive from 1.2 to 2.3. You need
to migrate your custom SerDes to Hive 2.3 or build your own Spark with
`hive-1.2` profile. See HIVE-15167 for more details.
+ - You need to migrate your custom SerDes to Hive 2.3 or build your own
Spark with `hive-1.2` profile. See HIVE-15167 for more details.
+
+ - The decimal string representation can be different between Hive 1.2 and
Hive 2.3 when using `TRANSFORM` operator in SQL for script transformation,
which depends on hive's behavior. In Hive 1.2, the string representation omits
trailing zeroes. But in Hive 2.3, it is always padded to 18 digits with
trailing zeroes if necessary.
## Upgrading from Spark SQL 2.4.4 to 2.4.5
diff --git
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/ScriptTransformationSuite.scala
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/ScriptTransformationSuite.scala
index 7d01fc5..7153d3f 100644
---
a/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/ScriptTransformationSuite.scala
+++
b/sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/ScriptTransformationSuite.scala
@@ -212,9 +212,8 @@ class ScriptTransformationSuite extends SparkPlanTest with
SQLTestUtils with Tes
|FROM v
""".stripMargin)
- // In Hive1.2, it does not do well on Decimal conversion. For example,
in this case,
- // it converts a decimal value's type from Decimal(38, 18) to Decimal(1,
0). So we need
- // do extra cast here for Hive1.2. But in Hive2.3, it still keeps the
original Decimal type.
+ // In Hive 1.2, the string representation of a decimal omits trailing
zeroes.
+ // But in Hive 2.3, it is always padded to 18 digits with trailing
zeroes if necessary.
val decimalToString: Column => Column = if (HiveUtils.isHive23) {
c => c.cast("string")
} else {
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]