voonhous commented on code in PR #18488:
URL: https://github.com/apache/hudi/pull/18488#discussion_r3064148936
##########
hudi-spark-datasource/hudi-spark3.3.x/src/main/scala/org/apache/spark/sql/parser/HoodieSpark3_3ExtendedSqlParser.scala:
##########
@@ -129,7 +129,8 @@ class HoodieSpark3_3ExtendedSqlParser(session:
SparkSession, delegate: ParserInt
normalized.contains("drop index") ||
normalized.contains("show indexes") ||
normalized.contains("refresh index") ||
- normalized.contains(" blob")
+ normalized.contains(" blob") ||
+ normalized.contains(" vector")
Review Comment:
Improve VECTOR DDL test coverage with targeted tests and routing by changing
`" vector("` to `" vector"`
- Relax isHoodieCommand VECTOR check from " vector(" to " vector" in all 4
extended parser files.
- The stricter " vector(" variant only routes SQL containing VECTOR type
declarations with parentheses (e.g. VECTOR(128)), which means VECTOR without
parens is delegated to Spark's native parser and never reaches our Hudi code
path.
- Relaxing to " vector" routes all VECTOR-related SQL through our parser,
enabling us to exercise the "vector with empty params" branch of the `case
("vector", _ :: _)` pattern - previously reported as partial coverage because
the empty-list side of the `_ :: _` check was never hit.
- This is also consistent with the existing BLOB routing pattern " blob".
Add two targeted tests:
1. test create table with INT8 VECTOR column - isolated INT8 test that
independently exercises the `case INT8 => ByteType` branch
2. test create table with VECTOR without dimension fails - routes VECTOR
alone through the Hudi parser to cover the empty-list branch
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]