[
https://issues.apache.org/jira/browse/FLINK-39062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
featzhang updated FLINK-39062:
------------------------------
Description:
h1. 1. Motivation
Currently, watermark definition in Flink SQL is tightly coupled with table DDL.
This introduces several limitations:
# Watermark is treated as catalog metadata instead of a logical relational
property.
# It cannot be flexibly reused across logical abstractions (e.g., view-based
pipelines).
# It complicates modular query design when watermark logic needs to be shared.
During earlier discussion, extending VIEW DDL to support watermark was
considered. However, this approach raised concerns around:
* Semantic ambiguity between metadata and logical properties
* Increased complexity in view expansion and optimization
* Potential impact on existing DDL semantics
To address these concerns while still enabling logical reuse of watermark
definitions, this FLIP proposes introducing a function-based approach.
----
h1. 2. Proposed Solution
Introduce a built-in table function:
{code:java}
APPLY_WATERMARK(
table,
DESCRIPTOR(rowtime_column),
watermark_expression
){code}
This function attaches watermark semantics at the logical relational level
instead of the catalog level.
----
h1. 3. Syntax
{code:java}
SELECT *
FROM APPLY_WATERMARK(
Orders,
DESCRIPTOR(order_time),
order_time - INTERVAL '5' SECOND
);{code}
The first argument supports:
* Base tables
* Views (including non-materialized views)
* Subqueries
This enables logical reuse without introducing watermark semantics into catalog
DDL.
----
h1. 4. Semantics
* The function returns a relation identical to the input table but with an
attached watermark definition.
* The watermark is scoped to the query.
* It does not modify catalog metadata.
* It does not alter view definitions.
Watermark propagation rules remain consistent with current planner behavior.
----
h1. 5. Why Function-Based Instead of VIEW Extension
The function-based approach:
* Keeps watermark as a logical property rather than catalog metadata
* Avoids modifying VIEW DDL semantics
* Avoids ambiguity between physical and logical properties
* Aligns with existing function-style relational extensions
* Minimizes impact on optimizer and catalog layers
This design keeps the scope focused and reduces planner and metadata coupling.
----
h1. 6. Design Evolution
The initial proposal considered allowing watermark definition directly inside
VIEW DDL to support reuse.
After discussion, it was identified that:
* Watermark semantics are fundamentally logical rather than metadata.
* Embedding watermark in VIEW DDL introduces complexity in view expansion.
* It may blur the boundary between logical and catalog layers.
Based on community feedback, the proposal pivots to a function-based solution
that preserves logical reuse while avoiding DDL-level changes.
The view-based design is no longer preferred.
----
h1. 7. Compatibility
* No changes to existing table DDL behavior.
* No backward compatibility impact.
* Purely additive feature.
----
h1. 8. Scope
This FLIP only introduces:
* The APPLY_WATERMARK function
* Planner support for logical watermark attachment
It does not:
* Modify catalog metadata
* Change existing VIEW DDL syntax
* Introduce physical materialization behavior
----
h1. 9. Implementation Plan
# Add APPLY_WATERMARK function definition
# Extend planner to attach logical watermark trait
# Add validation rules for descriptor and expression
# Add unit and integration tests
----
h1. 10. Open Questions
* Should nested APPLY_WATERMARK calls be allowed?
* Should conflicting watermark definitions throw validation errors?
was:
h1. 1. Motivation
Currently, watermark definition in Flink SQL is tightly coupled with table DDL.
This introduces several limitations:
# Watermark is treated as catalog metadata instead of a logical relational
property.
# It cannot be flexibly reused across logical abstractions (e.g., view-based
pipelines).
# It complicates modular query design when watermark logic needs to be shared.
During earlier discussion, extending VIEW DDL to support watermark was
considered. However, this approach raised concerns around:
* Semantic ambiguity between metadata and logical properties
* Increased complexity in view expansion and optimization
* Potential impact on existing DDL semantics
To address these concerns while still enabling logical reuse of watermark
definitions, this FLIP proposes introducing a function-based approach.
----
h1. 2. Proposed Solution
Introduce a built-in table function:
{code:java}
APPLY_WATERMARK(
table,
DESCRIPTOR(rowtime_column),
watermark_expression
){code}
This function attaches watermark semantics at the logical relational level
instead of the catalog level.
----
h1. 3. Syntax
{code:java}
SELECT *
FROM APPLY_WATERMARK(
Orders,
DESCRIPTOR(order_time),
order_time - INTERVAL '5' SECOND
);{code}
The first argument supports:
* Base tables
* Views (including non-materialized views)
* Subqueries
This enables logical reuse without introducing watermark semantics into catalog
DDL.
----
h1. 4. Semantics
* The function returns a relation identical to the input table but with an
attached watermark definition.
* The watermark is scoped to the query.
* It does not modify catalog metadata.
* It does not alter view definitions.
Watermark propagation rules remain consistent with current planner behavior.
----
h1. 5. Why Function-Based Instead of VIEW Extension
The function-based approach:
* Keeps watermark as a logical property rather than catalog metadata
* Avoids modifying VIEW DDL semantics
* Avoids ambiguity between physical and logical properties
* Aligns with existing function-style relational extensions
* Minimizes impact on optimizer and catalog layers
This design keeps the scope focused and reduces planner and metadata coupling.
----
h1. 6. Design Evolution
The initial proposal considered allowing watermark definition directly inside
VIEW DDL to support reuse.
After discussion, it was identified that:
* Watermark semantics are fundamentally logical rather than metadata.
* Embedding watermark in VIEW DDL introduces complexity in view expansion.
* It may blur the boundary between logical and catalog layers.
Based on community feedback, the proposal pivots to a function-based solution
that preserves logical reuse while avoiding DDL-level changes.
The view-based design is no longer preferred.
----
h1. 7. Compatibility
* No changes to existing table DDL behavior.
* No backward compatibility impact.
* Purely additive feature.
----
h1. 8. Scope
This FLIP only introduces:
* The APPLY_WATERMARK function
* Planner support for logical watermark attachment
It does not:
* Modify catalog metadata
* Change existing VIEW DDL syntax
* Introduce physical materialization behavior
----
h1. 9. Implementation Plan
# Add APPLY_WATERMARK function definition
# Extend planner to attach logical watermark trait
# Add validation rules for descriptor and expression
# Add unit and integration tests
----
h1. 10. Open Questions
* Should nested APPLY_WATERMARK calls be allowed?
* Should conflicting watermark definitions throw validation errors?
> Support applying watermark on logical relations in SQL
> ------------------------------------------------------
>
> Key: FLINK-39062
> URL: https://issues.apache.org/jira/browse/FLINK-39062
> Project: Flink
> Issue Type: New Feature
> Components: Table SQL / API, Table SQL / Planner
> Reporter: featzhang
> Priority: Major
>
> h1. 1. Motivation
> Currently, watermark definition in Flink SQL is tightly coupled with table
> DDL. This introduces several limitations:
> # Watermark is treated as catalog metadata instead of a logical relational
> property.
> # It cannot be flexibly reused across logical abstractions (e.g., view-based
> pipelines).
> # It complicates modular query design when watermark logic needs to be
> shared.
> During earlier discussion, extending VIEW DDL to support watermark was
> considered. However, this approach raised concerns around:
> * Semantic ambiguity between metadata and logical properties
> * Increased complexity in view expansion and optimization
> * Potential impact on existing DDL semantics
> To address these concerns while still enabling logical reuse of watermark
> definitions, this FLIP proposes introducing a function-based approach.
> ----
> h1. 2. Proposed Solution
> Introduce a built-in table function:
>
> {code:java}
> APPLY_WATERMARK(
> table,
> DESCRIPTOR(rowtime_column),
> watermark_expression
> ){code}
>
> This function attaches watermark semantics at the logical relational level
> instead of the catalog level.
> ----
> h1. 3. Syntax
> {code:java}
> SELECT *
> FROM APPLY_WATERMARK(
> Orders,
> DESCRIPTOR(order_time),
> order_time - INTERVAL '5' SECOND
> );{code}
>
> The first argument supports:
> * Base tables
> * Views (including non-materialized views)
> * Subqueries
> This enables logical reuse without introducing watermark semantics into
> catalog DDL.
> ----
> h1. 4. Semantics
> * The function returns a relation identical to the input table but with an
> attached watermark definition.
> * The watermark is scoped to the query.
> * It does not modify catalog metadata.
> * It does not alter view definitions.
> Watermark propagation rules remain consistent with current planner behavior.
> ----
> h1. 5. Why Function-Based Instead of VIEW Extension
> The function-based approach:
> * Keeps watermark as a logical property rather than catalog metadata
> * Avoids modifying VIEW DDL semantics
> * Avoids ambiguity between physical and logical properties
> * Aligns with existing function-style relational extensions
> * Minimizes impact on optimizer and catalog layers
> This design keeps the scope focused and reduces planner and metadata coupling.
> ----
> h1. 6. Design Evolution
> The initial proposal considered allowing watermark definition directly inside
> VIEW DDL to support reuse.
> After discussion, it was identified that:
> * Watermark semantics are fundamentally logical rather than metadata.
> * Embedding watermark in VIEW DDL introduces complexity in view expansion.
> * It may blur the boundary between logical and catalog layers.
> Based on community feedback, the proposal pivots to a function-based solution
> that preserves logical reuse while avoiding DDL-level changes.
> The view-based design is no longer preferred.
> ----
> h1. 7. Compatibility
> * No changes to existing table DDL behavior.
> * No backward compatibility impact.
> * Purely additive feature.
> ----
> h1. 8. Scope
> This FLIP only introduces:
> * The APPLY_WATERMARK function
> * Planner support for logical watermark attachment
> It does not:
> * Modify catalog metadata
> * Change existing VIEW DDL syntax
> * Introduce physical materialization behavior
> ----
> h1. 9. Implementation Plan
> # Add APPLY_WATERMARK function definition
> # Extend planner to attach logical watermark trait
> # Add validation rules for descriptor and expression
> # Add unit and integration tests
> ----
> h1. 10. Open Questions
> * Should nested APPLY_WATERMARK calls be allowed?
> * Should conflicting watermark definitions throw validation errors?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)