[ 
https://issues.apache.org/jira/browse/FLINK-39062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

featzhang updated FLINK-39062:
------------------------------
    Description: 
h1. 1. Motivation

Currently, watermark definition in Flink SQL is tightly coupled with table DDL. 
This introduces several limitations:
 # Watermark is treated as catalog metadata instead of a logical relational 
property.

 # It cannot be flexibly reused across logical abstractions (e.g., view-based 
pipelines).

 # It complicates modular query design when watermark logic needs to be shared.

During earlier discussion, extending VIEW DDL to support watermark was 
considered. However, this approach raised concerns around:
 * Semantic ambiguity between metadata and logical properties

 * Increased complexity in view expansion and optimization

 * Potential impact on existing DDL semantics

To address these concerns while still enabling logical reuse of watermark 
definitions, this FLIP proposes introducing a function-based approach.
----
h1. 2. Proposed Solution

Introduce a built-in table function: 
 
{code:java}
APPLY_WATERMARK(
table,
DESCRIPTOR(rowtime_column),
watermark_expression
){code}
 
This function attaches watermark semantics at the logical relational level 
instead of the catalog level.
----
h1. 3. Syntax
{code:java}
SELECT *
FROM APPLY_WATERMARK(
Orders,
DESCRIPTOR(order_time),
order_time - INTERVAL '5' SECOND
);{code}
 
The first argument supports:
 * Base tables

 * Views (including non-materialized views)

 * Subqueries

This enables logical reuse without introducing watermark semantics into catalog 
DDL.
----
h1. 4. Semantics
 * The function returns a relation identical to the input table but with an 
attached watermark definition.

 * The watermark is scoped to the query.

 * It does not modify catalog metadata.

 * It does not alter view definitions.

Watermark propagation rules remain consistent with current planner behavior.
----
h1. 5. Why Function-Based Instead of VIEW Extension

The function-based approach:
 * Keeps watermark as a logical property rather than catalog metadata

 * Avoids modifying VIEW DDL semantics

 * Avoids ambiguity between physical and logical properties

 * Aligns with existing function-style relational extensions

 * Minimizes impact on optimizer and catalog layers

This design keeps the scope focused and reduces planner and metadata coupling.
----
h1. 6. Design Evolution

The initial proposal considered allowing watermark definition directly inside 
VIEW DDL to support reuse.

After discussion, it was identified that:
 * Watermark semantics are fundamentally logical rather than metadata.

 * Embedding watermark in VIEW DDL introduces complexity in view expansion.

 * It may blur the boundary between logical and catalog layers.

Based on community feedback, the proposal pivots to a function-based solution 
that preserves logical reuse while avoiding DDL-level changes.

The view-based design is no longer preferred.
----
h1. 7. Compatibility
 * No changes to existing table DDL behavior.

 * No backward compatibility impact.

 * Purely additive feature.

----
h1. 8. Scope

This FLIP only introduces:
 * The APPLY_WATERMARK function

 * Planner support for logical watermark attachment

It does not:
 * Modify catalog metadata

 * Change existing VIEW DDL syntax

 * Introduce physical materialization behavior

----
h1. 9. Implementation Plan
 # Add APPLY_WATERMARK function definition

 # Extend planner to attach logical watermark trait

 # Add validation rules for descriptor and expression

 # Add unit and integration tests

----
h1. 10. Open Questions
 * Should nested APPLY_WATERMARK calls be allowed?

 * Should conflicting watermark definitions throw validation errors?

  was:
h1. 1. Motivation

Currently, watermark definition in Flink SQL is tightly coupled with table DDL. 
This introduces several limitations:
 # Watermark is treated as catalog metadata instead of a logical relational 
property.

 # It cannot be flexibly reused across logical abstractions (e.g., view-based 
pipelines).

 # It complicates modular query design when watermark logic needs to be shared.

During earlier discussion, extending VIEW DDL to support watermark was 
considered. However, this approach raised concerns around:
 * Semantic ambiguity between metadata and logical properties

 * Increased complexity in view expansion and optimization

 * Potential impact on existing DDL semantics

To address these concerns while still enabling logical reuse of watermark 
definitions, this FLIP proposes introducing a function-based approach.
----
h1. 2. Proposed Solution

Introduce a built-in table function: 
 
{code:java}
APPLY_WATERMARK(
table,
DESCRIPTOR(rowtime_column),
watermark_expression
){code}

 
This function attaches watermark semantics at the logical relational level 
instead of the catalog level.
----
h1. 3. Syntax

 
 
 
 
{code:java}
SELECT *
FROM APPLY_WATERMARK(
Orders,
DESCRIPTOR(order_time),
order_time - INTERVAL '5' SECOND
);{code}

 
The first argument supports:
 * Base tables

 * Views (including non-materialized views)

 * Subqueries

This enables logical reuse without introducing watermark semantics into catalog 
DDL.
----
h1. 4. Semantics
 * The function returns a relation identical to the input table but with an 
attached watermark definition.

 * The watermark is scoped to the query.

 * It does not modify catalog metadata.

 * It does not alter view definitions.

Watermark propagation rules remain consistent with current planner behavior.
----
h1. 5. Why Function-Based Instead of VIEW Extension

The function-based approach:
 * Keeps watermark as a logical property rather than catalog metadata

 * Avoids modifying VIEW DDL semantics

 * Avoids ambiguity between physical and logical properties

 * Aligns with existing function-style relational extensions

 * Minimizes impact on optimizer and catalog layers

This design keeps the scope focused and reduces planner and metadata coupling.
----
h1. 6. Design Evolution

The initial proposal considered allowing watermark definition directly inside 
VIEW DDL to support reuse.

After discussion, it was identified that:
 * Watermark semantics are fundamentally logical rather than metadata.

 * Embedding watermark in VIEW DDL introduces complexity in view expansion.

 * It may blur the boundary between logical and catalog layers.

Based on community feedback, the proposal pivots to a function-based solution 
that preserves logical reuse while avoiding DDL-level changes.

The view-based design is no longer preferred.
----
h1. 7. Compatibility
 * No changes to existing table DDL behavior.

 * No backward compatibility impact.

 * Purely additive feature.

----
h1. 8. Scope

This FLIP only introduces:
 * The APPLY_WATERMARK function

 * Planner support for logical watermark attachment

It does not:
 * Modify catalog metadata

 * Change existing VIEW DDL syntax

 * Introduce physical materialization behavior

----
h1. 9. Implementation Plan
 # Add APPLY_WATERMARK function definition

 # Extend planner to attach logical watermark trait

 # Add validation rules for descriptor and expression

 # Add unit and integration tests

----
h1. 10. Open Questions
 * Should nested APPLY_WATERMARK calls be allowed?

 * Should conflicting watermark definitions throw validation errors?


> Support applying watermark on logical relations in SQL
> ------------------------------------------------------
>
>                 Key: FLINK-39062
>                 URL: https://issues.apache.org/jira/browse/FLINK-39062
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table SQL / API, Table SQL / Planner
>            Reporter: featzhang
>            Priority: Major
>
> h1. 1. Motivation
> Currently, watermark definition in Flink SQL is tightly coupled with table 
> DDL. This introduces several limitations:
>  # Watermark is treated as catalog metadata instead of a logical relational 
> property.
>  # It cannot be flexibly reused across logical abstractions (e.g., view-based 
> pipelines).
>  # It complicates modular query design when watermark logic needs to be 
> shared.
> During earlier discussion, extending VIEW DDL to support watermark was 
> considered. However, this approach raised concerns around:
>  * Semantic ambiguity between metadata and logical properties
>  * Increased complexity in view expansion and optimization
>  * Potential impact on existing DDL semantics
> To address these concerns while still enabling logical reuse of watermark 
> definitions, this FLIP proposes introducing a function-based approach.
> ----
> h1. 2. Proposed Solution
> Introduce a built-in table function: 
>  
> {code:java}
> APPLY_WATERMARK(
> table,
> DESCRIPTOR(rowtime_column),
> watermark_expression
> ){code}
>  
> This function attaches watermark semantics at the logical relational level 
> instead of the catalog level.
> ----
> h1. 3. Syntax
> {code:java}
> SELECT *
> FROM APPLY_WATERMARK(
> Orders,
> DESCRIPTOR(order_time),
> order_time - INTERVAL '5' SECOND
> );{code}
>  
> The first argument supports:
>  * Base tables
>  * Views (including non-materialized views)
>  * Subqueries
> This enables logical reuse without introducing watermark semantics into 
> catalog DDL.
> ----
> h1. 4. Semantics
>  * The function returns a relation identical to the input table but with an 
> attached watermark definition.
>  * The watermark is scoped to the query.
>  * It does not modify catalog metadata.
>  * It does not alter view definitions.
> Watermark propagation rules remain consistent with current planner behavior.
> ----
> h1. 5. Why Function-Based Instead of VIEW Extension
> The function-based approach:
>  * Keeps watermark as a logical property rather than catalog metadata
>  * Avoids modifying VIEW DDL semantics
>  * Avoids ambiguity between physical and logical properties
>  * Aligns with existing function-style relational extensions
>  * Minimizes impact on optimizer and catalog layers
> This design keeps the scope focused and reduces planner and metadata coupling.
> ----
> h1. 6. Design Evolution
> The initial proposal considered allowing watermark definition directly inside 
> VIEW DDL to support reuse.
> After discussion, it was identified that:
>  * Watermark semantics are fundamentally logical rather than metadata.
>  * Embedding watermark in VIEW DDL introduces complexity in view expansion.
>  * It may blur the boundary between logical and catalog layers.
> Based on community feedback, the proposal pivots to a function-based solution 
> that preserves logical reuse while avoiding DDL-level changes.
> The view-based design is no longer preferred.
> ----
> h1. 7. Compatibility
>  * No changes to existing table DDL behavior.
>  * No backward compatibility impact.
>  * Purely additive feature.
> ----
> h1. 8. Scope
> This FLIP only introduces:
>  * The APPLY_WATERMARK function
>  * Planner support for logical watermark attachment
> It does not:
>  * Modify catalog metadata
>  * Change existing VIEW DDL syntax
>  * Introduce physical materialization behavior
> ----
> h1. 9. Implementation Plan
>  # Add APPLY_WATERMARK function definition
>  # Extend planner to attach logical watermark trait
>  # Add validation rules for descriptor and expression
>  # Add unit and integration tests
> ----
> h1. 10. Open Questions
>  * Should nested APPLY_WATERMARK calls be allowed?
>  * Should conflicting watermark definitions throw validation errors?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to