Re: [DISCUSS] FIP-27: Remove Mandatory System Columns From Fluss Lake Tables

Mehul Batra Tue, 03 Mar 2026 11:11:48 -0800

Hi Yuxia,

First of all thank you for leading this, It's an important aspect as this
is non-trivial storage cost in Parquet/ORC files for columns that most
consumers never read and schema introspection gets polluted too.
I've been going through FIP-27 in detail and have a few questions I'd like
to clarify before implementation begins. Grouping them by area:

*1. Schema & Legacy Detection*

1a. When datalake is re-enabled on an existing table and the lake table
already exists with system columns, where does the schema inspection
happen  do we add a new method to the LakeCatalog interface (e.g.,
getTableSchema(TablePath)), or is this handled at the Fluss server metadata
level outside the plugin boundary?

1b. Is the legacy/clean mode decision persisted in Fluss table metadata
(e.g., as a property like fluss.lake.schema.mode = legacy | clean), or is
it re-derived by inspecting the lake table schema each time? If re-derived,
what happens if someone manually alters the lake table schema externally?

*2. PARTITION_TIMESTAMP Mode*

2a. The FIP shows a day-granularity example for timestamp-to-partition
mapping. Can we document the exact mapping for all supported time-unit
values (hour, day, month, quarter, year)? I assume it follows the same
DateTimeFormatter patterns in PartitionUtils, but it would be good to make
this explicit.

2b. Should the Flink connector fail fast at job submission time (via
ValidationException) if PARTITION_TIMESTAMP is used on a
non-auto-partitioned table? Or do we allow it for manually partitioned
tables as well?

2c. For PK tables with CDC, how are duplicates at the partition boundary
resolved during the union read? Is it the same snapshot-then-changelog
pattern that FULL mode uses today? The FIP mentions "downstream
idempotency" but CDC duplicate handling is non-trivial it would help to be
more specific here.

*3. Union Read Boundary*

3a. How is the exact transition point from lake historical reads to Fluss
log reads determined per-partition  is it the per-partition tiering
watermark stored in Fluss server metadata?

*4. Backward Compatibility*

4a. If a user drops and recreates a table with the same name post-upgrade,
the new lake table will not have system columns. Should we warn users about
this schema change, especially if they have downstream jobs that depend on
__offset or __bucket?

*5. Scope*

5a. The changes apply to both Paimon and Iceberg lake catalogs, correct?
Both PaimonLakeCatalog and IcebergLakeCatalog currently append system
columns independently.

Thanks for the FIP, happy to help with the implementation once these are
clarified.

Best Regards,
Mehul Batra

On Mon, Mar 2, 2026 at 7:26 PM Lorenzo Affetti via dev <[email protected]>
wrote:

> Hello! I went through the FIP another time as I did not remember doing it
> already :)
>
> I have additional questions beyond the first 2.
>
> Let me paste those here and add:
>
> 1. Isn't the scope of the FIP misleading?
> This FIP seems to be about removing system columns, but it primarily
> proposes a new read mode named PARTITION_TIMESTAMP.
> Is this because removing those columns prevents users from accessing data
> on the lake?
> If so:
>  - how do user are supposed to do that now
>  - What would change
>
> 2. How does this relate to union reads?
> I am quite new to the community and Fluss. Could you explain how the new
> PARTITION_TIMESTAMP mode relates to union reads?
> If the answer is not obvious, perhaps this warrants a section in the FIP.
>
> 3. Why *"*Only auto partitioned table is supported in this mode"?
> Why only for partitions generated by Fluss, and not for any partition that
> represents a timestamp?
>
> On Wed, Feb 4, 2026 at 4:50 PM Lorenzo Affetti <
> [email protected]> wrote:
>
> > Hello Yuxia!
> > Thanks for the great FIP!
> > I have some questions:
> >
> > 1. Isn't the scope of the FIP misleading?
> > It seems this FIP is about removing system columns, but it primarily
> > proposes a new read mode named PARTITION_TIMESTAMP.
> >
> > 2. How does this relate to union reads?
> > I am quite new to the community and Fluss. Could you explain how the new
> > PARTITION_TIMESTAMP mode relates to union reads?
> > If the answer is not obvious, perhaps this warrants a section in the FIP.
> >
> > Thank you!
> >
> > On Tue, Jan 20, 2026 at 8:20 AM yuxia <[email protected]>
> wrote:
> >
> >> Hi, all.
> >>
> >> Currently, every Fluss lake table is automatically provisioned with
> three
> >> mandatory system columns, __bucket , __offset , __timstamp (intended for
> >> bucket and offset-based subscription as well as addition informartion
> >> check).
> >> While originally designed to allow clients to pinpoint specific data
> >> offsets of specific buckets, the practical evolution of the ecosystem
> has
> >> rendered this default behavior suboptimal for the dowstream since the
> >> dowstream warehouse or BI tools do not expect these internal metadata
> >> fields.
> >>
> >>
> >> So, I'd like to propose FIP-27: Remove Mandatory System Columns From
> >> Fluss Lake Tables [1] to remove the three mandatory system columns while
> >> still keep compability.
> >>
> >> Welcome your feedback and suggestions on this proposal. Looking forward
> >> to a productive discussion!
> >>
> >> [1]:
> >>
> https://cwiki.apache.org/confluence/display/FLUSS/FIP-27%3A+Remove+Mandatory+System+Columns+From+Fluss+Lake+Tables
> >>
> >> Best regards,
> >> Yuxia
> >>
> >
> >
> > --
> > Lorenzo Affetti
> > Senior Software Engineer @ Flink Team
> > Ververica <http://www.ververica.com>
> >
>
>
> --
> Lorenzo Affetti
> Senior Software Engineer @ Flink Team
> Ververica <http://www.ververica.com>
>

Re: [DISCUSS] FIP-27: Remove Mandatory System Columns From Fluss Lake Tables

Reply via email to