The GitHub Actions job "Required Checks" on 
texera.git/fix/5279-csv-empty-header-rename has succeeded.
Run started by GitHub user mengw15 (triggered by mengw15).

Head commit for run:
7f534a91640e0e15502b014dc95d11053a029dde / mengw15 
<[email protected]>
fix(workflow-operator): auto-rename empty CSV column headers

A CSV with an empty column header (e.g. a trailing comma `id,name,,age`)
produced an `Attribute` with an empty name. After #3295 every output
port writes via `IcebergTableWriter -> Parquet`, where Avro rejects
empty names with `IllegalArgumentException: Empty name` at flush time,
losing the operator port's entire result.

Rename blank header positions to `column-<index>` (matching the
convention used by pandas, Spark, R, and DuckDB) in all three CSV scan
operators: `CSVScanSourceOpDesc`, `ParallelCSVScanSourceOpDesc`, and
`CSVOldScanSourceOpDesc`. `ParallelCSVScanSourceOpDesc` is currently
commented out of `LogicalOp`'s registry (unreachable from the UI) but is
fixed for consistency and in case it is re-enabled for experiments.

Closes #5279.

Report URL: https://github.com/apache/texera/actions/runs/26599759972

With regards,
GitHub Actions via GitBox

Reply via email to