pvary commented on code in PR #12691: URL: https://github.com/apache/iceberg/pull/12691#discussion_r2026603208
########## docs/docs/flink-configuration.md: ########## @@ -198,4 +198,42 @@ they are. This is only applicable to {@link StatisticsType#Map} for low-cardinality scenario. For {@link StatisticsType#Sketch} high-cardinality sort columns, they are usually not used as partition columns. Otherwise, too many partitions and small files may be generated during -write. Sketch range partitioner simply splits high-cardinality keys into ordered ranges. \ No newline at end of file +write. Sketch range partitioner simply splits high-cardinality keys into ordered ranges. + +### Exec options + +When constructing Flink Iceberg source via Java API, configs can be set in Configuration like this: + +``` +configuration.setBoolean(FlinkConfigOptions.TABLE_EXEC_ICEBERG_INFER_SOURCE_PARALLELISM, true); +FlinkSource.forRowData() + .flinkConf(configuration) + ... +``` + +When using table API, options can be set in Flink's TableEnvironment. + +``` +TableEnvironment tEnv = createTableEnv(); +tEnv.getConfig() + .getConfiguration() + .setBoolean(FlinkConfigOptions.TABLE_EXEC_ICEBERG_INFER_SOURCE_PARALLELISM, true); +``` + +For Flink SQL, set options can be passed like this: +``` +SET table.exec.iceberg.infer-source-parallelism.max=10; + +SELECT * FROM tableName; +``` + +| Flink configuration | Default | Description | +|-------------------------------------------------|-------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------| +| table.exec.iceberg.infer-source-parallelism | true | If false, parallelism of source are set by config. If true, source parallelism is inferred according to splits number. | +| table.exec.iceberg.infer-source-parallelism.max | 100 | Sets max infer parallelism for source operator. | +| table.exec.iceberg.expose-split-locality-info | null | If true, expose split host information to use Flink's locality aware split assigner. | +| table.exec.iceberg.fetch-batch-record-count | 2048 | The target number of records for Iceberg reader fetch batch. | +| table.exec.iceberg.worker-pool-size | ThreadPools.WORKER_THREAD_POOL_SIZE | The size of workers pool used to plan or scan manifests. If the value of Runtime.getRuntime().availableProcessors() is compared to 2, take the larger one. | +| table.exec.iceberg.use-flip27-source | true | If true, Use the FLIP-27 based Iceberg source implementation. | Review Comment: Source specific - maybe it should be at the source? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org