pengding-stripe commented on issue #14083: URL: https://github.com/apache/pinot/issues/14083#issuecomment-2449019177
I'm also working on batch upload for realtime table using spark job spec. Read through the source code, it looks like it's not supported for `uploadedRealtime`. All fields are getting for `segmentGeneratorConfig` instead of the spec map `segmentNameGeneratorConfigs`: https://github.com/apache/pinot/blob/master/pinot-plugins/pinot-batch-ingestion/pinot-batch-ingestion-common/src/main/java/org/apache/pinot/plugin/ingestion/batch/common/SegmentGenerationTaskRunner.java#L169-L182 I think we need to update to get these fields from job spec, we can reuse `segment.name.prefix`, `use.global.directory.sequence.id` and `segment.partitionId` defined in [BatchConfigProperties](https://github.com/apache/pinot/blob/master/pinot-spi/src/main/java/org/apache/pinot/spi/ingestion/batch/BatchConfigProperties.java#L24) @Jackie-Jiang do you know why these properties are defined in both `SegmentGenerationTaskRunner` and `BatchConfigProperties`? Could you also help confirm if my above assumptions are correct? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org