Re: [PR] Spark upsert table backfill support [pinot]

via GitHub Wed, 13 Nov 2024 23:20:42 -0800


rohityadav1993 commented on code in PR #14443:
URL: https://github.com/apache/pinot/pull/14443#discussion_r1841682046



##########
pinot-plugins/pinot-batch-ingestion/pinot-batch-ingestion-common/src/main/java/org/apache/pinot/plugin/ingestion/batch/common/SegmentGenerationTaskRunner.java:
##########
@@ -167,19 +167,25 @@ private SegmentNameGenerator 
getSegmentNameGenerator(SegmentGeneratorConfig segm
         return new 
InputFileSegmentNameGenerator(segmentNameGeneratorConfigs.get(FILE_PATH_PATTERN),
             segmentNameGeneratorConfigs.get(SEGMENT_NAME_TEMPLATE), 
inputFileUri, appendUUIDToSegmentName);
       case BatchConfigProperties.SegmentNameGeneratorType.UPLOADED_REALTIME:
-        Preconditions.checkState(segmentGeneratorConfig.getCreationTime() != 
null,
-            "Creation time must be set for uploaded realtime segment name 
generator");
-        
Preconditions.checkState(segmentGeneratorConfig.getUploadedSegmentPartitionId() 
!= -1,
+        
Preconditions.checkState(segmentNameGeneratorConfigs.get(BatchConfigProperties.SEGMENT_UPLOAD_TIME_MS)
 != null,
+            "Upload time must be set for uploaded realtime segment name 
generator");
+        
Preconditions.checkState(segmentNameGeneratorConfigs.get(BatchConfigProperties.SEGMENT_PARTITION_ID)
 != null,

Review Comment:
   How will the partitionId be provided by the task? The partitionId means the 
data is partitioned and same as stream that is ingesting to realtime table.
   
   There should be a corresponding code in spark to partition data and derive 
id accordingly.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Re: [PR] Spark upsert table backfill support [pinot]

Reply via email to