chenboat commented on pull request #6567: URL: https://github.com/apache/incubator-pinot/pull/6567#issuecomment-805240112
> > A new integration test UpsertTableSegmentUploadIntegrationTest is added to verify the uploaded segment is added to a realtime upsert enabled table correctly. The segment name used in the test does not follow llc segment convention but it is loaded successfully. The query result is good. > > I do quite understand why we should "Exclude uploaded segment from the processing". During segment load, the upsert table metadata should be processed based on the loaded segment, right? > > > Please take a look at `PinotLLCRealtimeSegmentManager` and ensure it works on uploaded segments: > > > > > > * Segment name for uploaded segment does not follow the llc segment name convention > > > * Exclude uploaded segment from the processing > > I took another look at `PinotLLCRealtimeSegmentManager` and seems non-LLC segments are already excluded (because it might co-exist with HLC segments before). We cannot include them during metadata processing because it relies on the segment name format. > > For the issue within the `processExistingSegment()`, for uploading the new segment it won't hit that code path, but for refreshing the existing segment it won't set the fields properly. > > If possible, I would recommend setting up an actual table, upload some segments and keep it running for some time to ensure all components can properly work on the uploaded segments The integration test I added uploads a segment to a LLC table with all Pinot components up. The test shows the query can be run successfully on a Pinot upsert table with correct query results returned. It also verifies the idealstate. I can add more verification to the test such as more metadata check and more segments uploads. I prefer Pinot integration tests than manual tests. for `processExistingSegment()`, I think we should leave it to another PR and let this PR focus on adding new segments. For now, this PR will reject upload segments of the same name. Notice that for our Upsert table use case, upload new segments would be good enough. I will have a follow up PR for refreshing existing segments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org