richardstartin commented on issue #7791:
URL: https://github.com/apache/pinot/issues/7791#issuecomment-978385352


   > could you explain what you mean by "this works everywhere"? That makes me 
feel like I don't really understand what you're proposing, and/or that you 
don't understand what I'm proposing. Streaming unpack of files should also 
"work everywhere", and should be highly performant, with minimal code changes.
   
   I understand what you’re proposing. You want to start downloading the 
segment and, relying on the metadata being at the start of the segment, 
essentially abort the download once you have the metadata. The motivation is to 
save the download time for most of the segment file. Is that correct? 
   
   By everywhere, I mean for all filesystem implementations and for all 
segments, no matter how they have been generated or by what. It is not 
currently specified where the metadata file is within the segment file, segment 
files may not have been generated by the code within this repository, and the 
metadata may not be at the start of the file.
   
   > As to the mismatch scenario, I'm assuming the idea is to store segments in 
directory A, and associated metadata in directory B. If an ops person copies 
updated segments to an archive dir, and forgets to copy the associated 
metadata, you would have miss-matched data. 
   
   Without meaning to appear facetious, this sounds a bit like what happens 
when users occasionally tamper with transaction logs in RDBMS and hope for same 
outcomes. Shouldn’t the user just not do that?
   
   Regarding directories, assuming metadata is stored in a separate file from 
the segment, is there a good reason for it to be in a different directory? WDYT 
@mcvsubbu? I know you expressed similar concerns to mine on slack.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to