snleee commented on PR #10856:
URL: https://github.com/apache/pinot/pull/10856#issuecomment-1580148040

   > Can we also document user experience when this feature is enabled ?
   > 
   > 1. What happens if proper headers are there already ? Here the algorithm 
should not get this wrong and override the proper headers. How do we safeguard 
against this?
   > 2. What happens when there are no headers and we assign default headers
   > 3. What happens when there are no headers and we fail to detect and assign 
default
   
   Good point. 
   
   I added more comments on the user experience. By the way, the current logic 
is the following:
   
   - Check the header
   - If header found, keep the existing behavior
   - if header not found, fill default header `col_0, col_1`
   
   There can be 2 possibilities that the logic can go wrong:
   1. False negative:  detect 'no header' while there's a header <- in this 
case, we will replace the header to `col_0, col_1...` instead of honoring the 
header. I do see some reports on this. 
https://github.com/python/cpython/issues/104380
   2. False positive: detect 'header' while there's no header <- in this case, 
the end behavior would be the same as today because we will fall back to the 
original behavior when we detect the header (`format.withHeader()`). So, this 
would not cause any degradation.
   
   I think that `false negative` cases will cause new issues that doesn't exist 
today when this feature is turned on. However, I think that we need to 
incrementally improve the logic as we see more edge cases because it looks that 
the csv header detection cannot be perfect. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to