[GitHub] [incubator-pinot] timsants commented on a change in pull request #6046: Deep Extraction Support for ORC, Thrift, and ProtoBuf Records

GitBox Sun, 11 Oct 2020 21:31:48 -0700


timsants commented on a change in pull request #6046:
URL: https://github.com/apache/incubator-pinot/pull/6046#discussion_r503031500




##########
File path: 
pinot-plugins/pinot-input-format/pinot-csv/src/main/java/org/apache/pinot/plugin/inputformat/csv/CSVRecordReader.java
##########
@@ -95,8 +95,13 @@ public void init(File dataFile, Set<String> fieldsToRead, 
@Nullable RecordReader
     _recordExtractor = new CSVRecordExtractor();
     CSVRecordExtractorConfig recordExtractorConfig = new 
CSVRecordExtractorConfig();
     recordExtractorConfig.setMultiValueDelimiter(multiValueDelimiter);
-    _recordExtractor.init(fieldsToRead, recordExtractorConfig);
+
     init();
+
+    if (fieldsToRead == null || fieldsToRead.isEmpty()) {

Review comment:
       I had the same thought and was debating whether or not to follow the 
same pattern as the other extractors. I eventually decided to put the "read all 
fields" in the CSVRecordReader because the field names are accessible only 
through the CSV header and not in the record object being passed to the 
`extract` method.
   
   The alternate implementation I was thinking of would require that all the 
CSV column names would be set in a new variable within 
`CSVRecordExtractorConfig`. But if most of the time,`fieldsToRead` is being 
set, then it would be a duplicated unused `Set` of field names that will be 
sent to the `CSVRecordExtractor. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-pinot] timsants commented on a change in pull request #6046: Deep Extraction Support for ORC, Thrift, and ProtoBuf Records

Reply via email to