timsants commented on a change in pull request #6046:
URL: https://github.com/apache/incubator-pinot/pull/6046#discussion_r503031500
##########
File path:
pinot-plugins/pinot-input-format/pinot-csv/src/main/java/org/apache/pinot/plugin/inputformat/csv/CSVRecordReader.java
##########
@@ -95,8 +95,13 @@ public void init(File dataFile, Set<String> fieldsToRead,
@Nullable RecordReader
_recordExtractor = new CSVRecordExtractor();
CSVRecordExtractorConfig recordExtractorConfig = new
CSVRecordExtractorConfig();
recordExtractorConfig.setMultiValueDelimiter(multiValueDelimiter);
- _recordExtractor.init(fieldsToRead, recordExtractorConfig);
+
init();
+
+ if (fieldsToRead == null || fieldsToRead.isEmpty()) {
Review comment:
I had the same thought and was debating whether or not to follow the
same pattern as the other extractors. I eventually decided to put the "read all
fields" in the CSVRecordReader because the field names are accessible only
through the CSV header and not in the record object being passed to the
`extract` method.
The alternate implementation I was thinking of would require that all the
CSV column names would be set in a new variable within
`CSVRecordExtractorConfig`. But if most of the time,`fieldsToRead` is being
set, then it would be a duplicated unused `Set` of field names that will be
sent to the `CSVRecordExtractor.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]