swaminathanmanish commented on code in PR #10874:
URL: https://github.com/apache/pinot/pull/10874#discussion_r1224345854


##########
pinot-core/src/main/java/org/apache/pinot/core/segment/processing/mapper/SegmentMapper.java:
##########
@@ -122,32 +144,23 @@ public Map<String, GenericRowFileManager> map()
   private Map<String, GenericRowFileManager> doMap()
       throws Exception {
     Consumer<Object> observer = _processorConfig.getProgressObserver();
-    int totalCount = _recordReaders.size();
+    int totalCount = _recordReaderFileConfigs.size();
     int count = 1;
     GenericRow reuse = new GenericRow();
-    for (RecordReader recordReader : _recordReaders) {
-      observer.accept(String.format("Doing map phase on data from RecordReader 
(%d out of %d)", count++, totalCount));
-      while (recordReader.hasNext()) {
-        reuse = recordReader.next(reuse);
-
-        // TODO: Add ComplexTypeTransformer here. Currently it is not 
idempotent so cannot add it
-
-        if (reuse.getValue(GenericRow.MULTIPLE_RECORDS_KEY) != null) {
-          //noinspection unchecked
-          for (GenericRow row : (Collection<GenericRow>) 
reuse.getValue(GenericRow.MULTIPLE_RECORDS_KEY)) {
-            GenericRow transformedRow = _recordTransformer.transform(row);
-            if (transformedRow != null && 
IngestionUtils.shouldIngestRow(transformedRow)) {
-              writeRecord(transformedRow);
-            }
-          }
-        } else {
-          GenericRow transformedRow = _recordTransformer.transform(reuse);
-          if (transformedRow != null && 
IngestionUtils.shouldIngestRow(transformedRow)) {
-            writeRecord(transformedRow);
-          }
-        }
-
-        reuse.clear();
+    boolean inited = false;
+    for (RecordReaderFileConfig recordReaderFileConfig : 
_recordReaderFileConfigs) {

Review Comment:
   Both are related :).
   I initially did not have RecordReader in RecordReaderFileConfig, but noticed 
that callers passed in OSS RecordReaders as well custom RecordReaders.  I 
wanted to have the flexibility to support both these usages using the same 
wrapper/container 'RecordReaderFileConfig'. Its better for the caller not to 
differentiate between when to use List<RecordReader> Vs  
List<RecordReaderFileConfig>. They only use RecordReaderFileConfig. The 
constructors for RecordReaderFileConfig makes it clear how it should be used 
(either pass in info to initialize the reader or the reader instance itself).  
   The other parameter List<RecordReader> is kept only for backwards 
compatibility. 
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to