Jackie-Jiang commented on code in PR #12704:
URL: https://github.com/apache/pinot/pull/12704#discussion_r1583790911


##########
pinot-common/src/main/java/org/apache/pinot/common/response/BrokerResponse.java:
##########
@@ -123,6 +127,17 @@ String toJsonString()
    */
   long getMinConsumingFreshnessTimeMs();
 
+  /**
+   * Get the max number of rows seen by a single operator in the query 
processing chain.
+   * <p>
+   * In single-stage this value is equal to {@link 
#getNumEntriesScannedPostFilter()} because single-stage operators
+   * cannot generate more rows than the number of rows received. This is not 
the case in multi-stage operators, where
+   * operators like join or union can emit more rows than the ones received on 
each of its children operators.
+   */
+  default long getMaxRowsInOperator() {

Review Comment:
   To me this is actually a little bit confusing. Correct me if my 
understanding is wrong:
   - In `v1` it is a query level stats on entries (rows * columns projected), 
which is the total `numEntriesScannedPostFilter` across all segments queried
   - In `v2` it is the maximum rows returned by the same operator (there can be 
multiple operators of the same type, but they are counted separately) across 
all the workers. It is essentially the maximum of all the `EMITTED_ROWS` in the 
stats map.
   
   In `v1` the closest matching might be `numDocsScanned`.
   I feel we may discuss about this and add it as a separate PR



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to