gortiz commented on code in PR #14264: URL: https://github.com/apache/pinot/pull/14264#discussion_r1818880977
########## pinot-query-runtime/src/main/java/org/apache/pinot/query/runtime/operator/window/value/FirstValueWindowFunction.java: ########## @@ -113,12 +187,115 @@ private List<Object> processRangeWindow(List<Object[]> rows) { return result; } - private List<Object> processUnboundedPreceding(List<Object[]> rows) { + private List<Object> processRangeWindowIgnoreNulls(List<Object[]> rows) { int numRows = rows.size(); - assert numRows > 0; - Object value = extractValueFromRow(rows.get(0)); - Object[] result = new Object[numRows]; - Arrays.fill(result, value); - return Arrays.asList(result); + + if (_windowFrame.isUnboundedPreceding() && _windowFrame.isUnboundedFollowing()) { + // Find the first non-null value and fill it in all rows + for (int i = 0; i < numRows; i++) { + Object[] row = rows.get(i); + Object value = extractValueFromRow(row); + if (value != null) { + return fillAllWithValue(rows, value); + } + } + // There's no non-null value + return Collections.nCopies(numRows, null); + } + + if (_windowFrame.isUnboundedPreceding() && _windowFrame.isUpperBoundCurrentRow()) { + // Find the first non-null value and fill it in all rows starting from the first row of the peer group of the row + // with the first non-null value + int firstNonNullValueIndex = -1; + Key firstNonNullValueKey = null; + for (int i = 0; i < numRows; i++) { + Object[] row = rows.get(i); + Object value = extractValueFromRow(row); + if (value != null) { + firstNonNullValueIndex = i; + firstNonNullValueKey = AggregationUtils.extractRowKey(row, _orderKeys); + break; + } + } + + // No non-null values + if (firstNonNullValueIndex == -1) { + return Collections.nCopies(numRows, null); + } + + List<Object> result = new ArrayList<>(numRows); + // Find the start of the peer group of the row with the first non-null value + int i; + for (i = 0; i < numRows; i++) { + Object[] row = rows.get(i); + Key orderKey = AggregationUtils.extractRowKey(row, _orderKeys); + if (orderKey.equals(firstNonNullValueKey)) { + break; + } else { + result.add(null); + } + } + + Object firstNonNullValue = extractValueFromRow(rows.get(firstNonNullValueIndex)); + for (; i < numRows; i++) { + result.add(firstNonNullValue); + } + + return result; Review Comment: nit: Here we could reduce allocation cost if using a custom list that could combine two lists. I'm shocked about the lack of a list like that in Guava, but shouldn't be difficult to create one. The idea would be to create two lists like Collections.nCopies. The first would contain just nulls and the second just the first non null value. Finally we wrap these two instances in a view whose get delegates on either the first or the second list depending on whether the index is greater or smaller than firstNonNullValueIndex. If this pattern is seen in more window functions, to create this combine list view would worth the effort -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org For additional commands, e-mail: commits-h...@pinot.apache.org