Jackie-Jiang commented on code in PR #14298:
URL: https://github.com/apache/pinot/pull/14298#discussion_r1819836993


##########
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/transformer/datetime/BaseDateTimeTransformer.java:
##########
@@ -42,14 +46,26 @@ public abstract class BaseDateTimeTransformer<I, O> 
implements DataTransformer<I
   private final DateTimeFormatter _outputDateTimeFormatter;
   private final DateTimeGranularitySpec _outputGranularity;
   private final long _outputGranularityMillis;
-  private final SDFDateTimeTruncate _dateTimeTruncate;
+  private final SDFDateTimeTruncate _sdfDateTimeTruncate;
+  private final MillisDateTimeTruncate _millisDateTimeTruncate;
+  private final Chronology _bucketingChronology;
+  private final MutableDateTime _dateTime;
+  private final StringBuilder _printBuffer;
 
   private interface SDFDateTimeTruncate {
-    String truncate(DateTime dateTime);
+    String truncate(MutableDateTime dateTime);
   }
 
-  public BaseDateTimeTransformer(@Nonnull DateTimeFormatSpec inputFormat, 
@Nonnull DateTimeFormatSpec outputFormat,
-      @Nonnull DateTimeGranularitySpec outputGranularity) {
+  private interface MillisDateTimeTruncate {
+    long truncate(MutableDateTime dateTime);
+  }
+
+  public BaseDateTimeTransformer(
+      @Nonnull DateTimeFormatSpec inputFormat,

Review Comment:
   (Convention) We usually only annotate `Nullable` and treat others as 
`Nonnull`



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/transformer/datetime/EpochToEpochTransformer.java:
##########
@@ -29,14 +31,20 @@
 public class EpochToEpochTransformer extends BaseDateTimeTransformer<long[], 
long[]> {
 
   public EpochToEpochTransformer(DateTimeFormatSpec inputFormat, 
DateTimeFormatSpec outputFormat,
-      DateTimeGranularitySpec outputGranularity) {
-    super(inputFormat, outputFormat, outputGranularity);
+      DateTimeGranularitySpec outputGranularity, @Nullable DateTimeZone 
bucketingTz) {
+    super(inputFormat, outputFormat, outputGranularity, bucketingTz);
   }
 
   @Override
   public void transform(@Nonnull long[] input, @Nonnull long[] output, int 
length) {
-    for (int i = 0; i < length; i++) {
-      output[i] = 
transformMillisToEpoch(transformToOutputGranularity(transformEpochToMillis(input[i])));
+    if (useCustomBucketingTimeZone()) {

Review Comment:
   We can still short-circuit it if it is UTC. Same for other places.
   I guess we can check if the explicitly specified time zone is the same as 
output format time zone, and count it as custom time zone only when they are 
different.



##########
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/DateTimeConvert.java:
##########
@@ -42,36 +51,43 @@ public Object dateTimeConvert(String timeValueStr, String 
inputFormatStr, String
       _inputFormatSpec = new DateTimeFormatSpec(inputFormatStr);
       _outputFormatSpec = new DateTimeFormatSpec(outputFormatStr);
       _granularitySpec = new DateTimeGranularitySpec(outputGranularityStr);
+      _dateTime = new MutableDateTime(0L, DateTimeZone.UTC);
+      _buffer = new StringBuilder();

Review Comment:
   Consider adding some javadoc explaining we reuse date time and string 
builder to reduce garbage



##########
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/DateTimeConvert.java:
##########
@@ -83,4 +99,67 @@ public Object dateTimeConvert(String timeValueStr, String 
inputFormatStr, String
       return _outputFormatSpec.fromMillisToFormat(roundedTimeValueMs);
     }
   }
+
+  @ScalarFunction
+  public Object dateTimeConvert(String timeValueStr, String inputFormatStr, 
String outputFormatStr,

Review Comment:
   It is almost the same as the other method. Can we introduce a helper method 
with nullable `bucketingTimeZone`, and make both the scalar function call the 
helper method



##########
pinot-common/src/main/java/org/apache/pinot/common/function/TransformFunctionType.java:
##########
@@ -115,8 +115,14 @@ public enum TransformFunctionType {
   // Date time functions
   TIME_CONVERT("timeConvert", ReturnTypes.BIGINT,
       OperandTypes.family(List.of(SqlTypeFamily.ANY, SqlTypeFamily.CHARACTER, 
SqlTypeFamily.CHARACTER))),
-  DATE_TIME_CONVERT("dateTimeConvert", 
TransformFunctionType::dateTimeConverterReturnTypeInference, 
OperandTypes.family(
-      List.of(SqlTypeFamily.ANY, SqlTypeFamily.CHARACTER, 
SqlTypeFamily.CHARACTER, SqlTypeFamily.CHARACTER))),
+  DATE_TIME_CONVERT("dateTimeConvert", 
TransformFunctionType::dateTimeConverterReturnTypeInference,
+      OperandTypes.or(

Review Comment:
   We can use optional predicate instead. See `DATE_TRUNC` as an example



##########
pinot-common/src/main/java/org/apache/pinot/common/function/scalar/DateTimeConvert.java:
##########
@@ -42,36 +51,43 @@ public Object dateTimeConvert(String timeValueStr, String 
inputFormatStr, String
       _inputFormatSpec = new DateTimeFormatSpec(inputFormatStr);
       _outputFormatSpec = new DateTimeFormatSpec(outputFormatStr);
       _granularitySpec = new DateTimeGranularitySpec(outputGranularityStr);
+      _dateTime = new MutableDateTime(0L, DateTimeZone.UTC);
+      _buffer = new StringBuilder();
     }
 
     long timeValueMs = _inputFormatSpec.fromFormatToMillis(timeValueStr);
     if (_outputFormatSpec.getTimeFormat() == 
DateTimeFieldSpec.TimeFormat.SIMPLE_DATE_FORMAT) {
       DateTimeFormatter outputFormatter = 
_outputFormatSpec.getDateTimeFormatter();
-      DateTime dateTime = new DateTime(timeValueMs, outputFormatter.getZone());
+      _dateTime.setMillis(timeValueMs);
+      _dateTime.setZone(outputFormatter.getZone());

Review Comment:
   We can fix the time zone during the initialization as it won't change. Same 
for other places where we update timezone per value



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/transformer/datetime/BaseDateTimeTransformer.java:
##########
@@ -58,41 +74,145 @@ public BaseDateTimeTransformer(@Nonnull DateTimeFormatSpec 
inputFormat, @Nonnull
     _outputDateTimeFormatter = outputFormat.getDateTimeFormatter();
     _outputGranularity = outputGranularity;
     _outputGranularityMillis = outputGranularity.granularityToMillis();
+    _bucketingChronology = bucketingTimeZone != null ? 
ISOChronology.getInstance(bucketingTimeZone) : null;
+    _dateTime = new MutableDateTime(0L, DateTimeZone.UTC);
+    _printBuffer = new StringBuilder();
+
+    final int size = _outputGranularity.getSize();
 
     // setup date time truncating based on output granularity
-    final int sz = _outputGranularity.getSize();
+    //when size == 1, skip the needless set() calls

Review Comment:
   (nit) Put a space after `//` for consistency



##########
pinot-core/src/main/java/org/apache/pinot/core/operator/transform/transformer/datetime/BaseDateTimeTransformer.java:
##########
@@ -58,41 +74,145 @@ public BaseDateTimeTransformer(@Nonnull DateTimeFormatSpec 
inputFormat, @Nonnull
     _outputDateTimeFormatter = outputFormat.getDateTimeFormatter();
     _outputGranularity = outputGranularity;
     _outputGranularityMillis = outputGranularity.granularityToMillis();
+    _bucketingChronology = bucketingTimeZone != null ? 
ISOChronology.getInstance(bucketingTimeZone) : null;
+    _dateTime = new MutableDateTime(0L, DateTimeZone.UTC);
+    _printBuffer = new StringBuilder();
+
+    final int size = _outputGranularity.getSize();
 
     // setup date time truncating based on output granularity
-    final int sz = _outputGranularity.getSize();
+    //when size == 1, skip the needless set() calls
     switch (_outputGranularity.getTimeUnit()) {
       case MILLISECONDS:
-        _dateTimeTruncate = (dateTime) -> _outputDateTimeFormatter
-            .print(dateTime.withMillisOfSecond((dateTime.getMillisOfSecond() / 
sz) * sz));
+        if (size != 1) {
+          _sdfDateTimeTruncate = (dateTime) -> {
+            dateTime.setMillisOfSecond((dateTime.getMillisOfSecond() / size) * 
size);
+            return print(dateTime);
+          };
+          _millisDateTimeTruncate = (dateTime) -> {
+            dateTime.setMillisOfSecond((dateTime.getMillisOfSecond() / size) * 
size);
+            return dateTime.getMillis();
+          };
+        } else {
+          _sdfDateTimeTruncate = (dateTime) -> print(dateTime);
+          _millisDateTimeTruncate = (dateTime) -> dateTime.getMillis();
+        }
         break;
       case SECONDS:
-        _dateTimeTruncate = (dateTime) -> _outputDateTimeFormatter.print(
-            dateTime.withSecondOfMinute((dateTime.getSecondOfMinute() / sz) * 
sz).secondOfMinute().roundFloorCopy());
+        if (size != 1) {
+          _sdfDateTimeTruncate = (dateTime) -> {
+            dateTime.setSecondOfMinute((dateTime.getSecondOfMinute() / size) * 
size);
+            dateTime.secondOfMinute().roundFloor();
+            return print(dateTime);
+          };
+          _millisDateTimeTruncate = (dateTime) -> {
+            dateTime.setSecondOfMinute((dateTime.getSecondOfMinute() / size) * 
size);
+            dateTime.secondOfMinute().roundFloor();
+            return dateTime.getMillis();
+          };
+        } else {
+          _sdfDateTimeTruncate = (dateTime) -> {
+            dateTime.secondOfMinute().roundFloor();
+            return print(dateTime);
+          };
+          _millisDateTimeTruncate = (dateTime) -> {
+            dateTime.secondOfMinute().roundFloor();
+            return dateTime.getMillis();
+          };
+        }
         break;
       case MINUTES:
-        _dateTimeTruncate = (dateTime) -> _outputDateTimeFormatter
-            .print(dateTime.withMinuteOfHour((dateTime.getMinuteOfHour() / sz) 
* sz).minuteOfHour().roundFloorCopy());
+        if (size != 1) {
+          _sdfDateTimeTruncate = (dateTime) -> {
+            dateTime.setMinuteOfHour((dateTime.getMinuteOfHour() / size) * 
size);
+            dateTime.minuteOfHour().roundFloor();
+            return print(dateTime);
+          };
+          _millisDateTimeTruncate = (dateTime) -> {
+            dateTime.setMinuteOfHour((dateTime.getMinuteOfHour() / size) * 
size);
+            dateTime.minuteOfHour().roundFloor();
+            return dateTime.getMillis();
+          };
+        } else {
+          _sdfDateTimeTruncate = (dateTime) -> {
+            dateTime.minuteOfHour().roundFloor();
+            return print(dateTime);
+          };
+          _millisDateTimeTruncate = (dateTime) -> {
+            dateTime.minuteOfHour().roundFloor();
+            return dateTime.getMillis();
+          };
+        }
         break;
       case HOURS:
-        _dateTimeTruncate = (dateTime) -> _outputDateTimeFormatter
-            .print(dateTime.withHourOfDay((dateTime.getHourOfDay() / sz) * 
sz).hourOfDay().roundFloorCopy());
+        if (size != 1) {
+          _sdfDateTimeTruncate = (dateTime) -> {
+            dateTime.setHourOfDay((dateTime.getHourOfDay() / size) * size);
+            dateTime.hourOfDay().roundFloor();
+            return print(dateTime);
+          };
+          _millisDateTimeTruncate = (dateTime) -> {
+            dateTime.setHourOfDay((dateTime.getHourOfDay() / size) * size);
+            dateTime.hourOfDay().roundFloor();
+            return dateTime.getMillis();
+          };
+        } else {
+          _sdfDateTimeTruncate = (dateTime) -> {
+            dateTime.hourOfDay().roundFloor();
+            return print(dateTime);
+          };
+          _millisDateTimeTruncate = (dateTime) -> {
+            dateTime.hourOfDay().roundFloor();
+            return dateTime.getMillis();
+          };
+        }
         break;
       case DAYS:
-        _dateTimeTruncate = (dateTime) -> _outputDateTimeFormatter
-            .print(dateTime.withDayOfMonth(((dateTime.getDayOfMonth() - 1) / 
sz) * sz + 1).dayOfMonth()
-                .roundFloorCopy());
+        if (size != 1) {
+          _sdfDateTimeTruncate = (dateTime) -> {
+            dateTime.setDayOfMonth(((dateTime.getDayOfMonth() - 1) / size) * 
size + 1);
+            dateTime.dayOfMonth().roundFloor();
+            return print(dateTime);
+          };
+          _millisDateTimeTruncate = (dateTime) -> {
+            dateTime.setDayOfMonth(((dateTime.getDayOfMonth() - 1) / size) * 
size + 1);
+            dateTime.dayOfMonth().roundFloor();
+            return dateTime.getMillis();
+          };
+        } else {
+          _sdfDateTimeTruncate = (dateTime) -> {
+            dateTime.dayOfMonth().roundFloor();
+            return print(dateTime);
+          };
+          _millisDateTimeTruncate = (dateTime) -> {
+            dateTime.dayOfMonth().roundFloor();
+            return dateTime.getMillis();
+          };
+        }
         break;
       default:
-        _dateTimeTruncate = _outputDateTimeFormatter::print;
+        _sdfDateTimeTruncate = _outputDateTimeFormatter::print;

Review Comment:
   Should we throw exception here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to