gortiz commented on code in PR #15919:
URL: https://github.com/apache/pinot/pull/15919#discussion_r2137677116


##########
pinot-common/src/main/java/org/apache/pinot/common/datablock/ZeroCopyDataBlockSerde.java:
##########
@@ -254,21 +265,37 @@ private long calculateEndOffset(DataBuffer buffer, Header 
header) {
     return currentOffset;
   }
 
+  /// Deserializes the exceptions and metadata from the stream.
   @VisibleForTesting
-  static Map<Integer, String> deserializeExceptions(PinotInputStream stream, 
Header header)
+  static ErrorsAndMetadata deserializeExceptions(PinotInputStream stream, 
Header header)
       throws IOException {
     if (header._exceptionsLength == 0) {
-      return new HashMap<>();
+      return ErrorsAndMetadata.EMPTY;
     }
+    long currentOffset = header.getExceptionsStart();
+
     stream.seek(header.getExceptionsStart());
     int numExceptions = stream.readInt();
-    Map<Integer, String> exceptions = new 
HashMap<>(HashUtil.getHashMapCapacity(numExceptions));
+    // We reserve extra space for the fake error codes storing stageId, 
workerId and serverId
+    Map<Integer, String> exceptions = new 
HashMap<>(HashUtil.getHashMapCapacity(numExceptions + 3));
     for (int i = 0; i < numExceptions; i++) {
       int errCode = stream.readInt();
       String errMessage = stream.readInt4UTF();
       exceptions.put(errCode, errMessage);
     }
-    return exceptions;
+
+    long readOffset = stream.getCurrentOffset() - currentOffset;
+    if (readOffset >= header._exceptionsLength) {
+      return new ErrorsAndMetadata(exceptions, -1, -1, "");
+    }
+    int errorMetadataVersion = stream.readInt();
+    if (errorMetadataVersion != ERROR_METADATA_VERSION) {
+      return new ErrorsAndMetadata(exceptions, -1, -1, "");
+    }
+    int stageId = stream.readInt();
+    int workerId = stream.readInt();
+    String serverId = stream.readInt4UTF();

Review Comment:
   I'm using "".
   
   I think it is better (may avoid NPEs) and simplifies the serialized code, as 
it is simpler to write an empty string than a null value. We can then transform 
this empty string into null when converting it to a block if needed



##########
pinot-common/src/main/java/org/apache/pinot/common/datablock/ZeroCopyDataBlockSerde.java:
##########
@@ -254,21 +265,37 @@ private long calculateEndOffset(DataBuffer buffer, Header 
header) {
     return currentOffset;
   }
 
+  /// Deserializes the exceptions and metadata from the stream.
   @VisibleForTesting
-  static Map<Integer, String> deserializeExceptions(PinotInputStream stream, 
Header header)
+  static ErrorsAndMetadata deserializeExceptions(PinotInputStream stream, 
Header header)
       throws IOException {
     if (header._exceptionsLength == 0) {
-      return new HashMap<>();
+      return ErrorsAndMetadata.EMPTY;
     }
+    long currentOffset = header.getExceptionsStart();
+
     stream.seek(header.getExceptionsStart());
     int numExceptions = stream.readInt();
-    Map<Integer, String> exceptions = new 
HashMap<>(HashUtil.getHashMapCapacity(numExceptions));
+    // We reserve extra space for the fake error codes storing stageId, 
workerId and serverId
+    Map<Integer, String> exceptions = new 
HashMap<>(HashUtil.getHashMapCapacity(numExceptions + 3));
     for (int i = 0; i < numExceptions; i++) {
       int errCode = stream.readInt();
       String errMessage = stream.readInt4UTF();
       exceptions.put(errCode, errMessage);
     }
-    return exceptions;
+
+    long readOffset = stream.getCurrentOffset() - currentOffset;
+    if (readOffset >= header._exceptionsLength) {
+      return new ErrorsAndMetadata(exceptions, -1, -1, "");

Review Comment:
   We cannot use EMPTY, as exceptions could be non-empty.



##########
pinot-common/src/main/java/org/apache/pinot/common/datablock/MetadataBlock.java:
##########
@@ -69,6 +71,17 @@ public static MetadataBlock newEosWithStats(List<DataBuffer> 
statsByStage) {
   public MetadataBlock(List<DataBuffer> statsByStage) {

Review Comment:
   > Is there scenario where stageId and workerId are unavailable?
   
   Yes, MetadataBlock is used to serialize any EOS, including those that 
succeed and those that don't. But stageId and workerId are only used on error 
blocks. We could include them in successful blocks as well, but that won't be 
very useful and would make SuccessMseBlock stateful, which is not great given 
they can be used as singletons now.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to