Re: [PR] Allows multiple requests per server per request ID [pinot]

via GitHub Thu, 14 Nov 2024 17:28:55 -0800


Jackie-Jiang commented on code in PR #13742:
URL: https://github.com/apache/pinot/pull/13742#discussion_r1843021732



##########
pinot-connectors/pinot-spark-common/src/main/scala/org/apache/pinot/connector/spark/common/reader/PinotServerDataFetcher.scala:
##########
@@ -40,9 +40,9 @@ import scala.collection.JavaConverters._
  * Eg: offline-server1: segment1, segment2, segment3
  */
 private[reader] class PinotServerDataFetcher(
-    partitionId: Int,
-    pinotSplit: PinotSplit,
-    dataSourceOptions: PinotDataSourceReadOptions)
+                                              partitionId: Int,

Review Comment:
   (format) The format doesn't seem correct. Can you change it to follow the 
old format?



##########
pinot-common/src/thrift/request.thrift:
##########
@@ -52,4 +52,5 @@ struct InstanceRequest {
   4: optional bool enableTrace;
   5: optional string brokerId;
   6: optional list<string> optionalSegments;
+//  7: required i64 queryHash;

Review Comment:
   Don't add commented field here as it usually indicates that the field is 
deprecated, and we should not reuse the same id



##########
pinot-common/src/main/java/org/apache/pinot/common/utils/request/BrokerRequestIdUtils.java:
##########
@@ -0,0 +1,38 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.pinot.common.utils.request;
+
+public class BrokerRequestIdUtils {
+  public static final int TABLE_TYPE_OFFSET = 10;

Review Comment:
   Seems this is the core change to split one request into multiple ids. 
Currently this one will put a hard limit on the total tables queried. Do we 
need to reverse engineer the original request id? If not, we can probably 
simply using consecutive ids to represent each request?



##########
pinot-core/src/main/java/org/apache/pinot/core/transport/AsyncQueryResponse.java:
##########
@@ -39,10 +44,13 @@
 @ThreadSafe
 public class AsyncQueryResponse implements QueryResponse {
   private final QueryRouter _queryRouter;
-  private final long _requestId;
+
+  // TODO(egalpin): Remove the concept of truncated request ID in next major 
release after all servers are expected
+  //  to send back tableName in DataTable metadata
+  private final long _canonicalRequestId;

Review Comment:
   Is the overall plan to use request id + table name as the unique identifier 
of the query? Can you share how do you plan to manage request id?
   E.g. only have one request id on broker side, and server splits it into 
multiple requests? I'm also considering whether to enhance `InstanceRequest` to 
have a list of <query + searchSegments + optionalSegments>



##########
pinot-core/src/main/java/org/apache/pinot/core/plan/GlobalPlanImplV0.java:
##########
@@ -57,6 +57,8 @@ public InstanceResponseBlock execute()
     InstanceResponseBlock instanceResponseBlock = 
instanceResponseOperator.nextBlock();
     long endTime2 = System.currentTimeMillis();
     LOGGER.debug("InstanceResponseOperator.nextBlock() took: {}ms", endTime2 - 
endTime1);
+//    instanceResponseBlock.addMetadata(DataTable.MetadataKey.TABLE.getName(),

Review Comment:
   This seems unrelated. Could you go over the changes again and clean up the 
unrelated changes and todos that doesn't apply?



##########
pinot-common/src/main/java/org/apache/pinot/common/datatable/DataTableImplV4.java:
##########
@@ -94,6 +94,7 @@ public class DataTableImplV4 implements DataTable {
   protected ByteBuffer _fixedSizeData;
   protected byte[] _variableSizeDataBytes;
   protected ByteBuffer _variableSizeData;
+  // TODO(egalpin): add query hash to metadata, alongside requestId

Review Comment:
   Does this still apply?



##########
pinot-core/src/main/java/org/apache/pinot/core/transport/ServerRoutingInstance.java:
##########
@@ -23,43 +23,36 @@
 import java.util.Map;
 import java.util.concurrent.ConcurrentHashMap;
 import javax.annotation.concurrent.ThreadSafe;
-import org.apache.pinot.spi.config.table.TableType;
 import org.apache.pinot.spi.utils.CommonConstants.Helix;
 
 
 /**
  * The {@code ServerRoutingInstance} class represents the routing target 
instance which contains the information of
- * hostname, port, and table type it serves.
- * <p>Different table types on same host and port are counted as different 
instances. Therefore, one single Pinot Server
- * might be treated as two different routing target instances based on the 
types of table it serves.
+ * hostname and port.
  */
 @ThreadSafe
 public final class ServerRoutingInstance {

Review Comment:
   Since the table type info is no longer maintained here, we should be able to 
remove this class and directly use `ServerInstance` as the key of the map



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Re: [PR] Allows multiple requests per server per request ID [pinot]

Reply via email to