Copilot commented on code in PR #354:
URL: 
https://github.com/apache/doris-spark-connector/pull/354#discussion_r3019635154


##########
spark-doris-connector/spark-doris-connector-base/src/main/java/org/apache/doris/spark/client/read/DorisFlightSqlReader.java:
##########
@@ -112,6 +124,8 @@ public Object next() throws DorisException {
 
     @Override
     public void close() {
+        log.info("Partition read finished: tablet={}, batches={}, 
totalRows={}",
+                partition.getTablets(), totalBatches, totalRows);

Review Comment:
   `partition.getTablets()` is a `Long[]`, so this summary log will print an 
array identity instead of the tablet IDs. Use 
`Arrays.toString(partition.getTablets())` (or similar) and consider renaming 
the placeholder/key to `tablets` since it may contain multiple IDs.



##########
spark-doris-connector/spark-doris-connector-base/src/main/java/org/apache/doris/spark/client/read/DorisFlightSqlReader.java:
##########
@@ -91,7 +94,16 @@ public DorisFlightSqlReader(DorisReaderPartition partition) 
throws Exception {
     public boolean hasNext() throws DorisException {
         if (!endOfStream.get() && (rowBatch == null || !rowBatch.hasNext())) {
             try {
+                long batchStart = System.currentTimeMillis();
                 endOfStream.set(!arrowReader.loadNextBatch());
+                if (!endOfStream.get()) {
+                    VectorSchemaRoot root = arrowReader.getVectorSchemaRoot();
+                    int rows = root.getRowCount();
+                    totalBatches++;
+                    totalRows += rows;
+                    log.info("Batch loaded: tablet={}, rows={}, cost={}ms",
+                            partition.getTablets(), rows, 
System.currentTimeMillis() - batchStart);

Review Comment:
   `partition.getTablets()` is a `Long[]`, so logging it with `{}` will print 
the Java array identity (e.g., `[Ljava.lang.Long;@...`) rather than the tablet 
IDs. Format it explicitly (e.g., `Arrays.toString(...)` or 
`StringUtils.join(...)`) so the batch log is usable.
   ```suggestion
                               StringUtils.join(partition.getTablets(), ','), 
rows, System.currentTimeMillis() - batchStart);
   ```



##########
spark-doris-connector/spark-doris-connector-base/src/main/java/org/apache/doris/spark/client/read/DorisFlightSqlReader.java:
##########
@@ -91,7 +94,16 @@ public DorisFlightSqlReader(DorisReaderPartition partition) 
throws Exception {
     public boolean hasNext() throws DorisException {
         if (!endOfStream.get() && (rowBatch == null || !rowBatch.hasNext())) {
             try {
+                long batchStart = System.currentTimeMillis();
                 endOfStream.set(!arrowReader.loadNextBatch());
+                if (!endOfStream.get()) {
+                    VectorSchemaRoot root = arrowReader.getVectorSchemaRoot();
+                    int rows = root.getRowCount();
+                    totalBatches++;
+                    totalRows += rows;
+                    log.info("Batch loaded: tablet={}, rows={}, cost={}ms",

Review Comment:
   This per-batch `log.info` runs once for every Arrow batch and can generate 
very high log volume for large scans, impacting both performance and 
operational noise. Consider lowering to `debug` (or gating behind a 
config/threshold) and keeping `info` for partition-level summaries.
   ```suggestion
                       log.debug("Batch loaded: tablet={}, rows={}, cost={}ms",
   ```



##########
spark-doris-connector/spark-doris-connector-base/src/main/java/org/apache/doris/spark/client/read/DorisFlightSqlReader.java:
##########
@@ -147,7 +161,9 @@ private ArrowReader executeQuery() throws AdbcException, 
OptionRequiredException
         String flightSql = generateQuerySql(partition);
         log.info("Query SQL Sending to Doris FE is: {}", flightSql);
         statement.setSqlQuery(flightSql);
+        long start = System.currentTimeMillis();
         AdbcStatement.QueryResult queryResult = statement.executeQuery();
+        log.info("Query submitted, tablet={}, cost={}ms", 
partition.getTablets(), System.currentTimeMillis() - start);

Review Comment:
   Same issue here: `partition.getTablets()` is a `Long[]`, so the log will not 
show the actual tablet IDs. Format the array explicitly (and consider using 
`tablets` instead of `tablet` in the message).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to