amogh-jahagirdar commented on code in PR #14822:
URL: https://github.com/apache/iceberg/pull/14822#discussion_r2628841011


##########
spark/v4.0/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestRemoteScanPlanning.java:
##########
@@ -0,0 +1,93 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.spark.extensions;
+
+import static org.assertj.core.api.Assertions.assertThat;
+
+import org.apache.iceberg.CatalogProperties;
+import org.apache.iceberg.ParameterizedTestExtension;
+import org.apache.iceberg.Parameters;
+import org.apache.iceberg.relocated.com.google.common.collect.ImmutableMap;
+import org.apache.iceberg.rest.RESTCatalogProperties;
+import org.apache.iceberg.spark.SparkCatalogConfig;
+import org.apache.iceberg.spark.sql.TestSelect;
+import org.junit.jupiter.api.Disabled;
+import org.junit.jupiter.api.TestTemplate;
+import org.junit.jupiter.api.extension.ExtendWith;
+
+@ExtendWith(ParameterizedTestExtension.class)
+public class TestRemoteScanPlanning extends TestSelect {
+  @Parameters(name = "catalogName = {0}, implementation = {1}, config = {2}, 
binaryTableName = {3}")
+  protected static Object[][] parameters() {
+    return new Object[][] {
+      {
+        SparkCatalogConfig.REST.catalogName(),
+        SparkCatalogConfig.REST.implementation(),
+        ImmutableMap.builder()
+            .putAll(SparkCatalogConfig.REST.properties())
+            .put(CatalogProperties.URI, 
restCatalog.properties().get(CatalogProperties.URI))
+            // this flag is typically only set by the server, but we set it 
from the client for
+            // testing
+            .put(RESTCatalogProperties.REST_SCAN_PLANNING_ENABLED, "true")
+            .build(),
+        SparkCatalogConfig.REST.catalogName() + ".default.binary_table"
+      }
+    };

Review Comment:
   I think I agree with @nastra here, I'd probably start out with a separate 
set of tests for RemoteScanPlanning. Ideally in the long run we should just be 
able to plug in remote planning basically everywhere but I'm not sure it's a 
good idea to _start_ there at least in main because it may be more noisy than 
it's worth at this stage.  And as @nastra said I believe there are tests which 
are predicated on client side planning. 
   
   We can always remove this down the line but for now I think we just want 
some reasonable confidence in the scan planning integration, rather than 
plugging in tests everywhere.
   
   I also do think test times are an issue  but I think the solution space for 
that is a bit different.
   
   I do think it's a good idea though @singhpk234 to do that separately, and we 
can use that to identify any gaps! 



##########
core/src/main/java/org/apache/iceberg/rest/RequiresRemoteScanPlanning.java:
##########
@@ -0,0 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.iceberg.rest;
+
+/** Marker interface to indicate whether a Table requires remote scan planning 
*/
+public interface RequiresRemoteScanPlanning {}

Review Comment:
   I do think that it's a good idea to decouple the concrete RESTTable type 
from any required planning behavior. Any table implementation can just choose 
to implement the interface rather than having to be tied to RESTTable, so it's 
an easy way to achieve flexibility without too much additional indirection or 
complexity. 
   
   More importantly though after more thought I think there's a conceptual 
difference between the type of the table and the planning mode, enough of a 
separation that justifies the additional interface at least in my head. Sure 
the current implementation could only return a RESTTable if remote planning is 
required but it does not _have_ to be that way, e.g. if I wanted to plugin my 
own client side of scan planning in my own table implementation for example 
(For tests or if I know I only care about a slimmed down version of the 
protocol, or some concurrency requirements that better fit my environment etc).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to