szehon-ho commented on code in PR #6698:
URL: https://github.com/apache/iceberg/pull/6698#discussion_r1131801430


##########
hive-metastore/src/main/java/org/apache/iceberg/hive/CachedClientPool.java:
##########
@@ -53,12 +70,14 @@ public class CachedClientPool implements 
ClientPool<IMetaStoreClient, TException
             properties,
             CatalogProperties.CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS,
             CatalogProperties.CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS_DEFAULT);
+    this.keySuppliers =

Review Comment:
   While I like the keySupplier idea, wouldn't it be much simpler if we just 
use a static map of suppliers, like:
   
   ```
   static final Map<String, Supplier> keySuppliers = ImmutableMap.of(
      "uri", () -> {
                   try {
                     return UserGroupInformation.getCurrentUser();
                   } catch (IOException e) {
                     throw new UncheckedIOException(e);
                   }
                 }),
      "user_name" -> {
               () -> {
                   try {
                     String userName = 
UserGroupInformation.getCurrentUser().getUserName();
                     return UserNameElement.of(userName);
                   } catch (IOException e) {
                     throw new UncheckedIOException(e);
                   }
                 });
   }
   ```
   
   (as i dont see any need to generate this on the fly.)  And then here in 
CachedClientPool CTOR, we can just use key to use the conf to make a key?
   



##########
core/src/main/java/org/apache/iceberg/CatalogProperties.java:
##########
@@ -119,6 +119,26 @@ private CatalogProperties() {}
       "client.pool.cache.eviction-interval-ms";
   public static final long CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS_DEFAULT =
       TimeUnit.MINUTES.toMillis(5);
+  /**
+   * A comma separated list of elements that are used to compose the key of 
the client pool cache.
+   *
+   * <p>The following elements are supported:
+   *
+   * <ul>
+   *   <li>URI - as specified by {@link CatalogProperties#URI}. URI will be 
the only element when

Review Comment:
   Actually , I am now thinking I don't see any use-case where you would turn 
off any of these from the key.  What do you think?  From user point of view, 
probably simpler is better and we should have reasonable defaults.  
   
   Maybe we could have a configurable key but just limit it to to add 
additional hive conf if we missed something, as a safety valve.  In any case, 
without dynamic loading, the user cant add additional code suppliers like ugi, 
and thus is limited to just setting config values here.
   
   cc @pvary @RussellSpitzer @flyrain for thoughts.



##########
hive-metastore/src/main/java/org/apache/iceberg/hive/CachedClientPool.java:
##########
@@ -87,4 +106,125 @@ public <R> R run(Action<R, IMetaStoreClient, TException> 
action, boolean retry)
       throws TException, InterruptedException {
     return clientPool().run(action, retry);
   }
+
+  @VisibleForTesting
+  static Key toKey(List<Supplier<Object>> suppliers) {
+    return 
Key.of(suppliers.stream().map(Supplier::get).collect(Collectors.toList()));
+  }
+
+  @VisibleForTesting
+  static List<Supplier<Object>> extractKeySuppliers(String cacheKeys, 
Configuration conf) {
+    URIElement uri = 
URIElement.of(conf.get(HiveConf.ConfVars.METASTOREURIS.varname, ""));
+    if (cacheKeys == null || cacheKeys.isEmpty()) {
+      return Collections.singletonList(() -> uri);
+    }
+
+    // generate key elements in a certain order, so that the Key instances are 
comparable
+    Set<KeyElementType> types = 
Sets.newTreeSet(Comparator.comparingInt(Enum::ordinal));
+    Map<String, String> confElements = Maps.newTreeMap();
+    for (String element : cacheKeys.split(",", -1)) {
+      String trimmed = element.trim();
+      if (trimmed.toLowerCase(Locale.ROOT).startsWith(CONF_ELEMENT_PREFIX)) {
+        String key = trimmed.substring(CONF_ELEMENT_PREFIX.length());
+        ValidationException.check(
+            !confElements.containsKey(key), "Conf key element %s already 
specified", key);
+        confElements.put(key, conf.get(key));
+      } else {
+        KeyElementType type = KeyElementType.valueOf(trimmed.toUpperCase());
+        switch (type) {
+          case URI:
+          case UGI:
+          case USER_NAME:
+            ValidationException.check(
+                types.add(type), "%s key element already specified", 
type.name());
+            break;
+          default:
+            throw new ValidationException("Unknown key element %s", trimmed);
+        }
+      }
+    }
+    ImmutableList.Builder<Supplier<Object>> suppliers = 
ImmutableList.builder();
+    for (KeyElementType type : types) {
+      switch (type) {
+        case URI:
+          suppliers.add(() -> uri);
+          break;
+        case UGI:
+          suppliers.add(
+              () -> {
+                try {
+                  return UserGroupInformation.getCurrentUser();
+                } catch (IOException e) {
+                  throw new UncheckedIOException(e);
+                }
+              });
+          break;
+        case USER_NAME:
+          suppliers.add(
+              () -> {
+                try {
+                  String userName = 
UserGroupInformation.getCurrentUser().getUserName();
+                  return UserNameElement.of(userName);
+                } catch (IOException e) {
+                  throw new UncheckedIOException(e);
+                }
+              });
+          break;
+        default:
+          throw new RuntimeException("Unexpected key element " + type.name());
+      }
+    }
+    for (String key : confElements.keySet()) {
+      ConfElement element = ConfElement.of(key, confElements.get(key));
+      suppliers.add(() -> element);
+    }
+    return suppliers.build();
+  }
+
+  @Value.Immutable

Review Comment:
   I'm strugling to see the value of adding these immutable value, although I 
may be missing something.  Why not just use String?



##########
core/src/main/java/org/apache/iceberg/CatalogProperties.java:
##########
@@ -119,6 +119,26 @@ private CatalogProperties() {}
       "client.pool.cache.eviction-interval-ms";
   public static final long CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS_DEFAULT =
       TimeUnit.MINUTES.toMillis(5);
+  /**
+   * A comma separated list of elements that are used to compose the key of 
the client pool cache.
+   *
+   * <p>The following elements are supported:
+   *
+   * <ul>
+   *   <li>URI - as specified by {@link CatalogProperties#URI}. URI will be 
the only element when

Review Comment:
   I think URI should be removed here as it shouldn't be configurable, as it is 
always there.  
   
   Maybe we could mention in the javadoc that uri is always used , no matter 
what configurable keys the user passes in?  Something like:
   
   ```
   A comma separated list of elements used, in addition to the hive metastore 
uri, to compose the key of the client pool cache.
   ```



##########
core/src/main/java/org/apache/iceberg/CatalogProperties.java:
##########
@@ -119,6 +119,26 @@ private CatalogProperties() {}
       "client.pool.cache.eviction-interval-ms";
   public static final long CLIENT_POOL_CACHE_EVICTION_INTERVAL_MS_DEFAULT =
       TimeUnit.MINUTES.toMillis(5);
+  /**
+   * A comma separated list of elements that are used to compose the key of 
the client pool cache.
+   *
+   * <p>The following elements are supported:
+   *
+   * <ul>
+   *   <li>URI - as specified by {@link CatalogProperties#URI}. URI will be 
the only element when
+   *       this property is not set.
+   *   <li>UGI - the Hadoop UserGroupInformation instance that represents the 
current user using the
+   *       cache.
+   *   <li>USER_NAME - similar to UGI but only includes the user's name 
determined by
+   *       UserGroupInformation#getUserName.
+   *   <li>CONF - name of an arbitrary configuration. The value of the 
configuration will be

Review Comment:
   Can we also add a example what the user can pass in here?
   
   Also, what do you think about make the list here show lower-case, to match 
what you wrote (ie, there is CONF but then you mention conf.a.b.c)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to