dimas-b commented on code in PR #3823:
URL: https://github.com/apache/polaris/pull/3823#discussion_r2824674504


##########
polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java:
##########
@@ -91,31 +91,43 @@ public static void enforceFeatureEnabledOrThrow(
           .defaultValue(false)
           .buildFeatureConfiguration();
 
-  public static final FeatureConfiguration<Boolean> 
INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL =
-      PolarisConfiguration.<Boolean>builder()
-          .key("INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL")

Review Comment:
   Normally, we should not remove feature flags without a deprecation cycle, 
but I guess this one was never released, so it's ok... We have to be careful as 
the 1.4.0 release is in progress.



##########
polaris-core/src/main/java/org/apache/polaris/core/storage/aws/AwsSessionTagsBuilder.java:
##########
@@ -40,61 +41,78 @@ private AwsSessionTagsBuilder() {
   }
 
   /**
-   * Builds a list of AWS STS session tags from the principal name and 
credential vending context.
-   * These tags will appear in CloudTrail events for correlation purposes.
+   * Builds a list of AWS STS session tags from the principal name and 
credential vending context,
+   * including only the fields specified in {@code enabledFields}.
    *
-   * <p>The trace ID tag is only included if {@link 
CredentialVendingContext#traceId()} is present.
-   * This is controlled at the source (StorageAccessConfigProvider) based on 
the
-   * INCLUDE_TRACE_ID_IN_SESSION_TAGS feature flag.
+   * <p>Only fields present in {@code enabledFields} will be included in the 
returned list. The
+   * trace ID tag is only included if it is both in {@code enabledFields} and 
present in {@link
+   * CredentialVendingContext#traceId()} (the context's traceId is populated 
by the caller based on
+   * configuration).
    *
    * @param principalName the name of the principal requesting credentials
-   * @param context the credential vending context containing catalog, 
namespace, table, roles, and
-   *     optionally trace ID
+   * @param context the credential vending context
+   * @param enabledFields the set of field names to include (see {@code
+   *     FeatureConfiguration.SUPPORTED_SESSION_TAG_FIELDS} for valid values)
    * @return a list of STS Tags to attach to the AssumeRole request
    */
-  public static List<Tag> buildSessionTags(String principalName, 
CredentialVendingContext context) {
+  public static List<Tag> buildSessionTags(
+      String principalName, CredentialVendingContext context, Set<String> 
enabledFields) {
     List<Tag> tags = new ArrayList<>();
 
-    // Always include all tags with "unknown" placeholder for missing values
-    // This ensures consistent tag presence in CloudTrail for correlation
-    tags.add(
-        Tag.builder()
-            .key(CredentialVendingContext.TAG_KEY_PRINCIPAL)
-            .value(truncateTagValue(principalName))
-            .build());
-    tags.add(
-        Tag.builder()
-            .key(CredentialVendingContext.TAG_KEY_ROLES)
-            
.value(truncateTagValue(context.activatedRoles().orElse(TAG_VALUE_UNKNOWN)))
-            .build());
-    tags.add(
-        Tag.builder()
-            .key(CredentialVendingContext.TAG_KEY_CATALOG)
-            
.value(truncateTagValue(context.catalogName().orElse(TAG_VALUE_UNKNOWN)))
-            .build());
-    tags.add(
-        Tag.builder()
-            .key(CredentialVendingContext.TAG_KEY_NAMESPACE)
-            
.value(truncateTagValue(context.namespace().orElse(TAG_VALUE_UNKNOWN)))
-            .build());
-    tags.add(
-        Tag.builder()
-            .key(CredentialVendingContext.TAG_KEY_TABLE)
-            
.value(truncateTagValue(context.tableName().orElse(TAG_VALUE_UNKNOWN)))
-            .build());
-
-    // Only include trace ID if it's present in the context.
-    // The context's traceId is only populated when 
INCLUDE_TRACE_ID_IN_SESSION_TAGS is enabled.
-    // This allows efficient credential caching when trace IDs are not needed 
in session tags.
-    context
-        .traceId()
-        .ifPresent(
-            traceId ->
-                tags.add(
-                    Tag.builder()
-                        .key(CredentialVendingContext.TAG_KEY_TRACE_ID)
-                        .value(truncateTagValue(traceId))
-                        .build()));
+    if (enabledFields.contains("realm")) {

Review Comment:
   WDYT about making an `enum` for `enabledFields` and putting the tag 
extraction code into it (parameterized in each value)?



##########
polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java:
##########
@@ -91,31 +91,43 @@ public static void enforceFeatureEnabledOrThrow(
           .defaultValue(false)
           .buildFeatureConfiguration();
 
-  public static final FeatureConfiguration<Boolean> 
INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL =
-      PolarisConfiguration.<Boolean>builder()
-          .key("INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL")
-          .description(
-              "If set to true, session tags (catalog, namespace, table, 
principal, roles) will be included\n"
-                  + "in AWS STS AssumeRole requests for credential vending. 
These tags appear in CloudTrail events,\n"
-                  + "enabling correlation between catalog operations and S3 
data access.\n"
-                  + "Requires the IAM role trust policy to allow 
sts:TagSession action.\n"
-                  + "Note that enabling this feature may lead to degradation 
in temporary credential caching as \n"
-                  + "catalog will no longer be able to reuse credentials for 
different tables/namespaces/roles.")
-          .defaultValue(false)
-          .buildFeatureConfiguration();
+  /**
+   * The set of fields that are supported as AWS STS session tag labels in 
credential vending.
+   *
+   * <p>Supported values:
+   *
+   * <ul>
+   *   <li>{@code realm} - The realm identifier for the request
+   *   <li>{@code catalog} - The name of the catalog vending credentials
+   *   <li>{@code namespace} - The namespace being accessed (dot-separated)
+   *   <li>{@code table} - The table name being accessed
+   *   <li>{@code principal} - The principal name requesting credentials
+   *   <li>{@code roles} - Comma-separated list of activated principal roles
+   *   <li>{@code trace_id} - OpenTelemetry trace ID (WARNING: disables 
credential caching)
+   * </ul>
+   */
+  public static final List<String> SUPPORTED_SESSION_TAG_FIELDS =
+      List.of("realm", "catalog", "namespace", "table", "principal", "roles", 
"trace_id");
 
-  public static final FeatureConfiguration<Boolean> 
INCLUDE_TRACE_ID_IN_SESSION_TAGS =
-      PolarisConfiguration.<Boolean>builder()
-          .key("INCLUDE_TRACE_ID_IN_SESSION_TAGS")
+  public static final FeatureConfiguration<List> 
SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL =
+      PolarisConfiguration.<List>builder()
+          .key("SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL")
           .description(
-              "If set to true (and 
INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL is also true), the OpenTelemetry\n"
-                  + "trace ID will be included as a session tag in AWS STS 
AssumeRole requests. This enables\n"
-                  + "end-to-end correlation between catalog operations 
(Polaris events), credential vending (CloudTrail),\n"
-                  + "and metrics reports from compute engines.\n"
-                  + "WARNING: Enabling this feature completely disables 
credential caching because every request\n"
+              "A comma-separated list of fields to include as session tags in 
AWS STS AssumeRole requests\n"
+                  + "for credential vending. These tags appear in CloudTrail 
events, enabling correlation between\n"
+                  + "catalog operations and S3 data access. An empty list 
(default) disables session tags entirely.\n"
+                  + "Requires the IAM role trust policy to allow 
sts:TagSession action.\n"
+                  + "\n"
+                  + "Supported fields: realm, catalog, namespace, table, 
principal, roles, trace_id\n"
+                  + "\n"
+                  + "Note: each additional field may contribute to AWS STS 
packed policy size (max 2048 characters).\n"
+                  + "Fields with long values (e.g. deeply nested namespaces) 
can cause STS policy size limit errors.\n"
+                  + "Choose only the fields needed for your CloudTrail 
correlation requirements.\n"
+                  + "\n"
+                  + "WARNING: Including 'trace_id' completely disables 
credential caching because every request\n"
                   + "has a unique trace ID. This may significantly increase 
latency and STS API costs.\n"
-                  + "Consider leaving this disabled unless end-to-end tracing 
correlation is critical.")
-          .defaultValue(false)
+                  + "Fields that vary per-request (table, namespace, roles) 
degrade caching but do not disable it.")
+          .defaultValue(List.of())

Review Comment:
   Maybe `List.<String>of()`?



##########
CHANGELOG.md:
##########
@@ -43,14 +43,15 @@ request adding CHANGELOG notes for breaking (!) changes and 
possibly other secti
 - The `PolarisMetricsReporter.reportMetric()` method signature has been 
extended to include a `receivedTimestamp` parameter of type `java.time.Instant`.
 - The `ExternalCatalogFactory.createCatalog()` and `createGenericCatalog()` 
method signatures have been extended to include a `catalogProperties` parameter 
of type `Map<String, String>` for passing through proxy and timeout settings to 
federated catalog HTTP clients.
 - Metrics reporting now requires the `TABLE_READ_DATA` privilege on the target 
table for read (scan) metrics and `TABLE_WRITE_DATA` for write (commit) metrics.
+- The `INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL` and 
`INCLUDE_TRACE_ID_IN_SESSION_TAGS` feature flags have been removed and replaced 
by the `SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL` list-based configuration. 
Deployments using either removed flag must migrate to the new config (e.g., 
`SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL=catalog,namespace,table,principal,roles`).

Review Comment:
   `INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL` and 
`INCLUDE_TRACE_ID_IN_SESSION_TAGS` never got released... I do not think we need 
this statement at all.



##########
polaris-core/src/main/java/org/apache/polaris/core/storage/aws/AwsCredentialsStorageIntegration.java:
##########
@@ -97,8 +98,10 @@ public StorageAccessConfig getSubscopedCreds(
 
     boolean includePrincipalNameInSubscopedCredential =
         
realmConfig.getConfig(FeatureConfiguration.INCLUDE_PRINCIPAL_NAME_IN_SUBSCOPED_CREDENTIAL);
-    boolean includeSessionTags =
-        
realmConfig.getConfig(FeatureConfiguration.INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL);
+    List sessionTagFields =
+        
realmConfig.getConfig(FeatureConfiguration.SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL);

Review Comment:
   `getConfig()` is typed, AFAIK... Could we have `List<String>` here?



##########
polaris-core/src/main/java/org/apache/polaris/core/config/FeatureConfiguration.java:
##########
@@ -91,31 +91,43 @@ public static void enforceFeatureEnabledOrThrow(
           .defaultValue(false)
           .buildFeatureConfiguration();
 
-  public static final FeatureConfiguration<Boolean> 
INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL =
-      PolarisConfiguration.<Boolean>builder()
-          .key("INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL")
-          .description(
-              "If set to true, session tags (catalog, namespace, table, 
principal, roles) will be included\n"
-                  + "in AWS STS AssumeRole requests for credential vending. 
These tags appear in CloudTrail events,\n"
-                  + "enabling correlation between catalog operations and S3 
data access.\n"
-                  + "Requires the IAM role trust policy to allow 
sts:TagSession action.\n"
-                  + "Note that enabling this feature may lead to degradation 
in temporary credential caching as \n"
-                  + "catalog will no longer be able to reuse credentials for 
different tables/namespaces/roles.")
-          .defaultValue(false)
-          .buildFeatureConfiguration();
+  /**
+   * The set of fields that are supported as AWS STS session tag labels in 
credential vending.
+   *
+   * <p>Supported values:
+   *
+   * <ul>
+   *   <li>{@code realm} - The realm identifier for the request
+   *   <li>{@code catalog} - The name of the catalog vending credentials
+   *   <li>{@code namespace} - The namespace being accessed (dot-separated)
+   *   <li>{@code table} - The table name being accessed
+   *   <li>{@code principal} - The principal name requesting credentials
+   *   <li>{@code roles} - Comma-separated list of activated principal roles
+   *   <li>{@code trace_id} - OpenTelemetry trace ID (WARNING: disables 
credential caching)
+   * </ul>
+   */
+  public static final List<String> SUPPORTED_SESSION_TAG_FIELDS =
+      List.of("realm", "catalog", "namespace", "table", "principal", "roles", 
"trace_id");
 
-  public static final FeatureConfiguration<Boolean> 
INCLUDE_TRACE_ID_IN_SESSION_TAGS =
-      PolarisConfiguration.<Boolean>builder()
-          .key("INCLUDE_TRACE_ID_IN_SESSION_TAGS")
+  public static final FeatureConfiguration<List> 
SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL =
+      PolarisConfiguration.<List>builder()
+          .key("SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL")
           .description(
-              "If set to true (and 
INCLUDE_SESSION_TAGS_IN_SUBSCOPED_CREDENTIAL is also true), the OpenTelemetry\n"
-                  + "trace ID will be included as a session tag in AWS STS 
AssumeRole requests. This enables\n"
-                  + "end-to-end correlation between catalog operations 
(Polaris events), credential vending (CloudTrail),\n"
-                  + "and metrics reports from compute engines.\n"
-                  + "WARNING: Enabling this feature completely disables 
credential caching because every request\n"
+              "A comma-separated list of fields to include as session tags in 
AWS STS AssumeRole requests\n"
+                  + "for credential vending. These tags appear in CloudTrail 
events, enabling correlation between\n"
+                  + "catalog operations and S3 data access. An empty list 
(default) disables session tags entirely.\n"
+                  + "Requires the IAM role trust policy to allow 
sts:TagSession action.\n"
+                  + "\n"
+                  + "Supported fields: realm, catalog, namespace, table, 
principal, roles, trace_id\n"

Review Comment:
   nit: maybe concatenate with `SUPPORTED_SESSION_TAG_FIELDS`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to