Copilot commented on code in PR #10793:
URL: https://github.com/apache/gravitino/pull/10793#discussion_r3090243301


##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/GroupMetaBaseSQLProvider.java:
##########
@@ -182,4 +182,17 @@ public String deleteGroupMetasByLegacyTimeline(
         + GROUP_TABLE_NAME
         + " WHERE deleted_at > 0 AND deleted_at < #{legacyTimeline} LIMIT 
#{limit}";
   }
+
+  public String touchGroupUpdatedAt(@Param("groupId") long groupId, 
@Param("now") long now) {
+    return "UPDATE " + GROUP_TABLE_NAME + " SET updated_at = #{now} WHERE 
group_id = #{groupId}";
+  }
+
+  public String getGroupInfoByUserId(@Param("userId") long userId) {
+    return "SELECT gm.group_id as groupId, gm.updated_at as updatedAt"
+        + " FROM "
+        + GROUP_TABLE_NAME
+        + " gm"
+        + " JOIN group_user_rel gu ON gm.group_id = gu.group_id AND 
gu.deleted_at = 0"

Review Comment:
   `getGroupInfoByUserId` joins against `group_user_rel`, but this table does 
not exist in the shipped SQL schemas (e.g., 
`scripts/mysql/schema-1.2.0-mysql.sql`, `scripts/h2/schema-1.2.0-h2.sql`). This 
query will fail at runtime as soon as the mapper is used. Please either (1) add 
the missing `group_user_rel` table (and its upgrade/migration scripts for each 
supported backend), or (2) change the join to the correct existing relation 
table used to represent user↔group membership in the relational store.



##########
scripts/mysql/upgrade-1.2.0-to-1.3.0-mysql.sql:
##########
@@ -0,0 +1,81 @@
+--
+-- Licensed to the Apache Software Foundation (ASF) under one
+-- or more contributor license agreements.  See the NOTICE file
+-- distributed with this work for additional information
+-- regarding copyright ownership.  The ASF licenses this file
+-- to you under the Apache License, Version 2.0 (the
+-- "License"). You may not use this file except in compliance
+-- with the License.  You may obtain a copy of the License at
+--
+--  http://www.apache.org/licenses/LICENSE-2.0
+--
+-- Unless required by applicable law or agreed to in writing,
+-- software distributed under the License is distributed on an
+-- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+-- KIND, either express or implied.  See the License for the
+-- specific language governing permissions and limitations
+-- under the License.
+--
+
+-- Role privilege tracking (strong consistency -- Step 3 version check)
+ALTER TABLE `role_meta`
+    ADD COLUMN `updated_at` BIGINT NOT NULL DEFAULT 0
+    COMMENT 'Set to currentTimeMillis() on any privilege grant/revoke for this 
role.
+             JcasbinAuthorizer compares db.updated_at vs cached updated_at per 
request
+             to decide whether to reload JCasbin policies for this role.';
+
+-- User role assignment tracking (strong consistency -- Step 1a version check)
+ALTER TABLE `user_meta`
+    ADD COLUMN `updated_at` BIGINT NOT NULL DEFAULT 0
+    COMMENT 'Set to currentTimeMillis() on any role assign/revoke for this 
user.
+             JcasbinAuthorizer compares db.updated_at vs cached updated_at per 
request
+             to decide whether to reload the user-role mapping.';
+
+-- Group role assignment tracking (strong consistency -- Step 1b version check)
+ALTER TABLE `group_meta`
+    ADD COLUMN `updated_at` BIGINT NOT NULL DEFAULT 0
+    COMMENT 'Set to currentTimeMillis() on any role assign/revoke for this 
group.
+             JcasbinAuthorizer compares db.updated_at vs cached updated_at per 
request
+             to decide whether to reload the group-role mapping.';
+
+-- Ownership mutation tracking (eventual consistency -- owner change poller)
+ALTER TABLE `owner_meta`
+    ADD COLUMN `updated_at` BIGINT NOT NULL DEFAULT 0
+    COMMENT 'Set to currentTimeMillis() on any ownership transfer.
+             The owner change poller reads updated_at > maxSeen to find 
changed rows
+             and invalidates only the specific metadataObjectIds in 
ownerRelCache.';
+
+-- Covering indexes for high-frequency read predicates
+CREATE INDEX idx_user_meta_name_del_upd
+    ON user_meta (metalake_id, user_name, deleted_at, updated_at);
+CREATE INDEX idx_group_meta_del_upd
+    ON group_meta (group_id, deleted_at, updated_at);
+CREATE INDEX idx_role_meta_del_upd
+    ON role_meta (role_id, deleted_at, updated_at);
+CREATE INDEX idx_owner_meta_obj_del_upd
+    ON owner_meta (metadata_object_id, deleted_at, updated_at);

Review Comment:
   The new `idx_owner_meta_obj_del_upd (metadata_object_id, deleted_at, 
updated_at)` index does not support the new polling query `WHERE updated_at > ? 
ORDER BY updated_at` (the leading columns are not constrained), so MySQL cannot 
use it for the range scan/order-by. Consider adding an index that starts with 
`updated_at` (e.g., `(updated_at, metadata_object_id)` and optionally 
`deleted_at`) to match the `selectChangedOwners` access pattern.



##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/EntityChangeLogBaseSQLProvider.java:
##########
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.storage.relational.mapper.provider.base;
+
+import static 
org.apache.gravitino.storage.relational.mapper.EntityChangeLogMapper.ENTITY_CHANGE_LOG_TABLE_NAME;
+
+import org.apache.ibatis.annotations.Param;
+
+public class EntityChangeLogBaseSQLProvider {
+
+  public String selectEntityChanges(
+      @Param("createdAtAfter") long createdAtAfter, @Param("maxRows") int 
maxRows) {
+    return "SELECT metalake_name as metalakeName, entity_type as entityType,"
+        + " full_name as fullName, operate_type as operateType, created_at as 
createdAt"
+        + " FROM "
+        + ENTITY_CHANGE_LOG_TABLE_NAME
+        + " WHERE created_at > #{createdAtAfter} ORDER BY created_at LIMIT 
#{maxRows}";
+  }
+
+  public String insertEntityChange(
+      @Param("metalakeName") String metalakeName,
+      @Param("entityType") String entityType,
+      @Param("fullName") String fullName,
+      @Param("operateType") String operateType,
+      @Param("createdAt") long createdAt) {
+    return "INSERT INTO "
+        + ENTITY_CHANGE_LOG_TABLE_NAME
+        + " (metalake_name, entity_type, full_name, operate_type, created_at)"
+        + " VALUES (#{metalakeName}, #{entityType}, #{fullName}, 
#{operateType}, #{createdAt})";
+  }
+
+  public String pruneOldEntityChanges(@Param("before") long before) {
+    return "DELETE FROM "
+        + ENTITY_CHANGE_LOG_TABLE_NAME
+        + " WHERE created_at < #{before} LIMIT 1000";

Review Comment:
   `pruneOldEntityChanges` uses `DELETE ... LIMIT 1000`, which is not supported 
by PostgreSQL. Since `EntityChangeLogSQLProviderFactory` registers a PostgreSQL 
provider, this will break on that backend. Follow the existing pattern used by 
other PostgreSQL providers (e.g., delete via `WHERE id IN (SELECT id ... LIMIT 
...)` or use `USING`/CTE) by overriding `pruneOldEntityChanges` in a 
PostgreSQL-specific provider.



##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/EntityChangeLogSQLProviderFactory.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.storage.relational.mapper;
+
+import com.google.common.collect.ImmutableMap;
+import java.util.Map;
+import org.apache.gravitino.storage.relational.JDBCBackend.JDBCBackendType;
+import 
org.apache.gravitino.storage.relational.mapper.provider.base.EntityChangeLogBaseSQLProvider;
+import org.apache.gravitino.storage.relational.session.SqlSessionFactoryHelper;
+import org.apache.ibatis.annotations.Param;
+
+public class EntityChangeLogSQLProviderFactory {
+
+  private static final Map<JDBCBackendType, EntityChangeLogBaseSQLProvider>
+      ENTITY_CHANGE_LOG_SQL_PROVIDER_MAP =
+          ImmutableMap.of(
+              JDBCBackendType.MYSQL, new EntityChangeLogMySQLProvider(),
+              JDBCBackendType.H2, new EntityChangeLogH2Provider(),
+              JDBCBackendType.POSTGRESQL, new 
EntityChangeLogPostgreSQLProvider());

Review Comment:
   `EntityChangeLogSQLProviderFactory` registers providers for 
MYSQL/H2/POSTGRESQL, but this PR only includes a MySQL upgrade script that 
creates the `entity_change_log` table and adds the `updated_at` columns. The 
H2/PostgreSQL schema scripts currently do not define `entity_change_log` and do 
not add `updated_at` to the auth tables, so running on those backends will fail 
once these mappers are exercised. Please add the corresponding DDL/migration 
updates for H2 and PostgreSQL (or restrict registration to backends where the 
schema is guaranteed to exist).



##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/OwnerMetaBaseSQLProvider.java:
##########
@@ -236,4 +236,17 @@ public String deleteOwnerMetasByLegacyTimeline(
         + OWNER_TABLE_NAME
         + " WHERE deleted_at > 0 AND deleted_at < #{legacyTimeline} LIMIT 
#{limit}";
   }
+
+  public String selectOwnerByMetadataObjectId(@Param("metadataObjectId") long 
metadataObjectId) {
+    return "SELECT owner_id as ownerId, owner_type as ownerType FROM "
+        + OWNER_TABLE_NAME
+        + " WHERE metadata_object_id = #{metadataObjectId} AND deleted_at = 0";
+  }
+
+  public String selectChangedOwners(@Param("updatedAtAfter") long 
updatedAtAfter) {
+    return "SELECT metadata_object_id as metadataObjectId, updated_at as 
updatedAt"
+        + " FROM "
+        + OWNER_TABLE_NAME
+        + " WHERE updated_at > #{updatedAtAfter} ORDER BY updated_at";

Review Comment:
   `selectChangedOwners` returns an unbounded result set (`WHERE updated_at > 
... ORDER BY updated_at` with no LIMIT). For a lagging poller or high update 
volume, this can lead to very large reads and long-running queries. Consider 
adding a `maxRows` parameter (similar to 
`EntityChangeLogBaseSQLProvider#selectEntityChanges`) and a supporting index on 
`updated_at` so the poller can process changes in bounded batches.



##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/EntityChangeLogMapper.java:
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *  http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.storage.relational.mapper;
+
+import java.util.List;
+import org.apache.gravitino.storage.relational.po.auth.EntityChangeRecord;
+import org.apache.ibatis.annotations.DeleteProvider;
+import org.apache.ibatis.annotations.InsertProvider;
+import org.apache.ibatis.annotations.Param;
+import org.apache.ibatis.annotations.SelectProvider;
+
+/**
+ * A MyBatis Mapper for entity_change_log table operations.
+ *
+ * <p>This append-only log tracks structural changes to entities (create, 
alter, drop) and is used
+ * by the entity change poller to drive targeted invalidation of the 
metadataIdCache on HA peer
+ * nodes.
+ */
+public interface EntityChangeLogMapper {
+
+  String ENTITY_CHANGE_LOG_TABLE_NAME = "entity_change_log";
+
+  @SelectProvider(
+      type = EntityChangeLogSQLProviderFactory.class,
+      method = "selectEntityChanges")
+  List<EntityChangeRecord> selectChanges(
+      @Param("createdAtAfter") long createdAtAfter, @Param("maxRows") int 
maxRows);
+
+  @InsertProvider(type = EntityChangeLogSQLProviderFactory.class, method = 
"insertEntityChange")
+  void insertChange(
+      @Param("metalakeName") String metalakeName,
+      @Param("entityType") String entityType,
+      @Param("fullName") String fullName,
+      @Param("operateType") String operateType,
+      @Param("createdAt") long createdAt);
+
+  @DeleteProvider(
+      type = EntityChangeLogSQLProviderFactory.class,
+      method = "pruneOldEntityChanges")
+  void pruneOldEntries(@Param("before") long before);

Review Comment:
   This PR adds new mapper operations (`touchUpdatedAt`, `getUserInfo`, 
`getGroupInfoByUserId`, entity_change_log insert/select/prune) but does not add 
tests validating the generated SQL and result mappings. Please add coverage (at 
least against the H2 test backend) for the happy-path and a couple of edge 
cases (e.g., deleted_at filtering, batchGetUpdatedAt with multiple roleIds, 
prune behavior) to prevent regressions.



##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/OwnerMetaBaseSQLProvider.java:
##########
@@ -236,4 +236,17 @@ public String deleteOwnerMetasByLegacyTimeline(
         + OWNER_TABLE_NAME
         + " WHERE deleted_at > 0 AND deleted_at < #{legacyTimeline} LIMIT 
#{limit}";
   }
+
+  public String selectOwnerByMetadataObjectId(@Param("metadataObjectId") long 
metadataObjectId) {
+    return "SELECT owner_id as ownerId, owner_type as ownerType FROM "
+        + OWNER_TABLE_NAME
+        + " WHERE metadata_object_id = #{metadataObjectId} AND deleted_at = 0";

Review Comment:
   `selectOwnerByMetadataObjectId` filters only by `metadata_object_id` (and 
`deleted_at = 0`) but returns a single `OwnerInfo`. In the `owner_meta` schema 
the active rows are not unique on `metadata_object_id` alone (it can vary by 
`metadata_object_type` and/or `owner_id`), so this can legitimately return 
multiple rows and cause MyBatis `TooManyResultsException` or nondeterministic 
results. Consider requiring `metadataObjectType` (and possibly `ownerType`) as 
parameters like the existing `select*OwnerMetaByMetadataObjectIdAndType` 
methods, or change the mapper return type to a `List<OwnerInfo>` with 
deterministic selection logic.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to