Copilot commented on code in PR #10793:
URL: https://github.com/apache/gravitino/pull/10793#discussion_r3090243301
##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/GroupMetaBaseSQLProvider.java:
##########
@@ -182,4 +182,17 @@ public String deleteGroupMetasByLegacyTimeline(
+ GROUP_TABLE_NAME
+ " WHERE deleted_at > 0 AND deleted_at < #{legacyTimeline} LIMIT
#{limit}";
}
+
+ public String touchGroupUpdatedAt(@Param("groupId") long groupId,
@Param("now") long now) {
+ return "UPDATE " + GROUP_TABLE_NAME + " SET updated_at = #{now} WHERE
group_id = #{groupId}";
+ }
+
+ public String getGroupInfoByUserId(@Param("userId") long userId) {
+ return "SELECT gm.group_id as groupId, gm.updated_at as updatedAt"
+ + " FROM "
+ + GROUP_TABLE_NAME
+ + " gm"
+ + " JOIN group_user_rel gu ON gm.group_id = gu.group_id AND
gu.deleted_at = 0"
Review Comment:
`getGroupInfoByUserId` joins against `group_user_rel`, but this table does
not exist in the shipped SQL schemas (e.g.,
`scripts/mysql/schema-1.2.0-mysql.sql`, `scripts/h2/schema-1.2.0-h2.sql`). This
query will fail at runtime as soon as the mapper is used. Please either (1) add
the missing `group_user_rel` table (and its upgrade/migration scripts for each
supported backend), or (2) change the join to the correct existing relation
table used to represent user↔group membership in the relational store.
##########
scripts/mysql/upgrade-1.2.0-to-1.3.0-mysql.sql:
##########
@@ -0,0 +1,81 @@
+--
+-- Licensed to the Apache Software Foundation (ASF) under one
+-- or more contributor license agreements. See the NOTICE file
+-- distributed with this work for additional information
+-- regarding copyright ownership. The ASF licenses this file
+-- to you under the Apache License, Version 2.0 (the
+-- "License"). You may not use this file except in compliance
+-- with the License. You may obtain a copy of the License at
+--
+-- http://www.apache.org/licenses/LICENSE-2.0
+--
+-- Unless required by applicable law or agreed to in writing,
+-- software distributed under the License is distributed on an
+-- "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+-- KIND, either express or implied. See the License for the
+-- specific language governing permissions and limitations
+-- under the License.
+--
+
+-- Role privilege tracking (strong consistency -- Step 3 version check)
+ALTER TABLE `role_meta`
+ ADD COLUMN `updated_at` BIGINT NOT NULL DEFAULT 0
+ COMMENT 'Set to currentTimeMillis() on any privilege grant/revoke for this
role.
+ JcasbinAuthorizer compares db.updated_at vs cached updated_at per
request
+ to decide whether to reload JCasbin policies for this role.';
+
+-- User role assignment tracking (strong consistency -- Step 1a version check)
+ALTER TABLE `user_meta`
+ ADD COLUMN `updated_at` BIGINT NOT NULL DEFAULT 0
+ COMMENT 'Set to currentTimeMillis() on any role assign/revoke for this
user.
+ JcasbinAuthorizer compares db.updated_at vs cached updated_at per
request
+ to decide whether to reload the user-role mapping.';
+
+-- Group role assignment tracking (strong consistency -- Step 1b version check)
+ALTER TABLE `group_meta`
+ ADD COLUMN `updated_at` BIGINT NOT NULL DEFAULT 0
+ COMMENT 'Set to currentTimeMillis() on any role assign/revoke for this
group.
+ JcasbinAuthorizer compares db.updated_at vs cached updated_at per
request
+ to decide whether to reload the group-role mapping.';
+
+-- Ownership mutation tracking (eventual consistency -- owner change poller)
+ALTER TABLE `owner_meta`
+ ADD COLUMN `updated_at` BIGINT NOT NULL DEFAULT 0
+ COMMENT 'Set to currentTimeMillis() on any ownership transfer.
+ The owner change poller reads updated_at > maxSeen to find
changed rows
+ and invalidates only the specific metadataObjectIds in
ownerRelCache.';
+
+-- Covering indexes for high-frequency read predicates
+CREATE INDEX idx_user_meta_name_del_upd
+ ON user_meta (metalake_id, user_name, deleted_at, updated_at);
+CREATE INDEX idx_group_meta_del_upd
+ ON group_meta (group_id, deleted_at, updated_at);
+CREATE INDEX idx_role_meta_del_upd
+ ON role_meta (role_id, deleted_at, updated_at);
+CREATE INDEX idx_owner_meta_obj_del_upd
+ ON owner_meta (metadata_object_id, deleted_at, updated_at);
Review Comment:
The new `idx_owner_meta_obj_del_upd (metadata_object_id, deleted_at,
updated_at)` index does not support the new polling query `WHERE updated_at > ?
ORDER BY updated_at` (the leading columns are not constrained), so MySQL cannot
use it for the range scan/order-by. Consider adding an index that starts with
`updated_at` (e.g., `(updated_at, metadata_object_id)` and optionally
`deleted_at`) to match the `selectChangedOwners` access pattern.
##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/EntityChangeLogBaseSQLProvider.java:
##########
@@ -0,0 +1,53 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.storage.relational.mapper.provider.base;
+
+import static
org.apache.gravitino.storage.relational.mapper.EntityChangeLogMapper.ENTITY_CHANGE_LOG_TABLE_NAME;
+
+import org.apache.ibatis.annotations.Param;
+
+public class EntityChangeLogBaseSQLProvider {
+
+ public String selectEntityChanges(
+ @Param("createdAtAfter") long createdAtAfter, @Param("maxRows") int
maxRows) {
+ return "SELECT metalake_name as metalakeName, entity_type as entityType,"
+ + " full_name as fullName, operate_type as operateType, created_at as
createdAt"
+ + " FROM "
+ + ENTITY_CHANGE_LOG_TABLE_NAME
+ + " WHERE created_at > #{createdAtAfter} ORDER BY created_at LIMIT
#{maxRows}";
+ }
+
+ public String insertEntityChange(
+ @Param("metalakeName") String metalakeName,
+ @Param("entityType") String entityType,
+ @Param("fullName") String fullName,
+ @Param("operateType") String operateType,
+ @Param("createdAt") long createdAt) {
+ return "INSERT INTO "
+ + ENTITY_CHANGE_LOG_TABLE_NAME
+ + " (metalake_name, entity_type, full_name, operate_type, created_at)"
+ + " VALUES (#{metalakeName}, #{entityType}, #{fullName},
#{operateType}, #{createdAt})";
+ }
+
+ public String pruneOldEntityChanges(@Param("before") long before) {
+ return "DELETE FROM "
+ + ENTITY_CHANGE_LOG_TABLE_NAME
+ + " WHERE created_at < #{before} LIMIT 1000";
Review Comment:
`pruneOldEntityChanges` uses `DELETE ... LIMIT 1000`, which is not supported
by PostgreSQL. Since `EntityChangeLogSQLProviderFactory` registers a PostgreSQL
provider, this will break on that backend. Follow the existing pattern used by
other PostgreSQL providers (e.g., delete via `WHERE id IN (SELECT id ... LIMIT
...)` or use `USING`/CTE) by overriding `pruneOldEntityChanges` in a
PostgreSQL-specific provider.
##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/EntityChangeLogSQLProviderFactory.java:
##########
@@ -0,0 +1,70 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.storage.relational.mapper;
+
+import com.google.common.collect.ImmutableMap;
+import java.util.Map;
+import org.apache.gravitino.storage.relational.JDBCBackend.JDBCBackendType;
+import
org.apache.gravitino.storage.relational.mapper.provider.base.EntityChangeLogBaseSQLProvider;
+import org.apache.gravitino.storage.relational.session.SqlSessionFactoryHelper;
+import org.apache.ibatis.annotations.Param;
+
+public class EntityChangeLogSQLProviderFactory {
+
+ private static final Map<JDBCBackendType, EntityChangeLogBaseSQLProvider>
+ ENTITY_CHANGE_LOG_SQL_PROVIDER_MAP =
+ ImmutableMap.of(
+ JDBCBackendType.MYSQL, new EntityChangeLogMySQLProvider(),
+ JDBCBackendType.H2, new EntityChangeLogH2Provider(),
+ JDBCBackendType.POSTGRESQL, new
EntityChangeLogPostgreSQLProvider());
Review Comment:
`EntityChangeLogSQLProviderFactory` registers providers for
MYSQL/H2/POSTGRESQL, but this PR only includes a MySQL upgrade script that
creates the `entity_change_log` table and adds the `updated_at` columns. The
H2/PostgreSQL schema scripts currently do not define `entity_change_log` and do
not add `updated_at` to the auth tables, so running on those backends will fail
once these mappers are exercised. Please add the corresponding DDL/migration
updates for H2 and PostgreSQL (or restrict registration to backends where the
schema is guaranteed to exist).
##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/OwnerMetaBaseSQLProvider.java:
##########
@@ -236,4 +236,17 @@ public String deleteOwnerMetasByLegacyTimeline(
+ OWNER_TABLE_NAME
+ " WHERE deleted_at > 0 AND deleted_at < #{legacyTimeline} LIMIT
#{limit}";
}
+
+ public String selectOwnerByMetadataObjectId(@Param("metadataObjectId") long
metadataObjectId) {
+ return "SELECT owner_id as ownerId, owner_type as ownerType FROM "
+ + OWNER_TABLE_NAME
+ + " WHERE metadata_object_id = #{metadataObjectId} AND deleted_at = 0";
+ }
+
+ public String selectChangedOwners(@Param("updatedAtAfter") long
updatedAtAfter) {
+ return "SELECT metadata_object_id as metadataObjectId, updated_at as
updatedAt"
+ + " FROM "
+ + OWNER_TABLE_NAME
+ + " WHERE updated_at > #{updatedAtAfter} ORDER BY updated_at";
Review Comment:
`selectChangedOwners` returns an unbounded result set (`WHERE updated_at >
... ORDER BY updated_at` with no LIMIT). For a lagging poller or high update
volume, this can lead to very large reads and long-running queries. Consider
adding a `maxRows` parameter (similar to
`EntityChangeLogBaseSQLProvider#selectEntityChanges`) and a supporting index on
`updated_at` so the poller can process changes in bounded batches.
##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/EntityChangeLogMapper.java:
##########
@@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.storage.relational.mapper;
+
+import java.util.List;
+import org.apache.gravitino.storage.relational.po.auth.EntityChangeRecord;
+import org.apache.ibatis.annotations.DeleteProvider;
+import org.apache.ibatis.annotations.InsertProvider;
+import org.apache.ibatis.annotations.Param;
+import org.apache.ibatis.annotations.SelectProvider;
+
+/**
+ * A MyBatis Mapper for entity_change_log table operations.
+ *
+ * <p>This append-only log tracks structural changes to entities (create,
alter, drop) and is used
+ * by the entity change poller to drive targeted invalidation of the
metadataIdCache on HA peer
+ * nodes.
+ */
+public interface EntityChangeLogMapper {
+
+ String ENTITY_CHANGE_LOG_TABLE_NAME = "entity_change_log";
+
+ @SelectProvider(
+ type = EntityChangeLogSQLProviderFactory.class,
+ method = "selectEntityChanges")
+ List<EntityChangeRecord> selectChanges(
+ @Param("createdAtAfter") long createdAtAfter, @Param("maxRows") int
maxRows);
+
+ @InsertProvider(type = EntityChangeLogSQLProviderFactory.class, method =
"insertEntityChange")
+ void insertChange(
+ @Param("metalakeName") String metalakeName,
+ @Param("entityType") String entityType,
+ @Param("fullName") String fullName,
+ @Param("operateType") String operateType,
+ @Param("createdAt") long createdAt);
+
+ @DeleteProvider(
+ type = EntityChangeLogSQLProviderFactory.class,
+ method = "pruneOldEntityChanges")
+ void pruneOldEntries(@Param("before") long before);
Review Comment:
This PR adds new mapper operations (`touchUpdatedAt`, `getUserInfo`,
`getGroupInfoByUserId`, entity_change_log insert/select/prune) but does not add
tests validating the generated SQL and result mappings. Please add coverage (at
least against the H2 test backend) for the happy-path and a couple of edge
cases (e.g., deleted_at filtering, batchGetUpdatedAt with multiple roleIds,
prune behavior) to prevent regressions.
##########
core/src/main/java/org/apache/gravitino/storage/relational/mapper/provider/base/OwnerMetaBaseSQLProvider.java:
##########
@@ -236,4 +236,17 @@ public String deleteOwnerMetasByLegacyTimeline(
+ OWNER_TABLE_NAME
+ " WHERE deleted_at > 0 AND deleted_at < #{legacyTimeline} LIMIT
#{limit}";
}
+
+ public String selectOwnerByMetadataObjectId(@Param("metadataObjectId") long
metadataObjectId) {
+ return "SELECT owner_id as ownerId, owner_type as ownerType FROM "
+ + OWNER_TABLE_NAME
+ + " WHERE metadata_object_id = #{metadataObjectId} AND deleted_at = 0";
Review Comment:
`selectOwnerByMetadataObjectId` filters only by `metadata_object_id` (and
`deleted_at = 0`) but returns a single `OwnerInfo`. In the `owner_meta` schema
the active rows are not unique on `metadata_object_id` alone (it can vary by
`metadata_object_type` and/or `owner_id`), so this can legitimately return
multiple rows and cause MyBatis `TooManyResultsException` or nondeterministic
results. Consider requiring `metadataObjectType` (and possibly `ownerType`) as
parameters like the existing `select*OwnerMetaByMetadataObjectIdAndType`
methods, or change the mapper return type to a `List<OwnerInfo>` with
deterministic selection logic.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]