github-actions[bot] commented on code in PR #63310:
URL: https://github.com/apache/doris/pull/63310#discussion_r3427248747
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CatalogRecycleBin.java:
##########
@@ -574,7 +587,128 @@ public void replayEraseTable(long tableId) {
}
}
- private void erasePartition(long currentTimeMs, int keepNum) {
+ /**
+ * Erase all partitions belonging to the given table.
+ * This handles partitions that were not expired when erasePartition()
collected
+ * its expired IDs, but became expired later due to time advancement
during processing.
+ *
+ * <p>This method acquires its own locks with fine granularity:
+ * <ul>
+ * <li>Read lock to collect partition IDs</li>
+ * <li>Write lock for each individual partition cleanup</li>
+ * </ul>
+ *
+ * @param tableId the table ID whose partitions need to be erased
+ */
+ private void erasePartitionsForTable(long tableId) {
+ // 1. Collect orphan partition IDs under read lock (fast, non-blocking)
+ List<Long> partitionIds = new ArrayList<>();
+ readLock();
+ try {
+ for (Map.Entry<Long, RecyclePartitionInfo> entry :
idToPartition.entrySet()) {
+ if (entry.getValue().getTableId() == tableId) {
+ partitionIds.add(entry.getKey());
+ }
+ }
+ } finally {
+ readUnlock();
+ }
+
+ // 2. Clean each orphan partition with individual write lock (fine
granularity)
+ for (Long partitionId : partitionIds) {
+ writeLock();
+ try {
+ RecyclePartitionInfo pInfo = idToPartition.remove(partitionId);
+ if (pInfo == null) {
+ continue;
+ }
+ Partition partition = pInfo.getPartition();
+ Env.getCurrentEnv().onErasePartition(partition);
+ idToRecycleTime.remove(partitionId);
+
+ dbTblIdPartitionNameToIds.computeIfPresent(
+ Pair.of(pInfo.getDbId(), pInfo.getTableId()), (pair,
partitionMap) -> {
+ partitionMap.computeIfPresent(partition.getName(),
(name, idSet) -> {
+ idSet.remove(partitionId);
+ return idSet.isEmpty() ? null : idSet;
+ });
+ return partitionMap.isEmpty() ? null :
partitionMap;
+ });
+
+
Env.getCurrentEnv().getEditLog().logErasePartition(partitionId);
+ LOG.info("erase orphan partition[{}] when erasing table[{}]",
partitionId, tableId);
+ } finally {
+ writeUnlock();
+ }
+ }
+ }
+
+ /**
+ * Erase all tables belonging to the given database.
+ * This handles tables that were not expired when eraseTable() collected
+ * its expired IDs, but became expired later due to time advancement
during processing.
+ *
+ * <p>This method acquires its own locks with fine granularity:
+ * <ul>
+ * <li>Read lock to collect table IDs</li>
+ * <li>For each table, erase its partitions then erase the table
itself</li>
+ * </ul>
+ *
+ * @param dbInfo the RecycleDatabaseInfo containing the database and its
table metadata
+ */
+ private void eraseTablesForDatabase(RecycleDatabaseInfo dbInfo) {
+ long dbId = dbInfo.getDb().getId();
+ Set<String> tableNames = Sets.newHashSet(dbInfo.getTableNames());
+ Set<Long> tableIdSet = Sets.newHashSet(dbInfo.getTableIds());
+
+ // 1. Collect orphan table IDs under read lock
+ List<Long> tableIds = new ArrayList<>();
+ readLock();
+ try {
+ for (Map.Entry<Long, RecycleTableInfo> entry :
idToTable.entrySet()) {
+ RecycleTableInfo tableInfo = entry.getValue();
+ if (tableInfo.getDbId() == dbId
+ && tableNames.contains(tableInfo.getTable().getName())
Review Comment:
This still skips recycled tables from the same database if they were dropped
before the database drop. Concrete sequence: drop table `t_old` in db `d`, then
later drop database `d`. `RecycleDatabaseInfo` is built from `db.getTables()`
at the database drop, so it does not contain `t_old`. When the db expires, this
helper skips `t_old`, then `eraseDatabase()` removes/logs the db, leaving
`idToTable` with a dbId that is no longer in the catalog or `idToDatabase`.
That table is no longer recoverable and will be rebuilt from image as a
recycle-bin table whose database is missing. For database parent erasure,
please erase all recycled tables/partitions with this `dbId` before logging
`OP_ERASE_DB`, or otherwise explicitly clean the remaining children by dbId.
This is distinct from the existing thread's current-table drop-db case because
it covers tables that were already recycled before the database drop and
therefore are not in `dbInfo.getTableIds()`.
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CatalogRecycleBin.java:
##########
@@ -467,6 +479,7 @@ private void eraseTable(long currentTimeMs, int keepNum) {
if (tableInfo == null) {
continue;
}
+ erasePartitionsForTable(tableId);
Review Comment:
This fixes the normal expired-table path, but the parallel
`eraseTableWithSameName()` path still removes a recycled table without calling
`erasePartitionsForTable`. A concrete case is: drop partition `p` from table
`t`, then drop multiple tables named `t` in the same db so
`max_same_name_catalog_trash_num` evicts the older table after min latency.
`erasePartition()` can leave `p` because it is not expired and not over the
partition keep count, then `eraseTableWithSameName()` removes/logs the table
and leaves `p` with a missing parent table. Please apply the same partition
cascade before removing/logging the table in the same-name path, and add a test
for that path.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]