yuqi1129 commented on code in PR #10738:
URL: https://github.com/apache/gravitino/pull/10738#discussion_r3115747227


##########
core/src/main/java/org/apache/gravitino/catalog/CatalogManager.java:
##########
@@ -877,9 +879,37 @@ private boolean containsUserCreatedSchemas(
     Set<String> availableSchemaNames =
         
Arrays.stream(allSchemas).map(NameIdentifier::name).collect(Collectors.toSet());
 
-    // some schemas are dropped externally, but still exist in the entity 
store, those schemas are
-    // invalid
-    return 
schemaEntities.stream().map(SchemaEntity::name).anyMatch(availableSchemaNames::contains);
+    // Some schemas are dropped externally but still exist in the entity store 
— those are invalid.
+    // Among schemas that exist in the underlying catalog, only those created 
via Gravitino carry a
+    // StringIdentifier in their external properties; imported schemas do not.
+    for (SchemaEntity schemaEntity : schemaEntities) {
+      if (!availableSchemaNames.contains(schemaEntity.name())) {
+        continue;
+      }
+
+      try {
+        Schema schema =
+            catalogWrapper.doWithSchemaOps(ops -> 
ops.loadSchema(schemaEntity.nameIdentifier()));
+        Map<String, String> props = schema.properties();
+        // If the backend cannot store a StringIdentifier (null or empty 
properties, e.g. MySQL
+        // which does not support schema comments), we cannot tell whether the 
schema was created
+        // by Gravitino or imported. Be conservative and treat it as 
user-created to avoid
+        // accidental data loss.
+        // Only skip a schema when properties are non-null, non-empty, and 
contain no
+        // StringIdentifier — the reliable signal that the schema was imported 
from an external
+        // catalog on a backend that does support identifier storage.
+        if (props == null || props.isEmpty() || 
StringIdentifier.fromProperties(props) != null) {

Review Comment:
   Do schemas loaded by  `catalogWrapper.doWithSchemaOps(ops -> 
ops.loadSchema(schemaEntity.nameIdentifier()))` really contain 
`StringIdentifier`? I remember that it will return schema in the external 
storage. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to