yuqi1129 commented on code in PR #10738:
URL: https://github.com/apache/gravitino/pull/10738#discussion_r3115747227
##########
core/src/main/java/org/apache/gravitino/catalog/CatalogManager.java:
##########
@@ -877,9 +879,37 @@ private boolean containsUserCreatedSchemas(
Set<String> availableSchemaNames =
Arrays.stream(allSchemas).map(NameIdentifier::name).collect(Collectors.toSet());
- // some schemas are dropped externally, but still exist in the entity
store, those schemas are
- // invalid
- return
schemaEntities.stream().map(SchemaEntity::name).anyMatch(availableSchemaNames::contains);
+ // Some schemas are dropped externally but still exist in the entity store
— those are invalid.
+ // Among schemas that exist in the underlying catalog, only those created
via Gravitino carry a
+ // StringIdentifier in their external properties; imported schemas do not.
+ for (SchemaEntity schemaEntity : schemaEntities) {
+ if (!availableSchemaNames.contains(schemaEntity.name())) {
+ continue;
+ }
+
+ try {
+ Schema schema =
+ catalogWrapper.doWithSchemaOps(ops ->
ops.loadSchema(schemaEntity.nameIdentifier()));
+ Map<String, String> props = schema.properties();
+ // If the backend cannot store a StringIdentifier (null or empty
properties, e.g. MySQL
+ // which does not support schema comments), we cannot tell whether the
schema was created
+ // by Gravitino or imported. Be conservative and treat it as
user-created to avoid
+ // accidental data loss.
+ // Only skip a schema when properties are non-null, non-empty, and
contain no
+ // StringIdentifier — the reliable signal that the schema was imported
from an external
+ // catalog on a backend that does support identifier storage.
+ if (props == null || props.isEmpty() ||
StringIdentifier.fromProperties(props) != null) {
Review Comment:
Do schemas loaded by `catalogWrapper.doWithSchemaOps(ops ->
ops.loadSchema(schemaEntity.nameIdentifier()))` really contain
`StringIdentifier`? I remember that it will return schema in the external
storage.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]