yuqi1129 commented on code in PR #10383: URL: https://github.com/apache/gravitino/pull/10383#discussion_r2959789684
########## catalogs-contrib/catalog-jdbc-clickhouse/src/main/java/org/apache/gravitino/catalog/clickhouse/operations/ClickHouseClusterUtils.java: ########## @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.gravitino.catalog.clickhouse.operations; + +import com.google.common.annotations.VisibleForTesting; +import org.apache.commons.lang3.StringUtils; + +/** + * Utilities for embedding and extracting ClickHouse cluster metadata in object COMMENT fields. + * + * <p><b>Why COMMENT?</b> + * + * <p>ClickHouse does not persist {@code ON CLUSTER} information in any queryable system table for + * non-Replicated objects: + * + * <ul> + * <li>{@code SHOW CREATE DATABASE} omits the {@code ON CLUSTER} clause. + * <li>{@code SHOW CREATE TABLE} omits the {@code ON CLUSTER} clause (each node stores the local + * DDL without the distribution directive). + * <li>{@code system.databases.cluster} is only populated for {@code Replicated}-engine databases. + * </ul> + * + * <p>Gravitino therefore embeds the cluster name inside the object's COMMENT field at creation time + * using a non-printable SOH separator ({@code \u0001}). The metadata is invisible to end users + * because Gravitino strips it before surfacing the comment. + * + * <p><b>Stored format:</b> {@code userComment\u0001ch.cluster=clusterName} + * + * <p><b>Limitation:</b> This mechanism only works for databases and tables created through + * Gravitino. If a database or table was created directly in ClickHouse (bypassing Gravitino), + * Gravitino has no way to determine whether it was created {@code ON CLUSTER} or which cluster name + * was used. In that case {@link #extractClusterFromComment} returns {@code null} and the {@code + * on-cluster} / {@code cluster-name} properties reported by Gravitino will be absent or inaccurate. + */ +public final class ClickHouseClusterUtils { + + /** + * Separator character between the user comment and the cluster metadata token. SOH (U+0001) is a + * non-printable control character that will not appear in normal user-supplied comments. + */ + @VisibleForTesting public static final char CLUSTER_META_SEP = '\u0001'; Review Comment: done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
