Copilot commented on code in PR #10562:
URL: https://github.com/apache/gravitino/pull/10562#discussion_r3045197434
##########
catalogs-contrib/catalog-jdbc-clickhouse/src/main/java/org/apache/gravitino/catalog/clickhouse/operations/ClickHouseTableOperations.java:
##########
@@ -1265,12 +1271,18 @@ private List<Index> getSecondaryIndexes(
String name = resultSet.getString("name");
String type = resultSet.getString("type");
String expression = resultSet.getString("expr");
+ String granularity = resultSet.getString("granularity");
try {
String[][] fields = parseIndexFields(expression);
if (ArrayUtils.isEmpty(fields)) {
continue;
}
- secondaryIndexes.add(Indexes.of(getClickHouseIndexType(type),
name, fields));
+ Map<String, String> indexProperties = ImmutableMap.of();
+ if (StringUtils.isNotBlank(granularity)) {
+ indexProperties = ImmutableMap.of(GRANULARITY, granularity);
+ }
+ secondaryIndexes.add(
+ Indexes.of(getClickHouseIndexType(type), name, fields,
indexProperties));
Review Comment:
`getSecondaryIndexes` only adds `properties.granularity` when the
`system.data_skipping_indices.granularity` value is non-blank. However this
PR’s documented behavior says `loadTable` should return the *effective*
granularity (including defaults) for supported data-skipping indexes. To make
load behavior consistent and resilient to missing/blank metadata, consider
populating `properties.granularity` with the default value for the index type
when `granularity` is null/blank (instead of omitting the key).
##########
docs/jdbc-clickhouse-catalog.md:
##########
@@ -240,8 +240,31 @@ If you need Gravitino to manage an existing cluster
database or table, recreate
- `PRIMARY_KEY`
- Data-skipping indexes:
- - `DATA_SKIPPING_MINMAX` (`GRANULARITY` fixed to 1)
- - `DATA_SKIPPING_BLOOM_FILTER` (`GRANULARITY` fixed to 3)
+ - `DATA_SKIPPING_MINMAX`
+ - `DATA_SKIPPING_BLOOM_FILTER`
+
+#### Properties
+
+For data-skipping indexes, Gravitino supports catalog-specific index
`properties`.
+
+##### `granularity`
+
+| Property Name | Description | Validation
|
+|----------------|-----------------------------------------|----------------------------------------------------------------------------|
+| `granularity` | Maps to ClickHouse `GRANULARITY` clause | Property key must
be lowercase `granularity`; value must be a positive integer |
+
+Defaults by index type:
+
+| Index Type | Default `granularity` | Notes
|
+|--------------------------------|------------------------|----------------------------------------------------|
+| `DATA_SKIPPING_MINMAX` | `1` | Used when
`properties.granularity` is not provided |
+| `DATA_SKIPPING_BLOOM_FILTER` | `3` | Used when
`properties.granularity` is not provided |
+
+Behavior:
+
+- If `properties.granularity` is provided, the configured value is used for
`CREATE TABLE` and `ALTER TABLE ... ADD INDEX`.
+- If `properties.granularity` is not provided, Gravitino uses the default
value for each index type.
+- On table load, Gravitino reads ClickHouse index metadata and returns the
effective `granularity` in index properties for supported data-skipping indexes.
Review Comment:
The new `loadTable` behavior is documented as returning the effective
`granularity` in index properties, but the current implementation may omit
`properties.granularity` when ClickHouse reports it as blank/null. Either
ensure the implementation always returns an effective value (preferred), or
clarify the doc to describe when the property may be absent.
```suggestion
- On table load, Gravitino reads ClickHouse index metadata for supported
data-skipping indexes.
- If ClickHouse reports a non-blank `granularity` value, Gravitino returns
it in index properties.
- If ClickHouse reports `granularity` as blank or `null`,
`properties.granularity` may be absent on the loaded table metadata.
```
##########
catalogs-contrib/catalog-jdbc-clickhouse/src/test/java/org/apache/gravitino/catalog/clickhouse/operations/TestClickHouseTableOperationsGranularity.java:
##########
@@ -0,0 +1,352 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied. See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.gravitino.catalog.clickhouse.operations;
+
+import static
org.apache.gravitino.catalog.clickhouse.ClickHouseConstants.IndexConstants.DEFAULT_BLOOM_FILTER_GRANULARITY;
+import static
org.apache.gravitino.catalog.clickhouse.ClickHouseConstants.IndexConstants.GRANULARITY;
+import static
org.apache.gravitino.catalog.clickhouse.ClickHouseUtils.getSortOrders;
+import static org.apache.gravitino.rel.Column.DEFAULT_VALUE_NOT_SET;
+
+import java.lang.reflect.InvocationTargetException;
+import java.lang.reflect.Method;
+import java.sql.Connection;
+import java.sql.PreparedStatement;
+import java.sql.ResultSet;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+import
org.apache.gravitino.catalog.clickhouse.converter.ClickHouseColumnDefaultValueConverter;
+import
org.apache.gravitino.catalog.clickhouse.converter.ClickHouseExceptionConverter;
+import
org.apache.gravitino.catalog.clickhouse.converter.ClickHouseTypeConverter;
+import org.apache.gravitino.catalog.jdbc.JdbcColumn;
+import org.apache.gravitino.catalog.jdbc.JdbcTable;
+import org.apache.gravitino.rel.TableChange;
+import org.apache.gravitino.rel.expressions.distributions.Distributions;
+import org.apache.gravitino.rel.expressions.sorts.SortOrder;
+import org.apache.gravitino.rel.expressions.transforms.Transform;
+import org.apache.gravitino.rel.expressions.transforms.Transforms;
+import org.apache.gravitino.rel.indexes.Index;
+import org.apache.gravitino.rel.indexes.Indexes;
+import org.apache.gravitino.rel.types.Types;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+import org.mockito.Mockito;
+
+public class TestClickHouseTableOperationsGranularity {
+
+ private TestableClickHouseTableOperations newOps() {
+ TestableClickHouseTableOperations ops = new
TestableClickHouseTableOperations();
+ ops.initialize(
+ null,
+ new ClickHouseExceptionConverter(),
+ new ClickHouseTypeConverter(),
+ new ClickHouseColumnDefaultValueConverter(),
+ new HashMap<>());
+ return ops;
+ }
+
+ @Test
+ void testGenerateCreateTableSqlGranularityDefaultAndConfigured() {
+ TestableClickHouseTableOperations ops = newOps();
+ JdbcColumn[] cols =
+ new JdbcColumn[] {
+ JdbcColumn.builder()
+ .withName("id")
+ .withType(Types.IntegerType.get())
+ .withNullable(false)
+ .build(),
+ JdbcColumn.builder()
+ .withName("amount")
+ .withType(Types.DoubleType.get())
+ .withNullable(true)
+ .build(),
+ JdbcColumn.builder()
+ .withName("note")
+ .withType(Types.StringType.get())
+ .withNullable(true)
+ .build()
+ };
+ Index[] indexes =
+ new Index[] {
+ Indexes.primary(Indexes.DEFAULT_PRIMARY_KEY_NAME, new String[][]
{{"id"}}),
+ Indexes.of(
+ Index.IndexType.DATA_SKIPPING_MINMAX,
+ "idx_amount_mm",
+ new String[][] {{"amount"}},
+ Map.of(GRANULARITY, "9")),
+ Indexes.of(
+ Index.IndexType.DATA_SKIPPING_BLOOM_FILTER, "idx_note_bf", new
String[][] {{"note"}})
+ };
+
+ String sql =
+ ops.buildCreateSql(
+ "t_granularity",
+ cols,
+ "comment",
+ new HashMap<>(),
+ Transforms.EMPTY_TRANSFORM,
+ Distributions.NONE,
+ indexes,
+ getSortOrders("id"));
+
+ Assertions.assertTrue(sql.contains("INDEX `idx_amount_mm` `amount` TYPE
minmax GRANULARITY 9"));
+ Assertions.assertTrue(
+ sql.contains(
+ "INDEX `idx_note_bf` `note` TYPE bloom_filter GRANULARITY "
+ + DEFAULT_BLOOM_FILTER_GRANULARITY));
+ }
+
+ @Test
+ void testGenerateCreateTableSqlRejectInvalidGranularity() {
+ TestableClickHouseTableOperations ops = newOps();
+ JdbcColumn[] cols =
+ new JdbcColumn[] {
+ JdbcColumn.builder()
+ .withName("id")
+ .withType(Types.IntegerType.get())
+ .withNullable(false)
+ .build(),
+ JdbcColumn.builder()
+ .withName("amount")
+ .withType(Types.DoubleType.get())
+ .withNullable(true)
+ .build()
+ };
+
+ assertCreateGranularityInvalid(
+ ops, cols, Map.of("GRANULARITY", "9"), "unsupported property key");
+ assertCreateGranularityInvalid(
+ ops, cols, Map.of(GRANULARITY, " "), "It must be a positive integer.");
+ assertCreateGranularityInvalid(
+ ops, cols, Map.of(GRANULARITY, "abc"), "It must be a positive
integer.");
+ assertCreateGranularityInvalid(
+ ops, cols, Map.of(GRANULARITY, "0"), "It must be a positive integer.");
+ assertCreateGranularityInvalid(
+ ops, cols, Map.of(GRANULARITY, "-1"), "It must be a positive
integer.");
+ }
+
+ @Test
+ void testGenerateAlterTableSqlRejectInvalidGranularity() {
+ TestableClickHouseTableOperations ops = newOps();
+ ops.setTable(buildStubTable());
+
+ assertAlterGranularityInvalid(ops, Map.of("GRANULARITY", "9"),
"unsupported property key");
+ assertAlterGranularityInvalid(ops, Map.of(GRANULARITY, " "), "It must be a
positive integer.");
+ assertAlterGranularityInvalid(
+ ops, Map.of(GRANULARITY, "abc"), "It must be a positive integer.");
+ assertAlterGranularityInvalid(ops, Map.of(GRANULARITY, "0"), "It must be a
positive integer.");
+ assertAlterGranularityInvalid(ops, Map.of(GRANULARITY, "-1"), "It must be
a positive integer.");
+ }
+
+ @Test
+ void testGenerateAlterTableSqlGranularityDefaultAndConfigured() {
+ TestableClickHouseTableOperations ops = newOps();
+ ops.setTable(buildStubTable());
+
+ String minmaxSql =
+ ops.buildAlterSql(
+ "db",
+ "tbl",
+ new TableChange[] {
+ TableChange.addIndex(
+ Index.IndexType.DATA_SKIPPING_MINMAX,
+ "idx_amount_mm",
+ new String[][] {{"amount"}},
+ Map.of(GRANULARITY, "11"))
+ });
+ Assertions.assertTrue(minmaxSql.contains("ADD INDEX `idx_amount_mm`
`amount` TYPE minmax"));
+ Assertions.assertTrue(minmaxSql.contains("GRANULARITY 11"));
+
+ String bloomSql =
+ ops.buildAlterSql(
+ "db",
+ "tbl",
+ new TableChange[] {
+ TableChange.addIndex(
+ Index.IndexType.DATA_SKIPPING_BLOOM_FILTER,
+ "idx_note_bf",
+ new String[][] {{"note"}})
+ });
+ Assertions.assertTrue(
+ bloomSql.contains(
+ "ADD INDEX `idx_note_bf` `note` TYPE bloom_filter GRANULARITY "
+ + DEFAULT_BLOOM_FILTER_GRANULARITY));
+ }
+
+ @Test
+ void testGetSecondaryIndexesSkipsUnsupportedRowsAndLoadsValidGranularity()
throws Exception {
+ TestableClickHouseTableOperations ops = newOps();
+ Connection connection = Mockito.mock(Connection.class);
+ PreparedStatement preparedStatement =
Mockito.mock(PreparedStatement.class);
+ ResultSet resultSet = Mockito.mock(ResultSet.class);
+
+
Mockito.when(connection.prepareStatement(Mockito.anyString())).thenReturn(preparedStatement);
+ Mockito.when(preparedStatement.executeQuery()).thenReturn(resultSet);
+
+ // Case 1: valid minmax index with explicit granularity.
+ // Case 2: valid bloom_filter index with blank granularity (no properties
key).
+ // Case 3: invalid expression, should be skipped.
+ // Case 4: unsupported index type, should be skipped.
+ Mockito.when(resultSet.next()).thenReturn(true, true, true, true, false);
+ Mockito.when(resultSet.getString("name"))
+ .thenReturn("idx_valid_mm", "idx_valid_bf", "idx_invalid_expr",
"idx_invalid_type");
+ Mockito.when(resultSet.getString("type"))
+ .thenReturn("minmax", "bloom_filter", "minmax", "unknown_type");
+ Mockito.when(resultSet.getString("expr"))
+ .thenReturn("`amount`", "`note`", "amount + 1", "`note`");
+ Mockito.when(resultSet.getString("granularity")).thenReturn("9", "", "7",
"5");
+
+ List<Index> indexes = ops.invokeGetSecondaryIndexes(connection, "db",
"tbl");
+
+ // Only Case 1 and Case 2 should be loaded.
+ Assertions.assertEquals(2, indexes.size());
+
+ // Case 1 verification: granularity is loaded into index properties.
+ Assertions.assertEquals("idx_valid_mm", indexes.get(0).name());
+ Assertions.assertEquals(Index.IndexType.DATA_SKIPPING_MINMAX,
indexes.get(0).type());
+ Assertions.assertEquals("9", indexes.get(0).properties().get(GRANULARITY));
+
+ // Case 2 verification: blank granularity does not create a granularity
property.
+ Assertions.assertEquals("idx_valid_bf", indexes.get(1).name());
+ Assertions.assertEquals(Index.IndexType.DATA_SKIPPING_BLOOM_FILTER,
indexes.get(1).type());
+
Assertions.assertFalse(indexes.get(1).properties().containsKey(GRANULARITY));
+ }
Review Comment:
This test asserts that a blank `granularity` from
`system.data_skipping_indices` results in *no* `properties.granularity` key.
That contradicts the documented/expected behavior in this PR that `loadTable`
returns the effective granularity (including defaults). If the intent is to
always expose an effective granularity, update this expectation to assert the
default granularity value is returned instead of omitting the property.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]