FANNG1 commented on PR #10721:
URL: https://github.com/apache/gravitino/pull/10721#issuecomment-4244828347
Thanks, both earlier concerns are cleanly addressed. One remaining issue
plus two nits.
## (Should fix) `distributionToProperties` drops `BUCKET_NUM` when `number
== AUTO`
```java
if (number != Distributions.AUTO) {
properties.put(PaimonConstants.BUCKET_NUM, String.valueOf(number));
}
```
Problems:
1. **Depends on Paimon's default for `bucket`**, which has drifted across
versions (historically `1` = single bucket, later `-1` = dynamic). On an older
Paimon, a round-tripped `SHOW CREATE TABLE` that omits `bucket` will silently
re-create as a single-bucket table — the same class of silent semantic change
this PR was meant to prevent.
2. **Inconsistent with server-side
`GravitinoPaimonTable.applyDistribution`** (`GravitinoPaimonTable.java:212`),
which unconditionally writes `BUCKET_NUM` regardless of AUTO.
Suggested fix — always write it, mapping AUTO to Paimon's `-1` sentinel
([docs](https://paimon.apache.org/docs/1.2/maintenance/configurations/#bucket)):
```java
properties.put(
PaimonConstants.BUCKET_NUM,
String.valueOf(number == Distributions.AUTO ? -1 : number));
```
## (Related) `getDistribution` doesn't normalize `bucket=-1`
- `bucket` blank → `Distributions.auto(HASH, ...)`
- `bucket=-1` → `Distributions.hash(-1, ...)`
After the fix above we'll write `-1` on load and re-parse it on the next
create, so these two paths must produce equal `Distribution` objects. Please
normalize:
```java
int parsed = Integer.parseInt(trimmedBucketValue);
return parsed == -1
? Distributions.auto(Strategy.HASH, expressions)
: Distributions.hash(parsed, expressions);
```
And add a test asserting `distributionToProperties(getDistribution(...))` is
idempotent for both the blank-`bucket` and explicit-`bucket=-1` cases.
## Nits
- `distributionToProperties` silently drops non-`NamedReference` expressions
via `.filter(...)`. Prefer `Preconditions.checkArgument` to fail loudly.
- `BaseCatalog.fromGravitinoDistribution` default returns `new HashMap<>()`
per call; `Collections.emptyMap()` works with `putAll`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]