FANNG1 commented on PR #10721:
URL: https://github.com/apache/gravitino/pull/10721#issuecomment-4244828347

   Thanks, both earlier concerns are cleanly addressed. One remaining issue 
plus two nits.
   
   ## (Should fix) `distributionToProperties` drops `BUCKET_NUM` when `number 
== AUTO`
   
   ```java
   if (number != Distributions.AUTO) {
     properties.put(PaimonConstants.BUCKET_NUM, String.valueOf(number));
   }
   ```
   
   Problems:
   
   1. **Depends on Paimon's default for `bucket`**, which has drifted across 
versions (historically `1` = single bucket, later `-1` = dynamic). On an older 
Paimon, a round-tripped `SHOW CREATE TABLE` that omits `bucket` will silently 
re-create as a single-bucket table — the same class of silent semantic change 
this PR was meant to prevent.
   2. **Inconsistent with server-side 
`GravitinoPaimonTable.applyDistribution`** (`GravitinoPaimonTable.java:212`), 
which unconditionally writes `BUCKET_NUM` regardless of AUTO.
   
   Suggested fix — always write it, mapping AUTO to Paimon's `-1` sentinel 
([docs](https://paimon.apache.org/docs/1.2/maintenance/configurations/#bucket)):
   
   ```java
   properties.put(
       PaimonConstants.BUCKET_NUM,
       String.valueOf(number == Distributions.AUTO ? -1 : number));
   ```
   
   ## (Related) `getDistribution` doesn't normalize `bucket=-1`
   
   - `bucket` blank → `Distributions.auto(HASH, ...)`
   - `bucket=-1` → `Distributions.hash(-1, ...)`
   
   After the fix above we'll write `-1` on load and re-parse it on the next 
create, so these two paths must produce equal `Distribution` objects. Please 
normalize:
   
   ```java
   int parsed = Integer.parseInt(trimmedBucketValue);
   return parsed == -1
       ? Distributions.auto(Strategy.HASH, expressions)
       : Distributions.hash(parsed, expressions);
   ```
   
   And add a test asserting `distributionToProperties(getDistribution(...))` is 
idempotent for both the blank-`bucket` and explicit-`bucket=-1` cases.
   
   ## Nits
   
   - `distributionToProperties` silently drops non-`NamedReference` expressions 
via `.filter(...)`. Prefer `Preconditions.checkArgument` to fail loudly.
   - `BaseCatalog.fromGravitinoDistribution` default returns `new HashMap<>()` 
per call; `Collections.emptyMap()` works with `putAll`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to