danielcweeks commented on code in PR #11112:
URL: https://github.com/apache/iceberg/pull/11112#discussion_r1806873426


##########
core/src/main/java/org/apache/iceberg/LocationProviders.java:
##########
@@ -172,10 +193,45 @@ private static String pathContext(String tableLocation) {
     }
 
     private String computeHash(String fileName) {
-      byte[] bytes = TEMP.get();
-      HashCode hash = HASH_FUNC.hashString(fileName, StandardCharsets.UTF_8);
-      hash.writeBytesTo(bytes, 0, 4);
-      return BASE64_ENCODER.encode(bytes);
+      HashCode hashCode = HASH_FUNC.hashString(fileName, 
StandardCharsets.UTF_8);
+
+      // {@link Integer#toBinaryString} excludes leading zeros, which we want 
to preserve.
+      // force the first bit to be set to get around that.
+      String hashAsBinaryString = Integer.toBinaryString(hashCode.asInt() | 
Integer.MIN_VALUE);
+      return 
dirsFromHash(hashAsBinaryString.substring(HASH_BINARY_STRING_START_INDEX));
+    }
+
+    /**
+     * Divides hash into directories for optimized orphan removal operation 
using ENTROPY_DIR_DEPTH
+     * and ENTROPY_DIR_LENGTH
+     *
+     * @param hash 10011001100110011001
+     * @return 1001/1001/1001/10011001 with depth 3 and length 4
+     */
+    private String dirsFromHash(String hash) {
+      if (hash.length() < ENTROPY_DIR_DEPTH * ENTROPY_DIR_LENGTH) {

Review Comment:
   We should use a `Preconditions.check` here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to