gavinchou commented on code in PR #56861:
URL: https://github.com/apache/doris/pull/56861#discussion_r2446813782


##########
be/src/io/fs/azure_obj_storage_client.cpp:
##########
@@ -53,9 +53,39 @@ std::string wrap_object_storage_path_msg(const 
doris::io::ObjectStoragePathOptio
                        opts.path.native());
 }
 
-auto base64_encode_part_num(int part_num) {
-    return Aws::Utils::HashingUtils::Base64Encode(
-            {reinterpret_cast<unsigned char*>(&part_num), sizeof(part_num)});
+/**
+ * Encode a 32-bit part number into a Base64 string (blockId).
+ *
+ * Design goals:
+ *  1. Platform-independent: ignores machine endianness (little-endian / 
big-endian)
+ *  2. Fixed length: always uses 4 bytes (int32_t)
+ *  3. Fixed byte order: Big-Endian (network byte order)
+ *  4. Consistent with Java and front-end (FE):
+ *     - Java: ByteBuffer.order(ByteOrder.BIG_ENDIAN) + Base64 encoding
+ *     - FE: must follow the same Big-Endian + Base64 rule
+ *
+ * Rules:
+ *  - Input: 32-bit integer partNum
+ *  - Manually split into 4 bytes, high byte first (Big-Endian)
+ *  - Base64 encode the 4 bytes → 8-character string
+ *  - Ensures consistency across machines, languages, and platforms
+ *
+ * Example:
+ *  partNum = 1
+ *  buf = {0x00, 0x00, 0x00, 0x01}
+ *  Base64 encoding result = "AAAAAQ=="
+ *
+ * @param part_num the integer part ID to encode
+ * @return Base64-encoded blockId string
+ */
+auto base64_encode_part_num(int32_t part_num) {

Review Comment:
   we have some existing encoding functions: be/src/util/coding.h



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to