[
https://issues.apache.org/jira/browse/HADOOP-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18009169#comment-18009169
]
ASF GitHub Bot commented on HADOOP-19604:
-----------------------------------------
anmolanmol1234 opened a new pull request, #7819:
URL: https://github.com/apache/hadoop/pull/7819
Jira :- https://issues.apache.org/jira/browse/HADOOP-19604
BlockId computation to be consistent across clients for PutBlock and
PutBlockList so made use of blockCount instead of offset.
Block IDs were previously derived from the data offset, which could lead to
inconsistency across different clients. The change now uses blockCount (i.e.,
the index of the block) to compute the Block ID, ensuring deterministic and
consistent ID generation for both PutBlock and PutBlockList operations across
clients.
Restrict URL encoding of certain JSON metadata during setXAttr calls.
When setting extended attributes (xAttrs), the JSON metadata
(hdi_permission) was previously URL-encoded, which could cause unnecessary
escaping or compatibility issues. This change ensures that only required
metadata are encoded.
Maintain the MD5 hash of the whole block to validate data integrity during
flush.
During flush operations, the MD5 hash of the entire block's data is computed
and stored. This hash is later used to validate that the block correctly
persisted, ensuring data integrity and helping detect corruption or
transmission errors.
> ABFS: Fix WASB ABFS compatibility issues
> ----------------------------------------
>
> Key: HADOOP-19604
> URL: https://issues.apache.org/jira/browse/HADOOP-19604
> Project: Hadoop Common
> Issue Type: Sub-task
> Affects Versions: 3.4.1
> Reporter: Anmol Asrani
> Assignee: Anmol Asrani
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.4.1
>
>
> Fix WASB ABFS compatibility issues. Fix issues such as:-
> # BlockId computation to be consistent across clients for PutBlock and
> PutBlockList
> # Restrict url encoding of certain json metadata during setXAttr calls.
> # Maintain the md5 hash of whole block to validate data integrity during
> flush.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]