[
https://issues.apache.org/jira/browse/HADOOP-19365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18023797#comment-18023797
]
ASF GitHub Bot commented on HADOOP-19365:
-----------------------------------------
ahmarsuhail opened a new pull request, #7723:
URL: https://github.com/apache/hadoop/pull/7723
<!--
Thanks for sending a pull request!
1. If this is your first time, please read our contributor guidelines:
https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute
2. Make sure your PR title starts with JIRA issue id, e.g.,
'HADOOP-17799. Your PR title ...'.
-->
### Description of PR
Relevant AAL PR: https://github.com/awslabs/analytics-accelerator-s3/pull/280
* S3A will pass in the span_id and operation_name when opening the stream
with AAL.
* AAL will attach these per request as executionAttributes.
* When the `modifyHttpRequest` executionInterceptor in the LoggingAuditor
called, these are now available in the `ExecutionAttributes` and can be used
for logging.
* Referrer header is updated by updating the spanId and operation in the
builder.
Example logs:
```
// FIRST HEAD
2025-06-04 16:04:16,414 [s3a-transfer-noaa-cors-pds-unbounded-pool5-t1]
DEBUG impl.LoggingAuditor (LoggingAuditor.java:modifyHttpRequest(429)) - [33]
ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009 Executing op_open with
{action_http_head_request 'raw/2023/017/ohfh/OHFH017d.23_.gz' size=0,
mutating=false};
https://audit.example.org/hadoop/1/op_open/ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009/?op=op_open&p1=raw/2023/017/ohfh/OHFH017d.23_.gz&pr=ahmarsu&ps=6d6c1b1f-5677-4ed6-9706-10b4d5bf8284&id=ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009&t0=33&fs=ad10d217-5092-4323-ba27-d24ca0e8a0f7&t1=33&ts=1749049456399
// FIRST GET
2025-06-04 16:04:18,335 [setup] DEBUG impl.LoggingAuditor
(LoggingAuditor.java:modifyHttpRequest(429)) - [15]
ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009 Executing op_open with
{action_http_get_request 'raw/2023/017/ohfh/OHFH017d.23_.gz' size=8388607,
mutating=false};
https://audit.example.org/hadoop/1/op_open/ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009/?op=op_open&p1=noaa-cors-pds&pr=ahmarsu&ps=6d6c1b1f-5677-4ed6-9706-10b4d5bf8284&rg=5-8388612&id=ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009&t0=15&fs=ad10d217-5092-4323-ba27-d24ca0e8a0f7&t1=15&ts=1749049456389
// SECOND GET
2025-06-04 16:04:18,359 [setup] DEBUG impl.LoggingAuditor
(LoggingAuditor.java:modifyHttpRequest(429)) - [15]
ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009 Executing op_open with
{action_http_get_request 'raw/2023/017/ohfh/OHFH017d.23_.gz' size=8388607,
mutating=false};
https://audit.example.org/hadoop/1/op_open/ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009/?op=op_open&p1=noaa-cors-pds&pr=ahmarsu&ps=6d6c1b1f-5677-4ed6-9706-10b4d5bf8284&rg=8388613-16777220&id=ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009&t0=15&fs=ad10d217-5092-4323-ba27-d24ca0e8a0f7&t1=15&ts=1749049456389
// THIRD GET
2025-06-04 16:04:18,361 [setup] DEBUG impl.LoggingAuditor
(LoggingAuditor.java:modifyHttpRequest(429)) - [15]
ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009 Executing op_open with
{action_http_get_request 'raw/2023/017/ohfh/OHFH017d.23_.gz' size=4733952,
mutating=false};
https://audit.example.org/hadoop/1/op_open/ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009/?op=op_open&p1=noaa-cors-pds&pr=ahmarsu&ps=6d6c1b1f-5677-4ed6-9706-10b4d5bf8284&rg=16777221-21511173&id=ad10d217-5092-4323-ba27-d24ca0e8a0f7-00000009&t0=15&fs=ad10d217-5092-4323-ba27-d24ca0e8a0f7&t1=15&ts=1749049456389
```
### How was this patch tested?
Tested in eu-west-1, `mvn -Dparallel-tests -DtestsThreadCount=16 clean
verify`
### For code changes:
- [ ] Does the title or this PR starts with the corresponding JIRA issue id
(e.g. 'HADOOP-17799. Your PR title ...')?
- [ ] Object storage: have the integration tests been executed and the
endpoint declared according to the connector-specific documentation?
- [ ] If adding new dependencies to the code, are these dependencies
licensed in a way that is compatible for inclusion under [ASF
2.0](http://www.apache.org/legal/resolved.html#category-a)?
- [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`,
`NOTICE-binary` files?
> S3A Analytics-Accelerator: Add audit header support
> ----------------------------------------------------
>
> Key: HADOOP-19365
> URL: https://issues.apache.org/jira/browse/HADOOP-19365
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Reporter: Ahmar Suhail
> Priority: Major
> Labels: pull-request-available
>
> S3A adds audit information as referrer headers, see
> [https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/auditing.md]
> for documentation on this.
> These are attached using execution interceptors, see ActiveAuditManagerS3A.
> createExecutionInterceptors().
>
> analyitcs-accelerator-s3 makes the GET requests now but does not update the
> referrer with GETs.
>
> See LoggingAuditor.attachRangeFromRequest() for how this is done.
>
> When using analytics-accelerator-s3, audit headers from S3A should be sent
> through.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]