[
https://issues.apache.org/jira/browse/HADOOP-19557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17948469#comment-17948469
]
ASF GitHub Bot commented on HADOOP-19557:
-----------------------------------------
ahmarsuhail commented on code in PR #7662:
URL: https://github.com/apache/hadoop/pull/7662#discussion_r2068635417
##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java:
##########
@@ -829,7 +829,8 @@ public boolean hasCapability(String capability) {
@Override
public void hflush() throws IOException {
statistics.hflushInvoked();
- handleSyncableInvocation();
+ // do not reject these, but downgrade to a no-oop
+ LOG.debug("Hflush invoked");
Review Comment:
@steveloughran is parquet the only reader calling hflush? think this changes
behaviour for everyone.. is this something we need to care about?
> S3A: S3ABlockOutputStream to never log/reject hflush(): calls
> -------------------------------------------------------------
>
> Key: HADOOP-19557
> URL: https://issues.apache.org/jira/browse/HADOOP-19557
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 3.4.1
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Critical
> Labels: pull-request-available
>
> Parquet's GH-3204 patch uses hflush() just before close()
> this is needless and hurts write performance on hdfs.
> For s3A it will trigger a warning long (Syncable is not supported) or an
> actual failure if
> fs.s3a.downgrade.syncable.exceptions is false
> proposed: hflush to log at debug -only log/reject on hsync, which is the real
> place where semantics cannot be met
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]