steveloughran commented on a change in pull request #2149:
URL: https://github.com/apache/hadoop/pull/2149#discussion_r467960077
##########
File path:
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
##########
@@ -2890,72 +2942,114 @@ S3AFileStatus innerGetFileStatus(final Path f,
// Skip going to s3 if the file checked is a directory. Because if the
// dest is also a directory, there's no difference.
- if (!pm.getFileStatus().isDirectory() &&
+ if (!msStatus.isDirectory() &&
!allowAuthoritative &&
probes.contains(StatusProbeEnum.Head)) {
// a file has been found in a non-auth path and the caller has not said
// they only care about directories
LOG.debug("Metadata for {} found in the non-auth metastore.", path);
- final long msModTime = pm.getFileStatus().getModificationTime();
-
- S3AFileStatus s3AFileStatus;
- try {
- s3AFileStatus = s3GetFileStatus(path, key, probes, tombstones);
- } catch (FileNotFoundException fne) {
- s3AFileStatus = null;
- }
- if (s3AFileStatus == null) {
- LOG.warn("Failed to find file {}. Either it is not yet visible, or "
- + "it has been deleted.", path);
- } else {
- final long s3ModTime = s3AFileStatus.getModificationTime();
-
- if(s3ModTime > msModTime) {
- LOG.debug("S3Guard metadata for {} is outdated;"
- + " s3modtime={}; msModTime={} updating metastore",
- path, s3ModTime, msModTime);
- return S3Guard.putAndReturn(metadataStore, s3AFileStatus,
- ttlTimeProvider);
+ // If the timestamp of the pm is close to "now", we don't need to
Review comment:
Interesting point. It's actually a bug in
https://issues.apache.org/jira/browse/HADOOP-16279 : Once the TTL of an entry
has expired, the S3AFS code will probe S3 For a changed object forever. The
change I'm doing here is not actually needed for the markers, it is just a fix
which has gone in because this PR is where I found the issue
f you do want things moved about, I could put in this patch we could do
something separate including adding new tests. This will be back portable.
However, the fix is going to go on a different schedule. Given the #802 patch
is only in 3.3.0 and I'm targeting that branch, and editing the same method,
I'd like to just wrap it up in here.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]