[
https://issues.apache.org/jira/browse/HADOOP-13171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15303230#comment-15303230
]
Chris Nauroth commented on HADOOP-13171:
----------------------------------------
The new stats are very thorough. The refactoring of {{ProgressListener}} and
pulling encryption logic out of the output streams is nice too. Thank you,
Steve!
{{TestS3ADirectoryPerformance#testListOperations}} is timing out for me
consistently in my runs against US-west-2. It appears to be making progress,
but not quickly enough.
{code}
int scale = (int)getOperationCount();
int width = 2 * scale;
int depth = 2 * scale;
int files = 2 * scale;
{code}
Are you sure this is what you wanted for these factors? Please check my math
here, but I believe the interaction between width and depth means the total
number of directories grows exponentially with respect to scale, something like
(2*scale) ^ (2*scale) + 1 (for the root). Then, considering the file count per
directory, we have a total of ((2*scale) ^ (2*scale) + 1) * (2 * scale) files.
The default operation count is 2005, so these are big numbers. Even if I
down-tune to scale.test.operation.count=4, it still takes a long time.
Please add an {{@Override}} annotation to
{{ProgressableProgressListener#progressChanged}}.
I think {{ProgressableProgressListener#getLastBytesTransferred}} is no longer
called anywhere and can be removed.
Maybe I'm missing something, but it appears that after execution of the base
{{FileSystem#initialize}}, it's impossible for {{FileSystem#statistics}} to be
null. Therefore, the null guards in the increment methods should be
unnecessary. If you prefer to keep them in the spirit of defensive coding, I'm
fine with that too.
{code}
private Iterator<Map.Entry<Statistic, AtomicLong>> iterator =
opsCount.entrySet().iterator();
{code}
I suggest changing to the following...
{code}
private Iterator<Map.Entry<Statistic, AtomicLong>> iterator =
Collections.unmodifiableSet(opsCount.entrySet()).iterator();
{code}
...so that we block callers from removing entries through the iterator. That's
probably applicable to the equivalent code up in hadoop-common too, but
probably best to keep the scope focused just on the hadoop-aws code here.
The JavaDocs for {{S3ATestUtils#createSubdirs}} aren't at the right indentation
level.
Repeating a comment from HADOOP-13131, {{getS3AFS}} might instead be an
override of {{getFileSystem}} with a covariant return type.
{code}
@Test
public void testCostOfCopyFromLocalFile() throws Throwable {
describe("testCostOfCopyFromLocalFile");
File tmpFile = File.createTempFile("tests3acost", ".txt");
{code}
I'd prefer to avoid accessing /tmp due to the potential for platform-specific
oddities, e.g. HADOOP-761 and MAPREDUCE-5191. Can we switch to something based
on {{getContract().getTestPath()}}?
> Add StorageStatistics to S3A; instrument some more operations
> -------------------------------------------------------------
>
> Key: HADOOP-13171
> URL: https://issues.apache.org/jira/browse/HADOOP-13171
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.8.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Minor
> Attachments: HADOOP-13171-branch-2-001.patch,
> HADOOP-13171-branch-2-002.patch, HADOOP-13171-branch-2-003.patch,
> HADOOP-13171-branch-2-004.patch, HADOOP-13171-branch-2-005.patch,
> HADOOP-13171-branch-2-006.patch, HADOOP-13171-branch-2-007.patch
>
>
> Add {{StorageStatistics}} support to S3A, collecting the same metrics as the
> instrumentation, but sharing across all instances.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]