[ 
https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=782825&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782825
 ]

ASF GitHub Bot logged work on HADOOP-18258:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Jun/22 06:32
            Start Date: 20/Jun/22 06:32
    Worklog Time Spent: 10m 
      Work Description: mehakmeet commented on code in PR #4383:
URL: https://github.com/apache/hadoop/pull/4383#discussion_r899217349


##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java:
##########
@@ -0,0 +1,334 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.Closeable;
+import java.io.EOFException;
+import java.io.File;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FilterFileSystem;
+import org.apache.hadoop.fs.LocatedFileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RemoteIterator;
+import org.apache.hadoop.fs.s3a.S3AFileSystem;
+import org.apache.hadoop.util.ExitUtil;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS;
+
+/**.
+ * AuditTool is a Command Line Interface to manage S3 Auditing.
+ * i.e, it is a functionality which directly takes s3 path of audit log files
+ * and merge all those into single audit log file

Review Comment:
   This isn't the correct functionality, we only support merging in this patch, 
but our end goal is to parse the audit log into an avro file.



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java:
##########
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.io.PrintWriter;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Merger class will merge all the audit logs present in a directory of
+ * multiple audit log files into a single audit log file.
+ */
+public class S3AAuditLogMerger {
+
+  private final Logger logger =
+      LoggerFactory.getLogger(S3AAuditLogMerger.class);
+
+  public void mergeFiles(String auditLogsDirectoryPath) throws IOException {
+    File auditLogFilesDirectory = new File(auditLogsDirectoryPath);
+    String[] auditLogFileNames = auditLogFilesDirectory.list();
+
+    //Read each audit log file present in directory and writes each and every 
audit log in it
+    //into a single audit log file

Review Comment:
   change this comment to be little simpler, something like "Merging of audit 
log files present in a directory into a single audit log file"



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * MergerTest will implement different tests on Merger class methods.
+ */
+public class TestS3AAuditLogMerger {
+
+  private final Logger logger = 
LoggerFactory.getLogger(TestS3AAuditLogMerger.class);
+
+  private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger();
+
+  /**
+   * sample directories and files to test.
+   */
+  private final File auditLogFile = new File("AuditLogFile");
+  private final File sampleDirectory = new File("sampleFilesDirectory");
+  private final File emptyDirectory = new File("emptyFilesDirectory");
+  private final File firstSampleFile =
+      new File("sampleFilesDirectory", "sampleFile1.txt");
+  private final File secondSampleFile =
+      new File("sampleFilesDirectory", "sampleFile2.txt");
+  private final File thirdSampleFile =
+      new File("sampleFilesDirectory", "sampleFile3.txt");
+
+  /**
+   * creates the sample directories and files before each test.
+   *
+   * @throws IOException on failure
+   */
+  @Before
+  public void setUp() throws IOException {
+    boolean sampleDirCreation = sampleDirectory.mkdir();
+    boolean emptyDirCreation = emptyDirectory.mkdir();
+    if (sampleDirCreation && emptyDirCreation) {
+      try (FileWriter fw = new FileWriter(firstSampleFile);
+          FileWriter fw1 = new FileWriter(secondSampleFile);
+          FileWriter fw2 = new FileWriter(thirdSampleFile)) {
+        fw.write("abcd");
+        fw1.write("efgh");
+        fw2.write("ijkl");
+      }
+    }
+  }
+
+  /**
+   * mergeFilesTest() will test the mergeFiles() method in Merger class.
+   * by passing a sample directory which contains files with some content in it
+   * and checks if files in a directory are merged into single file
+   *
+   * @throws IOException on any failure

Review Comment:
   cut `@throws` in tests



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java:
##########
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.io.PrintWriter;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Merger class will merge all the audit logs present in a directory of
+ * multiple audit log files into a single audit log file.
+ */
+public class S3AAuditLogMerger {
+
+  private final Logger logger =
+      LoggerFactory.getLogger(S3AAuditLogMerger.class);
+
+  public void mergeFiles(String auditLogsDirectoryPath) throws IOException {

Review Comment:
   javadocs



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java:
##########
@@ -0,0 +1,334 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.Closeable;
+import java.io.EOFException;
+import java.io.File;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FilterFileSystem;
+import org.apache.hadoop.fs.LocatedFileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RemoteIterator;
+import org.apache.hadoop.fs.s3a.S3AFileSystem;
+import org.apache.hadoop.util.ExitUtil;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS;
+
+/**.
+ * AuditTool is a Command Line Interface to manage S3 Auditing.
+ * i.e, it is a functionality which directly takes s3 path of audit log files
+ * and merge all those into single audit log file
+ */
+public class AuditTool extends Configured implements Tool, Closeable {
+
+  private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class);
+
+  private final String entryPoint = "s3audit";
+
+  private PrintWriter out;
+
+  // Exit codes
+  private static final int SUCCESS = EXIT_SUCCESS;
+  private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR;
+
+  /**
+   * Error String when the wrong FS is used for binding: {@value}.
+   **/
+  @VisibleForTesting
+  public static final String WRONG_FILESYSTEM = "Wrong filesystem for ";
+
+  private final String usage = entryPoint + "  s3a://BUCKET\n";
+
+  private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory");
+
+  public AuditTool() {
+  }
+
+  /**
+   * tells us the usage of the AuditTool by commands.
+   *
+   * @return the string USAGE
+   */
+  public String getUsage() {
+    return usage;
+  }
+
+  /**
+   * this run method in AuditTool takes S3 bucket path.
+   * which contains audit log files from command line arguments
+   * and merge the audit log files present in that path into single file in 
local system
+   *
+   * @param args command specific arguments.
+   * @return SUCCESS i.e, '0', which is an exit code
+   * @throws Exception on any failure
+   */
+  @Override
+  public int run(String[] args) throws Exception {
+    List<String> argv = new ArrayList<>(Arrays.asList(args));
+    println("argv: %s", argv);
+    if (argv.isEmpty()) {
+      errorln(getUsage());
+      throw invalidArgs("No bucket specified");
+    }
+    //path of audit log files in s3 bucket
+    Path s3LogsPath = new Path(argv.get(0));
+
+    //setting the file system
+    URI fsURI = toUri(String.valueOf(s3LogsPath));
+    S3AFileSystem s3AFileSystem =
+        bindFilesystem(FileSystem.newInstance(fsURI, getConf()));
+    RemoteIterator<LocatedFileStatus> listOfS3LogFiles =
+        s3AFileSystem.listFiles(s3LogsPath, true);
+
+    //creating local audit log files directory and
+    //copying audit log files into local files from s3 bucket
+    //so that it will be easy for us to implement merging and parsing classes

Review Comment:
   cut the last line. Just comment what we are doing here.



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java:
##########
@@ -0,0 +1,334 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.Closeable;
+import java.io.EOFException;
+import java.io.File;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FilterFileSystem;
+import org.apache.hadoop.fs.LocatedFileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RemoteIterator;
+import org.apache.hadoop.fs.s3a.S3AFileSystem;
+import org.apache.hadoop.util.ExitUtil;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS;
+
+/**.
+ * AuditTool is a Command Line Interface to manage S3 Auditing.
+ * i.e, it is a functionality which directly takes s3 path of audit log files
+ * and merge all those into single audit log file
+ */
+public class AuditTool extends Configured implements Tool, Closeable {
+
+  private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class);
+
+  private final String entryPoint = "s3audit";
+
+  private PrintWriter out;
+
+  // Exit codes
+  private static final int SUCCESS = EXIT_SUCCESS;
+  private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR;
+
+  /**
+   * Error String when the wrong FS is used for binding: {@value}.
+   **/
+  @VisibleForTesting
+  public static final String WRONG_FILESYSTEM = "Wrong filesystem for ";
+
+  private final String usage = entryPoint + "  s3a://BUCKET\n";
+
+  private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory");
+
+  public AuditTool() {
+  }
+
+  /**
+   * tells us the usage of the AuditTool by commands.
+   *
+   * @return the string USAGE
+   */
+  public String getUsage() {
+    return usage;
+  }
+
+  /**
+   * this run method in AuditTool takes S3 bucket path.
+   * which contains audit log files from command line arguments
+   * and merge the audit log files present in that path into single file in 
local system
+   *
+   * @param args command specific arguments.
+   * @return SUCCESS i.e, '0', which is an exit code
+   * @throws Exception on any failure
+   */
+  @Override
+  public int run(String[] args) throws Exception {
+    List<String> argv = new ArrayList<>(Arrays.asList(args));
+    println("argv: %s", argv);
+    if (argv.isEmpty()) {
+      errorln(getUsage());
+      throw invalidArgs("No bucket specified");
+    }
+    //path of audit log files in s3 bucket
+    Path s3LogsPath = new Path(argv.get(0));
+
+    //setting the file system
+    URI fsURI = toUri(String.valueOf(s3LogsPath));
+    S3AFileSystem s3AFileSystem =
+        bindFilesystem(FileSystem.newInstance(fsURI, getConf()));
+    RemoteIterator<LocatedFileStatus> listOfS3LogFiles =
+        s3AFileSystem.listFiles(s3LogsPath, true);
+
+    //creating local audit log files directory and
+    //copying audit log files into local files from s3 bucket
+    //so that it will be easy for us to implement merging and parsing classes
+    if (!s3aLogsDirectory.exists()) {
+      boolean s3aLogsDirectoryCreation = s3aLogsDirectory.mkdir();
+    }
+    File s3aLogsSubDir = new File(s3aLogsDirectory, s3LogsPath.getName());
+    boolean s3aLogsSubDirCreation = false;
+    if (!s3aLogsSubDir.exists()) {
+      s3aLogsSubDirCreation = s3aLogsSubDir.mkdir();
+    }
+    if (s3aLogsSubDirCreation) {
+      while (listOfS3LogFiles.hasNext()) {
+        Path s3LogFilePath = listOfS3LogFiles.next().getPath();
+        File s3LogLocalFilePath =
+            new File(s3aLogsSubDir, s3LogFilePath.getName());
+        boolean localFileCreation = s3LogLocalFilePath.createNewFile();
+        if (localFileCreation) {
+          FileStatus fileStatus = s3AFileSystem.getFileStatus(s3LogFilePath);
+          long s3LogFileLength = fileStatus.getLen();
+          //reads s3 file data into byte buffer
+          byte[] s3LogDataBuffer =
+              readDataset(s3AFileSystem, s3LogFilePath, (int) s3LogFileLength);
+          //writes byte array into local file
+          FileUtils.writeByteArrayToFile(s3LogLocalFilePath, s3LogDataBuffer);
+        }
+      }
+    }
+
+    //calls S3AAuditLogMerger for implementing merging code
+    //by passing local audit log files directory which are copied from s3 
bucket

Review Comment:
   Again keep the comments simple, like merging local audit files into a single 
file.



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java:
##########
@@ -0,0 +1,334 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.Closeable;
+import java.io.EOFException;
+import java.io.File;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FilterFileSystem;
+import org.apache.hadoop.fs.LocatedFileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RemoteIterator;
+import org.apache.hadoop.fs.s3a.S3AFileSystem;
+import org.apache.hadoop.util.ExitUtil;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS;
+
+/**.
+ * AuditTool is a Command Line Interface to manage S3 Auditing.
+ * i.e, it is a functionality which directly takes s3 path of audit log files
+ * and merge all those into single audit log file
+ */
+public class AuditTool extends Configured implements Tool, Closeable {
+
+  private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class);
+
+  private final String entryPoint = "s3audit";
+
+  private PrintWriter out;
+
+  // Exit codes
+  private static final int SUCCESS = EXIT_SUCCESS;
+  private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR;
+
+  /**
+   * Error String when the wrong FS is used for binding: {@value}.
+   **/
+  @VisibleForTesting
+  public static final String WRONG_FILESYSTEM = "Wrong filesystem for ";
+
+  private final String usage = entryPoint + "  s3a://BUCKET\n";
+
+  private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory");
+
+  public AuditTool() {
+  }
+
+  /**
+   * tells us the usage of the AuditTool by commands.
+   *
+   * @return the string USAGE
+   */
+  public String getUsage() {
+    return usage;
+  }
+
+  /**
+   * this run method in AuditTool takes S3 bucket path.
+   * which contains audit log files from command line arguments
+   * and merge the audit log files present in that path into single file in 
local system
+   *
+   * @param args command specific arguments.
+   * @return SUCCESS i.e, '0', which is an exit code
+   * @throws Exception on any failure
+   */
+  @Override
+  public int run(String[] args) throws Exception {
+    List<String> argv = new ArrayList<>(Arrays.asList(args));
+    println("argv: {}" , argv);

Review Comment:
   Not neccessary



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java:
##########
@@ -0,0 +1,334 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.Closeable;
+import java.io.EOFException;
+import java.io.File;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FilterFileSystem;
+import org.apache.hadoop.fs.LocatedFileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RemoteIterator;
+import org.apache.hadoop.fs.s3a.S3AFileSystem;
+import org.apache.hadoop.util.ExitUtil;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS;
+
+/**.

Review Comment:
   nit: remove "."



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java:
##########
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.io.PrintWriter;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Merger class will merge all the audit logs present in a directory of
+ * multiple audit log files into a single audit log file.
+ */
+public class S3AAuditLogMerger {
+
+  private final Logger logger =

Review Comment:
   make it `private static final LOG`, like we do for all LOG instances.



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java:
##########
@@ -0,0 +1,334 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.Closeable;
+import java.io.EOFException;
+import java.io.File;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FilterFileSystem;
+import org.apache.hadoop.fs.LocatedFileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RemoteIterator;
+import org.apache.hadoop.fs.s3a.S3AFileSystem;
+import org.apache.hadoop.util.ExitUtil;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS;
+
+/**.
+ * AuditTool is a Command Line Interface to manage S3 Auditing.
+ * i.e, it is a functionality which directly takes s3 path of audit log files
+ * and merge all those into single audit log file
+ */
+public class AuditTool extends Configured implements Tool, Closeable {
+
+  private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class);
+
+  private final String entryPoint = "s3audit";
+
+  private PrintWriter out;
+
+  // Exit codes
+  private static final int SUCCESS = EXIT_SUCCESS;
+  private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR;
+
+  /**
+   * Error String when the wrong FS is used for binding: {@value}.
+   **/
+  @VisibleForTesting
+  public static final String WRONG_FILESYSTEM = "Wrong filesystem for ";
+
+  private final String usage = entryPoint + "  s3a://BUCKET\n";
+
+  private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory");
+
+  public AuditTool() {
+  }
+
+  /**
+   * tells us the usage of the AuditTool by commands.
+   *
+   * @return the string USAGE
+   */
+  public String getUsage() {
+    return usage;
+  }
+
+  /**
+   * this run method in AuditTool takes S3 bucket path.
+   * which contains audit log files from command line arguments
+   * and merge the audit log files present in that path into single file in 
local system
+   *
+   * @param args command specific arguments.
+   * @return SUCCESS i.e, '0', which is an exit code
+   * @throws Exception on any failure
+   */
+  @Override
+  public int run(String[] args) throws Exception {
+    List<String> argv = new ArrayList<>(Arrays.asList(args));
+    println("argv: %s", argv);
+    if (argv.isEmpty()) {
+      errorln(getUsage());
+      throw invalidArgs("No bucket specified");
+    }
+    //path of audit log files in s3 bucket
+    Path s3LogsPath = new Path(argv.get(0));
+
+    //setting the file system
+    URI fsURI = toUri(String.valueOf(s3LogsPath));
+    S3AFileSystem s3AFileSystem =
+        bindFilesystem(FileSystem.newInstance(fsURI, getConf()));
+    RemoteIterator<LocatedFileStatus> listOfS3LogFiles =
+        s3AFileSystem.listFiles(s3LogsPath, true);
+
+    //creating local audit log files directory and
+    //copying audit log files into local files from s3 bucket
+    //so that it will be easy for us to implement merging and parsing classes
+    if (!s3aLogsDirectory.exists()) {
+      boolean s3aLogsDirectoryCreation = s3aLogsDirectory.mkdir();
+    }
+    File s3aLogsSubDir = new File(s3aLogsDirectory, s3LogsPath.getName());
+    boolean s3aLogsSubDirCreation = false;
+    if (!s3aLogsSubDir.exists()) {
+      s3aLogsSubDirCreation = s3aLogsSubDir.mkdir();
+    }
+    if (s3aLogsSubDirCreation) {
+      while (listOfS3LogFiles.hasNext()) {
+        Path s3LogFilePath = listOfS3LogFiles.next().getPath();
+        File s3LogLocalFilePath =

Review Comment:
   Are these files being closed after we write into them?



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * MergerTest will implement different tests on Merger class methods.
+ */
+public class TestS3AAuditLogMerger {
+
+  private final Logger logger = 
LoggerFactory.getLogger(TestS3AAuditLogMerger.class);

Review Comment:
   make this `private static final Logger LOG`



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * MergerTest will implement different tests on Merger class methods.
+ */
+public class TestS3AAuditLogMerger {
+
+  private final Logger logger = 
LoggerFactory.getLogger(TestS3AAuditLogMerger.class);
+
+  private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger();
+
+  /**
+   * sample directories and files to test.
+   */
+  private final File auditLogFile = new File("AuditLogFile");
+  private final File sampleDirectory = new File("sampleFilesDirectory");
+  private final File emptyDirectory = new File("emptyFilesDirectory");
+  private final File firstSampleFile =
+      new File("sampleFilesDirectory", "sampleFile1.txt");
+  private final File secondSampleFile =
+      new File("sampleFilesDirectory", "sampleFile2.txt");
+  private final File thirdSampleFile =
+      new File("sampleFilesDirectory", "sampleFile3.txt");
+
+  /**
+   * creates the sample directories and files before each test.
+   *
+   * @throws IOException on failure
+   */
+  @Before
+  public void setUp() throws IOException {
+    boolean sampleDirCreation = sampleDirectory.mkdir();
+    boolean emptyDirCreation = emptyDirectory.mkdir();
+    if (sampleDirCreation && emptyDirCreation) {
+      try (FileWriter fw = new FileWriter(firstSampleFile);
+          FileWriter fw1 = new FileWriter(secondSampleFile);
+          FileWriter fw2 = new FileWriter(thirdSampleFile)) {
+        fw.write("abcd");
+        fw1.write("efgh");
+        fw2.write("ijkl");
+      }
+    }
+  }
+
+  /**
+   * mergeFilesTest() will test the mergeFiles() method in Merger class.
+   * by passing a sample directory which contains files with some content in it
+   * and checks if files in a directory are merged into single file
+   *
+   * @throws IOException on any failure
+   */
+  @Test
+  public void mergeFilesTest() throws IOException {
+    s3AAuditLogMerger.mergeFiles(sampleDirectory.getPath());
+    String str =
+        new String(Files.readAllBytes(Paths.get(auditLogFile.getPath())));
+    String fileText = str.replace("\n", "");

Review Comment:
   comment why you are doing this



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java:
##########
@@ -0,0 +1,334 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.Closeable;
+import java.io.EOFException;
+import java.io.File;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FilterFileSystem;
+import org.apache.hadoop.fs.LocatedFileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RemoteIterator;
+import org.apache.hadoop.fs.s3a.S3AFileSystem;
+import org.apache.hadoop.util.ExitUtil;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS;
+
+/**.
+ * AuditTool is a Command Line Interface to manage S3 Auditing.
+ * i.e, it is a functionality which directly takes s3 path of audit log files
+ * and merge all those into single audit log file
+ */
+public class AuditTool extends Configured implements Tool, Closeable {
+
+  private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class);
+
+  private final String entryPoint = "s3audit";
+
+  private PrintWriter out;
+
+  // Exit codes
+  private static final int SUCCESS = EXIT_SUCCESS;
+  private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR;
+
+  /**
+   * Error String when the wrong FS is used for binding: {@value}.
+   **/
+  @VisibleForTesting
+  public static final String WRONG_FILESYSTEM = "Wrong filesystem for ";
+
+  private final String usage = entryPoint + "  s3a://BUCKET\n";
+
+  private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory");
+
+  public AuditTool() {
+  }
+
+  /**
+   * tells us the usage of the AuditTool by commands.
+   *
+   * @return the string USAGE
+   */
+  public String getUsage() {
+    return usage;
+  }
+
+  /**
+   * this run method in AuditTool takes S3 bucket path.
+   * which contains audit log files from command line arguments
+   * and merge the audit log files present in that path into single file in 
local system
+   *
+   * @param args command specific arguments.
+   * @return SUCCESS i.e, '0', which is an exit code
+   * @throws Exception on any failure
+   */
+  @Override
+  public int run(String[] args) throws Exception {
+    List<String> argv = new ArrayList<>(Arrays.asList(args));
+    println("argv: %s", argv);
+    if (argv.isEmpty()) {
+      errorln(getUsage());
+      throw invalidArgs("No bucket specified");
+    }
+    //path of audit log files in s3 bucket
+    Path s3LogsPath = new Path(argv.get(0));
+
+    //setting the file system
+    URI fsURI = toUri(String.valueOf(s3LogsPath));
+    S3AFileSystem s3AFileSystem =
+        bindFilesystem(FileSystem.newInstance(fsURI, getConf()));
+    RemoteIterator<LocatedFileStatus> listOfS3LogFiles =
+        s3AFileSystem.listFiles(s3LogsPath, true);
+
+    //creating local audit log files directory and
+    //copying audit log files into local files from s3 bucket
+    //so that it will be easy for us to implement merging and parsing classes
+    if (!s3aLogsDirectory.exists()) {
+      boolean s3aLogsDirectoryCreation = s3aLogsDirectory.mkdir();
+    }
+    File s3aLogsSubDir = new File(s3aLogsDirectory, s3LogsPath.getName());

Review Comment:
   why are we creating a sub-dir?



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * MergerTest will implement different tests on Merger class methods.
+ */
+public class TestS3AAuditLogMerger {
+
+  private final Logger logger = 
LoggerFactory.getLogger(TestS3AAuditLogMerger.class);
+
+  private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger();
+
+  /**
+   * sample directories and files to test.
+   */
+  private final File auditLogFile = new File("AuditLogFile");
+  private final File sampleDirectory = new File("sampleFilesDirectory");
+  private final File emptyDirectory = new File("emptyFilesDirectory");
+  private final File firstSampleFile =
+      new File("sampleFilesDirectory", "sampleFile1.txt");
+  private final File secondSampleFile =
+      new File("sampleFilesDirectory", "sampleFile2.txt");
+  private final File thirdSampleFile =
+      new File("sampleFilesDirectory", "sampleFile3.txt");
+
+  /**
+   * creates the sample directories and files before each test.
+   *
+   * @throws IOException on failure
+   */
+  @Before
+  public void setUp() throws IOException {
+    boolean sampleDirCreation = sampleDirectory.mkdir();
+    boolean emptyDirCreation = emptyDirectory.mkdir();
+    if (sampleDirCreation && emptyDirCreation) {
+      try (FileWriter fw = new FileWriter(firstSampleFile);
+          FileWriter fw1 = new FileWriter(secondSampleFile);
+          FileWriter fw2 = new FileWriter(thirdSampleFile)) {
+        fw.write("abcd");
+        fw1.write("efgh");
+        fw2.write("ijkl");
+      }
+    }
+  }
+
+  /**
+   * mergeFilesTest() will test the mergeFiles() method in Merger class.
+   * by passing a sample directory which contains files with some content in it
+   * and checks if files in a directory are merged into single file
+   *
+   * @throws IOException on any failure
+   */
+  @Test
+  public void mergeFilesTest() throws IOException {
+    s3AAuditLogMerger.mergeFiles(sampleDirectory.getPath());
+    String str =
+        new String(Files.readAllBytes(Paths.get(auditLogFile.getPath())));
+    String fileText = str.replace("\n", "");
+    assertTrue("the string 'abcd' should be in the merged file",
+        fileText.contains("abcd"));
+    assertTrue("the string 'efgh' should be in the merged file",
+        fileText.contains("efgh"));
+    assertTrue("the string 'ijkl' should be in the merged file",
+        fileText.contains("ijkl"));
+  }
+
+  /**
+   * mergeFilesTestEmpty() will test the mergeFiles().

Review Comment:
   Don't need to put the test name in the Javadoc, just saying this test does 
this, is good enough.



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java:
##########
@@ -0,0 +1,334 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.Closeable;
+import java.io.EOFException;
+import java.io.File;
+import java.io.IOException;
+import java.io.PrintWriter;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.commons.io.FileUtils;
+import org.apache.hadoop.classification.VisibleForTesting;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.conf.Configured;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FilterFileSystem;
+import org.apache.hadoop.fs.LocatedFileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.RemoteIterator;
+import org.apache.hadoop.fs.s3a.S3AFileSystem;
+import org.apache.hadoop.util.ExitUtil;
+import org.apache.hadoop.util.Tool;
+import org.apache.hadoop.util.ToolRunner;
+
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE;
+import static 
org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS;
+
+/**.
+ * AuditTool is a Command Line Interface to manage S3 Auditing.
+ * i.e, it is a functionality which directly takes s3 path of audit log files
+ * and merge all those into single audit log file

Review Comment:
   Also "." at the end of all the Javadocs



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * MergerTest will implement different tests on Merger class methods.
+ */
+public class TestS3AAuditLogMerger {
+
+  private final Logger logger = 
LoggerFactory.getLogger(TestS3AAuditLogMerger.class);
+
+  private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger();
+
+  /**
+   * sample directories and files to test.
+   */
+  private final File auditLogFile = new File("AuditLogFile");
+  private final File sampleDirectory = new File("sampleFilesDirectory");
+  private final File emptyDirectory = new File("emptyFilesDirectory");
+  private final File firstSampleFile =
+      new File("sampleFilesDirectory", "sampleFile1.txt");
+  private final File secondSampleFile =
+      new File("sampleFilesDirectory", "sampleFile2.txt");
+  private final File thirdSampleFile =
+      new File("sampleFilesDirectory", "sampleFile3.txt");
+
+  /**
+   * creates the sample directories and files before each test.
+   *
+   * @throws IOException on failure
+   */
+  @Before
+  public void setUp() throws IOException {
+    boolean sampleDirCreation = sampleDirectory.mkdir();
+    boolean emptyDirCreation = emptyDirectory.mkdir();
+    if (sampleDirCreation && emptyDirCreation) {
+      try (FileWriter fw = new FileWriter(firstSampleFile);
+          FileWriter fw1 = new FileWriter(secondSampleFile);
+          FileWriter fw2 = new FileWriter(thirdSampleFile)) {
+        fw.write("abcd");
+        fw1.write("efgh");
+        fw2.write("ijkl");
+      }
+    }
+  }
+
+  /**
+   * mergeFilesTest() will test the mergeFiles() method in Merger class.
+   * by passing a sample directory which contains files with some content in it
+   * and checks if files in a directory are merged into single file
+   *
+   * @throws IOException on any failure
+   */
+  @Test
+  public void mergeFilesTest() throws IOException {
+    s3AAuditLogMerger.mergeFiles(sampleDirectory.getPath());
+    String str =
+        new String(Files.readAllBytes(Paths.get(auditLogFile.getPath())));
+    String fileText = str.replace("\n", "");
+    assertTrue("the string 'abcd' should be in the merged file",
+        fileText.contains("abcd"));
+    assertTrue("the string 'efgh' should be in the merged file",
+        fileText.contains("efgh"));
+    assertTrue("the string 'ijkl' should be in the merged file",
+        fileText.contains("ijkl"));
+  }
+
+  /**
+   * mergeFilesTestEmpty() will test the mergeFiles().
+   * by passing an empty directory and checks if merged file is created or not
+   *
+   * @throws IOException on any failure
+   */
+  @Test
+  public void mergeFilesTestEmpty() throws IOException {

Review Comment:
   change the name to testMergeFilesEmptyDir



##########
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java:
##########
@@ -0,0 +1,77 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.BufferedReader;
+import java.io.File;
+import java.io.FileReader;
+import java.io.IOException;
+import java.io.PrintWriter;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Merger class will merge all the audit logs present in a directory of
+ * multiple audit log files into a single audit log file.
+ */
+public class S3AAuditLogMerger {
+
+  private final Logger logger =
+      LoggerFactory.getLogger(S3AAuditLogMerger.class);
+
+  public void mergeFiles(String auditLogsDirectoryPath) throws IOException {
+    File auditLogFilesDirectory = new File(auditLogsDirectoryPath);
+    String[] auditLogFileNames = auditLogFilesDirectory.list();
+
+    //Read each audit log file present in directory and writes each and every 
audit log in it
+    //into a single audit log file
+    if (auditLogFileNames != null && auditLogFileNames.length != 0) {
+      File auditLogFile = new File("AuditLogFile");

Review Comment:
   What would happen if this file already exists?



##########
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java:
##########
@@ -0,0 +1,131 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ *  or more contributor license agreements.  See the NOTICE file
+ *  distributed with this work for additional information
+ *  regarding copyright ownership.  The ASF licenses this file
+ *  to you under the Apache License, Version 2.0 (the
+ *  "License"); you may not use this file except in compliance
+ *  with the License.  You may obtain a copy of the License at
+ *
+ *       http://www.apache.org/licenses/LICENSE-2.0
+ *
+ *  Unless required by applicable law or agreed to in writing, software
+ *  distributed under the License is distributed on an "AS IS" BASIS,
+ *  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ *  See the License for the specific language governing permissions and
+ *  limitations under the License.
+ */
+
+package org.apache.hadoop.fs.s3a.audit;
+
+import java.io.File;
+import java.io.FileWriter;
+import java.io.IOException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static org.junit.Assert.assertFalse;
+import static org.junit.Assert.assertTrue;
+
+/**
+ * MergerTest will implement different tests on Merger class methods.
+ */
+public class TestS3AAuditLogMerger {
+
+  private final Logger logger = 
LoggerFactory.getLogger(TestS3AAuditLogMerger.class);
+
+  private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger();
+
+  /**
+   * sample directories and files to test.
+   */
+  private final File auditLogFile = new File("AuditLogFile");
+  private final File sampleDirectory = new File("sampleFilesDirectory");
+  private final File emptyDirectory = new File("emptyFilesDirectory");
+  private final File firstSampleFile =
+      new File("sampleFilesDirectory", "sampleFile1.txt");
+  private final File secondSampleFile =
+      new File("sampleFilesDirectory", "sampleFile2.txt");
+  private final File thirdSampleFile =
+      new File("sampleFilesDirectory", "sampleFile3.txt");

Review Comment:
   I think it'll be better to move these files into specific tests that need 
them and remove the deletion of them from teardown(). I think you won't have to 
worry about cleanup if you create a local file inside the test. Lookup 
`File.createTempFile()`, these are automatically deleted, check if you could 
use them in your tests.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 782825)
    Time Spent: 1h 40m  (was: 1.5h)

> Merging of S3A Audit Logs
> -------------------------
>
>                 Key: HADOOP-18258
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18258
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sravani Gadey
>            Assignee: Sravani Gadey
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Merging audit log files containing huge number of audit logs collected from a 
> job like Hive or Spark job containing various S3 requests like list, head, 
> get and put requests.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to