ajantha-bhat commented on code in PR #9953:
URL: https://github.com/apache/iceberg/pull/9953#discussion_r1524376106


##########
aws/src/test/java/org/apache/iceberg/aws/s3/TestS3FileIO.java:
##########
@@ -377,6 +384,50 @@ public void testResolvingFileIOLoad() {
     Assertions.assertThat(result).isInstanceOf(S3FileIO.class);
   }
 
+  @Test
+  public void testInputFileWithDataFile() throws IOException {
+    String location = "s3://bucket/path/to/data-file.parquet";
+    DataFile dataFile =
+        DataFiles.builder(PartitionSpec.unpartitioned())
+            .withPath(location)
+            .withFileSizeInBytes(123L)
+            .withFormat(FileFormat.PARQUET)
+            .withRecordCount(123L)
+            .build();
+    OutputStream outputStream = s3FileIO.newOutputFile(location).create();
+    byte[] data = "testing".getBytes();
+    outputStream.write(data);
+    outputStream.close();
+
+    InputFile inputFile = s3FileIO.newInputFile(dataFile);
+    Assertions.assertThat(inputFile.getLength())
+        .as("Data file length should be determined from the file size stats")
+        .isEqualTo(123L);
+  }
+
+  @Test
+  public void testInputFileWithManifest() throws IOException {
+    String dataFileLocation = "s3://bucket/path/to/data-file-2.parquet";
+    DataFile dataFile =
+        DataFiles.builder(PartitionSpec.unpartitioned())
+            .withPath(dataFileLocation)
+            .withFileSizeInBytes(123L)
+            .withFormat(FileFormat.PARQUET)
+            .withRecordCount(123L)
+            .build();
+    String manifestLocation = "s3://bucket/path/to/manifest.avro";
+    OutputFile outputFile = s3FileIO.newOutputFile(manifestLocation);
+    ManifestWriter<DataFile> writer =
+        ManifestFiles.write(PartitionSpec.unpartitioned(), outputFile);
+    writer.add(dataFile);
+    writer.close();
+    ManifestFile manifest = writer.toManifestFile();
+    InputFile inputFile = s3FileIO.newInputFile(manifest);
+    Assertions.assertThat(inputFile.getLength())
+        .as("Manifest file length should be determined from the file size 
stats")
+        .isEqualTo(manifest.length());

Review Comment:
   minor: This will not confirm whether `manifest.length` was used or file size 
is computed by reading the file. As the both values can be same.
   The above `dataFileTestcase` validates right. Maybe we have mock the 
`manifest.length` to confirm that value is used. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to