amogh-jahagirdar commented on code in PR #9953: URL: https://github.com/apache/iceberg/pull/9953#discussion_r1552280909
########## aws/src/test/java/org/apache/iceberg/aws/s3/TestS3FileIO.java: ########## @@ -377,6 +384,50 @@ public void testResolvingFileIOLoad() { Assertions.assertThat(result).isInstanceOf(S3FileIO.class); } + @Test + public void testInputFileWithDataFile() throws IOException { + String location = "s3://bucket/path/to/data-file.parquet"; + DataFile dataFile = + DataFiles.builder(PartitionSpec.unpartitioned()) + .withPath(location) + .withFileSizeInBytes(123L) + .withFormat(FileFormat.PARQUET) + .withRecordCount(123L) + .build(); + OutputStream outputStream = s3FileIO.newOutputFile(location).create(); + byte[] data = "testing".getBytes(); + outputStream.write(data); + outputStream.close(); + + InputFile inputFile = s3FileIO.newInputFile(dataFile); + Assertions.assertThat(inputFile.getLength()) + .as("Data file length should be determined from the file size stats") + .isEqualTo(123L); + } + + @Test + public void testInputFileWithManifest() throws IOException { + String dataFileLocation = "s3://bucket/path/to/data-file-2.parquet"; + DataFile dataFile = + DataFiles.builder(PartitionSpec.unpartitioned()) + .withPath(dataFileLocation) + .withFileSizeInBytes(123L) + .withFormat(FileFormat.PARQUET) + .withRecordCount(123L) + .build(); + String manifestLocation = "s3://bucket/path/to/manifest.avro"; + OutputFile outputFile = s3FileIO.newOutputFile(manifestLocation); + ManifestWriter<DataFile> writer = + ManifestFiles.write(PartitionSpec.unpartitioned(), outputFile); + writer.add(dataFile); + writer.close(); + ManifestFile manifest = writer.toManifestFile(); + InputFile inputFile = s3FileIO.newInputFile(manifest); + Assertions.assertThat(inputFile.getLength()) + .as("Manifest file length should be determined from the file size stats") + .isEqualTo(manifest.length()); Review Comment: I just added a verification that the s3Mock.headObject is never called when determining the length. It fails before this change, and passes after the fix so I think it's a better test. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org