amogh-jahagirdar commented on code in PR #9953:
URL: https://github.com/apache/iceberg/pull/9953#discussion_r1552280909
##########
aws/src/test/java/org/apache/iceberg/aws/s3/TestS3FileIO.java:
##########
@@ -377,6 +384,50 @@ public void testResolvingFileIOLoad() {
Assertions.assertThat(result).isInstanceOf(S3FileIO.class);
}
+ @Test
+ public void testInputFileWithDataFile() throws IOException {
+ String location = "s3://bucket/path/to/data-file.parquet";
+ DataFile dataFile =
+ DataFiles.builder(PartitionSpec.unpartitioned())
+ .withPath(location)
+ .withFileSizeInBytes(123L)
+ .withFormat(FileFormat.PARQUET)
+ .withRecordCount(123L)
+ .build();
+ OutputStream outputStream = s3FileIO.newOutputFile(location).create();
+ byte[] data = "testing".getBytes();
+ outputStream.write(data);
+ outputStream.close();
+
+ InputFile inputFile = s3FileIO.newInputFile(dataFile);
+ Assertions.assertThat(inputFile.getLength())
+ .as("Data file length should be determined from the file size stats")
+ .isEqualTo(123L);
+ }
+
+ @Test
+ public void testInputFileWithManifest() throws IOException {
+ String dataFileLocation = "s3://bucket/path/to/data-file-2.parquet";
+ DataFile dataFile =
+ DataFiles.builder(PartitionSpec.unpartitioned())
+ .withPath(dataFileLocation)
+ .withFileSizeInBytes(123L)
+ .withFormat(FileFormat.PARQUET)
+ .withRecordCount(123L)
+ .build();
+ String manifestLocation = "s3://bucket/path/to/manifest.avro";
+ OutputFile outputFile = s3FileIO.newOutputFile(manifestLocation);
+ ManifestWriter<DataFile> writer =
+ ManifestFiles.write(PartitionSpec.unpartitioned(), outputFile);
+ writer.add(dataFile);
+ writer.close();
+ ManifestFile manifest = writer.toManifestFile();
+ InputFile inputFile = s3FileIO.newInputFile(manifest);
+ Assertions.assertThat(inputFile.getLength())
+ .as("Manifest file length should be determined from the file size
stats")
+ .isEqualTo(manifest.length());
Review Comment:
I just added a verification that the s3Mock.headObject is never called when
determining the length. It fails before this change, and passes after the fix
so I think it's a better test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]