danielcweeks commented on code in PR #14333:
URL: https://github.com/apache/iceberg/pull/14333#discussion_r2519927619
##########
gcp/src/main/java/org/apache/iceberg/gcp/gcs/GCSInputFile.java:
##########
@@ -64,6 +78,33 @@ public long getLength() {
@Override
public SeekableInputStream newStream() {
+ if (gcpProperties().isGcsAnalyticsCoreEnabled()) {
+ try {
+ GcsFileInfo fileInfo = getGcsFileInfo();
+ return new GoogleCloudStorageInputStreamWrapper(
+ GoogleCloudStorageInputStream.create(gcsFileSystem(), fileInfo));
+ } catch (IOException e) {
+ LOG.error("Failed to create GCS analytics core input stream.", e);
+ throw new RuntimeIOException(
+ e, "Failed to create GCS analytics core input stream for: %s",
blobId().toGsUtilUri());
+ }
+ }
+
return new GCSInputStream(storage(), blobId(), blobSize, gcpProperties(),
metrics());
}
+
+ GcsFileInfo getGcsFileInfo() {
+ BlobId blobId = blobId();
+ GcsItemId itemId =
+ GcsItemId.builder()
+ .setBucketName(blobId.getBucket())
+ .setObjectName(blobId.getName())
+ .build();
+ GcsItemInfo itemInfo =
GcsItemInfo.builder().setItemId(itemId).setSize(getLength()).build();
Review Comment:
I'm a little concerned about the behavior of the `getLength` call since we
may be introducing an extra call just to populate the size. Is this a required
field? Can we provide it only if we know the length as opposed to always
fetching?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]