CalvinKirs commented on code in PR #63400:
URL: https://github.com/apache/doris/pull/63400#discussion_r3361927896
##########
fe/fe-filesystem/fe-filesystem-s3/src/main/java/org/apache/doris/filesystem/s3/S3OutputStream.java:
##########
@@ -15,48 +15,39 @@
// specific language governing permissions and limitations
// under the License.
-package org.apache.doris.filesystem.s3;
-
-import org.apache.doris.filesystem.spi.RequestBody;
+package org.apache.doris.filesystem.spi;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io.OutputStream;
/**
- * OutputStream that buffers writes in memory and uploads to S3 on close.
+ * OutputStream that buffers writes in memory and uploads to object storage on
close.
*
- * <p>This implementation is intentionally simple and suitable for small
metadata files (manifests,
- * snapshots, job info, etc.). Writes are rejected when the in-memory buffer
would exceed
- * {@link #MAX_SINGLE_UPLOAD_BYTES} to prevent OOM on large payloads.
- * For large file writes (Hive data files, Backup archives), multipart upload
must be used instead.
+ * <p>This implementation is intentionally simple and suitable for small
metadata files
+ * (manifests, snapshots, job info, etc.). Writes are rejected when the
in-memory buffer
+ * would exceed {@link #MAX_SINGLE_UPLOAD_BYTES} to prevent OOM on large
payloads.
+ * For large file writes (Hive data files, Backup archives), multipart upload
must be
+ * used instead.
*
- * <p><strong>Empty-close semantics (#22):</strong> if {@link #close()} is
called without any
- * preceding {@code write(...)} call, NO object is uploaded. This avoids
polluting the bucket
- * with phantom 0-byte placeholders when the caller opens an output stream and
aborts before
- * writing. To explicitly create a zero-byte object, call {@code write(new
byte[0])} (or
- * {@code write(b, off, 0)}) prior to {@link #close()} — any write call, even
of length 0,
- * marks the stream as "written" and triggers an upload of the empty buffer.
+ * <p><strong>Empty-close semantics (#22):</strong> if {@link #close()} is
called without
+ * any preceding {@code write(...)} call, no object is uploaded. This avoids
polluting the
+ * bucket with phantom 0-byte placeholders when the caller opens an output
stream and
+ * aborts before writing. To explicitly create a zero-byte object, call
+ * {@code write(new byte[0])} or {@code write(b, off, 0)} before {@link
#close()}.
*/
-class S3OutputStream extends OutputStream {
+public class ObjectStorageOutputStream extends OutputStream {
- /**
- * Maximum in-memory buffer size before writes are rejected.
- * S3 single-PUT limit is 5 GB, but we enforce a much smaller guard (256
MB) so that an
- * accidental large-file write fails early with a clear message rather
than OOMing silently.
- */
private static final long MAX_SINGLE_UPLOAD_BYTES = 256L * 1024 * 1024; //
256 MB
private final String remotePath;
- private final S3ObjStorage objStorage;
+ private final ObjStorage<?> objStorage;
private final ByteArrayOutputStream buffer = new ByteArrayOutputStream();
private boolean closed = false;
- // Tracks whether write(...) was called at least once. close() skips the
upload entirely
- // when this is false to avoid creating phantom 0-byte objects on
accidental empty close.
Review Comment:
The class was moved/renamed to ObjectStorageOutputStream (in
fe-filesystem-spi); the comment was not intentionally dropped. The
5GB-vs-256MB rationale is restored there as an inline comment on
MAX_SINGLE_UPLOAD_BYTES.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]