Joerg Hoh created OAK-12165:
-------------------------------
Summary: Review Part Size Values for Azure Blob Upload
Key: OAK-12165
URL: https://issues.apache.org/jira/browse/OAK-12165
Project: Jackrabbit Oak
Issue Type: Task
Components: blob-cloud-azure
Affects Versions: 1.92.0
Reporter: Joerg Hoh
We should revisit the actual values for the 2 constants
* AZURE_BLOB_MIN_MULTIPART_UPLOAD_PART_SIZE (256k)
* AZURE_BLOB_MAX_MULTIPART_UPLOAD_PART_SIZE (4G)
Our implementation was relying on these values and started to allocate up to 4G
of a byte[] array before sending the actual buffer to the blobstore (using a
simple HttpUrlConnection), which blew the heap. For that we should revisit the
value of AZURE_BLOB_MAX_MULTIPART_UPLOAD_PART_SIZE.
AZURE_BLOB_MIN_MULTIPART_UPLOAD_PART_SIZE is more important, as when you use
this value to upload a large binary, the implementation will return with a
large collection of URIs, which can pose a problem on its own (we calculated
that we would get 500k DownloadURIs when we want to upload a 100G file.
Where do these values come from and why were they changed so dramatically from
the actual values used with V8 (min=10M, max=100M), which were working very
well for us?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)