Joerg Hoh created OAK-12165:
-------------------------------

             Summary: Review Part Size Values for Azure Blob Upload
                 Key: OAK-12165
                 URL: https://issues.apache.org/jira/browse/OAK-12165
             Project: Jackrabbit Oak
          Issue Type: Task
          Components: blob-cloud-azure
    Affects Versions: 1.92.0
            Reporter: Joerg Hoh


We should revisit the actual values for the 2 constants
* AZURE_BLOB_MIN_MULTIPART_UPLOAD_PART_SIZE (256k)
* AZURE_BLOB_MAX_MULTIPART_UPLOAD_PART_SIZE (4G)

Our implementation was relying on these values and started to allocate up to 4G 
of a byte[] array before sending the actual buffer to the blobstore (using a 
simple HttpUrlConnection), which blew the heap. For that we should revisit the 
value of AZURE_BLOB_MAX_MULTIPART_UPLOAD_PART_SIZE.

AZURE_BLOB_MIN_MULTIPART_UPLOAD_PART_SIZE is more important, as when you use 
this value to upload a large binary, the implementation will return with a 
large collection of URIs, which can pose a problem on its own (we calculated 
that we would get 500k DownloadURIs when we want to upload a 100G file.

Where do these values come from and why were they changed so dramatically from 
the actual values used with V8 (min=10M, max=100M), which were working very 
well for us?







--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to