This is an automated email from the ASF dual-hosted git repository. davsclaus pushed a commit to branch fix/CAMEL-23806-split-aws-s3-docs in repository https://gitbox.apache.org/repos/asf/camel.git
commit c56ad16e6c2a33da26ae9545daafc1124842e6ae Author: Claus Ibsen <[email protected]> AuthorDate: Sat Jun 20 16:40:45 2026 +0200 CAMEL-23806: Split aws2-s3 docs into focused sub-pages Split the AWS S3 component documentation (2,515 lines) into 3 focused sub-pages, reducing the main page to 392 lines. - aws2-s3-producer-operations: all 28 producer operation examples + IAM policies - aws2-s3-streaming: streaming upload mode with timestamp grouping - aws2-s3-consumer-examples: consumer patterns (prefix, filtering, done files) Co-Authored-By: Claude <[email protected]> Signed-off-by: Claus Ibsen <[email protected]> --- .../src/main/docs/aws2-s3-component.adoc | 2146 +------------------- .../src/main/docs/aws2-s3-consumer-examples.adoc | 313 +++ ...onent.adoc => aws2-s3-producer-operations.adoc} | 999 +-------- .../src/main/docs/aws2-s3-streaming.adoc | 303 +++ docs/components/modules/others/nav.adoc | 3 + .../others/pages/aws2-s3-consumer-examples.adoc | 1 + .../others/pages/aws2-s3-producer-operations.adoc | 1 + .../modules/others/pages/aws2-s3-streaming.adoc | 1 + 8 files changed, 639 insertions(+), 3128 deletions(-) diff --git a/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-component.adoc b/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-component.adoc index 4edcc0e06d43..ed3923fa2caf 100644 --- a/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-component.adoc +++ b/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-component.adoc @@ -246,1833 +246,13 @@ The order of evaluation for Default Credentials Provider is the following: - Web Identity Token from AWS STS. - The shared credentials and config files. - Amazon ECS container credentials - loaded from the Amazon ECS if the environment variable `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI` is set. - - Amazon EC2 Instance profile credentials. - -You have also the possibility of using Profile Credentials Provider, by specifying the useProfileCredentialsProvider option to true and profileCredentialsName to the profile name. - -Only one of static, default and profile credentials could be used at the same time. - -For more information about this you can look at https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials.html[AWS credentials documentation] - -=== S3 Producer Operation examples - -- Single Upload: This operation will upload a file to S3 based on the body content - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camel.txt")) - .setBody(constant("Camel rocks!")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camel.txt</constant></setHeader> - <setBody><constant>Camel rocks!</constant></setBody> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camel.txt" - - setBody: - constant: "Camel rocks!" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - - to: - uri: mock:result ----- -==== - -This operation will upload the file camel.txt with the content "Camel rocks!" in the _mycamelbucket_ bucket - -- Multipart Upload: This operation will perform a multipart upload of a file to S3 based on the body content - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("empty.txt")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&multiPartUpload=true&autoCreateBucket=true&partSize=1048576") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>empty.txt</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&multiPartUpload=true&autoCreateBucket=true&partSize=1048576"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "empty.txt" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - multiPartUpload: true - autoCreateBucket: true - partSize: 1048576 - - to: - uri: mock:result ----- -==== - -This operation will perform a multipart upload of the file empty.txt with based on the content the file src/empty.txt in the _mycamelbucket_ bucket - -- CopyObject: this operation copies an object from one bucket to a different one - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3BucketDestinationName", constant("camelDestinationBucket")) - .setHeader("CamelAwsS3Key", constant("camelKey")) - .setHeader("CamelAwsS3DestinationKey", constant("camelDestinationKey")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=copyObject") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3BucketDestinationName"><constant>camelDestinationBucket</constant></setHeader> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <setHeader name="CamelAwsS3DestinationKey"><constant>camelDestinationKey</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=copyObject"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3BucketDestinationName - constant: "camelDestinationBucket" - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - setHeader: - name: CamelAwsS3DestinationKey - constant: "camelDestinationKey" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: copyObject - - to: - uri: mock:result ----- -==== - -This operation will copy the object with the name expressed in the header camelDestinationKey to the camelDestinationBucket bucket, from the bucket _mycamelbucket_. - -- DeleteObject: this operation deletes an object from a bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteObject") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteObject"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: deleteObject - - to: - uri: mock:result ----- -==== - -This operation will delete the object camelKey from the bucket _mycamelbucket_. - -- ListBuckets: this operation lists the buckets for this account in this region - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=listBuckets") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=listBuckets"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: listBuckets - - to: - uri: mock:result ----- -==== - -This operation will list the buckets for this account - -- DeleteBucket: this operation deletes the bucket specified as URI parameter or header - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteBucket") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteBucket"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: deleteBucket - - to: - uri: mock:result ----- -==== - -This operation will delete the bucket _mycamelbucket_ - -- ListObjects: this operation list object in a specific bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=listObjects") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=listObjects"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: listObjects - - to: - uri: mock:result ----- -==== - -This operation will list the objects in the _mycamelbucket_ bucket - -- GetObject: this operation gets a single object in a specific bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getObject") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getObject"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: getObject - - to: - uri: mock:result ----- -==== - -This operation will return an S3Object instance related to the camelKey object in _mycamelbucket_ bucket. - -- GetObjectRange: this operation gets a single object range in a specific bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .setHeader("CamelAwsS3RangeStart", constant("0")) - .setHeader("CamelAwsS3RangeEnd", constant("9")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getObjectRange") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <setHeader name="CamelAwsS3RangeStart"><constant>0</constant></setHeader> - <setHeader name="CamelAwsS3RangeEnd"><constant>9</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getObjectRange"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - setHeader: - name: CamelAwsS3RangeStart - constant: "0" - - setHeader: - name: CamelAwsS3RangeEnd - constant: "9" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: getObjectRange - - to: - uri: mock:result ----- -==== - -This operation will return an S3Object instance related to the camelKey object in _mycamelbucket_ bucket, containing the bytes from 0 to 9. - -- CreateDownloadLink: this operation will return a download link through S3 Presigner - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .to("aws2-s3://mycamelbucket?accessKey=xxx&secretKey=yyy®ion=region&operation=createDownloadLink") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?accessKey=xxx&secretKey=yyy&region=region&operation=createDownloadLink"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - to: - uri: aws2-s3://mycamelbucket - parameters: - accessKey: xxx - secretKey: yyy - region: region - operation: createDownloadLink - - to: - uri: mock:result ----- -==== - -This operation will return a download link url for the file camel-key in the bucket _mycamelbucket_ and region _region_. -Parameters (`accessKey`, `secretKey` and `region`) are mandatory for this operation, if S3 client is autowired from the registry. - -NOTE: If checksum validations are enabled, the url will no longer be browser compatible because it adds a signed header that must be included in the HTTP request. - -- HeadBucket: this operation checks if a bucket exists and you have permission to access it - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=headBucket") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=headBucket"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: headBucket - - to: - uri: mock:result ----- -==== - -This operation will check if the bucket _mycamelbucket_ exists and is accessible. - -- HeadObject: this operation retrieves metadata from an object without returning the object itself - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=headObject") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=headObject"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: headObject - - to: - uri: mock:result ----- -==== - -This operation will return metadata about the object camelKey in the bucket _mycamelbucket_. - -- DeleteObjects: this operation deletes multiple objects from a bucket in a single request - -NOTE: The `CamelAwsS3KeysToDelete` header requires a `List<String>` value, which must be set from a bean or processor. - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .process(exchange -> { - List<String> keys = List.of("key1", "key2", "key3"); - exchange.getIn().setHeader("CamelAwsS3KeysToDelete", keys); - }) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteObjects") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <process ref="deleteKeysProcessor"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteObjects"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - process: - ref: deleteKeysProcessor - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: deleteObjects - - to: - uri: mock:result ----- -==== - -This operation will delete the objects with keys key1, key2, and key3 from the bucket _mycamelbucket_. - -- CreateUploadLink: this operation will return an upload link through S3 Presigner - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .to("aws2-s3://mycamelbucket?accessKey=xxx&secretKey=yyy®ion=region&operation=createUploadLink") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?accessKey=xxx&secretKey=yyy&region=region&operation=createUploadLink"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - to: - uri: aws2-s3://mycamelbucket - parameters: - accessKey: xxx - secretKey: yyy - region: region - operation: createUploadLink - - to: - uri: mock:result ----- -==== - -This operation will return an upload link url for uploading to the bucket _mycamelbucket_. - -- RestoreObject: this operation restores an archived object from Glacier storage - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .setHeader("CamelAwsS3RestoreDays", constant(1)) - .setHeader("CamelAwsS3RestoreTier", constant("Expedited")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=restoreObject") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <setHeader name="CamelAwsS3RestoreDays"><constant>1</constant></setHeader> - <setHeader name="CamelAwsS3RestoreTier"><constant>Expedited</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=restoreObject"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - setHeader: - name: CamelAwsS3RestoreDays - constant: 1 - - setHeader: - name: CamelAwsS3RestoreTier - constant: "Expedited" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: restoreObject - - to: - uri: mock:result ----- -==== - -This operation will restore the archived object camelKey from Glacier for 1 day using expedited retrieval. - -- GetObjectTagging: this operation retrieves the tags associated with an object - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getObjectTagging") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getObjectTagging"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: getObjectTagging - - to: - uri: mock:result ----- -==== - -This operation will return the tags for object camelKey in the bucket _mycamelbucket_. - -- PutObjectTagging: this operation sets tags on an object - -NOTE: The `CamelAwsS3ObjectTags` header requires a `Map<String, String>` value, which must be set from a bean or processor. - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .process(exchange -> { - Map<String, String> tags = Map.of("Environment", "Production", "Owner", "TeamA"); - exchange.getIn().setHeader("CamelAwsS3Key", "camelKey"); - exchange.getIn().setHeader("CamelAwsS3ObjectTags", tags); - }) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putObjectTagging") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <process ref="objectTagsProcessor"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putObjectTagging"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - process: - ref: objectTagsProcessor - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: putObjectTagging - - to: - uri: mock:result ----- -==== - -This operation will set tags on the object camelKey in the bucket _mycamelbucket_. - -- DeleteObjectTagging: this operation deletes all tags from an object - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteObjectTagging") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteObjectTagging"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: deleteObjectTagging - - to: - uri: mock:result ----- -==== - -This operation will delete all tags from the object camelKey in the bucket _mycamelbucket_. - -- GetObjectAcl: this operation retrieves the access control list (ACL) for an object - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getObjectAcl") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getObjectAcl"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: getObjectAcl - - to: - uri: mock:result ----- -==== - -This operation will return the ACL for object camelKey in the bucket _mycamelbucket_. - -- PutObjectAcl: this operation sets the access control list (ACL) for an object - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3Key", constant("camelKey")) - .setHeader("CamelAwsS3CannedAcl", constant("PublicRead")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putObjectAcl") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3Key"><constant>camelKey</constant></setHeader> - <setHeader name="CamelAwsS3CannedAcl"><constant>PublicRead</constant></setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putObjectAcl"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3Key - constant: "camelKey" - - setHeader: - name: CamelAwsS3CannedAcl - constant: "PublicRead" - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: putObjectAcl - - to: - uri: mock:result ----- -==== - -This operation will set the ACL to public-read for object camelKey in the bucket _mycamelbucket_. - -- CreateBucket: this operation creates a new S3 bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mynewbucket?amazonS3Client=#amazonS3Client&operation=createBucket") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mynewbucket?amazonS3Client=#amazonS3Client&operation=createBucket"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mynewbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: createBucket - - to: - uri: mock:result ----- -==== - -This operation will create a new bucket named _mynewbucket_ in the configured region. - -- GetBucketTagging: this operation retrieves the tags associated with a bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getBucketTagging") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getBucketTagging"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: getBucketTagging - - to: - uri: mock:result ----- -==== - -This operation will return the tags for the bucket _mycamelbucket_. - -- PutBucketTagging: this operation sets tags on a bucket - -NOTE: The `CamelAwsS3BucketTags` header requires a `Map<String, String>` value, which must be set from a bean or processor. - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .process(exchange -> { - Map<String, String> tags = Map.of("Project", "CamelIntegration", "CostCenter", "Engineering"); - exchange.getIn().setHeader("CamelAwsS3BucketTags", tags); - }) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putBucketTagging") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <process ref="bucketTagsProcessor"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putBucketTagging"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - process: - ref: bucketTagsProcessor - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: putBucketTagging - - to: - uri: mock:result ----- -==== - -This operation will set tags on the bucket _mycamelbucket_. - -- DeleteBucketTagging: this operation deletes all tags from a bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteBucketTagging") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteBucketTagging"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: deleteBucketTagging - - to: - uri: mock:result ----- -==== - -This operation will delete all tags from the bucket _mycamelbucket_. - -- GetBucketVersioning: this operation retrieves the versioning configuration of a bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getBucketVersioning") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getBucketVersioning"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: getBucketVersioning - - to: - uri: mock:result ----- -==== - -This operation will return the versioning configuration for the bucket _mycamelbucket_. - -- PutBucketVersioning: this operation sets the versioning configuration of a bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .setHeader("CamelAwsS3VersioningStatus", constant("Enabled")) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putBucketVersioning") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <setHeader name="CamelAwsS3VersioningStatus"> - <constant>Enabled</constant> - </setHeader> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putBucketVersioning"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - setHeader: - name: CamelAwsS3VersioningStatus - constant: Enabled - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: putBucketVersioning - - to: - uri: mock:result ----- -==== - -This operation will enable versioning on the bucket _mycamelbucket_. - -- GetBucketPolicy: this operation retrieves the policy of a bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getBucketPolicy") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=getBucketPolicy"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: getBucketPolicy - - to: - uri: mock:result ----- -==== - -This operation will return the bucket policy for _mycamelbucket_ as a JSON string. - -- PutBucketPolicy: this operation sets the policy on a bucket - -NOTE: The `CamelAwsS3BucketPolicy` header requires a JSON policy string, which must be set from a bean or processor. - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:start") - .process(exchange -> { - String policy = """ - {"Version": "2012-10-17", "Statement": [{"Effect": "Allow", \ - "Principal": "*", "Action": "s3:GetObject", \ - "Resource": "arn:aws:s3:::mycamelbucket/*"}]}"""; - exchange.getIn().setHeader("CamelAwsS3BucketPolicy", policy); - }) - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putBucketPolicy") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <process ref="bucketPolicyProcessor"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=putBucketPolicy"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - process: - ref: bucketPolicyProcessor - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: putBucketPolicy - - to: - uri: mock:result ----- -==== - -This operation will set a bucket policy on _mycamelbucket_. - -- DeleteBucketPolicy: this operation deletes the policy from a bucket - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("direct:start") - .to("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteBucketPolicy") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:start"/> - <to uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&operation=deleteBucketPolicy"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:start - steps: - - to: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - operation: deleteBucketPolicy - - to: - uri: mock:result ----- -==== - -This operation will delete the bucket policy from _mycamelbucket_. - -=== AWS S3 Producer minimum permissions - -For making the producer work, you'll need at least PutObject and ListBuckets permissions. The following policy will be enough: - -[source,json] ----- -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "s3:PutObject", - "Resource": "arn:aws:s3:::*/*" - }, - { - "Effect": "Allow", - "Action": "s3:ListBucket", - "Resource": "arn:aws:s3:::*" - } - ] -} ----- - -A variation to the minimum permissions is related to the usage of Bucket autocreation. In that case the permissions will need to be increased with CreateBucket permission - -[source,json] ----- -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "s3:PutObject", - "Resource": "arn:aws:s3:::*/*" - }, - { - "Effect": "Allow", - "Action": "s3:ListBucket", - "Resource": "arn:aws:s3:::*" - }, - { - "Effect": "Allow", - "Action": "s3:CreateBucket", - "Resource": "arn:aws:s3:::*" - } - ] -} ----- - -=== AWS S3 Consumer minimum permissions - -For making the producer work, you'll need at least GetObject, ListBuckets and DeleteObject permissions. The following policy will be enough: - -[source,json] ----- -{ - "Version": "2012-10-17", - "Statement": [ - { - "Effect": "Allow", - "Action": "s3:ListBucket", - "Resource": "arn:aws:s3:::*" - }, - { - "Effect": "Allow", - "Action": "s3:GetObject", - "Resource": "arn:aws:s3:::*/*" - }, - { - "Effect": "Allow", - "Action": "s3:DeleteObject", - "Resource": "arn:aws:s3:::*/*" - } - ] -} ----- - -By Default the consumer will use the deleteAfterRead option, this means the object will be deleted once consumed, this is why the DeleteObject permission is required. - -=== Streaming Upload mode - -With the stream mode enabled, users will be able to upload data to S3 without knowing ahead of time the dimension of the data, by leveraging multipart upload. -The upload will be completed when the batchSize has been completed or the batchMessageNumber has been reached. -There are two possible naming strategies: progressive and random. -With the progressive strategy, each file will have the name composed by keyName option and a progressive counter, and eventually the file extension (if any), while with the random strategy a UUID will be added after keyName and eventually the file extension will be appended. - -Additionally, streaming upload mode supports timestamp-based file grouping, which allows messages to be automatically grouped into time windows based on their timestamps. - -As an example: - -._Java-only: Endpoint DSL builder style_ - -[source,java] ----- -from(kafka("topic1").brokers("localhost:9092")) - .log("Kafka Message is: ${body}") - .to(aws2S3("camel-bucket").streamingUploadMode(true).batchMessageNumber(25) - .namingStrategy(AWS2S3EndpointBuilderFactory.AWSS3NamingStrategyEnum.progressive) - .keyName("{{kafkaTopic1}}/{{kafkaTopic1}}.txt")); - -from(kafka("topic2").brokers("localhost:9092")) - .log("Kafka Message is: ${body}") - .to(aws2S3("camel-bucket").streamingUploadMode(true).batchMessageNumber(25) - .namingStrategy(AWS2S3EndpointBuilderFactory.AWSS3NamingStrategyEnum.random) - .keyName("{{kafkaTopic2}}/{{kafkaTopic2}}.txt")); ----- - -The default size for a batch is 1 Mb, but you can adjust it according to your requirements. - -When you stop your producer route, the producer will take care of flushing the remaining buffered message and complete the upload. - -In Streaming upload, you'll be able to restart the producer from the point where it left. It's important to note that this feature is critical only when using the progressive naming strategy. - -By setting the restartingPolicy to lastPart, you will restart uploading files and contents from the last part number the producer left. - -As example: - - Start the route with progressive naming strategy and keyname equals to camel.txt, with batchMessageNumber equals to 20, and restartingPolicy equals to lastPart - - Send 70 messages. - - Stop the route - - On your S3 bucket you should now see four files: camel.txt, camel-1.txt, camel-2.txt and camel-3.txt, the first three will have 20 messages, while the last one is only 10. - - Restart the route - - Send 25 messages - - Stop the route - - You'll now have two other files in your bucket: camel-5.txt and camel-6.txt, the first with 20 messages and the second with 5 messages. - - Go ahead - -This won't be needed when using the random naming strategy. - -On the opposite, you can specify the override restartingPolicy. In that case, you'll be able to override whatever you written before (for that particular keyName) in your bucket. - -[NOTE] -==== -In Streaming upload mode, the only keyName option that will be taken into account is the endpoint option. Using the header will throw an NPE and this is done by design. -Setting the header means potentially change the file name on each exchange, and this is against the aim of the streaming upload producer. The keyName needs to be fixed and static. -The selected naming strategy will do the rest of the work. -==== - -Another possibility is specifying a streamingUploadTimeout with batchMessageNumber and batchSize options. With this option, the user will be able to complete the upload of a file after a certain time passed. -In this way, the upload completion will be passed on three tiers: the timeout, the number of messages and the batch size. - -As an example: - -._Java-only: Endpoint DSL builder style_ - -[source,java] ----- -from(kafka("topic1").brokers("localhost:9092")) - .log("Kafka Message is: ${body}") - .to(aws2S3("camel-bucket").streamingUploadMode(true).batchMessageNumber(25) - .streamingUploadTimeout(10000) - .namingStrategy(AWS2S3EndpointBuilderFactory.AWSS3NamingStrategyEnum.progressive) - .keyName("{{kafkaTopic1}}/{{kafkaTopic1}}.txt")); ----- - -In this case, the upload will be completed after 10 seconds. - -==== Timestamp Grouping - -The streaming upload mode also supports timestamp-based file grouping, which allows messages to be automatically grouped into time windows and written to the same S3 file based on their timestamps. This feature enables append-like behavior where messages with timestamps falling within the same time window are combined into a single file. - -To enable timestamp grouping, use the following configuration options: - -- `timestampGroupingEnabled`: Set to `true` to enable timestamp-based grouping (default: `false`) -- `timestampWindowSizeMillis`: The size of the time window in milliseconds (default: `300000` = 5 minutes) -- `timestampHeaderName`: The name of the message header containing the timestamp (default: `Exchange.MESSAGE_TIMESTAMP`) - -Messages are grouped based on their timestamps extracted from the specified header. The timestamp can be provided as: - -- Long: Unix timestamp in milliseconds -- Date: Java Date object -- String: String representation of Unix timestamp in milliseconds - -Files are automatically named using a timestamp-based pattern that includes the time window information. - -===== Example Configuration - -Basic timestamp grouping with 5-minute windows: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("timer:messages?period=10000") - .setHeader(Exchange.MESSAGE_TIMESTAMP, simple("${date:now:timestamp}")) - .setBody(constant("Message with timestamp")) - .to("aws2-s3://my-bucket?streamingUploadMode=true×tampGroupingEnabled=true×tampWindowSizeMillis=300000&keyName=grouped-messages.txt"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="timer:messages?period=10000"/> - <setHeader name="CamelMessageTimestamp"> - <simple>${date:now:timestamp}</simple> - </setHeader> - <setBody> - <constant>Message with timestamp</constant> - </setBody> - <to uri="aws2-s3://my-bucket?streamingUploadMode=true&timestampGroupingEnabled=true&timestampWindowSizeMillis=300000&keyName=grouped-messages.txt"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: timer:messages - parameters: - period: 10000 - steps: - - setHeader: - name: CamelMessageTimestamp - simple: "${date:now:timestamp}" - - setBody: - constant: Message with timestamp - - to: - uri: aws2-s3://my-bucket - parameters: - streamingUploadMode: true - timestampGroupingEnabled: true - timestampWindowSizeMillis: 300000 - keyName: grouped-messages.txt ----- -==== - -Custom window size (1 minute) with custom header name: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:timestamped") - .setHeader("MyTimestamp", simple("${date:now:timestamp}")) - .setBody(constant("Custom timestamped message")) - .to("aws2-s3://my-bucket?streamingUploadMode=true×tampGroupingEnabled=true×tampWindowSizeMillis=60000×tampHeaderName=MyTimestamp&keyName=custom-grouped.txt"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:timestamped"/> - <setHeader name="MyTimestamp"> - <simple>${date:now:timestamp}</simple> - </setHeader> - <setBody> - <constant>Custom timestamped message</constant> - </setBody> - <to uri="aws2-s3://my-bucket?streamingUploadMode=true&timestampGroupingEnabled=true&timestampWindowSizeMillis=60000&timestampHeaderName=MyTimestamp&keyName=custom-grouped.txt"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:timestamped - steps: - - setHeader: - name: MyTimestamp - simple: "${date:now:timestamp}" - - setBody: - constant: Custom timestamped message - - to: - uri: aws2-s3://my-bucket - parameters: - streamingUploadMode: true - timestampGroupingEnabled: true - timestampWindowSizeMillis: 60000 - timestampHeaderName: MyTimestamp - keyName: custom-grouped.txt ----- -==== - -Large files with multipart and timestamp grouping: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:large-timestamped") - .setHeader(Exchange.MESSAGE_TIMESTAMP, simple("${date:now:timestamp}")) - .setBody(constant("Large message content...")) - .to("aws2-s3://my-bucket?streamingUploadMode=true×tampGroupingEnabled=true×tampWindowSizeMillis=1800000&multiPartUpload=true&partSize=5242880&keyName=large-grouped.txt"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:large-timestamped"/> - <setHeader name="CamelMessageTimestamp"> - <simple>${date:now:timestamp}</simple> - </setHeader> - <setBody> - <constant>Large message content...</constant> - </setBody> - <to uri="aws2-s3://my-bucket?streamingUploadMode=true&timestampGroupingEnabled=true&timestampWindowSizeMillis=1800000&multiPartUpload=true&partSize=5242880&keyName=large-grouped.txt"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:large-timestamped - steps: - - setHeader: - name: CamelMessageTimestamp - simple: "${date:now:timestamp}" - - setBody: - constant: "Large message content..." - - to: - uri: aws2-s3://my-bucket - parameters: - streamingUploadMode: true - timestampGroupingEnabled: true - timestampWindowSizeMillis: 1800000 - multiPartUpload: true - partSize: 5242880 - keyName: large-grouped.txt ----- -==== - -===== File Naming - -Files are automatically named using the following pattern: -``` -{baseFileName}_{YYYYMMDD}_{HHMM}_{HHMM-HHMM}.{extension} -``` - -For time windows smaller than 1 minute, seconds precision is used: -``` -{baseFileName}_{YYYYMMDD}_{HHMMSS}_{HHMMSS-HHMMSS}.{extension} -``` - -Example file names: -- 5-minute window: `messages_20240101_0800_0800-0805.txt` -- 1-minute window: `data_20240315_1430_1430-1431.log` -- 5-second window: `events_20241225_000000_000000-000005.json` + - Amazon EC2 Instance profile credentials. -===== Time Window Examples - -With a 5-minute window (300000ms): -- Window 1: 08:00:00 - 08:04:59 → `file_20240101_0800_0800-0805.txt` -- Window 2: 08:05:00 - 08:09:59 → `file_20240101_0805_0805-0810.txt` -- Window 3: 08:10:00 - 08:14:59 → `file_20240101_0810_0810-0815.txt` - -===== Fallback Behavior - -- If no timestamp header is found, the current system time is used -- If the timestamp header contains invalid data, the current system time is used -- Warning messages are logged when fallback behavior occurs +You have also the possibility of using Profile Credentials Provider, by specifying the useProfileCredentialsProvider option to true and profileCredentialsName to the profile name. -===== Performance Considerations +Only one of static, default and profile credentials could be used at the same time. -- Multiple concurrent time windows are supported -- Each window maintains its own upload state -- Memory usage scales with the number of active time windows -- Completed windows are automatically cleaned up -- Works with all existing streaming upload features (multipart uploads, timeouts, etc.) +For more information about this you can look at https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials.html[AWS credentials documentation] === Bucket Auto-creation @@ -2085,315 +265,6 @@ Some users like to consume stuff from a bucket and move the content in a differe If this is case for you, remember to remove the bucketName header from the incoming exchange of the consumer, otherwise the file will always be overwritten on the same original bucket. -=== MoveAfterRead consumer option - -In addition to deleteAfterRead, it has been added another option, moveAfterRead. With this option enabled, the consumed object will be moved to a target destinationBucket instead of being only deleted. -This will require specifying the destinationBucket option. As example: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - moveAfterRead: true - destinationBucket: myothercamelbucket - steps: - - to: - uri: mock:result ----- -==== - -In this case, the objects consumed will be moved to _myothercamelbucket_ bucket and deleted from the original one (because of deleteAfterRead set to true as default). - -You have also the possibility of using a key prefix/suffix while moving the file to a different bucket. -The options are `destinationBucketPrefix` and `destinationBucketSuffix`. - -Both options support the xref:languages:simple-language.adoc[Simple] expression language. -Wrap an expression in `RAW()` to prevent the Camel URI parser from interpreting special characters: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(pre-)&destinationBucketSuffix=RAW(-suff)") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(pre-)&destinationBucketSuffix=RAW(-suff)"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - moveAfterRead: true - destinationBucket: myothercamelbucket - destinationBucketPrefix: "RAW(pre-)" - destinationBucketSuffix: "RAW(-suff)" - steps: - - to: - uri: mock:result ----- -==== - -In this case, an object named `test` is moved to `myothercamelbucket` with the key `pre-test-suff`. - -Using a Simple expression, you can build dynamic paths at runtime. -The following example organises moved objects by date: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(${date:now:yyyy/MM/dd}/)") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(${date:now:yyyy/MM/dd}/)"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - moveAfterRead: true - destinationBucket: myothercamelbucket - destinationBucketPrefix: "RAW(${date:now:yyyy/MM/dd}/)" - steps: - - to: - uri: mock:result ----- -==== - -An object named `report.csv` consumed on 2026-05-19 is moved with the key `2026/05/19/report.csv`. -Expressions are evaluated once per exchange, so each message can produce a different destination key. - -=== Additional Consumer Examples - -=== Consumer with prefix filtering - -You can configure the consumer to only process objects with a specific prefix: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&prefix=processed/&delay=30000") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&prefix=processed/&delay=30000"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - prefix: processed/ - delay: 30000 - steps: - - to: - uri: mock:result ----- -==== - -This will only consume objects that start with "processed/" prefix from the _mycamelbucket_ bucket, with a 30-second polling delay. - -=== Consumer with custom polling and batch settings - -Configure custom polling intervals and batch sizes: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&delay=60000&maxMessagesPerPoll=5&includeBody=false") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&delay=60000&maxMessagesPerPoll=5&includeBody=false"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - delay: 60000 - maxMessagesPerPoll: 5 - includeBody: false - steps: - - to: - uri: mock:result ----- -==== - -This consumer polls every 60 seconds, processes up to 5 objects per poll, and doesn't include the object body in the message (only metadata). - -=== Consumer with file filtering and no deletion - -Configure the consumer to not delete files after reading and include specific file patterns: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&deleteAfterRead=false&fileName=*.pdf") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&deleteAfterRead=false&fileName=*.pdf"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - deleteAfterRead: false - fileName: "*.pdf" - steps: - - to: - uri: mock:result ----- -==== - -This consumer will read PDF files but won't delete them after processing. - -=== Consumer with done file pattern - -Use a done file pattern to ensure files are completely uploaded before processing: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&doneFileName=*.done") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&doneFileName=*.done"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - doneFileName: "*.done" - steps: - - to: - uri: mock:result ----- -==== - -This consumer will only process files when a corresponding .done file exists in the bucket. - === Using the customer key as encryption We introduced also the customer key support (an alternative of using KMS). The following code shows an example. @@ -2494,6 +365,14 @@ configuration.setUriEndpointOverride("http://ecs1.emc.com:9020"); For more details see https://www.dell.com/support/manuals/en-us/ecs-appliance-/ecs_pub_data_access_guide_3_3_to_3_6/using-the-java-amazon-sdk?guid=guid-149be134-6938-4cd7-9503-cf3ca9d23261&lang=en-us[their documentation]. +== Sub-Pages + +For more details on specific features, see: + +* xref:others:aws2-s3-producer-operations.adoc[Producer Operations] - All producer operation examples with minimum IAM permissions +* xref:others:aws2-s3-streaming.adoc[Streaming Upload] - Streaming upload mode, naming strategies, and timestamp grouping +* xref:others:aws2-s3-consumer-examples.adoc[Consumer Examples] - Consumer patterns including prefix filtering, polling strategies, and done files + == Dependencies Maven users will need to add the following dependency to their pom.xml. @@ -2511,4 +390,3 @@ Maven users will need to add the following dependency to their pom.xml. where `$\{camel-version}` must be replaced by the actual version of Camel. - diff --git a/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-consumer-examples.adoc b/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-consumer-examples.adoc new file mode 100644 index 000000000000..ab28ca5d7e6d --- /dev/null +++ b/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-consumer-examples.adoc @@ -0,0 +1,313 @@ += AWS S3 - Consumer Examples +:tabs-sync-option: + +xref:ROOT:aws2-s3-component.adoc[Back to AWS S3 Component] + +== MoveAfterRead consumer option + +In addition to deleteAfterRead, it has been added another option, moveAfterRead. With this option enabled, the consumed object will be moved to a target destinationBucket instead of being only deleted. +This will require specifying the destinationBucket option. As example: + +[tabs] +==== +Java:: ++ +[source,java] +---- + from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket") + .to("mock:result"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket"/> + <to uri="mock:result"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: aws2-s3://mycamelbucket + parameters: + amazonS3Client: "#amazonS3Client" + moveAfterRead: true + destinationBucket: myothercamelbucket + steps: + - to: + uri: mock:result +---- +==== + +In this case, the objects consumed will be moved to _myothercamelbucket_ bucket and deleted from the original one (because of deleteAfterRead set to true as default). + +You have also the possibility of using a key prefix/suffix while moving the file to a different bucket. +The options are `destinationBucketPrefix` and `destinationBucketSuffix`. + +Both options support the xref:languages:simple-language.adoc[Simple] expression language. +Wrap an expression in `RAW()` to prevent the Camel URI parser from interpreting special characters: + +[tabs] +==== +Java:: ++ +[source,java] +---- +from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(pre-)&destinationBucketSuffix=RAW(-suff)") + .to("mock:result"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(pre-)&destinationBucketSuffix=RAW(-suff)"/> + <to uri="mock:result"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: aws2-s3://mycamelbucket + parameters: + amazonS3Client: "#amazonS3Client" + moveAfterRead: true + destinationBucket: myothercamelbucket + destinationBucketPrefix: "RAW(pre-)" + destinationBucketSuffix: "RAW(-suff)" + steps: + - to: + uri: mock:result +---- +==== + +In this case, an object named `test` is moved to `myothercamelbucket` with the key `pre-test-suff`. + +Using a Simple expression, you can build dynamic paths at runtime. +The following example organises moved objects by date: + +[tabs] +==== +Java:: ++ +[source,java] +---- +from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(${date:now:yyyy/MM/dd}/)") + .to("mock:result"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(${date:now:yyyy/MM/dd}/)"/> + <to uri="mock:result"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: aws2-s3://mycamelbucket + parameters: + amazonS3Client: "#amazonS3Client" + moveAfterRead: true + destinationBucket: myothercamelbucket + destinationBucketPrefix: "RAW(${date:now:yyyy/MM/dd}/)" + steps: + - to: + uri: mock:result +---- +==== + +An object named `report.csv` consumed on 2026-05-19 is moved with the key `2026/05/19/report.csv`. +Expressions are evaluated once per exchange, so each message can produce a different destination key. + +== Additional Consumer Examples + +== Consumer with prefix filtering + +You can configure the consumer to only process objects with a specific prefix: + +[tabs] +==== +Java:: ++ +[source,java] +---- + from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&prefix=processed/&delay=30000") + .to("mock:result"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&prefix=processed/&delay=30000"/> + <to uri="mock:result"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: aws2-s3://mycamelbucket + parameters: + amazonS3Client: "#amazonS3Client" + prefix: processed/ + delay: 30000 + steps: + - to: + uri: mock:result +---- +==== + +This will only consume objects that start with "processed/" prefix from the _mycamelbucket_ bucket, with a 30-second polling delay. + +== Consumer with custom polling and batch settings + +Configure custom polling intervals and batch sizes: + +[tabs] +==== +Java:: ++ +[source,java] +---- + from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&delay=60000&maxMessagesPerPoll=5&includeBody=false") + .to("mock:result"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&delay=60000&maxMessagesPerPoll=5&includeBody=false"/> + <to uri="mock:result"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: aws2-s3://mycamelbucket + parameters: + amazonS3Client: "#amazonS3Client" + delay: 60000 + maxMessagesPerPoll: 5 + includeBody: false + steps: + - to: + uri: mock:result +---- +==== + +This consumer polls every 60 seconds, processes up to 5 objects per poll, and doesn't include the object body in the message (only metadata). + +== Consumer with file filtering and no deletion + +Configure the consumer to not delete files after reading and include specific file patterns: + +[tabs] +==== +Java:: ++ +[source,java] +---- + from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&deleteAfterRead=false&fileName=*.pdf") + .to("mock:result"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&deleteAfterRead=false&fileName=*.pdf"/> + <to uri="mock:result"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: aws2-s3://mycamelbucket + parameters: + amazonS3Client: "#amazonS3Client" + deleteAfterRead: false + fileName: "*.pdf" + steps: + - to: + uri: mock:result +---- +==== + +This consumer will read PDF files but won't delete them after processing. + +== Consumer with done file pattern + +Use a done file pattern to ensure files are completely uploaded before processing: + +[tabs] +==== +Java:: ++ +[source,java] +---- + from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&doneFileName=*.done") + .to("mock:result"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&doneFileName=*.done"/> + <to uri="mock:result"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: aws2-s3://mycamelbucket + parameters: + amazonS3Client: "#amazonS3Client" + doneFileName: "*.done" + steps: + - to: + uri: mock:result +---- +==== + +This consumer will only process files when a corresponding .done file exists in the bucket. diff --git a/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-component.adoc b/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-producer-operations.adoc similarity index 53% copy from components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-component.adoc copy to components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-producer-operations.adoc index 4edcc0e06d43..e0f8bf5c5b9e 100644 --- a/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-component.adoc +++ b/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-producer-operations.adoc @@ -1,260 +1,9 @@ -= AWS S3 Storage Service Component -:doctitle: AWS S3 Storage Service -:shortname: aws2-s3 -:artifactid: camel-aws2-s3 -:description: Store and retrieve objects from AWS S3 Storage Service. -:since: 3.2 -:supportlevel: Stable += AWS S3 - Producer Operations :tabs-sync-option: -:component-header: Both producer and consumer are supported -//Manually maintained attributes -:group: AWS -*Since Camel {since}* +xref:ROOT:aws2-s3-component.adoc[Back to AWS S3 Component] -*{component-header}* - -The AWS2 S3 component supports storing and retrieving objects from/to -https://aws.amazon.com/s3[Amazon's S3] service. - -Prerequisites - -You must have a valid Amazon Web Services developer account, and be -signed up to use Amazon S3. More information is available at -https://aws.amazon.com/s3[Amazon S3]. - -== URI Format - ----- -aws2-s3://bucketNameOrArn[?options] ----- - -The bucket will be created if it doesn't already exist. - -You can append query options to the URI in the following format: - -`?options=value&option2=value&...` - - -// component options: START -include::partial$component-configure-options.adoc[] -include::partial$component-endpoint-options.adoc[] -include::partial$component-endpoint-headers.adoc[] -// component options: END - - -Required S3 component options - -You have to provide the amazonS3Client in the -Registry or your accessKey and secretKey to access -the https://aws.amazon.com/s3[Amazon's S3]. - -== Usage - -=== Batch Consumer - -This component implements the Batch Consumer. - -This allows you, for instance, to know how many messages exist in this -batch and for instance, let the Aggregator -aggregate this number of messages. - -=== S3 Producer operations - -Camel-AWS2-S3 component provides the following operation on the producer side: - -- copyObject -- deleteObject -- deleteObjects -- listBuckets -- deleteBucket -- listObjects -- getObject (this will return an S3Object instance) -- getObjectRange (this will return an S3Object instance) -- createDownloadLink -- createUploadLink -- headBucket -- headObject -- restoreObject -- getObjectTagging -- putObjectTagging -- deleteObjectTagging -- getObjectAcl -- putObjectAcl -- createBucket -- getBucketTagging -- putBucketTagging -- deleteBucketTagging -- getBucketVersioning -- putBucketVersioning -- getBucketPolicy -- putBucketPolicy -- deleteBucketPolicy - -If you don't specify an operation, explicitly the producer will do: - -- a single file upload -- a multipart upload if multiPartUpload option is enabled - -== Examples - -For example, to read file `hello.txt` from bucket `helloBucket`, use the following snippet: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("aws2-s3://helloBucket?accessKey=yourAccessKey&secretKey=yourSecretKey&prefix=hello.txt") - .to("file:/var/downloaded"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://helloBucket?accessKey=yourAccessKey&secretKey=yourSecretKey&prefix=hello.txt"/> - <to uri="file:/var/downloaded"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://helloBucket - parameters: - accessKey: yourAccessKey - secretKey: yourSecretKey - prefix: hello.txt - steps: - - to: - uri: file:/var/downloaded ----- -==== - -=== Advanced AmazonS3 configuration - -If your Camel Application is running behind a firewall or if you need to -have more control over the `S3Client` instance configuration, you can -create your own instance and refer to it in your Camel aws2-s3 component configuration: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("aws2-s3://MyBucket?amazonS3Client=#client&delay=5000&maxMessagesPerPoll=5") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://MyBucket?amazonS3Client=#client&delay=5000&maxMessagesPerPoll=5"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://MyBucket - parameters: - amazonS3Client: "#client" - delay: 5000 - maxMessagesPerPoll: 5 - steps: - - to: - uri: mock:result ----- -==== - -=== Use KMS with the S3 component - -To use AWS KMS to encrypt/decrypt data by using AWS infrastructure, you can use the options introduced in 2.21.x like in the following example - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("file:tmp/test?fileName=test.txt") - .setHeader("CamelAwsS3Key", constant("testFile")) - .to("aws2-s3://mybucket?amazonS3Client=#client&useAwsKMS=true&awsKMSKeyId=3f0637ad-296a-3dfe-a796-e60654fb128c"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="file:tmp/test?fileName=test.txt"/> - <setHeader name="CamelAwsS3Key"> - <constant>testFile</constant> - </setHeader> - <to uri="aws2-s3://mybucket?amazonS3Client=#client&useAwsKMS=true&awsKMSKeyId=3f0637ad-296a-3dfe-a796-e60654fb128c"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: file:tmp/test - parameters: - fileName: test.txt - steps: - - setHeader: - name: CamelAwsS3Key - constant: testFile - - to: - uri: aws2-s3://mybucket - parameters: - amazonS3Client: "#client" - useAwsKMS: true - awsKMSKeyId: "3f0637ad-296a-3dfe-a796-e60654fb128c" ----- -==== - -TIP: The Java example uses the string value `"CamelAwsS3Key"` directly. You can also use the Java constant `AWS2S3Constants.KEY`. - -In this way, you'll ask S3 to use the KMS key 3f0637ad-296a-3dfe-a796-e60654fb128c, to encrypt the file test.txt. -When you ask to download this file, the decryption will be done directly before the download. - -=== Static credentials, Default Credential Provider and Profile Credentials Provider - -You have the possibility of avoiding the usage of explicit static credentials by specifying the useDefaultCredentialsProvider option and set it to true. - -The order of evaluation for Default Credentials Provider is the following: - - - Java system properties - `aws.accessKeyId` and `aws.secretKey`. - - Environment variables - `AWS_ACCESS_KEY_ID` and `AWS_SECRET_ACCESS_KEY`. - - Web Identity Token from AWS STS. - - The shared credentials and config files. - - Amazon ECS container credentials - loaded from the Amazon ECS if the environment variable `AWS_CONTAINER_CREDENTIALS_RELATIVE_URI` is set. - - Amazon EC2 Instance profile credentials. - -You have also the possibility of using Profile Credentials Provider, by specifying the useProfileCredentialsProvider option to true and profileCredentialsName to the profile name. - -Only one of static, default and profile credentials could be used at the same time. - -For more information about this you can look at https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/credentials.html[AWS credentials documentation] - -=== S3 Producer Operation examples +== S3 Producer Operation examples - Single Upload: This operation will upload a file to S3 based on the body content @@ -1696,7 +1445,7 @@ YAML:: This operation will delete the bucket policy from _mycamelbucket_. -=== AWS S3 Producer minimum permissions +== AWS S3 Producer minimum permissions For making the producer work, you'll need at least PutObject and ListBuckets permissions. The following policy will be enough: @@ -1745,7 +1494,7 @@ A variation to the minimum permissions is related to the usage of Bucket autocre } ---- -=== AWS S3 Consumer minimum permissions +== AWS S3 Consumer minimum permissions For making the producer work, you'll need at least GetObject, ListBuckets and DeleteObject permissions. The following policy will be enough: @@ -1774,741 +1523,3 @@ For making the producer work, you'll need at least GetObject, ListBuckets and De ---- By Default the consumer will use the deleteAfterRead option, this means the object will be deleted once consumed, this is why the DeleteObject permission is required. - -=== Streaming Upload mode - -With the stream mode enabled, users will be able to upload data to S3 without knowing ahead of time the dimension of the data, by leveraging multipart upload. -The upload will be completed when the batchSize has been completed or the batchMessageNumber has been reached. -There are two possible naming strategies: progressive and random. -With the progressive strategy, each file will have the name composed by keyName option and a progressive counter, and eventually the file extension (if any), while with the random strategy a UUID will be added after keyName and eventually the file extension will be appended. - -Additionally, streaming upload mode supports timestamp-based file grouping, which allows messages to be automatically grouped into time windows based on their timestamps. - -As an example: - -._Java-only: Endpoint DSL builder style_ - -[source,java] ----- -from(kafka("topic1").brokers("localhost:9092")) - .log("Kafka Message is: ${body}") - .to(aws2S3("camel-bucket").streamingUploadMode(true).batchMessageNumber(25) - .namingStrategy(AWS2S3EndpointBuilderFactory.AWSS3NamingStrategyEnum.progressive) - .keyName("{{kafkaTopic1}}/{{kafkaTopic1}}.txt")); - -from(kafka("topic2").brokers("localhost:9092")) - .log("Kafka Message is: ${body}") - .to(aws2S3("camel-bucket").streamingUploadMode(true).batchMessageNumber(25) - .namingStrategy(AWS2S3EndpointBuilderFactory.AWSS3NamingStrategyEnum.random) - .keyName("{{kafkaTopic2}}/{{kafkaTopic2}}.txt")); ----- - -The default size for a batch is 1 Mb, but you can adjust it according to your requirements. - -When you stop your producer route, the producer will take care of flushing the remaining buffered message and complete the upload. - -In Streaming upload, you'll be able to restart the producer from the point where it left. It's important to note that this feature is critical only when using the progressive naming strategy. - -By setting the restartingPolicy to lastPart, you will restart uploading files and contents from the last part number the producer left. - -As example: - - Start the route with progressive naming strategy and keyname equals to camel.txt, with batchMessageNumber equals to 20, and restartingPolicy equals to lastPart - - Send 70 messages. - - Stop the route - - On your S3 bucket you should now see four files: camel.txt, camel-1.txt, camel-2.txt and camel-3.txt, the first three will have 20 messages, while the last one is only 10. - - Restart the route - - Send 25 messages - - Stop the route - - You'll now have two other files in your bucket: camel-5.txt and camel-6.txt, the first with 20 messages and the second with 5 messages. - - Go ahead - -This won't be needed when using the random naming strategy. - -On the opposite, you can specify the override restartingPolicy. In that case, you'll be able to override whatever you written before (for that particular keyName) in your bucket. - -[NOTE] -==== -In Streaming upload mode, the only keyName option that will be taken into account is the endpoint option. Using the header will throw an NPE and this is done by design. -Setting the header means potentially change the file name on each exchange, and this is against the aim of the streaming upload producer. The keyName needs to be fixed and static. -The selected naming strategy will do the rest of the work. -==== - -Another possibility is specifying a streamingUploadTimeout with batchMessageNumber and batchSize options. With this option, the user will be able to complete the upload of a file after a certain time passed. -In this way, the upload completion will be passed on three tiers: the timeout, the number of messages and the batch size. - -As an example: - -._Java-only: Endpoint DSL builder style_ - -[source,java] ----- -from(kafka("topic1").brokers("localhost:9092")) - .log("Kafka Message is: ${body}") - .to(aws2S3("camel-bucket").streamingUploadMode(true).batchMessageNumber(25) - .streamingUploadTimeout(10000) - .namingStrategy(AWS2S3EndpointBuilderFactory.AWSS3NamingStrategyEnum.progressive) - .keyName("{{kafkaTopic1}}/{{kafkaTopic1}}.txt")); ----- - -In this case, the upload will be completed after 10 seconds. - -==== Timestamp Grouping - -The streaming upload mode also supports timestamp-based file grouping, which allows messages to be automatically grouped into time windows and written to the same S3 file based on their timestamps. This feature enables append-like behavior where messages with timestamps falling within the same time window are combined into a single file. - -To enable timestamp grouping, use the following configuration options: - -- `timestampGroupingEnabled`: Set to `true` to enable timestamp-based grouping (default: `false`) -- `timestampWindowSizeMillis`: The size of the time window in milliseconds (default: `300000` = 5 minutes) -- `timestampHeaderName`: The name of the message header containing the timestamp (default: `Exchange.MESSAGE_TIMESTAMP`) - -Messages are grouped based on their timestamps extracted from the specified header. The timestamp can be provided as: - -- Long: Unix timestamp in milliseconds -- Date: Java Date object -- String: String representation of Unix timestamp in milliseconds - -Files are automatically named using a timestamp-based pattern that includes the time window information. - -===== Example Configuration - -Basic timestamp grouping with 5-minute windows: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("timer:messages?period=10000") - .setHeader(Exchange.MESSAGE_TIMESTAMP, simple("${date:now:timestamp}")) - .setBody(constant("Message with timestamp")) - .to("aws2-s3://my-bucket?streamingUploadMode=true×tampGroupingEnabled=true×tampWindowSizeMillis=300000&keyName=grouped-messages.txt"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="timer:messages?period=10000"/> - <setHeader name="CamelMessageTimestamp"> - <simple>${date:now:timestamp}</simple> - </setHeader> - <setBody> - <constant>Message with timestamp</constant> - </setBody> - <to uri="aws2-s3://my-bucket?streamingUploadMode=true&timestampGroupingEnabled=true&timestampWindowSizeMillis=300000&keyName=grouped-messages.txt"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: timer:messages - parameters: - period: 10000 - steps: - - setHeader: - name: CamelMessageTimestamp - simple: "${date:now:timestamp}" - - setBody: - constant: Message with timestamp - - to: - uri: aws2-s3://my-bucket - parameters: - streamingUploadMode: true - timestampGroupingEnabled: true - timestampWindowSizeMillis: 300000 - keyName: grouped-messages.txt ----- -==== - -Custom window size (1 minute) with custom header name: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:timestamped") - .setHeader("MyTimestamp", simple("${date:now:timestamp}")) - .setBody(constant("Custom timestamped message")) - .to("aws2-s3://my-bucket?streamingUploadMode=true×tampGroupingEnabled=true×tampWindowSizeMillis=60000×tampHeaderName=MyTimestamp&keyName=custom-grouped.txt"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:timestamped"/> - <setHeader name="MyTimestamp"> - <simple>${date:now:timestamp}</simple> - </setHeader> - <setBody> - <constant>Custom timestamped message</constant> - </setBody> - <to uri="aws2-s3://my-bucket?streamingUploadMode=true&timestampGroupingEnabled=true&timestampWindowSizeMillis=60000&timestampHeaderName=MyTimestamp&keyName=custom-grouped.txt"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:timestamped - steps: - - setHeader: - name: MyTimestamp - simple: "${date:now:timestamp}" - - setBody: - constant: Custom timestamped message - - to: - uri: aws2-s3://my-bucket - parameters: - streamingUploadMode: true - timestampGroupingEnabled: true - timestampWindowSizeMillis: 60000 - timestampHeaderName: MyTimestamp - keyName: custom-grouped.txt ----- -==== - -Large files with multipart and timestamp grouping: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("direct:large-timestamped") - .setHeader(Exchange.MESSAGE_TIMESTAMP, simple("${date:now:timestamp}")) - .setBody(constant("Large message content...")) - .to("aws2-s3://my-bucket?streamingUploadMode=true×tampGroupingEnabled=true×tampWindowSizeMillis=1800000&multiPartUpload=true&partSize=5242880&keyName=large-grouped.txt"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="direct:large-timestamped"/> - <setHeader name="CamelMessageTimestamp"> - <simple>${date:now:timestamp}</simple> - </setHeader> - <setBody> - <constant>Large message content...</constant> - </setBody> - <to uri="aws2-s3://my-bucket?streamingUploadMode=true&timestampGroupingEnabled=true&timestampWindowSizeMillis=1800000&multiPartUpload=true&partSize=5242880&keyName=large-grouped.txt"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: direct:large-timestamped - steps: - - setHeader: - name: CamelMessageTimestamp - simple: "${date:now:timestamp}" - - setBody: - constant: "Large message content..." - - to: - uri: aws2-s3://my-bucket - parameters: - streamingUploadMode: true - timestampGroupingEnabled: true - timestampWindowSizeMillis: 1800000 - multiPartUpload: true - partSize: 5242880 - keyName: large-grouped.txt ----- -==== - -===== File Naming - -Files are automatically named using the following pattern: -``` -{baseFileName}_{YYYYMMDD}_{HHMM}_{HHMM-HHMM}.{extension} -``` - -For time windows smaller than 1 minute, seconds precision is used: -``` -{baseFileName}_{YYYYMMDD}_{HHMMSS}_{HHMMSS-HHMMSS}.{extension} -``` - -Example file names: -- 5-minute window: `messages_20240101_0800_0800-0805.txt` -- 1-minute window: `data_20240315_1430_1430-1431.log` -- 5-second window: `events_20241225_000000_000000-000005.json` - -===== Time Window Examples - -With a 5-minute window (300000ms): -- Window 1: 08:00:00 - 08:04:59 → `file_20240101_0800_0800-0805.txt` -- Window 2: 08:05:00 - 08:09:59 → `file_20240101_0805_0805-0810.txt` -- Window 3: 08:10:00 - 08:14:59 → `file_20240101_0810_0810-0815.txt` - -===== Fallback Behavior - -- If no timestamp header is found, the current system time is used -- If the timestamp header contains invalid data, the current system time is used -- Warning messages are logged when fallback behavior occurs - -===== Performance Considerations - -- Multiple concurrent time windows are supported -- Each window maintains its own upload state -- Memory usage scales with the number of active time windows -- Completed windows are automatically cleaned up -- Works with all existing streaming upload features (multipart uploads, timeouts, etc.) - -=== Bucket Auto-creation - -With the option `autoCreateBucket` users are able to avoid the auto-creation of an S3 Bucket in case it doesn't exist. The default for this option is `false`. -If set to false, any operation on a not-existent bucket in AWS won't be successful and an error will be returned. - -=== Moving stuff between a bucket and another bucket - -Some users like to consume stuff from a bucket and move the content in a different one without using the copyObject feature of this component. -If this is case for you, remember to remove the bucketName header from the incoming exchange of the consumer, otherwise the file will always be overwritten on the same -original bucket. - -=== MoveAfterRead consumer option - -In addition to deleteAfterRead, it has been added another option, moveAfterRead. With this option enabled, the consumed object will be moved to a target destinationBucket instead of being only deleted. -This will require specifying the destinationBucket option. As example: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - moveAfterRead: true - destinationBucket: myothercamelbucket - steps: - - to: - uri: mock:result ----- -==== - -In this case, the objects consumed will be moved to _myothercamelbucket_ bucket and deleted from the original one (because of deleteAfterRead set to true as default). - -You have also the possibility of using a key prefix/suffix while moving the file to a different bucket. -The options are `destinationBucketPrefix` and `destinationBucketSuffix`. - -Both options support the xref:languages:simple-language.adoc[Simple] expression language. -Wrap an expression in `RAW()` to prevent the Camel URI parser from interpreting special characters: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(pre-)&destinationBucketSuffix=RAW(-suff)") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(pre-)&destinationBucketSuffix=RAW(-suff)"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - moveAfterRead: true - destinationBucket: myothercamelbucket - destinationBucketPrefix: "RAW(pre-)" - destinationBucketSuffix: "RAW(-suff)" - steps: - - to: - uri: mock:result ----- -==== - -In this case, an object named `test` is moved to `myothercamelbucket` with the key `pre-test-suff`. - -Using a Simple expression, you can build dynamic paths at runtime. -The following example organises moved objects by date: - -[tabs] -==== -Java:: -+ -[source,java] ----- -from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(${date:now:yyyy/MM/dd}/)") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&moveAfterRead=true&destinationBucket=myothercamelbucket&destinationBucketPrefix=RAW(${date:now:yyyy/MM/dd}/)"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - moveAfterRead: true - destinationBucket: myothercamelbucket - destinationBucketPrefix: "RAW(${date:now:yyyy/MM/dd}/)" - steps: - - to: - uri: mock:result ----- -==== - -An object named `report.csv` consumed on 2026-05-19 is moved with the key `2026/05/19/report.csv`. -Expressions are evaluated once per exchange, so each message can produce a different destination key. - -=== Additional Consumer Examples - -=== Consumer with prefix filtering - -You can configure the consumer to only process objects with a specific prefix: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&prefix=processed/&delay=30000") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&prefix=processed/&delay=30000"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - prefix: processed/ - delay: 30000 - steps: - - to: - uri: mock:result ----- -==== - -This will only consume objects that start with "processed/" prefix from the _mycamelbucket_ bucket, with a 30-second polling delay. - -=== Consumer with custom polling and batch settings - -Configure custom polling intervals and batch sizes: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&delay=60000&maxMessagesPerPoll=5&includeBody=false") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&delay=60000&maxMessagesPerPoll=5&includeBody=false"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - delay: 60000 - maxMessagesPerPoll: 5 - includeBody: false - steps: - - to: - uri: mock:result ----- -==== - -This consumer polls every 60 seconds, processes up to 5 objects per poll, and doesn't include the object body in the message (only metadata). - -=== Consumer with file filtering and no deletion - -Configure the consumer to not delete files after reading and include specific file patterns: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&deleteAfterRead=false&fileName=*.pdf") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&deleteAfterRead=false&fileName=*.pdf"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - deleteAfterRead: false - fileName: "*.pdf" - steps: - - to: - uri: mock:result ----- -==== - -This consumer will read PDF files but won't delete them after processing. - -=== Consumer with done file pattern - -Use a done file pattern to ensure files are completely uploaded before processing: - -[tabs] -==== -Java:: -+ -[source,java] ----- - from("aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&doneFileName=*.done") - .to("mock:result"); ----- - -XML:: -+ -[source,xml] ----- -<route> - <from uri="aws2-s3://mycamelbucket?amazonS3Client=#amazonS3Client&doneFileName=*.done"/> - <to uri="mock:result"/> -</route> ----- - -YAML:: -+ -[source,yaml] ----- -- route: - from: - uri: aws2-s3://mycamelbucket - parameters: - amazonS3Client: "#amazonS3Client" - doneFileName: "*.done" - steps: - - to: - uri: mock:result ----- -==== - -This consumer will only process files when a corresponding .done file exists in the bucket. - -=== Using the customer key as encryption - -We introduced also the customer key support (an alternative of using KMS). The following code shows an example. - -._Java-only: programmatic customer key encryption setup_ - -[source,java] ----- -String key = UUID.randomUUID().toString(); -byte[] secretKey = generateSecretKey(); -String b64Key = Base64.getEncoder().encodeToString(secretKey); -String b64KeyMd5 = Md5Utils.md5AsBase64(secretKey); - -String awsEndpoint = "aws2-s3://mycamel?autoCreateBucket=false&useCustomerKey=true&customerKeyId=RAW(" + b64Key + ")&customerKeyMD5=RAW(" + b64KeyMd5 + ")&customerAlgorithm=" + AES256.name(); - -from("direct:putObject") - .setHeader("CamelAwsS3Key", constant("test.txt")) - .setBody(constant("Test")) - .to(awsEndpoint); ----- - -=== Using a POJO as body - -Sometimes building an AWS Request can be complex because of multiple options. We introduce the possibility to use a POJO as the body. -In AWS S3 there are multiple operations you can submit, as an example for List objects request, you can do something like: - -._Java-only: POJO request requires building AWS SDK request objects_ - -[source,java] ----- -from("direct:aws2-s3") - .setBody(ListObjectsV2Request.builder().bucket(bucketName).build()) - .to("aws2-s3://test?amazonS3Client=#amazonS3Client&operation=listObjects&pojoRequest=true"); ----- - -In this way, you'll pass the request directly without the need of passing headers and options specifically related to this operation. - -=== Create S3 client and add component to registry - -Sometimes you would want to perform some advanced configuration using AWS2S3Configuration, which also allows to set the S3 client. -You can create and set the S3 client in the component configuration as shown in the following example - -._Java-only: programmatic S3 client creation_ - -[source,java] ----- -String awsBucketAccessKey = "your_access_key"; -String awsBucketSecretKey = "your_secret_key"; - -S3Client s3Client = S3Client.builder() - .credentialsProvider(StaticCredentialsProvider.create( - AwsBasicCredentials.create(awsBucketAccessKey, awsBucketSecretKey))) - .region(Region.US_EAST_1).build(); - -AWS2S3Configuration configuration = new AWS2S3Configuration(); -configuration.setAmazonS3Client(s3Client); -configuration.setAutoDiscoverClient(true); -configuration.setBucketName("s3bucket2020"); -configuration.setRegion("us-east-1"); ----- - -Now you can configure the S3 component (using the configuration object created above) and add it to the registry in the -configure method before initialization of routes. - -._Java-only: programmatic component registration_ - -[source,java] ----- -AWS2S3Component s3Component = new AWS2S3Component(getContext()); -s3Component.setConfiguration(configuration); -s3Component.setLazyStartProducer(true); -camelContext.addComponent("aws2-s3", s3Component); ----- - -Now your component will be used for all the operations implemented in camel routes. - -=== Note about using this component for storing and retrieving objects from/to Dell ECS (Elastic Cloud Solutions) - -For storing and retrieving objects from/to Dell ECS, both of the `forcePathStyle` and `overrideEndpoint` options need to be set to `true` and using the `uriEndpointOverride` you need to provide your own ECS endpoint. - -._Java-only: programmatic Dell ECS configuration_ - -[source,java] ----- -String awsBucketAccessKey = "your_access_key"; -String awsBucketSecretKey = "your_secret_key"; - -S3Client s3Client = S3Client.builder() - .credentialsProvider(StaticCredentialsProvider.create( - AwsBasicCredentials.create(awsBucketAccessKey, awsBucketSecretKey))) - .region(Region.US_EAST_1).build(); - -AWS2S3Configuration configuration = new AWS2S3Configuration(); -configuration.setForcePathStyle(true); -configuration.setOverrideEndpoint(true); -configuration.setUriEndpointOverride("http://ecs1.emc.com:9020"); ----- - -For more details see https://www.dell.com/support/manuals/en-us/ecs-appliance-/ecs_pub_data_access_guide_3_3_to_3_6/using-the-java-amazon-sdk?guid=guid-149be134-6938-4cd7-9503-cf3ca9d23261&lang=en-us[their documentation]. - -== Dependencies - -Maven users will need to add the following dependency to their pom.xml. - -*pom.xml* - -[source,xml] ----- -<dependency> - <groupId>org.apache.camel</groupId> - <artifactId>camel-aws2-s3</artifactId> - <version>${camel-version}</version> -</dependency> ----- - -where `$\{camel-version}` must be replaced by the actual version of Camel. - - diff --git a/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-streaming.adoc b/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-streaming.adoc new file mode 100644 index 000000000000..1c9258dc10cb --- /dev/null +++ b/components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-streaming.adoc @@ -0,0 +1,303 @@ += AWS S3 - Streaming Upload +:tabs-sync-option: + +xref:ROOT:aws2-s3-component.adoc[Back to AWS S3 Component] + +== Streaming Upload mode + +With the stream mode enabled, users will be able to upload data to S3 without knowing ahead of time the dimension of the data, by leveraging multipart upload. +The upload will be completed when the batchSize has been completed or the batchMessageNumber has been reached. +There are two possible naming strategies: progressive and random. +With the progressive strategy, each file will have the name composed by keyName option and a progressive counter, and eventually the file extension (if any), while with the random strategy a UUID will be added after keyName and eventually the file extension will be appended. + +Additionally, streaming upload mode supports timestamp-based file grouping, which allows messages to be automatically grouped into time windows based on their timestamps. + +As an example: + +._Java-only: Endpoint DSL builder style_ + +[source,java] +---- +from(kafka("topic1").brokers("localhost:9092")) + .log("Kafka Message is: ${body}") + .to(aws2S3("camel-bucket").streamingUploadMode(true).batchMessageNumber(25) + .namingStrategy(AWS2S3EndpointBuilderFactory.AWSS3NamingStrategyEnum.progressive) + .keyName("{{kafkaTopic1}}/{{kafkaTopic1}}.txt")); + +from(kafka("topic2").brokers("localhost:9092")) + .log("Kafka Message is: ${body}") + .to(aws2S3("camel-bucket").streamingUploadMode(true).batchMessageNumber(25) + .namingStrategy(AWS2S3EndpointBuilderFactory.AWSS3NamingStrategyEnum.random) + .keyName("{{kafkaTopic2}}/{{kafkaTopic2}}.txt")); +---- + +The default size for a batch is 1 Mb, but you can adjust it according to your requirements. + +When you stop your producer route, the producer will take care of flushing the remaining buffered message and complete the upload. + +In Streaming upload, you'll be able to restart the producer from the point where it left. It's important to note that this feature is critical only when using the progressive naming strategy. + +By setting the restartingPolicy to lastPart, you will restart uploading files and contents from the last part number the producer left. + +As example: + - Start the route with progressive naming strategy and keyname equals to camel.txt, with batchMessageNumber equals to 20, and restartingPolicy equals to lastPart + - Send 70 messages. + - Stop the route + - On your S3 bucket you should now see four files: camel.txt, camel-1.txt, camel-2.txt and camel-3.txt, the first three will have 20 messages, while the last one is only 10. + - Restart the route + - Send 25 messages + - Stop the route + - You'll now have two other files in your bucket: camel-5.txt and camel-6.txt, the first with 20 messages and the second with 5 messages. + - Go ahead + +This won't be needed when using the random naming strategy. + +On the opposite, you can specify the override restartingPolicy. In that case, you'll be able to override whatever you written before (for that particular keyName) in your bucket. + +[NOTE] +==== +In Streaming upload mode, the only keyName option that will be taken into account is the endpoint option. Using the header will throw an NPE and this is done by design. +Setting the header means potentially change the file name on each exchange, and this is against the aim of the streaming upload producer. The keyName needs to be fixed and static. +The selected naming strategy will do the rest of the work. +==== + +Another possibility is specifying a streamingUploadTimeout with batchMessageNumber and batchSize options. With this option, the user will be able to complete the upload of a file after a certain time passed. +In this way, the upload completion will be passed on three tiers: the timeout, the number of messages and the batch size. + +As an example: + +._Java-only: Endpoint DSL builder style_ + +[source,java] +---- +from(kafka("topic1").brokers("localhost:9092")) + .log("Kafka Message is: ${body}") + .to(aws2S3("camel-bucket").streamingUploadMode(true).batchMessageNumber(25) + .streamingUploadTimeout(10000) + .namingStrategy(AWS2S3EndpointBuilderFactory.AWSS3NamingStrategyEnum.progressive) + .keyName("{{kafkaTopic1}}/{{kafkaTopic1}}.txt")); +---- + +In this case, the upload will be completed after 10 seconds. + +=== Timestamp Grouping + +The streaming upload mode also supports timestamp-based file grouping, which allows messages to be automatically grouped into time windows and written to the same S3 file based on their timestamps. This feature enables append-like behavior where messages with timestamps falling within the same time window are combined into a single file. + +To enable timestamp grouping, use the following configuration options: + +- `timestampGroupingEnabled`: Set to `true` to enable timestamp-based grouping (default: `false`) +- `timestampWindowSizeMillis`: The size of the time window in milliseconds (default: `300000` = 5 minutes) +- `timestampHeaderName`: The name of the message header containing the timestamp (default: `Exchange.MESSAGE_TIMESTAMP`) + +Messages are grouped based on their timestamps extracted from the specified header. The timestamp can be provided as: + +- Long: Unix timestamp in milliseconds +- Date: Java Date object +- String: String representation of Unix timestamp in milliseconds + +Files are automatically named using a timestamp-based pattern that includes the time window information. + +==== Example Configuration + +Basic timestamp grouping with 5-minute windows: + +[tabs] +==== +Java:: ++ +[source,java] +---- +from("timer:messages?period=10000") + .setHeader(Exchange.MESSAGE_TIMESTAMP, simple("${date:now:timestamp}")) + .setBody(constant("Message with timestamp")) + .to("aws2-s3://my-bucket?streamingUploadMode=true×tampGroupingEnabled=true×tampWindowSizeMillis=300000&keyName=grouped-messages.txt"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="timer:messages?period=10000"/> + <setHeader name="CamelMessageTimestamp"> + <simple>${date:now:timestamp}</simple> + </setHeader> + <setBody> + <constant>Message with timestamp</constant> + </setBody> + <to uri="aws2-s3://my-bucket?streamingUploadMode=true&timestampGroupingEnabled=true&timestampWindowSizeMillis=300000&keyName=grouped-messages.txt"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: timer:messages + parameters: + period: 10000 + steps: + - setHeader: + name: CamelMessageTimestamp + simple: "${date:now:timestamp}" + - setBody: + constant: Message with timestamp + - to: + uri: aws2-s3://my-bucket + parameters: + streamingUploadMode: true + timestampGroupingEnabled: true + timestampWindowSizeMillis: 300000 + keyName: grouped-messages.txt +---- +==== + +Custom window size (1 minute) with custom header name: + +[tabs] +==== +Java:: ++ +[source,java] +---- +from("direct:timestamped") + .setHeader("MyTimestamp", simple("${date:now:timestamp}")) + .setBody(constant("Custom timestamped message")) + .to("aws2-s3://my-bucket?streamingUploadMode=true×tampGroupingEnabled=true×tampWindowSizeMillis=60000×tampHeaderName=MyTimestamp&keyName=custom-grouped.txt"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="direct:timestamped"/> + <setHeader name="MyTimestamp"> + <simple>${date:now:timestamp}</simple> + </setHeader> + <setBody> + <constant>Custom timestamped message</constant> + </setBody> + <to uri="aws2-s3://my-bucket?streamingUploadMode=true&timestampGroupingEnabled=true&timestampWindowSizeMillis=60000&timestampHeaderName=MyTimestamp&keyName=custom-grouped.txt"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: direct:timestamped + steps: + - setHeader: + name: MyTimestamp + simple: "${date:now:timestamp}" + - setBody: + constant: Custom timestamped message + - to: + uri: aws2-s3://my-bucket + parameters: + streamingUploadMode: true + timestampGroupingEnabled: true + timestampWindowSizeMillis: 60000 + timestampHeaderName: MyTimestamp + keyName: custom-grouped.txt +---- +==== + +Large files with multipart and timestamp grouping: + +[tabs] +==== +Java:: ++ +[source,java] +---- +from("direct:large-timestamped") + .setHeader(Exchange.MESSAGE_TIMESTAMP, simple("${date:now:timestamp}")) + .setBody(constant("Large message content...")) + .to("aws2-s3://my-bucket?streamingUploadMode=true×tampGroupingEnabled=true×tampWindowSizeMillis=1800000&multiPartUpload=true&partSize=5242880&keyName=large-grouped.txt"); +---- + +XML:: ++ +[source,xml] +---- +<route> + <from uri="direct:large-timestamped"/> + <setHeader name="CamelMessageTimestamp"> + <simple>${date:now:timestamp}</simple> + </setHeader> + <setBody> + <constant>Large message content...</constant> + </setBody> + <to uri="aws2-s3://my-bucket?streamingUploadMode=true&timestampGroupingEnabled=true&timestampWindowSizeMillis=1800000&multiPartUpload=true&partSize=5242880&keyName=large-grouped.txt"/> +</route> +---- + +YAML:: ++ +[source,yaml] +---- +- route: + from: + uri: direct:large-timestamped + steps: + - setHeader: + name: CamelMessageTimestamp + simple: "${date:now:timestamp}" + - setBody: + constant: "Large message content..." + - to: + uri: aws2-s3://my-bucket + parameters: + streamingUploadMode: true + timestampGroupingEnabled: true + timestampWindowSizeMillis: 1800000 + multiPartUpload: true + partSize: 5242880 + keyName: large-grouped.txt +---- +==== + +==== File Naming + +Files are automatically named using the following pattern: +``` +{baseFileName}_{YYYYMMDD}_{HHMM}_{HHMM-HHMM}.{extension} +``` + +For time windows smaller than 1 minute, seconds precision is used: +``` +{baseFileName}_{YYYYMMDD}_{HHMMSS}_{HHMMSS-HHMMSS}.{extension} +``` + +Example file names: +- 5-minute window: `messages_20240101_0800_0800-0805.txt` +- 1-minute window: `data_20240315_1430_1430-1431.log` +- 5-second window: `events_20241225_000000_000000-000005.json` + +==== Time Window Examples + +With a 5-minute window (300000ms): +- Window 1: 08:00:00 - 08:04:59 → `file_20240101_0800_0800-0805.txt` +- Window 2: 08:05:00 - 08:09:59 → `file_20240101_0805_0805-0810.txt` +- Window 3: 08:10:00 - 08:14:59 → `file_20240101_0810_0810-0815.txt` + +==== Fallback Behavior + +- If no timestamp header is found, the current system time is used +- If the timestamp header contains invalid data, the current system time is used +- Warning messages are logged when fallback behavior occurs + +==== Performance Considerations + +- Multiple concurrent time windows are supported +- Each window maintains its own upload state +- Memory usage scales with the number of active time windows +- Completed windows are automatically cleaned up +- Works with all existing streaming upload features (multipart uploads, timeouts, etc.) diff --git a/docs/components/modules/others/nav.adoc b/docs/components/modules/others/nav.adoc index e8ea5aefaa88..95d278de07c6 100644 --- a/docs/components/modules/others/nav.adoc +++ b/docs/components/modules/others/nav.adoc @@ -3,6 +3,9 @@ * xref:others:index.adoc[Miscellaneous Components] ** xref:attachments.adoc[Attachments] +** xref:aws2-s3-consumer-examples.adoc[AWS S3 - Consumer Examples] +** xref:aws2-s3-producer-operations.adoc[AWS S3 - Producer Operations] +** xref:aws2-s3-streaming.adoc[AWS S3 - Streaming Upload] *** xref:azure-schema-registry.adoc[Azure Schema Registry] ** xref:camel-yaml-dsl-validator-maven-plugin.adoc[Camel YAML DSL Validator Maven Plugin] ** xref:cli-connector.adoc[CLI Connector] diff --git a/docs/components/modules/others/pages/aws2-s3-consumer-examples.adoc b/docs/components/modules/others/pages/aws2-s3-consumer-examples.adoc new file mode 120000 index 000000000000..e9a449c1c7ca --- /dev/null +++ b/docs/components/modules/others/pages/aws2-s3-consumer-examples.adoc @@ -0,0 +1 @@ +../../../../../components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-consumer-examples.adoc \ No newline at end of file diff --git a/docs/components/modules/others/pages/aws2-s3-producer-operations.adoc b/docs/components/modules/others/pages/aws2-s3-producer-operations.adoc new file mode 120000 index 000000000000..c24b69800957 --- /dev/null +++ b/docs/components/modules/others/pages/aws2-s3-producer-operations.adoc @@ -0,0 +1 @@ +../../../../../components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-producer-operations.adoc \ No newline at end of file diff --git a/docs/components/modules/others/pages/aws2-s3-streaming.adoc b/docs/components/modules/others/pages/aws2-s3-streaming.adoc new file mode 120000 index 000000000000..686c3b7a21f7 --- /dev/null +++ b/docs/components/modules/others/pages/aws2-s3-streaming.adoc @@ -0,0 +1 @@ +../../../../../components/camel-aws/camel-aws2-s3/src/main/docs/aws2-s3-streaming.adoc \ No newline at end of file
