estebanz01 commented on issue #12547:
URL: https://github.com/apache/pinot/issues/12547#issuecomment-2021333062

   OK, after lots of trial and error, this is what I did to have a working 
hybrid table with minion tasks and S3 deep storage:
   
   <details>
   <summary>Controller helm config</summary>
   
   ```yaml
   controller:
     # We make sure that only this configuration is present, as duplicated 
configs won't override but merge.
     data:
       dir: s3://<bucket-name>/<custom-path>/controller-data
   
     # If we don't specify the host and port, a `Controller_null_9000` 
controller will be seen by pinot.
     host: pinot-controller
     port: 9000
   
     # Not sure why a `Controller_null_9000` will appear if we have `vip` 
enable, but oh well!
     vip:
       enable: true
       host: pinot-controller
       port: 9000
   
     # ...other configs
     configs: |-
       pinot.set.instance.id.to.hostname=true
       controller.task.scheduler.enabled=true
       controller.local.temp.dir=/var/pinot/controller/data # Super important! 
data will be here until it's offloaded to S3
       
pinot.controller.storage.factory.class.s3=org.apache.pinot.plugin.filesystem.S3PinotFS
       pinot.controller.storage.factory.s3.region=us-east-1
       pinot.controller.segment.fetcher.protocols=file,http,s3
       
pinot.controller.segment.fetcher.s3.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
       pinot.controller.storage.factory.s3.disableAcl=false
   ```
   
   </details>
   
   <details>
   <summary>Minion helm config</summary>
   
   ```yaml
   minion:
     # ... other configs
     extra:
       configs: |-
         pinot.set.instance.id.to.hostname=true
         
pinot.minion.storage.factory.class.s3=org.apache.pinot.plugin.filesystem.S3PinotFS
         pinot.minion.storage.factory.s3.region=us-east-1
         pinot.minion.segment.fetcher.protocols=file,http,s3
         
pinot.minion.segment.fetcher.s3.class=org.apache.pinot.common.utils.fetcher.PinotFSSegmentFetcher
   ```
   
   </details>
   
   Basically, we had to configure the S3 filesystem on controller, server and 
minion so the workers can fetch and upload/download data when needed. I'm not 
sure how with other deep storage options it might look like, but it seems that 
all three components must be in config-sync, if that makes sense.
   
   Thanks for the help on this!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to