daniellavoie opened a new pull request #5608:
URL: https://github.com/apache/incubator-pinot/pull/5608


   ## Description
   
   **DISCLAIMER** : This PR looks scarier than it is :)
   
   This change introduces configuration externalization to work with different 
environments. Properties can now be loaded from properties files, YAML files, 
environment variables, and command-line arguments to externalize configuration 
independently.  Properties are now accessible from a `PinotConfiguration` class 
which act has an adapter to Apache Commons  Configuration. This initial phase 
does not change how services are started and how they are configured. A future 
commit will enable properties loaded from `PINOT_X` environment variables. 
   
   ## Motivation
   
   In order to ease cloud native deployments of Pinot, it is a general best 
practice to allow software to be configured with environment variables or CLI 
arguments. Config file based configuration is not considered best practice as 
it requires additional effort to mount volumes and maintain the property files. 
CLI arguments and environment variables are easily referred in an helm chart, a 
docker-compose manifest or simple docker run commands.
   
   ## Features:
   
   ### Configuration overwrite priority:
   
   Constructing a `PinotConfiguration` instance offers the opportunity to 
provide base properties, environment variables. Properties will also be loaded 
from any config file referenced with a `config.paths` property found in the 
base properties or  environment variables. The priority is established as 
follow: `Base properties > Env variables > Config Files`
   
   As explained previously, this initial phase only loads properties from 
existing CLI commands and config files (`Base properties > Config Files`).
   
   ### Relaxed property binding from environment variables
   
   The `PinotConfiguration` class introduces a relaxed property binding 
mechanism to help, as much as possible, to follow operating system strict rules 
around environment variables naming concentions. For example, Linux shell 
variables can contain only letters (a to z or A to Z), numbers (0 to 9) or the 
underscore character (_). By convention, Unix shell variables will also have 
their names in UPPERCASE. To convert a property name in the canonical-form to 
an environment variable name you can follow these rules: 
   
   * Prefix with `PINOT_` to avoid conflict with other services.
   * Replace dots (.) with underscores (_).
   * Remove any dashes (-).
   * Convert to uppercase.
   
   As an example, the `PINOT_CONTROLLER_PORT` would be interpreted as the 
`controller.port` property. 
   
   ## Side effects of the change:
   
   This first phase of the refactoring only introduces the new 
`PinotConfiguration` abstraction. The dependency  
`org.apache.commons.configuration.Configuration` used to be to be referred in a 
big portion of the Pinot codebase. It has  been replaced by the Pinot specific 
`PinotConfiguration` class. This change will simplify the a future upgrade of 
Apache Commons Configuration from 1.x to 2.x that new package namespaces and 
breaking changes.
   
   ## Breaking change and upgrade notes
   
   No impact or configuration changes on standard deployments. With Apache 
Commons Configuration, duplicate properties may be defined and will be made 
available as string array values. Prior to this change, retrieving a property 
with `getString` would return the first occurrences of the duplicate property. 
Using `getStringArray` would return all occurrences. With this new change, 
duplicate properties defined from multiple sources (eg: env variables and 
properties files) will be overwritten using the overwrite priority described 
earlier. Array properties now needs to be defined as a comma separated string 
from a single properties. Duplicate properties from difference sources will not 
be merged to an array property. Only the property highest priority source will 
be available from `PinotConfiguration.getProperty`.
   
   ## Notes for reviewers:
   
   The change impacts a large portion of the codebase. While it can be scary, 
the changes represents mostly a replacement of 
`org.apache.commons.configuration.Configuration` to 
`org.apache.pinot.spi.env.PinotConfiguration`. The following classes should be 
the main area of attention for the review:
   * `org.apache.pinot.spi.env.PinotConfiguration`
   * `org.apache.pinot.controller.ControllerConf`
   * `org.apache.pinot.tools.utils.PinotConfigUtils`
   * `org.apache.pinot.spi.env.CommonsConfigurationUtils`
   * `org.apache.pinot.spi.env.Environment`
   * `org.apache.pinot.spi.env.SystemEnvironment`
   
   ## Release Notes
   
   No release note required since these changes are not user facing yet.
   
   ## Documentation
   
   No documentation change required since these changes are not user facing 
yet. Future phase of this refactoring will enable environment variables 
properties and include documentation changes to expose the relaxed binding 
mechanism.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to