Thanks for starting this thread! If I am reading correctly, there are 2 main improvements proposed: move parallel value to a param and have our “patches” update those and not the jobs themselves, look into / use matrix to define the set of jobs we need…
For the parallel param logic, sounds fine to me. Not sure if this would also work for resource_type, but I still argue that xlarge isn’t needed in 90% of the cases its used… so fixing this may be better than param there…. So yes, I would be cool with this change if it basically removes the patching logic… I had another JIRA to have a python script rewrite the YAML, but this method may solve in a cleaner way. About matrix jobs; I don’t know them in circle but have used in other places, this sounds good to me. I would also enhance to argue that JVM is just 1 config and we sadly have many more: JVM: [8, 11, 17] VNODE: [true, false] CDC: [true, false] COMPRESSION: [true, false] MEMTABLE: [skiplist, shardedskiplist, trie] So, if we can express that with matrix; sounds good to me. One point about JVM support that is subtle and skipped by many: we currently also test compiling with the smallest version and running with the larger version… this could also be a matrix job, but would just be a matrix over 11/17 with 8 removed from that list. > On Oct 28, 2022, at 2:25 PM, Derek Chen-Becker <de...@chen-becker.org> wrote: > > While I've been working on bringing CircleCI to parity with Jenkins, > I've made some notes about ways that the whole config generation > process could be improved. Here are my thoughts. I'm not sure if this > is worthy of a CEP since it's infra and not a feature. > > Cheers, > > Derek > > Problem Statement > ═════════════════ > > The CircleCI configuration subdivides various steps in the test plan > into jobs that can execute independently. This set of jobs are > intended to be run by developers under the free/OSS plan as well as > under paid CircleCI plans for developers or organizations that wish to > spend money to obtain faster test results by committing more resources > (LOWRES, MIDRES, HIGHRES). > > The allocation of resources is currently driven by a shell script that > performs textual modification (e.g. patch and sed) of files before > processing to effect changes in various configuration parameters. > While this approach works, it does not fully utilize the > parameterization features of CircleCI’s configuration processor to > reduce complexity when adding new tests or making changes to the > system, imposing additional burden on developers modifying the CI > configuration. > > This proposal details an initial goal for reducing CircleCI > configuration complexity, and provides a high level overview of > subsequent goals to be investigated. > > > Goal 1: Eliminate patch files > ═════════════════════════════ > > Patch files are targeted as the first goal for this proposal because > there is a significant reduction in configuration complexity for a > relatively modest effort. Patch files themselves are brittle; the > patch tool can accommodate some changes between the original target > file and the current state, but cannot also unambiguously apply > changes. When the CircleCI configuration is changed, the patch files > also need to be changed or regenerated to match line numbers and any > new sections added. This is extra work that does not provide any > benefit. > > The patch files currently apply changes to three types of > configuration: > > • Heap size parameters (only for the HIGHRES config) > • Job resource class > • Executor parallelism > > CircleCI handles this use case via parameterization of the > configuration. Interestingly, our CircleCI configuration already takes > advantage of parameterization in the definition of the executor: > > ┌──── > │ java8-executor: > │ parameters: > │ exec_resource_class: > │ type: string > └──── > > CircleCI additionally allows for parameters to be defined at the top > level of the pipeline, which are then accessible anywhere in the > pipeline definition (e.g. steps, jobs, etc). These parameters can be > overridden by providing a yaml file to the `circleci config process' > command. > > As an example of what a change would entail, consider that the patch > files change the parallelism of all repated dtest executors uniformly. > We could introduce a single pipeline parameter for this value: > > ┌──── > │ parameters: > │ repeated_dtest_parallelism: > │ type: integer > │ default: 4 > └──── > > And then update the configuration of the executors to use the > parameter: > > ┌──── > │ j8_repeated_utest_executor: &j8_repeated_utest_executor > │ executor: > │ name: java8-executor > │ parallelism: << pipeline.parameters.repeated_dtest_parallelism >> > │ > │ j8_repeated_dtest_executor: &j8_repeated_dtest_executor > │ executor: > │ name: java8-executor > │ parallelism: << pipeline.parameters.repeated_dtest_parallelism >> > │ > │ j8_repeated_upgrade_dtest_executor: &j8_repeated_upgrade_dtest_executor > │ executor: > │ name: java8-executor > │ parallelism: << pipeline.parameters.repeated_dtest_parallelism >> > │ > │ j8_repeated_jvm_upgrade_dtest_executor: > &j8_repeated_jvm_upgrade_dtest_executor > │ executor: > │ name: java8-executor > │ parallelism: << pipeline.parameters.repeated_dtest_parallelism >> > └──── > > Then we create a MIDRES-specific override file containing the new > value: > > ┌──── > │ repeated_dtest_parallelism: 25 > └──── > > and a HIGHRES-specific override file: > > ┌──── > │ repeated_dtest_parallelism: 100 > └──── > > And then execute the config processor with the proper override: > > ┌──── > │ circleci config process --pipeline-parameters MIDRES-params.yml > config-2_1.yml > │ # or > │ circleci config process --pipeline-parameters HIGHRES-params.yml > config-2_1.yml > └──── > > If a new job is added that does not use any parameters, only the > `config-2_1.yml' file is modified. If new parameters are introduced, > then the override files also need updated. One option (although not > recommended) to this approach to help catch missed values for MID and > HIGH profiles would be to preclude the use of default values and > provide a LOWRES-params.yml file with the default values. > > One caveat is that there was a bug in the CircleCI CLI prior to > version 0.1.22322 that resulted in the override file being ignored. > One possible workaround, if we do not want to require a minimum > circleci CLI version, would be to define the parameters in their own > file and concatenate the level-specific definition file with the > config-2_1.yml file prior to processing. > > Either approach (yaml file override or concatenation) provides benefit > since there would be no need to modify the level-specific file as the > main configuration is changed, unlike with patch files. > > The changes needed to implement this are relatively small, and are > easy to test, since a change to use pipeline parameters (along with > requisite changes to `generate.sh') should not result in any changes > to the config.yml files. > > > Other benefits for future investigation > ═══════════════════════════════════════ > > Beyond simple replacement of the patch files, there are some > additional capabilities in CircleCI configuration that may benefit the > project long-term. One of the most interesting is the concept of > matrix jobs. Matrix jobs allow you to parameterize a job over multiple > values. In our case, this could potentially allow us to parameterize > jobs over different version of the JDK. This would reduce the overall > size of the configuration, because the current approach duplicates > jobs for each Java version. It would also make it simpler to add/test > new Java versions by making a relatively small number of changes to > the matrix instead of creating a whole set of version-specific jobs. > We would still need to create version-specific executors, but if we > move things like parallelism and resource class out of the executor > and into the job definition, we can parameterize in the workflow and > reduce the number of executors to one per Java version. For example: > > ┌──── > │ jobs: > │ j8_unit_tests: > │ parameters: > │ executor: > │ type: executor > │ executor: << parameters.executor >> > │ parallelism: << pipeline.parameters.unit_test_parallelism >> > │ resource_class: << pipeline.parameters.unit_test_resclass >> > │ steps: > │ - attach_workspace: > │ at: /home/cassandra > │ - create_junit_containers > │ - log_environment > │ - run_parallel_junit_tests > │ > │ workflows: > │ pre-commit_tests: > │ jobs: > │ - j8_unit_tests: > │ requires: > │ - build > │ matrix: > │ parameters: > │ executor: [java8-executor, java11-executor, java17-executor] > └──── > > -- > +---------------------------------------------------------------+ > | Derek Chen-Becker | > | GPG Key available at https://keybase.io/dchenbecker and | > | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org | > | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7 7F42 AFC5 AFEE 96E4 6ACC | > +---------------------------------------------------------------+