Re: A proposal for refactoring the CircleCI config

2022-11-02 Thread David Capwell
Thanks for starting this thread!

If I am reading correctly, there are 2 main improvements proposed: move 
parallel value to a param and have our “patches” update those and not the jobs 
themselves, look into / use matrix to define the set of jobs we need…

For the parallel param logic, sounds fine to me.  Not sure if this would also 
work for resource_type, but I still argue that xlarge isn’t needed in 90% of 
the cases its used… so fixing this may be better than param there…. So yes, I 
would be cool with this change if it basically removes the patching logic… I 
had another JIRA to have a python script rewrite the YAML, but this method may 
solve in a cleaner way.

About matrix jobs; I don’t know them in circle but have used in other places, 
this sounds good to me.  I would also enhance to argue that JVM is just 1 
config and we sadly have many more:

JVM: [8, 11, 17]
VNODE: [true, false]
CDC: [true, false]
COMPRESSION: [true, false]
MEMTABLE: [skiplist, shardedskiplist, trie]

So, if we can express that with matrix; sounds good to me.

One point about JVM support that is subtle and skipped by many: we currently 
also test compiling with the smallest version and running with the larger 
version… this could also be a matrix job, but would just be a matrix over 11/17 
with 8 removed from that list.

> On Oct 28, 2022, at 2:25 PM, Derek Chen-Becker  wrote:
> 
> While I've been working on bringing CircleCI to parity with Jenkins,
> I've made some notes about ways that the whole config generation
> process could be improved. Here are my thoughts. I'm not sure if this
> is worthy of a CEP since it's infra and not a feature.
> 
> Cheers,
> 
> Derek
> 
> Problem Statement
> ═
> 
>  The CircleCI configuration subdivides various steps in the test plan
>  into jobs that can execute independently. This set of jobs are
>  intended to be run by developers under the free/OSS plan as well as
>  under paid CircleCI plans for developers or organizations that wish to
>  spend money to obtain faster test results by committing more resources
>  (LOWRES, MIDRES, HIGHRES).
> 
>  The allocation of resources is currently driven by a shell script that
>  performs textual modification (e.g. patch and sed) of files before
>  processing to effect changes in various configuration parameters.
>  While this approach works, it does not fully utilize the
>  parameterization features of CircleCI’s configuration processor to
>  reduce complexity when adding new tests or making changes to the
>  system, imposing additional burden on developers modifying the CI
>  configuration.
> 
>  This proposal details an initial goal for reducing CircleCI
>  configuration complexity, and provides a high level overview of
>  subsequent goals to be investigated.
> 
> 
> Goal 1: Eliminate patch files
> ═
> 
>  Patch files are targeted as the first goal for this proposal because
>  there is a significant reduction in configuration complexity for a
>  relatively modest effort. Patch files themselves are brittle; the
>  patch tool can accommodate some changes between the original target
>  file and the current state, but cannot also unambiguously apply
>  changes. When the CircleCI configuration is changed, the patch files
>  also need to be changed or regenerated to match line numbers and any
>  new sections added. This is extra work that does not provide any
>  benefit.
> 
>  The patch files currently apply changes to three types of
>  configuration:
> 
>  • Heap size parameters (only for the HIGHRES config)
>  • Job resource class
>  • Executor parallelism
> 
>  CircleCI handles this use case via parameterization of the
>  configuration. Interestingly, our CircleCI configuration already takes
>  advantage of parameterization in the definition of the executor:
> 
>  ┌
>  │ java8-executor:
>  │   parameters:
>  │ exec_resource_class:
>  │   type: string
>  └
> 
>  CircleCI additionally allows for parameters to be defined at the top
>  level of the pipeline, which are then accessible anywhere in the
>  pipeline definition (e.g. steps, jobs, etc). These parameters can be
>  overridden by providing a yaml file to the `circleci config process'
>  command.
> 
>  As an example of what a change would entail, consider that the patch
>  files change the parallelism of all repated dtest executors uniformly.
>  We could introduce a single pipeline parameter for this value:
> 
>  ┌
>  │ parameters:
>  │   repeated_dtest_parallelism:
>  │ type: integer
>  │ default: 4
>  └
> 
>  And then update the configuration of the executors to use the
>  parameter:
> 
>  ┌
>  │ j8_repeated_utest_executor: &j8_repeated_utest_executor
>  │   executor:
>  │ name: java8-executor
>  │   parallelism: << pipeline.parameters.repeated_dtest_parallelism >>
>  │
>  │ j8_repeated_dtest_executor: &j8_repeated_dtest_executor
>  │   executor:
>  │ name: java8-executor
>  │   parallelism: << pipeline.parameter

Re: A proposal for refactoring the CircleCI config

2022-11-02 Thread Derek Chen-Becker
> For the parallel param logic, sounds fine to me.  Not sure if this would also 
> work for resource_type, but I still argue that xlarge isn’t needed in 90% of 
> the cases its used… so fixing this may be better than param there…. So yes, I 
> would be cool with this change if it basically removes the patching logic… I 
> had another JIRA to have a python script rewrite the YAML, but this method 
> may solve in a cleaner way.

Almost any part of a CircleCI definition can be replaced with a
parameter, so basically we want config-2_1.yml to be a template, and
we plug different values in as desired. Would you mind sending a link
to that JIRA so I can understand that use case?

> About matrix jobs; I don’t know them in circle but have used in other places, 
> this sounds good to me.  I would also enhance to argue that JVM is just 1 
> config and we sadly have many more:
>
> JVM: [8, 11, 17]
> VNODE: [true, false]
> CDC: [true, false]
> COMPRESSION: [true, false]
> MEMTABLE: [skiplist, shardedskiplist, trie]

My understanding is that we could parameterize all of these such that
we could use a matrix as long as all combinations are valid. Let me
get parameterization of basic configuration reviewed first, and then
we can take a look at how to matricize things.

Cheers,

Derek

-- 
+---+
| Derek Chen-Becker |
| GPG Key available at https://keybase.io/dchenbecker and   |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---+


Re: A proposal for refactoring the CircleCI config

2022-11-02 Thread David Capwell
Here is the ticket I was talking about 
https://issues.apache.org/jira/browse/CASSANDRA-17600 
 

> On Nov 2, 2022, at 1:29 PM, Derek Chen-Becker  wrote:
> 
>> For the parallel param logic, sounds fine to me.  Not sure if this would 
>> also work for resource_type, but I still argue that xlarge isn’t needed in 
>> 90% of the cases its used… so fixing this may be better than param there…. 
>> So yes, I would be cool with this change if it basically removes the 
>> patching logic… I had another JIRA to have a python script rewrite the YAML, 
>> but this method may solve in a cleaner way.
> 
> Almost any part of a CircleCI definition can be replaced with a
> parameter, so basically we want config-2_1.yml to be a template, and
> we plug different values in as desired. Would you mind sending a link
> to that JIRA so I can understand that use case?
> 
>> About matrix jobs; I don’t know them in circle but have used in other 
>> places, this sounds good to me.  I would also enhance to argue that JVM is 
>> just 1 config and we sadly have many more:
>> 
>> JVM: [8, 11, 17]
>> VNODE: [true, false]
>> CDC: [true, false]
>> COMPRESSION: [true, false]
>> MEMTABLE: [skiplist, shardedskiplist, trie]
> 
> My understanding is that we could parameterize all of these such that
> we could use a matrix as long as all combinations are valid. Let me
> get parameterization of basic configuration reviewed first, and then
> we can take a look at how to matricize things.
> 
> Cheers,
> 
> Derek
> 
> -- 
> +---+
> | Derek Chen-Becker |
> | GPG Key available at https://keybase.io/dchenbecker and   |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---+