Just to give a bit of context - why I think it is important.

It had never ever happened that we had to yank 4 versions of a
provider because of incompatibilities we learned after the fact. And
it's not anyone's fault - i't just learning that we should take into
account. And Cncf.kubernetes is a very special case that had bitten us
in the past several times because of it's tight coupling to the core.
And I think if we do any breaking change we should at least think how
to avoid similar problems in the future. Those are not academic
questions - it already happened (we had similar problems around
Airflow 2 migration in the early Airflow 2 days - so this is an
indication that this is currently "property" of the relation of the
provider with the core so we need to rethink
it)

This change introduces a breaking change in the future, which might
need another breaking change when (or if) we fix the "real" issue we
had and experienced (and which resulted in yanking 4 versions of
cncf.kubernetes provider).

My points are:

* we are introducing a change that will (eventually) make it into a
breaking change (and as you mentioned - potentially disruptive).
* which will likely need another breaking change in the way how KPO is defined
* which means that our users will likely have to go through
incompatibility pains more than once.

I am not against this change - I am just asking for a bit more forward
thinking (and happy to brain-storm in some doc? aip? something more
substantial and suitable than just email thread).

I have some questions (and my current answers are below):

1) Are we sure we want to do it without even attempting to define (not
implementing) how the (maybe imagined) target setup will look like?

My answer: I think not because it might lead to confusion with our
users. K8S for many of our users is important and any change required
will be amplified by a number of people having to do it so we should
limit the number of times they have to do it.

2) Are we sure (or at least it is very likely) that the change we
introduce now will also hold when we solve the real problem ?

I am not sure. I do not know answer to 3) 4) questions to be sure this
change is going to hold.

3)  Are we going to have some fixed version relationship between
Airflow and Cncf.kubernetes? (like we have now: Airflow 2.1 and 2.2 ->
use cncf.kubernetes 3*. Airflow 2.3 -> use cncf.kubernetes 4.*.

My answer: That's one of the options, but it's a bit limiting in terms
of releasing bug-fixes. Likely it will lead to us having to maintain
two branches of cncf.kubernetes provider if we do (when we find a
critical issue). And people who are using 2.1 will have to migrate to
2.3 in order to use any new features in K8S we developed. This is the
current situation we are in and is in a stark contrast with the way
how it works for other providers.
We might deliberately choos that path though - maybe it is better to
keep it this way - with potential price to pay to maintain critical
fixes in two or more branches for the provider. But it should be
deliberate choice knowing the consequences not an accidental
by-product of the versioning approach we choose. Maybe we make a
pledge that there will be no incompatible changes and we will keep 4.0
for as long as we can (but due to changes in kubernetes libarary it
might be not possible - as we already experienced in 3.0 -> 4.0 move).

4) Alternatively - do we know the changes needd to having "true
decoupling" in-place - so that "provider 4.0 and 5.0 will be able to
use ? What needs to be done to get there?

My answer: I do not really know - I do not know details too much and
why the changes we implemented last time were so disruptive and
whether we could keep backwards compatibility if we really wanted. Was
it deliberated braking compatibility because we had no other choice?
Or was it  accidental and we **could** keep the compatibility if we
really wanted? I am not sure. Of course we cannot anticipate what
future kubernetes library will bring but just maybe deciding of what
is our "goal" here and whether it seems to be achievable or not is
something we should do.

5) Or MAYBE we should simply incorporate cncf.kubernetes provider
entirely in the core of Airlfow? Maybe there should be NO
"cncf.kubernetes" provider?

My Answer: This is the point which is the real reason for me being
reluctant here. I see it as quite possible scenario, that we will drop
the provider and all kubernetes code will be simply embedded in
Airlfow Core. I think this is a very interesting and probably least
distruptive scenario. Yes it means that bugfixes will only be
releaseable together with whole Airflow, but K8S is so tightly
connected with Airflow Core via K8S executor that it might be the best
choice. And if we do choose this path, this means that likely the core
settings should simply ... stay in core rather than be moved out.

I am happy to collaborate on that - but I think we at least need to
have a document and discussion on where we are going with this  to
decide on any breaking changes in Kubernetes settings.

J.

On Fri, Apr 15, 2022 at 6:44 PM Daniel Standish
<[email protected]> wrote:
>
> Thanks Jarek
>
> I think the settings intermingling is independent from the problem you're 
> concerned with.  I can see how it would be desirable to define the executor 
> interface more robustly, and to allow core to not care about k8s version (so 
> that provider can use whatever k8s version it likes).  But this is really a 
> separate issue.
>
> The issue I'm concerned with is that we have a defined way to configure hooks 
> and operators Airflow: (1) the Airflow Connection or (2) direct config 
> through operator or hook params.  We do not do this via the `airflow.cfg` 
> file.  Resolving this inconsistency does not solve the problem you are 
> concerned with; but it rectifies a user-facing inconsistency and a source of 
> confusion.
>
> Whether the K8s executor is ever moved out of core or not, it will remain 
> desirable that KPO only takes configuration from Airflow Connection or direct 
> params, because that's how things are done in Airflow.  The core 
> `[kubernetes]` settings should apply to the executor but not the operator or 
> the hook.  And indeed, by and large, this is the case already; there are just 
> a few `airflow.cfg` settings that affect KPO and the vast majority do not.
>
> WDYT?

Reply via email to