Re: [Avocado-devel] Parameter System Overhaul

Cleber Rosa Wed, 16 Aug 2017 13:23:38 -0700


On 08/08/2017 09:21 AM, Ademar Reis wrote:
> On Tue, Aug 08, 2017 at 01:01:26PM +0200, Lukáš Doktor wrote:
>> Hello guys,
>>
>> I'm sorry for such a late response, I totally forgot about this email 
>> (thanks to Ademar, your response reminded it to me).
>>
>> Dne 7.8.2017 v 23:49 Ademar Reis napsal(a):
>>> On Tue, Aug 01, 2017 at 03:37:34PM -0400, Cleber Rosa wrote:
>>>> Even though Avocado has had a parameter passing system for
>>>> instrumented tests almost from day one, it has been intertwined with
>>>> the varianter (then multiplexer) and this is fundamentally wrong.  The
>>>> most obvious example of this broken design is the `mux-inject` command
>>>> line option::
>>>>
>>>>   --mux-inject [MUX_INJECT [MUX_INJECT ...]]
>>>>                         Inject [path:]key:node values into the final
>>>> multiplex
>>>>                         tree.
>>>>
>>>> This is broken design not because such a varianter implementations can
>>>> be tweaked over the command line, that's fine.  It's broken because it
>>>> is the recommended way of passing parameters on the command line.
>>>>
>>>> The varianter (or any other subsystem) should be able to act as a
>>>> parameter provider, but can not dictate that parameters must first be
>>>> nodes/key/values of its own internal structure.
>>>
>>> Correct. It's broken because it violates several layers. There would
>>> be nothing wrong with something like "--param [prefix:]<key:value>",
>>> for example (more below).
>>>
>> Well I wouldn't call it broken. The implementation is fine we only lack 
>> other providers which would allow to inject just params so people are 
>> abusing `mux-inject` for that.
>>
>>>>
>>>> The proposed design
>>>> ===================
>>>>
>>>> A diagram has been used on a few different occasions, to describe how
>>>> the parameters and variants generation mechanism should be connected
>>>> to a test and to the overall Avocado architecture.  Here it is, in its
>>>> original form::
>>>>
>>>>    +------------------------------+
>>>>    | Test                         |
>>>>    +------------------------------+
>>>>              |
>>>>              |
>>>>    +---------v---------+    +--------------------------------+
>>>>    | Parameters System |--->| Variants Generation Plugin API |
>>>>    +-------------------+    +--------------------------------+
>>>>              ^                                ^
>>>>              |                                |
>>>>    +--------------------------------------+   |
>>>>    | +--------------+ +-----------------+ |   |
>>>>    | | avocado-virt | | other providers | |   |
>>>>    | +--------------+ +-----------------+ |   |
>>>>    +--------------------------------------+   |
>>>>                                               |
>>>>                  +----------------------------+-----+
>>>>                  |                                  |
>>>>                  |                                  |
>>>>                  |                                  |
>>>>        +--------------------+           +-------------------------+
>>>>        | Multiplexer Plugin |           | Other variant plugin(s) |
>>>>        +--------------------+           +-------------------------+
>>>>              |
>>>>              |
>>>>        +-----v---------------------------+
>>>>        | +------------+ +--------------+ |
>>>>        | | --mux-yaml | | --mux-inject | |
>>>>        | +------------+ +--------------+ |
>>>>        +---------------------------------+
>>>>
>>>>
>>>> Given that the "Parameter System" is the entry point into the parameters
>>>> providers, it should provide two different interfaces:
>>>>
>>>>  1) An interface for its users, that is, developers writing
>>>>     `avocado.Test` based tests
>>>>
>>>>  2) An interface for developers of additional providers, such as the
>>>>     "avocado-virt" and "other providers" box on the diagram.
>>>>
>>>> The current state of the the first interface is the ``self.params``
>>>> attribute.  Hopefully, it will be possible to keep its current interface,
>>>> so that tests won't need any kind of compatibility adjustments.
>>>
>>> Right. The way I envision the parameters system includes a
>>> resolution mechanism, the "path" currently used in params.get().
>>> This adds extra specificity to the user who requests a parameter.
>>>
>>> But these parameters can be provided by any entity. In the diagram
>>> above, they're part of the "Parameter System" box. Examples of
>>> "other providers" could be support for a configuration file or a
>>> "--param=[prefix:]<key:value>" command line option.
>>>
>>> The Test API to request parameters should be shared. To a test it
>>> doesn't matter where the parameter is coming from. They're always
>>> accessed through an API like:
>>>
>>>     params.get(<key>, [prefix], [fallback value])
>>>
>>> Where:
>>>   * key (mandatory) is the configuration variable the user wants.
>>>
>>>   * prefix (optional) is used to find the right key and can be
>>>     partially resolved, from right to left. As an example, a
>>>     `params.get("bar", "name")` would be fulfilled by the parameter
>>>     system if there are entries like "bar:name", "foobar:name" or
>>>     "/foo/bar:name". We're using '/' in the multiplexer to
>>>     separate branch levels, thus instead of "prefix", we're calling
>>>     it "path", but the mechanism is the same and IMO should be
>>>     generic to avoid confusion (a path at the parameter level is
>>>     different than a path at the multiplexer level).
>>>
>>>     params.get() can be even more flexible, supporting regexps and
>>>     blobs...  The objective of [prefix] is to serve as a filter and
>>>     to guarantee the caller is getting the variable he wants. It's
>>>     similar to a namespace.
>>>
>>>   * fallback (optional) is the value returned if the parameter
>>>     system can't resolve prefix+key.
>>>
>>> Users who don't want any specificity and/or have a small test base
>>> with a low chance of clashes could simply ignore the prefix both
>>> when creating parameters and when making calls to params.get().
>>>
>>>>
>>>> The second item will probably mean the definition of a new class to
>>>> the ``avocado.core.plugin_interfaces`` module, together with a new
>>>> dispatcher(-like) implementation in ``avocado.core.dispatcher``.
>>>>
>>>> Parameters availability: local .vs. external
>>>> ============================================
>>>>
>>>> Right now, all parameters are given to the test at instantiation time.
>>>> Let's say that in this scenario, all parameters are *local* to the
>>>> test.  Local parameters have the benefit that the test is self
>>>> contained and doesn't need any external communication.
>>>>
>>>> In theory, this is also a limitation, because all parameters must be
>>>> available before the test is started.  Even if other parameter system
>>>> implementations are possible, with a local approach, there would be a
>>>> number of limitations.  For long running tests, that may depend on
>>>> parameters generated during the test, this would be a blocker.  Also,
>>>> if a huge number of parameters would be available (think of a huge
>>>> cloud or database of parameters) they would all have to be copied to
>>>> the test at instantiation time.  Finally, it also means that the
>>>> source of parameters would need to be able to iterate over all the
>>>> available parameters, so that they can be copied, which can be a
>>>> further limitation for cascading implementations.
>>>>
>>>> An external approach to parameters, would be one that the test holds a
>>>> handle to a broker of parameter providers.  The parameter resolution
>>>> would be done at run time.  This avoids the copying of parameters, but
>>>> requires runtime communication with the parameter providers.  This can
>>>> make the test execution much more fragile and dependent on the external
>>>> communication.  Even by minimizing the number of communication
>>>> endpoints by communicating with the test runner only, it can still add
>>>> significant overhead, latency and point of failures to the test
>>>> execution.
>>>>
>>>> I believe that, at this time, the limitations imposed by local
>>>> parameter availability do *not* outweigh the problems that an external
>>>> approach can bring.  In the future, if advanced use cases require an
>>>> external approach to parameters availability, this can be reevaluated.
>>>
>>> If I understand your point correctly, this is an implementation
>>> detail. It depends on what the "contract" is between the test runner
>>> (parameter provider) and the test being run (the user of
>>> params.get()).
>>
>> The only difference for the scope of this RFC I see is that the
>> "lazy" params can't be merged with "static" params before test
>> execution (otherwise it'd effectively made them static as well).
>> This is quite important difference to consider further in the
>> design...
>>
>>>
>>> For example, should users assume parameters are dynamic and can
>>> change during the lifetime of a test, an therefore two identical
>>> calls to params.get() might return different values?  Should it be
>>> possible to change params (something like params.put()) at runtime?
>> Definitely no params changing.
>>
>>>
>>> (IMO the answer should be no to both questions).
> 
> The above should be properly specified.
>


Agreed, let's specify it as a negative: users won't have an interface to
add/change/delete parameters during the execution of a test.

>>>
>>> If you have something different in mind, then it would be
>>> interesting to see some real use-cases.
>>>
>>>>
>>>> Namespaces (AKA how/if should we merge parameters)
>>>> ==================================================
>>>>
>>>> Currently, the parameter fetching interface already contains at its
>>>> core the concept of paths[1].  In theory, this is sufficient to prevent
>>>> clashes of keys with the same names, but intended to configure different
>>>> aspects of a test.
>>>>
>>>> Now, with the proposed implementation of multiple providers to the
>>>> parameter system, the question of how they will be combined comes up.
>>>>
>>>> One way is for each implementation to claim, based on some unique
>>>> attribute such as its own name, a part of a tree path.  For instance,
>>>> for two implementations:
>>>>
>>>>  1) variants
>>>>  2) plain_ini
>>>>
>>>> Users could access parameters explicitly defined on those by referring
>>>> to paths such as:
>>>>
>>>>    self.params.get('speed', path='/plain_ini', default=60)
>>>>
>>>> or
>>>>
>>>>    self.params.get('speed', path='/variants/path/inside/varianter',
>>>> default=60)
>>>>
>>>> This clearly solves the clashes, but binds the implementation to the
>>>> tests, which should be avoided at all costs.
>>>
>>> So you're providing this as an example of why it's a bad idea...
>>> OK. :)
>>>
>> Yep, don't see this as a way to go. One should be able to execute the same 
>> test with different providers (eg. json file with params generated by 
>> jenkins)
>>
>>>>
>>>> One other approach would be to merge everything into the tree root
>>>> node.  By doing this, one parameter provider could effectively
>>>> override the values in another parameter provider, given that both
>>>> used the same paths for a number of parameters.
>> This wouldn't be possible with "lazy" providers as it'd require iterating 
>> through all params.
>>
>>>>
>>>> Yet another approach would be to *not* use paths, and resort to
>>>> completely separate namespaces.  A parameter namespace would be an
>>>> additional level of isolation, which can quickly become exceedingly
>>>> complex.
>> If I understood it correctly, this is what I'd go with. The way I envision 
>> this is:
>>
>> 1. varianter yields a variant, which also contains params in form of list of 
>> namespaces associated with params `variant = [{"/run/variant": {"foo": 
>> "bar"}},]`
>> 2. runner attaches params from cmdline `cmdline = [{"/1": {"foo": "bar"}}, 
>> {"/2": {"foo": "bar"}}]`
>> 3. runner attaches params from avocado-virt plugin `virt = [{"/virt": 
>> {"image_name": "foo"}]
>> 4. runner executes test and passes params definition there
>>     {"params": [("varianter", varianter), ("cmdline", cmdline), ("virt", 
>> virt)],
>>      "order": ["varianter", "__ALL_", "cmdline"]}
>>
>> Now the NewAvocadoParams object should create AvocadoParams per each 
>> provider (params, cmdline, virt) being able to query params from any of 
>> them. The process of getting params would get a bit more complicated for 
>> Avocado, but for user nothing changes. Let's say user issues 
>> params.get("foo", "/*"):
>>
>> 1. NewAvocadoParasms looks at the "order"
>> 2. varianter is asked for a value, reports "bar", as this is a valid 
>> response, no further params is queried and "bar" is returned
>>
>> Now for params.get("foo", "/1/*")
>>
>> 1. NewAvocadoParasms looks at the "order"
>> 2. varianter is asked and reports no match
>> 3. __ALL__ => means ask providers without assigned priority, which applies 
>> to "virt" in this example, which reports no match
>> 4. cmdline is asked and returns "bar"
>>
>> And params.get("missing")
>>
>> 1. NewAvocadoParasms looks at the "order"
>> 2. varianter => no match
>> 3. virt => no match
>> 4. cmdline => no match
>> 5. default is reported (None)
>>
>>
>> The "order" is important, by default I'd suggest 
>> varianter=>plugins=>params=>default_params (basically what Ademar suggested 
>> and we all agreed many times), but it should be possible to set it if 
>> necessary (eg. when user have multiple plugins and knows the order).
>>
>> Now in case of clash I believe it should report the clash even though 
>> further providers might be able to report. Let's change the order to 
>> ["cmdline", "__ALL__"] and query for params.get("foo", "/*")
>>
>> 1. cmdline is asked, it raises ValueError("Leaves [/1, /2] contain key 
>> "foo"\n  /1:foo:bar\n  /2:foo:bar") which is forwarded to the test.
>>
>> This schema would work for "lazy" providers as it'd only be asked for 
>> specific key and only when needed.
>>

Your example fits what would be necessary to have "lazy" providers
indeed (what I call parameters that are not "local" to the test).  But,
please take a look at the other points raised in other responses in this
thread.  I think it's a good idea to stick with data only parameters to
tests, which means no "lazy" providers.

> 
> So looks like we agree here, but I'm not familiar with the
> implementation details to follow everything. Cleber, can you confirm
> and clarify what you had in mind?
> 

We're in sync about what users should expect and the general
functionality.  As I just wrote, I'm confident that the test can be kept
isolated from almost all of the parameter code, receiving only the
parameter data.  It'd only need a simple wrapper to access the parameter
data it already possess: that would be the role of self.params.get().

That means a different *implementation* than Lukáš exemplified above.
It also means no "lazy" providers.

>>>
>>> I think using paths is confusing because it mixes concepts which are
>>> exclusive to the multiplexer (a particular implementation of the
>>> varianter) with an API that is shared by all other parameter
>>> providers.
>>>
>>> For example, when you say "merge everything into the tree root
>>> node", are you talking about namespace paths, or paths as used by
>>> the multiplexer when the "!mux" keyword is present?
>> There is no difference between namespace path and path in
>> multiplexer. The only difference is that multiplexer can provide a
>> different slice, but the namespace is always the same, only not
>> existing in some slices:
> 
> The multiplexer is just one of many possible "varianters". In the
> multiplexer the path (tree) has a well defined semantics, which is
> non-trivial: paths can be namespaces or used for "multiplexing" when
> the "!mux" keyword is used.
> 
> I've been thinking of variants as having a prefix, but not a path (a
> dictionary, not a tree).  But maybe a tree is better, assuming it
> works as a good abstraction for other "varianters". We need an
> authoritative specification somewhere.
> 

Right.  Lukáš can you compose a simple section on this topic?  Do you
think the variants (thinking abstractly about them) would match nicely
with the use of a tree based path?

All of the previous responses about the parameter system have mentioned
a regex based "prefix", so this is quite an important point.

- Cleber.


-- 
Cleber Rosa
[ Sr Software Engineer - Virtualization Team - Red Hat ]
[ Avocado Test Framework - avocado-framework.github.io ]
[  7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3  ]

signature.asc
Description: OpenPGP digital signature

Re: [Avocado-devel] Parameter System Overhaul

Reply via email to