On 08/08/2017 09:21 AM, Ademar Reis wrote:
> On Tue, Aug 08, 2017 at 01:01:26PM +0200, Lukáš Doktor wrote:
>> Hello guys,
>>
>> I'm sorry for such a late response, I totally forgot about this email
>> (thanks to Ademar, your response reminded it to me).
>>
>> Dne 7.8.2017 v 23:49 Ademar Reis napsal(a):
>>> On Tue, Aug 01, 2017 at 03:37:34PM -0400, Cleber Rosa wrote:
>>>> Even though Avocado has had a parameter passing system for
>>>> instrumented tests almost from day one, it has been intertwined with
>>>> the varianter (then multiplexer) and this is fundamentally wrong. The
>>>> most obvious example of this broken design is the `mux-inject` command
>>>> line option::
>>>>
>>>> --mux-inject [MUX_INJECT [MUX_INJECT ...]]
>>>> Inject [path:]key:node values into the final
>>>> multiplex
>>>> tree.
>>>>
>>>> This is broken design not because such a varianter implementations can
>>>> be tweaked over the command line, that's fine. It's broken because it
>>>> is the recommended way of passing parameters on the command line.
>>>>
>>>> The varianter (or any other subsystem) should be able to act as a
>>>> parameter provider, but can not dictate that parameters must first be
>>>> nodes/key/values of its own internal structure.
>>>
>>> Correct. It's broken because it violates several layers. There would
>>> be nothing wrong with something like "--param [prefix:]<key:value>",
>>> for example (more below).
>>>
>> Well I wouldn't call it broken. The implementation is fine we only lack
>> other providers which would allow to inject just params so people are
>> abusing `mux-inject` for that.
>>
>>>>
>>>> The proposed design
>>>> ===================
>>>>
>>>> A diagram has been used on a few different occasions, to describe how
>>>> the parameters and variants generation mechanism should be connected
>>>> to a test and to the overall Avocado architecture. Here it is, in its
>>>> original form::
>>>>
>>>> +------------------------------+
>>>> | Test |
>>>> +------------------------------+
>>>> |
>>>> |
>>>> +---------v---------+ +--------------------------------+
>>>> | Parameters System |--->| Variants Generation Plugin API |
>>>> +-------------------+ +--------------------------------+
>>>> ^ ^
>>>> | |
>>>> +--------------------------------------+ |
>>>> | +--------------+ +-----------------+ | |
>>>> | | avocado-virt | | other providers | | |
>>>> | +--------------+ +-----------------+ | |
>>>> +--------------------------------------+ |
>>>> |
>>>> +----------------------------+-----+
>>>> | |
>>>> | |
>>>> | |
>>>> +--------------------+ +-------------------------+
>>>> | Multiplexer Plugin | | Other variant plugin(s) |
>>>> +--------------------+ +-------------------------+
>>>> |
>>>> |
>>>> +-----v---------------------------+
>>>> | +------------+ +--------------+ |
>>>> | | --mux-yaml | | --mux-inject | |
>>>> | +------------+ +--------------+ |
>>>> +---------------------------------+
>>>>
>>>>
>>>> Given that the "Parameter System" is the entry point into the parameters
>>>> providers, it should provide two different interfaces:
>>>>
>>>> 1) An interface for its users, that is, developers writing
>>>> `avocado.Test` based tests
>>>>
>>>> 2) An interface for developers of additional providers, such as the
>>>> "avocado-virt" and "other providers" box on the diagram.
>>>>
>>>> The current state of the the first interface is the ``self.params``
>>>> attribute. Hopefully, it will be possible to keep its current interface,
>>>> so that tests won't need any kind of compatibility adjustments.
>>>
>>> Right. The way I envision the parameters system includes a
>>> resolution mechanism, the "path" currently used in params.get().
>>> This adds extra specificity to the user who requests a parameter.
>>>
>>> But these parameters can be provided by any entity. In the diagram
>>> above, they're part of the "Parameter System" box. Examples of
>>> "other providers" could be support for a configuration file or a
>>> "--param=[prefix:]<key:value>" command line option.
>>>
>>> The Test API to request parameters should be shared. To a test it
>>> doesn't matter where the parameter is coming from. They're always
>>> accessed through an API like:
>>>
>>> params.get(<key>, [prefix], [fallback value])
>>>
>>> Where:
>>> * key (mandatory) is the configuration variable the user wants.
>>>
>>> * prefix (optional) is used to find the right key and can be
>>> partially resolved, from right to left. As an example, a
>>> `params.get("bar", "name")` would be fulfilled by the parameter
>>> system if there are entries like "bar:name", "foobar:name" or
>>> "/foo/bar:name". We're using '/' in the multiplexer to
>>> separate branch levels, thus instead of "prefix", we're calling
>>> it "path", but the mechanism is the same and IMO should be
>>> generic to avoid confusion (a path at the parameter level is
>>> different than a path at the multiplexer level).
>>>
>>> params.get() can be even more flexible, supporting regexps and
>>> blobs... The objective of [prefix] is to serve as a filter and
>>> to guarantee the caller is getting the variable he wants. It's
>>> similar to a namespace.
>>>
>>> * fallback (optional) is the value returned if the parameter
>>> system can't resolve prefix+key.
>>>
>>> Users who don't want any specificity and/or have a small test base
>>> with a low chance of clashes could simply ignore the prefix both
>>> when creating parameters and when making calls to params.get().
>>>
>>>>
>>>> The second item will probably mean the definition of a new class to
>>>> the ``avocado.core.plugin_interfaces`` module, together with a new
>>>> dispatcher(-like) implementation in ``avocado.core.dispatcher``.
>>>>
>>>> Parameters availability: local .vs. external
>>>> ============================================
>>>>
>>>> Right now, all parameters are given to the test at instantiation time.
>>>> Let's say that in this scenario, all parameters are *local* to the
>>>> test. Local parameters have the benefit that the test is self
>>>> contained and doesn't need any external communication.
>>>>
>>>> In theory, this is also a limitation, because all parameters must be
>>>> available before the test is started. Even if other parameter system
>>>> implementations are possible, with a local approach, there would be a
>>>> number of limitations. For long running tests, that may depend on
>>>> parameters generated during the test, this would be a blocker. Also,
>>>> if a huge number of parameters would be available (think of a huge
>>>> cloud or database of parameters) they would all have to be copied to
>>>> the test at instantiation time. Finally, it also means that the
>>>> source of parameters would need to be able to iterate over all the
>>>> available parameters, so that they can be copied, which can be a
>>>> further limitation for cascading implementations.
>>>>
>>>> An external approach to parameters, would be one that the test holds a
>>>> handle to a broker of parameter providers. The parameter resolution
>>>> would be done at run time. This avoids the copying of parameters, but
>>>> requires runtime communication with the parameter providers. This can
>>>> make the test execution much more fragile and dependent on the external
>>>> communication. Even by minimizing the number of communication
>>>> endpoints by communicating with the test runner only, it can still add
>>>> significant overhead, latency and point of failures to the test
>>>> execution.
>>>>
>>>> I believe that, at this time, the limitations imposed by local
>>>> parameter availability do *not* outweigh the problems that an external
>>>> approach can bring. In the future, if advanced use cases require an
>>>> external approach to parameters availability, this can be reevaluated.
>>>
>>> If I understand your point correctly, this is an implementation
>>> detail. It depends on what the "contract" is between the test runner
>>> (parameter provider) and the test being run (the user of
>>> params.get()).
>>
>> The only difference for the scope of this RFC I see is that the
>> "lazy" params can't be merged with "static" params before test
>> execution (otherwise it'd effectively made them static as well).
>> This is quite important difference to consider further in the
>> design...
>>
>>>
>>> For example, should users assume parameters are dynamic and can
>>> change during the lifetime of a test, an therefore two identical
>>> calls to params.get() might return different values? Should it be
>>> possible to change params (something like params.put()) at runtime?
>> Definitely no params changing.
>>
>>>
>>> (IMO the answer should be no to both questions).
>
> The above should be properly specified.
>
Agreed, let's specify it as a negative: users won't have an interface to
add/change/delete parameters during the execution of a test.
>>>
>>> If you have something different in mind, then it would be
>>> interesting to see some real use-cases.
>>>
>>>>
>>>> Namespaces (AKA how/if should we merge parameters)
>>>> ==================================================
>>>>
>>>> Currently, the parameter fetching interface already contains at its
>>>> core the concept of paths[1]. In theory, this is sufficient to prevent
>>>> clashes of keys with the same names, but intended to configure different
>>>> aspects of a test.
>>>>
>>>> Now, with the proposed implementation of multiple providers to the
>>>> parameter system, the question of how they will be combined comes up.
>>>>
>>>> One way is for each implementation to claim, based on some unique
>>>> attribute such as its own name, a part of a tree path. For instance,
>>>> for two implementations:
>>>>
>>>> 1) variants
>>>> 2) plain_ini
>>>>
>>>> Users could access parameters explicitly defined on those by referring
>>>> to paths such as:
>>>>
>>>> self.params.get('speed', path='/plain_ini', default=60)
>>>>
>>>> or
>>>>
>>>> self.params.get('speed', path='/variants/path/inside/varianter',
>>>> default=60)
>>>>
>>>> This clearly solves the clashes, but binds the implementation to the
>>>> tests, which should be avoided at all costs.
>>>
>>> So you're providing this as an example of why it's a bad idea...
>>> OK. :)
>>>
>> Yep, don't see this as a way to go. One should be able to execute the same
>> test with different providers (eg. json file with params generated by
>> jenkins)
>>
>>>>
>>>> One other approach would be to merge everything into the tree root
>>>> node. By doing this, one parameter provider could effectively
>>>> override the values in another parameter provider, given that both
>>>> used the same paths for a number of parameters.
>> This wouldn't be possible with "lazy" providers as it'd require iterating
>> through all params.
>>
>>>>
>>>> Yet another approach would be to *not* use paths, and resort to
>>>> completely separate namespaces. A parameter namespace would be an
>>>> additional level of isolation, which can quickly become exceedingly
>>>> complex.
>> If I understood it correctly, this is what I'd go with. The way I envision
>> this is:
>>
>> 1. varianter yields a variant, which also contains params in form of list of
>> namespaces associated with params `variant = [{"/run/variant": {"foo":
>> "bar"}},]`
>> 2. runner attaches params from cmdline `cmdline = [{"/1": {"foo": "bar"}},
>> {"/2": {"foo": "bar"}}]`
>> 3. runner attaches params from avocado-virt plugin `virt = [{"/virt":
>> {"image_name": "foo"}]
>> 4. runner executes test and passes params definition there
>> {"params": [("varianter", varianter), ("cmdline", cmdline), ("virt",
>> virt)],
>> "order": ["varianter", "__ALL_", "cmdline"]}
>>
>> Now the NewAvocadoParams object should create AvocadoParams per each
>> provider (params, cmdline, virt) being able to query params from any of
>> them. The process of getting params would get a bit more complicated for
>> Avocado, but for user nothing changes. Let's say user issues
>> params.get("foo", "/*"):
>>
>> 1. NewAvocadoParasms looks at the "order"
>> 2. varianter is asked for a value, reports "bar", as this is a valid
>> response, no further params is queried and "bar" is returned
>>
>> Now for params.get("foo", "/1/*")
>>
>> 1. NewAvocadoParasms looks at the "order"
>> 2. varianter is asked and reports no match
>> 3. __ALL__ => means ask providers without assigned priority, which applies
>> to "virt" in this example, which reports no match
>> 4. cmdline is asked and returns "bar"
>>
>> And params.get("missing")
>>
>> 1. NewAvocadoParasms looks at the "order"
>> 2. varianter => no match
>> 3. virt => no match
>> 4. cmdline => no match
>> 5. default is reported (None)
>>
>>
>> The "order" is important, by default I'd suggest
>> varianter=>plugins=>params=>default_params (basically what Ademar suggested
>> and we all agreed many times), but it should be possible to set it if
>> necessary (eg. when user have multiple plugins and knows the order).
>>
>> Now in case of clash I believe it should report the clash even though
>> further providers might be able to report. Let's change the order to
>> ["cmdline", "__ALL__"] and query for params.get("foo", "/*")
>>
>> 1. cmdline is asked, it raises ValueError("Leaves [/1, /2] contain key
>> "foo"\n /1:foo:bar\n /2:foo:bar") which is forwarded to the test.
>>
>> This schema would work for "lazy" providers as it'd only be asked for
>> specific key and only when needed.
>>
Your example fits what would be necessary to have "lazy" providers
indeed (what I call parameters that are not "local" to the test). But,
please take a look at the other points raised in other responses in this
thread. I think it's a good idea to stick with data only parameters to
tests, which means no "lazy" providers.
>
> So looks like we agree here, but I'm not familiar with the
> implementation details to follow everything. Cleber, can you confirm
> and clarify what you had in mind?
>
We're in sync about what users should expect and the general
functionality. As I just wrote, I'm confident that the test can be kept
isolated from almost all of the parameter code, receiving only the
parameter data. It'd only need a simple wrapper to access the parameter
data it already possess: that would be the role of self.params.get().
That means a different *implementation* than Lukáš exemplified above.
It also means no "lazy" providers.
>>>
>>> I think using paths is confusing because it mixes concepts which are
>>> exclusive to the multiplexer (a particular implementation of the
>>> varianter) with an API that is shared by all other parameter
>>> providers.
>>>
>>> For example, when you say "merge everything into the tree root
>>> node", are you talking about namespace paths, or paths as used by
>>> the multiplexer when the "!mux" keyword is present?
>> There is no difference between namespace path and path in
>> multiplexer. The only difference is that multiplexer can provide a
>> different slice, but the namespace is always the same, only not
>> existing in some slices:
>
> The multiplexer is just one of many possible "varianters". In the
> multiplexer the path (tree) has a well defined semantics, which is
> non-trivial: paths can be namespaces or used for "multiplexing" when
> the "!mux" keyword is used.
>
> I've been thinking of variants as having a prefix, but not a path (a
> dictionary, not a tree). But maybe a tree is better, assuming it
> works as a good abstraction for other "varianters". We need an
> authoritative specification somewhere.
>
Right. Lukáš can you compose a simple section on this topic? Do you
think the variants (thinking abstractly about them) would match nicely
with the use of a tree based path?
All of the previous responses about the parameter system have mentioned
a regex based "prefix", so this is quite an important point.
- Cleber.
--
Cleber Rosa
[ Sr Software Engineer - Virtualization Team - Red Hat ]
[ Avocado Test Framework - avocado-framework.github.io ]
[ 7ABB 96EB 8B46 B94D 5E0F E9BB 657E 8D33 A5F2 09F3 ]
signature.asc
Description: OpenPGP digital signature
