Re: In search of a CFEngine language specification (for 4.x)

Mike Weilgart Mon, 28 Nov 2016 23:50:22 -0800

> For a start, I can presume that current CFEngine3.x promise file syntax 
may be 
> more than enough to cover all usages and underlying theory. Let me 
provoke 
> that there is no need to introduce any new syntactic elements.

This is almost entirely true. There is *one* syntactic change I believe
*may* be appropriate (and only one). That would be to resolve the
non-symmetricity between the "depends_on" attribute
<https://docs.cfengine.com/lts/reference-promise-types.html#depends_on> and
the "promisees" list
<https://docs.cfengine.com/lts/guide-language-concepts.html>, either by
adding a dedicated syntax for listing "dependent" promises, or by removing
the dedicated syntax for "promisees."

As hardly anyone uses the "promisees" list and there is a dedicated syntax
for list attributes already, I would favor removing the "promisees" syntax
entirely. (If it is desired it could be defined as a "promisees"
attribute, which would increase syntax regularity.) I will write as though
"promisees" lists don't exist, as I favor removing that dedicated syntax in
CFEngine 4.x.

Also, let's be clear that by *syntax* we mean *only* those elements
described in the very first two code snippets in the Language Concepts
documentation <https://docs.cfengine.com/lts/guide-language-concepts.html>.
These elements are so simple that they may be formally defined within a
two page document. From a *syntactic* standpoint, with no concern for
semantics
<https://en.wikipedia.org/wiki/Syntax_(programming_languages)#Syntax_versus_semantics>,

there are only two alphanumeric (non-symbolic) keywords: "bundle" and
"body."

*Here is a rough draft of a formal syntax specification:*

------------------

A *policy*, informally, is a collection of *bundles*. More precisely, a
*policy* consists of zero or more *bundles* and zero or more *bodies*.

A *bundle* is a collection of zero or more *promises*. More precisely, a
*bundle* consists of the "bundle" keyword, a *bundle type*, a *bundle name*,
an open curly brace, zero or more *promises*, and a close curly brace.

A *promise* consists of a *promise type* followed by a colon, an optionally
specified *context*, a *promiser*, zero or more comma-separated *attributes*,
and a terminating semicolon. The *promise type* (along with its following
colon) may be omitted from any *promise* other than the first *promise* in
a *bundle*. (Semantically: An omitted *promise type* shall be inferred
from the closest preceding *promise* in the *bundle* for which a *promise
type* is specified. A *promise* with an omitted *context* but a specified
*promise
type* shall be inferred to have the *context* "any::". A *promise* with
both an omitted *context* and an omitted *promise type* shall have its
*context* inferred from the immediately preceding *promise* in the *bundle*
.)

A *body* is a collection of *attributes*. More precisely, a *body* consists
of the "body" keyword, a *body type*, a *body name*, an open curly brace,
one or more semicolon-terminated *attributes *(with optionally-specified
*contexts*) and a close curly brace. Any *attribute* within a *body* may
be preceded by a *context*. (Semantically: If a *context* is specified
within a *body*, it applies to all *attributes* forward from that *context*
until
either another *context* is specified or the end of the *body* is reached.
*Attributes* for which no *context* is specified shall be inferred to have
the *context* "any::".)

An *attribute* is a key-value pair. More precisely, an *attribute* consists
of an *attribute name*, the characters "=>", and a *value*.

A *value* can be either a *scalar value* or a *list value*.

A *scalar value* is either an *integer*, a *real number*, or a *string*.

A *list value* is either an *integer list*, a *real number list*, or a *string
list*. Any *list value* consists of an open curly brace, zero or more
comma-separated *scalar values* which are all of the same type, and a
closing curly brace.

A *context* consists of a *class expression* followed by the characters
"::".

*Body semantics:*

If an *attribute* has an *attribute name* which is a defined *body type*,
the *value* of the *attribute* is expected to be a *string* containing a *body
name* referring to a *body* which is defined somewhere in the *policy*.

------------------

In this draft syntax I only defined some italicized terms, notably *not*
including
"string" nor "class expression." I believe these are much more complicated
to define precisely in a first draft, particularly considering the quoting
and whitespace rules <https://tracker.mender.io/browse/CFE-1921>.

I do think typing should be strengthened (strong typing) in CFEngine 4
compared to CFEngine 3.7.

--------

One syntactic change I considered but ultimately rejected is the request
for anonymous in-line bodies in promises (as described in CFE-2196
<https://tracker.mender.io/browse/CFE-2196>, the help-cfengine post "
<https://groups.google.com/forum/#!topic/help-cfengine/jTvkDyHruPI>Anonymous
bodies - language syntax suggestion,"
<https://groups.google.com/forum/#!topic/help-cfengine/jTvkDyHruPI> and
CFE-2064 <https://tracker.mender.io/browse/CFE-2064>.) I wrote the post,
and yet I reject this idea. Why?

Consider the *purpose* of promise bodies: they are for *knowledge
management*.

All attributes of, let us say, a "copy_from" body are *actually* attributes
of a "files" promise. It would not violate the simple, regular,
predictable syntax outlined above if a "files" promise were allowed to
*directly* include those attributes rather than referencing a "copy_from"
body. Why not allow this?

Likewise, why hardcode the allowable body types in the parser? Why not
simply define ALL attribute names related to a promise type as belonging to
that promise type, and allow ANY arbitrary collection of them to be
specified as a body type?

This could done (within the grammar above) with a special "bundle type."
Call it a "body_def" and you would get the following:

#NOT REAL CODE

bundle body_def copy_from {
attributes:
"source";
"servers";
"collapse_destination_dir"
default_string => "false";
"compare"
default_string => "???",
comment => "There is no way to accomplish default behavior
explicitly.";
"copy_backup"
default_string => "true";
"encrypt"
default_string => "false";
"check_root"
default_string => "false";
"copylink_patterns";
"copy_size"
default_irange => "0,inf",
comment => "I have no idea what type this is supposed to be.";
"findertype";
"linkcopy_patterns";
"link_type"
default_string => "symlink";
"force_update"
default_string => "false";
"force_ipv4"
default_string => "false";
"portnumber";
"preserve"
default_string => "false";
"protocol_version"
default_string => "classic",
comment => "Can also be an int???";
"purge"
default_string => "false";
"stealth"
default_string => "false";
"timeout"
default_int => "default_timeout",
comment => "Yes, I know that's not an int.";
"trustkey"
default => "false",
comment => "I forgot '_string' because it wouldn't be intuitive.
I like 'default' better.";
"type_check" default => "true";
"verify" default => "false";
}

----------------------

Writing this out was an interesting exercise that highlighted a few more
syntax questions:

1. In CFEngine 3, *each attribute name is tied to a specific type.* In
some CFEngine 4 cases I can conceive (such as the "default" attribute of an
"attributes" promise in a "body_def" bundle, in the made up example above),
it would be useful not to tie these together so tightly. *Should* attribute
names be tied to specific data types *by the parser*?

2. *Booleans are inconsistently handled* in CFEngine 3. For class
expressions, "any" is true and "!any" is false, and there is evidently a
dedicated internal type for classes
<https://tracker.mender.io/browse/CFE-2255> which is distinct from strings.

Menu items for attributes which may be "true" or "false" are nicely human
readable, but if we are to expand the interoperability of the language to
allow new promise types to be added more easily it could make sense to
allow booleans to be an actual *type*.

For instance, why should the following syntax be necessary in CFEngine 3?

body copy_from conditionally_secure {
be_secure::
encrypt => "true";
!be_secure::
encrypt => "false";
}

If booleans were fully supported as a type, related to class expressions,
you could simplify this. Incorporating my notions of inline attributes
instead of bodies:

files:
"/path/to/some/file"
source => "/path/on/hub/to/file",
servers => "$(sys.policy_hub)",
encrypt => "be_secure";

3. Should there be some sort of "range" data type?? It seems that
"copy_size" could just as well be two separate attributes, which wouldn't
violate the key-value pair concept and would preserve regularity.

4. Very importantly, *CFEngine 3 contains multiple promises packed into a
single "promise." <https://tracker.mender.io/browse/CFE-1843>* This is
viewable in promise outcomes, wherein multiple outcomes can result from a
"single" promise. (Nick Anderson calls them "compound promises.") This is
probably the single biggest discrepancy between Promise Theory *per se* and
the CFEngine 3 implementation.

How should this be coordinated with the language semantics and syntax?

(Note that the "create" attribute of a files promise does not strictly
align to Promise Theory at all. This could be made the default behavior,
and the current behavior could be accomplished with "if =>
fileexists("$(this.promiser)")". This would solve accidental empty file
creation <https://tracker.mender.io/browse/CFE-2329> as just a side effect.)

---------

> So, indeed, there must be a core, clear and universal approach to all
> promises, their dependencies etc.
>
> I would dare to add, also, that extending promise types within the
monolithic
> binary may not work long-term, making CFEngine's code hard to maintain
and
> adapt to all usage scenarios (think the needs of embedded devices vs.
data-
> center clusters). Given the above formal language specification, parts of
> CFEngine could be split out to /modules/, as in dynamically-linked
libraries.

I completely agree with both points here. There will always be new needs
requiring new additional *promise types* as computer applications in the
world continue to grow exponentially.

For example, Promise Theory is more than adequate to model cloud
orchestration, yet CFEngine offers no orchestration features. You might
have a "machines:" or "vms:" promise type, with attributes such as 'os =>
"redhat 6"', 'subnet => "10.10.10.0/24"'—all sorts of things. These would
require updates to the language specification but these should be *semantic*
changes
rather than *syntactic*.
<https://en.wikipedia.org/wiki/Syntax_(programming_languages)#Syntax_versus_semantics>

*(Note: I actually wrote the text from here down before I wrote any other
part of this post.)*

Drawing on the pure mathematical model of Promise Theory, it should be
possible to define a standard by which *any* promises (and promisees and
promisers) can be expressed—and if that notation is compatible wherever
possible to the CFEngine 3 policy language, the connection can be quite
straight forward. What I'm talking about here would probably be classified
as a formal grammar <https://en.wikipedia.org/wiki/Formal_grammar>. C has
an official formal grammar, though not available for free
<http://www.iso.org/iso/home/store/catalogue_ics/catalogue_detail_ics.htm?csnumber=57853>.

(It's also summarized in K&R
<https://www.safaribooksonline.com/library/view/the-c-programming/9780133086249/app01.html>.)

POSIX shell has a formal grammar
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_10>
.

Now whether or not CFEngine gets a formal grammar (which it should,
eventually), it should certainly have a language *specification*. They're
not the same thing. In actual fact, because of the (intended) regularity
of CFEngine, I *suspect* that a formal grammar for the CFEngine 4 policy
language would actually be simpler than the full language specification!

The language specification would include the defined *behavior* of all the
promise types and attributes, whereas the grammar would define simply how
elements are structured. If the regularity between bundles and bodies
could be captured in the formal grammar, any manner of promise could be
*written* in the language, and people could learn the language without
necessarily learning the particular bundle types, promise types and
attributes available in a particular *parser *for the "Promises Language."

I love your suggestions regarding splitting out parts of CFEngine into
modules.

Best,
--Mike Weilgart
Vertical Sysadmin, Inc.

--
You received this message because you are subscribed to the Google Groups
"dev-cfengine" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/dev-cfengine/95f146bc-49ef-43f4-a483-5988a74068d9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: In search of a CFEngine language specification (for 4.x)

Reply via email to