Chapter 4: Using CFEngine

Using CFEngine

We will now explore how to perform some common configuration tasks using CFEngine. Along the way we will encounter more advanced concepts and structures of the CFEngine language.

Initial System Configuration

After a system is installed, a number of routine tasks needs to be performed before declaring it ready for use. These include installation of base software packages, network configuration, file system configuration, user creation, authentication configuration, and configuration of system components. CFEngine can do all of these tasks consistently and predictably.

Throughout this section, we will incrementally build a CFEngine policy that edits a number of configuration files, starting from a single entry point. In the process I will show you some common techniques for passing and processing parameters, and several new CFEngine constructs and concepts.

Running These Policies

While you are writing and testing them, it’s easiest to save these policies in a file that you can run by itself from the command line. For this, remember as we saw in Your First CFEngine Policy, that you need to add a body common control in the file that loads the appropriate libraries and specifies which bundles should be executed. For all the system configuration examples, you need to add this at the beginning of the file ("configfiles" is the name of the bundle to execute—adapt it to the different policies and examples):

body common control
{
    inputs => { "/var/cfengine/inputs/libraries/cfengine_stdlib.cf" };
    bundlesequence => { "configfiles" };
}

Once you are ready to integrate these bundles into your main CFEngine policy, you need to follow the steps described in Integrating Your New Policy Into Periodic CFEngine Execution.

Editing /etc/sysctl.conf

One of the files that commonly requires configuration in a new Linux system is /etc/sysctl.conf. This file contains configuration values for some kernel parameters that control different aspects of system behavior. For example, it may contain the following lines:

net.ipv4.tcp_syncookies = 1
net.ipv4.conf.all.accept_source_route = 0
net.ipv4.conf.all.accept_redirects = 0
net.ipv4.conf.all.rp_filter = 1
net.ipv4.conf.all.log_martians = 1

These particular parameters control behavior of the networking stack in the kernel (net.ipv4).

We can use CFEngine to ensure these parameters are present in the /etc/sysctl.conf file. We will walk through an example that demonstrates this ability, but also shows the different levels at which a CFEngine policy can operate:

  1. At the highest level, the policy simply says “configure the /etc/sysctl.conf file using these parameters.” This part of the policy is a building block that can be added or removed to an installation at a management level without worrying about how it’s implemented.

  2. The next level down says “set these values in the /etc/sysctl.conf file and in the running system.” This can be changed as a sysadmin decides what options need to be enabled, without thinking about the syntax of the file.

  3. The next level explains the structure of the file and how the parameters should be set. It essentially extracts the implementation details, which are independent of which options you choose.

  4. The lowest level explains how to perform the field edits in the file, how classes should be handled, and other implementation details.

This is the code:

bundle agent configfiles (1)
{
  vars:
      # Files to edit
      "files[sysctl]" string => "/etc/sysctl.conf"; (2)

      # Sysctl variables to set
      "sysctl[net.ipv4.tcp_syncookies]"               string => "1"; (3)
      "sysctl[net.ipv4.conf.all.accept_source_route]" string => "0";
      "sysctl[net.ipv4.conf.all.accept_redirects]"    string => "0";
      "sysctl[net.ipv4.conf.all.rp_filter]"           string => "1";
      "sysctl[net.ipv4.conf.all.log_martians]"        string => "1";

  methods: (4)
      "sysctl"  usebundle => edit_sysctl,
        comment => "Configure $(files[sysctl])";
}

bundle agent edit_sysctl
{
  files: (5)
      "$(configfiles.files[sysctl])"
        handle => "edit_sysctl",
        comment => "Make sure sysctl.conf contains desired configuration",
        create => "true",
        edit_line => set_variable_values("configfiles.sysctl"), (6)
        classes => if_repaired("sysctl_modified"); (7)

  commands: (8)
    sysctl_modified.!no_restarts::
      "/sbin/sysctl -p"
        handle => "reload_sysctl",
        comment => "Make sure new sysctl settings are loaded";
}

This short CFEngine policy ensures that the appropriate lines are present in the /etc/sysctl.conf file to set the parameters we want. If a parameter is already there but with a different value, the policy will fix it. If a parameter does not appear in the file, the policy will add it. Let’s dissect the example by parts.

  1. The configfiles() agent bundle is a “driver” bundle that calls others to actually perform the work (see the methods: section later). This allows us (as we will do in later sections) to add more tasks that are called from the same driver bundle.

  2. In configfiles(), we first define some variables in a vars: section. In it, we define an array called files. Remember, as we saw in Arrays, that arrays in CFEngine are indexed by arbitrary strings. In this case we are specifying an element indexed by the string sysctl, and containing the path of the file to be edited. We will refer back to this array later in the policy, and we will add more elements to it throughout the chapter, to hold the filenames of the different files to edit. To refer back to this array we will use the name configfiles.files, to indicate the files array inside the configfiles() bundle.

  3. We also define an array called sysctl (I used the same name as the element defined above in files just because they both refer to the same file, but I could use any name), indexed by parameter name, and containing the value to set for each parameter. This is a common technique that we will use for passing key/value pairs in CFEngine, as it allows us to succinctly define values for configuration files, users, and many other parameters. Note that we define each element of the array on its own line, each element indexed by the name of a parameter to set in /etc/sysctl.conf, and containing as its value the value to set for that parameter. We define the elements as strings to make them generic and able to contain any type of value. To refer to this array from other bundles, we will use its full name configfiles.sysctl, to identify the bundle where it was defined.

  4. After setting up the variables in configfiles(), we include a methods: section, which allows us to specify multiple bundles to be called in sequence. In this example, we have only the call to edit_sysctl(), which does the work of editing the file. (We will describe edit_sysctl() in a moment.) Each method call has an arbitrary identifier. In this case we use the identifier "sysctl" to identify it as part of the sequence that performs the edits on /etc/sysctl.conf. Later in the chapter we will add calls to other bundles that perform different configuration tasks. We also specify a comment attribute to express the higher-level intention of this promise.

    Using methods: promises to abstract lower-level bundles is a good way of communicating higher-level intentions in CFEngine policies, without the distraction of the actual implementation details.

  5. The edit_sysctl() bundle is called from the methods: section, and contains promises that specify the desired state of the system. The bundle starts with a files: section, in which the promiser is the file to be edited. We use, as the filename, the sysctl element of the configfiles.files array. This is the files array defined in the configfiles() bundle, which gives us the value "/etc/sysctl.conf". We provide handle and comment attributes, which contribute nothing to the configuration activity, but are recommended in all promises because they help you tremendously when observing log output. The create attribute specifies that the file should be created if it does not exist (this may be the case immediately after installation if no custom parameters are set).

  6. Next comes the part that actually does the work, and it is surprisingly simple. The edit_line attribute calls the set_variable_values() bundle with the name of the array that contains the values we want to set. We do not pass the array itself but its name, and this name will be dereferenced inside set_variable_values() to find the actual array.

    You probably realize that the set_variable_values() bundle is very important, since it’s the one that actually performs the work of editing the file. This is not a built-in command, but rather is contained in the CFEngine Standard Library (described in CFEngine Standard Library). We will come back to it in a moment.

  7. The classes attribute tells CFEngine that if the promise is repaired, the sysctl_modified class should be set (the notion of a “class” was explained in Classes and Decision Making) The if_repaired body part is defined in the Standard Library as well.

    Wait a second. What do I mean by repaired? To CFEngine, a promise is repaired when any actions needs to be taken as a result of evaluating the promise, and those actions result in reaching the desired state of the promise. For example, if the file already had the desired configuration values, CFEngine would see no need to edit it, and it would not be marked as repaired (in this case CFEngine would consider the promise as “kept”). On the other hand, if any of the parameters were not present and the promise adds them, then the promise will be flagged as “repaired.” All possible end states of a promise are described in the documentation for the classes attribute. We are free to execute this bundle as many times as we want, and CFEngine will make changes only when they are needed. Thus is the nature of convergent configuration that CFEngine allows us to perform.

  8. If the file did not contain all the configuration parameters and CFEngine adds any of them (thus “repairing” the state of the file), the sysctl_modified class will be set, thanks to the if_repaired body part we just saw in our configuration. This is useful because when the file gets modified, we have to issue the /sbin/sysctl -p command to instruct the system to reload the values and make them effective immediately. Thus, in the commands: section, you can see that we are issuing this command. The command is preceded by a class expression:

        sysctl_modified.!no_restarts::

    This is a boolean expression in which the dot means AND (you can also use an ampersand if it feels more natural), and the exclamation mark means NOT. (A vertical bar or pipe character, which is not used in this example, would mean OR.) In this particular case, the /sbin/sysctl -p command will be executed only if the sysctl_modified class is set (that is, if /etc/sysctl.conf was modified) AND the no_restarts class is not set. This construct allows us to change the configuration files without executing any restart or reconfiguration commands, by defining the no_restarts class (which we could do, for example, by giving the -Dno_restarts command-line option to cf-agent when executing the policy).

This concludes the high-level description of the policy, which as you can see, describes in a fairly human-readable fashion what we want to achieve. In summary, our configuration file defines two bundles at the top level: configfiles() and edit_sysctl(). The configfiles() bundle provides the entry point for the policy, defines the files we want to edit and the contents we want them to have, and invokes the edit_sysctl() bundle (later in the chapter we will add calls to other bundles.) That bundle in turn carries out the edits we want to perform on the /etc/sysctl.conf file. Now we will delve deeper into the implementation details.

First, let’s come back to the set_variable_values() bundle, since it seems so important. If you open lib/3.5/files.cf (or cfengine_stdlib.cf if you have CFEngine 3.5.0 or older), you will find its definition:

bundle edit_line set_variable_values(v) (1)
{
  vars:
      "index" slist => getindices("$(v)"); (2)
      "cindex[$(index)]" string => canonify("$(index)"); (3)
  field_edits: (4)
      "\s*$(index)\s*=.*" (5)
        edit_field => col("=","2","$($(v)[$(index)])","set"), (6)
        classes => if_ok("$(cindex[$(index)])_in_file"),   (7)
        comment => "Match a line starting like key = something";
  insert_lines: (8)
      "$(index)=$($(v)[$(index)])",
        comment => "Insert a variable definition",
        ifvarclass => "!$(cindex[$(index)])_in_file";
}

Remember, from Bundles, Bodies, and Namespaces, that bundles in CFEngine are equivalent to subroutines in other programming languages—they are self-contained units that can contain most different promise types, thus allowing us to encapsulate functionality. We will now dissect this set_variable_values() bundle to see how it performs its magic.

  1. This bundle receives as its argument, v, the name of an array. In our current example, the bundle will be invoked with v taking the value configfiles.sysctl, in which the indices are parameter names. The values in the array are the values to which the parameters will be set. So v provides the instructions to edit a file of the form name = value, modifying the values of parameters that already exist, and adding those that do not exist yet.

  2. First, we get a list of all the parameters, and store it in the list variable named index. This is done using the built-in CFEngine function getindices() on the passed array, which returns a list of the indices in the array. Note that getindices() also receives the name of the array on which it will operate, so we can simply pass the $(v) parameter to it. In CFEngine, you can use either braces or parenthesis around variable names; they are equivalent, so ${v} would mean the same.

  3. Next, we generate an array of canonified parameter names. In CFEngine, a canonified string is a string that can be readily used as a class name. Because some characters are not valid in CFEngine class names, the CFEngine function canonify() allows us to take an arbitrary string and remove invalid characters from it. We store these canonified values in an array named cindex, indexed by the real parameter name so we can relate them to their canonified version. We use CFEngine’s implicit looping to populate the entire array.

  4. The next step in performing the edits on the file is to update the values of parameters that already exist in the file. For this we use a field_edits: section, which also uses implicit looping to apply the editing promise for every parameter.

  5. The field_edits: promise starts with a regular expression that selects the lines in the file that need to have the edit applied. In this case, we want to edit any line that starts with the current parameter name ($(index)), surrounded by optional whitespace (\s*), followed by an equals sign (=), and followed by an arbitrary string (we don’t care about the existing value, since we will replace it with the new one). It is important to note that in field_edits: promises, CFEngine automatically anchors the given regular expressions to the beginning and end of the line, so the regex we give needs to match the whole line.

    Notice again that thanks to CFEngine’s implicit looping, this whole promise will be executed once for every single parameter stored in our configfiles.sysctl array. In our example, the array contains 5 elements, so the field_edits: promise will be evaluated 5 times, with $(index) iterating through the following values:

    • net.ipv4.tcp_syncookies

    • net.ipv4.conf.all.accept_source_route

    • net.ipv4.conf.all.accept_redirects

    • net.ipv4.conf.all.rp_filter

    • net.ipv4.conf.all.log_martians

    With a single promise and without any explicit flow-control instructions, CFEngine allows us to apply all of our edits to the whole file.

    In this example we have a single promise inside the field_edits: section, but we could have several promises if we wanted to apply different types of field-based edits to the file.

  6. If any lines in the file match the regular expression (this is, that contain a definition of the given parameter), we will apply to them the changes defined by the edit_field attribute of the promise. For this we use yet another definition from the standard library called col(), and which allows generic field-based file editing. In this case, the arguments to col() tell it to use = as the field separator, and to set the second field of the line to the value given by the expression "$($(v)[$(index)])".

    There is a bit of variable interpolation magic going on here. Variable values in strings are expanded from the inside-out by CFEngine. First, the value of $(v) is expanded, so in our example the string will now read $(configfiles.sysctl[$(index)]). Next, the value of $(index) will be automatically iterated over each parameter value. As an example, for the net.ipv4.tcp_syncookies parameter, it will expand to $(configfiles.sysctl[net.ipv4.tcp_syncookies]). This now looks like a regular variable reference in CFEngine, which will give us the value we want to set for the given parameter, the string "1" in this case.

  7. If the promise is OK (in CFEngine, this means the promise was either already satisfied, or not satisfied but repaired), the classes attribute sets the "$(cindex[$(index)])_in_file" class. For example, if the parameter net.ipv4.tcp_syncookies already existed in the file, the net_ipv4_tcp_syncookies_in_file class will be set. This is the canonified version of the parameter name, concatenated with the string _in_file.

    Remember from Classes and Decision Making that classes in CFEngine are identifiers that are either set or unset, and that allow us to perform Boolean decisions. In this case, we are setting classes that contain the names of all the parameters that already existed in the file, whether their value was correct already or not. The existence of these classes indicates that there is no further work to be done for those particular parameters.

  8. If a parameter was not found in the file, it needs to be added, and this task is performed by the insert_lines: section of the bundle. The promiser in this case is the line we want to insert, in the form parameter=value, which promises to be inserted into the file only if the class expression given by the ifvarclass attribute is true. In this case, the value of ifvarclass is the negation (!) of the class defined by the field_edits: promise when the parameter was already present in the file. If the class is not defined (which means the parameter was not found in the file), the ifvarclass expression evaluates to true, and the missing line will be inserted.

    As an example, lets imagine that the net.ipv4.conf.all.log_martians parameter is not present in the file. Then the field_edits: promise will fail (because there is no line that matches the regular expression that searches for a line starting with the parameter name), and so the net_ipv4_conf_all_log_martians_in_file class will not be set. When the insert_lines: promise is executed, the value of the class expression !$(cindex[$(index)])_in_file (which expands to the string !net_ipv4_conf_all_log_martians_in_file) will be true, indicating that the line needs to be inserted.

    You will notice this pattern of behavior often in CFEngine policies: doing some checks and fixes, setting certain classes based on the result, and then triggering other actions based on the existence of those classes. It seems convoluted at first, but it allows a lot of flexibility, and particularly allows policies to be convergent, not making changes unless they are necessary.

    I must note that insert_lines: promises in CFEngine are quite smart. In particular, they will not insert a line that already exists in the file (this is a consequence of the promise theory underpinnings—if the line is already in the file, the promise has already converged to its desired state, and there is no need to insert it again), so in principle we should not need to set a class and then condition the insertion of the line on it. In this particular case, using the class allows us to account for things like spacing differences (e.g., spaces around the equals sign) that would not be considered by an unconditional insert_lines: promise.

Implicit Looping in CFEngine

Although we have already described implicit looping in Looping in CFEngine, let us look in detail at what is going on in this variable assignment, to refresh your memory. If a list variable is referenced as a scalar (with the $ prefix instead of @), CFEngine automatically loops over all the values in the list, replacing each element in turn. Thus, by accessing the index array as a scalar ($(index) instead of @(index)), we are telling CFEngine to execute the corresponding statement once for every element of the array. In effect, the following line:

      "cindex[$(index)]" string => canonify("$(index)");

will be repeated for every element of @(index), with $(index) taking each value in sequence. This results in the creation, element by element, of an array indexed by parameter names, and containing as values the canonified version of each name.

To finish this discussion, we will go down one more level in the implementation chain, to discuss the three low-level body parts if_repaired(), if_ok(), and col(). None of these are native CFEngine functions, rather they are defined in the standard library as follows:

body classes if_repaired(x)
{
        promise_repaired => { "$(x)" };
}
body classes if_ok(x)
{
        promise_repaired => { "$(x)" };
        promise_kept => { "$(x)" };
}
body edit_field col(split,col,newval,method)
{
        field_separator    => "$(split)";
        select_field       => "$(col)";
        value_separator    => ",";
        field_value        => "$(newval)";
        field_operation    => "$(method)";
        extend_fields      => "true";
        allow_blank_fields => "true";
}
Note

As a general rule, you should not worry too much about the implementation details of bundles in the standard library, just as you don’t worry about the implementation details in the C standard library or in Perl CPAN modules. We are delving into the details here as an opportunity for you to learn more about the CFEngine policy language, and all the different levels at which it operates.

if_repaired() and if_ok() are both classes body parts, which means they can be used as the value of a classes attribute. This attribute is allowed in almost any CFEngine promise, and defines classes to be set depending on the result of the promise.

The two examples shown here should be fairly self-explanatory. In if_repaired(), we are specifying that the class whose name is given as argument $(x) will be defined only when the promise was repaired (this is, when some change had to be made in order to bring the promise to its desired state). In if_ok(), we are specifying that the class will be defined when the promise was either repaired or kept (this is, it was already true). We examined the col() body in detail in Bodies. In this case we are specifying the field separator, the field to be selected for edition, its value, and the operation to be performed.

This finishes our explanation for now. I would like to remind you of the different levels of abstraction present even in this simple example:

  1. At the highest level (configfiles() bundle), the policy simply says “configure the /etc/sysctl.conf file with these parameters.”

  2. The next level (edit_sysctl bundle), says “set these values in the /etc/sysctl.conf file and in the running system.”

  3. The next level (set_variable_values() bundle) explains the structure of the file and how the parameters should be set.

  4. The lowest level (col(), if_ok(), if_repaired()) explains how to perform the field edits in the file, how classes should be handled, and other implementation details.

The beauty of CFEngine is that you need to work only at the level of abstraction that is needed at the moment. In fact, different sets of people could operate at each level. A policy maker could set the requirements at the highest level (even higher than the levels shown here, in fact), and both system administrators and CFEngine administrators could operate at the lower levels as required.

Editing /etc/sshd_config

Another common task upon initial installation of a system is to configure certain services, SSH (Secure Shell) being a particularly useful one, and OpenSSH being one of the most popular SSH implementations. By default, the OpenSSH daemon ships with a fairly usable configuration, but you may still want to change it to be more secure, or to adhere to local policies.

Having seen how to edit /etc/sysctl.conf in the previous section, you should already start to see how to perform this configuration. For the sake of our example, let’s say we want to modify the following parameters in /etc/ssh/sshd_config from their default configuration in an OpenSSH installation:

#Protocol 1,2
#X11Forwarding no
#UseDNS yes

In OpenSSH, most configuration parameters appear commented out by default, showing their default values. We would like to modify these parameters to the following:

Protocol 2
X11Forwarding yes
UseDNS no

This is, we want to uncomment the corresponding lines, and modify their values to the ones we want. If the line for the parameter we want does not exist already, we want to add it to the configuration file.

With this in mind, we can rewrite our earlier top-level configfiles() bundle to the following:

bundle agent configfiles
{
  vars:
      # Files to edit
      "files[sysctl]" string => "/etc/sysctl.conf";
      "files[sshd]"   string => "/etc/ssh/sshd_config";

      # Sysctl variables to set
      "sysctl[net.ipv4.tcp_syncookies]"               string => "1";
      "sysctl[net.ipv4.conf.all.accept_source_route]" string => "0";
      "sysctl[net.ipv4.conf.all.accept_redirects]"    string => "0";
      "sysctl[net.ipv4.conf.all.rp_filter]"           string => "1";
      "sysctl[net.ipv4.conf.all.log_martians]"        string => "1";

      # SSHD configuration to set
      "sshd[Protocol]"                                string => "2";
      "sshd[X11Forwarding]"                           string => "yes";
      "sshd[UseDNS]"                                  string => "no";

  methods:
      "sysctl"  usebundle => edit_sysctl;
      "sshd"    usebundle => edit_sshd;
}

You can see that we added a second element to the files array, files[sshd], which contains the path of the /etc/ssh/sshd_config file. We have also added a new array called sshd, containing the parameters we want to set in the configuration file. Finally, in the methods: section, we added a call to an edit_sshd() bundle, which performs the necessary edits. Note again the very clear separation that CFEngine allows in specifying what to do (the values of the parameters that we want to set) from how to do it (the methods: calls, and their respective implementations). Here is the new edit_sshd() bundle:

bundle agent edit_sshd
{
  files:
      "$(configfiles.files[sshdconfig])"   (1)
        handle => "edit_sshd",
        comment => "Set desired sshd_config parameters",
        edit_line => set_config_values("configfiles.sshd"),
        classes => if_repaired("restart_sshd");

  commands:
    restart_sshd.!no_restarts::   (2)
      "/etc/init.d/sshd reload"
        handle => "sshd_restart",
        comment => "Restart sshd if the configuration file was modified";

  services:   (3)
      "ssh"
        service_policy => "start";
}

The edit_sshd() bundle is in general very similar to edit_sysctl(), but there are a few differences worth exploring.

  1. Instead of calling the set_variable_values() bundle for editing the file (which is used to set lines of the form variable=value) we use the set_config_values() bundle, which is used to set lines of the form variable value, with the additional feature of automatically uncommenting lines if they exist already in commented-out form.

  2. The edit_sshd() bundle also has a commands: section, which is used to restart the sshd daemon if the configuration file was changed. As before, we set the restart_sshd class if the file-editing promise was repaired (that is, if any changes were made to the file), and depending on this class, we issue the necessary command.

  3. Independently of restarting the sshd daemon in case of file changes, we want to make sure that the daemon is running. For this, we introduce a new type of promise called services:, which offers us an abstraction for system services and allows us to declare that the ssh service needs to be running. Because ssh is a standard service, we can simply specify its desired state using the service_policy parameter.

    By default, services: promises call the standard_services() bundle defined in the standard library to actually handle the service operations. This bundle defines the information necessary for checking, starting and stopping a large number of common services, including SSH. If you need to handle some service that is not included in this bundle, you can either modify it to add support for it (don’t forget to submit a pull request so that the community can benefit from your additions!) or write your own service-handling bundles. For more details about services: promises, please see [service-management].

    As it evolves, the standard_services() bundle may be able to dynamically manage unknown services without your having to explicitly declare them, but this is not fully implemented at the moment.

Let us now look at the set_config_values() bundle, also defined in the standard library.

bundle edit_line set_config_values(v)
{
  vars:
      "index" slist => getindices("$(v)"); (1)
      "cindex[$(index)]" string => canonify("$(index)");
  replace_patterns: (2)
      "^\s*($(index)\s+(?!$($(v)[$(index)])).*|# ?$(index)\s+.*)$"
        replace_with => value("$(index) $($(v)[$(index)])"), (3)
        classes => always("replace_attempted_$(cindex[$(index)])"); (4)
  insert_lines:
      "$(index) $($(v)[$(index)])" (5)
        ifvarclass => "replace_attempted_$(cindex[$(index)])";
}

This bundle uses a completely different logic from set_variable_values(), even though it performs similar functions. This allows me to introduce you to a couple of new concepts and tricks.

  1. The first part of the bundle is already familiar: it gets a list of indices from the array passed to the bundle, stores it in index, and uses it to populate cindex with the canonified versions of those parameter names, to be used in class names later.

  2. The actual line editing is done now by a replace_patterns: section instead of field_edits:, which allows for more flexible transformations. Promises of this type allow us to search for and replace regular expressions in the file.

    The promiser in a replace_patterns: promise is the regular expression we want to match. In this case we are asking it to look for two types of lines, corresponding to the two regular expressions separated by a pipe character (|):

    1. Lines that start (^) with optional whitespace (\s*), followed by the current parameter name ($(index)) followed by whitespace (\s+) and any string that is not the correct value of the current parameter ((?!$($(v)[$(index)])).*). This represents lines that already set the parameter we are looking for, but with an incorrect value.

    2. Lines that start (^) with optional whitespace (\s*), followed by a comment character and an optional space (# ?), followed by the current parameter name ($(index)) followed by whitespace (\s+) and any arbitrary string. This represents lines that contain the parameter, but commented out.

    Again, we are using implicit looping to iterate over all the parameters to be set, by using $(index) instead of @(index) in the promise.

    The last part of the first regular expression is complicated because we need to find lines that do not contain the correct value already, and replace them. For this, we use a negative-lookahead expression ((?!…​)) that indicates that the text after the whitespace must not match the desired value ((?!$($(v)[$(index)]))). The final part (.*) is necessary to match the actual characters that follow the whitespace, because the whole negative-lookahead expression is zero-length, and does not “consume” any characters during the regex evaluation.

  3. The replace_with attribute tells us what to use as the replacement. In this case, the replacement will be the current parameter and its desired value, separated by a space:

    replace_with => value("$(index) $($(v)[$(index)])"),

    value() is another body that specifies the value and characteristics of the replacement text. It is defined in the standard library:

    body replace_with value(x)
    {
            replace_value => "$(x)";
            occurrences => "all";
    }
  4. For reasons I will explain in a moment, we want to remember that the replace_patterns: promise has run, whether or not it actually found its pattern. So it ends by setting the replace_attempted_`parameter` class using the classes attribute with the always() body part. The definition of the always() body part is also found in the Standard Library:

    body classes always(x)
    {
            promise_repaired => { "$(x)" };
            promise_kept => { "$(x)" };
            repair_failed => { "$(x)" };
            repair_denied => { "$(x)" };
            repair_timeout => { "$(x)" };
    }

    The effect of using always() is that the class given as a parameter is set for any of the conditions listed in it (promise_repaired, promise_kept, repair_failed, repair_denied or repair_timeout). These are all the possible outcomes of a promise in CFEngine, so the net effect is to set the class regardless of what has happened.

  5. Up to this point we have dealt with parameters whose lines are already in the file (maybe commented out), but we also need to insert parameters that do not yet appear in the file. How to do this is a little tricky and counter-intuitive, but it gives us an opportunity to learn more about how CFEngine works.

    As we saw in Normal Ordering, promise sections in a CFEngine policy are executed in a hard-coded sequence known as normal ordering. According to normal ordering, the insert_lines: section is executed before the replace_patterns: section. This poses a problem in our current example because we want to try to fix already-existing parameters (possibly commented out or with incorrect values) before adding any new lines. If we let the insert_lines: promise execute first, we may end up with duplicated definitions of parameters in the configuration file.

    To alter the order of execution, we condition the execution of the insert_lines: promise on the existence of the replace_attempted___parameter__ class defined when the replace_patterns: promise is evaluated. Because CFEngine does up to three passes over the promises, this makes the insert_lines: promise execute only on the second pass, after the replace_patterns: section has had a chance to uncomment and correct any existing lines. If at this point the line with the correct value still does not exist, then inserting it is the correct behavior.

Normal Ordering in edit_line Bundles

Within edit_line bundles, the sections are executed, up to three times, in the following order: vars, classes, delete_lines, field_edits, insert_lines, replace_patterns and reports.

I know this can be confusing, so here is an example to clarify it. Suppose that our /etc/ssh/sshd_config file contains the following line:

#Protocol 1,2

The behavior of the set_config_values() bundle will be the following (assuming $(index) currently has the value "Protocol"):

  1. (First pass) The insert_lines: promise for "Protocol 2" is not executed because the replace_attempted_protocol class is not defined. Note that the class name contains the canonified version of the parameter name, which includes making it all lowercase.

  2. (First pass) The replace_patterns: promise replaces the original line with its uncommented, correct value, and defines the replace_attempted_protocol class:

    Protocol 2
  3. (Second pass) The insert_lines promise now executes, but because the correct line is already present in the file, it is not inserted again.

Now consider the case when the commented-out "Protocol" line is not present at all in the file. Then the flow would be the following:

  1. (First pass) The insert_lines promise for "Protocol 2" is not executed because the replace_attempted_protocol class is not defined.

  2. (First pass) The replace_patterns promise is executed but does not succeed because the line does not exist. It defines the replace_attempted_protocol class anyway, due to the use of the always() body.

  3. (Second pass) The insert_lines promise now executes, and because the line "Protocol 2" does not exist in the file, it is inserted.

In both cases, the end result is the same: to set the Protocol parameter to its correct value. It is important to note that our previously-examined set_variable_values() bundle could trivially be rewritten using the same technique used by set_config_values(), which would add the functionality of allowing it to handle commented-out lines properly.

Warning

Note that the example we saw in this section assumes each parameter can only appear once in the file. This assumption does not hold true if the file contains “Match” blocks, which allow specifying conditional configuration values. In the interest of clarity I have considered only the simplest example in the book. For the full functionality, please see the networking/ssh sketch in the CFEngine Design Center.

Editing /etc/inittab

Another common initial task when setting up a Unix or Linux system is to customize /etc/inittab. For our example, we will do the following tasks:

  1. Modify the default runlevel from 5 to 3, to disable graphical login by default (this is commonly done on Linux servers, to prevent wasting resources on an unused graphical console).

  2. Disable Ctrl-Alt-Del handling, to prevent this key combination from rebooting the system.

To achieve the first task, we need to modify the second field in the following line:

id:5:initdefault:

This is a fairly simple task, now that you understand the previous editing tasks we have done. Here is the promise that achieves it:

  files: (1)
      "/etc/inittab"
        handle => "inittab_set_initdefault",
        comment => "Ensure graphical mode is disabled (default runmode=3)",
        create    => "false",
        edit_defaults => backup_timestamp, (2)
        edit_line => set_colon_field("id","2","3"); (3)
  1. This is a files: promise that indicates the file to edit, and states that the file must not be created ("create" ⇒ "false") if it does not exist already, since /etc/inittab should always exist in a Unix system.

  2. The edit_defaults attribute specifies the behavior for the file-editing operation. The definition of backup_timestamp can be found in the standard library:

    body edit_defaults backup_timestamp
    {
            empty_file_before_editing => "false";
            edit_backup => "timestamp";
            max_file_size => "300000";
    }

    This states that the file should not be emptied before editing (you can set this to true when the promise will recreate the file in its entirety), that a copy of the old version should be kept, named with a timestamp at the end (this allows you to keep a history of the file, and is particularly advisable for critical system files, so that you can quickly revert any changes if problems arise), and that the file should not be more than 300,000 bytes in size (this is simply a sanity check to ensure that files do not grow beyond normal limits).

    You will notice that we had omitted the edit_defaults attribute in our previous file-editing promises. This is valid and provides sane default behavior. We use edit_defaults now in particular because it is a good idea to keep backup copies of the /etc/inittab file in case anything goes wrong.

    The backup files are by default stored in the same directory as the original file. You can modify this behavior to store them under a separate directory by using the repository attribute for a single files: promise, or the default_repository option in body agent control to use it throughout the entire policy.

  3. The actual editing of /etc/inittab is done by the standard library set_colon_field() bundle, which allows us to edit fields in a colon-separated file. Here is its definition:

    bundle edit_line set_colon_field(key,field,val)
    {
      field_edits:
        "$(key):.*"
          comment => "Edit colon-separated file, using first field as a key",
          edit_field => col(":","$(field)","$(val)","set");
    }

    This bundle uses the same lower-level col() body we employed in Editing /etc/sysctl.conf, only this time using the colon as a separator to set the appropriate field to the value we provide. As used in our promise, col() results in the second field of the line whose first field is "id" to be set to "3".

To achieve the second task, we need to comment out the following line:

ca::ctrlaltdel:/sbin/shutdown -r -t 4 now

We can use the following promise to achieve this:

  files:
      "/etc/inittab"
        handle => "inittab_disable_ctrlaltdel",
        comment => "Ensure handling of ctrl-alt-del is disabled",
        create    => "false",
        edit_defaults => backup_timestamp,
        edit_line => comment_lines_matching("ca::ctrlaltdel:.*", "#");

Again, the actual work in this promise is performed by the edit_line attribute, which in this case calls the comment_lines_matching() bundle. This standard library bundle is used to insert a comment character (given as the second argument, in this case "#") at the beginning of any line that matches the first argument. Here is its definition:

bundle edit_line comment_lines_matching(regex,comment)
{
  replace_patterns:
      "^($(regex))$"
        replace_with => comment("$(comment)"),
        comment => "Search and replace string";
}

It consists, as you might have expected, of a simple replace_patterns: promise. The replacement string is defined by the comment body definition, which is also in the standard library:

body replace_with comment(c)
{
        replace_value => "$(c) $(match.1)";
        occurrences => "all";
}

In the replace_value attribute, $(c) is the comment string we passed as an argument, and $(match.1) refers to the content of the first set of parenthesis in the regular expression used to select the line. If you look back in the comment_lines_matching() bundle, you’ll see that the regular expression is given as "^($(regex))$", which has grouping parenthesis that capture the whole matched line. This results in the matching line being replaced by the comment character, followed by a space, and then the previous content of the line.

Putting it all together, and extending our previous configfiles() bundle to handle editing the /etc/inittab file, we get the following:

bundle agent configfiles
{
 vars:
   # Files to edit
   "files[sysctl]" string => "/etc/sysctl.conf";
   "files[sshd]" string => "/etc/ssh/sshd_config";
   "files[inittab]"    string => "/etc/inittab";

   # Sysctl variables to set
   "sysctl[net.ipv4.tcp_syncookies]"               string => "1";
   "sysctl[net.ipv4.conf.all.accept_source_route]" string => "0 ";
   "sysctl[net.ipv4.conf.all.accept_redirects]"    string => "0";
   "sysctl[net.ipv4.conf.all.rp_filter]"           string => "1";
   "sysctl[net.ipv4.conf.all.log_martians]"        string => "1";

   # SSHD configuration to set
   "sshd[Protocol]"                                string => "2";
   "sshd[X11Forwarding]"                           string => "yes";
   "sshd[UseDNS]"                                  string => "no";

 methods:
   "sysctl"  usebundle => edit_sysctl;
   "sshd"    usebundle => edit_sshd;
   "inittab" usebundle => edit_inittab;
}

bundle agent edit_inittab
{
 files:
   "$(configfiles.files[inittab])"
     handle => "inittab_set_initdefault",
     comment => "Ensure graphical mode is disabled (runmode=3)",
     create => "false",
     edit_defaults => backup_timestamp,
     edit_line => set_colon_field("id","2","3");

   "$(configfiles.files[inittab])"
     handle => "inittab_disable_ctrlaltdel",
     comment => "Ensure handling of ctrl-alt-del is disabled",
     create => "false",
     edit_defaults => backup_timestamp,
     edit_line => comment_lines_matching("ca::ctrlaltdel:.*", "#");
}

Here, we have simply moved the filename into the files array that we have been using, and added the call to edit_inittab() to the methods: section.

Configuration Files with Variable Content

So far, we have been making fixed changes to configuration files, which is helpful enough, but CFEngine is able to handle much more complex situations. In a real network not all systems are the same, and you have a mixture of operating systems, releases, and parameters that affect how each machine should be configured. Handling these almost-the-same-but-slightly-different configurations by hand is a certain recipe for disaster: eventually, someone will lose track of the changes that have to be made, forget to make certain changes, or make the wrong set of changes, and a system will stop working. With CFEngine, these configurations can be made consistently and without errors.

Class-based configuration

CFEngine automatically discovers a large amount of information about the system and its current state and sets classes based on them. These are called hard classes in CFEngine terminology because they are set by CFEngine based on system characteristics; they are different from soft classes, which are set by the policy during its execution. Using hard classes, we can instruct CFEngine to act differently depending on characteristics of each system or of the moment when CFEngine is executed.

To know which classes are discovered by CFEngine, we can use the cf-promises command like this:

# cf-promises -v | grep classes
2013-10-04T04:31:00+0000  verbose: Discovered hard classes: 10_0_2_15
127_0_0_1 2_cpus 64_bit Day4 Friday GMT_Hr4 Hr04 Hr04_Q3 Lcycle_0
Min30_35 Min31 Night October PK_MD5_0b595fd7ffc16e9bda575402bd2048de
Q3 Yr2013 any cfengine cfengine_3 cfengine_3_5 cfengine_3_5_2
common community_edition compiled_on_linux_gnu debian debian_wheezy
fe80__a00_27ff_fefe_aaaf have_aptitude inform_mode ipv4_10 ipv4_10_0
ipv4_10_0_2 ipv4_10_0_2_15 ipv4_127 ipv4_127_0 ipv4_127_0_0 ipv4_127_0_0_1
linux linux_3_2_0_23_generic linux_x86_64 linux_x86_64_3_2_0_23_generic
linux_x86_64_3_2_0_23_generic__36_Ubuntu_SMP_Tue_Apr_10_20_39_51_UTC_2012
localhost mac_08_00_27_fe_aa_af net_iface_eth0 net_iface_lo somehost
somehost_cfengine_com ubuntu ubuntu_12 ubuntu_12_4 verbose_mode x86_64
Tip

The community-contributed tool hcgrep allows you to easily search and display hard classes defined by CFEngine.

Let’s look at some of these classes and what the names tell us:

  • Time information is given by classes such as Day4 (4th of the month), Friday, Hr04 (4AM), Min30_35 (it’s between 3:30 and 3:35), Hr04_Q3 and Q3 (the current quarter-hour), Night (it’s at night, or more precisely, between 00-06 hours), October, Yr2013, Lcycle_0 (this is a “lifecycle index” defined as the year modulo 3, and which can be used for long-term scheduling). All times are expressed in the local system timezone except for GMT_Hr4.

  • Network information is given by classes such as 10_0_2_15 (the host’s IP address), ipv4_10, ipv4_10_0, ipv4_10_0_2, ipv4_10_0_2_15 (the different portions of the IP address), net_iface_eth0 and net_iface_lo (the network interfaces defined in the system).

  • System information is given by classes such as somehost (the host name), somehost_cfengine_com (its FQDN, with the periods replaced by underscores since classnames cannot contain periods), linux, ubuntu, ubuntu_12 (operating system type and, in this case, Linux distribution information), x86_64 (system architecture), and linux_3_2_0_23_generic (Linux kernel version and build information).

  • CFEngine information is given by classes such as cfengine_3, cfengine_3_5, cfengine_3_5_2 (version number), community_edition (indicating that this host is running the CFEngine Community edition), PK_MD5_0b59... (the cryptographic signature of the host’s CFEngine-generated public key, which can be used to uniquely identify the system), and verbose_mode (which tells us CFEngine was run with the -v option, so you could tie your own verbose output to the use of this option).

Note

Note that all strings that contain periods or other special characters (e.g. IP addresses, host names, etc.) have those characters replaced by underscores when converted to classnames. You can perform this conversion on any string to get a valid class name by using the canonify() function.

Hard classes allow a great deal of flexibility in writing configurations by offering very detailed information based on which you can perform configuration actions. Of course, you can also define your own classes in policy to identify any arbitrary conditions you need.

As an example, you could use the system type to decide which command to use for a certain task:

bundle agent reboot
{
  commands:
    linux::
      "/sbin/shutdown -r now";
    windows::
      "c:/Windows/system32/shutdown.exe /r /t 01";
}

Remember that in CFEngine, lines that end with a double colon are interpreted as class expressions, which indicate that the lines that follow should be evaluated only if the expression evaluates to true. In this case the selection is very simple: we use one command for rebooting Linux systems, and a different one for Windows machines, using the hard classes linux and windows as class expressions.

We can also combine classes in more complex expressions. Extending our previous example, we could use the and (. or &) operator to condition a reboot on both the existence of the reboot_needed class and the corresponding operating-system class. Additionally, we can produce an error if the machine is not (!) Linux and (.) not (!) Windows (we can use parenthesis for grouping parts of the expression):

bundle agent reboot
{
  commands:
    reboot_needed.linux::
      "/sbin/shutdown -r now";
    reboot_needed.windows::
      "c:/Windows/system32/shutdown.exe /r /t 01";

  reports:
    reboot_needed.!(linux|windows)::
      "I know how to reboot only Linux and Windows machines.";
}

Time-based classes can be used to emulate cron-like behavior using CFEngine. For example:

bundle agent cron_tasks
{
  commands:
    # Commands to run hourly between minute 00-05
    Min00_05::
      "/usr/sbin/updatedb";
    # Commands to run during hours 00 and 03
    Hr00::
      "/usr/local/sbin/logrotate";
      "/usr/sbin/tmpclean";
    Hr03::
      "/usr/local/sbin/run_backups";
    Monday::     # Commands to run during Monday
      "/usr/sbin/usercheck";
    Lcycle_0.March::   # Commands to run during March every four years
      "/usr/sbin/random_catastrophic_failure";
}
Warning

Note that a command (and in general any promise) will execute every time CFEngine runs, as long as its class conditions are true. So the class expression Hr03 means that the /usr/local/sbin/run_backups command will be executed all hour long during 3AM, every time CFEngine runs (by default every 5 minutes). Be mindful to design your class expressions according to what you need. You can configure how often a single promise will be evaluated using the ifelapsed attribute.

In a bundle like this, you can define any number of tasks to execute. The other big advantage is that using CFEngine as a cron replacement allows you to schedule not only commands and shell scripts, but also arbitrary CFEngine promises, which you can use to perform more complex tasks than you could using cron alone.

System-information classes allow you to perform different tasks depending on the system state or configuration. For example, you could easily create different network profiles using CFEngine:

bundle agent network_profiles
{
  methods:
    # At home, 192.168.23.0/24, start my backup
    ipv4_192_168_23::
      "openservices" usebundle => openservices;
      "dobackup"     usebundle => backup;
      "printer"      usebundle => configure_printer("home");
    # At work, 9.4.0.0/16, configure the appropriate printers
    ipv4_9_4::
      "openservices" usebundle => openservices;
      "printer"      usebundle => configure_printer("work");
    # Anywhere else, close some services for additional protection
    !(ipv4_192_168_23|ipv4_9_4)::
      "closeservices" usebundle => closeservices;
}

In this case, we are modifying system settings (through bundles we are calling using methods: promises) based on the IP address range in which the system is currently configured. The possibilities are endless.

System-state-based configuration

Another, even more flexible, way of configuring a system involves using its current state to determine the desired end state, making the policy fully dynamic depending on each particular system.

In one of my projects, we had a large number of Linux machines with two network interfaces, one of them connected to the production network (which we called the “green” network) and the other one to the management network (called the “black” network). Due to the characteristics of the networking infrastructure, we had to disable the TSO flag (TCP Segmentation Offload) on the interfaces that were on the green network. In my first attempt at automating this, I observed that the green interface was always eth0 (these were all Linux systems), and hard-coded the CFEngine configuration to add the following line to /etc/inittab:

tso:3:once:/usr/sbin/ethtool -K eth0 tso off

This results in the ethtool command being run upon system boot to disable this flag. The policy to achieve this is very similar to the ones we have seen before, so I will not show its exact implementation.

This worked fine… until exceptions started to appear: systems in which the green interface was not necessarily eth0. Then the rules had to adapt, and with CFEngine this was fairly simple to accomplish.

In this particular case, the two networks could be easily identified by their IP address ranges. The green network was in the 192.168.0.0/16 range, and the black network was in the 10.10.0.0/16 range. With this piece of information, I was able to modify the policy so that the correct interface is used in the ethtool command. Here is the complete bundle:

bundle agent disable_tso_flag
{
 vars:
   "ipregex" string => "192\.168\..*"; (1)
   "nics"    slist  => getindices("sys.ipv4");

 classes:
   "isgreen_$(nics)" expression => regcmp("$(ipregex)",
                                          "$(sys.ipv4[$(nics)])"); (2)

 files: (3)
   "$(configfiles.files[inittab])"
     handle => "inittab_add_ethtool",
     comment => "Ensure ethtool is run on startup to disable the TSO flag",
     create => "false",
     edit_defaults => edit_backup,
     edit_line => replace_or_add("tso:3:.*", (4)
                                 "tso:3:once:/usr/sbin/ethtool -K $(nics) tso off"),
     ifvarclass => "isgreen_$(nics)";
}

This bundle is meant to be incorporated into the main policy that we have been developing throughout this chapter, since it refers to the configfiles() bundle. Let’s examine it in more detail.

  1. First, we assign to the $(ipregex) variable the regular expression to select the interfaces we want (the green ones, in this case). Next, we store in the @(nics) list the indices of the special CFEngine array sys.ipv4. This is a special variable created by CFEngine that contains all the IP addresses configured in the system, indexed by interface name. Therefore, getindices("sys.ipv4") gives us a list of all the network interfaces on the system.

  2. Once we have this list, we again make use of CFEngine’s implicit looping to assign a number of classes named isgreen___ifname__, where __ifname__ represents each of the network interfaces on the system. Each class is true if the IP address of said interface, given by the value "$(sys.ipv4[$(nics)])" matches $(ipregex) (remember that $(nics) is set to each of the interface names in turn). So, for example, if the system has the following network interfaces:

    • eth0: 9.4.21.16

    • eth1: 189.177.231.225

    • eth2: 192.168.13.56

    • eth3: 10.10.54.25

    then the evaluation of the classes will be as follows:

    • isgreen_eth0: unset

    • isgreen_eth1: unset

    • isgreen_eth2: set

    • isgreen_eth3: unset

    This tells us exactly which interface is the one we need to use in the ethtool command.

  3. Armed with this knowledge, we can proceed to the files: promise, which adds to /etc/inittab the line that executes the ethtool command. This command contains the interface name as given by the $(nics) variable (implicit looping in action again), only if the corresponding isgreen_ifname class is set, as indicated by the ifvarclass ⇒ "isgreen_$(nics)" clause in the promise.

  4. To actually append the line we use another bundle from standard library called replace_or_add that does the following: if a line matches the regular expression given by the first argument, it is replaced in its entirety by the second argument. If no match is found, the line given in the second argument is added to the file. The replace_or_add bundle is very simple. It uses the same trick as the set_config_values bundle we discussed before (setting a class unconditionally upon execution of the replace_patterns: promise) to achieve the desired operation:

    bundle edit_line replace_or_add(pattern,line)
    {
      vars:
          "cline" string => canonify("$(line)");
      replace_patterns:
          "^(?!$(line)$)$(pattern)$"
            replace_with => value("$(line)"),
            classes => always("replace_done_$(cline)");
      insert_lines:
          "$(line)"
            ifvarclass => "replace_done_$(cline)";
    }

It pays to know the built-in classes, variables and functions in CFEngine, since they help achieve most necessary processing and data extraction tasks. I would strongly encourage you to read through the corresponding sections in the reference manual to get familiar at least in general terms with the available functionality.

We have described the use of CFEngine’s implicit looping several times. This concept isn’t found in most programming languages, so it can be hard to wrap your head around it at the beginning. Once you get the hang of it, you will realize that it can save many lines of flow-control code that would be necessary in other languages, and whose absence in CFEngine allows you to focus on writing the policy. In fact, CFEngine goes out of its way to prevent you from worrying about the flow of execution in a policy, using concepts such as implicit looping and normal ordering to determine how things are executed.

It is a natural tendency at the beginning to fight this level of automation, but true CFEngine mastery lies in letting go of the urge to control everything down to the last detail, and using CFEngine the way it is meant to be used. Tell CFEngine what you want and how to do it, and let CFEngine worry about details like the order in which operations will be executed.

User Management

One of the basic tasks of any system administrator is to control user accounts. Whether they are local accounts or centralized accounts using some network-wide mechanism such as LDAP, CFEngine gives you the exact control you need.

From a high-level perspective, the definition of user accounts may be expressed as follows:

bundle agent manage_users
{
 vars:
   # Users to create
   "users[root][fullname]" string => "System administrator";
   "users[root][uid]"      string => "0";
   "users[root][gid]"      string => "0";
   "users[root][home]"     string => "/root";
   "users[root][shell]"    string => "/bin/bash";
   "users[root][flags]"    string => "-o -m";
   "users[root][password]" string => "FkDMzhB1WnOp2";

   "users[zamboni][fullname]" string => "Diego Zamboni";
   "users[zamboni][uid]"      string => "501";
   "users[zamboni][gid]"      string => "users";
   "users[zamboni][home]"     string => "/home/zamboni";
   "users[zamboni][shell]"    string => "/bin/bash";
   "users[zamboni][flags]"    string => "-m";
   "users[zamboni][password]" string => "dk52ia209rfuh";

 methods:
   "users" usebundle => create_users("manage_users.users");
}

This example stores the user characteristics in a two-dimensional array indexed by username and by the different fields of each user record. The create_users() bundle is called from the methods: section of the policy, passing the configuration array as an argument. Here is the create_users() bundle:

bundle agent create_users(info)
{
 vars:
   "user" slist => getindices("$(info)"); (1)

 classes:
   "add_$(user)" not => userexists("$(user)"); (2)

 commands: (3)
  linux::
   "/usr/sbin/useradd $($(info)[$(user)][flags]) -u $($(info)[$(user)][uid])
       -g $($(info)[$(user)][gid]) -d $($(info)[$(user)][home])
       -s $($(info)[$(user)][shell]) -c '$($(info)[$(user)][fullname])' $(user)"
     ifvarclass => "add_$(user)";
  windows::
   "c:/Windows/system32/net user $(user) $($(info)[$(user)][password]) /add
       \"/fullname:$($(info)[$(user)][fullname])\"
       \"/homedir:$($(info)[$(user)][home])\""
     ifvarclass => "add_$(user)";
   # On Windows we use a command to set the password
   # unconditionally in case it has changed.
   "c:/Windows/system32/net user $(user) $($(info)[$(user)][password])"; (4)

 files:
  linux::
   # This is not conditioned to the add_* classes
   # to always check and reset the passwords if needed.
   "/etc/shadow" (5)
     edit_line => set_user_field("$(user)", 2,
                                 "$($(info)[$(user)][password])");

 reports: (6)
  !linux.!windows::
   "I only know how to create users under Linux and Windows.";
  verbose_mode::
   "Created user $(user)"
     ifvarclass => "add_$(user)";
}

This particular implementation of create_users() handles only local user accounts, both on Linux and on Windows.

  1. In the vars: section, we store in @(user) a list of the user accounts to check (the top-level indices of the configuration array) using the getindices() function. This list is used through CFEngine’s implicit looping to apply the rest of the bundle to each of those accounts.

  2. The classes: section defines a class named add_username for each user account that does not exist, by using the built-in userexists() function to check for the existence of each user in turn.

    The userexists() function does not return valid results on Windows when using the community edition of CFEngine, but it does work properly if you are using one of the commercial editions. Proper Windows support is one of the benefits of the commercial versions of CFEngine.

    Note that we are using implicit looping again, but this time the variable $(user) is being used in two places: as part of the classname, and as argument to the userexists() function.

  3. The commands: section is divided by operating systems, using the predefined OS-type hard classes provided by CFEngine. Here, we issue the necessary command-line instructions to create the users, but only if the user does not exist already. This is controlled by the ifvarclass attribute added to each command promise, which makes the statement execute only if the given class expression is true.

    Note that the other account attributes (other than the password, see below) are not verified for accuracy, in this version of the bundle. If the account exists already, the promise is considered as satisfied.

  4. Since we also want to enforce the passwords for each account, we have to make sure the passwords are checked and, if needed, changed every time.

    In the case of Windows, we issue the command to reset the password to its desired value every time the policy runs. This is done from the commands: section. (In this case, the password has to be given in clear text in the users array.)

  5. For Linux, we reset user passwords in the files: section, by directly editing the /etc/shadow file to set the password field to the value given in the user matrix (this value has to be desired password, already encoded using the crypt() function appropriate for the operating system). The set_user_field() bundle can be found in the standard library, and is very similar to the set_colon_field() function I described before.

  6. Finally, the reports: section produces a report for each user that was created, if verbose mode is enabled (the verbose_mode class is automatically set when the -v command-line option is given to cf-agent), and also an error if we are running on an unsupported system.

The call to manage_users() could be easily integrated into the overarching configfiles() bundle we have been building, by adding one more line to the methods: section:

  "users"   usebundle => manage_users;

To make things easier to manage, we could also get rid of manage_users() entirely, move the definition of the user accounts from the manage_users() bundle to the configfiles() bundle, where all our other user-configurable variables are being set, and call create_users() directly:

bundle agent configfiles
{
 vars:
   ...
   # Users to create
   "users[root][fullname]" string => "System administrator";
   "users[root][uid]"      string => "0";
   "users[root][gid]"      string => "0";
   "users[root][home]"     string => "/root";
   "users[root][shell]"    string => "/bin/bash";
   "users[root][flags]"    string => "-o -m";
   "users[root][password]" string => "FkDMzhB1WnOp2";

   "users[zamboni][fullname]" string => "Diego Zamboni";
   "users[zamboni][uid]"      string => "501";
   "users[zamboni][gid]"      string => "users";
   "users[zamboni][home]"     string => "/home/zamboni";
   "users[zamboni][shell]"    string => "/bin/bash";
   "users[zamboni][flags]"    string => "-m";
   "users[zamboni][password]" string => "dk52ia209rfuh";

 methods:
   ...
   "users"   usebundle => create_users("configfiles.users");
}

This is a very simple example that manages only local accounts, but which is useful, for example, to set known attributes on common local accounts such as root. However, CFEngine has the ability to manage much more complex scenarios, including centralized user directories. LDAP integration (including Active Directory) is properly supported in the commercial editions of CFEngine.

Software Installation

One of the main tasks of system maintenance is the installation, configuration, upgrading, and removal of software. In old times, most software on a system was (a) part of the operating system, installed or upgraded whenever the whole system was installed or upgraded, (b) commercial software that had its own installation mechanisms, or (c) open-source software that had to be compiled and installed manually. Over time, most operating systems have developed package-management mechanisms, which make it easier to install and manage software of any kind. Unfortunately, package-management mechanisms vary wildly in their capabilities and user interfaces, which makes writing software that can interface with any of them a daunting task. Furthermore, there is still the need (however sporadic) to compile and install software manually.

CFEngine provides powerful and generic mechanisms for dealing with this task, that make it possible to adapt it to the needs of every particular system.

Package-Based Software Management

CFEngine understands package management as a generic concept. Each package is represented by three attributes: its name, its version, and its architecture. CFEngine allows you to perform operations such as add, delete, and update. The specifics of how to interact with the package-management system are abstracted into discrete components of the policy, and can be customized to interact with any command-line-driven package manager.

All package-management promises in CFEngine occur in the packages: section of an agent bundle. CFEngine allows us to make promises about the state of the packages in the system, and leaves the work of actually modifying the packages to the underlying packaging system. Keep in mind that given the widely varying capabilities of package management systems, we must take their capabilities into consideration when writing package-management promises (for example, if the system uses rpm, we should take into account that it will not automatically fetch and install dependencies of the package being installed).

Let us look at a very simple example:

bundle agent software
{
  vars:
      "pkgs" slist => {
                        "subversion",
                        "tcpdump"
                      };
  packages:
      "$(pkgs)"
        package_policy => "addupdate",
        package_method => apt;   # For Debian and Ubuntu
}

We define a list variable containing the packages we want to install or update (Subversion and tcpdump), and use them in a promise that specifies the addupdate package policy, which means “update the package if it’s installed, install it if not.” We also specify apt as the package method, which is the package management system in Debian-based Linux distributions.

Some standard package_method bodies, including apt, are defined in the standard library. Let us look at its definition (some lines have been wrapped and summarized for readability):

body package_method apt
{
     package_changes => "bulk";
     package_list_command => "/usr/bin/dpkg -l"; (1)
     package_list_name_regex    => "ii\s+([^\s]+).*"; (2)
     package_list_version_regex => "ii\s+[^\s]+\s+([^\s]+).*";
     package_installed_regex => ".*"; # all reported are installed
     package_name_convention => "$(name)"; (3)

   # set it to "0" to avoid caching of list during upgrade
     package_list_update_ifelapsed => "240";

  have_aptitude:: (4)
     package_add_command => "/usr/bin/aptitude --assume-yes install";
     package_list_update_command => "/usr/bin/aptitude update";
     package_delete_command => "/usr/bin/aptitude --assume-yes -q remove";
     package_update_command => "/usr/bin/aptitude --assume-yes install";
     package_verify_command => "/usr/bin/aptitude show";
     package_noverify_regex =>
       "(State: not installed|E: Unable to locate package .*)";

  !have_aptitude::
     package_add_command => "/usr/bin/apt-get --yes install";
     package_list_update_command => "/usr/bin/apt-get update";
     package_delete_command => "/usr/bin/apt-get --yes -q remove";
     package_update_command =>  "/usr/bin/apt-get --yes install";
     package_verify_command => "/usr/bin/dpkg -s";
     package_noverify_returncode => "1";
}

A package_method body tells CFEngine how to execute the commands that actually perform the operations, and how to process their output to obtain necessary information:

  1. The package_list_command attribute specifies the command to run to generate a list of packages on the system.

  2. Coupled with this, the package_list_name_regex and package_list_version_regex attributes tell CFEngine the regular expressions to apply on the output of the package-listing command to determine each package’s name and version. Additionally, the package_installed_regex is used to determine which of the packages in the listing are actually installed (in this case, because of the command used, all packages in the output are installed, but this may not be the case in other package-management systems).

  3. The package_name_convention attribute tells CFEngine how to specify a package when executing any of the commands. Some package-management systems may require both the name and the version to operate. apt needs only the name, hence it’s specified like this.

  4. The have_aptitude class is a hard class that CFEngine defines automatically on Debian-like systems when the aptitude package management program is installed, since it provides some additional capabilities. Depending on this class, the body sets the specific commands for adding, removing or updating packages.

The standard library includes predefined package_method bodies for several common package managers, including zypper, apt, rpm, yum, Windows MSI installers, the Solaris package manager and the FreeBSD package manager. There is also a generic package method that combines all of the above, and provides the correct values according to the appropriate operating-system hard classes.

It is important to note that a package_method definition specifies exactly how to interact with the package manager, and thus allows you to interact with any packaging mechanism you want by writing the appropriate package_method. Useful examples of this would be package_method definitions for popular language-specific or tool-specific package managers, such as Pear or Ruby Gems.

Package promises can be far more complicated. The name, version, and architecture attributes can be used in package promises to define the desired result. We can also use version-comparison operators to further refine the actions. For example:

bundle agent software
{
  vars:
      "version[openssl]"  string => "0.9.8k-7ubuntu8";
      "version[ssl-cert]" string => "1.0.23ubuntu2";

      "architectures" slist => { "x86_64" };
      "allpkgs"       slist => getindices("version");

  packages:
      "$(allpkgs)"
        package_policy => "add",
        package_select => "==",
        package_version => "$(version[$(allpkgs)])",
        package_architectures => @(architectures),
        package_method => apt;   # For Debian and Ubuntu
}

In this case, we are using an array to store the versions we want, indexed by package name. Then we are using the list of indices from the array to install the specific version we want for each package, also specifying the desired architecture. We are again using an array and implicit looping to request the needed version for each one of the packages. The package_select attribute with value "==" tells CFEngine that we want exactly the specified version of the package (by default its value is ">=", which gives us the latest available version older than the one specified).

When package_policy is verify (this is its default value), all that CFEngine does is check that the desired packages are installed correctly. This can be used to simply report on the correctness of the system, without attempting to fix anything[1]. For example:

bundle agent verify_packages
{
 vars:
   "allpkgoutput"
     string => execresult("/usr/bin/rpm -qa --queryformat \"%{name}\n\"");
   "allpkgs"
     slist => splitstring("$(allpkgoutput)", "\s+", 999999);

 packages:
   "$(allpkgs)"
     package_policy => "verify",
     package_method => rpm,
     classes => if_notkept("incorrect_$(allpkgs)");

 reports:
   "Problem: package $(allpkgs) is not installed correctly."
     ifvarclass => "incorrect_$(allpkgs)";
}

This bundle starts by getting the listing of all packages by running an external command using the execresult() function, and storing it in the $(allpkgoutput) string, which then gets split by the splitstring() function into the @(allpkgs) list. We then iterate over this list verifying each package in turn. If the promise is not kept (this is, if the package does not get verified correctly), the packages bundle defines a incorrect___packagename__ class. In the reports: section, we iterate again over @(allpkgs), printing a message for the packages whose incorrect___packagename__ class is defined. We can use this as a general “sanity check” of a system, for example to produce a report of its current state if we have a new system that comes under our management, or to trigger automatic corrective actions.

Manual Software Management

Although package management software is the ideal way to install and uninstall software on a system, there may be cases in which you want or need to manage software manually. One such case would be when the software you need to install is not available in your operating system’s software repository, or if you need to compile or install it in a custom way, or if you need a specific version that is too old or too new to be in the repository.

In this section we develop a CFEngine policy to manually install an application. This requires more manual work and each policy will be unique to the application that is being installed, so you may want to minimize the number of applications you install using this method. However, it is useful to know how to perform this task for the times when it is needed.

For our example, we will install the WordPress blogging and CMS application. From the WordPress documentation, we can see that it has a fairly simple installation procedure:

  1. Install the system requirements: Apache, PHP, and MySQL;

  2. Download and extract the package;

  3. Create a MySQL database and user to use with WordPress;

  4. Set up wp-config.php with the necessary database parameters, using wp-config-sample.php as a starting point.

These steps give us a fairly good guide for implementing the installation using CFEngine. We’ll create a wp_install bundle, but let’s start by thinking how we would like to invoke it:

body common control
{
     bundlesequence => { wp_install("g.wp_config") };
     inputs => { "libraries/cfengine_stdlib.cf", "wordpress.cf" };
}

bundle common g
{
 vars:
   "wp_config[DB_NAME]"       string => "wordpress";
   "wp_config[DB_USER]"       string => "wordpress";
   "wp_config[DB_PASSWORD]"   string => "lopsa10linux";
  debian::
   "wp_config[_htmlroot]"     string => "/var/www";
  redhat::
   "wp_config[_htmlroot]"     string => "/var/www/html";
  any::
   "wp_config[_wp_dir]"       string => "$(wp_config[_htmlroot])/blog";
}

We are defining, in the common g bundle, the wp_config array with our parameters for the installation. The most important parameters are the database name, user and password, and the directory where we want WordPress to be installed. Note that we use classes to assign a different value to the htmlroot parameter depending on whether we are on a Debian or a RedHat system, to account for slight differences between those distributions.

The wp_install() bundle is called directly from the bundlesequence declaration, passing the name of the configuration array as a parameter. The wp_install() bundle could also be called, for example, from the methods: section of some other bundle, as we did before in our configfiles() bundle.

Let’s now walk through the actual implementation of the wp_install policy.[2]

  1. The wp_install() bundle is the point of entry for this policy bundle, which calls all other tasks:

bundle agent wp_install(params)
{
 methods:
   "wp_vars"  usebundle => wp_vars("$(params)");
   "wp_pkgs"  usebundle => wp_packages_installed("wp_vars.conf");
   "wp_svcs"  usebundle => wp_services_up("wp_vars.conf");
   "wp_tar"   usebundle => wp_tarball_is_present("wp_vars.conf");
   "wp_xpnd"  usebundle => wp_tarball_is_unrolled("wp_vars.conf");
   "wp_mysql" usebundle => wp_mysql_configuration("wp_vars.conf");
   "wp_cfgcp" usebundle => wp_config_exists("wp_vars.conf");
   "wp_cfg"   usebundle => wp_is_properly_configured("wp_vars.conf");
}

This bundle receives as argument a variable called params, which must contain the name of an array that itself contains the different adjustable parameters of this bundle, such as the database user and password to use (as in the sample invocation we saw before, the argument was the string "g.wp_config"; this is how we have passed configuration arrays before: by using their name instead of passing the array itself). In the first methods: call to the wp_vars() bundle, the configuration array is extended with default parameter values (for those that are not specified by the user), and with some hard-coded internal parameters related to the operation of the bundle, such as the file into which the wordpress.tar.gz file will be downloaded, the URL from where it will be fetched, the path of the service command, and the name by which the Apache web server is identified. The extended parameter array is stored in wp_vars.conf, which will be used by all the other bundles (we will examine in detail the operation of the wp_vars() bundle in [passing-name-value-pairs-to-bundles]). In the rest of the methods: section of this bundle, we call the other bundles that actually perform the required tasks. The wp_vars.conf array is passed to every single bundle. In methods: promises, the promiser is an arbitrary string that (at least in current CFEngine versions) is not used for any purpose. For clarity, we use short identifiers for each of the methods we are calling.

  1. The first step of the actual installation process is to make sure all WordPress prerequisites are properly installed and working. This is taken care of by two bundles, wp_packages_installed() and wp_services_up(). The first one uses the native package-management facilities to install the prerequisites for WordPress:

    bundle agent wp_packages_installed(params)
    {
      vars:
        debian::  (1)
          "desired_package" slist => {
                                       "apache2",
                                       "php5",
                                       "php5-mysql",
                                       "mysql-server",
                                     };
        redhat::
          "desired_package" slist => {
                                       "httpd",
                                       "php",
                                       "php-mysql",
                                       "mysql-server",
                                     };
      packages:  (2)
          "$(desired_package)"
            package_policy => "add",
            package_method => generic,
            classes => if_repaired("packages_added");
    
      commands:
        packages_added::  (3)
          "$($(params)[_sys_servicecmd]) $($(params)[_sys_apachesrv]) graceful"
            comment => "Restarting httpd so it can pick up new modules.";
    
    }
    1. We first define the list of necessary packages for each operating system that we want to support.

    2. This is then used in the packages: section of the bundle to install them as appropriate. If any of the package promises are repaired (this means, if any of the packages need to be installed), the packages_added class will be defined.

    3. If the packages_added class is defined, Apache needs to be restarted to ensure it uses any newly-available modules. The path of the service command and the name of the service to restart (httpd in RedHat, apache2 in Debian) are taken from the params array as defined in wp_vars().

  2. The wp_services_up() bundle ensures that both MySQL and Apache are running:

    bundle agent wp_services_up(params)
    {
      processes:
        debian:: (1)
          "/usr/sbin/mysqld" restart_class => "start_mysqld";
          "/usr/sbin/apache2"  restart_class => "start_httpd";
        redhat::
          "^mysqld" restart_class => "start_mysqld";
          "^httpd"  restart_class => "start_httpd";
    
      commands: (2)
        start_mysqld::
          "$($(params)[_sys_servicecmd]) mysql start";
    
        start_httpd::
          "$($(params)[_sys_servicecmd]) $($(params)[_sys_apachesrv]) start" ;
    }
    1. First we ensure that both mysqld and httpd are running by using a processes: section. Different process-matching strings are used depending on the Linux distribution used, due to the differences in how the processes appear in the process table. If any of the processes are not running, the corresponding restart_class is defined.

    2. If any of the restart classes are defined (start_mysqld or start_httpd), the corresponding command is run to start the appropriate service.

    Tip

    This bundle could be rewritten to make use of services: promises, which are available since CFEngine 3.3.0. In this case it would look like this:

    bundle agent wp_services_up(params)
    {
      services:
          "www"   service_policy => "start";
          "mysql" service_policy => "start";
    }

    See [service-management] for more information about service management using services: promises.

    After both of these bundles are called from the methods: section of the main wp_install() bundle, both the HTTP and MySQL daemons will be running, with the appropriate modules installed.

  3. Next, we need to download the WordPress distribution file if it is not present already. This is taken care of by the wp_tarball_is_present() bundle:

    bundle agent wp_tarball_is_present(params)
    {
      classes: (1)
          "wordpress_tarball_is_present"
            expression => fileexists("$($(params)[_tarfile])");
    
      commands: (2)
        !wordpress_tarball_is_present::
          "/usr/bin/wget -q -O $($(params)[_tarfile]) $($(params)[_downloadurl])"
            comment => "Downloading latest version of WordPress.";
    
      reports: (3)
        wordpress_tarball_is_present::
          "WordPress tarball is on disk.";
    }
    1. In the classes: section we define a class depending on the existence of the tar file that contains the WordPress distribution. The location and filename of this file is also contained in the params configuration array.

    2. If the class is not defined (which means the file is not present), the commands: section uses the wget command to download the file to the proper location.

    3. If the file was already there, we don’t download it again, and simply report its existence in the reports: section.

  4. Once we ensure that the WordPress distribution file is present, the wp_tarball_is_unrolled() bundle makes sure it has been expanded:

    bundle agent wp_tarball_is_unrolled(params)
    {
      classes: (1)
          "wordpress_src_dir_is_present"
            expression => fileexists("$($(params)[_htmlroot])/wordpress");
          "wordpress_final_dir_is_present"
            expression => fileexists("$($(params)[_wp_dir])");
    
      reports:
        wordpress_final_dir_is_present::
          "WordPress directory is present.";
    
      commands:
        !wordpress_final_dir_is_present&!wordpress_src_dir_is_present:: (2)
          "/bin/tar -xzf $($(params)[_tarfile])"
            comment => "Unroll WP tar to $($(params)[_htmlroot])/wordpress.",
            contain => in_dir_shell("$($(params)[_htmlroot])");
        wordpress_src_dir_is_present&!wordpress_final_dir_is_present::
          "/bin/mv $($(params)[_htmlroot])/wordpress $($(params)[_wp_dir])"
            comment => "Rename unrolled directory to $($(params)[_wp_dir])";
    }

    This bundle is very similar to the previous one, except that:

    1. The existence check is done on the directory into which the tar file expands, contained in the _wp_dir parameter of the configuration array.

    2. If it does not exist, we use the tar command to expand it under the directory defined by the htmlroot parameter. This will create a directory named _wordpress under that directory, since that is how the WordPress tar file is packaged. After unpacking the distribution, we rename the resulting wordpress directory to its final name as indicated by the _wp_dir parameter, from where it will be served by the web server.

  5. Once the files are in place, it is time to configure WordPress, and the first step is creating a MySQL database and user for WordPress to use. This is done by the wp_mysql_configuration() bundle:

    bundle agent wp_mysql_configuration(params)
    {
      commands:
          "/usr/bin/mysql -u root -e \"
          CREATE DATABASE IF NOT EXISTS $($(params)[DB_NAME]);
          GRANT ALL PRIVILEGES ON $($(params)[DB_NAME]).*
          TO '$($(params)[DB_USER])'@localhost
          IDENTIFIED BY '$($(params)[DB_PASSWORD])';
          FLUSH PRIVILEGES;\"
    ";
    }

    This is a very simple bundle: it just runs the mysql command-line utility with the appropriate SQL commands to perform the task. In this respect, MySQL makes things quite easy, since a single command can be used to create the database if it doesn’t exist already, create the user if it doesn’t exist already, and set the user password.

  6. The second bundle involved in configuring WordPress is wp_config_exists():

    bundle agent wp_config_exists(params)
    {
      classes:
          "wordpress_config_file_exists"  (1)
            expression => fileexists("$($(params)[_wp_config])");
    
      files:
        !wordpress_config_file_exists::  (2)
          "$($(params)[_wp_config])"
            copy_from => backup_local_cp("$($(params)[_wp_cfgsample])");
    
      reports:
        wordpress_config_file_exists::
          "WordPress config file $($(params)[_wp_config]) is present";
        !wordpress_config_file_exists::
          "WordPress config file $($(params)[_wp_config]) is not present";
    }
    1. This bundle first checks whether the wp-config.php file already exists inside the WordPress installation directory. If it does, we do not want to overwrite it, since it may already have some customizations (this is useful when updating WordPress to a new version).

    2. If the file does not exist, the wordpress_config_file_exists class will not be set, and in this case the files: section will create it using the wp-config-sample.php file shipped with WordPress as the starting point.

  7. After making sure the configuration file is in its proper place, we want to ensure that it contains the appropriate parameters. For this we use the wp_is_properly_configured() bundle:

    bundle agent wp_is_properly_configured(params)
    {
      vars:
          "allparams" slist => getindices("$(params)"); (1)
        secondpass::
          "wpparams"  slist => grep("[^_].*", "allparams");
    
      classes:
          "secondpass" expression => isvariable("allparams");
    
      files:
          "$($(params)[_wp_config])" (2)
            edit_line =>
              replace_or_add(
               "define\('$(wpparams)', *(?!'$($(params)[$(wpparams)]))'.*",
               "define('$(wpparams)', '$($(params)[$(wpparams)])');");
    }

    Although this is a short bundle, there is a lot going on behind the scenes, so let us take a moment to understand what it does.

    1. First, we store in the allparams list the indices of the configuration array that is passed into the bundle as the params argument. This gives us, according to the definition given in our example, the following values: { DB_NAME, DB_USER, DB_PASSWORD, htmlroot, _tarfile, …​ } We then use the grep() built-in function to select from allparams only those parameters that do not start with an underscore, to avoid storing internal parmeters in the WordPress configuration file. The filtered list is stored in wpparams, which would have in our example the following values: { DB_NAME, DB_USER, DB_PASSWORD } Due to an artifact of the way variables are converged in the current version of CFEngine, we need to use a trick to make sure the wpparams list is assigned only on the _second pass of the CFEngine policy evaluation (remember that CFEngine makes three passes over each bundle). Not doing this results in the wpparams list being empty, because it is filled using the original value of allparams and not the final value obtained using the getindices() function. To achieve this, we set class secondpass in the classes: section based on whether the allparams variable exists, and in the vars: section, wpparams is only created when the secondpass class is true. For details about how this works, see [controlling-promise-execution-order].

    2. To understand the trick we are about to use, we need to look at the lines in the wp-config.php file that we want to modify:

    /** The name of the database for WordPress */
    define('DB_NAME', 'database_name_here');
    
    /** MySQL database username */
    define('DB_USER', 'username_here');
    
    /** MySQL database password */
    define('DB_PASSWORD', 'password_here');

    Notice that we have used, as the indices in the wp_config array, the same parameter names used in the wp-config.php file (plus some others, which we use internally in the policy). This allows the edit_line statement in the files: section to do its magic using CFEngine’s implicit looping (you should by now be getting the idea that this is one of the most powerful features in CFEngine!). In this statement, we replace the following regexp:

    define\('$(wpparams)', *(?!'$($(params)[$(wpparams)]))'.*

    by the following text:

    define('$(wpparams)', '$($(params)[$(wpparams)])');

    Through implicit looping, $(wpparams) takes the value of each index in sequence. So, for example, when $(wpparams) has the value DB_NAME, the regular expression looks like this:

    define\('DB_NAME', *(?!'$($(params)[DB_NAME]))'.*

    and the replacement string looks like this:

    define('DB_NAME', 'wordpress');

    The negative look-ahead (?!…​) in the regular expression is used to match only lines in which the correct value is not already present, and to ensure the proper convergence of the replacement operation. Without this, CFEngine notices that the regular expression matches the line even after the replacement operation has taken place, and considers it to be a non-properly-convergent operation. It still works, but will produce a warning from CFEngine every time it runs.

    This means, we will replace whatever value the DB_NAME parameter has at the moment with the correct one, taken from the wp_config array. If the line does not exist at all, it will be added to the file. This will happen for all the parameters in that array automatically, and the file will be rewritten to disk only if at least one change is actually made to it. A nice side effect of this automation is that we can modify any parameter in the wp-config.php file just by adding a new element to the configuration array. For example, if we needed to set the DB_CHARSET parameter, all we need to add is the following line to the definition of wp_config:

    "wp_config[DB_CHARSET]"  string => "iso8859-1";

There are a few aspects of this configuration to focus on. First, I would like to draw a contrast between this policy and a shell script that would perform the same tasks. The main difference is that in a CFEngine policy, we simply specify the end state we want to achieve (for example, a directory or a file existing), and CFEngine only proceeds with the actions if any aspects of the system are not in the desired state.

Second, notice that we are making use of some of the generic tools and tricks that we have built elsewhere, or that are available in the standard libraries. For example, we use the replace_or_add() bundle from standard library to edit the WordPress configuration file. And we are using an array to pass parameters, which allows us to do some generic manipulations and traversing of the data, as seen in the wp_is_properly_configured() bundle.

Third, note how using the methods: section allows us to break a task into sub-tasks, thereby providing a single point of entry (in this case, the wp_install() bundle) into a policy that may be arbitrarily complex.

In general, I would advise you to use the built-in package management facilities for handling software in the system, using the interfaces that CFEngine provides to these systems. However, as we have just seen, it is entirely possible to perform ad-hoc software installation and configuration when needed. These are tasks that, when managing systems manually, you would have to perform anyway. CFEngine allows you to automate and perform them in the best possible way. As your mechanism to install and configure the software improves (or, for example, when the package appears in a proper way in the software repository), your CFEngine policy can evolve to adapt to your needs and possibilities. For example, when WordPress becomes available in the software repository, you could keep wp_install() as the main entry point for the policy, and simply replace the first five calls in the methods: section by a single call to a bundle that handles the installation using packages: promises.

Using CFEngine for Security

A large part of maintaining security in a computer system consists of maintaining proper configuration of the systems, which makes CFEngine well suited for the task of configuring, maintaining, and monitoring security-related state. In this section we will explore some of the applications of CFEngine in this area.

Policy Enforcement

Many organizations have security policies that are in turn translated into specific configuration specifications for computer systems. While CFEngine cannot help with mapping high-level policies into procedures and implementations, it can certainly make sure that the implementations are correctly applied and maintained. Here are some example security-related policies that are common in some systems, and how CFEngine can help in enforcing them. In the process we will learn and revisit some CFEngine concepts and constructs.

Template-based login banners

CFEngine can make sure a login banner is always present, and contains the approved text according to a template that contains the policy-mandated text, plus some variable per-host information. For example:

body common control
{
     bundlesequence => { "login_banner" };
     inputs => { "/var/cfengine/inputs/libraries/cfengine_stdlib.cf" }; (1)
}

bundle agent login_banner
{
 vars:
   "template_file" string => "/var/cfengine/templates/motd_template.txt"; (2)
   "motd_file"     string => "/etc/motd";
   "support"       slist => { "Solaris: John Doe (x3478)",
                              "Linux: Sam Wilson (x7832)",
                              "AIX: Steve Clark (x3212)" };
 files:
   "$(motd_file)" (3)
     handle => "set_login_banner",
     comment => "Ensure the login banner is set to the authorized text",
     create => "true",
     perms => mog("744", "root", "wheel"),
     edit_defaults => empty,
     edit_template => "$(template_file)"; (4)
}

In this example we are making use of templates in CFEngine to populate the /etc/motd file with the appropriate content. A template is a partially complete file that you fill in with the particular values you want. CFEngine then expands the template. Let us examine what is going on in the example.

  1. First, we must include the CFEngine standard library, because the policy makes use of several bodies and bundles defined in it.

  2. We store in variables the filenames of the template file, and of the actual file to be edited. This is not needed, but it is a good practice to have all user-defined values in a single, differentiated place. We will look at the contents of the template file in a moment. Note that we also define some information that will be included in the template when expanded.

  3. In the files: promise we tell CFEngine, among other things, that the /etc/motd file needs to be created if it doesn’t exist, that it needs to have permissions 644 (or rw-r—​r--), belong to user root and group wheel, and that it should be emptied completely before inserting the lines. For these specifications we use two bodies from the standard library, namely:

    body perms mog(mode,user,group)
    {
      owners => { "$(user)" };
      groups => { "$(group)" };
      mode   => "$(mode)";
    }
    body edit_defaults empty
    {
      empty_file_before_editing => "true";
      edit_backup => "false";
      max_file_size => "300000";
    }

    Both of these bodies are very simple. The first one simply passes on the permissions it receives, and the second simply specifies that the file must be emptied before starting the editing. Remember that CFEngine does all the editing in memory and writes results to the disk only if they are different from what was there before, so there are no unnecessary edits of the file, even when empty_file_before_editing is used.

  4. The edit_template attribute is the one that actually specifies how the file will be edited. We have seen in previous examples the use of edit_line to add and delete lines, how to search and replace using regular expressions, and to edit field-based files. Now we use yet another file-editing facility provided by CFEngine, that of using templates for specifying the contents of a file.

The value of the edit_template attribute must be the filename of the template to expand, which could contain something like this:

[%CFEngine BEGIN %]   (1)
This system may be accessed by authorized users only.
Use of this system implies acceptance of authorized use policies.
Misuse may be subject to prosecution.

Host: $(sys.fqhost) ($(sys.ipv4))   (2)
This system is managed by CFEngine v$(sys.cf_version)
This file was generated from $(login_banner.template_file)
[%CFEngine END %]
[%CFEngine BEGIN %]
Support for $(login_banner.support).   (3)
[%CFEngine END %]
[%CFEngine Night:: %]   (4)
REMEMBER: Nighttime logins are subject to additional scrutiny.

Let us look at some of the constructs that can be used in a template:

  1. Lines containing the special strings [%CFEngine BEGIN %] and [%CFEngine END %] are used to delimit blocks of text within the template that should not be broken. Normally, each line in the template is translated internally to a separate edit_line promise, but when grouping them in a block, CFEngine ensures that all the lines remain together and without alteration (for example, duplicate empty lines are preserved).

  2. Variables in the template are referenced just as you would in any other string. Keep in mind that all variables in the template must be referenced with their full module path, as shown in the reference to the login_banner.template_file variable.

  3. Implicit looping is supported within the template. If a list is referenced inside the template, the line or block containing the reference will be repeated for each value in the list. In this particular case, The entire block is repeated for each value contained in the support list in the login_banner() bundle.

  4. Conditional blocks using class expressions are also supported, by using the special line [%CFEngine classexpression:: %]. Everything that follows that line will be included only if the given class expression is true. In this example, the last line of the template will only be included when the Night class is defined, which CFEngine defines only between midnight and 6AM.

When we execute this policy, the output may look like this:

# cf-agent -f ./login_banner.cf
# cat /etc/motd
This system may be accessed by authorized users only.
Use of this system implies acceptance of authorized use policies.
Misuse may be subject to prosecution.

Host: cfma-10022 (192.168.1.140)
This system is managed by CFEngine v3.4.2
This file was generated from /var/cfengine/templates/motd_template.txt

Support for Solaris: John Doe (x3478).
Support for Linux: Sam Wilson (x7832).
Support for AIX: Steve Clark (x3212).
REMEMBER: Nighttime logins are subject to additional scrutiny.

Template files are a powerful mechanisms for generating files using CFEngine. They make it easier to modify the contents of a file without having to touch the policy files that maintain it, and make it easier to understand what the final contents of the file will be without having to untangle the logic of the code.

Password expiration periods

Password expiration is another common configuration policy that is mandated by security policies, and which is possible to set and maintain using CFEngine. For example, in Linux systems this is commonly done using the /etc/login.defs file. We can use the following bundle to set these parameters appropriately:

bundle agent password_expiration
{
 vars:
   # Maximum password age
   "logindefs[PASS_MAX_DAYS]" string => "180";   (1)
   # Minimum password age (minimum days between changes)
   "logindefs[PASS_MIN_DAYS]" string => "10";
   # Warning period (in days) before password expires
   "logindefs[PASS_WARN_AGE]" string => "5";

   # Position of each parameter in /etc/shadow
   "fieldnum[PASS_MIN_DAYS]"  string => "4";   (2)
   "fieldnum[PASS_MAX_DAYS]"  string => "5";
   "fieldnum[PASS_WARN_AGE]"  string => "6";

   # List of parameters to modify
   "params" slist => getindices("logindefs");   (3)

   # UIDs below this threshold will not be touched
   "uidthreshold" int => "500";   (4)
   # Additionally, these users and UIDs will not be touched.
   # These are comma-separated lists.
   "skipped_users" string => "vboxadd,nobody";   (5)
   "skipped_uids"  string => "1000,1005";

   # Get list of users, and also generate them in canonified form
   "users" slist => getusers("$(skipped_users)", "$(skipped_uids)");   (6)
   "cusers[$(users)]" string => canonify("$(users)");

 classes:
   # Define classes for users that must not be modified,
   # either by UID threshold or by username
   "skip_$(cusers[$(users)])" expression => islessthan(getuid("$(users)"),  (7)
                                                       "$(uidthreshold)");

 files:
   "/etc/login.defs"   (8)
     handle => "edit_logindefs",
     comment => "Set desired login.defs parameters",
     edit_line => set_config_values("password_expiration.logindefs");

   "/etc/shadow"   (9)
     handle => "edit_shadow_$(params)",
     comment => "Modify $(params) for individual users.",
     edit_defaults => backup_timestamp,
     edit_line => set_user_field("$(users)",
                                 "$(fieldnum[$(params)])",
                                 "$(logindefs[$(params)])"),
     ifvarclass => "!skip_$(cusers[$(users)])";
}

The idea is to set in /etc/login.defs new default values for the minimum and maximum password ages, as well as the warning period to users when the password expiration date is approaching. To ensure consistency, we also edit /etc/shadow to change all user-specific expiration settings to the default value. But we don’t want to blindly change all the user entries, because this would most certainly cause problems by changing the password periods for system accounts such as root, lpd, or daemon. To address this, we include a system for excluding certain users by user ID threshold (all UIDs below a set threshold are ignored), and also by specific usernames and user IDs. This bundle brings together several concepts we have discussed before and introduces a couple of new ideas. Let us look in detail at how it works:

  1. We set the value we want for each of the parameters. The parameter names (the array indices) are the names as they appear in /etc/login.defs. In this case, we want to set a maximum password lifetime of 180 days, a minimum of 10 days between password changes, and a warning period of 5 days before the password expires.

  2. Like we said, we need to set these parameters also in /etc/shadow for preexisting users. For this reason, we define the field number in which each parameter appears in this file. This will allow us to make the promise that edits /etc/shadow generic as well.

  3. The list variable @(params) holds the list of parameters whose values we want to set, obtained automatically from the logindefs array. Defining this list will allow us to write generic file-editing promises, as we will see in a moment.

  4. We now get to the definition of the exceptions. First we define $(uidthreshold), which contains the minimum User ID for which changes in /etc/shadow will be applied. (In this case, all users with UID smaller than 500 will be skipped. This includes most system and application users.)

  5. Continuing with the exceptions, we define $(skipped_users) and $(skipped_uids), both of which contain comma-separated lists of usernames and user IDs to skip. This is meant to allow more fine-grained control over users whose parameters should not be modified.

    The exception definitions are combined: both users with a UID lower than $(uidthreshold), and those listed in $(skipped_users) or $(skipped_uids), will be skipped when making changes.

  6. We get a list of all the users in the system using the built-in function getusers(). This function returns a list of users and takes two parameters, which allow us to specify lists of users and UIDs that should not be returned, so we use our two variables $(skipped_users) and $(skipped_uids) directly. We store the list of users in the @(users) list variable.

    Additionally, we generate a list of canonified usernames, and store them in the @(cusers) array. Most usernames should be safe to use in class names, but it’s better to do this conversion anyway to have the certainty that they will not produce errors.

  7. In the classes: section of the policy, we finally start applying the logic of the policy to decide which users must be skipped. For this, we make use one more time of CFEngine’s implicit looping to create per-users classes named skip___username__. The class is defined using the built-in function islessthan() to compare the user ID of the current user (the current username is contained in $(users) by the magic of implicit looping, and its user ID is obtained using the getuid() function) against the threshold defined in the $(uidthreshold) variable. The class skip___username__ will be defined for all those users for which the condition is true.

    Finally, by this point we have the list of users to edit, the list and values of the parameters to modify, and all the per-user classes to tell us which users need to be skipped. Now we will apply these pieces of information into editing /etc/login.defs and /etc/shadow.

  8. We use a files: promise to edit the values in /etc/login.defs. This is a fairly simple promise: we use the set_config_values() bundle just like we did in Editing /etc/sshd_config.

  9. The second files: promise does the editing of /etc/shadow for all users in the system. Note that this promise is parameterized using the $(params) variable, which means that in practice it is evaluated as three promises: one for each element of @(params). Note that we use $(params) even in the handle and comment of the promise, so that we can clearly identify which parameters failed.

    The promise also loops over all the available users, thanks to the reference to the $(users) variable. The ifvarclass attribute indicates that only those users for which the skip___username__ class is not defined will be examined.

    The editing work is done, as usual, by the edit_line attribute. This attribute tells CFEngine that the corresponding field for each parameter (as indicated by $(fieldnum[$(params)]) must be set to the correct value, as stored in $(logindefs`[$(params)]). The `set_user_field() bundle comes from the standard library.

Security Scanning

Let us now look at another way to use CFEngine as a security tool. A common strategy is to establish mechanisms to detect unwanted file changes—in fact one of the oldest and most respected security tools, Tripwire, does precisely this, and is the centerpiece of a successful business venture. CFEngine can also perform monitoring for file changes. CFEngine keeps cryptographic hash values for all the files it manages in order to detect changes that may trigger certain actions (for example, the file may need to be re-copied from a remote server, or fixed in some way). The trick is to leverage this database to focus on change detection as the end objective.

Looking at the CFEngine reference manual, we find that there exists a changes attribute to files: promises. It is defined as a body, which means it needs to be declared as an external body part. It looks promising, since it supports the following attributes: hash, report_changes, update_hashes and report_diffs.

The standard library is a good source for learning how to use different CFEngine constructs, and in this case it doesn’t disappoint. Looking for “body changes” definitions, we find the following little gem:

body changes detect_all_change
{
        hash           => "best";
        report_changes => "all";
        update_hashes  => "yes";
}

This seems to be exactly what we need. And indeed, it is all we need if we only want to monitor a single file. For example:

bundle agent monitor_files
{
  vars:
      "files" slist => { "/bin/ls", "/etc/passwd", "/etc/motd" };

  files:
      "$(files)"
        changes => detect_all_change;
}

This simple bundle allows us to define an arbitrary list of files to monitor in the @(files) list, and will produce an alert when one of them changes. The first time you run it, you will see something like this, as CFEngine adds the hashes of the files to its database:

# cf-agent -KI -f ./monitor_one_file.cf
2013-10-14T04:34:48+0000    error: /monitor_files/files/'$(files)':
  File '/bin/ls' was not in 'md5' database - new file found
2013-10-14T04:34:48+0000    error: /monitor_files/files/'$(files)':
  File '/bin/ls' was not in 'sha1' database - new file found
2013-10-14T04:34:48+0000    error: /monitor_files/files/'$(files)':
  File '/etc/passwd' was not in 'md5' database - new file found
2013-10-14T04:34:48+0000    error: /monitor_files/files/'$(files)':
  File '/etc/passwd' was not in 'sha1' database - new file found
2013-10-14T04:34:48+0000    error: /monitor_files/files/'$(files)':
  File '/etc/motd' was not in 'md5' database - new file found
2013-10-14T04:34:48+0000    error: /monitor_files/files/'$(files)':
  File '/etc/motd' was not in 'sha1' database - new file found
Tip

I have switched to using the short form of the cf-agent command-line options (-KI instead of --no-lock --inform) now that you have seen them a few times. I will continue using the short form throughout the rest of the book.

Afterward, if any of the files is modified, CFEngine will produce the appropriate alerts:

# echo "Hello world" >> /etc/motd
# cf-agent -KI -f ./monitor_one_file.cf
2013-10-14T04:36:38+0000    error: Hash 'md5' for '/etc/motd' changed!
2013-10-14T04:36:38+0000    error: /monitor_files/files/'$(files)':
  Updating hash for '/etc/motd' to 'MD5=53d50cd5338eef7f35afb9e5bb1c6972'
2013-10-14T04:36:38+0000    error: Hash 'sha1' for '/etc/motd' changed!
2013-10-14T04:36:38+0000    error: /monitor_files/files/'$(files)':
  Updating hash for '/etc/motd' to 'SHA=62c6f6d8a41279e2f07f4818b8563375413a5818'
2013-10-14T04:36:38+0000   notice: Last modified time for '/etc/motd'
  changed 'Mon Oct 14 04:34:44 2013' -> 'Mon Oct 14 04:36:36 2013'

Each file is checked (and reported) twice because we are using hash ⇒ "best", which according to the documentation “correlates the best two available algorithms known in the OpenSSL library.” We could specify a specific algorithm (e.g. "sha256") to check each file only once.

As written, the detect_all_change body will automatically update the hashes database whenever a change is detected, but changing the value of update_hashes to "no" would prevent this from happening, and it would keep warning you about changes until you update the database.

More useful in many cases would be the ability to monitor whole directories for unauthorized changes. For this we use the same detect_all_change body, but we add additional attributes to the files: promise that uses it, so that it recurses into the directories we specify:

bundle agent monitor_for_changes
{
  vars:
      "files_dirs" slist => { "/bin", "/etc/passwd", "/etc/motd" };

  files:
      "$(files_dirs)"
        changes => detect_all_change,
        depth_search => recurse("inf");
}

Note that we are combining in the same list both directories and files that we want to monitor. When running this bundle for the first time, you will see how CFEngine populates its database of hashes:

# cf-agent -KI -f ./monitor_for_changes.cf
2013-10-14T04:39:19+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/bin/chown' was not in 'md5' database - new file found
2013-10-14T04:39:19+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/bin/chown' was not in 'sha1' database - new file found
2013-10-14T04:39:19+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/bin/tar' was not in 'md5' database - new file found
2013-10-14T04:39:19+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/bin/tar' was not in 'sha1' database - new file found
2013-10-14T04:39:19+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/bin/kbd_mode' was not in 'md5' database - new file found
2013-10-14T04:39:19+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/bin/kbd_mode' was not in 'sha1' database - new file found
...
2013-10-14T04:39:20+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/etc/passwd' was not in 'md5' database - new file found
2013-10-14T04:39:20+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/etc/passwd' was not in 'sha1' database - new file found
2013-10-14T04:39:20+0000  warning: depth_search (recursion) is promised for a
  base object '/etc/passwd' that is not a directory
2013-10-14T04:39:20+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/etc/motd' was not in 'md5' database - new file found
2013-10-14T04:39:20+0000    error: /monitor_for_changes/files/'$(files_dirs)':
  File '/etc/motd' was not in 'sha1' database - new file found
2013-10-14T04:39:20+0000  warning: depth_search (recursion) is promised for a
  base object '/etc/motd' that is not a directory

Note the two warning messages I have highlighted — CFEngines tells us that it cannot recurse into files. It will still compute the hashes and monitor the files for changes, but if we want to eliminate these spurious warnings, we can change the bundle to use two lists, one for directories and one for files:

bundle agent monitor_for_changes
{
  vars:
      "dirs"  slist => { "/bin/", "/usr/bin/" };
      "files" slist => { "/etc/passwd", "/etc/motd" };

  files:
      "$(dirs)"
        changes => detect_all_change,
        depth_search => recurse("inf");

      "$(files)"
        changes => detect_all_change;
}

When using a recursive search, CFEngine will detect new files, in addition to file changes:

# touch /bin/blah
# cf-agent -KI -f ./monitor_for_changes.cf
2013-10-14T04:43:59+0000    error: /monitor_for_changes/files/'$(dirs)':
  File '/bin/blah' was not in 'md5' database - new file found
2013-10-14T04:43:59+0000    error: /monitor_for_changes/files/'$(dirs)':
  File '/bin/blah' was not in 'sha1' database - new file found
Warning

Unfortunately, as of 3.5.2, there is a bug in CFEngine which prevents it from detecting when a file in a monitored directory is deleted unless you specify update_hashes as “no” in the detect_all_change body. In this case, if a file disappears, you will see a message like this:

2013-10-14T04:51:48+0000    error: /monitor_for_changes:
  File '/bin/blah' no longer exists

The weak point of any file-change monitoring solution such as the one described above, or in Tripwire, is the hashes database. If an attacker manages to modify the database, he can update it with the new hash values of any files he modifies, and those changes will not be detected nor reported.

One way in which CFEngine can help to solve this problem is by performing distributed monitoring of the hash database. CFEngine is able to automatically and transparently distribute the monitoring among groups of hosts so that if the hash database is modified in any one of them, a group of others will detect the change and notify about it. The idea is that an attacker might modify the database in a single host, but if that database is replicated across several other hosts, it’s very unlikely that the attacker will be able to modify all those copies simultaneously.

For this, we again use CFEngine’s file-comparison abilities, coupled with its ability to automatically determine groups of hosts. The peers() function allows us to break a list of hosts into subsets of arbitrary size, and allows each host to determine its “neighbors” in the group. Using this capability, we can instruct hosts to cross-copy the database file among themselves. For example:

bundle agent neighborhood_watch
{
 vars:
   "neighbors" slist => peers("/var/cfengine/inputs/hostlist","#.*",4),   (1)
     comment => "Get my neighbors from a list of all hosts";
 files:
   "$(sys.workdir)/nw/$(neighbors)_checksum_digests.tcdb"   (2)
     comment => "Watch our peers remote hash tables!",
     copy_from => remote_cp("$(sys.workdir)/checksum_digests.tcdb",
                            "$(neighbors)"),   (3)
     action => neighbor_report("File changes observed on $(neighbors)"),   (4)
     depends_on => { "grant_hash_tables" };   (5)
}

body action neighbor_report(msg)
{
     ifelapsed => "30";
     log_string => "$(msg)";
}

Let’s examine how this works.

  1. We assume each client has a list of all hosts in the network stored at /var/cfengine/inputs/hostlist. This file could be generated by the policy hub using the hostsseen() function (left as an exercise to the reader), and then copied using CFEngine to all other hosts. The peers() function splits this list into chunks of the given size (4 hosts per group in this case), and assigns into the @(neighbors) list the list of the peers of the current host. In each host, peers() will determine the group to which the current host belongs, and then return all the hosts in that group, except the current one.

  2. The files: promise will repeat for each one of the neighbors using implicit looping, and will copy their hash database into a local file under /var/cfengine/nw/, named after the corresponding host name.

  3. The file to be copied from each neighbor is /var/cfengine/checksum_digests.tcdb. (This filename may change depending on the database engine that CFEngine is using)

  4. We determine that the action to be taken when the promise is repaired is to generate a log message about it, indicating the host in which the discrepancy was found.

    Let’s analyze for a moment the behavior of this code. It’s a simple file-copy operation, like the ones we use to copy updated policies from the policy hub into the clients. However, in this case we are dealing with a file that should very rarely change, so whenever it changes, it’s a noteworthy event. When the hash database is modified in any of the neighbors, the other neighbors will notice this change and re-copy the file to their local disk. The promise is marked as repaired, and a message is generated.

  5. The correct execution of this neighborhood-watch technique depends on being able to copy the hash databases among neighbors. For this reason, we make this promise dependent on another promise that sets the appropriate access rules for the file, and which must be defined in a bundle of type server:

    bundle server access_rules()
    {
     vars:
       # List here the IP masks that we grant access to on the server
       "acl" slist => {
                       "$(sys.ipv4)/24",
                       "128.39.89.233",
                       "2001:700:700:3.*"
       },
         comment => "Define an acl for the machines to be granted accesses",
         handle => "common_def_vars_acl";
    
     access:
       "/var/cfengine/checksum_digests.tcdb"
         handle  => "grant_hash_tables",
         admit   => { @(acl) },
         maproot => { @(acl) };
    }

    Bundles of this type define behavior of the cf-serverd process, and among other things, define which machines can access which files through it. The cf-serverd process running on each machine is the one that will provide access to the /var/cfengine/checksum_digests.tcdb file so that neighbors can copy it as described before. For this to work, we are using an access: promise to specify who can read this file. The admit attribute indicates which IP addresses will have permission to access the file, and the maproot attribute indicates which machines can have access to any file on the system. We set both of these attributes to the value of the @(acl) list, which we define in the vars: section. The first value in @(acl) is "$(sys.ipv4)/24", which indicates that we want the whole class-C network segment (/24) in which the machine is located ($(sys.ipv4) contains the current IP address), to have access.[3] We also specify, for the sake of example, two individual IP addresses (one IPv4, one IPv6) as part of @(acl).

Using this technique, we can have a self-maintaining, self-protecting system for monitoring file changes. We can add more hosts into the peer groups to increase security (by increasing the number from 4 to whatever we need), at the expense of additional file copy operations among the hosts.

Additional CFEngine Features and Information

In this chapter we have seen a number of examples of CFEngine policy, through which we have explored many of the CFEngine language features and abilities. Of course, this is only the beginning, and I cannot possibly show you examples of all the useful features of the CFEngine policy language. In this section I will give you some pointers to some of those features, for you to explore on your own.

Editing XML files

The edit_xml bundle type allows you to specify instructions for editing XML files, analogous to the way edit_line bundles allow you to specify editing operations on plain text files. With edit_xml you can perform XPath-based selection and editing of an XML file to insert, delete, and modify arbitrary nodes and attributes in the file.

Managing virtual machines

CFEngine allows you to manage virtual machines through guest_environments: promises. If CFEngine was compiled with libvirt support, these promises allow you to interface with Xen, KVM, VMWare and other virtualization solutions.

Managing databases

databases: promises allow you to interact with PostgreSQL and MySQL databases (and, in CFEngine Enterprise, with LDAP and Windows Registry) to manage their structure and contents.

CFEngine functions

The CFEngine language includes many functions that allow you to manipulate or obtain data or system information. As with any language, having a mental overview of the types of functions available will help you while writing policies, and just reading through the list may trigger ideas of things you could do with CFEngine to better manage your systems.

Tip

CFEngine 3.5.0 introduced a large number of new functions for data and class manipulation, including sublist(), uniq(), filter(), format(), filestat(), classesmatching(), ifelse(), and many others.

As you work more with CFEngine, you will discover new features and new ways of doing things. Now that you have read through this chapter, I encourage you to go back and look at the list of resources in CFEngine Information Resources. Many of them will probably make much more sense now.


1. The concept of “verify” is dependent on the package manager, and some package_method bodies do not support it.
2. This policy was originally written by Aleksey Tsalolikhin. It is available in the CFEngine Design Center and is used with permission.
3. This is just an example. You would of course need to adapt it according to the specifics of your network.