External Tools and the Puppet Ecosystem

15 min read

(For more resources related to this topic, see here.)

Introduction to Puppet Facts

Puppet is a useful tool by itself, but you can get much greater benefits from using Puppet in combination with other tools and frameworks. We’ll look at some ways of getting data into Puppet, including custom Facter facts, external facts, and Hiera databases, and tools to generate Puppet manifests automatically from existing configuration.

You’ll also learn how to extend Puppet by creating your own custom functions, resource types, and providers; how to use an external node classifier script to integrate Puppet with other parts of your infrastructure; how to use public modules from Puppet Forge; and how to test your code with rspec-puppet.

Creating custom facts

While Facter’s built-in facts are useful, it’s actually quite easy to add your own facts. For example, if you have machines in different data centers or hosting providers, you could add a custom fact for this so that Puppet can determine if any local settings need to be applied (for example, local DNS servers or network routes).

How to do it…

Here’s an example of a simple custom fact:

Run the following command:

ubuntu@cookbook:~/puppet$ mkdir -p modules/facts/lib/facter

Create the file modules/facts/lib/facter/hello.rb with the following contents:
```
Facter.add(:hello) do
setcode do
"Hello, world"
end
end
```
Modify your manifests/nodes.pp file as follows:
```
node 'cookbook' {
notify { $::hello: }
}
```

Run Puppet:

ubuntu@cookbook:~/puppet$ papply
Notice: Hello, world
Notice: /Stage[main]//Node[cookbook]/Notify[Hello, world]/message:
defined 'message' as 'Hello, world'
Notice: Finished catalog run in 0.24 seconds

How it works…

The built-in facts in Facter are defined in the same way as the custom fact that we just created. This architecture makes it very easy to add or modify facts, and provides a standard way for you to read information about the host into your Puppet manifests.

Facts can contain any Ruby code, and the last value evaluated inside the setcode do … end block will be the value returned by the fact. For example, you could make a more useful fact that returns the number of users currently logged in:

Facter.add(:users) do
setcode do
%x{/usr/bin/who |wc -l}.chomp
end
end

To reference the fact in your manifests, just use its name like a built-in fact:

notify { "${::users} users logged in": }
Notice: 2 users logged in

You can add custom facts to any Puppet module. You might like to create a dedicated facts module to contain them, as in the example, or just add facts to whichever existing module seems appropriate.

There’s more…

You can extend the use of facts to build a completely nodeless Puppet configuration; in other words, Puppet can decide what resources to apply to a machine, based solely on the results of facts. Jordan Sissel has written about this approach at:

http://www.semicomplete.com/blog/geekery/puppet-nodeless-configuration.html

You can find out more about custom facts, including how to make sure that OS-specific facts work only on the relevant systems, and how to weight facts so that they’re evaluated in a specific order, at the Puppet Labs website:

http://docs.puppetlabs.com/guides/custom_facts.html

Adding external facts

The Creating custom facts recipe describes how to add extra facts to Puppet for use in manifests, but these won’t show up in the command-line version of Facter. If you want to make your facts available to both Facter and Puppet, you can create external facts instead.

External facts live in the /etc/facter/facts.d directory, and have a simple key=value format, like this:

message="Hello, world"

Getting ready…

Here’s what you need to do to prepare your system for adding external facts:

You’ll need at least Facter 1.7 to use external facts, so run this command to check your Facter version:
```
ubuntu@cookbook:~$ facter -v
1.7.1
```
If your version is pre-1.7, you can install a more recent Facter version from the Puppet Labs APT repo. If you haven’t already configured your system to use this repo. Then, run the following commands:
```
ubuntu@cookbook:~$ sudo apt-get update
ubuntu@cookbook:~$ sudo apt-get install facter
```
You’ll also need to create the external facts directory, using the following command:
```
ubuntu@cookbook:~$ sudo mkdir -p /etc/facter/facts.d
```

How to do it…

In this example we’ll create a simple external fact that returns a message, as in the Creating custom facts recipe.

Create the file /etc/facter/facts.d/myfacts.txt with the following contents:
```
theanswer=42
```
Run the following command:
```
ubuntu@cookbook:~$ facter theanswer
42
```

Well, that was easy! You can add more facts to the same file, or other files, of course:

theanswer=42
world_population='7 billion'
breakfast=johnnycakes

But what if you need to compute a fact in some way; for example, the number of logged-in users? You can create executable facts to do this.

Create the file /etc/facter/facts.d/users.sh with the following contents:
```
#!/bin/sh
echo users=`who |wc -l`
```

Make this file executable with the following command:

ubuntu@cookbook:~$ sudo chmod a+x /etc/facter/facts.d/users.sh

Now check the fact value with the following command:
```
ubuntu@cookbook:~$ facter users
1
```

How it works…

Facter will look in /etc/facter/facts.d and parse any non-executable files there, as lists of key=value pairs, as in the myfacts.txt example. If it finds a key matching the one you requested, it will print the associated value:
```
ubuntu@cookbook:~$ facter theanswer
42
```
In the case of executable files, Facter will assume that their output is a list of key=value pairs. It will execute all the files in the facts.d directory and search their output for the requested key.
In the users example, Facter will execute the users.sh script, which results in the following output:
```
users=1
```
It will then search this output for users and return the matching value:
```
ubuntu@cookbook:~$ facter users
1
```
If there are multiple matches for the key you specified, Facter will return the one parsed first, which is generally the one from the file whose name is alphanumerically first.

There’s more…

You’re not limited to using the key=value format for text facts; you can also use YAML or JSON format, and Facter will detect the format based on the file extension. For example, here are some facts in YAML format:

---
robin: Erithacus rubecula
bluetit: Cyanistes caerulus
blackbird: Turdus merula

And here are the same facts in JSON format:

{
"robin": "Erithacus rubecula",
"bluetit": "Cyanistes caerulus",
"blackbird": "Turdus merula"
}

Be careful with the file extension; for YAML files the extension must be .yaml (.yml won’t work, for example). JSON files should have a .json extension.

For executable facts, however, the output has to be in key=value format. Facter can’t auto-detect the format of executable facts (yet).

Debugging external facts

If you’re having trouble getting Facter to recognize your external facts, run Facter in debug mode to see what’s happening:

ubuntu@cookbook:~/puppet$ facter -d robin
Fact file /etc/facter/facts.d/myfacts.json was parsed but returned an
empty data set

The X was parsed but returned an empty data set error means Facter didn’t find any key=value pairs in the file or (in the case of an executable fact) in its output.

Note that if you have external facts present, Facter parses or runs all the facts in the /etc/facter/facts.d directory every time you query Facter. If some of these scripts take a long time to run, that can significantly slow down anything that uses Facter (run Facter with the —timing switch to troubleshoot this). Unless a particular fact needs to be recomputed every time it’s queried, consider replacing it with a cron job that computes it every so often and writes the result to a text file in the Facter directory.

Using external facts in Puppet

Any external facts you create will be available to both Facter and Puppet. To reference external facts in your Puppet manifests, just use the fact name in the same way you would for a built-in or custom fact:

notify { "There are $::users people logged in right now.": }

Make sure you don’t create facts with the same name as an existing or built-in fact, or strange and probably unwelcome things may happen.

Setting facts as environment variables

Another handy way to get information into Puppet and Facter is to pass it in using environment variables. Any environment variable whose name starts with FACTER_ will be interpreted as a fact. For example, try the following command:

ubuntu@cookbook:~/puppet$ FACTER_moonphase=full facter moonphase
full

It works just as well with Puppet, so let’s run through an example.

How to do it…

Follow these steps to see how to set facts using environment variables:

Modify your manifests/nodes.pp file as follows:

node 'cookbook' {
notify { "The moon is $::moonphase": }
}

Run the following command:

ubuntu@cookbook:~/puppet$ FACTER_moonphase="waxing crescent"
puppet apply manifests/site.pp
Notice: The moon is waxing crescent
Notice: /Stage[main]//Node[cookbook]/Notify[The moon is waxing
crescent]/message: defined 'message' as 'The moon is waxing
crescent'
Notice: Finished catalog run in 0.06 seconds

Importing configuration data with Hiera

A key principle of good programming is to separate data and code. Many Puppet manifests are full of site-specific data that makes it hard to share and re-use the manifests. Grouping all such data into structured text files and moving it outside the Puppet manifests makes it easier to maintain, as well as easier to re-use and share code with other people.

Ideally, we’d not only be able to look up configuration parameters from a data file, but also choose a different data file depending on things like the environment, the operating system, and other facts about the machine. We’d also like to be able to organize the data hierarchically so that we can override certain values with other, higher-priority values.

Puppet has a mechanism named Hiera (as in ‘hierarchy’) to do just this. The hiera function lets you look up configuration data from within your manifest, and Hiera takes care of returning the correct value for the current environment.

Getting ready…

In this example we’ll see how to set up a minimal Hiera data file, and read some information out of it. First, we need to configure Puppet to retrieve data from Hiera. Follow these steps:

Create the file hiera.yaml in your Puppet directory with the following contents:
```
:hierarchy:
- common
:backends:
- yaml
:yaml:
:datadir: '/home/ubuntu/puppet/data'
```
Create the data directory in your Puppet directory:
```
ubuntu@cookbook:~/puppet$ mkdir data
```

Modify your modules/puppet/files/papply.sh script as follows: The sudo puppet apply command is all on one line.

#!/bin/sh
sudo puppet apply /home/ubuntu/puppet/manifests/site.pp
--modulepath=/home/ubuntu/puppet/modules/ --hiera_config=/home/
ubuntu/puppet/hiera.yaml $*

Make sure that your node includes the puppet module, as follows:
```
node 'cookbook' {
include puppet
}
```

Run Puppet to update the papply script:

ubuntu@cookbook:~/puppet$ papply
Notice: /Stage[main]/Puppet/File[/usr/local/bin/papply]/content:
content changed '{md5}171896840d39664c00909eb8cf47a53c' to '{md5}6
d104081905bcb5e1611ac9b6ae6d3b9'
Notice: Finished catalog run in 0.26 seconds

How to do it…

We’ve now enabled Hiera, so we’re ready to use it. Follow these steps to see how to read Hiera data into your Puppet manifest.

Create the file data/common.yaml with the following contents:
```
magic_word : 'xyzzy'
```

Add the following to your manifest:

$message = hiera('magic_word')
notify { $message: }

Run Puppet:

ubuntu@cookbook:~/puppet$ papply
Notice: xyzzy

How it works…

When you look up a bit of data using the hiera function, Hiera has to do several things. First, it has to find its data directory, which is set in hiera.yaml here as /home/ubuntu/puppet/data.
Then it needs to know which data file to read. This is determined by the hierarchy setting, which in our example simply contains common. So Hiera will read common.yaml and look for the parameter magic_word. When it finds it, it will read the value (xyzzy) and return it to the manifest.
Finally, this value is stored in the $message variable and printed out for our enjoyment.

There’s more…

Hiera is a very powerful mechanism for importing configuration data and we don’t have space to explore all its possibilities here. But let’s look at a simple improvement to our Hiera setup: adding node-specific settings.

Setting node-specific data with Hiera

In order to have Hiera read different data files depending on the name of the node, we need to make a couple of changes to the configuration in the previous example, as follows:

Modify your hiera.yaml file as follows:

:hierarchy:
- %{hostname}
- common
:backends:
- yaml
:yaml:
:datadir: '/home/ubuntu/puppet/data'

Create the file data/cookbook.yaml (if your machine is named cookbook; otherwise, name it after your hostname) with the following contents:
```
greeting : "Hello, I'm the cookbook node!"
```
Modify your data/common.yaml file as follows:
```
greeting : "Hello, I'm some other node!"
```

Modify your manifests/nodes.pp file as follows:

node 'cookbook', 'cookbook2' {
$message = hiera('greeting')
notify { $message: }
}

Run Puppet:

ubuntu@cookbook:~/puppet$ papply
Notice: Hello, I'm the cookbook node!

What’s happening here? Recall that Hiera uses the hierarchy setting to determine which data files to search, and in which order. In our example, the first value of hierarchy is:

%{hostname}

This tells Hiera to look for a .yaml file matching the machine’s hostname. If it can’t find one, it will go on to the next value in the list:

common

So for all calls to the hiera function, Hiera will first look for data/cookbook.yaml (or whatever the hostname is), and then data/common.yaml. This allows us to set certain bits of data which are specific to a named node, in this case, cookbook:

greeting : "Hello, I'm the cookbook node!"

If Hiera can’t find a data file matching the machine’s hostname, it will look in common.yaml and use the value of the greeting parameter from that file. We can test this by temporarily setting the machine’s hostname to something different as follows:

ubuntu@cookbook:~/puppet$ sudo hostname cookbook2
ubuntu@cookbook:~/puppet$ papply
sudo: unable to resolve host cookbook2
dnsdomainname: Name or service not known
dnsdomainname: Name or service not known
dnsdomainname: Name or service not known
Notice: Hello, I'm some other node!
Notice: /Stage[main]//Node[cookbook2]/Notify[Hello, I'm some other
node!]/message: defined 'message' as 'Hello, I'm some other node!'
Notice: Finished catalog run in 0.08 seconds
ubuntu@cookbook:~/puppet$ sudo hostname cookbook

The values you specify for hierarchy can include Facter facts (as in the hostname example) or even Puppet variables. For example, if you had a variable $::vagrant which is true on a Vagrant virtual machine and false otherwise, you could specify:

:hierarchy:
- vagrant_%{::vagrant}

Depending on the value of $::vagrant, Hiera will either look for the data in vagrant_true.yaml or vagrant_false.yaml.

This is a great way to modify the node’s configuration depending on some external factor without having to use lot of conditional statements and selectors in your Puppet code. Plus, you can store and version the Hiera data separately from your Puppet manifests (and for production use, this is a good idea; it means you can use different data depending on the Puppet environment setting).

Looking up data with Hiera

There are several ways to look up config data using Hiera: the hiera function, as we saw in the example, simply returns the first match found in the hierarchy for the specified key. However, you can also use hiera_array and hiera_hash to return multiple values. With these functions, Hiera will return all the matches found, including those at different levels of the hierarchy.

If you’re using Hiera to store your configuration data, there’s a gem available called hiera-gpg which adds an encryption backend to Hiera to achieve the same result.

Getting ready…

To set up hiera-gpg, follow these steps:

Run this command to install hiera-gpg:

ubuntu@cookbook:~$ sudo gem install hiera-gpg --no-ri --no-rdoc
Fetching: json_pure-1.8.0.gem (100%)
Fetching: hiera-1.2.1.gem (100%)
Fetching: gpgme-2.0.2.gem (100%)
Building native extensions. This could take a while...
Fetching: hiera-gpg-1.1.0.gem (100%)
Successfully installed json_pure-1.8.0
Successfully installed hiera-1.2.1
Successfully installed gpgme-2.0.2
Successfully installed hiera-gpg-1.1.0
4 gems installed

Modify your hiera.yaml file as follows:

:hierarchy:
- secret
- common
:backends:
- yaml
- gpg
:yaml:
:datadir: '/home/ubuntu/puppet/data'
:gpg:
:datadir: '/home/ubuntu/puppet/data'

How to do it…

In this example we’ll create a piece of encrypted data and retrieve it using hiera-gpg.

Create the file data/secret.yaml with the following contents:
```
top_secret: 'xyzzy'
```
Encrypt the secret.yaml file to this key using the following command (replace the john@bitfieldconsulting.com with the e-mail address you specified when creating the key). This will create the file secret.gpg:
```
ubuntu@cookbook:~/puppet$ cd data
ubuntu@cookbook:~/puppet/data$ gpg -e -o secret.gpg -r john@
bitfieldconsulting.com secret.yaml
```

Remove the plaintext secret.yaml file:

ubuntu@cookbook:~/puppet/data$ rm secret.yaml

Modify your manifests/nodes.pp file as follows:

node 'cookbook' {
$message = hiera('top_secret')
notify { $message: }
}

Now run Puppet:

ubuntu@cookbook:~/puppet$ papply
Notice: xyzzy
Notice: /Stage[main]//Node[cookbook]/Notify[xyzzy]/message:
defined 'message' as 'xyzzy'
Notice: Finished catalog run in 0.29 seconds

How it works…

When you install hiera-gpg, it adds to Hiera, the ability to decrypt .gpg files. So you can put any secret data into a .yaml file that you then encrypt to the appropriate key with GnuPG. Only machines that have the right secret key will be able to access this data.

For example, you might encrypt the MySQL root password using hiera-gpg and install the corresponding key only on your database servers. Although other machines may also have a copy of the secret.gpg file, it’s not readable to them unless they have the decryption key.

There’s more…

You might also like to know about hiera-eyaml, another secret-data backend for Hiera that supports encryption of individual values within a Hiera data file. This could be handy if you need to mix encrypted and unencrypted facts within a single file. Find out more about hiera-eyaml here:

https://github.com/TomPoulton/hiera-eyaml

Summary

In this article we learned how to create custom facts, add external facts, set facts as environment variables, import configuration data with Hiera, and store secret data with hiera-gpg.