Module, Facts, Types and Reporting tools in Puppet

10 min read

(For more resources related to this topic, see here.)

Module Files and Templates

Transferring files with puppet is something best done within modules, when you define a file resource, you can use content => “something” or you can push a file from the puppet master using source. As an example, using our judy database, we could have judy::config with the following file definition:

class judy::config { file {'/etc/judy/judy.conf': source => 'puppet:///modules/judy/judy.conf' } }

Now puppet will search for this file in the directory judy/files. It is also possible to add full paths and have your module mimic the filesystem, the previous source line would be changed to source => ‘puppet:///modules/judy/etc/judy/judy.conf’ and the file would be found in judy/files/etc/judy/judy.conf.

The puppet:/// url given previously has three backslashes, optionally the name of a puppet server may appear between the second and third backslash. If left blank, the puppet server performing catalog compilation is used to retrieve the file. You can alternatively specify the server using source => ‘puppet://’.

Having files come from specific puppet servers can make maintenance difficult. If you change the name of your puppet server, you have to change all references to that name as well.

Templates are searched in a similar fashion, in this example in judy/templates, when specifying the template, you use content => template(‘judy/template.erb’) to have puppet look for the template in your modules templates directory. As an example another config file for judy could be:

file {'/etc/judy/judyadm.conf': content => template('judy/judyadm.conf.erb') }

Puppet will look for the file ‘judy/judyadm.conf.erb’ in modulepath/judy/templates/judyadm.conf.erb. We haven’t covered ruby templates up to this point, templates are files that are parsed according to erb syntax rules. If you need to distribute a file where you need to change some settings based on variables, then a template is the thing to use. ERB syntax is covered in detail at

In the next section, we will discuss module implementations in a large organization before writing custom modules.

Creating a Custom Fact for use in Hiera

The most useful custom facts are those that return a calculated value that you can use to organize your nodes. Such facts allow you to group your nodes into smaller groups or create groups with functionality or locality. These facts allow you to separate the data component of your modules from the logic or code components. Such a fact can be used in your hiera.yaml file to add a level to the hierarchy. One aspect of the system that can be used to determine information about the node is the ipaddress. Assuming you do not reuse ipaddresses within your organization, the ipaddress can be used to determine on which part of the network a node resides, the zone. In this example, we will define three zone’s in which machines reside, production, development, and sandbox. The ipaddresses in each zone are on different subnets. We’ll start by building up a script to calculate the zone and then turn it into a fact like our last example. Our script will need to calculate IP ranges using netmasks, so we’ll import the ipaddr library and use IPAddr objects to calculate ranges.

require('ipaddr') require('facter') require('puppet')

Next we’ll define a function that takes an ipaddress as the argument and returns the zone to which that ipaddress belongs:

def zone(ip) zones = { 'production' => [''),'')], 'development' => [''),'')], 'sandbox' => ['')] } for zone in zones.keys do for subnet in zones[zone] do if subnet.include?(ip) return zone end end end return 'undef' end

This function will loop through the zones hash looking for a match on IP address. If no match is found, the value of ‘undef’ is returned. We then obtain the ipaddress for the machine using the ipaddress fact from facter.

ip ='ipaddress'))

Then we call the zone function with this ipaddress to obtain the zone.

print zone(ip),"n"

Now we can make this script executable and test.

node1# facteripaddress node1# ./example_zone.rb production

Now all we have to do is replace print zone(ip),”n” with the following to define the fact.

Facter.add('example_zone') do setcode do zone(ip) end end

Now when we insert this code into our example_facts module and run puppet on our nodes, the custom fact is available.

# facter -p example_zone production

Now that we can define a zone based on a custom fact, we can go back to our hiera.yaml file and add %{::example_zone} to the hierarchy. The hiera.yaml hierarchy will now contain the following

--- :hierarchy: - "zones/%{::example_zone}" - "hosts/%{::hostname}" - "roles/%{::role}" - "%{::kernel}/%{::osfamily}/%{::lsbmajdistrelease}" - "is_virtual/%{::is_virtual}" - common

After restarting httpd to have the hiera.yaml file reread, we create a zones directory in hieradata and add production.yaml with the following contents.

--- welcome: "example_zone - production"

Now when we run puppet on our node1, we see the motd updated with the new welcome message

node1# cat /etc/motd PRODUCTION example_zone - production Managed Node: node1 Managed by Puppet version 3.4.2

Creating a few key facts that can be used to build up your hierarchy can greatly reduce the complexity of your modules. There are several workflows available, in addition to the custom fact we just described above, you can use the /etc/facter/facts.d directory with static files or scripts, or you can have tasks run from other tools dump files into that directory to create custom facts.

When writing ruby scripts you can use any other fact by calling Facter.value(‘factname’). If you write your script in ruby you can access any ruby library using require. Your custom fact could query the system using lspci or lsusb to determine what hardware is specifically installed on that node. As an example, you could use lspci to determine the make and model of graphics card on the machine and return that as a fact, such as videocard. In the next section we’ll write our own custom modules that will take such a fact and install the appropriate driver for the videocard based on the custom fact.

Parameterized Classes

Parameterized classes are classes where you have defined several parameters that can be overridden when you instantiate the class for your node. The use case for parameterized classes is when you have something that won’t be repeated within a single node. You cannot define the same parameterized class more than once per node. As a simple example, we’ll create a class which installs a database program and starts that databases service. We’ll call this class example::db, the definition will live in modules/example/manifests/db.pp

class example::db ($db) { case $db { 'mysql': { $dbpackage = 'mysql-server' $dbservice = 'mysqld' } 'postgresql': { $dbpackage = 'postgresql-server' $dbservice = 'postgresql' } } package { "$dbpackage": } service { "$dbservice": ensure => true, enable => true, require => Package["$dbpackage"] } }

This class takes a single parameter ($db) that specifies the type of the database, in this case either postgresql or mysql. To use this class, we have to instantiate it.

class { 'example::db': db => 'mysql' }

Now when we apply this to a node we see that mysql-server is installed and mysqld is started and enabled at boot. This works great for something like a database, since we don’t think we will have more than one type of database server on a single node. If we try to instantiate the example::db class with postgresql on our node, we’ll get an error as follows:

Types and Providers

Puppet separates the implementation of a type into the type definition and any one of many providers for that type. For instance, the package type in puppet has multiple providers depending on the platform in use (apt, yum, rpm and others). Early on in puppet development there were only a few core types defined. Since then, the core types have expanded to the point where anything that I feel should be a type is already defined by core puppet. The lvm module created a type for defining logical volumes, the concat module created types for defining file fragments. The firewall module created a type for defining firewall rules. Each of these types represents something on the system with the following properties:

  • unique
  • searchable
  • creatable
  • destroyable
  • atomic

When creating a new type, you have to make sure your new type has these properties. The resource defined by the type has to be unique, this is why the file type uses the path to a file as the naming variable (namevar), a system may have files with the same name (not unique) but it cannot have more than one file with an identical path. As an example, the ldap configuration file for openldap is /etc/openldap/ldap.conf, the ldap configuration file for the name services library is /etc/ldap.conf, if you used filename, then they would both be the same resource. Resources must be unique. By atomic I mean that it is indivisible, it cannot be made of smaller components. For instance, the firewall module creates a type for single iptables rules. Creating a type for the tables (INPUT, OUTPUT, FORWARD) within iptables wouldn’t be atomic, each table is made up of multiple smaller parts, the rules. Your type has to be searchable so that puppet can determine the state of the thing you are modifying. A mechanism has to exist to know what the current state is of the thing in question. The last two properties are equally important, puppet must be able to remove the thing, destroy it and likewise, puppet must be able to create the thing anew.

Given these criteria, there are several modules that define new types, some examples include types which manage:

  • git repositories
  • apache virtual hosts
  • ldap entries
  • network routes
  • gem modules
  • perlcpan modules
  • databases
  • drupalmultisites


Foreman is more than just a puppet reporting tool, it bills itself as a complete lifecycle management platform. Foreman can act as the ENC (external node classifier) for your entire installation and configure DHCP, DNS and PXE booting. It’s a one stop shop. We’ll configure foreman to be our report backend in this example.


Mcollective is an orchestration tool created by puppetlabs that is not specific to puppet, plugins exist to work with other configuration management systems. Mcollective uses a message queue (MQ) with active connections from all active nodes to enable parallel job execution on large numbers of nodes.

To understand how mcollective works, we’ll consider the following high level diagram and work through the various components. The configuration of mcollective is still somewhat involved and prone to errors. Still, once mcollective is working properly, the power it provides can become addicting, it will be worth the effort.

The default MQ install for marionette is using activemq, the activemq provided by the puppetlabs repo is known to work.

Mcollective uses message queue and can use your existing message queue infrastructure.

If using activemq, a single server can handle 800 nodes. After that you’ll need to spread out. We’ll cover the standard mcollective install using puppet’s certificate authority to provide ssl security to mcollective. The theory here being that we trust puppet to configure the machines already, we can trust it a little more to run arbitrary commands. We’ll also require that users of mcollective have proper ssl authentication as well.


In this article we learned how to deal with puppet. We also created a custom fact for use in Hiera and covered different topics like Foreman, mcollective, and much more.

Resources for Article:

Further resources on this subject:


Please enter your comment!
Please enter your name here