Distributed Puppet

February 7, 2013 / Matthew Smith / 2 Comments

Some might say…

Some might say that running puppet as a server is the right way to go, it certainly provides some advantages like puppet db, sotred configs, external resources etc etc but is that really what you want?

If you have a centralised puppet server with 200 or so clients, there’s some fancy things that can be done to ensure that not all nodes hit the server at the same time but that requires setting up and configuring additional tools etc ect…

What if you just didn’t need any of that? what if you just needed a git repo with your manifests and modules in and puppet to be installed?
Have the script download / install puppet, pull down the git repo and then run it locally. This method puts more overhead on a per node basis but not much, it had to run puppet anyway, and in all cases this can still provide the same level of configuration as server client method, you just need to think out side of the server.

Don’t do it it’s stupid!

My response to my boss some 10 months ago when he said we should ditch puppet servers, manifests per server and make all variables outlawed. Our mission was to be able to scale our service to over 1 million users and we realised that manually having to add extra node manifests to puppet was not sustainable so we started on a journey to get rid of the puppet server and redo our entire puppet infrastructure.

Step 1 Set up a git repo, You should already be using one, if you aren’t Shame on you! We chose github, why do something yourself when there are better people out there doing a better job and are dedicated to doing just one thing, spend your time looking after your service not your infrastructure!

Step 2 Remove all manifests based on nodes, replace with a manifest per tier / role. For us this meant consolidation of our prod web role with our qa, test and dev roles so it was just one role file regardless of environment. This forces the management of the environment specific bits into variables.

Step 3 Implement hiera – Hiera gives puppet the ability to externalise variables into configuration files so we now end up with a configuration file per environment and only one role manifest. This, as my boss would say “removes the madness” Now if someone says “what’s the differences between prod and test you diff two files regardless of how complicated you want to make your manifests inherited or not. It’s probably worth noting you can set default variables for Hiera… hiera(“my_var”,”default value”)

Step 4 Parameterise everything – We had lengthy talks about parameterising modules vs just using hiera, but to help keep the modules transparent to what ever is coming into them, and that I was writing them, we kept parameters, I did however move all parameters for all manifests in a module into a “params.pp” file and inherit that everywhere to re-use the variables, within each manifest that always defaults to the params.pp value or is blank (to make it mandatory) This means that if you have sensible defaults you can set them here and reduce the size of your hiera files, which in turn makes it easier to see what is happening. Remember most people don’t care about the underlying technology just the top level settings and trust that the rest is magic… for the lower level bits see these: Puppet with out barriers part one for a generic overview Puppet with out barriers part two for manifest consolidation and Puppet with out barriers part three for params & hiera

This is all good, But what if you were in Amazon? and you don’t know what your box is? Well it’s in a security group but that is not enough information, especially if your security groups are dynamic, you can also Tag your boxes and you should make use, where possible of the aws cli tools to do this. We decided a long time ago to set n a per node basis a few details, Env, Role & Name From this we know what to set the hostname, what puppet manifests to apply and what set of hiera variables to apply as follows…

Step 5 Facts are cool – Write your own custom facts for facter. We did this in two ways, the first was to just pull down the tags from amazon (where we host) and return them as ec2_<tag>, this works but AWS has issues so it fails occasionally, Version2, was to get the tags, cache them locally in files and then facter can pull it from the files locally… something like this…

#!/bin/bash
# Load the AWS config
source /tmp/awsconfig.inc

# Grab all tags locally
IFS=$'\n'
for i in $($EC2_HOME/bin/ec2-describe-tags --filter "resource-type=instance" --filter "resource-id=`facter ec2_instance_id`" | grep -v cloudformation | cut -f 4-)
do
        key=$(echo $i | cut -f1)
        value=$(echo $i | cut -f2-)

        if [ ! -d "/opt/facts/tags/" ]
        then
                mkdir -p /opt/facts/tags
        fi
        if [ -n $value ]
        then
                echo $value > /opt/facts/tags/$key
        /usr/bin/logger set fact $key to $value
        fi
done

The AWS config file just contais the same info you would use to set up any of the CLI tools on linux and you can turn them to tags with this:

tags=`ls /opt/facts/tags/`

tags.each do |keys|
        value = `cat /opt/facts/tags/#{keys}`
        fact = "ec2_#{keys.chomp}"
        Facter.add(fact) { setcode { value.chomp } }
end

Also see: Simple facts with puppet

Step 6 Write your own boot scripts – This is a good one, scripts make the world run. Make a script that installs puppet, make a script that pulls down your git repo, then run puppet at the end (like the following)

The node_name_fact is awesome, as it kicks everything into gear and hooks your deployed boxes in a security group with the appropriate tags to become fully built servers.

Summary

So now, puppet is on each box, every box from the time it’s built knows what it is (thanks to tags) and bootstraps it’s self to a fully working box thanks to your own boot script and puppet. With some well written scripts you can cron the pulling of git and a re-run of puppet if so desired. The main advantage of this method is the distribution, as long as it manages to pull that git repo it will build a box. and if something changes on the box, it’ll put it back, because it has everything locally so no network issues to worry about.

This time, We survived the AWS outage

October 25, 2012 / Matthew Smith / 1 Comment

Another minor bump

Anyone based in the US East region in AWS knows that yet again there were issues with EBS volumes, although you wouldn’t know it if you looked at their website. It’s a bit of a joke when you see headlines like Amazon outage takes down Reddit, Foursquare, others yet on their status page a tiny little note icon appears that states there’s a slight issue, extremely minor, don’t worry about it. Yeah right.

The main culprits were EC2 and the API, both of which were EBS related.

“Degraded EBS performance in a single Availability Zone
10:38 AM PDT We are currently investigating degraded performance for a small number of EBS volumes in a single Availability Zone in the US-EAST-1 Region.
11:11 AM PDT We can confirm degraded performance for a small number of EBS volumes in a single Availability Zone in the US-EAST-1 Region. Instances using affected EBS volumes will also experience degraded performance.
11:26 AM PDT We are currently experiencing degraded performance for EBS volumes in a single Availability Zone in the US-EAST-1 Region. New launches for EBS backed instances are failing and instances using affected EBS volumes will experience degraded performance.
12:32 PM PDT We are working on recovering the impacted EBS volumes in a single Availability Zone in the US-EAST-1 Region.
1:02 PM PDT We continue to work to resolve the issue affecting EBS volumes in a single availability zone in the US-EAST-1 region. The AWS Management Console for EC2 indicates which availability zone is impaired. “

The actual message is much much longer but you get the gist, a small number of people were affected. Yet most of the major websites that use amazon were affected, how can that be considered small?

Either way, this time we survived, and we survived because we learnt. Back in June and July we experienced these issues with EBS so we did something about it, now why didn’t everyone else?

How Alfresco Cloud Survived

So back in June and July we were heavily reliant on EBS just like everyone else, we had an EBS backed AMI that we then used puppet to build out the OS, this is pretty much what everyone does and this is why everyone was affected, back then we probably had 100 – 150 EBS volumes so the likely hood of one of them going funny was quite high, now we have about 18, and as soon as we can we will ditch those as well.

After being hit twice in relatively quick succession we realised we had a choice, be lazy or be crazy, we went for crazy and now it paid out. We could have been lazy and just said that Amazon had issues and it wasn’t that frequent and it is not likely to happen again, or we could be crazy and reduce all of our EBS usage as much as possible, we did that.

Over the last few months I’ve added a numer or posts about The Cloud, Amazon and Architecting for the cloud along with a few funky Abnormal puppet set ups and oddities in the middle. All of this was spawned from the EBS outages, we had to be crazy, Amazon tell us all the time don’t have state, don’t rely on anything other than failure use multiple AZ’s etc etc all of those big players that were affected would have been told that they should use multiple availability zones, but as I pointed out Here their AZ can’t be fully independent and yet again this outage proves it.

Now up until those outages we had done all of that, but we still trusted Amazon to remain operational, since July we have made a concerted effort to move our infrastructure to elements within Amazon that are more stable, hence the removal of EBS. We now only deploy instance backed EC2 nodes which means we have no ability to restart a server, but it also means that we can build them quickly and consistently.

We possibly took it to the extreme, our base AMI, now instance backed, consists of a single file that does a git checkout, once it has done that it simply builds its self to the point that chef and puppet can take over and run. The tools used to do this are many but needless to say many hundreds of of lines of bash, supported by Ruby, Java, Go and any number of other languages or tools.

We combined this with fully distributing puppet so it runs locally, in theory once a box is built it is there for the long run; we externalised all of the configuration so puppet was simpler and easier to maintain. Puppet, its config, the Base OS, the tools to manage and maintain the systems are all pulled from remote services including our DNS which automatically updates its self based on a set of tags.

Summary

So, how did we survive, we decided every box was not important, if some crazy person can not randomly delete a box or service and the system keeps working then we had failed. I can only imagine that the bigger companies with a lot more money and people and time looking at this are still treating Amazon more as a datacentre rather than a collection of web services that may or may not work. With the distributed puppet and config once our servers are built they run happily on a local copy of the data, no network, and that is important because AWS’s network is not always reliable and nor is their data access. If a box no longer works delete it, if an environment stops working rebuild it; if amazon has a glitch, keep working, simple.

Release consistency

October 3, 2012 / Matthew Smith / No Comments

It’s gotta be… perfect

This is an odd one that the world of devops sits in an awkward place on but is very vital to the operational success of a product. As an operations team we want to ensure the product is performing well, there’s no issues and there’s some control over the release process, some times this leads to long winded and bloaty procedures that are common in service providers where people stop thinking and start following, this is a good sign a sysadmin has died inside. From the more Development side we want to be quick and efficient and reuse tooling. Consistency is important, a known state of release is important, the process side of it should not exists as it should be automated with the minimal interaction.

So as Operations guys, we may have to make changes ad-hoc to ensure the service continues to work, as a result a retrospective change is made to ensure the config is good for long term; often this can stay untested as you’re not going to re-build the system from scratch to re-test, are you?

The development teams want it all; rapid, agile change, quick testing suites and an infallible release process with plenty of resilience and double checks and catches for all sorts of error codes etc etc, the important thing is that the code makes it out, but it has to be perfect, which leads on to QA.

The QA team want to test the features, the product, the environment, combinations of each and the impact of going from X to Y and tracking the performance of the environment in between. All QA want is an environment that never changes, with a code base that never changes so they can test everything in every combination, those bugs must be squashed.

Obviously, everyones right but with so many contradicting opinions it is easy to end up in a blame circle, which isn’t all that productive. The good news is we know what we want…

Quick release
Quick accurate testing
Known Product build
Known configuration
No ad-hoc changes to systems
Ability to make ad-hoc to systems
Infallible release process

All in all not that hard, there are two issue points here. Point one, infallible releases are a thing of wonder and can only ever be iterated over to make them perfect, in time it will get better. Day 1 will be crap, day 101 will be better, day 1000 better still. point two, you can’t have no ad-hoc changes and the ability to make ad-hoc changes, can you? Well you can.

If you love it, let it go

As a sysadmin, if there’s an issue on production I will in most cases fix it in production, if it is risky I will use our staging environment and test it on there, but this is usually no good as staging typically won’t show the issues production does, i.e. all of its servers would be working yet production is missing one. This means I have to make ad-hoc changes to production, this causes challenges for the QA team as now the test environments aren’t always the same, it then screws up the release process as we made a change in one place and didn’t manually add it in to all other environments.

So, What if we throw away, production, staging or any other environment with every release? This is typically a no-no for traditional operations, why would you throw away a working server? well it provides several useful functions:

Removes ad-hoc changes
Tests documentation / configuration management
Enhances DR strategy and familiarisation with it
Clean environment

The typical reason why the throw away approach isn’t done is due to a lack in confidence of the tools. Well, Bollocks. What good is it having configuration management and DR polices if you aren’t using it? if in an operational place now you are making changes to puppet and rolling them forward you achieve some of this, but it’s not good enough, you still have the risk of not having tested the configuration on a new system.

With every environment being built from scratch with every release, we can version control a release to a specific build number and specific git commit which is better for QA as it’s always a known state, if there’s any doubt delete and re-build.

The release process can now start being baked into the startup of a server / puppet run so the consistency is increasing and hopefully the infallibility of the release is improving, adding to this a set of System wide checks for basic health, a set of checks for unit testing the environment, a quick user level test before handing over for testing then it’s more likely, more often that the environments are at a known consistence.

All of this goodness starts with deleting it and building fresh, some of it will happen organically but by at least making sure all of the configuration / deployment is managed at startup you can re-build, and you have some DR. Writing some simple smoke tests and some basic automation is a starting point, from here at least you can build upon the basics to start making it fully bullet proof.

Summary

Be radical, delete something in production today and see if you survive, I know I will.

Simple facts with Puppet

September 12, 2012 / Matthew Smith / 2 Comments

It’s not that hard

When I first started looking at facter it was magic, things just happened and when I entered facter a list of variables appeared and all of these variables are available to use within puppet modules / manifests to help make life easier. After approximately 2 years of thinking how good they were and how nice it would be to have my own I finally took the time to look at it and try to work it out….

For those of you that don’t know, facter is a framework for providing facts about a host system that puppet can use to make intelligent decisions about what to do and can be used to determine the operating system, release of it, local IPs etc etc. This gives you flexibility in puppet to do things like choose what packages to install based on Linux distribution or insert the local IP address into a template.

Writing Facts

So, Lets look at a standard fact that comes with it so you can see the complexity involved an understand why after glancing at it I never went much further.

# Fact: interfaces
#
# Purpose:
#
# Resolution:
#
# Caveats:
#

# interfaces.rb
# Try to get additional Facts about the machine's network interfaces
#
# Original concept Copyright (C) 2007 psychedelys <psychedelys@gmail.com>
# Update and *BSD support (C) 2007 James Turnbull <james@lovedthanlost.net>
#

require 'facter/util/ip'

# Note that most of this only works on a fixed list of platforms; notably, Darwin
# is missing.

Facter.add(:interfaces) do
  confine :kernel => Facter::Util::IP.supported_platforms
  setcode do
    Facter::Util::IP.get_interfaces.collect { |iface| Facter::Util::IP.alphafy(iface) }.join(",")
  end
end

Facter::Util::IP.get_interfaces.each do |interface|

  # Make a fact for each detail of each interface.  Yay.
  #   There's no point in confining these facts, since we wouldn't be able to create
  # them if we weren't running on a supported platform.
  %w{ipaddress ipaddress6 macaddress netmask}.each do |label|
    Facter.add(label + "_" + Facter::Util::IP.alphafy(interface)) do
      setcode do
        Facter::Util::IP.get_interface_value(interface, label)
      end
    end
  end
end

This is stolen, and all it does is provide a comma separated list of interfaces as follows: eth0, eth1 etc etc

Now, when I started looking at facter I knew no ruby and it was a bit daunting, but alas I learnt some and never bothered looking at facter again until my boss managed to simplify one down to it’s bear essentials, which is the one line…

Facter.add("bob") { setcode { "bob" } }

At this point onwards all you need to do is learn some ruby to make sure you can populate that appropriately or, use bash to get the details and populate the fact, in the next example I just grab the pid of apache from ps

apachepid=`ps -fu apache | grep apache | awk '{ print $2}'`

Facter.add(:apachepid) {
	setcode { apachepid }
}

So if you know bash, and you can copy and paste you can do something like the above, now this is ruby, so you can do a lot more complex things but that’s not for today

Okay, so now something more complex is needed…. What if you’re in Amazon and use the Tags on your EC2 instances and you want to use them in puppet ? well you can just query amazon and get a result and use that, although that will take forever and 1 day to run as AWS is not the quickest. This is an issue we had to over come, so we decided to run a script that would query amazon in it’s own time and populate the tags onto the file system, at which point we can read them quickly with facter.

So first, a shell script.

#!/bin/bash
source /path/to/aws/config.inc

# Grab all tags
IFS=$'\n'
for i in $($EC2_HOME/bin/ec2-describe-tags --filter "resource-type=instance" --filter "resource-id=`facter ec2_instance_id`" | grep -v cloudformation | cut -f 4-)
do
        key=$(echo $i | cut -f1)
        value=$(echo $i | cut -f2-)

        if [ ! -d "/opt/facts/tags/" ]
        then
                mkdir -p /opt/facts/tags
        fi
        if [ -n $value ]
        then
                echo $value > /opt/facts/tags/$key
		/usr/bin/logger set fact $key to $value
        fi
done

So this isn’t the best script in the world, but it is simple, it pulls a set of tags out of amazon and basically stores them in a directory where the file name is the tag name and the content is the tag value.
So now we have the facts locally with bash, something we’re all a bit more familiar with we can then take something like facter which is alien ruby and force some bash inside it but still generate facts that provide value

tags=`ls /opt/facts/tags/`

tags.each do |keys|
        value = `cat /opt/facts/tags/#{keys}`
        fact = "ec2_#{keys.chomp}"
        Facter.add(fact) { setcode { value.chomp } }
end

The first thing we do is produce a list of tags (directory list) and then we use some ruby to loop through it and yet more bash to get the values.
None of this is complicated, and hopefully these few examples are enough to encourage people to start writing facts even if they are an abomination to the ruby language but at least you have value without needing to spend time understanding or learning ruby.

Summary

Facts aren’t that hard to write, and thanks to being ruby you can make them as complicated as you like or as simple as you like, and you can even break into bash as needed. So now a caveat, although you can write facts quickly with this half bash/ruby mix by far, just learning ruby will make life easier in the long run, you can then start to incorporate some more complex logic into the facts to provide more value within puppet.

A useful link for facter incase you feel like reading more

Puppet with out barriers -part three

September 5, 2012 / Matthew Smith / 6 Comments

The examples

Over the last two weeks (part one & part two) I have slowly going into detail about module set up and some architecture, well nows the time for real world.

To save me writing loads of puppet code I am going to abbreviate and leave some bits out. First things first a simple module.

init.pp

class javaapp ( $conf_dir = $javaapp::params::conf_dir) inherits javaapp::params {

  class {
    "javaapp::install":
    conf_dir => $conf_dir
  }

}

install.pp

class javaapp::install (conf_dir = $javaapp::params::conf_dir ) inherits javaapp::params {

 package {
    "javaapp":
    name => "$javaapp::params::package",
    ensure => installed,
    before => Class["javaapp::config"],
  }

  file {
    "/var/lib/tomcat6/shared":
    ensure => directory,
  }

}

config.pp

class javaapp::config (app_var1 = $javaapp::params::app_var1,
                       app_var2 = $javaapp::params::app_var2) inherits javaapp::params {

  file {
    "/etc/javaapp/javaapp.conf":
    content => template("javaapp/javaapp.conf"),
    owner   => 'tomcat',
    group   => 'tomcat'
  }
}

params.pp

class javaapp::params ( ) {

$conf_dir = "/etc/javaapp"
$app_var1 = "1.2.3.4/32"
$app_var2 = "host.domain.com"

}

One simple module in the module directory. As you can see I have put all parameters into one file, it use to be that you’d specify the same defaults in every file so in init and config you would duplicate the same variables. Well that is just insane and if you have complicated modules with several manifests in each one it gets difficult to maintain all the defaults. This way they are all in one file and are easy to identify and maintain, it by far isn’t perfect, it does work though and i’m not even sure if puppet supports it and if it doesn’t that is a failing of puppet but it does work with the latest 2.7.18 and i’m sure i’ve had it on all 2.7 variants at some point.

You should be aiming to set sensible defaults set every parameter regardless, but make sure it’s sensible, if you want to enforce the variable is set you can still not put an entry in params and just specify it without a default.

Now the /etc/puppet directory

Matthew-Smiths-MacBook-Pro-2:puppet soimafreak$ ls
auth.conf	extdata		hiera.yaml	modules
autosign.conf	fileserver.conf	manifests	puppet.conf

the auth, autosign and fileserver configs will depend on your infrastructure, but the two important configurations here are puppet.conf and hiera.conf

puppet.conf

[master]
certname=server.domain.com
modulepath = /etc/puppet/modules 
[main]
    # The Puppet log directory.
    # The default value is '$vardir/log'.
    logdir = /var/log/puppet

    # Where Puppet PID files are kept.
    # The default value is '$vardir/run'.
    rundir = /var/run/puppet

    # Where SSL certificates are kept.
    # The default value is '$confdir/ssl'.
    ssldir = $vardir/ssl

		autosign = true
		autosign = /etc/puppet/autosign.conf

[agent]
    # The file in which puppetd stores a list of the classes
    # associated with the retrieved configuratiion.  Can be loaded in
    # the separate ``puppet`` executable using the ``--loadclasses``
    # option.
    # The default value is '$confdir/classes.txt'.
    classfile = $vardir/classes.txt

    # Where puppetd caches the local configuration.  An
    # extension indicating the cache format is added automatically.
    # The default value is '$confdir/localconfig'.
    localconfig = $vardir/localconfig
    pluginsync = true

The only real change worth making to this is in the agent sector, plugin sync ensures that any plugins you install in puppet, like Firewall, VCSRepo, hiera etc are loaded by the agent, obviously on the agent you do not want all of the master config at the top.

Now the hiera.yaml file

hiera.yaml

---
:hierarchy:
      - %{env}
:backends:
    - yaml
:yaml:
    :datadir: '/etc/puppet/extdata'

Okay, to the trained eye this is sort of pointless, it tells puppet that it should look in a directory called /etc/puppet/extdata for a file called %{env}.yaml so in this case if env were to equal bob it would like for a file /etc/puppet/extdata/bob.yaml The advantage to this is that at some point if needed that file could be changed to for example

hiera.yaml

---
:hierarchy:
      - common
      - %{location}
      - %{env}
      - %{hostname}
:backends:
    - yaml
:yaml:
    :datadir: '/etc/puppet/extdata'

This basically provides a location for all variables that you are not able to tie down to a role which will be defined by the manifests.

Matthew-Smiths-MacBook-Pro-2:puppet soimafreak$ ls manifests/roles/
tomcat.pp	default.pp	app.pp	apache.pp

So we’ll look at the default node and tomcat to get a full picture of the variables being passed around.

default.pp

node default {
	
	#
	# Default node - base packages for all systems
	#

  # Define stages
  class {
    "sshd":     stage =>  first;
    "ntp":      stage =>  first;
  }
	
  # Needed for Facter to generate OS related information
  package {
    "redhat-lsb":
    ensure => "installed"
  }

  # mcollective
  class {
    "mcollective":
    mc_password   => "bobbypassword6",
    puppet_server => "puppet.domain.com",
    activemq_host => "mq.domain.com",
  }

  # Manage puppet
  include puppet
}

As you can see, this default node sets up some classes that must be on every box and ensures that packages that are vital are also installed. If you so feel the need to extrapolate this further you could have the default node inherit another node, for example you may have a company manifest as follows:

company.pp

node company {
$mc_password = "bobbypassword6"
$activemq_host = "mq.domain.com"

$puppet_server = $env ? {
    "bob" => 'bobpuppet.domain.com',
    default => 'puppet.domain.com',
  }
}

This company node manifest could be inherited by the default and then instead of having puppet_server => “puppet.domain.com”, you could have puppet_server => $puppet_server, which I think is nice and clear. The only recommendation is to keep your default and your role manifests as simple as possible, try and keep if statements out of them, can you push the decision into hiera? do you have a company.pp that would be sensible to have some env logic in it? are you able to take some existing logic and turn it into a fact ?

Be ruthless and push as much logic out as possible, use the tools to do the leg work and keep puppet manifests and modules simple to maintain.

Now finally the role,

tomcat.pp

node /*tomcat.*/ inherits default {

  include tomcat6

  # Installs java app using the init/install classes and default params, 
  include javaapp

  class {
    "javaapp::config"
    app_var1 => hiera('app_var1') 
    app_var2 => $fqdn
  }
}

The role should be “simple” but it also needs to make it clear what it’s setting, if you notice that several roles use the same class and in most cases the params are the same, change the params file, remove the options from the roles, try and keep it so what is in the roles is only overrides and as minimal as possible. The hiera vars and any set in the default / other node inheritance can all be referenced here.

Summary

Hopefully that helps some of you understand the options for placing variables within puppet in different locations. As I mentioned in the other posts, this method has 3 files, the params, the role the hiera file that’s it, all variables are in one fo those three so there’s no need to hunt through all of the manifests in a module to identify where the the variable may or may not be set, it is either defaulted of overridden, if it’s overridden it will be in the role manifest, from there you can work out if it’s in your default or hiera and so on.

Puppet with out barriers -part two

August 29, 2012 / Matthew Smith / 3 Comments

Magnificent manifests

Back in the day, we use to have maybe 7 node manifests per environment and 6 or 7 environments, which is a lot of individual nodes to manage. As a result not everything was always the same.

We got to the point where each environment, although using the same modules was not always quite right, we used variables to set each environment up, but each variable was in each environment and different, yes you can keep it in check with diff and good process, but that’s time consuming. Why not just have one manifest for each tier of the application? that’s what we have now every tier is identical in terms of its use of the manifest, but obviously every environment has its own settings. So how do you have 1 manifest 100s of environments and guarantee that everything is the same across each environment except a small number of differences which can be easily seen in one file.

Now you could try and manage the different sets of vars through node inheritance but you then end up with a quite complex set of inheritance which can be a pain to track through. At first I was a little confused at how we would reduce our manifests down and still be able to set variables in the right places. Before going to far, lets look at a typical node inheritance set up.

[12:56]msmith@pinf01:/etc/puppet/manifests$ tree
.
|-- company
|   `-- init.pp
|-- dev
|   `-- init.pp
|-- site1
|   |-- init.pp
|   `-- nodes.pp
|-- site2
|   |-- dev_nodes.pp
|   |-- init.pp
|   |-- nodes.pp
|   `-- test_nodes.pp
|-- site3
|   |-- init.pp
|   `-- nodes.pp
|-- prod
|   `-- init.pp
|-- site.pp
`-- test
    `-- init.pp

In this set up, The company init.pp sets up company wide details, domain names, IP addresses for firewall rules or anything that is applicable to multiple sites. The site init.pp inherits the company one and sets up any variables needed for that specific site like DNS servers. In some of the sites we have additional configuration which inherits that site’s init.pp and sets more specific information for say the test or dev environment and finally there is an environment init.pp which sets variables for that type of environment such as environment prefix or other settings that are specific to that environment.

With this set up it gives you plenty of room to grow but can be harder to work out where the variables come from as you will have to track them back through many levels of inheritance; this can make it hard for new people to the team to identify where the variables should be set which is made harder with parameterised classes and defaults.

Now with classes you can create a class/subclass that is used to set a majority of your defaults and this can be useful if you have multiple companies or distinct groups which always insist on having different settings, the advantage of this is that your node declarations then remain minimal. However, this can also make things complicated as you have meta classes that set some of the params needed for a class but not all, so you’ll still end up passing in params within your nodes to the meta class which in turns passes them into your modules and so on. This is typically more inheritance and paramaters than most sysadmins can handle.

So we had the same issue, it was never a 5 min talk to explain what is set where and when because it could be set in 3 or 4 nodes, passed as a param and this was the same for each of the 6 environments. After much talking and disbelief we eventually started on a journey to reduce 7 environments and what was 50+ node files (each controlling multiple nodes via regexp) down to a handful.

[msmith@puppet manifests]$ tree
.
├── account.pp
├── roles
│   ├── app1.pp
│   ├── app2.pp
│   ├── default.pp
│   ├── app3.pp
│   ├── app4.pp
│   ├── app5.pp
│   └── app6.pp
└── site.pp

So to start with, the modules are still parameterised as before, heavily so to make it easy to turn features on and off and where possible everything is set as a default within the module within a params.pp file, so there is one place within a module to change the defaults for all subclasses. Each appX.pp will load which ever sets of classes it needs and where possible it will set the params to make use of various internal variables that are needed.

Each appX.pp is inherited from the default which sets most of the classes we want on each app such as ssh, mcollecitve, aws tools. This inherits the account.pp which sets variables that are relavent to each account we use, or classes that are relavent to each account, the account in this case is an amazon account. We set different yum servers, infrastructure servers and so on.

Now we still have environments, and we choose to load the variables for this via hiera, but we have gone to great lengths to restrict the variables in the environment down to those that are absolutely necessary as such we have not yet needed to make sue of hiera’s hierarchy but we understand the necessity of keeping that option open to us.

The new structure means we have at the moment no more than 5 places to edit vars for every app / env / node. This particular structure works for us because we run a small set of node types but in multiple environments. As time goes on the roles manifests directory will grow to incorporate new servers and if we need to deal with nodes of a similar role type apart from option X/Y/Z we will make use of hiera to add another layer to the system.

Next week I’ll go through the node manifests in some detail which should hopefully cement some of this blurb.

Puppet with out barriers -part one

August 22, 2012 / Matthew Smith / 2 Comments

Structure is good

Like everyone that has used puppet and is using puppet; the puppet structure we use to use to manage our nodes was relatively straight forward. But before getting into that lets go into the two or three ways you would know of already.

1. Node inheritance, Typically you define a number of nodes that define physical and logical structures that suite your business and inherit from each of these to end up with a host node that has the correct details relavent to the data centre it is in and its logical assignment within it.
2. Class inheritance, Similar to node inheritance, you create your modules which are typically agnostic and more than likely take a number of parameters; then create a module that contains a set number of node types, for example, “web” nodes. So the web node would include the apache module and would do all of the configuration you expect of all apache nodes, this can be done to the point of each node manifest simple including a web, or wiki type class.
3, By far I imagine the most common, is a mixture of them both.

Either of these methods is fine, however what will essentially happen is duplication and you’ll be faced with situations where you want 90% of a web node but not that last 10%; you then create a one of module or node and before you know it you have several web nodes or several roles defining the same task.

It’s fair to say we do not do any one of those, node inheritance a little to save duplication of code but that’s about it, I’ll touch on this more next week, but this week is about foundations, and in puppet that means modules.

Parameterise everything

Parameterised classes help with this level of flexibility but if your module isn’t fully parameterised it’s a bit of a pain, so a slight detour based on experience.

I imagine like everyone out there, I write puppet modules that are half arsed on most days. I parameterise the variables I need for my templates and stick to a simple module layout to stop my classes getting to busy. This isn’t a bad way of doing things, you get quite a lot of benefit with out a lot of investment in time, and I for one will continue to write modules that are half arsed, but with one difference, more parameters.

Occasionally I have to extend my module, some times I don’t have time to re-factor and make it perfect so I work around my module, if you’re looking to build technical debt, this is a good way of going. and if you continue down this route you will end up with modules that are complicated and hard to follow.
Initially when introducing module layouts to people they assume they will make life more complicated because you have split a module into classes, but the reality is rather than having 1 file with 100 lines you have 10 files with 10 lines, and of those 10 files you may only be using 2 files in your configuration which can make life a lot simpler than sifting through code.

A typical module layout I use will consist of the following

[root@rincewind modules]# ls example/
files  manifests  README  templates
[root@rincewind modules]# ls example/manifests/
install.pp  config.pp  init.pp  params.pp

For those not familiar with such naming schemes… init is loaded by default and will typically call include the install class and the config class. So if I want to install / configure an entire module that has had some reasonable params set I just “include modulename”. You can read more about modules and how I typically use them Here It was rather experimental at the time but with only a few tweaks it is now in a production environment and is by far the easiest module we have to maintain.

The main reason this structure works for us is that we are able to set a number of sensible defaults in params which reduces the number of params we need to pass in while allowing us to still have a lot of parameters on each class. typically it means our actual use of the classes even though they may have 40+ params won’t normally go past 10.

The big thing we did different with this module is to parameterise about 90% of all the configuration options, now you may think that’s fine for small apps but what about large ones, well we did it with our own product Alfresco, which is quite large and has a lot of options. Granted we can’t account for every possible configuration but we gave it a shot. Consequently we now have a puppet module that out of the box should work with multiple application containers (tomcat, jboss etc ) but also allows for the application to be run in different containers on the same machine.

The advantage of this up front effort on a module which is core to what we are doing is that we are now able to just change options with out having to do a major re-factor of code, adding more complexity in terms of configuration is simpler as it has been split into multiple classes that are very specific to their tasks. If we want to change most of the configuration it is simply a matter of changing that parameter.

Be specific
So you have a module that is nicely parameterise and that’s it, well not quite, you have to make sure your module is specific to it’s task, and this is one that was a little odd to get over. Puppet is great at configuration management, it’s not so good at managing dependancies and order, so we made the decision to simplify our installation process so rather than having the puppet module try and role out the application and multiple versions of it, with different configurations for different versions, it simply set’s up the environment and sets the configuration with no application deployed.

So yes, you can install within the module if you wish, and we use to do that and some complicated expanding / re-zipping of the application to include plugins, but we also have a yum repo, or git, or a ftp server where we can just write a much more specific script to deal with the installation.

By separating out the installation we have been able to cut a lot of code from our module that was trying to do far more than was sensible.

Be sensible
When you look through configuration there are sometimes options that will only ever be set if X or Y is set, in this case write more complicated modules to cope with that logic and don’t just add everything as a parameter. When you look through your module and you start seeing lots of logic, you’ve gone wrong.

Typically now when writing modules the only logic that ends up in the pp file is is feature enabled or disabled, if you try to make a manifest in a module all things to all people you end up with a lot of logic and complications that are not needed. To tackle this, after writing a base for the module I then cloned the params file and the application manifest which does most of the config and made it more specific to the needs of the main task that I’m trying to achieve. It is 90% the same as the normal application one but with a few extra params and allows me to add specific files that are 100% relavent to what we do in the cloud with out ruining the main application one incase we decide to change direction or if we need a more traditional configuration.

The point is more that you should avoid over complicating each manifest in the module, create more simple manifests that are sensible for the task in hand.

This module layout will give you something to work with rather than against, we spent a lot of time making our whole puppet infrastructure as simple as possible to explain, this doesn’t mean it’s simple it just means everyone understands what it does, me, my boss, my bosses boss and anyone else that asks “How does it work?”

Puppet inheritance, revisited

February 29, 2012 / Matthew Smith / 2 Comments

A calmer approach

A few weeks ago I wrote an article which was more of a rant than necessary. I was trying to drastically alter the way we right one of our puppet modules, and not being a simple module like ntp it required a bit of flexibility. To give you an understanding of what our puppet module does, we deploy Alfresco, including the automation of upgrades and configuring varios aspects of the application including bespoke overrides here and there and all sorts of wizzardy in the middle. The original puppet module was written by Ken Barber and then re-written by Adrian and myself. Needless to say there was some big puppet style brains working on it and the time had come to make some fundamental changes, mainly to make it easier to deploy directly from built code. So a lot of the work I’ve been doing on the module meant that it was more important to have the module deploy code form the build servers as well as automatically update itself.

Well, I achieved that within the old module, but quite frankly, the module was about 10 times larger than it needed to be specifically for an operational deployment into the cloud. With this in mind I thought it would eb a good use of time to re-write it and make easier to maintain and hopefully easier to extend, hence the rant around inheritance; in my head I had the perfect solution, which due to some bugs didn’t work. Well Time has moved on and as always progress must continue.

Puppet modules, the easy way.

Okay, so What did I do to overcome the lack of inheritance yet not have all of the duplication that was in the old modules. Simples! combine the two. I gave it some thought and I realsed that the best way out of the situation is to make it so that the variables were set from one place, params, so even though there is some duplication, you still only have to set the variables in one place. As a result I wrote a simple puppet module which has not been tested…. as a demonstration, this is the same structure as a live module so in theory will work the same way.

Manifests

[matthew@rincewind manifests]$ ll
total 20
-rw-r--r--. 1 root root 1981 Feb 17 21:26 config.pp
-rw-r--r--. 1 root root 1757 Feb 17 21:06 init.pp
-rw-r--r--. 1 root root 1578 Feb 17 21:12 install.pp
-rw-r--r--. 1 root root  921 Feb 17 21:51 params.pp
-rw-r--r--. 1 root root 2480 Feb 17 21:57 soimaapp.pp

Let’s start at the top,

init.pp

# = Class: soimasysadmin
#
#   This is the init class for the soimasysadmin module, it loads other required clases
#
# == Parameters:
#
#   *application_container_class*             = Puppet application container class deploying soimasysadmin i.e. tomcat
#   *application_container_cache*             = Location of the tomcat cache directory
#   *application_container_home*              = Application container home directory i.e. /var/lib/tomcat6/
#   *application_container_user*              = User for application container i.e. tomcat
#   *application_container_group*             = Group for application container i.e. tomcat
#
# == Actions:
#   Include any classes that are needed i.e. params, install etc
#
# == Requires:
#
# == Sample Usage:
#
#   class	{ 
#			"soimasysadmin":
#				application_container_class		=> 	"tomcat6",
#				application_container_home		=>	"/srv/tomcat"
#		}
#
class soimasysadmin  ( 	$application_container_class     =	$soimasysadmin::params::application_container_class,
			                  $application_container_cache     =	$soimasysadmin::params::application_container_cache,
                  			$application_container_home      =	$soimasysadmin::params::application_container_home,
                  			$application_container_user      =	$soimasysadmin::params::application_container_user,
                  			$application_container_group     =	$soimasysadmin::params::application_container_group,
		                ) 	inherits soimasysadmin::params {

include soimasysadmin::install

# The application_container variables are set here, but the defaults are all in params, which is available thanks to inheritance, Because this is the top of the tree, these variables can be accessed by other classes as needed

}

install.pp

# = Class: soimasysadmin::install
#
#   This class will install and configure directories and system wide settings that are relavent to soimasysadmin
#
# == Parameters:
#
#   All Parameters should have defaults set in params
#
#   *logrotate_days*      = Number of days to keep soimasysadmin logs for
#
# == Actions:
#   - Set up the application_container
#   - Set up soimasysadmin directories as needed
#
# == Requires:
#   - ${::soimasysadmin::application_container_class},
#
# == Sample Usage:
#
#   class	{ 
#			"soimasysadmin":
#				application_container_class		=> 	"tomcat6",
#				application_container_home		=>	"/srv/tomcat"
#		}
#
#

class soimasysadmin::install 	(	$logrotate_days = $soimasysadmin::params::logrotate_days,
                        			) inherits soimasysadmin::params	{

  File  {
    require =>  Class["${::soimasysadmin::application_container_class}"],
    owner   =>  "${::soimasysadmin::application_container_user}",
    group   =>  "${::soimasysadmin::application_container_group}"
  }

  #
  # Create folders
  #
  define shared_classpath ( $application_container_conf_dir = "$::soimasysadmin::application_container_conf_dir") {

    file {
      "${::soimasysadmin::application_container_home}/${name}/shared":
      ensure  =>  directory,
      notify  =>  undef,
    }
    file {
      "${::soimasysadmin::application_container_home}/${name}/shared/classes":
      ensure  =>  directory,
      notify  =>  undef,
    }
	}

	file {
		"/etc/logrotate.d/soimasysadmin":
		source		=>	"pupet:///modules/soimasysadmin/soimasysadmin.logrotate",
	}

}

params.pp

# = Class: soimasysadmin::params
#
# This class sets the defaults for soimasysadmin parameters
#
# == Parameters:
#
#
# == Actions:
#   Set parameters that are needed globally across the soimasysadmin class
#
# == Requires:
#
# == Sample Usage:
#
#   class soimasysadmin::foo inherits soimasysadmin:params ($foo  = $soimasysadmin::params::foo,
#                                                 					$bar  = $soimasysadmin::params::bar) {
#   }
#
#
class soimasysadmin::params  ( ) {

#
# Application Container details
#

  $application_container_class            = "tomcat"
  $application_container_cache            = "/var/cache/tomcat6"
  $application_container_home             = "/srv/tomcat"
  $application_container_user             = "tomcat"
  $application_container_group            = "tomcat"


#
#	soimapp config
#

	$dev_override_enabled   = "false"
  $custom_dev_signup_url	= "http://$hostname/soimaapp" 

}

Okay that’s all that’s needed to do the install, Notice that each class inherits params. Interestingly enough Adrian informs me this probably shouldn’t work and is possibly a bug, as params is being inherited multiple times. I think the only reason it works is that we have no resources in the params.pp if we add a file resource in ti will fail. but nonetheless, this works (for now)

config.pp

# = Class: soimasysadmin::config
#
#		This class configures the soimasysadmin repository
#
# == Parameters:
#
#		All Parameters should have defaults set in params
#
#		*db_name*												= The Database name 
#		*db_username*										= User to connect to the DB with
#		*db_password*										= Password for the user
#		*db_server*											= The DB server host 
#		*db_port*												= DB port i.e. 3306
#		*db_pool_min*										= Min DB connection pool
#		*db_pool_max*										= Max DB connection pool
#		*db_pool_initial*								= Initial DB connection pool size
#
# == Actions:
#   Install and configure soimasysadmin repository component into appropriate
#   application container
#
# == Requires:
#   - Class["${soimasysadmin::params::application_container_class}"],
#
# == Sample Usage:
# 
#   class	{ 
#			"soimasysadmin::config":
#				db_password 	=>	"Hdy^D7D6fvndsakj(*80",
#		}
#
#

class soimasysadmin::config	(	$db_name					=	$soimasysadmin::params::db_name,
															$db_username			=	$soimasysadmin::params::db_username,
															$db_password,
															$db_server				=	$soimasysadmin::params::db_server,
															$db_port					=	$soimasysadmin::params::db_port,
															$db_pool_min			=	$soimasysadmin::params::db_pool_min,
															$db_pool_max			=	$soimasysadmin::params::db_pool_max,
															$db_pool_initial	=	$soimasysadmin::params::db_pool_initial,
														)	inherits soimasysadmin::params	{
	File {
    owner   =>  "${::soimasysadmin::application_container_user}",
		group		=>	"${::soimasysadmin::application_container_group}",
		notify	=>	Service["${::soimasysadmin::soimasysadmin::application_container_service}"],
	}

	#
	# properties file
	#

	# If extra config set do concatinate
  file {
    "${::soimasysadmin::application_container_home}/${::soimasysadmin::application_container_instance}/properties":
    content	=> template("soimasysadmin/properties"),
    mode		=> 0640,
  }
}

NB In this instance I use config.pp as a way of setting up default config for the application, things common, for example the DB, where as application specific config is done separately, this obviously depends on your application…

soimaapp.pp

# = Class: soimasysadmin::soimaapp
#
# This class installs the soimaapp front end for the soimasysadmin repository
#
# == Parameters:
#
#		All Parameters should have defaults set in params
#
#		*dev_override_enabled*						= Enable dev configuration override
#		*custom_dev_signup_url*						=	Custom dev signup url
#		*application_container_instance*	= Application container instance i.e. tomcat1, tomcat2, tomcatN
#		*application_container_conf_dir*	= Application container conf directory i.e. /etc/tomcat6/conf/
#		*application_container_service*		= Service name of application container i.e. tomcat6
#
# == Actions:
#   Install and configure soimaapp into appropriate application container
#
# == Requires:
#		- Class["${soimasysadmin::params::soimasysadmin_application_container_class}",
#						"soimasysadmin::install"],
#
# == Sample Usage:
# 
#		include soimasysadmin::soimaapp
#
class soimasysadmin::soimaapp	(	$dev_override_enabled								$soimasysadmin::params::dev_override_enabled,
															$custom_dev_signup_url							=	$soimasysadmin::params::custom_dev_signup_url,
															$application_container_instance			= $soimasysadmin::params::application_container_instance,
															$application_container_conf_dir			= $soimasysadmin::params::application_container_conf_dir,
															$application_container_service			= $soimasysadmin::params::application_container_service
															) inherits  soimasysadmin::params  {

	# Set Default actions for soimasysadmin files
	File	{
		notify	=>	Class["${::soimasysadmin::application_container_class}"],
		owner		=>	"${::soimasysadmin::application_container_user}",
    group   =>  "${::soimasysadmin::application_container_group}",
	}


	#
	#	Install / Upgrade War
	#
	
	# Push war file to application container
  file {
    "${::soimasysadmin::application_container_home}/${application_container_instance}/webapps/soimaapp.war":
    source	=> "puppet:///modules/soimasysadmin/soimaapp.war"
    mode		=> "0644",
  }

	#
	#	Shared class created
	#
	soimasysadmin::install::shared_classpath {
		"${application_container_instance}":
			application_container_conf_dir	=>	"${application_container_conf_dir}"
	}
	
	if	( $dev_override_enabled == "true" ) {	
		file {
			"${::soimasysadmin::application_container_home}/${application_container_instance}/shared/classes/soimasysadmin/web-extension/soimaapp-config-custom.xml":
			content => template("soimasysadmin/soimaapp-config-custom.xml"),
		}
	}	
}

Okay that’s the basics, in the examples above, soimaapp does application specific config, any generic settings are in params.pp, any global settings are in init.pp; hopefully the rest is self explanatory.

I haven’t included any templates or files, you can work that out, you are smart after all.

Now the bad news

I will say the above does work, i’ve even tested it with inheriting soimaapp.pp and applying even more specific config over the top and all seems well, so what exactly is the bad news…

I tested all of this on the latest version of puppet 2.7.10 and then came across this, luckily 2.7.11 is out and available but I just haven’t tried it yet. let’s hope it works!

RHN Satellite vs Puppet, A clear victory?

February 22, 2012 / Matthew Smith / 14 Comments

Is It Clear?

No! It never is, it is all about what your environment looks like and what you are trying to achieve. Both solutions provide configuration management, both have a paid for and a free version, although you’d be forgiven for not realising you can get RHN Satellite (or close to it) for free, At the end of the day they are different tools aimed at different audiences and it is important that you, yes you the sysadmin, works out what your environment is, not what it might be, not what think it is, but actually what it is.

Step 1

Before we even consider the features and the pro’s and con’s of each solution we (and I mean you) have to work out what the solution needs to provide, and more importantly the skill set of those managing it. There’s a quick way to do this and a slightly more laborious one, lets look at what technology is used to run each solution.

It’s worth noting the following skills are what I would say is needed to produce an effective system, that is not to say that you could do it with less or that you shouldn’t have more skills….

RHN Satellite required skills

Understanding of python – Especially if you want to interact with the API’s or configure templates; if you are not going to use it for templating configuration files you will need no skills in this.

Basic sysadmin skills, so RHCT or higher

Understanding of networking

That’s it, it is designed to be easy to use, the installation and configuration is not overly easy, if you know what you are doing / As good as the RH Consultants, You’re looking at 3 days, if you have no idea what you’re doing, allow 2 weeks. Personally I would get Red Hat in to set it up and train you at the same time you’ll be in a better position for it.

Puppet

Good Ruby skills or Very good programming skills

Understanding of object orientated programming languages

Good sysadmin skills, RHCE, if you can’t set up DHCP, DNS, Yum Repos, Apache, NFS, SVN or basic web applications from the CLI then this is not you.

Understanding of networking

A slightly higher set of skills, but achievable by most, it is not going to be difficult to install and get working, I’d think if you had an RHCE you’d have a working puppet server in a matter of mins (you would need to add a yum repo and install a package, done)

Who’s running the service?

Okay so you, the super dooper sysadmin, manage to convince yourself that you are the right person to run puppet, You get it installed set up and configured, write a few basic modules and then pass it off to some other team to “run with it”, their skill set is closer in line to those needed for RHN Satellite. In short, Bad sysadmin!

You have to consider who will be using the system, the skill set and the aptitude to learn and progress, just because you want to do puppet doesn’t make it the right solution. You could always go with the more simple RHN Satellite server to start with and as skills develop look back at something like puppet in a couple of years.

Step 2

What features do you need? Not what features you want… So, what does that mean… Do you need to have dynamic configuration files, files where depending on what the state of the node or configuration around the node change their configuration using if statements, loops, case statements etc?
Do you want to easily be able to build and deploy servers from “bare metal”?

Hopefully by this point you will have a good understanding of the skill set to support it and what the business actually needs, now you’ve done this I can happily tell you that either solution is able to do what ever you are thinking of (in reason) but it was important to get a fuller understanding of what was needed.

Puppet Features, a totally non-exhaustive list

Dynamic configuration of files through very powerful ruby templating, if you can think of it, it can do it

Powerful node inheritance for ease of managing set’s of servers

Classes to manage applications with parameters and basic inheritance (See Last Weeks post)

I did say it was non-exhaustive, for a full list look here but be warned, just because it say’s it can do something doesn’t mean it can do it as well as you might think, Doubly true of RHN Satellite server!

The important thing for puppet is the community behind it and the fact it is extensible, you can do anything with it. You can create your own providers, resource types, facter variables etc etc there’s always new developments so you really do need to subscribe to their Blogs

You can get an Enterprise version, which comes with support, a fancy gui and all that warm fuzzy stuff, you can even get the “bare metal” build by using something like Foreman

Enough of Puppet what about the RHN Satellite!

RHN Satellite Features

Repository management through Channels

Organisational management – You can create sub-organisations that have access to specific nodes or profiles to apply builds, so they appear to have their own RHN

Security control – You can easily manage access to the web interface, nodes, access keys

Easy to use – Really it is, anyone with a little tech savvy and some time on their hands could work it out

A few more features can be found here, But the real benefit with the RHN satellite system is the ease of use, if the people running the service are more RHCT than RHCE then it’s worth considering.

I will say it’s easier to manage your patch release cycle, although in reality it isn’t; the RHN Satellite does allow anyone to manage the flow, move systems between different channels etc etc.

One of the features I liked the most was the ability to group servers together and apply patches to those groups and manage a group of servers as single server, and migrate them from dev to staging etc etc.

The ups and the downs

So we’ve looked at what you need and what you want and who should be looking after it and even touched on a few features. With all this in mind I never once mentioned the downsides of either.

The biggest downside with puppet is its flexibility and its pure focus on configuration management, as a result it doesn’t fire up a PXE boot service or easily integrate OS install to configuration with out additional tools, it just does configuration. As a result you have to provide all of these ancillary services in addition to the configuration management to achieve the same completeness of service that you get from the RHN Satellite. It is for this reason that you need the additional skills and experience to cope with it or Foreman

So what about the RHN Satellite server? The biggest let down with this is the configuration management, if you want to push a static file out it’s really straight forward, and when I was looking at this a few years back you could put variables into the files but from memory the set of options were limited, like you could add the local IP address or the hostname of the server, but you couldn’t pass in custom settings.

The biggest benefit of puppet is the combination of generic modules and well written template files, the principles behind it is that you may very well have a complicated module, but you should be able to switch the configuration at a moments notice. This provides a very flexible approach to delivering the configuration. For example You can have a simple apache module which you can add additional complexity to through inheritance, parameters and defines. With RHN Satellite you just won’t get that unless you re-package your apache into its own RPM, for each type of web service.

With the RHN Satellite the biggest advantage is purely the easy of use, it is a jack of all trades and a master of none, but if your aims are simple and your staff not quite up there on the pillars of excellence it is a good solution that you will be able to do most of what you want with.

Summary

For me I’d boil it down to the simple way of determining this.

If your company is predominately Microsoft Windows, or the sysadmin’s are not dedicated (and yes that’s plural sysadmin’s…) to Linux then I would recommend RHN Satellite, unless you have a very specific use case that can not be solved by the RHN satellite it is worth giving up some flexibility. For example, if you need to manage Red Hat and Debian, rule out the RHN Satellite, or if you know that you are going to be growing a team (know not think…) skills and numbers to be dedicated Linux sysadmins.

If you are an open source company, or have dedicated Linux Sysadmins who have been there, done that, brought the T-shirt, ruined the T-shirt redecorating the house and know the difference between nc -z -t and nc -z -w 10. Then I would consider puppet your first choice, it is young and upcoming, it’s easy to forget that it has not been around 5 years. It has some rough edges but they are getting better, and with support from puppet it makes total sense.

It’s worth touching on the training and availability of skills, Good Luck RHN satellite skills are not well-known and mainly retained within Red Hat. Puppet skills are in very high demand and people claiming the experience and understanding may be pulling your leg (a lot in some cases). However training is available for both the RHN Satellite and Puppet

These are not two products to compare evenly, both can be done for free, both can be very complicated, the only recommendation is to not choose either one based on their technical merit alone but more so on which one fits best with the aims of the project, the hearts and minds of those using the system and good ol’fashioned gut-feel.

For me, having used both, I would lean more towards Puppet, but I’m lucky, Where I am we have a lot of very technical people who are able to understand and work with puppet.

Puppet inheritance, The insane mine field

February 15, 2012 / Matthew Smith / 5 Comments

I love Puppet

I’m what you would call a newbie when it comes to puppet, I’ve be fortunate enough to work with it for the last 2 years but I only started using it 15 months back. I’ve used other configuration management tools in the past, and one day maybe I will bore you with the inner workings os RHN Satellite Server, but puppet was different.

For me it had great power over the RHN Satellite Server I had used, it allowed for me to have dynamic configuration that was structured and sensible, although the Satellite server allows you to replace certain variables it was nowhere near as powerful.

Luckily for me I had my own personal puppet guru in the form of Adrian Bridgett who has been working on puppet since the early days. It enabled for me to progress quite quickly through some of the less interesting features and straight into the world of classes and parameters.

As time progressed I learnt all about the different resources and various structures I could utilise, the various quirks of working with puppet, such as loops, if you ever need to do a loop, good luck. In short I have had the privilege of working with some very interesting puppet code created by people who knew puppet a lot better than I, all the time with a resource on hand to help me learn.

I hate duplication!

One of my pet peeves with puppet at the moment is the way you have to pass variables around to make things work and that proper inheritance of the classes is just so poor.

For example, The following is a simple set of classes that works with no issues.

init.pp

class frank ( $foo ="bar" ) {

include frank::bob

  class {
    "frank::joe":
    foo => $foo
  }
}

bob.pp

class frank::bob {

  file {
    "/var/lib/bob":
    ensure => "directory"
  }
}

joe.pp

class frank::joe ($foo="bob") {

  file {
    "/var/lib/bob/${foo}.txt":
    content => template("module/file.txt"),
    owner => "frank",
  }
}

Okay, so we have some class called frank, which takes a param $foo, includes a class bob and calls another class called joe and passes a param foo to it. This is a pretty standard way to pass variables around and it works flawlessly. The downside? It’s rubbish, on simple classes as Puppetlabs always shows you it’s fine you have a handful of parameters and it’s not too much hassle to maintain or they show you a very simple ntp class that doesn’t give you a feel for the more complicated modules you may be writing yourself. You can of course just reference the variable directly or not pass them into the child classes… Wrong If you are using these in templates you will end up having variables that are not in the scoped path for the template. You could reference the nicely scoped variable i.e. $frank::joe::foo this works as long you want the default or you can spend ages working out how to get the variables around, if you actually want to set the variable you need to call the class directly so you end up with a more complicated nodes manifest which isn’t the end of the world, but in the world of keep it simple less is more.

So imagine the above but where you want to pass in 25 or 40 parameters, believe it or not, we have some complicated configuration files and to help us keep the flexibility in the module as much as possible. You basically end up duplicating a lot of variables around, and your init.pp will be huge as a result.

It’s probably worth saying that we try to keep where possible the most simplest node manifest, so typically the largest node manifest would be maybe 10 lines.

So here’s a better way to achieve the same things with less configuration.

init.pp

class frank ( ) inherits frank::joe {

  include frank::bob
}

bob.pp

class frank::bob {

  file {
    "/var/lib/bob":
    ensure => "directory"
  }
}

joe.pp

class frank::joe ($foo="bob") {

  file {
    "/var/lib/bob/${foo}.txt":
    content => template("module/file.txt"),
    owner => "frank",
  }
}

One simple change, inherit! it makes so much sense, the init.pp has has everything that is in the frank::joe and some more, wouldn’t it be nice if that worked. Well it Doesn’t thanks to this and a number of other bugs, the inheritance within puppet just isn’t powerful enough, and more importantly doesn’t work as expected.

It seems the only way to get around this is the first method, which means that even if you are splitting your classes out you end up with duplication and a more confusing modules.

I encourage everyone to give their own personal view-point on this as it’s an area I’d like to learn more about.

Summary

I want an easy life, I want puppet modules that allow for proper inheritance and dependency management, I don’t want to be working around quirks of an application, I want to work with the application and I want it to be simple.

I am not a puppet expert, but I do think that you should be able to pass parameters through to the inherited class without having the variables re-defined at every level, What do you think?