Releasing your first Devops Application

First the worry

When it comes to releasing the first version of an application it’s always worth weighing up the constraints of your environment and the time frame in which the task was delivered versus the skill set available. Inevitably as a skilled DevOps professional you want to do a good job, well done you; however you have to be strong and realise it is not about delivering perfection from day one but about the journey you must take to get there.

I recall the first deployment I did for a version 1 and every time I do one since then I get better, be it a bit more focused or a better starting point. The very first one I did was all over the place, no real configuration management, quite a few manual steps but a well written process, unfortunately that project remained in the depths of secrecy and I ended up moving on.

Constantly I see over engineering and complication added to projects and the root cause of this is worry, I know, I use to be there doing it, it is difficult to step back and be objective to what the business needs, but as a DevOps professional that is your job. When delivering a solution try and remember these things to help you worry less and focus more:

  1. Before being perfect you must first just “be”
  2. When in doubt, do less
  3. If you do not know when the site is down you will not have a job
  4. Always have a backup

Then the delivery

The above list is rather quite useful, use it as a bit of guidance. Starting with point 1, some elaboration; when delivering a solution the most important thing is to deliver the solution, so many people forget this part and focus on the technicalities or whether or not it is the “best” way to deliver the solution. In reality, who cares, no one will care when you are in that meeting explaining why you’re late and have not got a working solution.

Getting stuck in the detail is a horrible place to be and sometimes it gets too involved or too complicated leading to much discussion and inevitably the solution comes out complicated and will take a while to deliver, in these situations point 2 comes in, just do less. It sounds silly but if you’re rushing around struggling to meet a deadline then you need to take things out of scope, and focus on what the actual solution needs to be, maybe you have to have a manual step, then at a later point you can automate it.

The last two points are along the same lines, and those lines are things that get you fired. If your site is down and you don’t know that it is totally down, that’s a bad thing; likewise loosing data is considered pretty poor. However do not get stuck in the trap of assuming you must have full monitoring of every server or that the backup needs to be anything more than a cron job for now.

The “trick” is always around identifying what needs to be done and could be done, by focusing on what needs to be done first you can then come back to improve the rest.

build, improve, rinse, repeat

As touched on earlier You are allowed to cut corners and focus on what is necessary, failure to do this will just lead to delays and a business that is getting rapidly turned off of DevOps. The first release you do can be complete and utter crap, it can be all manual, with nothing more than a simple web check on port 80, that is okay. The important thing is you deliver to the deadline, You have mitigated the main risks of not knowing when the site is down or the potential loss of data, heck even having single points failure are allowed as long as you can clearly identify what the risk is and a solution if that were to happen. In fact, I’d almost go as far to say this is expected.

The key is as always to improve, little and often. Step 1, Manual, Step 2, automate what is easy, Step 3, automate the rest. It has never been and will never be about perfection from version 0.1 onwards you just need to improve a little each time in line with that golden view of what perfection is. As long as you know what the end goal is you can work towards it, just don’t get carried away by trying to deliver it all for the first version.

Deploying Sinatra based apps to Heroku, a beginners guide

A bit of background

So last week, I was triumphant, I conquered an almighty task, I managed to migrate my companies website from a static site to a Sinatra backed site using partial templates. I migrated because I was getting fed up of modifying all of the pages, all two of them, when ever I wanted to update the footers or headers, this is where the partial templates came in; Sinatra came in because it had decent documentation and seemed good…

So, at this time feeling rather pleased with myself I set about working out how to put this online with my current hosting provider, who I have a few domains name with but only through an acquisition they made. I thought I’d give them a shot when setting up my company site, better the devil you know etc and they supported php, ruby and python which was fantastic as I knew I would be using python or ruby at some point to manage the site. After a frustrating hour of reading trying to work how to deploy the ruby app and finding no docs with the hosting provider I logged a support ticket asking for help; to which the reply was along the lines of “I’m afraid our support for Ruby is very limited”. I chased them on Friday to try and get a response on when it would be available, no progress, some excuses because of the platform, so I asked “Do you currently have any servers that do run ruby?” to which the reply was “I’m afraid we have no servers that run Ruby, it shouldn’t be listed on our site, I didn’t know it was there.”

By this point alarm bells were ringing and I thought I best think about alternatives.

Getting started

sinatra-logo

Before even signing up to heroku it’s worth getting a few things sorted in your build environment, I had to implement a lot of this to get it working and it makes sense to have it before hand. So for starters, you need to be able to get your application running using rackup, and I came across this guide (I suggest reading it all). In short, you use Bundler to manage what gems you need installed, and you do this by creating a Gemfile with the gems in, and specifying the ruby version (2.0.0 for Heroku)

My Gemfile looks like this:

source 'https://rubygems.org'
ruby '2.0.0'

# Gems
gem 'erubis'
gem 'log4r'
gem 'sinatra'
gem 'sinatra-partial'
gem 'sinatra-static-assets'
gem 'split'

It simply tells rack / bundler what is needed to make your environment work, and with this you can do something I wish I found sooner, you can execute your project in a container so you can test you have the dependancies correct before you push the site by running a command like this:

bundle exec rackup -p 9292 config.ru &

NB You will need to run

bundler install

first.

By now you should have a directory with a Gemfile, Gemfile.lock, app.rb, config.ru and various directories for your app. The only other thing you need before deploying to heroku is a Procfile with something like the following in it:

web: bundle exec rackup config.ru -p $PORT

This tells Heroku how to run your app, which combined with the Gemfile, Bundler and the config.ru means you have a nicely contained app.

Signing up

Application hosting

Application hosting

Now, Why would I look at Heroku when I’ve already spent money on hosting. Well, for one, it will run ruby, two, it’s free for the same level of service I have with my current provider, three, it’s 7 times quicker serving the ruby app in Heroku than the static files with my current host. So step one, Sign up it’s free, no credit card 1 dyno (think of it as a fraction of a cpu, not convinced you get a whole one)

Create a new app, now, a good tip here, if you don’t already have a github account, Heroku is going to give you a git repo for free, granted no fancy graphs, but a nice place to store a website in with out forking out for private repos in github. Now once your site is in the Heroku git repo you just need to push it up and watch it deploy, at this point you amy need to fix a few things but… it’ll be worth it.

Performance

I don’t want to say it’s the best, so I’m going to balance up the awesomeness of what follows with this I suggest you read it so you can form your own opinions.

So using Pingdom’s tool for web performance I tested the performance of my site, hosted in the UK, vs Heroku in AWS’s European (Ireland) and here’s the results:

The current site, is behind a CDN provided by Cloudflare and already had a few tweaks made to make it quicker, so this is as good as it gets for the static site: results

Now the new site, unpublished due to the aforementioned hosting challenge, doe snot have a CDN, it is not using any compression yet unless Heroku is doing it, but it’s performance is significantly quicker as seen in the results

Now for those of you who can’t be bothered to click the link, the current site loads in 3.43 seconds which is slow but still faster than most sites, the Heroku based site loads in 459ms so 7 times quicker and it’s not CDN’d yet, or white space optimised, that’s pretty darn quick.

Sinatra – partial templates

Singing a different song

Firstly apologies, it’s been over a month since my last blog post but unfortunately with holidays, illness and change of jobs I’ve been struggling to find any time to write about what I’ve been doing.

A few months back I did a little research into micro web frameworks, did quite a bit of reading around Sinatra, Bottle and Flask. To be honest they all seem good, and I want to play with Flask or bottle at some point too, but Sinatra is the one I’ve gone for so far as he documentation was the best and it seemed the easiest to use and he easiest to extend, not that I’ll ever be doing that!

Either way I thought I’d have a bit of a play with it and see if I could get something up and working and, locally for now due to a lack of documentation from my hosting provider… I have re-created my website Practical DevOps within Sinatra using rackup, bundler and ERB templates.

Now there’s a few reasons I did this, one, I wanted to stop updating every page when ever I needed to update the header or footer of a page, two, I want to implement Split testing (also know as A/B Testing) using something like Split. With all of this in mind and the necessity of having a bit more programmatic control over the website it seemed like a good idea to go with Sinatra.

The Basics of Sinatra

By default Sinatra looks for a static content in a directory called “public” and will look for templates in a folder called “views” which if needed can be configured within the app. So for basic sites this works fine and would have worked fine for me, but I really wanted partial templates to save having to enter the same details on multiple pages, and this can be done with Sinatra partial.

!/usr/bin/ruby

require 'sinatra'
require 'sinatra/partial'
require 'erb'


module Sinatra
  class App < Sinatra::Base

   register Sinatra::Partial
   set :partial_template_engine, :erb

    #Index page
    ['/?', '/index.html'].each do |path|
      get path do
        erb :index, :locals => {:js_plugins => ["assets/plugins/parallax-slider/js/modernizr.js", "assets/plugins/parallax-slider/js/jquery.cslider.js", "assets/js/pages/index.js"], :js_init => '<script type="text/javascript">
        jQuery(document).ready(function() {
            App.init();
            App.initSliders();
            Index.initParallaxSlider();
        });
        </script>', :css_plugins => ['assets/plugins/parallax-slider/css/parallax-slider.css'], :home_active => true}
      end
    end
  end
end

So let’s look at the above which is simply to serve the index of the site.

   register Sinatra::Partial
   set :partial_template_engine, :erb

The register command is how you extend Sinatra, so the sinatra-partial gem when it is installed simply drops it’s code in the sinatra area and when you call register all of the public methods are registered, this allows you to do stuff like this with magic, or you can use it in the ERB template like this. The next line simple tells sinatra to use ERB rather than haml, I chose this because of puppet and chef all using erb and as a result i’m a lot more familiar with that.

 #Index page
    ['/?', '/index.html'].each do |path|
      get path do
        erb :index, :locals => {:js_plugins => ["assets/plugins/parallax-slider/js/modernizr.js", "assets/plugins/parallax-slider/js/jquery.cslider.js", "assets/js/pages/index.js"], :js_init => '<script type="text/javascript">
        jQuery(document).ready(function() {
            App.init();
            App.initSliders();
            Index.initParallaxSlider();
        });
        </script>', :css_plugins => ['assets/plugins/parallax-slider/css/parallax-slider.css'], :home_active => true}
      end
    end

One of the nice things with sinatra is it’s simple to use the same provider for multiple routes, and the easiest way of doing this is to define an array and simply iterate over it for each path. Sinatra uses the http methods to define it’s own functions of what should happen, so a http get requires a route to be defined using the “get [path] block” style syntax and likewise the same for post, delete etc, see the Routes section of the sinatra docs for more info.

The last section is calling the template, so typically the syntax could just be “erb :page_minus_extension” which would load the erb template from the “views” directory created earlier. If you wanted to pass in variables to this you would define a signal ‘:locals’ which takes a hash of variables. All of these variables are only available to the the template that was called at the beginning, so to get the variables to the partial requires some work within the template.

Now within the the views/index.erb file I have the following:

<%= #include header
partial :"partials/header", :locals => {:css_plugins => css_plugins, :home_active => home_active}
%>

Partial calls another template within the views directory, so as I have a partial called header.erb in views/partials/ it loads that, and by defining the locals again from within the template I am able to pass the variables from index into the header or any other partial as needed.

Okay, that’s all folks, Hopefully that’s enough to get people up and running, have a good look at the examples in the git projects they’re useful, and be sure to read the entire Intro to sinatra, very useful!

Helping others

A while back

I was volunteered to help our finance team customise their accounting system which finally I’ve got to the end, or at lest the end is now in sight. It was an interesting situation, a few years back the accounting use to be done in Access and a couple of years back they migrated to a solution provided by Netsuite. I have to say, when I started I spent ages complaining about how awkward and painful it was to do anything in Netsuite, but as I’ve done the journey I’ve come to understand a bit more and it’s not bad, it’s still massively complicated but it is at least not that bad!

One of the most frustrating things I found was using web services, I’d never done it before and on paper it sounded like the best solution, but it was a nightmare, the most annoying thing is the provided wsdl had errors in it, so I ended up having to spend some time finding the bad lines and fixing it from within vi thanks to this link. Needless to say after a few weeks of struggle I started down the right path which was to use more of Netsuite to do more and to write a simpler tool to do the rest.

Time for a rest

I found the rest-let’s which was a big help, use their documentation to work out roughly what to do and then hook it in and hope for the best. Its worth noting the Documentation is excessive and hard to follow, makes you wonder why they bother. Either way After writing that then spending some time trying to get that up and working it was there, just needed the other 7 pieces of the puzzle to make it work, Luckily you can call searches remotely to get a list of results and then manipulate them and send them back. I came up (with a lot of help from the finance team) a search that only returned entries I hadn’t yet changed and then took the details from that and manipulate it into the format I needed to post back to my rest-let to do the updating.

After sorting out the entry of data I then had to just work out how to invalidate the entry if anything was changed, this was harder, it took a bit of reading and then it was clear what to do, just impossible to find out how to do it! I must have spent more time on this project working out how the system worked and just using the tool rather than coding.

Bigger and better things

This initial step was an important one it sets the foundation for a raft of other changes that will hopefully be simpler and easier to put in place, but like anything unknown, we won’t know until we know. I’m looking forward to taking a system that already provides value and making it provide more, with reporting and customisations, hopefully over the coming weeks it progress to produce more useful information.

Summary

In short, Help people, sometimes it starts off being a pain but at least you help someone and you also learn something, yes it may lead to a flurry of other bits but, ask your self this. If you needed help, would you want someone to help you?

Let’s Chat

It’s good to talk

Over the last few weeks I’ve been playing with some forum based technology, trying to find an ideal platform to take and add lots of modifications too to integrate fully with another app. The idea is that in said other app you will click “help” and it will take your code and post it directly to a forum, where people can offer up code solutions which you as the original poster can click the “try it” button that then puts it back in the other app and allows you to test it against your test systems.

There’s a whole other bunch of features and gamification that needs to happen to help build up a desire to help people with rewards for those that help the most, needless to say there will be a lot of customisations so I spent a little bit of time playing around with a few technologies and the one we’re going to use is NodeBB. It’s also worth mentioning they have a nice indiegogo campaign going and more support is required!

Everyone we’ve shown the forum to likes it, it is still quite immature, they only started in mid May (2013) and the code has come a long way, and it is continuously improving. With help from me ;)

Building a stable platform for the future

I have some constraints around what I’m doing, one is I’m still new to Javascript, learning, but new; luckily I have people I can bug if I get stuck which really helpful. I also have to make the discussion boards easy to support for the people that will be looking after it and easy to integrate it with our exiting app as and when needed.

The next biggest issue with building a solid platform is a good choice of frameworks, using a frame work like PassportJS rather than writing individual login methods, implementing configurable logging with something like Winston and using templating solutions like Dust (Dust seems really god but not maintained any more, I guess it’s perfect?) or Jade (Jade is only server side but ClientJade fills that niche)

All of these frameworks will just make my life easier in the log run, I’ll hopefully be able to work quicker and it should give other sone thing less to learn assuming I choose libraries they are familiar with anyway.

So there’s a few things that need to be looked at from a supportability point of view and to enable easier development by us and then there’s all the extras we need to work on, because of the level of integration and needing to poke almost all elements it will be interesting to see how the plugins will work for NodeBB. I’m desperately trying to work out how I can maintain heavy customisation of the code and still contribute back shared goals, I’d prefer not to fork and become so separated that we can’t push back contributions, hopefully we’ll be able to work out how we do that!

What challenges you?

Over the last few weeks

I have been wondering what most people find challenging in the “modern” IT world. There’s been a recent upsurge in tools and technology that address most problems which only leaves me to wonder what is filling that gap? What is the current big annoying problem, maybe it’s not being able to push your architecture into multiple clouds, or having to live with the constraints of small root disk volumes; Who knows? Hence the poll :)

A week in the Valley

While out and about…

Over the last week I’ve been out in the bay area meeting with an important client talking about their needs and how we’re going to make things better for them and for us, all in all a good trip (apart from the plane crash). This was my first time to the bay area and it seems like a nice enough place, it lives up to expectations in some areas and not others, I’m sure with more local knowledge it’s possible to overcome some of the issues I had with the area. The main issue I could see (granted it was only a week) is that it’s not as nice to live there as it is in the UK or even as nice to be around and in like London.

In London, everything is a walk away or a short tube journey, and better yet if you’re willing to travel more than an hour each way each day you can live in the countryside and just commute in; but in the valley everything is a short car journey away, the public transport seems a bit hit and miss and Taxi’s aren’t cheap!
I think its things like this which will be the end of the bay area over the next 10 years unless it changes, and I’m not the only one to think this and as time moves on I think we’ll see a shift in tech start-ups away from silicon valley into areas that are nicer to live.

Which is what brings me back to London, there’s a good start-up culture, there’s more investment going on and there’s some good companies starting to appear, unfortunately for tech startups the UK still isn’t brilliant, but it will get there in time and it probably just needs a few more years and some brave people to trail blaze.

I think London has the makings of a nice tech hub for Europe and will over the next few years start exceeding the bay area, the only thing it’s really missing at the moment is the massive success stories that appear in the bay area every few years, sure there’s some good companies but none are a apple, google or facebook.

I think I could survive out there to live for a while but not forever, it’s nice being able to go to the beach, forrest, mountains what ever you want all within a reasonable drive but there’s too much convenience stuff, like fast food, corner shops and drive throughs. Like Walmart, it’s got a purpose but not for me, Trader Joes seemed better, but no soft drinks just fresh goods and booze… Maybe in time I would have found stuff that felt a little more “me” and a little less American but I’d have to go and give it a go to find out!

For me personally I don’t really want to live in London, it’s just too busy but living out in Hampshire makes London an awkward commute, do able but not every day. As time goes on I’m still hopefully that more start ups will start offering flexible working like we have at Alfresco where going in for 2-3 days a week is the norm and everyone is trusted to do the work, and who knows over the next 10 years maybe more will start filling the M3/M4 corridor which will make living in a nice place and commuting to a nice tech company is all possible.

It will certainly be interesting over the next few years how the tech industry in the UK changes but I’m certain it’s picking up speed.

Doing it the hard way

Lets make it really complicated

Over the last two weeks i’ve been playing with some open source project that has a bit of a kick starter going to fund their idea. I came across it through my boss who probably found it on redit but in essence it’s a forum using nodejs as the backend and some Jquery at the front end but all in all it looks pretty awesome; sure it has a few flaws but it’s less than 6 months old.

In my first venture into using it on my laptop I came across a bug with the title of a new post, when you reply it locks the title to stop you editing it on reply, unfortunately it didn’t unlock it so when you went to post a new topic you were unable to set a title. I decided that I could raise a ticket or I could just have a look at it so I had a lock and submitted a fix to them; to my amazement they accepted my fix! We decided that this is a good platform to use for what we’re trying to do at work but it needs a few core fixes followed by quite a bit of integration and customisations, the core things we need to fix in the product (in order of importance) is

  1. Deployment to bespoke path
  2. Increased logging for debugging
  3. Additional authentication routes

So I decided that not knowing the code I should start at working on point one, it makes it useful and gets me involved in a lot of the code so hopefully I’ll learn something. It seemed a sensible place to start, it was a sensible place to start; unfortunately being a new project there isn’t a lot of documents or sites to google for this stuff so I’ve been learning the hard way.

The main challenge is learning the code, the other is working out why things were done in a certain way. So one of the issues I have is that currently there’s something like 6 config.json files all used for different things, with different config so I need to backwards engineer all of it, a little annoying seeing as with some better technology choices it could just be one config file, but then I also don’t know why things have been done that way.

Challenging me more!

Up until recently my experience to nodeJS had been rather limited but I had used some cool things like express, Winston and Jade but I has always had the luxury of talking to the developer that wrote it to help me understand why it was done that way and how I should use it; this time I’m on hard mode, I have to understand it from reading and I have questions! Hopefully the people running the project will have some time to spend helping me get up to speed and answering my stupid question, I read through the code and I just don’t know what’s going on, I think this is partly down to not really being a programmer and partly to not knowing the language very well so everything is a little odd; at least I hope thats the case, if it is just bat shit crazy then at least I’m confused for a reason :)

I’m definitely going to persevere as I’m, sure it will be useful and it is in the project teams best interest to help me understand what it does so I can start submitting “awesome” changes back to the project even if they don’t want them :)

Either way I’m looking forward to diving into the code a bit more and trying to guess what it’s doing; hopefully I can make it work for our purposes while providing useful (although not needed) features back to the project. I’m also looking forward to increasing what I’m doing in complication so I can start doing the more bespoke work we needed with integrations and maybe adding an achievements framework or some sort of gamification to the forum tool

Configuration management alone is not the answer

Everything in one place

Normally when businesses start out building s product, especially those that don’t have the pre-existing knowledge of configuration management, tend to just throw the config on the server and then forget what it is. This is all fine, it’s a way of life and progression and sometime just bashing it out could prove very valuable indeed, but typically this becomes a nightmare to manage. Very quickly when there is then 100 servers all manually built it’s a pain in the arse so then everyone jumps into configuration management.

This is sort of phase 1, everything has become too complicated to manage, no one knows what settings are on what boxes and more time is spent working out if box 1 is the same as box 2. This leads to the need to have some consistency which leads to configuration management, the sensible approach is to move an application at a time into configuration management fully, not just the configuration files.

During this phase of execution it is critical to be pedantic and get as much as possible into configuration management, if you only do certain components there will always be the question of does X affect Y which isn’t in configuration management? and quite frankly, every time you have that conversation a sysadmin dies due to embarrassment.

Reduce & Reuse

After getting to Phase 1, probably in a hack and slash way, the same problems that caused the need for Phase 1 happen. 100 servers in configuration management lots of environments with variables set in them, and servers, and in the manifests themselves and the question starts to be come well is that variable overriding that one, why is there settings for var X in 5 places, which one wins? Granted in configuration management systems there are hierarchies that determine what takes precedence but that requires someone to always look through multiple definitions. On top of having the variables set in multiple locations, it is probably becoming clear that more variables are needed, more logic is needed, what was once a sensible default is now crazy.

This is where phase 2 comes in, aim to move 80%+ of each configuration into variables, have chunks of configuration turned on or off through key variables being set and set sensible defaults inside a module/cookbook. This is half of phase 2, the second half and probably the more important side is to reduce the definitions of the systems down to as few as possible. Back in the day, we use to have a server manifest, an environment manifest and a role manifest each of these set different variables in different places, how do you make sure that your 5 web servers in prod have the same config as the 5 in staging? that’s 14 manifests! why not have 1? just define a role and set the variables appropriately, this can then contain the sensible defaults for that role, all other variables would need to be externalised in something like hiera, or you would need to push them into Facter / ohai.

By taking this approach to minimising the definitions of what a server should be and reducing it down to one you are able to reuse the same configuration so all of your roleX servers are now identical except what ever variables are set in your external data store which can now easily be diff’d.

build, don’t configure

By this point, phase 1 & 2 are done, all is well with the world but still there’s some oddities Box X has a patch level y and box A has a patch level z, or there’s some left over hack to solve a prod issue which causes a problem on one of the servers. Well treat your servers as configurable and throw-away-able, There’s many technologies to help with this be it cloud based with Amazon and OpenStack or maybe VMWare, even physical servers with cobbler. This is Phase 3, build everything from scratch every time, at this point the consistency of the environment is pretty good leaving only the data in each environment to contend with.

Summary

Try and treat configuration management as something more than just config files on servers and be persistent about making everything as simple as possible while trying to get everything into it. If you’re only going to manage the files you might as well use tar’s and if that sounds crazy it’s the same level as phase 1 which is why you have to get everything in and I realise it can seem a massive task but start with the application stack you’re running and then cherry pick the modules/cookbooks that already exist for the main OS components like ntp, ssh etc

Cut down, deliver early and often

Deliver all the things

There’s idealistic people in the world and that’s fine (thanks for reading by the way :) ), and there’s pragmatic people, I want to go through how you can provide solutions that give a pragmatic approach to delivering value and doing it in a way that gets it done on time but still helps you get to your idealistic goals.

Often when sitting down and planning with senior management bods a feature list as long as your arm comes out, the reality is even if this is thought to be the bear minimum list, in reality it probably isn’t and is instead a bloated minimum, there’s always room to cut out features, so agree some prioritisation on the features and operate a bucket approach, one in one out.

After the features are agreed you can now start about delivering them, cutting where necessary.

Deadlines for Deadlines sake

When it comes to deadlines people take them a bit like marmite, you either love it or you hate it. Some people feel that having a deadline is a sure fire way of creating a bad product as corners are cut, others think that if you have a deadline you can at least work towards something with an end in sight rather than running off into the wild forever and ever.

To deliver the features needed it’s better to have a deadline, and one that stretches you and forces you to make some cuts, it’s not about not delivering or delivering badly it’s just about delivering what is needed, worse case scenario you have to deliver everything, but this way at least you do it in stages.

Certainly with a deadline you can help focus people on delivering what is important, I think sometimes people get caught up in trying to deliver everything perfectly for the deadline rather than delivering the value they already have. Some of my colleagues and mysef are currently working on a monitoring and metrics platform that integrates fully with nagios style checks but also allows you to write them in the web browser and test them on a server of your choice before distributing. The idea being that you can take the monitoring up to a real time level while reporting back business level reporting and everything in between so you have one place to go to to find out why something isn’t working and how well it has been doing, how many people have signed up; it’s a devops dashboard really.

Anyway, for a couple of months we have been identifying the core technologies and implementing various key functionality to the product but with at no point was any of it “working” some bits sort of worked but not quite, some bits just weren’t there. There was no real end date to this project as it’s something that will keep involving until it works and is useful however we need something to work towards and after a couple of months of sorting out the technology a deadline was set to do a demo and within a week we had the product up and working with the pages we needed with the correct functionality and everything working fine. Writing a nagios check on the fly, pushing it to the server distributing it to all the others and then reporting all in less than 30 seconds, wonderful.

I’m not saying what we did in that week was “production ready” but if our livelihood depended on it, it was good enough, and thats what being agile and lean is about. What is the least amount of work I can do to get me to the minimum product I need in the least amount of effort. The key is to obviously not get stuck delivering bear minimum all the time, with every sprint you need to improve upon what was there as well as add the new stuff; I think it is necessary to always fix something up when adding new features to get the product better and it certainly works for us.

Alternatively, of course, we could have not had a deadline and kept drifting aimlessly into the distance ensuring that the technology was “just right” all the time but the reality is we have to deliver something somewhere.

Iterate

Anyone that is familiar with Agile, Scrum, Extreme programming etc knows it’s better to deliver in small bite size pieces than in large chunks, you can provide value back to the business quicker and you focus on doing the task rather than doing it well. Not all tasks can be done by cutting a few corners but there’s normally a quick way, a good way and the right way fo doing it, so choose one and go for it, if it is a bit of angular that pulls down a list of plugins, go the quick way, if it’s a graphing engine that needs to draw lots of graphs and is used everywhere do it the right way; you’re sensible people, find a balance.

I’ve been talking all about software development which is where most of these methodologies come from, but they can be applied to systems administration as well, I think the same goes for sysadmins as it does programmers, they tend to get stuck in doing the best solution rather than the solution the business needs. Just to dispel any hopes and dreams, maybe save some time by realising that the business cares it works and is stable not how elegant or easy to maintain it is. So when coming up with a load balancing solution, maybe version 1 is haproxy with basic config and version 2 is a bit more in depth, version 3 is F5 & haproxy, version 4 is F5, haproxy and caching…. By all means have the hopes and dreams of the gold solution, but deliver the bronze one okay. If people really use the system and it provides more value iteratively make it better, maybe a bronze + a bit of silver, litle chunks, often.

Summary

Don’t get stuck in the end goal, think about what does the client really need or the business really need, bear minimum; deliver that, measure usage, iterate and improve.