It is expensive, time-consuming, and often boring to fix a server. So why fix a server when you can just throw it away and build a new one?
This is the essence of immutable infrastructure design, a.k.a. disposable infrastructure. This week we sat down with Phil Christensen, Sr. Systems Engineer, at Logicworks, about immutable infrastructure design on AWS and how to separate hype around the term from reality.
What is immutable infrastructure and why should we care?
Immutable infrastructure refers to a system where virtual instances are “disposable” and once you instantiate the infrastructure and code, you never change the instance. This way the infrastructure never strays from its initial “known-good” state, operations are simplified, and “failure” is a routine and continuous way of doing business. The system is so good at replacing itself that failure is a non-event.
What does immutable infrastructure look like in practice on AWS?
Immutable infrastructure design requires full automation of the environment from AWS resource provisioning to bootstrapping, package installation, and code deployment, which is why true immutable infrastructure is so rare — particularly in enterprise IT environments. But it is far from impossible.
In fact, I recently spent several months building an immutable infrastructure system for a very large enterprise software company. They were launching an internal product with a custom deployment pipeline that included dozens of parallel automated and manual tests, each in a separate development environment. When a new deploy to the development environment occurs, a new set of 45+ dev instances must be created that has the new version of code; when the test is finished, the instance must be terminated. This means that instances in their environment rarely last for longer than 24 hours, and over the course of a single week, hundreds of instances are terminated and rebuilt.
This degree of reliability requires significantly more effort than simply placing the instances in Auto Scaling Groups (ASGs), and requires custom tooling with AWS CloudFormation and Puppet that is closely integrated with the company’s own configuration management tooling and custom deployment script. Logicworks and the client’s team collaborated over the course of several months to get this right — and over the course of that time I probably spun up thousands of instances per week to test the process.
First I built a custom CloudFormation template that performs standard tasks like building and configuring a VPC and access controls. Then the Puppet agent is installed and connects to the Puppetmaster, which then configures the OS of the instance. I then created an AMI from this fully-configured instance, so that the only items that get installed upon creation are host names and other minor details. This image is continually improved and tested by the Logicworks team, and due to the nature of their system, they know that any new version of the AMI will be on every single instance within hours. The final step in the Puppet process is kicking off their deploy script, and this “hand off” required the most delicate work. This deploy script then pulls down the most recent version of code from an on-premises box. An identical process works in their production environment, though instances are replaced less frequently.
Any results from the project that you can share?
The client has a 0.001% instance failure rate and 100% uptime for their production application, even during very rare AWS outages. Their developers deploy to their development environment with a single click, without any instance configuration tasks, at which point they know that every testing and production environment is configured to their standard and has no residual impacts from previous failed or passed tests. That’s a 60% higher deployment efficiency over the company’s other projects in AWS.
Let’s say a company doesn’t think it’s “ready” for immutable infrastructure. How can they get closer?
Even if you are migrating an application that wasn’t built for the cloud, try building your environment using an AWS CloudFormation template and doing some simple bootstrapping with Puppet. That is a good place to get started and learn about the basics of infrastructure automation.
Your background is in software engineering. How has that influenced your everyday work at Logicworks?
On a philosophical level, it means I approach infrastructure as if it is a piece of software. In many ways, I’m still a software engineer — I just build software for systems. I hate repeating myself and get easily bored, so I have always automated parts of my job. At Logicworks, automation is my job.
If you want to talk to Phil or learn more about how we help build and automate AWS environments, contact us at firstname.lastname@example.org or (212) 625-5454.