Terraform Lifecycle

Okay I confess, this is a little "click baity" in that it's not so much about the Lifecycle block, as it is about why I've had to start using it a little more.

Stick with me.

Azure, Terraform and "Changes in the Portal"

I'm going to write this in a way that hopefully highlights the issue and makes me not seem like an arsehole.

I'm extremeley fortunate to work as part of a (semi) greenfield team. We are generally a "build it run it" sort of team, we take the idea, plan, laydown the foundations with Terraform and Azure DevOps, then we begin the process of writing the code, testing, depolying into Azure, gather feedback, then adjust as we go along.

We're pretty independant and have (shamefully) relatively little interaction with the Infrastructure and Operations department, that is until we see that a maually configure change in the Azure portal creates a down wind issue in our Terraform release pipeline!

Obviously, that's an issue, but one that can be solved.

There is no I in team

One thing that I'm "passionate" about in my job is enabling others; to learn, grow and make their own choices. I truly believe empowering people to achieve generates the best outcomes, no man is an island as they say and that couldn't (and shouldn't) be any more true in tech.

As frustrating as it is, when my colleagues in the Infra department apply the Azure reccomended changes via the portal, they have all of the best intentions and are simply doing their job, it's not malign intent on breaking the Terraform pipeline or interfering with my teams work, that drives them to those choices, it's a willingness to succeed and achieve the best for the company as a whole.

So what's the solution? Well it's obvious, help them learn Teraform, Azure DevOps and Git and empower them to achieve their goals but with the added safety and culture around the CI/CD pipelines, testing and ways of working that I've come to love.

Hang on you said "Terraform Lifecycle Block"

You're right and that's the thing, this week after some discussion with my collegaues; getting to understand their intentions, desires and reasons, I will be empowering them to (sort of) get their task done, by helping them to get more confident with Terraform (I'm by no means an expert, but I have a degree of confidence in what I know).

Enter the lifecycle block. This is a Terraform core concept, that is resource type agnostic, i.e the rules can be applied to Azure, AWS, GCP or whatever else you might be Terraforming.

My understanding is that it alters the way Terraform handles and treats certain infrastructure as it attempts to apply any differences, there are 3 main lifecycle elements

  • ignore_changes
  • create_before_destroy
  • prevent_destroy

I haven't really experimented with the later two, however I know that create_before_destroy can be useful in scenarios where you might get down time as Terraform creates new versions of the infrastructure you have previously provisioned

But I'm going to concentrate on the first item ignore_changes

Simply put, this is an array that you can put in attributes on a resource that you don't want Terraform to manage. Maybe you have a script that performs a job to some resource in Azure that sit's outside of Terraform (for whatever reason) and that the next time you run Terraform it want's to revert those changes back to default values. This is where you would specify those pieces in the life_cycle ignore block.

In my case, it was a change to a key vault, the prevent_purge and soft_delete options that are there to protect the potential loss of secrets should they be deleted (accidentally or intentionally) it would mean that those secrets were recoverable, should the deletion cause downstream issues in the consumers of those secrets.

This is a permenant setting and terraform is unable to put these values bak to the default, which is what happened, resulting in an error when we applied the change (I had actually seen the change in the plan, but pressed on anyway with my changes unsure of what those changes were).

So how does this work.

lifecycle{
    ignore_changes = [
        purge_protection_enabled,
        soft_delete_enabled,
        soft_delete_retention_days
    ]
}

That simply goes into the KV resource declaration nd et voila, the pipeline is all green again.

Simple.

Empowering People

So now I have the little block, the next steps this week will be getting my collegue who had originally applied the KV changes to now add in the block to the Terraform declarations, in the process I'm hoping to build this persons knowledge and confidence in Terraform as well as the Azure DevOps pipeline and show how the current way of working can be used to first test out changes in a "safe" environment before applying them directly to the live one.

It's truly the beginning of what I hope will be a wonderful working relationship, where we can lean to our colleagues in infra and get them to assist with changes, design decisions etc.

One step closer to marrying Dev and Ops in glorious DevOps matrimony.