Terraform AzureRM provider 2.0 upgrade

Today I needed to upgrade a set of Terraform configuration to the AzureRM 2.0 provider (technically 2.9.0 as of this writing). My need is primarily to get some bug fixes regarding Application Gateway and SSL certificates, but I knew I’d need to move sooner or later as any new resources and properties are being developed on this new major version.

This post will outline my experience and some issues I faced; they’ll be very specific to my set of configuration but the process may be helpful for others.

To start, I made sure I had the following:

  • A solid source-control version that I could roll back to
  • A snapshot of my state file (sitting in an Azure storage account)

Then, within my ‘terraform’ block where I specify the backend and required versions, I updated the value for my provider version:

terraform {
  backend "azurerm" {
  }
  required_version = "~> 0.12.16"
  required_providers {
    azurerm = "~> 2.9.0"
  }
}

Next, I ran the ‘terraform init’ command with the upgrade switch; plus some other parameters at command line because my backend is in Azure Storage:

terraform init `
    -backend-config="storage_account_name=$storage_account" `
    -backend-config="container_name=$containerName" `
    -backend-config="access_key=$accountKey" `
    -backend-config="key=prod.terraform.tfstate" `
    -upgrade

Since my required_providers property specified AzureRm at 2.9.0, the upgrade took place:


Now I could run a “terraform validate” and expect to get some syntax errors. I could have combed through the full upgrade guide and preemptively modified my code, but I found relying upon validate to call out file names and line numbers easier.

Below are some of the things I found I needed to change – had to run “terraform validate” multiple times to catch all the items:

Add a “features” property to the provider:

provider "azurerm" {
  subscription_id = var.subscription
  client_id       = "service principal id"
  client_secret   = "service principal secret"
  tenant_id       = "tenant id"
  features {}
}

Remove “network_security_group_id” references from subnets
Modify “address_prefix” to “address_prefixes on subnet, with values as a list

Virtual Machine Extension drops “resource_group”, “location”, and “virtual_machine_name” properties
Virtual Machine Extension requires “virtual_machine_id” property

Storage Account no longer has “enable_advanced_threat_protection” property
Storage Container no longer has “resource_group_name” property

Finally, the resources for Azure Backup VM policy and protection have been renamed – this is outlined in the upgrade guide (direct link).

It was this last one that caused the most problems. Before I had replaced it, my “terraform validate” was crashing on a fatal panic:

Looking in the crash.log, I eventually found an error on line 2572:

2020/05/13 20:30:55 [ERROR] AttachSchemaTransformer: No resource schema available for azurerm_recovery_services_protected_vm.rsv-protect-rapid7console

This reminded me of the resource change in the upgrade guide and I modified it.

Now, “terraform validate” is successful, yay!


Not so fast though – my next move of “terraform plan” failed:

Error: no schema available for azurerm_recovery_services_protected_vm.rsv-protect-rapid7console while reading state; this is a bug in Terraform and should be reported

I knew this was because there was still a reference in my state file, so my first thought was to try a “terraform state mv” command, to update the reference:

 
terraform state mv old_resource_type.resource_name new_resource_type.resource_name
terraform state mv azurerm_recovery_services_protection_policy_vm.ccsharedeus-mgmt-backuppolicy azurerm_backup_policy_vm.ccsharedeus-mgmt-backuppolicy

Of course, it was too much to hope that would work; it error-ed out:

Cannot move
azurerm_recovery_services_protection_policy_vm.ccsharedeus-mgmt-backuppolicy
to azurerm_backup_policy_vm.ccsharedeus-mgmt-backuppolicy: resource types
don't match.

I couldn’t find anything else online about converting a pre-existing terraform state to the 2.0 provider with resources changing like this. And from past experience I knew that Azure Backup didn’t like deleting and re-creating VM protection and policies, so I didn’t want to try a “terraform taint” on the resource.

I decided to take a risk and modify my state file directly (confirmed my snapshot!!)

Connecting to my blob storage container, I downloaded a copy of the state file, and replaced all references of the old resource type with the new resource type.

After replacing the text, I uploaded my modified file back to the blob container, and re-ran “terraform plan”.

This worked! The plan ran successfully, and showed no further changes required in my infrastructure.

Install Visio volume license alongside Office 365

At my company we get a certain number of seats of Visio volume license from our Microsoft Partner benefits, but it is not a subscription product and the volume key cannot be simply entered to activate Visio, because our Office 365 is installed as O365ProPlusRetail.

Here I’ll describe how to install the Visio volume license product side-by-side with our standard installation of Office 365, which is a supported scenario according to this Microsoft doc.

At first I had problems where installing the product “VisioPro2019Volume” would downgrade my O365 build to the 1808 version. I posted a comment on this Microsoft Docs issue because it seemed related, and received a very helpful reply from Martin with another Microsoft Docs page describing how to build lean and dynamic install packages for O365. This was the key to configuring my XML file for proper side-by-side installation.

I also used the super helpful Office Customization Tool in support of figuring this out.

Procedure

    1. Download the Office Deployment Tool
    2. Install the tool to a folder on your workstation. Create a new XML file named “newvisio.xml” to look like this (update the PIDKEY value)
<Configuration ID="6659a04f-037d-4f23-b8a2-64c851090a5e"	>
  <Add Version="MatchInstalled" 	>
    <Product ID="VisioPro2019Volume" PIDKEY="insert Key"	>
      <Language ID="MatchInstalled" TargetProduct="O365ProPlusRetail" /	>
    </Product	>
  </Add	>
</Configuration	>
  1. Note: You can get the Key from the Partner portal (go into MPN ? Benefits ? Software) and enter it into your xml file
  2. If you have Visio installed already as part of O365, you will need to remove it:
      1. Create yourself a configuration xml file named removevisio.xml:
    <Configuration ID="ba49a53d-04c0-44a6-b591-c099d9c4e6ed">
      <Remove OfficeClientEdition="64" Channel="Monthly">
        <Product ID="VisioProRetail">
          <Language ID="en-us" />
          <ExcludeApp ID="Access" />
          <ExcludeApp ID="Excel" />
          <ExcludeApp ID="Groove" />
          <ExcludeApp ID="Lync" />
          <ExcludeApp ID="OneDrive" />
          <ExcludeApp ID="OneNote" />
          <ExcludeApp ID="Outlook" />
          <ExcludeApp ID="PowerPoint" />
          <ExcludeApp ID="Publisher" />
          <ExcludeApp ID="Teams" />
          <ExcludeApp ID="Word" />
        </Product>
      </Remove>
      <Display Level="Full" AcceptEULA="TRUE" />
    </Configuration>
    1. Place this file where the deployment tool was downloaded, and then run it:
    2. setup /configure removevisio.xml
    3. The uninstall will proceed. You’ll see an image like this; don’t be alarmed its not removing all of office if you set your XML properly
  3. Once the previous version uninstall is complete, install with this command:
  4. setup /configure newvisio.xml
  5. Now you should have Office 365 on subscription, but Visio on Volume License.

Build Azure Storage Fuse 1.2 from source

I’ve been looking to test out Azure Storage Fuse on a linux VM (Oracle Linux 7), and wanted to evaluate the authentication options to Azure.

Reading through the main README page, it describes “Managed Identity auth: (Only available for 1.2.0 or above)”, however the Releases has only been built up to 1.1.1.

The Wiki page on installation describes a method to build from source, so I thought I’d give that a shot.

Following the instructions, I hit an error when starting the build:

Unable to find the requested Boost libraries
 Could not find the following static Boost libraries:

          boost_filesystem
          boost_system

  No Boost libraries were found.  You may need to set BOOST_LIBRARYDIR to the
  directory containing Boost libraries or BOOST_ROOT to the location of
  Boost.

 

I confirmed that I had Boost installed from yum (1.53) in /usr/include/boost.

I first looked at the Issue list within the repository on GitHub, and found exactly what I needed! Issue #319 was reported with the same error, and the suggestion from richardharrison was effective for me:

In the file CMakeLists.txt
There are 2 lines of the format
set(Boost_USE_STATIC_LIBS ON)
simply change them to
set(Boost_USE_STATIC_LIBS OFF)

I modifed the CMakeLists.txt file on line #134 and #222, and changed the values and then re-ran ./build.sh

This produced a 23MB file: azure-storage-fuse-master/build/blobfuse

I did a quick test to validate that my build was working:

mkdir /mnt/blobfusetmp
mkdir /mnt/blobfuse
export AZURE_STORAGE_ACCOUNT=myaccountname
export AZURE_STORAGE_ACCESS_KEY=myaccesskey
 
./blobfuse /mnt/blobfuse --tmp-path=/mnt/blobfusetmp -o attr_timeout=240 -o entry_timeout=240 -o negative_timeout=120 --container-name=fuse --log-level=LOG_DEBUG --file-cache-timeout-in-seconds=120

A file within my container is now visible within my mount point!

 

Off to play with Managed Identity!

Terraform dynamic blocks in resources (Azure Backup Retention Policy)

This post will give an example of using Terraform dynamic blocks within an Azure resource. Typically this is done when you need multiple instances of a nested block within a resource, for example multiple “http_listener” within an Azure Application Gateway.

Here I’m using an Azure Backup retention policy, which as a Terraform resource (note I’m using resource version prior to AzureRM 2.0 provider, which are no longer supported) can have blocks for multiple different retention levels. In this case, we are only looking to have one of each type of nested block (retention_daily, retention_weekly, etc) however there are cases where we want zero instances of a block.
I’m sure there would be a way to simplify this code with a list(map(string)) variable, so that the individual nested blocks I’ve identified here aren’t necessary, however I haven’t yet spent the time to make that simplification.

A single-file example of this code can be found on my GitHub repo here.

 

First, create a Map variable containing the desired policy values:

variable "default_rsv_retention_policy" {
  type = map(string)
  default = {
      retention_daily_count = 14
      retention_weekly_count = 0
      #retention_weekly_weekdays = ["Sunday"]
      retention_monthly_count = 0
      #retention_monthly_weekdays = ["Sunday"]
      #retention_monthly_weeks = [] #["First", "Last"]
      retention_yearly_count = 0
      #retention_yearly_weekdays = ["Sunday"]
      #retention_yearly_weeks = [] #["First", "Last"]
      #retention_yearly_months = [] #["January"]
    }
}

In the example above, this policy will retain 14 daily backups, and nothing else.

If you wanted a more typical grandfather scenario, you could retain weekly backups for 52 weeks every Monday, and monthly backups for 36 months from the first Sunday of each month:

variable "default_rsv_retention_policy" {
  type = map(string)
  default = {
      retention_daily_count = 14
      retention_weekly_count = 52
      retention_weekly_weekdays = ["Sunday"]
      retention_monthly_count = 36
      retention_monthly_weekdays = ["Sunday"]
      retention_monthly_weeks = ['First'] #["First", "Last"]
      retention_yearly_count = 0
      #retention_yearly_weekdays = ["Sunday"]
      #retention_yearly_weeks = [] #["First", "Last"]
      #retention_yearly_months = [] #["January"]
    }
}

Now we’re going to use this map variable within a dynamic function. Check the GitHub link above for the full file, as I’m only going to insert the dynamic block here for explanation. Each of these blocks would go inside the “azurerm_backup_policy_vm” resource.

 

First up is the daily – in my case, I am making an assumption that there will ALWAYS be a daily value, and thus directly using the variable.

#Assume we will always have daily retention
  retention_daily {
    count = var.default_rsv_retention_policy["retention_daily_count"]
  }

Next we will add in the “retention_weekly” block:

dynamic "retention_weekly" {
    for_each = var.default_rsv_retention_policy["retention_weekly_count"] > 0 ? [1] : []
    content {
      count  = var.default_rsv_retention_policy["retention_weekly_count"]
      weekdays = var.default_rsv_retention_policy["retention_weekly_weekdays"]
    }
  }

The “for_each” is using conditional logic, in this format: condition ? true_val : false_val
It is evaluated as: “if the retention_weekly_count” value of our map variable is greater than zero, then provide 1 content block, else provide 0 content block.

From our first example of the variable I provided, since the weekly count is 0, this content block would then not appear in our terraform resource.

In the second example, we did provide a value greater than zero (52) and we also specified a weekday. This is what gets inserted into the content block as the property names and values.

 

Similarly, the monthly retention would look like this:

dynamic "retention_monthly" {
    for_each = var.default_rsv_retention_policy["retention_monthly_count"] > 0 ? [1] : []
    content {
      count  = var.default_rsv_retention_policy["retention_monthly_count"]
      weekdays = var.default_rsv_retention_policy["retention_monthly_weekdays"]
      weeks    = var.default_rsv_retention_policy["retention_monthly_weeks"]
    }
  }

Because of the nature of this Azure resource, there is a new content property named “weeks” which signifies the week of the month to take the backup on (we said “First”).
Using dynamic blocks is an effective way to parameterize your Terraform – even with the example I gave earlier about Azure Application Gateway, in that resource itself there are many other nested resource blocks that this would be useful for, like “disabled_rule_group” within the WAF configuration, or “backend_http_settings”, or “probe”. I would expect the same is true for other Azure load balancers or Front Door configurations.

VM Images – Immutable builds and Azure resources

I’ve begun poking around with custom Azure managed images, using Packer as an on-premises build source. The goal is to use the same image within an on-premises test environment and in an Azure production environment, and take a small step towards immutable infrastructure.

There are lots of interesting questions around this topic that may have unique answers in different environments. Here’s a few thoughts on what I see where I am right now:

  • Should I use Packer to build a traditional Hyper-V image, convert it to VHD and upload it to Azure, or directly use the Packer builder for Azure?
    • Because I have the on-prem resources, am building upon a pre-existing framework where local-only images are built, and need to build images for both Hyper-V and Azure, I decided to keep it consistent rather than split the build chain off to a whole new builder.
  • Should I just use the Azure Image Builder to streamline the process?
    • Maybe eventually – again, I wanted to incrementally build upon the successes of existing on-premises deployments. The Image Builder service is very intriguing, and would be a next logical step once it leaves Preview.
  • Why do images need to be built for on-premises in the first place? Why not native Azure resources?
    • While deploying into Azure would offer a lot more flexibility and scale, there are still many reasons to maintain a local presence for resources, but it mostly boils down to financial: pre-existing CapEx investments exist vs new OpEx costs that would be realized, without the appropriate systems in place for constraining size, resource lifetime, and ultimately cost overflows
  • Why build images, and not do configuration management after deploy?
    • I go back and forth on this question frequently, and I have seen much conversation about it. Like most of IT, “it depends”. Using customized images that are version controlled provides the infrastructure ability to shift-left, and ensure quality is in the build repeatably and consistently. What if you’re doing post-deployment config management, and DNS isn’t available, or a service crashes halfway through, or any number of other things that can go wrong? Now there is a delay in the availability of that resource you’ve deployed, effort consumed to resolve the problem, and a lack of confidence in its quality.
    • Immutable builds do not natively solve the problem of configuration drift post-deployment, and this is one of the big gaps that I see trying to take traditional IaaS and fit it into a more modern profile. The ‘answer’ is to monitor drift and re-build from source (cattle not pets) when it is detected, but not everyone is working with modern micro-services running in containers orchestrated centrally to achieve this. Instead, there may be an intermediary step where immutable image builds are used, along with configuration management post-deployment to watch for drift.

 

Once I got to a place where an image is ready, I began poring over the Microsoft Docs on managed images and Shared Image Gallery, prior to testing.

I intended on a distribution flow something like this:

Packer drops VHD in Blob storage -> Create Managed Image -> Use Shared Image Gallery definition -> Create Image Version

The documentation left me with a few unanswered questions, which I’ve outlined here:

  • What if I remove the original blob, can I still use the image? Yes, you can continue to deploy the managed image without the source blob
  • What if the blob gets replaced, does it update the image? No, any future deployments of the image will continue to be delivered as when the image was created
  • What if I remove the source managed image, can I still use the Shared Image Gallery definition version?
    • According to the guidance from Microsoft: Yes, but if you plan on adding replica regions, do not delete the source managed image. The source managed image is needed for replicating the image version to additional regions.
  • What if I update the source managed image, does it update the Shared Image Gallery definition version?  Mostly no, similar to the blob-to-image relationship, if you update the source managed image, the version in a SIG definition doesn’t update. What I need to test is what happens if you replace the source managed image, and then replicate an image definition version to a new region – will it contain the updates in the image?

Here’s a few other important design discoveries I’ve made along the way:

  • A VM from an Azure Managed Image can only be deployed within the same Region and Subscription as the Image (i.e. if you want to re-use the image across multiple regions or subscriptions, you’ll have to create additional images to suit
  • A VM from a Shared Image Gallery definition CAN be deployed outside a subscription, from any region it is replicated, as long as the authentication mechanism performing the VM deployment has RBAC over the Shared Image Gallery resource
  • Microsoft says “as a best practice, we encourage you to keep the resource group, shared image gallery, image definition, and image version in the same location.” I can confirm that if the resource group you place the Shared Image Gallery in is in a different Location than the SIG itself or the image definition, there are no barriers to creating those resources or a VM from them
  • Terraform AzureRM provider support (as of today at least) has limitations in managing Shared Image components:
    • You cannot set properties for VM Generation on the image definitions
    • Resource removal does not respect the dependencies between an image version, definition, and gallery

 

At the end of the day, I’ve come up with the following flow which builds and delivers my images: