Azure VPN Gateway Connection with custom IPSEC Policy

I was recently setting up a VPN tunnel between an Azure VPN Gateay and an on-premise location, and ran into issues with the tunnel connecting.

The connection in Azure kept saying “connecting”. I was trying to use the VPN troubleshooter to log diagnostics to a storage account for parsing, however the diags didn’t contain the actual errors, and the wizard in the Portal wouldn’t refresh from subsequent runs, so it was stuck on an error with the pre-shared key which I had already corrected.

The on-premise device is a Cisco, and so there were accessible error messages from it:

crypo map policy not found for remote traffic selector 0.0.0.0

This led me down a path of searching resulting in the Cisco example configuration from Microsoft. The key part of this is that a Cisco ASA cannot make a connection to a native RouteBased VPN Gateway in Azure.

The fix is to apply a custom IPSec policy to your connection, particularly with this flag: -UsePolicyBasedTrafficSelectors $True

I used a small bit of PowerShell in order to try this out:

$rg          = "default-rg"
$ConnectionHEN = "vpngw"
$connection = Get-AzVirtualNetworkGatewayConnection -Name $ConnectionHEN -ResourceGroupName $rg
$newpolicy   = New-AzIpsecPolicy -IkeEncryption AES256 -ikeintegrity SHA256 -DhGroup DHGroup2 -IpsecEncryption AES256 -IpsecIntegrity SHA256 -PfsGroup none -SALifeTimeSeconds 28800
Set-AzVirtualNetworkGatewayConnection -VirtualNetworkGatewayConnection $connection -IpsecPolicies $newpolicy -UsePolicyBasedTrafficSelectors $true

However, this returned the following error:

A virtual network gateway SKU of Standard or higher is required for Ipsec Policies support on virtual network gateway

My VPN Gateway is of SKU “Basic”, so it does not support IPSec policies, according to this documentation page.

Because it’s Basic, I can’t simply upgrade to a “VpnGW1” – I have to destroy and re-create my gateway as the new SKU, which will also generate a new public IP address.

So I did these things, but I did them using Terraform since this environment is managed with that tool.

First I ‘tainted‘ the existing resource to mark it for deletion and recreation. Then I ran “terraform apply” to make the modifications, based on the resources here below:

 

resource "azurerm_resource_group" "srv-rg" {
  name     = "srv-rg"
  location = "${var.location}"
}
resource "azurerm_public_ip" "vpngw-pip" {
  name = "vpngateway-ip"
  location = "${azurerm_resource_group.srv-rg.location}"
  resource_group_name = "${azurerm_resource_group.srv-rg.name}"
  allocation_method = "Dynamic"
}
resource "azurerm_local_network_gateway" "localgateway" {
  name                = "localgateway"
  resource_group_name = "${azurerm_resource_group.srv-rg.name}"
  location            = "${azurerm_resource_group.srv-rg.location}"
  gateway_address     = "1.2.3.4"
  address_space       = "10.10.0.0/24"
}
resource "azurerm_virtual_network_gateway" "vpngw" {
  name = "vpngw"
  location = "${azurerm_resource_group.srv-rg.location}"
  resource_group_name = "${azurerm_resource_group.srv-rg.name}"
  type = "Vpn"
  vpn_type = "RouteBased"
 
  active_active = false
  enable_bgp = false
    sku = "VpnGw1"
 
  ip_configuration {
    name = "vpngateway_ipconfig"
    public_ip_address_id = "${azurerm_public_ip.vpngw-pip.id}"
    private_ip_address_allocation = "Dynamic"
    subnet_id = "${azurerm_subnet.GatewaySubnet.id}"
  }
}
 
# Client VPN Connection
resource "azurerm_virtual_network_gateway_connection" "vpnconnection" {
  name = "vpnconnection"
  location = "${azurerm_resource_group.srv-rg.location}"
  resource_group_name = "${azurerm_resource_group.srv-rg.name}"
 
  type = "IPsec"
  virtual_network_gateway_id = "${azurerm_virtual_network_gateway.vpngw.id}"
  local_network_gateway_id = "${azurerm_local_network_gateway.localgateway.id}"
  use_policy_based_traffic_selectors = true
 
  shared_key = "${var.ipsec_key}"
 
  ipsec_policy {
    dh_group = "DHGroup2"
    ike_encryption = "AES256"
    ike_integrity = "SHA256"
    ipsec_encryption = "AES256"
    ipsec_integrity = "SHA256"
    pfs_group = "None"
    sa_lifetime = "28800"
 
  }
}

 

 

Azure Application Gateway learnings

I’ve been fine-tuning some Terraform config for Azure Application Gateway lately, and have thus been fine-tuning my understanding of its components. This Microsoft Doc about the App Gateway configuration was quite helpful because of it’s diagram.

Here’s a few items I’ve learned:

  • Terraform: You must separate out private IP and public IP into different front-end configurations. If you wish to utilize both to be associated with listeners, you’d have two config blocks:
    • frontend_ip_configuration {
          name                 = "${local.frontend_ip_configuration_name}"
          public_ip_address_id = "${azurerm_public_ip.test.id}"
        }
    • frontend_ip_configuration {
          name                 = "${local.frontend_ip_configuration_name}"
          subnet_id = "${azurerm_subnet.test.id}"
          private_ip_address_id = "${azurerm_subnet.testsub.id}"
          private_ip_address_allocation = Static
        }
  • General: A listener can only be associated with 1 front end IP (either private or public). Originally I thought that I could have both a private and public front-end that were associated with the same listener, and thus the same rule with a backend. This isn’t possible, and instead you must have unique listeners to each front end configuration.
  • Terraform: Terraform seems to have a problem adding multiple rules touching the same backend. Even though these were unique rules associated with unique listeners, Terraform gave this error: ApplicationGatewayBackendAddressPoolCannotHaveDuplicateAddress
    • This was seen while trying to add an additional rule after an original was created. I haven’t yet tried to perform a ‘terraform apply’ from a fresh start with two rules referencing the same backend.
  • General: You can’t have multiple listeners on the same front-end port across two different front-end configurations. If you do, you receive the following error:
    • For example, if I have port 80 on my private front-end config, I can add multi-site listeners based on hostname here to target multiple rules and backends.  But this means I cannot use port 80 on the public front-end configuration anymore; a different port would be required.
    • Error: Two Http Listeners of Application Gateway are using the same Frontend Port and HostName (ApplicationGatewayFrontendPortsUsingSamePortNumber)

 

VMM Service Fails to start after DB Migration

I migrated a System Center Virtual Machine Manager (VMM) installation to a new virtual machine today. I used this TechGenix article as a guide.

Most things went quite smoothly, however when I completed the VMM installation, the VMMService.exe didn’t start.

System Event logs displayed: “The System Center Virtual Machine Manager service terminated unexpectedly. It has done this 3 time(s).”

Application Event logs displayed:

Faulting application name: vmmservice.exe, version: 4.1.3223.0, time stamp: 0x5a566055
Faulting module name: KERNELBASE.dll, version: 10.0.17763.292, time stamp: 0xb51bba8e
Exception code: 0xe0434352

I found the VMM log located in “C:\ProgramData\VMMLogs\SCVMM<guid>\report.txt“, which had the following error buried in it:

System.Data.SqlClient.SqlException (0x80131904): Cannot execute as the database principal because the principal "dbo" does not exist, this type of principal cannot be impersonated, or you do not have permission.

This SQL error let me to poking around in SQL Server Management Studio, and to the google, where I found this relevant post on Stackoverflow.

The second answer led me to look at the “Owner” property on the “Files” page of the properties of the VirtualManagerDB database. In my case, it was empty. I suspect somehow this happened while restoring the database backup from the original VM.

I attempted to fill this in with my domain VMM service account, however SQL provided an error that this account was already assigned a role for this database. So instead I ran this statement:

USE VirtualManagerDB
GO
SP_DROPUSER ‘DOMAIN VMM service account’ 
GO
SP_CHANGEDBOWNER ‘DOMAIN VMM service account’

This updated the Owner property, and allowed me to start the VMM service.

 

 

 

Azure Monitor Alert rule not stopping

I discovered something interesting while working with Azure Monitor alert rules recently.

If you have an alert that is firing based on a condition, it will continue to fire until the condition is cleared, even if the alert rule itself is modified, disabled, or even deleted.

Here’s the example:

I have a B-Series VM, and want to alert on the “CPU Credits Remaining” metric, to pro-actively catch intervals when CPU usage is causing credit exhaustion and thus reduced compute capacity.

I created an Alert Rule to fire when the “CPU Credits Remaining” to fire when the value is 100 or less. It was configured with a frequency of 1 minute and a period of 5 minutes (because I wasn’t thinking of the implications at the time).

This worked just great for a while! And then a rogue Windows Update process got stuck consuming 60% of the CPU for a period of time, and the credit count dropped all the way to zero.

The alert began to fire as expected, once per minute, which quickly became excessive and drowned a bunch of mailboxes in alert-overload.

“Ok, lets just disable the alert” – nope, it continued to fire. I modified the rule so that the frequency and period were much greater. However, the email alerts received continued to reflect the original values:

Even deleting the rule did not stop the alerts from triggering.

Interestingly, if I modified the action group to use a different target email address, that was immediately effective. This allowed me to black-hole the emails until I had resolved the CPU utilization problem and waiting until the credits built back up.

 

Asp.net Website to run PowerShell Script

Now that I have a reliable and programmatic way of adding a one-time maintenance window in PRTG, I wanted to be able to provide this functionality to end users responsible over their specific set of sensors. Since I have experience with C# Asp.net, and didn’t have the luxury of time learning something new (Asp.net Core, doing it entirely in Javascript, etc) I continued down that path.

Going through my requirements as I built and fine-tuned it, this is what I ended up with:

  • Must use Windows Authentication
  • Provide functionality to select a “logical group” that is a pre-defined set of objects for applying the maintenance window
  • Be able to edit these “logical groups” outside of the code base, preferably in a simple static file
  • Be able to restrict from view and selection certain “logical groups” depending on Active Directory group membership of the user viewing the site.
  • Allow user to supply 2 datetime values, with validation that they exist, and end datetime is later than start datetime
  • Allow user to supply conditional parameters for additional objects to include
  • Display results of the operation for the user
  • Email results of the operation to specific recipients

I started with this post from Jeff Murr, which detailed how to use asp.net to call a PowerShell script, and return some output. This really formed the basis of what I built.

I started by trying to use Jeff’s code as-is, as a proof-of-concept. One of the immediate problems I ran into was getting my project to recognize the system.automation reference. It kept throwing this error:

The type or namespace name 'Automation' does not exist in the namespace 'System.Management' (are you missing an assembly reference?)

I eventually came across this blog post that contained a comment with the resolution for me:

You have to add:
Microsoft PowerShell version (5/4/3/..) Reference Assembly.

You can search for "PowerShell" in NuGet Packages Manager search bar

Once I had done this, the project could be built and I had a functional method of executing PowerShell from my site.

Building out the framework of the site was simple, and I utilized some new learning on CSS FlexBox to layout my conditional panels as I wanted to.

I decided to use an XML file as the data source for my “logical grouping” of information; intending that team members will be able to simply modify and push changes without having to understand anything related to the code. The XML file looks like this:

<?xml version="1.0" standalone="yes"?>
<types>
<type Id ="0" Code ="None">
</type>
<type Id ="1" Code ="Client1">
<TimeZone>MST</TimeZone>
<emailaddress>notificationlist@domain.com,notificationlist2@domain.com</emailaddress>
</type>
<type Id ="2" Code ="Client2">
<TimeZone>MST</TimeZone>
<emailaddress>notificationlist@domain.com,notificationlist2@domain.com</emailaddress>
</type>
<type Id ="3" Code ="Client3">
<TimeZone>MST</TimeZone>
<emailaddress>notificationlist@domain.com,notificationlist2@domain.com</emailaddress>
</type>
</types>

Another issue I had was with choosing a good date/time control. The out-of-the-box ones are clearly inadequate, so I decided to use a jQuery timepicker. jQuery and client-side scripts are a little unfamiliar to me, so I spent quite a bit of time tinkering to get it just right, when in reality it should have only been a brief effort.

In order to get my PowerShell script to run, and return Out-String values back to the page, I had to add: UnobtrusiveValidationMode=”None”. I did this at the top of page declaration, but it could have been done in web.config as well. Without this, when the page attempted to run the PowerShell Invoke-WebRequest, it did so under the user context of my IIS Application Pool, and it tried to run the Internet Explorer first-run wizard. Adding UnobtrusiveValidationMode bypassed this requirement.

Another unique thing I wanted to do was be able to manipulate the location of the PowerShell script and other things like disabling email notifications if I was testing during debug on my local machine. To do that, I used an IF statement to test for HttpContext.Current.Request.IsLocal.

Here’s what the site looks like:

 

You can find the GitHub repository of this code here: https://github.com/jeffwmiles/PrtgMaintenanceWindow