Azure KMS and NSGs

A co-worker recently posed a question to me regarding virtual machines in Azure – “how do they activate Windows?”

The short answer is through KMS – using a KMS key and “kms.core.windows.net:1688”. You can see this on an Azure VM by typing:

slmgr /dlv

The functionality of KMS within Azure isn’t well documented, but there are buried references to things like:

The IP address of the KMS server for the Azure Global cloud is 23.102.135.246. Its DNS name is kms.core.windows.net.

This is a singular IP used globally, and from that same doc, “requires that the activation request come from an Azure public IP address.”

 

This leads to another question – “In an environment with deny-by-default outbound rules, how does KMS communicate?”

The short answer here is, “Magic?”

Lets say I have an NSG attached to a subnet, with the following outbound rules:

There is nothing specifically allowing access to the “kms.core.windows.net” IP address, but there IS a deny rule to the Internet Service Tag, so I would expect this traffic to be denied.

But a Test-NetConnection succeeds!

I check Network Watcher, with the IP Flow Verify tool. It says that this traffic will be denied by my Internet deny rule.

I check the NSG Flow Logs, and surprisingly I see zero references to my traffic on ANY rules! But, if I change my Test-NetConnection to a different port (say 1687), then it does appear as denied:

 

What is happening here?

I have yet to find anything authoritative in Microsoft’s documentation or any related GitHub issue. But based on what has been tested and the existing documents, I think that there is a hidden default NSG rule (likely below priority 100) which is configured to not be visible or logged (even from Network Watcher!) but allows traffic despite my own best efforts to block it.

This is operating effectively the same way that Azure DNS does, I believe – the IP address 168.63.129.16 is always reachable, regardless of the NSG rules you put in place.

There can be exceptions to that statement – there is a Service Tag named AzurePlatformLKM, which can be used to “disable the defaults for licensing” – I believe (but haven’t yet tested) that using this Service Tag in a deny rule would effectively block this traffic on 1688.

Azure Multiple NICs or Static IPs through Terraform and DSC

A situation came up where I needed to have two HTTP bindings on port 80 on a web server residing in Azure. This would be 1 binding each on two different IIS sites within a single VM. The creation of this configuration isn’t as simple as one would initially expect, due to some Azure limitations.

There are two options in order to achieve this:

  • Add a secondary virtual network interface to the VM
  • Add a second static IP configuration on the primary virtual network interface of the VM.

For each of these, I wanted to deploy the necessary configuration through Terraform and Desired State Configuration (DSC).

Second IP Configuration

In this scenario, its quite simple to add a second static IP configuration in terraform:

resource "azurerm_network_interface" "testnic" {
  name                = "testnic"
  location            = "${var.location}"
  resource_group_name = "${azurerm_resource_group.test.name}"
  ip_configuration {
    name                          = "ipconfig1"
    subnet_id                     = "${azurerm_subnet.test.id}"
    private_ip_address_allocation = "static"
    private_ip_address            = "10.2.0.5"
    primary                       = true
  }
  ip_configuration {
    name                          = "ipconfig2"
    subnet_id                     = "${azurerm_subnet.test.id}"
    private_ip_address_allocation = "static"
    private_ip_address            = "10.2.0.6"
  }
}

However, when using static IP addresses like this, Azure requires you to perform configuration within the VM as well. This is necessary for both the primary IP configuration, as well as the secondary.

Here’s how you can configure it within DSC:

Configuration dsctest 
{ 
    Import-DscResource -ModuleName PSDesiredStateConfiguration 
    Import-DscResource -ModuleName xPSDesiredStateConfiguration 
    Import-DscResource -moduleName NetworkingDSC
 
    Node localhost {
        ## Rename VMNic
        NetAdapterName RenameNetAdapter 
        { 
            NewName = "PrimaryNIC"  
            Status = "Up" 
            InterfaceNumber = 1 
        }
 
        DhcpClient DisabledDhcpClient
        {
            State          = 'Disabled'
            InterfaceAlias = 'PrimaryNIC'
            AddressFamily  = 'IPv4'
            DependsOn = "[NetAdapterName]RenameNetAdapter "
        }
 
        IPAddress NewIPv4Address
        {
            #Multiple IPs can be comma delimited like this
            IPAddress      = '10.2.0.5/24','10.2.0.6/24'
            InterfaceAlias = 'PrimaryNIC'
            AddressFamily  = 'IPV4'
            DependsOn = "[NetAdapterName]RenameNetAdapter "
        }
 
        # Skip as source on secondary IP address, in order to prevent DNS registration of this second IP
        IPAddressOption SetSkipAsSource
        {
            IPAddress    = '10.2.0.6'
            SkipAsSource = $true
            DependsOn = "[IPAddress]NewIPv4Address"
        }
        DefaultGatewayAddress SetDefaultGateway 
        { 
            Address = '10.2.0.1' 
            InterfaceAlias = 'PrimaryNIC' 
            AddressFamily = 'IPv4' 
        }  
 
    }
 
}

Both the disable DHCP and DefaultGateway resources are required, otherwise you will lose connectivity to your VM.

Once the “SkipAsSource” runs, this sets the proper priority for the IP addresses, matching the Azure configuration.

 

Second NIC

When adding a second NIC in Terraform, you have to add a property (“primary_network_interface_id”) on the VM resource for specifying the primary NIC.

resource "azurerm_network_interface" "testnic1" {
  name                = "testnic"
  location            = "${var.location}"
  resource_group_name = "${azurerm_resource_group.test.name}"
  ip_configuration {
    name                          = "ipconfig1"
    subnet_id                     = "${azurerm_subnet.test.id}"
    private_ip_address_allocation = "static"
    private_ip_address            = "10.2.0.5"
  }
 
}
resource "azurerm_network_interface" "testnic2" {
  name                = "testnic"
  location            = "${var.location}"
  resource_group_name = "${azurerm_resource_group.test.name}"
  ip_configuration {
    name                          = "ipconfig1"
    subnet_id                     = "${azurerm_subnet.test.id}"
    private_ip_address_allocation = "static"
    private_ip_address            = "10.2.0.6"
  }
 
}
 
resource "azurerm_virtual_machine" "test" {
  name                  = "helloworld"
  location              = "${var.location}"
  resource_group_name   = "${azurerm_resource_group.test.name}"
  network_interface_ids = ["${azurerm_network_interface.testnic1.id}","${azurerm_network_interface.testnic2.id}"]
  primary_network_interface_id = "${azurerm_network_interface.test.id}"
  vm_size               = "Standard_A1"
 
  storage_image_reference {
    publisher = "MicrosoftWindowsServer"
    offer     = "WindowsServer"
    sku       = "2016-Datacenter"
    version   = "latest"
  }
 
  storage_os_disk {
    name          = "myosdisk1"
    vhd_uri       = "${azurerm_storage_account.test.primary_blob_endpoint}${azurerm_storage_container.test.name}/myosdisk1.vhd"
    caching       = "ReadWrite"
    create_option = "FromImage"
  }
 
  os_profile {
    computer_name  = "helloworld"
    admin_username = "${var.username}"
    admin_password = "${var.password}"
  }
 
  os_profile_windows_config {
    provision_vm_agent = true
    enable_automatic_upgrades = true
  }
 
}

In this scenario, once again Azure requires additional configuration for this to work. The secondary NIC does not get a default gateway and cannot communicate out of it’s subnet.

One way to solve this with DSC is to apply a default route for that interface, with a higher metric than the primary interface. In this case, we don’t need to specify static IP addresses inside the OS through DSC, since there’s only one per virtual NIC.

Configuration dsctest 
{ 
    Import-DscResource -ModuleName PSDesiredStateConfiguration 
    Import-DscResource -ModuleName xPSDesiredStateConfiguration 
    Import-DscResource -moduleName NetworkingDSC
 
    Node localhost {
        ## Rename VMNic(s) 
        NetAdapterName RenameNetAdapter 
        { 
            NewName = "PrimaryNIC" 
            Status = "Up" 
            InterfaceNumber = 1 
            Name = "Ethernet 2"
        }
        NetAdapterName RenameNetAdapter_apps
        { 
            NewName = "SecondaryNIC"  
            Status = "Up" 
            InterfaceNumber = 2
            Name = "Ethernet 3"
        }
 
        Route NetRoute1
        {
            Ensure = 'Present'
            InterfaceAlias = 'SecondaryNIC'
            AddressFamily = 'IPv4'
            DestinationPrefix = '0.0.0.0'
            NextHop = '10.2.0.1'
            RouteMetric = 200
        }
 
    }
 
}

You could also add a Default Gateway resource to the second NIC, although I haven’t specifically tested that.

Generally speaking, I’d rather add a second IP to a single NIC – having two NICs on the same subnet with the same effective default gateway might function, but doesn’t seem to be best practice to me.

Azure Application Gateway through NSG

I’m testing some things with Azure Application Gateway this week, and ran into a problem after trying to isolate down a network security group (NSG) to restrict virtual network traffic between subnets and peered VNETs.

Here’s the test layout:

click for big

The NSG applied to the “sub-clt-test” has a default incoming rule to allow VirtualNetwork traffic of any port. This is a service tag and according to Microsoft the VirtualNetwork tag includes all subnets within a VNET as well as peered VNETs. This means in my diagram, there’s an “allow all” rule between my “vnet-edge” and “vnet-client”. Not ideal.

I created a new rule on the NSG for “sub-clt-test” to deny all HTTP traffic into the subnet, and then the following intending to allow the Application Gateway to communicate to it’s backend-pool targets:

Source Destination Port
10.8.48.6 Any 80
AzureLoadBalancer (Service Tag) Any 80

The IP address listed there is what is listed as the the frontend private IP of my Application Gateway, within the subnet 10.8.48.0/24.  I added the second rule during testing in case that the Microsoft service tag included the Application Gateway within its dynamic range.

What I discovered is that this configuration broke the Application Gateway’s ability to communicate with the backend targets. The health probes went unresponsive and suggested that the NSG is reviewed.

I knew that this wasn’t due to outbound restrictions on any of my NSG’s because as soon as I removed the incoming port 80 deny on this subnet, it began functioning again.

I removed the Deny rule, and then installed WireShark on the backend web server, to collect information about what IP was actually making the connection.

I discovered that while the frontend private IP was listed as 10.8.48.6, it is actually the IP addresses of 10.8.48.4 and 10.8.48.5 making the connection to the backend pool. The frustrating part is that I couldn’t find any explanation for this behavior in Microsoft documentation – I know that the Application Gateway requires its own subnet not to be shared with other resources, but there’s no references to the reserved IP’s that traffic would be coming from.

Since the subnet is effectively reserved for this resource its easy enough to modify my NSG for the range, but I felt like I was missing something obvious as to why it is these IP addresses being used for the connections.

Halfway through writing this post though, I came across this blog post by RoudyBob, with a bit of insight. Each instance of the Application Gateway uses an IP from the subnet assigned, and it is these IPs that will communicate with your backend targets.

SonicWall Preempt Secondary Gateway

This is something fairly simple and obvious, but wanted to note it down anyways.

I wanted to use the SonicWall site-to-site VPN feature called “Preempt Secondary Gateway” found on the Advanced tab of VPN properties:

This is effectively VPN failback -if your primary goes down and then returns to service, the VPN will have been established on the secondary gateway and won’t renegotiate automatically back to the primary until the IKE lifetime expires. This can be a disadvantage in cases where the secondary gateway is a sub-par link or has metered bandwidth on it.

You will want to be careful with this setting however, if your primary has returned to service but isn’t stable – it could enable a renegotiation loop of your tunnel that would impact is availability.

 

I noticed on some VPNs this option was missing:

 

This is because a secondary gateway wasn’t specified; as soon as you define anything within that space (even 0.0.0.1) the option dynamically appears on the Advanced tab.

Barracuda Tunnel to Sonicwall going down

A few months ago I solved a problem with a site-to-site VPN tunnel between a SonicWall NSA appliance and an Azure VM running Barracuda NextGen Firewall.

This VPN tunnel would go down once every 2-3 weeks, and only be restored if I manually re-initiated it from the Barracuda side.

I had intended on saving a draft of this post with details on the error I found in the Barracuda log, but apparently failed to do so.

So the short answer is make sure that “Enable Keepalives” is turned on, on the SonicWall side of the tunnel. This has brought stability to the VPN long-term.