SonicWall Preempt Secondary Gateway

This is something fairly simple and obvious, but wanted to note it down anyways.

I wanted to use the SonicWall site-to-site VPN feature called “Preempt Secondary Gateway” found on the Advanced tab of VPN properties:

This is effectively VPN failback -if your primary goes down and then returns to service, the VPN will have been established on the secondary gateway and won’t renegotiate automatically back to the primary until the IKE lifetime expires. This can be a disadvantage in cases where the secondary gateway is a sub-par link or has metered bandwidth on it.

You will want to be careful with this setting however, if your primary has returned to service but isn’t stable – it could enable a renegotiation loop of your tunnel that would impact is availability.

 

I noticed on some VPNs this option was missing:

 

This is because a secondary gateway wasn’t specified; as soon as you define anything within that space (even 0.0.0.1) the option dynamically appears on the Advanced tab.

Azure Site Recovery setup errors

While setting up an Azure Site Recovery proof of concept, errors were encountered; at first with associating the replication policy and then afterwards with updating the authentication service.

The background is connecting SCVMM with a Server 2012 R2 Hyper-V Cluster to replicate to Azure. During the final steps of the “Prepare Infrastructure” phase, you need to associate a replication policy. This failed at the following step:

The text of the error was:

Error ID
10003
Error Message
Protection couldn't be configured for cloud/site POC-ASR.
Provider error
Provider error code: 31408

Provider error message:

	Failed to fetch the version of Microsoft Azure Recovery Services Agent installed on the Hyper-V host server . Error: An internal error has occurred trying to contact the  server: : .

WinRM: URL: [http://:5985], Verb: [INVOKE], Method: [GetStringValue], Resource: [http://schemas.microsoft.com/wbem/wsman/1/wmi/root/cimv2/StdRegProv]

Check that WS-Management service is installed and running on server .

Provider error possible causes:
	It is possible that Registry provider of WMI is corrupted.

Provider error recommended action:
	Build the repository using MOF compiler and retry the operation.

This occurred right before I was distracted by other items so I didn’t directly troubleshoot. When I came back to the Azure Portal (in a fresh session) I had a surprising new message greeting me at the Recovery Services Vault blade:

This was very odd, since I had just installed the latest version of the Site Recovery provider on my VMM host, as well as the MARS agent on my Hyper-V hosts. But when I clicked “Update Now” it listed my VMM host and displayed a new button to “Update Authentication Service”.

This almost immediately error-ed out:

Error ID
635
Error Message
Updating authentication service information for server -  failed.
Provider error
Provider error code: 31437

Provider error message:

	Failed to fetch the version of Microsoft Azure Site Recovery Agent installed on the Hyper-V host(s) '' as the host is not reachable.

Provider error possible causes:
	
      1. Windows Management Instrumentation service crashed.
      2. Windows Remote Management (WinRM) service is not running.
      3. Required services may not be running on the Hyper-V host(s)''.
  
Provider error recommended action:
	
      Ensure that
      1. A firewall is not blocking HTTPS/HTTPS traffic on the Hyper-V host.
      2. If the server is running windows Server 2008 R2, ensure that KB 982293 is installed on it. Refer to https://aka.ms/kblink982293 for more details.
      3. The Hyper-V Virtual Machine Management service is running.
      4. Ensure that the Windows Management Instrumentation service is running on the Hyper-V host(s).
      5. Ensure that the Windows Remote Management (WinRM) service is running on the Hyper-V host(s).
      6. Verify that CredSSP authentication is enabled on the service configuration of the Hyper-V host(s). To enable the CredSSP on the service configuration, run the following command on the Hyper-V host, from an elevated command line: winrm set winrm/config/service/auth @{CredSSP="true"}.
      7. The Provider version running on the server is up-to-date. Download and install the latest Microsoft Azure Site Recovery Provider.
      8. If the error persists, retry the operation and contact support.
    

I validated all the components in the list here, checked the referenced articles, ensured WMF was updated to 5.1, to no avail.

I finally stumbled upon this post on the Microsoft forums where a check was done against WMI for the object “StdRegProv”, which is mentioned in the original error from the replication policy. Turns out this was my problem too! When I ran the WMI query it returned an error of “Exception calling “GetStringValue” : “Provider not found “” on 3 of my 4 Hyper-V hosts:

$hklm = 2147483650
$key = "Software\Microsoft\Windows\CurrentVersion\Uninstall\Windows Azure Backup"
$value = "DisplayVersion"
$wmi = get-wmiobject -list "StdRegProv" -namespace root\cimv2
($wmi.GetStringValue($hklm,$key,$value)).svalue

I ran the mofcomp command, and then when I ran the last line of the previous query ($wmi.GetStringValue) it returned a value instead of an error.

cd c:\windows\system32\wbem
mofcomp regevent.mof

Following this, the “Update Authentication Service” job completed successfully, and I was able to associate my replication policy without further problems.

 

Barracuda Tunnel to Sonicwall going down

A few months ago I solved a problem with a site-to-site VPN tunnel between a SonicWall NSA appliance and an Azure VM running Barracuda NextGen Firewall.

This VPN tunnel would go down once every 2-3 weeks, and only be restored if I manually re-initiated it from the Barracuda side.

I had intended on saving a draft of this post with details on the error I found in the Barracuda log, but apparently failed to do so.

So the short answer is make sure that “Enable Keepalives” is turned on, on the SonicWall side of the tunnel. This has brought stability to the VPN long-term.

Get Inner Error from Azure RM command

Today I’m working on an ARM Template to deploy some resources into an Azure subscription. After building my JSON files and prepping parameters, I used the cmdlet “Test-AzureRmResourceGroupDeployment” in order to validate my template.

This failed with the error:

Code    : InvalidTemplateDeployment
Message : The template deployment 'e76887a9' is not valid according to the validation
          procedure. The tracking id is '6df0fffb'. See inner errors for details. Please
          see https://aka.ms/arm-deploy for usage details.
Details : {Microsoft.Azure.Commands.ResourceManager.Cmdlets.SdkModels.PSResourceManagerError}

I found that decidedly unhelpful, but found an effective way to determine the actual error message.

To retrieve the error details, use the following cmdlet, where the CorrelationID equals the tracking ID mentioned in the error.

get-azurermlog -correlationid 6df0fffb -detailedoutput

This will produce output which you can investigate and determine where the error lies.

In my case, I needed to create a Core quota increase request with Azure support, as my subscription had reached it’s limit.

Terraform – Variables in resource names

This is a post about resource naming in Terraform. As I was working on my first production use of Terraform to deploy resources to Azure, my goal was to parameterize the resource so that in the future I could easily re-use the .TF file with a simple replacement of my .TFVARS file.

I struggled for a while because I was trying to do something like this:

resource "azurerm_virtual_network" "AZ${var.clientcode}Net" {
  name                = "AZ${var.clientcode}"
  address_space       = ["${lookup(var.networkipaddress, "AZ${var.clientcode}")}"]
  location            = "${var.location}"
  resource_group_name = "${azurerm_resource_group.Default.name}"
  dns_servers         = "${var.dnsservers}"
}

Effectively, I wanted my resource to result in something like “AZABCNet” both in the name property and the resource identifier as well.
Then I would try and reference that in a different resource attribute, for a subnet like this:

virtual_network_name = "${azurerm_virtual_network.AZ${var.clientcode}Net.name}"

However, this gave me errors when trying to validate:

Error: Error loading C:\terraform\ABC\ABC_Network.tf: Error reading config for azurerm_subnet[${var.location}]: parse error at 1:28: expected expression but found invalid sequence "$"

 

My assumption of how the resource identifier worked was incorrect thinking; resource names should be local and static, as described here on StackOverflow.

This means that I can give my resource an identifier as “AZClientNet”, since that is used only within Terraform while the Name attribute is what will become the deployed name in Azure.

 

This leaves a little bit of a gap where I might want a resource declared with multiples of something, with attributes defined in a map variable, without having to declare multiple resource blocks.

It appears this functionality is coming in Terraform in the form of a “for each” function, according to this Github discussion. Not having used the proposed syntax in a test environment, I don’t fully have my head around it.