Azure NSG discovery

During deployment of some resources with an Azure virtual network which has subnets with network security groups (NSG) applied, I made a new discovery that I didn’t previously know. It makes sense in the context of how Azure applies NSG rules, but it doesn’t align with a traditional understanding of firewall ACLs across a subnet.

Communication within subnet

If you apply a Deny rule that has a lower priority than the default 65000 “Allow Vnet inbound”, it will also deny resources within that subnet from communicating with each other.

I discovered this while applying a “Deny inbound” rule in order to restrict lateral movement between subnets, not intending to restrict traffic within a subnet.

For example, I have a “management” subnet, with an NSG applied. Inside this subnet is an AD domain controller, and a member server. I apply a Deny rule for any source, after my “allow incoming” rules have been applied to let other subnets talk to this domain controller.

Now I find that my domain controller cannot reach my member server, despite it residing within the same subnet.

While I do not want to allow service tag “VirtualNetwork” incoming access (again, to restrict lateral movement), I do want “everything inside this subnet can talk to everything inside this subnet”. As such I had to create a specific rule for this behavior.

Azure Function – Resolve DNS

As part of my search to provide outbound Deny on an Azure NSG with whitelisted FQDN entries, I started looking at Azure Functions.

The idea is that I would have an Automation runbook on a schedule, which called my function for a variety of domain names, receiving the resolved IP addresses in return. These would then be compared against outbound NSG rules, and if the resolved IP differs from what is in the NSG, it would update it.

In reality there isn’t much need for this, since you can do the DNS resolution right in the runbook with this:

$currentIpAddress = [system.net.dns]::GetHostByName("$fqdn").AddressList.IPAddressToString

There are other limitations with this idea as well:

  • for a globally-managed DNS behind some type of CDN or round-robin mechanism, its possible that IP resolution would continually be different. Take “smtp.office365.com” for example.
  • There isn’t a way to manage wildcard whitelists – “*.windowsupdate.com” isn’t something you can resolve to individual IP addresses.

All that being said, I still used this as a learning opportunity for my first function.

To begin, in the Azure Portal I went to the “App Services” blade, clicked “Add”, and searched for Function:

During creation I accepted most of the defaults, and was left with a v2 Function App and the initial “HttpTriggerCSharp1” function.

I am by no means a programmer, and certainly not familiar with C# from ASP.net Core as evidenced by my previous post. With that in mind, here is the contents of my function that I ended up with:

#r "Newtonsoft.Json"

using System.Net;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Primitives;
using Newtonsoft.Json;
using System.Text;

public static async Task<string[]> Run(HttpRequest req, ILogger log)
{
    log.LogInformation("C# HTTP trigger function processed a request.");

    string name = req.Query["name"];

    string requestBody = await new StreamReader(req.Body).ReadToEndAsync();
    dynamic data = JsonConvert.DeserializeObject(requestBody);
    name = name ?? data?.name;
    List collectedIP = new List();
    IPAddress[] ipaddresses = null;
    if (name != null){
        try
        {
            // Putting this in the Try because if it errored out I wanted to see that
            // as a defined message rather than failure of the function
            ipaddresses = Dns.GetHostAddresses(name);
        }
        catch (Exception)
        {
            log.LogInformation("Did not resolve IP from: " + name);
            collectedIP.Add("Did not resolve");
        }
        
        if (ipaddresses != null)
            {
                // Knowing that multiple IPs could be returned for a record, used a ForEach
                foreach (IPAddress ip in ipaddresses)
                {
                    log.LogInformation("Resolved " + name + " to " + ip.ToString());
                    // Add the resolved IP to a string list
                    collectedIP.Add(ip.ToString());
                    log.LogInformation("Added IP to list");
                }
                log.LogInformation("End of If Ipaddresses isn't null");
                
            }
            log.LogInformation("End of If Name isn't null");
    }     
    else
    {
        //return a string
        log.LogInformation("No IP passed In");
        collectedIP.Add("No IP passed in");
    }
    log.LogInformation("Ready to return value");
    // Return the string list as an array to the calling entity
    return collectedIP.ToArray();
}

Now I run this function in Test mode, with a Query parameter as “name”:

And I get results both in my log, and the Output:
To bring this into my Automation runbook, I retrieved the function URL, and since this is a private function it includes my key value in it:
This PowerShell command is then used to invoke the function, with the parameter at the end of the URL:
$IPList = invoke-webrequest 'https://functionappname.azurewebsites.net/api/HttpTriggerCSharp1?code=<privatekey>&name=www.microsoft.com'
Due to the limitations I mentioned at the start of this post, I never went far enough in my runbook to connect this $IPList into logic for updating the NSG.

Azure IaaS Deny outbound considerations

As a general practice, outbound Internet access should be denied except for approved destinations. This is referenced in NIST 800-41 as a “deny by default” posture.

Achieving this within Azure Infrastructure as a Service in a practical and economical way without breaking a large amount of services is quite difficult at the moment.

If Outbound Internet is fully denied, some of the commonly used services of Azure will cease to work:

  • Azure Backup
  • Log Analytics
  • Azure State Configuration (DSC)
  • Azure Update Management
  • Azure Security Center
  • Windows Update

Some of these are not as difficult to solve – Service Tags on NSG rules can allow Azure services where they have been defined by Microsoft. As of Ignite 2018 in late September, there are new service tags covering entire regions, or all of Azure (“AzureCloud”). This means you can allow most of those services above to function and still deny general Internet outbound.

Additional Service Tags for Windows Update, or custom definitions are supposed to be coming in the future, but this doesn’t fully resolve the problem.

What if your application has a GIS component, and it needs to reach *.arcgisonline.com? What if your users have a legitimate reason to access a particular website? It isn’t good enough to just resolve that IP address one time and add it to an NSG.

What is really needed is a method to allow access to a fully qualified domain name (FQDN), particularly with wildcard support.

Here are some possible solutions:

Implement a 3rd Party network virtualization appliance (NVA)

This is the most common response that I see recommended to the outbound problem. Unfortunately, it is really expensive, and overkill if you’re only address this one particular problem. One has to consider high availability of the resources, as well as management of them since you’re just adding more IaaS into your environment, which is what we’re all trying to get away from when we’re using the cloud isn’t it?

Some vendors may not support wildcard FQDN in it’s ACLs (Barracuda CloudGen last I checked), which means you can’t support things like Windows Update where no published IP list exists.

If the implementation is anything like SonicWALL’s method, it will have difficulty being reliable – this relies upon the SonicWALL using the same DNS server as the client (calling it ‘sanctioned’) which may or may not be true in your Azure environment with the use of Azure DNS or external providers.

Implement Azure Firewall

Azure Firewall is new on the scene and released to General Availability as of late September 2018. It supports the use of FQDN references in application rules, and while I haven’t personally tested it, the example deployment template is shown to allow an outbound rule to *microsoft.com.

Confusingly, their documentation states that FQDN tags can’t be custom created, but I believe this just references groups of FQDN, not individual items.

Azure Firewall solves the problem of deploying more IaaS, and it’s natively highly available. However it again isn’t cheap, at $1.25/hour USD it is a high price to pay for just this one feature.

Wait until FQDN support exists in an NSG rule

It has been noted on the Microsoft feedback site that NSG rules containing FQDN is a roadmap item, but since this hasn’t received the “Planned” designation yet, I expect it is very far down the roadmap; particularly considering this feature is available in the Azure Firewall.

Build something custom – Azure Function or runbook which resolves DNS and adds it to your NSG

I’ve toyed with the idea of building a custom Azure Function or Automation runbook which can resolve a record and add it to an NSG. I’ll have a post on the Function side of this coming soon that describes how it would work, and the limitations that made me discard the idea.

Realistically, this isn’t a long-term viable solution as it doesn’t solve the wildcard problem.

Utilize an outbound transparent proxy server

This method involves trusting some other source to proxy your outbound traffic and depending on that source, gives a large amount of flexibility to achieve the outbound denial without breaking your services.

This could be an IaaS resource running Squid or WinGate (a product I’m currently testing for this purpose), or it could be an external 3rd party service like zScaler which specializes in access control of this nature.

To make this work, your proxy must be able to be identified by some kind of static IP to allow it through the NSG, but after that the whitelisting could happen within the proxy service itself.

I see this as the most viable method of solving the problem until either FQDN support exists for NSG, or Azure Firewall pricing comes down with competition from 3rd party vendors.

Azure monitoring agents

I’ve been having some difficulty sorting out all the monitoring components available within Azure for IaaS resources, and finally sat down to do some dedicated reading on this.

Part of the confusion for me has come from the introduction of the Azure Monitor, Azure Security Center, and the removal of the Operations Management Suite (OMS) branding.

The overview for Azure agents for monitoring describes the following agents as viable for monitoring. I’ve added in additional wording that is used in other areas and ARM templates for these, along with some potential use cases:

Azure Diagnostics Extension

Somewhat a legacy agent at this point (my opinion), it collects Performance counters, System Logs, IIS Logs, and others. These are all stored in a Storage Account. Performance counter information can be sent to Azure Monitor (i.e. Log Analytics), but not system logs and other data sources.

This extension can be deployed with the Set-AzureRmVMDiagnosticsExtension cmdlet.

If using an ARM template, it is referenced as resource name “Microsoft.Insights.VMDiagnosticsSettings” (by default and by recommendation), publisher “Microsoft.Azure.Diagnostics”, and type “IaasDiagnostics”.

Often Microsoft docs refer to this as “guest-level monitoring”.

Frustratingly, this agent is also known by “Microsoft Monitoring Agent Diagnostics“, particularly in Visual Studio.  This nomenclature conflicts with the Log Analytics agent, but doesn’t seem to be very common.

This type of monitoring is needed to enable the full suite of Cloudyn metrics and optimizations, particularly for memory counters. It is called “extended metrics”, and the Microsoft Docs article specifically says that it is not compatible with Log Analytics.

The Azure Diagnostic extension (for VMs) is not to be confused with Azure Monitor Diagnostic logs, which feed data from Azure services into Azure Monitor.

 

Log Analytics Agent

Commonly referred to as the “Microsoft Monitoring Agent” (MMA), this is used to collect data from many different types of sources and enable Azure Monitor solutions in the workspace on IaaS resources. This is the direct integration of a VM into Azure Monitor beyond the default metrics that are provided. It is also used to support the Hybrid Runbook Worker feature of Azure Automation.

The MMA is also a required component for Azure Update Management.

There is a Log Analytics VM extension that installs the Log Analytics agent and registers it with a particular workspace.

If using an ARM template, it is referenced as a resource with publisher “Microsoft.EnterpriseCloud.Monitoring”, type “MicrosoftMonitoringAgent”, and commonly name “OMSExtension”.

If using PowerShell, it can be deployed with this cmdlet: Set-AzureRmVMExtension -ExtensionName “Microsoft.EnterpriseCloud.Monitoring”

Important: the Azure Security Center automatically provisions the Log Analytics agent (MMA) and connects it to a workspace – usually a new one but you can configure it to use an existing one.

 

A question I’m still investigating is whether enabling a VM for Azure Security Center with the MMA will automatically enable appropriate features within Azure Monitor – will the VM become a data source for Log Analytics with the collection of system logs and performance counters?

 

 

 

 

Azure Support Plan discovery

I learned some things about Azure Support Plans recently.

A co-worker was tasked with adding an Azure Support Plan to a new subscription that was being created. So they went to portal.azure.com, clicked “Help and Support”, and then from the drop-down selected the new subscription ID.

Then they clicked Change Plan and added a Standard support plan. Upon investigation, this appeared to add a support plan to ALL subscriptions, which was scary. We don’t want to be charged 10x$100USD per month!

Working through a support case with Microsoft, I learned the following about support plans that cleared things up.

A support plan is tied to an Azure Account – the account that subscriptions are created under. The support plan is effectively its own subscription, not tied to individual subscriptions themselves. However, you won’t see it displayed this way in portal.azure.com.

It’s not until you log into https://account.windowsazure.com/Subscriptions that you see it itemized by itself and are able to view billing history for the support plan.

So if I have 10 subscriptions, and they are all created under one Azure Account (you can see this on the “Properties” page of your Subscriptions blade), then only a single support plan is needed.

This also clarifies why Microsoft’s instructions on removing a support plan describe to “Go to the portal, and click ‘cancel subscription'”.

When I thought that this was a per-subscription basis, that made me afraid we’d cancel our actual subscription. Knowing now that this is it’s own subscription, that text makes a lot more sense.