Azure Function called from PRTG REST sensor

Using PRTG for service monitoring has been fairly effective for me, particularly with HTTP Advanced sensors to monitor a website. However, as more and more Azure resources are utilized, I want to continue to centralize my alerting and notifications within a single platform and that means integrating some Azure resource status into PRTG.

At a high level, here’s what needs to happen:

  • Azure Function triggered on HTTP POST to query Azure resources and return data
  • PRTG custom sensor template to interpret the results of the Function data
  • PRTG custom lookup to establish a default up/down threshold
  • PRTG REST sensor to trigger the function, and use the sensor template and custom lookup to properly display results

Azure Function

For my first use-case, I wanted to see the health status of the back-end pool members of an Application Gateway:

The intended goal is if a member becomes unhealthy, PRTG would alert using our normal mechanisms. I could use an Azure Monitor alert to trigger something when this event happens, but in reality it is easier for PRTG to poll rather than Azure Monitor to trigger something in PRTG.

I’m not going to cover the full walk-through of building an Azure Function; instead here is a good starting point.

I’m using a PowerShell function, where the full source can be found here: GitHub link

Here’s a snippet of the part doing the heavy lifting:

#Proceed if all request body parameters are found
if ($appgwname -and $httpsettingname -and $resourcegroupname -and $subscriptionid -and $tenantid) {
    $status = [HttpStatusCode]::OK
    # Make sure we're using the right Subscription
    Select-AzSubscription -SubscriptionID $subscriptionid -TenantID $tenantid
    # Get the health status, using the Expanded Resource parameter
    $healthexpand = Get-AzApplicationGatewayBackendHealth -Name $appgwname -ResourceGroupName $resourcegroupname -ExpandResource "backendhealth/applicationgatewayresource"
    # If serving multiple sites out of one AppGw, use the parameter $httpsettingname to filter so we can better organize in PRTG
    $filtered = $healthexpand.BackEndAddressPools.BackEndhttpsettingscollection | where-object { $_.Backendhttpsettings.Name -eq "$($httpsettingname)-httpsetting" }
    # Return results as boolean integers, either health or not. Could modify this to be additional values if desired
    $items = $filtered.Servers | select-object Address, @{Name = 'Health'; Expression = { if ($_.Health -eq "Healthy") { 1 } else { 0 } } }
    # Add a top-level property so that the PRTG custom sensor template can interpret the results properly
    $body = @{ items = $items }

}

You can test this function using the Azure Functions GUI or Postman, or PowerShell like this:

    $appsvcname = "appsvc.azurewebsites.net"
    $functionName = "Get-AppGw-Health"
    $functionKey = " insert key here "
    $Body = @"
{
    "httpsettingname": " prodint ",
    "resourcegroupname": " rgname ",
    "appgwname": " appgw name ",
    "subscriptionid": " subid ",
    "tenantid": " tenant id "
}
"@
    $URI = "https://$($appsvcname)/api/$($functionName)?code=$functionKey"
    Invoke-RestMethod -Uri $URI -Method Post -body $body -ContentType "application/json"

Expected results would look like this:

You can see the “Function Key” parameter in this code above; I’ve created a function key for our PRTG to authenticate against, rather than making this function part of a private VNET.

 

PRTG Custom Sensor Template

Now, in order to have PRTG interpret the results of that JSON body, and automatically create channels associated with each Item, we need to use a custom sensor template.

Here’s mine (github link):

{
  "prtg": {
    "description" : {
      "device": "azureapplicationgateway",
      "query": "/api/Get-AppGw-Health?code={key}",
      "comment": "Documentation is in Doc Library"
    },
    "result": [
      {

	"value": {
            #1: $..({ @.Address : @.Health }).*
        },
        "valueLookup": "prtg.customlookups.healthyunhealthy.stateonok",
        "LimitMode":0,
        "unit": "Custom",
      }
    ]
  }
}

The important part here is the “value” properties. The syntax for this isn’t officially documented, but Paessler support has provided a couple examples that I used, such as here and here. The #1 before the first semi-colon sets the channel name, and uses the first argument referenced within the braces (@.Address in this case). What is inside the braces associates the channel name to the value that is returned, which in this case is the boolean integer for “Health” that the Azure Function returns.

The valueLookup property references the custom lookup explained below.

This Sensor Template file needs to exist on the PRTG Probe that will be calling it, in this location:

Program Files (x86)\PRTG Network Monitor\Custom Sensors\rest

 

PRTG Custom Lookup

I want to be able to control what PRTG detects as “down” based on the values that my Function is returning. To do so, we apply a custom lookup (github link):

<?xml version="1.0" encoding="UTF-8"?>
  <ValueLookup id="prtg.customlookups.healthyunhealthy.stateonok" desiredValue="1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="PaeValueLookup.xsd">
    <Lookups>
      <SingleInt state="Error" value="0">
        Unhealthy
      </SingleInt>
      <SingleInt state="Ok" value="1">
        Healthy
      </SingleInt>
    </Lookups>
  </ValueLookup>

This matches my 0 or 1 return values to the PRTG states that a channel can be in. You can modify this to suit your Function return values.

This lookup file needs to exist on the PRTG Core, in this location:

Program Files (x86)\PRTG Network Monitor\lookups\custom

PRTG REST sensor

So lets put it all together now. Create a PRTG REST Sensor, and apply the following settings to it:

Your PostData should match the parameters you’re receiving into your Azure Function. The REST query directs to the URL where your function is located, and uses the function key that was generated for authentication purposes.

Important to note: the REST sensor depends upon the Device it is created under to generate the hostname for URL. This means you’ll need to have a Device created with a hostname matching your Function App URL, where the sensor’s REST Query references the function name itself.

Make sure the REST Configuration is directed to your custom sensor template; this can only be set on creation.

Then for each channel that gets auto-detected, you will go and modify it’s settings and apply the custom lookup:

With that, you should have a good sensor in PRTG relaying important information collected from Azure!

I’m also using this same method to collect Azure Site Recovery status from Azure and report it within PRTG, using this Function.

Update SCCM Maintenance Window through PowerShell

Sometimes trying to stay bleeding edge is tough – today’s example is when you want to install updates through SCCM a day or two after Patch Tuesday, particularly using Maintenance Windows to allow restarts within a specific time frame.

We use automatic deployment rules to update a Software Update Group every Patch Tuesday – scheduling this is easy because its always the second Tuesday of a month.

But I want the updates to install on Wednesday night, or Thursday morning. This ensures strict compliance requirements can be met, but allows 24 hours for testing. Can’t just schedule the install and restarts for “second Wednesday of the month” though, because if the first of the month is a Wednesday (like this month) then our actual install date happens to be the THIRD Wednesday of the month.

Previously we solved this by manually updating Maintenance Window schedules every month, painstaking selecting the right date and hoping we didn’t mess it up.

PowerShell took that risk away:

 

SCCM_UpdateMaintenanceWindowSchedule.ps1 – on GitHub

 

See the comments on the script for details of how it works. As a very brief overview:

I found Tim Curwick’s method of calculating Patch Tuesday, and used that in my script to reliably calculate my Wednesday or Thursday install date.

This script runs as System from an SCCM server on the 1st of every month. It performs the calculation, updates maintenance windows on specific Collections, and outputs a log to file and emails results.

 

Azure Site Recovery – ARM template

Programmatic deployment of Azure Site Recovery for Azure VMs – that’s the target I started with for a project a little while ago.

There is a large amount of information for Azure Site Recovery on Microsoft’s Docs, however the amount of available code online to programmatically deploy a full setup is very sparse! Much of what Microsoft provides is PowerShell, which isn’t idempotent and doesn’t fit with my current tooling (Terraform and declarative infrastructure-as-code). I did consider Azure CLI, but couldn’t find any references for Site Recovery.

While working on this I came across an Azure Quickstart which is a little incomplete and starts with some good variable definition but quickly devolves into hard-coded values from where it came from; I also discovered a blog post from Pratap Bhaskar  which was useful especially for understanding the Loop mechanism in the template, but it didn’t go far enough for my purposes.

So I spun up a few VMs, manually configured ASR, and did an ARM template export. What I received was a huge amount of properties on the resources that I was sure were relevant to runtime only, not creation. This was also a good reference, but not exactly where I needed to be.

The final piece of the puzzle that got me on my way was the REST API docs for Site Recovery. With this in hand, and the other sources I had at my disposal, I had the references I needed to begin putting together an ARM Template that would configure my environment end-to-end including Recovery Plans with automation runbooks.

There are a lot of design decisions I made when building this to fit my environment, some of which won’t make sense without additional context; most of which I can’t provide. That’s ok, as I hope it at least serves as a reference for “what’s possible” to others who come across it. Here is the overall structure:

  • Pre-define and create destination resources like resource groups, virtual networks, and subnets with Terraform
  • Deploy ASR for a subset of Virtual Machines, targeting the destination resources
    • Include dependent resources like source-side storage account for cache, and azure automation account in the same region and subscription as the ASR resources
  • Deploy a Recovery Plan that provides runbook functionality to configure a Test Failover environment
    • This environment was intended to be completely isolated, to ensure there’s no chance of contamination with prod
    • Current design of my web servers has multiple ip configurations; these need to be replaced
    • Access to the environment is provided through a Jump Host, which needs a known-in-advance IP address

 

I have documented this output on a GitHub repo called arm-azuresiterecovery

 

 

The majority of my time was actually spent cleaning up the template into proper parameters and variables for effective re-use, and then solving all the syntax challenges and typos that come along with that.

 

There are still some loose ends in what I’ve created, around certain manual steps still required. However as I’m sure is common in the industry, it is good enough to deploy and I must move on – fine-tuning comes later.

Invoke-AzVMRunCommand log output

While working with the Invoke-AzVmRunCommand cmdlet, I encountered a problem, receiving the following error:

Invoke-AzVMRunCommand : Long running operation failed with status 'Failed'. Additional Info:'VM has reported a failure w
hen processing extension 'RunCommandWindows'. Error message: "Finished executing command".'

I’m attempting to push a DSC configuration into a VM and run it, and although the PowerShell runs successful when run from inside the VM, I still see this error.

I came across a github issue with a helpful tip: the log output of Invoke-AzVmRunCommand can be viewed from this path:

C:\Packages\Plugins\Microsoft.CPlat.Core.RunCommandWindows\<version>\Status

This helped me determine what the problem was and how to solve it.

Use Azure Function to start VMs

An use-case came up recently to provide a little bit of self-service to some users with an occasionally used Virtual Machine in Azure. I have configured Auto-Shutdown on the VM, but want an ability for users to turn the VM on at their convenience.

Traditionally, this would be done by using Role-Based Access Control (RBAC) over the resource group or VM to allow users to enter portal.azure.com and start the VM with the GUI.

However I wanted to streamline this process without having to manage individual permissions, due to the low-risk of the resource. To do so, I’m using an Azure Function (v2. PowerShell) to start all the VMs in a resource group.

 

First create your function app (Microsoft Docs link) as a PowerShell app – this is still in preview as a Function V2 stack, but it is effective.

The next thing I did was create a Managed Identity in my directory for this Function app. I wanted to ensure that the code the Function runs is able to communicate with the Azure Resource Manager, but did not want to create and manage a dedicated Service Principal.

Within the Function App Platform Features section, I created a Managed Identity for it to authenticate against my directory to access resources:

Go to “Identity”:

Switch “System Assigned” to ON and click Save:

With the Managed Identity now created, you can go to your Subscription or Resource Group, and add a Role assignment under “Access control (IAM)”:

Lastly, I developed the following code to place into the function (github gist):

using namespace System.Net

# Input bindings are passed in via param block.
param($Request, $TriggerMetadata)

# Interact with query parameters or the body of the request.
$rgname = $Request.Query.resourcegroup
if (-not $rgname) {
    $rgname = $Request.Body.resourcegroup
}
$action = $Request.Query.action
if (-not $action) {
    $action = $Request.Body.action
}
$subscriptionid = $Request.Query.subscriptionid
if (-not $subscriptionid) {
    $subscriptionid = $Request.Body.subscriptionid
}
$tenantid = $Request.Query.tenantid
if (-not $tenantid) {
    $tenantid = $Request.Body.tenantid
}

#Proceed if all request body parameters are found
if ($rgname -and $action -and $subscriptionid -and $tenantid) {
    $status = [HttpStatusCode]::OK
    Select-AzSubscription -SubscriptionID $subscriptionid -TenantID $tenantid
    if ($action -ceq "get"){
        $body = Get-AzVM -ResourceGroupName $rgname -status | select-object Name,PowerState
    }
    if ($action -ceq "start"){
        $body = $action
        $body = Get-AzVM -ResourceGroupName $rgname | Start-AzVM
    }
}
else {
    $status = [HttpStatusCode]::BadRequest
    $body = "Please pass a name on the query string or in the request body."
}

# Associate values to output bindings by calling 'Push-OutputBinding'.
Push-OutputBinding -Name Response -Value ([HttpResponseContext]@{
    StatusCode = $status
    Body = $body
})

To provide secure access, I left my Function app with anonymous authentication, but added a new Function Key which I could use and control when calling this function. This is found under the “Manage” options of the function:

To test, I called the function externally like this, passing in my request body parameters and using the Function Key that was generated. You can grab the URL for your function right near the top “Save” button when you’re editing it in the Portal.

$Body = @"
{
    "resourcegroup": "source-rg",
    "action": "start",
    "subscriptionid": "SUBID",
    "tenantid": "TENANTID"
}
"@
$URI = "https://hostname.azurewebsites.net/api/startVMs?code=FUNCTIONKEY"
Invoke-RestMethod -Uri $URI -Method Post -body $body

 

If I run this with the “get” action, then the Function will return the status of each VM in the resource group: