Building a monitoring solution for Office 365 service availability using Azure Logic Apps

Photo by @rgrzybowski / Unsplash.com

I spent some time in early December building an integration between Azure Log Analytics (and Azure Sentinel, obviously) and Office 365 Audit Logs. The latter exposes the logs through Office 365 Management Activity API, which is very handy, albeit slightly cumbersome to use. I wrote about that experience here.

I noticed that there is also a separate Office 365 API I found useful, called the Office 365 Service Communications API. It’s much simpler to use, and exposes a limited set of data – Get Services, Get Current Status (of services), Get Historical Status, Get Messages and show Errors. Using this data, it’s easy to monitor what the status of your Office 365 tenant is, and if any of the services are degraded.

Getting started: Creating an Azure AD App

First, I need to provision an app in Azure Active Directory. This is so that I can then authenticate via Azure AD, and gain access to whatever services I need through OAuth2.

Navigate to Azure Portal, and then select Azure AD. From here, select App registrations. Click New registration. Fill in the name, and leave other selections as defaults.

Once you click OK, you have an Azure AD-supported app! We need to grant permissions next so that we can actually leverage the Office 365 Management APIs. Under the app, select API Permissions.

The permission you need to grant for the app, under API Permissions, is ServiceHealth.Read for the Office 365 Management APIs. Click Add a permission, and select Office 365 Management APIs:

Then, select Application permissions, as we’ll run this solution in the background without a signed-in user. Under Select permissions, choose ServiceHealth.Read:

You’ll also want to grant admin consent next, but first, you have to wait for a few minutes for the permissions to kick in.

Click Grant admin consent on TENANT, when the button becomes available. You’ll need to authenticate once more to perform this task.

Once completed successfully, you should have granted admin consent successfully, and you’ll see a green checkmark:

Finally, generate the client secret under Certificates & secrets, by clicking on New client secret. Usually, it makes sense not to set expiration to Never, so perhaps choose 1 or 2 years, and click Add.

Record the value of the client secret, as you’ll only see it once here:

And that’s it! You should now have an Azure AD-supported app, with permissions to query the Service Health of your Office 365 tenant.

Building a Proof of Concept using PowerShell

I built a 5-minute Proof of Concept hack to try, and get data from the API. It worked, much to my surprise.

As before, I needed to define a few variables to hold values – namely, the OAuth2-based authorization data. So first, define the following:

$clientID = "CLIENT_ID" 
$clientSecret = "CLIENT_SECRET" 
$loginURL = "https://login.microsoftonline.com/" 
$resource = "https://manage.office.com" 
$tenantID = "TENANT_ID"
$body = @{grant_type="client_credentials";resource=$resource;client_id=$clientID;client_secret=$clientSecret} 
$oauth = Invoke-RestMethod -Method Post -Uri $loginURL/$tenantdomain/oauth2/token?api-version=1.0 -Body $body 
$headerParams = @{'Authorization'="$($oauth.token_type) $($oauth.access_token)"}

You can easily check out your Azure AD tenant ID from the Azure Portal, by navigating to Azure AD > Properties:

Alternatively, if you prefer scripting, you can open Azure Cloud Shell and run the following command:

az account show

You can run the PowerShell now, as they are just variables so it just works:

Next, adding to our PowerShell, we need to call the API, and parse the results. First, call the Office 365 Service Management API and get the current status for all services:

$status = Invoke-WebRequest -Method GET -Headers $headerParams -Uri "https://manage.office.com/api/v1.0/$tenantID/ServiceComms/CurrentStatus"

We’re interested in the payload, which is exposed through the .Content -property and holds a bunch of data in JSON, so let’s add logic for that next:

$stats = $status.Content | ConvertFrom-Json 

And finally, let’s identify the services that are not in a normal operational state, and print them out with relevant data:

$servicesWithIssues = $stats.value | where { $_.Status -ne "ServiceOperational" }
$servicesWithIssues | select WorkloadDisplayName, Status, StatusTime, StatusDisplayName, IncidentIds

We can see from the output, that several services – including Skype for Business and Exchange Online, have encountered issues. Exchange Online has recovered, while Skype for Business still has service degradation. The relevant incident IDs are also printed. You can query for those specific incident IDs directly via the API, or you can check them out through the Microsoft 365 Admin Center, under Service Health:

Building the solution using Azure Logic Apps

While PowerShell is great, ideally it would be nice to automate this in a bit more enterprise-focused way. That’s where Logic Apps is highly useful, as it allows us to orchestrate a process, which can then later be modified easily. In essence, the Logic App mimics the behavior we just built with PowerShell!

I provisioned an empty Logic App via Azure Portal and opened the designer. You could use more automated ways for doing this, such as using Visual Studio to create the Logic App, but for this build, I chose to do this manually.

Once you have a Logic App provisioned, open the visual editor. I set my Logic App as a recurring, thus it triggers based on an interval. I set mine to once per day:

Next, I need to specify the OAuth2-specific variables. Logic App has a neat way to define variables, so I added three string-based variables named ClientID, ClientSecret, and TenantID.

While it isn’t ideal to store my secrets and other sensitive data directly in my Logic App, it makes it easier to understand what’s being built here. In a production implementation, you’d want to store these secrets in Azure Key Vault, and fetch them dynamically using the Azure Key Vault connector:

Once we have our three secrets, it’s time to call Azure AD and get our authentication token. Create a HTTP connect, and define the following values:

You’ll see, that in the body of the request we use the two variables (ClientID and ClientSecret) to call Azure AD using our Azure AD-backed app details. Let’s pick up the token that we receive through this call, and store it in a variable:

Next, we’ll perform the actual call to Office 365 Service Communications API, and pass on the relevant details for access:

And that’s it! We’ll simply need to pick up interesting details from the content we receive, and process that. As the content will still be JSON-formatted, I’m creating yet another variable to hold that for easier processing:

I’m then using the Parse JSON action to parse the content:

You can generate a sample payload, or pick content from the .Content -property you saw when you built the PowerShell prototype.

I’m sending myself, and email once per day, and for convenient handling of the content I’ll create a variable for this:

And now the tricky part – we’ll loop through the JSON-object (via the variable), and pick up the services that are degraded or otherwise not in top-notch condition. For each service we find, we’ll add a line in our email.

I’ll add a For-Each action, loop through the Status field from the JSON-variable, and if that does not match ServiceOperational, we need to look at it.

The details we need to inject into our email are simple:

Obviously, you might want to format the StatusTime and other details. I don’t have time for that, as I’m already running out of coffee.

Finally, once we’re out from the For-Each-loop, let’s send the email:

And that is it! Here’s the final Logic App orchestration:

You can run it manually to verify it works. Each action will get a green checkmark upon successful completion:

And after 5 seconds or so, you should receive an email:

In conclusion

This was yet another fun build. It thought me better how to access OAuth2-secured APIs using Logic Apps. The variable handling is superb, although you need to rename and track your actions carefully.

Parsing the JSON in Logic App was easy, but once you resort to using loops, it becomes a bit trickier again. Pay attention to what data you are getting, and don’t hesitate to run trial-runs to debug any issues.

There are a lot of things you could build on top of this — automated actions based on service degradation status, comparisons to previous days, and letting your IT admins know (via Teams messages and Actionable Cards, for example) when something breaks down.

Logic Apps is highly flexible and versatile, I’m very happy how this small build turned out!