Building a secure remote access solution for Azure-based virtual machines using Azure AD and Windows Admin Center
[Update October 1, 2019: I’ve published another blog post on Azure Bastion Host, which complements the findings and services I go through in this post. Perhaps have a look at it here!]
I’ve had some fun times lately with Azure. It seems there really is a second wave of adoption for cloud-based infrastructure and services from organizations. This is especially true in the Nordics, where Azure is commonly accepted as a trusted option for data centers and PaaS services.
This post stemmed from the idea of figuring out what options do we have for accessing and managing virtual machines remotely while enforcing a secure approach. Ideally, we’d like to secure authentication with Azure AD, and optionally enforce Multi-Factor Authentication (MFA) – especially for guest users.
Business problem
I needed to set up a few Windows Server 2016-based virtual machines in Azure. They will be provisioned in a new VNET together and both will have a public IP address. These VMs are aimed to run certain business apps that require remote access to the server for continuous configuration and tweaking. I know this is not a common approach but still very much used.
The business problem came when we had these virtual machines provisioned and it became time to grant permissions. How do we grant and enforce secure remote access for both in-house IT staff as well as select few external consultants for these virtual machines?
Mapping the options
I love challenges, if you couldn’t already tell by my previous posts. Back when life was still more datacenters and on-premises focused I remember jokingly saying to myself that there are always technical blockers and challenges to be figured out. And that’s the essence of working in IT!
If it was easy, everyone would be doing it.
So, I started this journey by mapping out the options – or at least researching a bit to find out what I needed to focus on.
Assuming we have a select number of virtual machines in Azure and needed to provide access for these I was able to find out the following options:
Open port 3389/TCP (Remote Desktop) for selected IPs or networks. This is the obvious “top of my head” solution, as it’s a classic approach. And also inherently insecure, and hard to manage. Many times external guests have dynamic IP addresses or proxy services that require allowing access from huge and mostly unknown networks.
I also don’t feel comfortable opening RDP for the whole world as it wouldn’t take more than a few minutes for portscans to hit the VMs. With multiple VMs it also becomes a burden to maintain and manage. Monitoring this would be a nightmare also.
Should I use this solution? No.
Provision a Point-to-Site (P2S) VPN solution. Another option from my 20+ years as a consultant. “Use a VPN!” is so easy to throw around without thinking about the operational issues. Azure P2S VPN is a great solution for very limited scenarios. You have several protocols to choose from (SSTP, OpenVPN, IKEv2), there’s support for all major platforms (Windows and macOS use native clients, Linux uses StrongSwan and even iOS is supported) and VPNs are typically well understood.
The challenge with P2S in a scenario like this is first and foremost with routing. Different VNETs, peering and routing between these and managing it all requires a bit of planning and experimenting to get it right. The other challenge is distributing and provisioning the client settings and supporting all users with their numerous and inevitable VPN-related issues.
I placed this solution as a good candidate and a viable backup option but continued searching for other options.
Create a jump server that is more open, and use that to RDP to internal hosts. Another classic and something I frequently see still being used. While not inherently bad, it still poses several issues. One is that users still need access to the initial jump server – so typically this would entail opening RDP access for the whole world (or just ‘trusted’ networks). From there, users would – after a successful authentication – jump to the next server.
It sometimes leads to RDP inception:
I don’t think using a jump server is in any way a modern approach in 2019. It works but it’s flawed and is something we’ve used since 1999 and there certainly has to be something better by now.
I discarded this option altogether.
Provision a Remote Desktop deployment and publish RD Web Access. Now we’re finally looking into a more modern approach. I admit I’ve slightly lost touch with all things Remote Desktop Services (RDS). RemoteApp is no more, Citrix is still a strong contender and Windows Virtual Desktop is something we’ve waited since Ignite last year (at the time of writing it’s not yet in public preview).
I spent a few hours brushing up on my understanding of what would be required for the infrastructure of a proper RDS infrastructure in my scenario. Turns out, it’s far from simple.
In order to set up a web-based Remote Desktop portal, I would need to provision the RD Gateway, RD Connection Broker, and finally RD Web Access. I’d need per-user licensing and I’d also need to purchase trusted certificates. You can review this guidance here.
This brought back memories from a time when I last set something like this up in around 2008. It was a lot of configuration and many moving parts. Typically this implies things will eventually break, and someone needs to maintain and manage this quite actively.
I knew I would be able to meet my goal with this approach, but it just felt like a bit too much work studying and learning all things RDS to get a simple RDP connection published.
For now, I discarded this option, but kept it as a final, ultimate escape solution should everything else fail. Always good to have one or two get out of jail free cards.
Use Azure Security Center’s new Just-In-Time Virtual Machine Access capability. Moving on to something much more modern. I was enticed and simply overjoyed when I realized there’s an exact solution for my exact need. It also looks cool.
How it works is that users request for RDP access, and depending on certain factors, Azure Security Center’s JIT capability grants access by reconfiguring the Network Security Group accordingly. In a way, it’s an automated access control list for RDP access. Very clever. You can see it in action here.
Sadly, it will also cost you extra. I shouldn’t be complaining about the cost, but it also means there’s something I need to sell – and I typically prefer building solutions than selling solutions that I then need to build. Essentially, you’ll need the Standard tier of Azure Security Center, which you can allocate per resource group or per subscription.
As I’m writing this in early February 2019 the cost is around 13 €/VM/month, which includes 500 MB of data. Additional data is 1.94 €/GB. The data is for analytics, not for RDP traffic so in essence, the cost is around 13 € for each VM within a given resource group. While not inherently much, it still adds up to hundreds of euro each year for RDP access.
I love this approach and it’s a strong contender – but let’s see if we have even more options to choose from.
Provision a Network Virtual Appliance with proper firewall and authentication controls. NVAs provide enterprises a secure approach for building hybrid solutions. They provide Layer 7 traffic support and provide much finer control over network traffic than Network Security Groups.
Pricing is not easily accessible, as NVAs are provided by third-party vendors. And there’s plenty to choose from.
(BYOL stands for Bring-Your-Own-License, while PAYG stands for Pay-As-You-Go)
These are some serious firewall and routing solutions. I wanted to understand the pricing, so I chose the Barracuda CloudGen Firewall for Azure (PAYG). The default VM size is Standard F1s (1 vCPU, 2 GB RAM), which is around 77 € (~$87) per month. When I factor in the Barracuda logic and license this price goes up to about 354 € (~$400) per month. It certainly isn’t cheap – or at least for the purpose, I would be needing it.
I fear introducing an NVA-based solution would add a lot of complexity and cost in this scenario. So, I discarded this option as well.
Use Azure AD Application Proxy to publish the RDP endpoint. I had been thinking about simply publishing the RDP endpoint with Azure AD Application Proxy. It’s a capability that is licensed through Azure AD Premium P1 (or P2, respectively) and it allows for intelligent and somewhat clean exposure of internal services. As a bonus, you can enforce Azure AD authentication and other Azure AD related capabilities on the fly.
Azure AD Application Proxy, however, does not support the publishing of a single port. There’s guidance on publishing the whole RDS infrastructure that I wasn’t planning on building. Usually, I see Azure AD Application Proxy is used to publish web-based solutions – thus I couldn’t rely on this approach.
Use the Windows Admin Center as a jump solution. I remember reading about the Windows Admin Center a year ago when Microsoft started marketing it as a reasonable alternative to Server Manager. Many IT admins I see at work frequently just use RDP, or PowerShell remoting for quick changes.
I hadn’t used Windows Admin Center in a production environment before this, so I gave it a proper try. Installing WAC is a breeze, a traditional Setup –> Next –> Next –> Finish. We like that.
In essence, after installing WAC it’s accessible through https://[ip] and requires a local account (or a local administrator account) to grant access.
The browser-based approach to managing remote virtual machines is elegant and very lightweight. I’m running WAC in a VM in Azure and a B2ms size (2 vCPU, 8 GB RAM) provides excellent performance.
Remote PowerShell works, as expected:
Remote Desktop also works, but it’s far from optimal. As it’s based on WebSockets, some advanced features are simply not there. For regular RDP needs it’s more than enough though and authentication against Azure AD is naturally supported also. I also find less need for RDP when the usual Windows admin tools are available as browser-based variants in WAC.
I realize now that hosting WAC in an Azure-based VM is equivalent to having a jump server – without the need to first RDP to the jump server.
I went even further and tried to publish my WAC endpoint through the Azure AD Application Proxy. This would allow me to provide additional security, such as enforcing MFA and requiring Conditional Access. I spent many brave hours on this but eventually had to give up. As Azure AD Application Proxy nowadays supports WebSockets, it seems the support for redirecting WebSocket connections is not there. I kept hitting an HTTP 401 error because of this and had to remove Azure AD Application Proxy from the overall architecture.
While setting up WAC, I ran into a few small technical issues. One tricky one was the Response URI, that you must configure when you enable Azure AD-based authentication for WAC. It always forced me to use the hostname of the server, not the IP-address or FQDN I was hoping to use. To resolve this, I configured Azure AD-support by first accessing WAC remotely (and not from the localhost). Simple, but tricky to figure out at first.
Introducing the final solution
It’s great to have so many options for remotely accessing VMs in Azure! Having gone through most, if not all of them, I realize that depending on your requirements and needs, there is always a suitable option to choose.
The final solution ended up looking like this:
I’m using Conditional Access to enforce MFA for remote admins. These admins access WAC through a friendly URL, such as https://wac.company.com. They first authenticate with a local admin account, and then again with Azure AD credentials – which in turn performs MFA for added security. This is by design for WAC, and while initially, it was a bit cumbersome, it works out great.
NSG is there for extra security and simply allows HTTPS for now.
Once WAC is accessible, additional authentication to VMs is done through the local domain (AD) accounts without additional MFA prompts.
For now, I chose not to include Azure Security Center JIT in this setup. Should I change to that, I could eliminate WAC altogether and have remote admins request direct RDP access through the Azure Portal. That is something I aim to build, but for now, WAC provides a very nice remote admin experience secured with Azure AD.