An adventure in containers and command-line tools: Running MongoDB in Azure

An adventure in containers and command-line tools: Running MongoDB in Azure

I’ve used a Synology NAS solution at home since 2012. It’s simply great – very affordable, runs everything I need and has consistently given me uptimes of 60-90 days (I reboot it mostly to patch things up). In 2015 my Synology model (a DS412+) received an upgrade enabling it to run something called Docker containers.

I’d read by then about Docker, and containers of course. It reminded me of 1996 or 1997, when VPN as a technology started to become a reality for early adopters. It was new, a bit weird and hard to understand at first. I’m proud to say I completed the F-Secure Data Fellows VPN+ Certification in 1998 (I think) so at least I knew a bit about this too, then.

One of the first things I did once my Synology device received the update was to pull a Minecraft Docker image from Docker Hub and configure that for my kids. I couldn’t believe how easy, quick and practical it was to pull a pre-configured app from the cloud and just.. run it! No worrying about OS versions, %TEMP% directory permissions, background services or TCP ports. It felt very modern at the time.

Now, fast forward to today, I’ve been wrecking my brains for a few days trying to figure out all things containers. I don’t mean to say anything within this realm is exceedingly complex – or at least in the level of functionality I need containers for. There are so many technologies involved, and plenty of things that keep you sidetracked and busy on your journey. But that’s not the point of this blog post.

Business problem

Like all great stories, this hands-on journey begins with a business problem. I’m a huge fan of using IT and technology to solve business problems, as opposed to using technology to introduce more technology for the sake of it. I love technology, but one needs to harness it to provide tangible benefits.

I’m managing an Azure tenant that has a hard requirement of not introducing any more virtual machines – and generally avoiding IaaS-related services, if possible. As such, most services are based on PaaS offerings from Azure – Web Apps, Azure SQL and similar basic building blocks in the cloud.

One of the services we need to build has a requirement for a MongoDB database. As Microsoft’s own Cosmos DB exposes a Mongo API, I initially suggested we try using that. For numerous reasons (that are probably outside the scope of this post), Cosmos DB used through the Mongo API wasn’t a good fit. One reason being the unpredicted cost of request units, another being the insane queries we needed to run. I realize these are topics that warrant a few other blog posts in the future, but for now, suffice to say Cosmos DB wasn’t the best fit due to time constraints (I plan to revisit this architecture soon and try to re-work Cosmos DB in it).

Thus, the business problem is the need to run MongoDB without a virtual machine in Azure.

Evaluating options for running MongoDB in Azure

A friend of mine once said he used to be an Azure expert back when it was only 2 services. Today, Azure consists of somewhere between 165 and 180 services, and it’s impossible to be an expert in them all up to the latest features.

As I knew virtual machine wouldn’t be the solution, I first had a look whether MongoDB is available as a native cloud offering from Azure. Looking through Azure Marketplace, I found 4 MongoDB offerings.

Unfortunately, the category is Compute, which directly translates to IaaS — typically a pre-installed virtual machine.

Next, I searched whether MongoDB is available in any other means for Azure. It seems MongoDB has been working together with Microsoft last year, and they are offering a limited version of MongoDB Atlas (see announcement). I’m not that familiar with MongoDB, but after reading up a bit I learned that MongoDB Atlas is a cloud-hosted SaaS version of MongoDB.

Out of curiosity I provisioned a free trial to MongoDB Atlas. It does ask if you want to run the MongoDB cluster in Azure, but it never prompts for your Azure subscription. It seems technically it runs in Azure, but just not your Azure subscription. The MongoDB Atlas offering through Azure is therefore a SaaS solution, and not a PaaS solution that I was in need of.

I thought about building rest of the solution within Azure, and simply having MongoDB Atlas available from a separate cloud or subscription. While this would obviously work, I had a few doubts for this approach. First, running a multi-cloud or multi-tenant deployment tends to introduce unnecessary complexity, and troubleshooting easily becomes nightmarish. Second, the solution must be in Azure as all other business solutions are deployed within the same Azure subscription. It doesn’t make sense to offload a key component to a platform we have no control of.

Services for running containers in Azure

Now that it was evident that MongoDB was not available natively in Azure, and all other options (deploying a virtual machine with MongoDB or subscribing to MongoDB Atlas) couldn’t be used I turned to containers.

Azure has a great track record of supporting containers. As I am truly not an expert with containers, but simply work around the needs I have quite successfully, I needed to get my ducks in a row on this one.

Searching for container from Azure Marketplace gives me a list of 19 different services!

I knew containers was a thing but 19 services were more than I expected. Reasoning through this list, I was able to narrow the list of services down to 7:

  • Container Service, which actually is Azure Kubernetes Service (AKS) now. It’s a solution for deploying and managing containers in a massive (or smaller) scale using the open source Kubernetes solution
  • Container Registry, a private registry for your own (Docker) containers. Remember UDDI? A bit like that!
  • Web App for Containers, allowing you to run containers in a Web App
  • Container Instances (ACI), a high-level container execution platform without the overhead and cost of AKS
  • Azure Batch, a service for running large-scale parallel and high-performance computing batch jobs – including support for container applications
  • Windows Server or Ubuntu VM with containers support – essentially using a VM as a Docker container platform or a Hyper-V container platform

I also found out that Service Fabric, a service for building and running distributed systems based on microservices, runs containers.

Azure Kubernetes Service

AKS seemed like an obvious fit for me. I spent considerable amount of time reading through the documentation on AKS. How-to guides include a respectable amount of content from cluster operations to virtual nodes to data volumes to security and authentication to monitoring to developing and running applications to devops. Finally, it also has Kubernetes basics somewhere near the end. It was evident AKS introduces a steep learning curve, and I wasn’t sure if I could afford spending days on getting a cluster up and running with my MongoDB container.

For now, I set AKS aside as it would not work for my needs – at least until the business problem demanded more capabilities. I just needed a simple container to run.

Web App for Containers with Azure CLI

Next, I spent an hour with Web App for Containers. There’s some proper guidance available here (although not sure why it links to es-es locale but it’s in English). It’s a tiny feature hidden in the Web App management blade of Azure Portal. Behind the scenes it’s using Azure Container Registry and Azure Storage.

To test Web App for Containers, I first provisioned a simple (free-tier) Web App. I chose to use Azure CLI, as I’ve used that the least of all command-line interfaces from Microsoft and I could use a bit of practice.

To install Azure CLI on a Windows 10 I downloaded the MSI package and then ran it in a traditional “next-next-finish” fashion.

Next, from a Windows command prompt I run the following to login to Azure:

az login

This opens a browser session for authentication. A successful authentication then returns a list of my subscriptions, of which I have 3:

Next, I’ll provision a resource group to hold my container tests. It will be named containertest-rg:

az group create --location westeurope --name containertest-rg

And then I’ll provision a web app plan, with the free tier and this will be provisioned in the containertest-rg resource group. Note that I need to specify the host OS to be Linux with –is-linux parameter. And this forces me to choose the Standard 1 (S1) tier for the app plan, as Linux is not available for Free or Shared app plans.

az appservice plan create --name containertestplan --resource-group containertest-rg --sku S1 --number-of-workers 1 --is-linux --location "westeurope"

And finally, I’ll provision the actual web app. As this is not a regular Web App but rather the containerized one, I need to specify the container image as well. I’m using the default Mongo Docker image, which coincidentally is called ‘mongo’:

az webapp create --name containertest31012019 --resource-group containertest-rg --plan containertestplan --deployment-container-image-name mongo

This takes a short while to complete. Opening the web app through Azure Portal I can verify the container settings under Container settings:

I’m now faced with a new challenge. Although everything provisioned correctly, and I can see from the logs that the Docker image was successfully pulled and instantiated, I don’t have access to a MongoDB instance anywhere. Opening the web app with a browser reveals to me, that MongoDB is running – somewhere:

I knew MongoDB wouldn’t have any of the HTTP interfaces enabled by default. Trying to access directly with MongoDB’s default port 27017 produces a network timeout. Turns out I need to configure a custom port (only one supported) with a WEBSITES_PORT variable in the web app settings. After doing this and restarting the container I’m still getting the same results: port 27017 doesn’t answer and I’m unable to do anything worthwhile with the MongoDB container.

Considering how limited the configuration for running the container is, I decide to leave Web App for Containers for now and visit this service later when I perhaps know more about configuring MongoDB.

To clean up, as the web app with the container is continuously incurring charges for me, I simply delete the resource group containertest-rg in Azure CLI:

az group delete --name containertest-rg

Container Instances with Azure CLI

I moved on to Azure Container Instances, as I felt this was the service that was most promising for my needs. Even though I couldn’t use virtual machines (to simply run Docker natively, or as Hyper-V containers) I knew by now that I have a loophole just in case nothing else would work for me.

ACI looks rather simple – very similar to Web App for Containers but without the ‘web app’ part. Let’s start with Azure CLI again, and provision a new resource group:

az group create --name acitest-rg --location westeurope

This took a few tries, but I figured the container that would eventually run my MongoDB (hopefully) wouldn’t persist, so I need external storage. And what better to use than Azure Storage.

First, I need to provision a new storage account in the resource group I just created:

az storage account create --name acistorage040219 --resource-group acitest-rg --location "westeurope" --sku Standard_LRS

Now that the storage account is provisioned, I can query for its access keys – this is handy when I next start provisioning containers in the storage account (not to be confused with containers that actually run Docker images).

az storage account keys list --account-name acistorage040219 --resource-group acitest-rg

I’ll use key1 from now on. Within the storage account I need a file share that my container instance (MongoDB) can access. The goal is to store all MongoDB databases and other data that must be non-volatile outside the container image – in a container in an Azure Storage account 🙂

az storage share create --name mongodata --account-name acistorage040219 --account-key "Dv14IH<snip>"

By now you probably realize using variables within Azure CLI scripts is quite handy. Should you need to store the Storage account access key in a variable, use the following in a Windows command prompt:

SET AZ_STORAGE_ACCOUNT="acistorage040219"
SET AZ_STORAGE_ACCOUNT_KEY="insert-key-here"

You can now easily reference these with %AZ_STORAGE_ACCOUNT% and %AZ_STORAGE_ACCOUNT_KEY% in your Azure CLI commands.

All prerequisites are provisioned, and I can move on to provisioning the actual MongoDB container instance. I need to specify the container image (‘mongo‘), the ports that I need (27017, MongoDB’s default port), and what type of computing resources I want to reserve for my instance. I’ll start with 2 GB of RAM and 2 CPU vCores. I also need to pass on the Azure Storage account details so that MongoDB will use the external storage as the default path for database files.

I needed to go back and forth here quite a while, as I was also learning the intimate details of MongoDB configuration. The process is simple, but requires a careful approach:

  1. Provision the container instance with MongoDB without authentication enforced — make sure that databases are mounted outside the container image
  2. Connect with MongoDB as anonynomous user and provision the first account and grant it super user privileges
  3. Re-provision the container instance and enforce authentication and disable anonymous access

I’ll go through these three steps next.

To provision MongoDB within a container instance without authentication enforced, I used the –command-line argument to pass on the –dbpath and –bind_ip_all:

az container create --resource-group acitest-rg --name acimongotest --image mongo --azure-file-volume-account-name acistorage040219 --azure-file-volume-account-key "Dv14I<snip>" --azure-file-volume-share-name mongodata --azure-file-volume-mount-path "/data/mongoaz" --ports 27017 --cpu 2 --ip-address public --memory 2 --os-type Linux --protocol TCP --command-line "mongod --dbpath=/data/mongoaz --bind_ip_all"

To verify everything goes well, I always query the logs:

az container logs --container-name acimongotest --name acimongotest --resource-group acitest-rg 

To view the container details, use the following:

az container show --name acimongotest --resource-group acitest-rg

This also nicely prints out the public IP address of the container – thus the IP address (and port) we need in order to connect with MongoDB.

Now that I have step 1 completed, I need to connect to my MongoDB and provision the user (step 2).

I know very little about MongoDB, as that’s part of the new world that typically people working on a Microsoft platform are not heavily exposed to. I remembered from some years ago when I had to debug a MongoDB instance that I used Robo 3T, a handy (and free) graphical tool. It used to be called Robomongo at the time, I think.

To continuously keep myself learning while doing, I chose to use Robo 3T from Ubuntu. With Windows 10 1809 update Hyper-V received an update for Hyper-V Quick Create. This includes a pre-defined template for Ubuntu 18.04.1 LTS – just what I need!

Provisioning Ubuntu on my local workstation was a breeze, and I was up an running very quickly. As Ubuntu also now supports enhanced mode for Hyper-V, remote access works very similarly to Windows virtual machines.

From Robo 3T I only need to specify the remote host and choose not to use authentication.

Upon connecting, I run the following MongoDB query to provision a mongoadmin user to use from now on:

use Admin
db.createUser(
{
user: "mongoadmin",
pwd: "Password1",
roles: [
{ role: "userAdminAnyDatabase", db: "admin" },
{ role: "readWriteAnyDatabase", db: "admin" },
{ role: "dbAdminAnyDatabase", db: "admin" },
{ role: "clusterAdmin", db: "admin" }
],
mechanisms:[ "SCRAM-SHA-1" ]
})

All that is left now is the last step, re-provisioning the container image with authentication enabled. I simply re-run the previous provisioning command and append the –auth parameter.

az container create --resource-group acitest-rg --name acimongotest --image mongo --azure-file-volume-account-name acistorage040219 --azure-file-volume-account-key "Dv14I<snip>" --azure-file-volume-share-name mongodata --azure-file-volume-mount-path "/data/mongoaz" --ports 27017 --cpu 2 --ip-address public --memory 2 --os-type Linux --protocol TCP --command-line "mongod --dbpath=/data/mongoaz --bind_ip_all --auth"

This seems to take a while longer, around 3 minutes to complete. The IP address typically doesn’t change, but I get to retain all my other settings. In Robo 3T, I should now be able to connect as an authenticated user:

And that’s it! Getting MongoDB up and running in Azure Container Instances sure takes a bit of patience, but eventually it runs – and it’s very fast also.

Should you run into any issues ACI provides a remote shell within the portal. It’s accessible through the Containers settings in the container blade.

I also learned that if I kill the mongod process, it kills the container :-).

To stop the container, simply issue the stop command:

az container stop --resource-group acitest-rg --name acimongotest

In summary

I started writing this post with a lot of interest in learning more about running containers in Azure. Which I certainly did. I didn’t anticipate spending so much time on tweaking and figuring out the command-line parameters for MongoDB. It’s understandable though, as I needed to learn a bit more about Azure CLI. Also, as Azure doesn’t care what I run within the container instance it’s up to me to configure MongoDB as I see best.

ACI is quite simple to manage and use, which I quite like. Pricing is not necessarily cheaper than running the same workload in a virtual machine – but given the level of automation, flexibility and security I feel it’s a fair trade-off for a number scenarios.

For the actual cost of running ACI, see here. 1 hour (3600 seconds) of running the MongoDB containre instance is around 0.12 € (or $0.14). Should I choose to run the container in a non-stop scenario for month, it would cost me roughly 88 € (~$105). Certainly not the cheapest option by any means, but still very competitive for what you get.

I’m still very keen on moving to Cosmos DB, as it provides a native Mongo API and much more control over what ACI does. That’s probably a great topic for another post.

Thanks for reading!