Access Azure’s BLOB storage with cURL on an VM/VMSS instance

After 1,5 years working intensively with Amazon Web Services (AWS), my current project builds its infrastructure in Microsoft Azure. This forces me to learn some new things, concepts and notations. A common task I like to use is to access a single file (for example a Kubernetes configuration) on object storage. Azure’s object storage is called Azure Blob storage.

Blob storage is optimized for storing massive amounts of unstructured data. Unstructured data is data that does not adhere to a particular data model or definition, such as text or binary data.

Azure Blob storage is the equivalent of Amazon Simple Storage Service (Amazon S3).

Let me describe my infrastructure:

  • I have a (managed) Azure Kubernetes Service (AKS) which is built by HashiCorp’s Terraform
  • I can export the Kubernetes configuration directly with Terraform
  • I’m using GitLab CI to deploy applications to the Kubernetes cluster

Every pipeline that deploys an application to the cluster needs the Kubernetes configuration. This enables the pipeline to be able to talk to the cluster (of course) and finally deploy our application.

Where does the K8s configuration come from?

Let’s create a storage account / container so we can file our configuration.

resource "azurerm_storage_account" "azure-file" {
  name                     = "${var.project}${var.cluster_name}data"
  location                 = "${azurerm_resource_group.k8s.location}"
  resource_group_name      = "MC_aks_dev_westeurope"
  account_tier             = "Standard"
  account_replication_type = "LRS"

  tags {
    Environment = "Development"
  }
}

resource "azurerm_storage_account" "k8s-config" {
  name                     = "${var.project}${var.cluster_name}k8sconfig"
  location                 = "${azurerm_resource_group.k8s.location}"
  resource_group_name      = "${azurerm_resource_group.k8s.name}"
  account_tier             = "Standard"
  account_replication_type = "LRS"

  tags {
    Environment = "Development"
  }
}

resource "azurerm_storage_container" "k8s-config" {
  name                  = "k8s-config"
  resource_group_name   = "${azurerm_resource_group.k8s.name}"
  storage_account_name  = "${azurerm_storage_account.k8s-config.name}"
  container_access_type = "private"
}

After doing the whole Terraform stuff (terraform init, terraform apply), we can now find a storage container within the storage account in the Azure Portal.

This gives us a place to store our Kubernetes configuration.

Let’s assume we have some logic that uploads the Kubernetes configuration automatically to our newly created storage container. It is alright, too, if we file our configuration manually if this changes very rarely. An application deployment pipeline (and its underlying vm) now mainly needs three things to be able to interact with Azure Blob storage:

  • a system assigned managed identity for the virtual machine / virtual machine scale set
  • access control (IAM) set on the storage account
  • Bearer access token that authenticates us to the BLOB storage

System assigned managed identity

A system assigned managed identity enables Azure resources to authenticate to cloud services without storing credentials in code. Once enabled, all necessary permissions can be granted via Azure role-based-access-control (IAM).

To enable this directly in the Azure Portal, go to your virtual machine (or virtual machine scale set) resource and choose Identity under Settings.

Terraform does this in a (better) reproducible way, too:

resource "azurerm_virtual_machine" "instance" {
  ...

  identity {
    type = "SystemAssigned"
  }

  ...
}

Access control (IAM)

With system assigned managed identities enabled, we’re now able to request an access token with an HTTP REST call. Before we do so, we need to authorize our vm to access the storage container where the Kubernetes configuration lives on. This is done through role-based access control (RBAC). Azure Storage defines a set of built-in RBAC roles that encompass common sets of permissions.

Go to your storage account resource and choose Access control (IAM). Click on Add role assignment and choose the appropriate values:

  • Role: Storage Blob Data Contributor (Preview)
    Use to grant read/write/delete permissions to Blob storage resources.
  • Assign access to: Virtual Machine
  • Subscription: choose your subscription
  • Select: Type the name of your vm resource. The form will auto-complete the whole name.

Save the role assignment with the blue button below.

Bearer access token

Before accessing the Kubernetes configuration on the storage container, we need an access token. All preconditions have been met now. Create the token directly on your vm with:

curl -s 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fstorage.azure.com%2F' -H Metadata:true | jq -r '.access_token'

 Wrap it up

As mentioned above, our final goal was to deploy an application with GitLab CI. This could be a finale deploy step:

deploy:
  stage: deploy
  tags:
    - docker
  before_script:
    - apk add --no-cache curl
    - export ACCESS_TOKEN=$(curl -s 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fstorage.azure.com%2F' -H Metadata:true | jq -r '.access_token')
    - >-
      curl
      -s
      https://<name of your storage account>.blob.core.windows.net/k8s-config/config
      -H "x-ms-version: 2017-11-09"
      -H "Authorization: Bearer ${ACCESS_TOKEN}"
      > ~/.kube/config
  script:
    - deploy_to_kubernetes.sh k8s.yml

Et voilà.

curl -s https://<name of your storage container>.blob.core.windows.net/k8s-config/config -H "x-ms-version: 2017-11-09" -H "Authorization: Bearer ${ACCESS_TOKEN}"

does the final magic and prints the Kubernetes configuration.