Private Endpoints with Terraform

Warning: This is a fairly lengthy one – if you are just here for the code: jfarrell-examples/private-endpoint-terraform (github.com) – Cheers.

In my previous post I talked about the need for security to be a top consideration when building apps for Azure, or any cloud for that matter. In that post, I offered an explanation of the private endpoint feature Azure services can use to communicate within Virtual Network resources (vnet).

While this is important, I decided to take my goals one step further by leveraging Terraform to create this outside of the portal. Infrastructure as Code (IaC) is a crucial concept for teams that wish to ensure consistency and predictability across environments. While there exists more than one operating model for IaC the concepts are the same:

  • Infrastructure configuration and definition should be a part of the codebase
  • Changes to infrastructure should ALWAYS be represented in scripts which are run continuously to mitigate drift
  • There should be a defined non-manual action which causes these scripts to be evaluated against reality

Terraform is a very popular tool to accomplish this, for a number of reasons:

  • Its HashiCorp Configuration Language (HCL) tends to be more readable than the formats used by ARM (Azure Resource Manager) or AWS Cloud Foundation
  • It supports multiple cloud both in configuration and in practice. This means, a single script could manage infrastructure across AWS, Azure, Google Cloud, and others.
  • It is free

However, it is also to note Terraform’s weakness in being a third party product. Neither Microsoft or others officially support this tool and as such, their development tends to be behind native tooling for specific platforms. This means, certain bleeding edge features may not be available in Terraform. Granted, one can mitigate this in Terraform by importing a native script into the Terraform file.

All in all, Terraform is a worthwhile tool to have at one’s disposal given the use cases it can support. I have yet to observe a situation in which there was something a client was relying on that Terraform did not support.

How to Private Endpoints Work?

Understanding how Private Endpoints work in Azure is a crucial step to building them into our Infrastructure as Code solution. Here is a diagram:

In this example, I am using an Azure App Service (Standard S1 SKU) which allows me to integrate with a subnet within the vnet. Once my request leaves the App Service it arrives at a Private DNS Zone which is linked to the vnet (it shown as part of the vnet, that is not true as its a global resource. But for the purposes of this article we can think of it as part of the vnet.

Within this DNS Zone we deploy an A Record with a specific name matching the resource we are targeting. This gets resolved to the private IP of a NiC interface that effectively represents our service. For its part, the service is not actually in the vnet, rather it is configured to only allow connections from the private endpoint. In effect, a tunnel is created to the service.

The result of this, as I said in the previous post, your traffic NEVER leaves the vnet. This is an improvement over the Service Endpoint offering which only guarantees workloads will never leave the Azure backbone. That is fine for most things but, Private Endpoints offer an added level of security for your workloads.

Having said all that, let’s walk through building this provisioning process in Terraform. For those who want to see code this repo contains the Terraform script in its entirety. As this material is for education purposes only, this code should not be considered production ready.

Below is the definition for App Service and the Swift connection which supports the integration.

Create a Virtual Network with 3 subnets

Our first step, as it usually is with any secure application, create a Virtual Network (vnet). In this case we will give it three subnets. I will take advantage of Terraform’s module concept to enable reuse of the definition logic. For the storage and support subnets we can use the first module shown below, for the apps we can use the second, as its configuration is more complicated and I have not taken the time to unify the definition.

# normal subnet with service endpoints
# create subnet
resource "azurerm_subnet" "this" {
name = var.name
resource_group_name = var.rg_name
virtual_network_name = var.vnet_name
address_prefixes = var.address_prefixes
service_endpoints = var.service_endpoints
enforce_private_link_endpoint_network_policies = true
enforce_private_link_service_network_policies = false
}
# output variables
output "subnet_id" {
value = azurerm_subnet.this.id
}
# delegated subnet, needed for integration with App Service
# create subnet
resource "azurerm_subnet" "this" {
name = var.name
resource_group_name = var.rg_name
virtual_network_name = var.vnet_name
address_prefixes = var.address_prefixes
service_endpoints = var.service_endpoints
delegation {
name = var.delegation_name
service_delegation {
name = var.service_delegation
actions = var.delegation_actions
}
}
enforce_private_link_endpoint_network_policies = false
enforce_private_link_service_network_policies = false
}
# output variables
output "subnet_id" {
value = azurerm_subnet.this.id
}
view raw subnets.tf hosted with ❤ by GitHub

Pay very close attention to the enforce properties. These are set in a very specific way to enable our use case. Do not worry though, IF you make a mistake the error messages reported back from ARM are pretty helpful to make corrections.

Here is an example of calling these modules:

# apps subnet
module "apps_subnet" {
source = "./modules/networking/delegated_subnet"
rg_name = azurerm_resource_group.rg.name
vnet_name = module.vnet.vnet_name
name = "apps"
delegation_name = "appservice-delegation"
service_delegation = "Microsoft.Web/serverFarms"
delegation_actions = [ "Microsoft.Network/virtualNetworks/subnets/action" ]
address_prefixes = [ "10.1.1.0/24" ]
}
# storage subnet
module "storage_subnet" {
source = "./modules/networking/subnet"
depends_on = [
module.apps_subnet
]
rg_name = azurerm_resource_group.rg.name
vnet_name = module.vnet.vnet_name
name = "storage"
address_prefixes = [ "10.1.2.0/24" ]
service_endpoints = [ "Microsoft.Storage" ]
}
view raw make_subnets.tf hosted with ❤ by GitHub

One tip I will give you for building up infrastructure, while Azure documentation is very helpful, for myself I will create the resource in the portal and choose the Export Template option. Generally, its pretty easy to map the ARM syntax to Terraform and glean the appropriate values – I know the above can seem a bit mysterious if you’ve never gone this deep.

Create the Storage Account

Up next we will want to create our storage account. This is due to the fact that our App Service will have a dependency on the storage account as it will hold the Storage Account Primary Connection string in its App Settings (this is not the most secure option, we will cover that another time).

I generally always advise the teams I work with to ensure a Storage Account is set to completely Deny public traffic – there are just too many reports of security breaches which start with a malicious user finding sensitive data on a publicly accessible storage container. Lock it down from the start.

resource "azurerm_storage_account" "this" {
name = "storage${var.name}jx02"
resource_group_name = var.rg_name
location = var.rg_location
account_tier = "Standard"
account_kind = "StorageV2"
account_replication_type = "LRS"
network_rules {
default_action = "Deny"
bypass = [ "AzureServices" ]
}
}
# outputs
output "account_id" {
value = azurerm_storage_account.this.id
}
output "account_connection_string" {
value = azurerm_storage_account.this.primary_connection_string
}
output "account_name" {
value = azurerm_storage_account.this.name
}
view raw storage.tf hosted with ❤ by GitHub

One piece of advice, however, make sure you add an IP Rule so that your local machine can still communicate with the storage account as you update it – it does support CIDR notation. Additionally, the Terraform documentation notes a property virtual_network_subnet_ids in the network_rules block – you do NOT need this for what we are doing.

Now that this is created we can create the App Service.

Create the App Service

Our App Service needs to be integrated with our vnet (reference the diagram above) so as to allow communication with the Private DNS Zone we will create next. This is accomplished via a swift connection. Below is the definition used to create an Azure App Service which is integrated with a specific Virtual Network.

# create the app service plan
resource "azurerm_app_service_plan" "this" {
name = "plan-${var.name}"
location = var.rg_location
resource_group_name = var.rg_name
kind = "Linux"
reserved = true
sku {
tier = "Standard"
size = "S1"
}
}
# create the app service
resource "azurerm_app_service" "this" {
name = "app-${var.name}ym05"
resource_group_name = var.rg_name
location = var.rg_location
app_service_plan_id = azurerm_app_service_plan.this.id
site_config {
dotnet_framework_version = "v5.0"
}
app_settings = {
"StorageAccountConnectionString" = var.storage_account_connection_string
"WEBSITE_DNS_SERVER" = "168.63.129.16"
"WEBSITE_VNET_ROUTE_ALL" = "1"
"WEBSITE_RUN_FROM_PACKAGE" = "1"
"EventGridEndpoint" = var.eventgrid_endpoint
"EventGridAccessKey" = var.eventgrid_access_key
}
}
# create the vnet integration
resource "azurerm_app_service_virtual_network_swift_connection" "swiftConnection" {
app_service_id = azurerm_app_service.this.id
subnet_id = var.subnet_id
}
view raw appservice.tf hosted with ❤ by GitHub

Critical here is the inclusion of two app settings shown in the Terraform:

  • WEBSITE_DNS_SERVER set to 168.63.129.16
  • WEBSITE_VNET_ROUTE_ALL set to 1

Reference: Integrate app with Azure Virtual Network – Azure App Service | Microsoft Docs

This information is rather buried in the above link and it took me effort to find it. Each setting has a distinct purpose. WEBSITE_DNS_SERVER indicate where outgoing requests should look to for name resolution. You MUST have this value to target the Private DNS Zone linked to the vnet. The WEBSITE_VNET_ROUTE_ALL setting tells the App Service to send ALL outbound calls to the vNet (this may not be practical depending on your use case).

For those eagle eyed readers, you can see settings for an Event Grid here. In fact, the code shows how to integrate Private Endpoints with Azure Event Grid, the technique is similar. We wont cover it as part of this post, but its worth understanding.

Create the Private DNS Rule

Ok, this is where things start to get tricky, mainly due to certain rules you MUST follow to ensure the connection is made successfully. What is effectively going to happen is, our DNS Zone name is PART of the target hostname we need to match. The match will then resolve to the private IP of our NiC card (part of the private endpoint connection).

Here is the definition for the storage DNS Zone. The name of the zone is crucial, as such I have included how the module is called as well.

# create dns zone resource
resource "azurerm_private_dns_zone" "this" {
name = var.name
resource_group_name = var.rg_name
}
# create link to vnet
resource "azurerm_private_dns_zone_virtual_network_link" "this" {
name = "vnet-link"
resource_group_name = var.rg_name
private_dns_zone_name = azurerm_private_dns_zone.this.name
virtual_network_id = var.vnet_id
}
# define outputs
output "zone_id" {
value = azurerm_private_dns_zone.this.id
}
output "zone_name" {
value = azurerm_private_dns_zone.this.name
}
# how it is called from the main Terrafrom file
module "private_dns" {
source = "./modules/networking/dns/private_zone"
depends_on = [
module.vnet
]
name = "privatelink.blob.core.windows.net"
rg_name = azurerm_resource_group.rg.name
vnet_id = "/subscriptions/${data.azurerm_subscription.current.subscription_id}/resourceGroups/${azurerm_resource_group.rg.name}/providers/Microsoft.Network/virtualNetworks/${module.vnet.vnet_name}"
}
view raw storage_dns.tf hosted with ❤ by GitHub

Ok there is quite a bit to unpack here, let’s start with the name. The name here is mandatory. If your Private Endpoint will target a Storage Account the name of the DNS Zone MUST be privatelink.blob.core.windows.net. Eagle eyed readers will recognize this URL as the standard endpoint for Blob service within a Storage account.

This rule holds true with ANY other service that integrates with Private Endpoint. The full list can be found here: Azure Private Endpoint DNS configuration | Microsoft Docs

A second thing to note in the call is the structure of the value passed to the vnet_id parameter. For reasons unknown, Terraform did NOT resolve this based on context, so I ended up having to build it myself. You can see the usage of the data “azurerm_subscription” block in the source code. All it does is give me a reference to the current subscription so I can get the ID for the resource Id string.

Finally, notice that, following the creation of the Private DNS Zone, we are linking our Vnet to it via the azurerm_private_dns_zone_virtual_network_link resource. Effectively, this informs the Vnet that it can use this DNS Zone when routing calls coming into the network – this back to the flags we set on the Azure App Service.

Now we can create the Private Endpoint resource proper.

Create the Private Endpoint

First, I initially thought that you had to create one Private Endpoint per need however, later reading suggests that might not be the case – I have not had time to test this so, for this section, I will assume it is one per.

When you create a private endpoint the resource will get added to your resource group. However, it will also prompt the creation of a Network Interface resource. As I have stated, this interface is effectively your tunnel to the resource connected through the Private Endpoint. This interface will get assigned an IP consistent with the CIDR range of the subnet specified to the private endpoint. We will need this to finish configuring routing within the DNS Zone.

Here is the creation block for the Private Endpoint:

# create the resource
resource "azurerm_private_endpoint" "this" {
name = "pe-${var.name}"
resource_group_name = var.rg_name
location = var.rg_location
subnet_id = var.subnet_id
private_service_connection {
name = "${var.name}-privateserviceconnection"
private_connection_resource_id = var.resource_id
is_manual_connection = false
subresource_names = var.subresource_names
}
}
# outputs
output "private_ip" {
value = azurerm_private_endpoint.this.private_service_connection[0].private_ip_address
}
view raw private_endpoint.tf hosted with ❤ by GitHub

I am theorizing you can specify multiple private_service_connection blocks, thus allowing the private endpoint resource to be shared. However, I feel this might make resolution of the private IP harder. More research is needed.

The private_service_connection block is critical here as it specifics which resource we are targeting (private_connection_resource_id) and what service(s) (group(s)) within that resource we specifically want access to. For example, in this example we are targeting our Storage Account and want access to the blob service – here is the call from the main file:

# create private endpoint
module "private_storage" {
source = "./modules/networking/private_endpoint"
depends_on = [
module.storage_subnet,
module.storage_account
]
name = "private-storage"
rg_name = azurerm_resource_group.rg.name
rg_location = azurerm_resource_group.rg.location
subnet_id = module.storage_subnet.subnet_id
resource_id = module.storage_account.account_id
subresource_names = [ "blob" ]
}
view raw call_pe_storage.tf hosted with ❤ by GitHub

The key here is the output variable private_ip which we will use to configure the A record next. Without this value, requests from our App Service being routed through the DNS Zone will not be able to determine a destination.

Create the A Record

The final bit here is the creation of an A Record in the DNS Zone to give a destination IP for incoming requests. Here is the creation block (first part) and how it is called from the main Terraform file (second part).

# create the resources
resource "azurerm_private_dns_a_record" "this" {
name = var.name
zone_name = var.zone_name
resource_group_name = var.rg_name
ttl = 300
records = var.ip_records
}
# calling from main terraform file
module "private_storage_dns_a_record" {
source = "./modules/networking/dns/a_record"
depends_on = [
module.private_dns,
module.private_storage
]
name = module.storage_account.account_name
rg_name = azurerm_resource_group.rg.name
zone_name = module.private_dns.zone_name
ip_records = [ module.private_storage.private_ip ]
}
view raw a_record.tf hosted with ❤ by GitHub

It is that simple. The A Record is added to the DNS Zone and its done. But LOOK OUT back to the naming aspect again. The name here MUST be the name of your service, or at least the unique portion of the URL when referencing the service. I will explain in the next section.

Understanding Routing

This is less obvious with the storage account than it is with Event Grid or other services. Consider what your typical storage account endpoint looks like:

mystorageaccount.blob.core.windows.net

Now here is the name of the attached Private DNS Zone: privatelink.blob.core.windows.net

Pretty similar right? Now look at the name of the A Record – it will be the name of your storage account. Effectively what happens here is the calling URL is mystorageaccount.privatelink.blob.core.windows.net. But yet, the code we deploy can still call mystorageaccount.blob.core.windows.net and work fine, why? The answer is here: Use private endpoints – Azure Storage | Microsoft Docs

Effectively, the typical endpoint gets translated to the private one above which then gets matched by the Private DNS Zone. The way I further understand it is, if you were calling this from a peered Virtual Network (on-premise or in Azure) you would NEED to use the privatelink endpoint.

Where this got hairy for me was with Event Grid because of the values returned relative to the values I needed. Consider the following Event Grid definition:

resource "azurerm_eventgrid_topic" "this" {
name = "eg-topic-${var.name}jx01"
resource_group_name = var.rg_name
location = var.rg_location
input_schema = "EventGridSchema"
public_network_access_enabled = false
}
output "eventgrid_topic_id" {
value = azurerm_eventgrid_topic.this.id
}
output "event_grid_topic_name" {
value = azurerm_eventgrid_topic.this.name
}
output "eventgrid_topic_endpoint" {
value = azurerm_eventgrid_topic.this.endpoint
}
output "eventgrid_topic_access_key" {
value = azurerm_eventgrid_topic.this.primary_access_key
}
view raw eventgrid.tf hosted with ❤ by GitHub

The value of the output variable eventgrid_topic_name is simply the name of the Event Grid instance, as expected. However, if you inspect the value of the endpoint you will see that it incorporates the region into the URL. For example:

https://someventgridtopic.eastus-1.eventgrid.azure.net/api/events

Given the REQUIRED name of a DNS Zone for the Event Grid Private Endpoint is privatelink.eventgrid.azure.net my matched URL would be: someeventgrid.privatelink.eventgrid.azure.net which wont work – I need the name of the A Record to be someeventgrid.eastus-1 but this value was not readily available. Here is how I got it:

module "private_eventgrid_dns_a_record" {
source = "./modules/networking/dns/a_record"
depends_on = [
module.private_dns,
module.private_storage,
module.eventgrid_topic
]
name = "${module.eventgrid_topic.event_grid_topic_name}.${split(".", module.eventgrid_topic.eventgrid_topic_endpoint)[1]}"
rg_name = azurerm_resource_group.rg.name
zone_name = module.private_eventgrid_dns.zone_name
ip_records = [ module.private_eventgrid.private_ip ]
}
view raw eg_a_record_call.tf hosted with ❤ by GitHub

It is a bit messy but, the implementation here is not important. I hope this shows how the construction of the Private Link address through the DNS zone is what allows this to work and emphasizes how important the naming of the DNS Zone and A Record are.

In Conclusion

I hope this article has shown the power of Private Endpoint and what it can do for the security of your application. Security is often overlooked, especially with the cloud. This is unfortunate. As more and more organizations move their workloads to the cloud, they will have an expectation for security. Developers must embrace these understandings and work to ensure what we create in the Cloud (or anywhere) is secure by default.

Deploying Containerized Azure Functions with Terraform

I am a huge fan of serverless and its ability to create more simple deployments with less for me to worry about and still with the reliability and scalability I need. I am also a fan of containers and IaC (Infrastructure as Code) so the ability to combine all three is extremely attractive from a technical, operational, and cost optimization standpoint.

In this post, I will go through a recent challenge that I completed where I used HashiCorp Terraform to setup an Azure Function app where the backing code is hosted by a Docker Container. I feel this is a much better way to handle serverless deployments instead of the referenced Zip file I have used in the past.

You need to be Premium

One of the things you first encounter when seeking out this approach is that Microsoft will only allow Function Apps to use Custom Docker Images if they use a Premium or Dedicated App Service Plan (https://docs.microsoft.com/en-us/azure/azure-functions/functions-create-function-linux-custom-image?tabs=nodejs#create-an-app-from-the-image) on Linux.

For this tutorial I will use the basic Premium Plan (SKU P1V2). A quick reminder, you Function App AND its App Service Plan MUST be in the same Azure region. I ran into a problem trying to work with Elastic Premium which, as of this writing, is only available (in the US) in East and West regions.

Terraform to create the App Service Plan (Premium P1V2)

resource "azurerm_app_service_plan" "plan" {
name = "${var.app_name}-premiumPlan"
resource_group_name = "${data.azurerm_resource_group.rg.name}"
location = "${data.azurerm_resource_group.rg.location}"
kind = "Linux"
reserved = true
sku {
tier = "Premium"
size = "P1V2"
}
}

view raw
appservice.tf
hosted with ❤ by GitHub

Pretty straightforward. Far as I am aware container based hosting is ONLY available on Linux plans – Windows Container support is no doubt coming but, no idea when it will be available, if ever.

Create the Dockerfile

No surprise that the Docker image has to have a certain internal structure for the function app to be able to use it. Here is the generic Dockerfile you can get using the func helper via the Azure Function Tools (npm).

func init MyFunctionProj –docker

This will start a new Azure Function App project that targets Docker. You can use this as a starting point or just to get the Dockerfile. Below is the contents of that Dockerfile:

FROM microsoft/dotnet:2.2-sdk AS installer-env
COPY . /src/dotnet-function-app
RUN cd /src/dotnet-function-app && \
mkdir -p /home/site/wwwroot && \
dotnet publish *.csproj –output /home/site/wwwroot
# To enable ssh & remote debugging on app service change the base image to the one below
# FROM mcr.microsoft.com/azure-functions/dotnet:2.0-appservice
FROM mcr.microsoft.com/azure-functions/dotnet:2.0
ENV AzureWebJobsScriptRoot=/home/site/wwwroot \
AzureFunctionsJobHost__Logging__Console__IsEnabled=true
COPY –from=installer-env ["/home/site/wwwroot", "/home/site/wwwroot"]

view raw
Dockerfile
hosted with ❤ by GitHub

At the very least its a good starting point. I assume that when Azure runs the image it looks mount into known directories hence the need to conform as this Dockerfile does.

Push the Image

As with any strategy that involves Docker containers we need to push the source image to a spot where it can be accessed by other services. I wont go into how to do that, but I will assume you are hosting the image in Azure Container Registry.

Deploy the Function App

Back to our Terraform script, we need to deploy our Function App – here is the script to do this:

resource "azurerm_function_app" "funcApp" {
name = "userapi-${var.app_name}fa-${var.env_name}"
location = "${data.azurerm_resource_group.rg.location}"
resource_group_name = "${data.azurerm_resource_group.rg.name}"
app_service_plan_id = "${azurerm_app_service_plan.plan.id}"
storage_connection_string = "${azurerm_storage_account.storage.primary_connection_string}"
version = "~2"
app_settings = {
FUNCTION_APP_EDIT_MODE = "readOnly"
https_only = true
DOCKER_REGISTRY_SERVER_URL = "${data.azurerm_container_registry.registry.login_server}"
DOCKER_REGISTRY_SERVER_USERNAME = "${data.azurerm_container_registry.registry.admin_username}"
DOCKER_REGISTRY_SERVER_PASSWORD = "${data.azurerm_container_registry.registry.admin_password}"
WEBSITES_ENABLE_APP_SERVICE_STORAGE = false
}
site_config {
always_on = true
linux_fx_version = "DOCKER|${data.azurerm_container_registry.registry.login_server}/${var.image_name}:${var.tag}"
}
}

view raw
funcApp.tf
hosted with ❤ by GitHub

The MOST critical AppSetting here is WEBSITES_ENABLE_APP_SERVICE_STORAGE and its value MUST be false. This indicates to Azure to NOT look in storage for metadata (as is normal). The other all cap AppSettings are access to the Azure Container Registry – I assume these will change if you use something like Docker Hub to host the container image.

Note also the linux_fx_version setting. If you have visited my blog before you will have seen this when deploying Azure App Service instances (not surprising since a Function App is an App Service under the hood).

Troubleshooting Tips

By far the best way I found to troubleshoot this process was to access the Kudu options from Platform Features for the Azure Function App. Once in, you can access the Docker Container logs (have to click a couple links) and it gives you the Docker output. You can use this to figure out why an image may not be starting.

This was what led me to ultimately discovered the APP_SERVICE_STORAGE setting (above) as the reason why, despite the container starting, I never saw my functions in the navigation.

Hope this helps people out. I think this is a very solid way to deploy Azure Functions moving forward though, I do wish a Premium plan was not required.