Creating Custom OPA Policies with Azure Policy

Governance is a very important component of a strong security and operational posture. Using Azure Policy, organizations can define rules around how the various resources in Azure may be used. These cover a wide range of granularity from the general “resources may only be deployed to these regions” to “VMs must be of certain SKUs and certain resource tiers are prohibited” to finally, “network security groups must contain a rule allow access over port 22”. Policies are essential to giving operators and security professionals peace of mind that rules and standards are being enforced.

As organizations have moved to adopt Cloud Native technologies like Kubernetes, the governance question continues to come up as a point of concern. Kubernetes resources are, after all, outside the true scope of the Cloud Provider, Azure included. Thus, in many cases, teams rely on each other to avoid any pitfalls, in lieu of proper governance.

Open Policy Agent (https://openpolicyagent.org) is a Graduated CNCF (https://cncf.io) project aimed at providing an agnostic platform for enforcing policy rules across a variety platform, one of which is Kubernetes (a fully intro on this topic is not covered here). Open Policy Agent hooks into the Kubernetes Admission Controller and works to prevent the creation of resources that violate defined rules. To aid in applying this at scale, Microsoft created the Azure Policy for Kubernetes (https://docs.microsoft.com/en-us/azure/governance/policy/concepts/policy-for-kubernetes) feature to allow the provisioning of OPA policies (written in Rego) to be created in an assigned AKS or Arc connected cluster.

The feature is enabled via an add-on to the target AKS cluster. Azure Policy authors have already written a large amount of these policies that you can use for free with an Azure subscription. Support for custom OPA policies is, at this time, still in Preview mode but, it is stable enough that walking through it should prove beneficial.

Enable the Add-on

Support for OPA type policies in AKS is done through an add-on, which must be enabled to support proliferation of the policies. Instructions for enabling this add-on can be done using the following command:

az aks enable-addons --addons azure-policy --resource-group $RG_NAME --name $CLUSTER_NAME

As with ANY addon, you should run a list first to see if its already installed. Be prepared for the enablement process to take a non-trivial amount of time.

Create your Rego Policy

Whether you use OPA in the traditional sense or with Azure Policy you start with creating a ConstraintTemplate. This template is what enables the creation of the custom kind (the name of a specific resource type in Kubernetes) that will enable low level assignment of your policy. Below is a simple ConstraintTemplate which restricts a certain for a target namespace:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
name: k8srestricttype
spec:
crd:
spec:
names:
kind: K8sRestrictType
validation:
openAPIV3Schema:
properties:
namespace:
type: string
targets:
target: admission.k8s.gatekeeper.sh
rego: |
package k8srestrictype
violation[{ "msg": msg }] {
object := input.review.object
object.metadata.namespace == input.parameters.namespace
msg := sprintf("%v is not allowed for creation in namespace %v", [input.review.kind.kind, input.parameters.namespace])
}
view raw template.yaml hosted with ❤ by GitHub

This code is not something I would use in a production setting, but it gets the point across without being overly complicated. The policy registers a violation if within a namespace as indicated by an input parameter namespace, if creation of a certain type of resource is attempted.

VERY IMPORTANT!! Note the use of the openAPIV3Schema block to indicate the supported parameters and their type. Including this is vital, otherwise the addon will not generate the relevant constraint with the parameter value provided.

Once you have created the ConstraintTemplate you should store it a place which is accessible, I recommend, for ease of use, Azure Blob Storage with a public container.

Create the Azure Policy

The Azure Policy team has a wonderful extension for VSCode that can aid in generating a strong starting point for a new Azure Policy: https://docs.microsoft.com/en-us/azure/governance/policy/how-to/extension-for-vscode. The below is the Policy JSON I used for the above policy:

{
"properties": {
"policyType": "Custom",
"mode": "Microsoft.Kubernetes.Data",
"policyRule": {
"if": {
"field": "type",
"in": [
"Microsoft.ContainerService/managedClusters"
]
},
"then": {
"effect": "[parameters('effect')]",
"details": {
"templateInfo": {
"sourceType": "PublicURL",
"url": "https://stopapolicyjx01.blob.core.windows.net/templates/template.yaml"
},
"apiGroups": [
""
],
"kinds": [
"[parameters('kind')]"
],
"namespaces": "[parameters('namespaces')]",
"excludedNamespaces": "[parameters('excludedNamespaces')]",
"labelSelector": "[parameters('labelSelector')]",
"values": {
"namespace": "[parameters('namespace')]"
}
}
}
},
"parameters": {
"kind": {
"type": "String",
"metadata": {
"displayName": "The Kind of Resource the policy is restricting",
"description": "This is a Restriction"
},
"allowedValues": [
"Pod",
"Deployment",
"Service"
]
},
"namespace": {
"type": "String",
"metadata": {
"displayName": "The namespace the restriction will be applied to",
"description": ""
}
},
"effect": {
"type": "String",
"metadata": {
"displayName": "Effect",
"description": "'audit' allows a non-compliant resource to be created or updated, but flags it as non-compliant. 'deny' blocks the non-compliant resource creation or update. 'disabled' turns off the policy."
},
"allowedValues": [
"audit",
"deny",
"disabled"
],
"defaultValue": "audit"
},
"excludedNamespaces": {
"type": "Array",
"metadata": {
"displayName": "Namespace exclusions",
"description": "List of Kubernetes namespaces to exclude from policy evaluation."
},
"defaultValue": [
"kube-system",
"gatekeeper-system",
"azure-arc"
]
},
"namespaces": {
"type": "Array",
"metadata": {
"displayName": "Namespace inclusions",
"description": "List of Kubernetes namespaces to only include in policy evaluation. An empty list means the policy is applied to all resources in all namespaces."
},
"defaultValue": []
},
"labelSelector": {
"type": "Object",
"metadata": {
"displayName": "Kubernetes label selector",
"description": "Label query to select Kubernetes resources for policy evaluation. An empty label selector matches all Kubernetes resources."
},
"defaultValue": {},
"schema": {
"description": "A label selector is a label query over a set of resources. The result of matchLabels and matchExpressions are ANDed. An empty label selector matches all resources.",
"type": "object",
"properties": {
"matchLabels": {
"description": "matchLabels is a map of {key,value} pairs.",
"type": "object",
"additionalProperties": {
"type": "string"
},
"minProperties": 1
},
"matchExpressions": {
"description": "matchExpressions is a list of values, a key, and an operator.",
"type": "array",
"items": {
"type": "object",
"properties": {
"key": {
"description": "key is the label key that the selector applies to.",
"type": "string"
},
"operator": {
"description": "operator represents a key's relationship to a set of values.",
"type": "string",
"enum": [
"In",
"NotIn",
"Exists",
"DoesNotExist"
]
},
"values": {
"description": "values is an array of string values. If the operator is In or NotIn, the values array must be non-empty. If the operator is Exists or DoesNotExist, the values array must be empty.",
"type": "array",
"items": {
"type": "string"
}
}
},
"required": [
"key",
"operator"
],
"additionalProperties": false
},
"minItems": 1
}
},
"additionalProperties": false
}
}
}
}
}
view raw policy.json hosted with ❤ by GitHub

Now, that is a rather long definition so, lets hit on the key points, focusing on the then block:

  • OPA supports many standard parameters, and these are listed in the policy JSON under parameters. Important to understand is that OPA ALWAYS expects these, so we do not need to do anything extra. You may also omit them, and the system does not seem to care.
  • Custom parameters should be placed under the values key. Parameters specified here MUST have a corresponding openAPIV3Spec definition.
  • kind and apiGroups relate directly to concepts in Kubernetes and help declare for what resources and actions against those resources the rule applies.
  • Note the templateInfo key value under then. This indicates where Azure Policy can find the OPA constraint template – that which we created earlier. There are a few different sourceType values available. For this example, I am referencing a URL in the storage account where the template was uploaded
  • Note the mode value (Microsoft.Kubernetes.Data). This value must be used so that the Azure Policy runners know this policy contains an OPA

In the case of our example, we are piggybacking off whatever kind the policy applies to and then indicating a specific namespace to make this restriction. We could achieve the same with the namespaces standard parameter but, in this case, I am using a singular namespace property to demonstrate passing properties.

Once this is created, create the Policy definition in Azure Policy using either the Portal or the Azure CLI, instructions here: https://docs.microsoft.com/en-us/azure/governance/policy/tutorials/create-custom-policy-definition#create-a-resource-in-the-portal

I also recommend giving the new definition a very clear name indicating it is custom, I tend to recommend this regardless of whether or not it’s for Kubernetes or uses OPA, just helps to more easily find them in the portal or when searching with the CLI.

Make the Assignment

As with any policy in Azure Policy, you must assign it to a specific scope for it to take effect. Policy assignments in Azure are hierarchical in nature thus, making the assignment at a higher level affects the lower levels. In MCS, we typically recommend assigning policies to Management Groups in order to create easier maintenance – but you may assign all the way down to Resource Groups.

More information on making assignments: https://learn.microsoft.com/en-us/azure/governance/policy/assign-policy-portal

Once the assignment is made you will have to wait for the AKS addon to pick it up and create the new type and constraint definition on your behalf – my experience has had it take around 15 minutes. At the time of this writing, there is no way to speed it up – even using a trigger-scan command from the AZ CLI does not work.

One critical detail while doing the assignment, ensure that the value provided to the effect parameter is in lower case and either audit (which will map to dryrun in OPA) or deny. In some of the built-in templates Audit and Deny are offered as options. In my experience, the addon gets confused by this.

Validate the assignment

After a period of time run a kubectl get <your CRD type name> and it will eventually return a generated resource of that type representing the policy assignment. Once you see the result there, you can attempt to create an invalid resource in your cluster to validate enforcement.

Something to keep in mind, is the CRD Constraints will NOT report on deny actions, only dryrun (at this time warn is not supported, nor do I see much sense in the team supporting it) enforcements get reported on – this makes sense since with deny in place, invalid resources cannot enter the system.

I also recommend starting with dryrun if your governance process is new, this will give teams time to make change per the policies. Starting with deny can cause work disruptions and less then chances of success.

Debugging the Assignment

One thing I found helpful is to use -o yaml to see what the generated CRD and Constraint look like in the cluster. I used this when I was working with the Engineering team to determine why parameters were not being mapped.

I still recommend building your Rego using the Rego Playground (https://play.openpolicyagent.org/)

Closing

I am deeply intrigued by the use OPA in Kubernetes to enforce governance, as I believe strongly in the promise of governance and its criticality in the DevSecOps movement. I also like what support in Azure Policy means for larger organizations looking to adopt OPA at a wider scale. Combine it with what Azure Arc brings to the table and suddenly any organization in any cloud has the ability to create a single pane of glass to monitor the governance status of their cluster regardless of platform or provider.

Please let me know if you have any questions in the comments.

Leave a comment