One of the things I have been focusing on lately is Kubernetes, its always been an interest to me but, I recently decided to pursue the Certified Kubernetes Application Developer (CKAD) and so diving into topics that I was not totally familiar with has been a great deal of fun.
One topic that is of particular interest is storage. In Kubernetes, though really in Containerized applications, state storage is an important topic since the entire design of these systems is aimed at being transient in nature. With this in mind, it is paramount that storage happen in a centralized and highly available way.
A common approach to this is to simply leverage the raw cloud APIs for things like Azure Storage, S3, etc as the providers will do a better job ensuring the data is stored securely and in a way that makes it hard for data loss to occur. However, Kubernetes enables the mounting of the cloud systems directly into Pods through Persistent Volumes and Storage Classes. In this post, I want to show how to use Storage Class with Azure so I wont be going to detail about the ins and out of Storage Classes or their use cases over Persistent Volumes, frankly I dont understand that super well myself, yet.
Creating the Storage Class
The advantage to Storage Class (SC) over something like Persistent Volume (PV) is the former can automatically create the later. That is, a Storage Class can received Claims for volume and will, under the hood, create PVs. This is why SC’s have become very popular with developers, less maintenance.
Here is a sample Storage Class I created for this demo:
This step is actually optional – I only did it for practice. AKS will automatically create 4 default storage classes (they are useless without a Persistent Volume Claim (PVC)). You can see them by running the follow command:
kubectl get storageclass
Use kubectl create -f to create the storage class based on the above, or use one of the built in ones. Remember, by itself, the storage class wont do anything. We need to create a Volume Claim for the magic to actually start.
Create the Persistent Volume Claim
A persistent volume claim (PVC) is used to “claim” a storage mechanism. The PVC can be, depending on its access mode, attached to multiple nodes where its pods reside. Here is a sample PVC claim that I made to go with the SC above:
The way PVCs work (simplistically) is they seek out a Persistent Volume (PV) that can support the claim request (see access mode and resource requests). If nothing is found, the claim is not fulfulled. However, when used with a Storage Class its fulfillment is based on the specifications of the Storage Class provisioner field.
One of the barriers I ran into, for example, was that my original provisioner (azure-disk) does NOT support multi-node (that is it does not support ReadWriteMany used above). This means, the storage medium is ONLY ever attached to a single node which limits where pods using the PVC can be scheduled.
To alleviate this, I opted to use, as you can see, the azure-file provisioner, which allows multi node mounting. A good resource for reading more about this is here: Concepts – Storage in Azure Kubernetes Services (AKS) – Azure Kubernetes Service | Microsoft Docs
Run a kubectl create -f to create this PVC in your cluster. Then run kubectl get pvc – if all things are working your new PVC should have a state of Bound.
Let’s dig a bit deeper into this – run a kubectl describe pvc <pvc name>. If you look at the details there is a value with the name Volume. This is actually the name of the PV that the Storage Class carved out based on the PVC request.
Run kubectl describe pv <pv name>. This gives you some juicy details and you can find the share in Azure now under a common Storage Account that Kubernetes has created for you (look under Source).
This is important to understand, the claim creates the actual storage and Pods just use the claim. Speaking of Pods, let’s now deploy an application to use this volume to store data.
Using a Volume with a Deployment
Right now, AKS has created a storage account for us based on the request from the given PVC that we created. To use this, we have to tell each Pod about this volume.
I have created the following application as Docker image xximjasonxx/fileupload:2.1. Its a basic C# Web API with a single endpoint to support a file upload. Here is the deployment that is associated with this:
The key piece of this the ENV and Volume Mounting specification. The web app looks to a hard coded path for storage if not overridden by the Environment Variable SAVE_PATH. In this spec, we specify a custom path within the container via this environment variable and then mount that directory externally using the Volume created by our PVC.
Run a kubectl create -f on this deployment spec and you will have the web app running in your cluster. To enable external access, create a Load Balancer Service (or Ingress), here is an example:
Run kubectl create -f on this spec file and then run kubectl get svc until you see an External IP for this service indicating it can be addressed from outside the cluster.
I ran the following via Postman to test the endpoint:
If all goes well, the response should be a Guid which indicates the name of the image as stored in our volume.
To see it, simply navigate to the Storage Account from before and select the newly created share under the Files service. If you see the file, congrats, you just used a PVC through a Storage Class to create a place to store data.
What about Blob Storage?
Unfortunately, near as I can tell so far, there is no support for saving these items to object storage, only file storage. To use the former, at least with Azure, you would still need to use the REST APIs.
This also means you wont get notifications when new files are created in the file share as you would with blob storage. Still, its useful and a good way to ensure that data provided and stored is securely and properly preserved as needed.