MVP for another year

It is with great humility that I announce Microsoft’s decision to renew my MVP status for an additional  year. While it is my second renewal, it comes as the first true renewal that I have had since being selected in 2016.

What I mean by that is, after I was selected in 2016 the program’s renewal cycle changed and as part of the change I was grandfathered into the MVP program for 2017 into 2018. This mean’t it would be my accomplishments in 2017 that would dictate whether my MVP status would continue.

I spoke and blogged quite a bit in early 2017 but, shut things down around August to focus on my wedding and honeymoon. What’s more, throughout 2017 I was tasked with a large $4mil web project using NodeJS, AWS, and ReactJS for West Monroe. I was worried as this certainly drew my focus away from what got me my MVP. In addition, I decided to also refocus on the web and away from Xamarin (this as part of an overall decision to focus more on the Cloud side of things).

2018 has also not been easy as the AWS project finishes up and I celebrate the birth of my first child, my son Ethan. But I am committed to finding the balance and have already spoken at two conferences (CodeMash and Chicago Code Camp) and am selected to speak at TechBash and have abstracts out to VSLive.

In the end, my willingness to share my ideas here and the awesome people who have to read what I wrote and even share some of my articles on forums and StackOverflow, helped get that MVP renewal and so, I send out a Thank You to all.


Scratching the Surface of Real Time Apps – Part 3

Part 1
Part 2

This is it, our final step. This is basically saving our query result from the Analytics application to a DynamoDB where it can easily be queried. For this we will again use Lambda as a highly scalable and easy to deploy mechanism to facilitate this transfer.

Creating your Lambda function

In Part 1, I talked about installing the AWS Lambda templates for Visual Studio Code, if you do a list on your templates using:

dotnet new -l

You will get a listing of all installed templates. A casual observation of this shows two provided Lambda templates that target Kinesis events. However, neither of these will work. I actually thought I had things working with these templates, however, the structure of the event sent from Analytics is very different, but the Records property is the only one that maps, and its all empty objects.

I would recommend starting with the serverless.EmptyServerless template. What you actually need is to use the KinesisAnalyticsOutputDeliveryEvent object which is available in the Amazon.Lambda.KinesisAnalyticsEvents NuGet package (link). Once you have this installed, ensure the first parameter is of type KinesisAnalyticsOutputDeliveryEvent and Amazon will handle the rest. For reference, here is my source code which writes to my DynamoDB table:

Here we are using the AWSSDK.DynamoDbv2 NuGet package to connect to Dynamo (the Lambda role must have access to Dynamo and our specific ARN). We use AttributeValue here to write, essentially, a JSON object to the table.

In my case, I have defined the partition and sorting keys since the data we can expect is a unique combination of a minute based timestamp and symbol; this is a composite key.

As with the Kinesis Firehose write Lambda use the following command to deploy (or update) the Lambda function:

dotnet lambda deploy-function <LambdaName>

Again, the Lambda name does NOT have to match the local name, it is the name will be used when connecting with other AWS Services, like our Analytics application.

Returning to our Analytics Application

We need to make sure our new Lambda function is set up as the destination for our Analytics Application. Recall that, we created a stream when we did our SQL Query. Using the Destination feature, we can ensure these bounded results wind up in the table. Part 2 explains this a bit better.

Running End to End

So, if you followed all three segments you should now be ready to see if things are working. I would recommend having the DynamoDb table Items table open so you can see new rows as they appear.

Start your ingestion application and refresh Dynamo. It may take some time for rows to appear. If you are inpatient, like me, you can use the SQL Editor to see rows as they come across. I also like to have Console output message in the Lambda functions that I can see in CloudWatch.

With any luck you should start seeing rows in your application. If you dont it could be a few things:

  • You really want to hit the Firehose with a lot of data, I recommend a MINIMUM of 200 records. If you send less the processing can be a bit delayed. Remember, what you are building is a system that is DESIGNED to handle lots of data. If there is not a lot of data, Kinesis may not be the best choice
  • Check that your Kinesis Query is returning results and that you have defined an OUTPUT_STREAM. I feel like the AWS Docs do not drive this point home well enough and seem to imply the presence of a “default” stream; this is not the case, or not in so much as I found.
  • Use CloudWatch obsessively. If things are not working, use Console.WriteLine to figure out where the break is. I found this a valuable tool
  • Ensure you are using the right types for your arguments. I spent hours debugging my write to Dynamo only to discover I was using the wrong type for the incoming Knesis event

Final Thoughts

Taking a step back and looking at this end to end product, the reality is we did not write that much code. Between my Lambda’s I wrote perhaps lines of code, total. Not bad given what this application can do and its level of extensibility (Kinesis streams can send to multiple destinations).

This is what I love about Cloud platforms. In the past, designing and building a system like this would take as long as the application that might be using the data. But now, with Cloud, its mostly point and click and the platform handles the rest. Streaming data applications, like this, will only become more and more common as we see more systems move to the Cloud and more business want to gather data in new and unique ways.

Kinesis in AWS, EventGrid in Azure, and Pub/Sub in Google Cloud represent tooling that allows us to take this very valuable, highly complicated applications, and building them in hours instead of weeks. Though, in my view, AWS has the best support for applications like these, though I expect the others to make progress to catch up.

I hope you enjoyed this long example and I hope it gives you ideas for your next application and reveals how easy it is to get a value production Real Time Data application quickly.

Scratching the Surface of Real Time Apps – Part 2

See Part 1 here

If you followed Part 1, we have a Lambda that sits behind an API Gateway that writes the contents of the payload it receives to an Amazon Kinesis Firehose Delivery Stream. This stream is currently configured to dump its contents, at set conditions, to S3.

It being S3 does not really lend itself to real time processing, though we could use something like Athena and query it, our aim is to see our data changes in real time. For that we want to hook our Firehose stream up to a Data Analytics stream where the principal job is to process “bounded” data.


The most important thing with this step is having an idea of what you want to look for before you start. Generally, these are going to be time based metrics, but they dont have to be. Once you have established this ideas we can move on to the creation phase.

Create the analytics application

Configure the source of you Analytics Application

Go to your Amazon AWS Console and select Services -> Kinesis. This will bring up your Kinesis Dashboard, you should see the Firehose stream you created previously. Click Create analytics application.

Similar to the Firehose process we give our new stream an application name and click the Create action button.

An analytics application is comprised of three parts: Source, Processor, and Destination. You need to configure these in order. Amazon makes it pretty easy, except for the part on debugging which we will talk about in a bit.

For the Source you will want to connect it to your Firehose Datastream that was created previous, if you are following this tutorial. You can also use a Kinesis Data Stream as well. This part is pretty straightforward.

The important thing to take note of is the name of you incoming In-Application stream, this will serve as the “table”.

In order to save the configuration the application will want to discover the schema from the incoming data. This is done using the Discover Schema button at the bottom, you need to have data flowing through.

As a side note, when I wrote my application with sends data I wrote logic to limit the size of the result set being sent. This helps speed up development and lets you explore other data scenarios you way want to explore (pre-processing). I find this approach is better than truly opening the firehose before you are ready, pun intended.

Configure the Processing of your streaming data

When you think about writing a SQL Query against a database like MySQL you probably dont often think about that data set changing underneath you, but that is what you need to consider for many operations within Analytics, because you are dealing with unbounded data.

The term “unbounded data” specifically applies to doing things like averages, mins, maxs, sums, etc, values that are calculated from the complete set of data. This cant happen with Analytics because you are dealing with a continuous stream of data. So you could never compute the average because it will keep changing. To mitigate this, query results need to operate on bounded data, especially for aggregates.

This bounding can be done a few ways but the most common is the use of a Window. Amazon has some good documentation on this here, as well as the different types. Our example will use Tumbling Window because we will consider the set of data relevant for a given minute, specifically the number of stock shares purchased or sold within a given minute,

For stage 2 (Real Time Analytics) of the Analytics Application you will want to click the Go to SQL Results button which will take you into the SQL Query editor. This is where you define the SQL to extract the metric you want. Remember you, MUST provide criteria to make the query operate on a bounded data set, otherwise the editor will throw an error.

For my application this is the SQL I used:

There is a lot going on here so let’s unpack it.

The first step you need to take is to create a stream that will contain the results of your query. You can KIND OF think of this as a “temp table” from SQL land, but its more sophisticated.

In our example, we defining a STREAM called OUTPUT_STREAM, we will be able to select this in Step 3, when we determine what happens after the query. This stream will feature 3 columns (timesymbol, and shareChange).

Our second step focuses on getting data from the query into the stream, which is done using a Pump. This will run the INSERT statement with each query result and insert the data into the stream.

The final bit is the query that actually does the lifting. Its pretty standard with the exception of the STEP operator in the GROUP BY clause – this is what creates the tumbling window we mentioned before.

Once you have set your query you can click the Save and Run, this will save the query and execute it. You will be much more pleased with the outcome of this if you are sending data to your Firehose so the Analytics application has data it can show.

Connect to a Destination

The final configuration task for the Analytics Application is to specify where the query results are sent. For our example we will use a Lambda. However, doing this with C# can be somewhat confusing so, we will cover it in the next part.

You can specify this configuration by access the Destination section from the Application homepage (below):

Alternatively, you can select the Destination tab from within the SQL Editor. I find this way to be better since you can directly select your output stream as opposed to free typing it.

For both of these options, Amazon has made it very easy and as simple as point and click.


Congrats. You now have a web endpoint that will automatically scale and write data to a pipe that will store things in S3. In addition, we created an Analytics Application which queries the streamed data to derive metrics based on a window (per minute in this case). We then take these results and pass them to a Lambda which we will cover in Part 3, its going to write the results to a DynamoDB table that we can easily query from a web app and see our results in real time.

Part 3

Scratching the Surface of Real Time Apps – Part 1

Recently, I began experimenting with Amazon Kinesis as a way to delve into the world of real time data applications. For the uninitiated, this is referred to a “streaming” and it basically deals with the ideas of continuous data sets. By using streaming, we can make determinations in real time which is allows a system to generate more value for those wanting insights into their data.

Amazon has completely embraced the notion of streaming within many of their services, supported by Lambda. This notion of being event driven is not a new pattern but, by listening for events from within the Cloud they become much easier to develop and enable more scenarios than ever before.

Understanding Our Layout

Perhaps the most important thing when designing this, and really any Cloud based application, is a coherent plan of what your after and what services you intend to use. Here is a diagram of what we are going to build (well parts of it).



Our main focus will be on the Amazon components. For the Console App, I wrote a simple .NET Core app which randomly generates stock prices and shares bought. Really you can build anything that will just generate a lot of data for practice.

This data gets written to an API Gateway which allows us to create a more coherent mapping to our Lambda rather than exposing it directly. API Gatways enable a host of useful scenarios such as unifying disparate services behind a single front or starting the breakup of a monolith into microservices.

The Lambda takes the request and writes it to a Kinesis Firehose which is, as it sounds, a wide pipe that is designed to take A LOT of data. We do this to maximize throughput. Its like a queue, but much more than that and its all managed by Amazon and so will scale automatically.

The Firehose than writes the data it receives to another Kinesis Stream, but its an Analytics stream, designed to be queried to see trends and other aspects of the incoming data. This outputs to a stream which is then sent to another Lambda. The thing to remember here is that we are working against a continious data set, so bounding our queries is important.

The final Lambda, takes the query result from analytics and saves it to Dynamo. This gives us our data, time based, that we can query against from a web application.


Creating our Kinesis Ingestion Stream

Start off from the AWS Console, under Analytics selecting Kinesis. You will want to click the Create Firehose Delivery Stream.

For the first screen, you will want to give the stream a name and ensure the Source is set to Direct PUT. Click Next.

On the second screen, you will be asked if you want to do any kind of preprocessing on the items. This can be very important depending on the data set you are injesting. In our case, we are not going to do any preprocessing, click Next.

On the third screen you will be asked about Destination. This is where the records you write to Firehose go after they pass through the buffer and into the ether. For our purposes, select S3 and create a new bucket (or use an existing one). I encourage you to explore the others as they can serve other use cases. Click Next.

On the fourth screen, you will be asked to configure various settings. including the all important S3 Buffer Conditions. This is when Firehose will store the data it receives. Be sure to set your IAM Role, you can auto create if you need to. Click Next.

On the final screen click Create Delivery Stream. Once the process is complete we can return and write some code for the Lambda which will write to this stream.

Creating the Ingestion Lambda

There are many ways to get data into Kinesis each of which is geared towards a specific scenario and dependent on the desired ingestion rate. For our example, we are going to use a Http Trigger Lambda written in C# (.NET Core 2.0).

Setup: I am quite fond of Visual Studio Code (download) and it has become my editor of choice. To make this process easier, you will want to run the following command to install the Amazon AWS Lambda templates (note: this assumes you have installed .NET Core SDK to your machine)

dotnet new -i Amazon.Lambda.Templates::*

More information here

Once this is installed we can create our project using the EmptyServerless template, here is the command:

dotnet new lambda.EmptyServerless –name <YourProjectName>

This will create a new directory named after your project with src/ and test/ directories. Our efforts will be in /src/<ProjectName> directory, you will want to make sure you cd here as well as you must be in the project root to run the installed dotnet lambda commands.

Our basic process with this Api call is to receive a payload, via POST, and write those contents to our Firehose stream. Here is the code that I used:

A few things to take note of here:

  • The first parameter is of type object and it needs to match the incoming JSON schema. Amazon will handle the deserialization from string to object for us. Though we end up needing the raw contents anyway.
  • We need to install the AWSSDK.KinesisFirehose (I used version which gives us access to AmazonKinesisFirehoseClient
  • We call PutItemAsync to write to our given Delivery stream (in the case of this code that stream is named input_transactions)
  • I have changed the name of the function handler from FunctionHandler (which comes with the template) to PostHandler. This is fine, but make sure to update the aws-lambda-tools-defaults.json file to ensure function-handler is correct

Once you have your code written we need to deploy the function. Note that I would rather do it from the command line since it avoids the auto creation of the API Gateway path. I have done both and I find I like the clarity offered with doing it in stages. You need to run this command to deploy:

dotnet lambda deploy-function <LambdaFunctionName>

LambdaFunctionName is the name of the function as it appears in the AWS Console. It does NOT have to match what you call it in development, very useful to know.

As a prerequisite to this, you need to define a role for the Lambda to operate within. As with anytime you define an IAM Role, it should only have access to the bare minimum of services that callers will need.

When you run the deploy, you will receive a prompt to pick an applicable role for the newly created function, future deployments of the same Lambda name will not ask this.

Our next step is to expose this on the web, for that we need to create a API Gateway route. As a side note, there is a concept of lambda-proxy which I could not get to work. If you can, great, but I made due with the more standard approach.

Add Web Access

I think API Gateways are one of the coolest and most useful services offered by Cloud providers (API Gateway on Azure and Endpoints on Google). They enable us to tackle the task of API generation in a variety of ways, offer standard features prebuilt, and offer an abstract to let us make changes.

I will be honest, I can never find API Gateway in the Services list, I always have to search for it. Click on it once you find it. On the next page press Create API. Here are the settings you will need to get started:

  • Name: Anything you cant, this doesnt get exposed to the public. Be descriptove
  • Endpoint type: Regional (at this point I have explored the differences between the types)

Click Create and you will be dropped on the API configuration main page.

The default endpoint given is / or the root of the API. Often we will add resources (i.e User, Product, etc) and then have our operations against those resources. For simplicity here we will operate against the root, this is not typical but API organization and development is not the focus of this post.

With / selected click Actions and select Create Method. Select POST and click the checkmark. API Gateway will create a path to allow POST against / with no authentication requirements, this is what we want.

Here is a sample of the configuration we will specify for our API Gateway to Lambda pipe, note that my function is called PostTransaction, yours will likely differ.


Click Save and you will get a nice diagram showing the stages of the API Gateway route. We can test that things are working by clicking Test and sending a payload to our endpoint.

I wont lie, this part can be maddening. I tried to keep this simple to avoid problems. There is good response logging from the Test output you just have to play with it.

So we are not quite ready to call our application via Postman or whatever yet, we need to create a stage. Stages allow us to deploy the API to different environments (Development, QA, Staging, Production). We can actually combine this with Deploy API.

IMPORTANT: Your API changes are NEVER live when you make them, you must deploy, each time. This is important because it can be annoying figuring out why your API doesnt work only to find you didnt deploy your change, as I did.

When you do Actions -> Deploy Api you will be prompted to select a Stage. In the case of our API, we have no stages, we can create one, I choose Production, you can choose whatever.

Once the deploy finishes we need to get the actual Url that we will use. We can go to Stages in the left hand navigation and select our stage, the Url will be displayed along the top (remember to append your resource name if you didnt use /). This is the Url we can use to call our Lambda from Postman.

The ideal case here is that you create an app that sends this endpoint a lot of data so we can write a large amount to Firehose for processing.


In this part, we did not write that much code and the code we did write could be tested rather easily within the testing harnesses for both API Gateway and Lambda. One thing that is useful is to use Console.WriteLine as CloudWatch will capture anything in Logs from STDOUT.

Next Steps

With this we have an endpoint applications can send data to and it will get written to Firehose. By using Firehose we can support ingesting  A LOT of data and with Lambda we get automatic scaling. We abstract the Lambda path away using API Gateway which lets us also include other API endpoints that could be pointing at other services.

In our next part we will create a Kinesis Analytics Data Stream which allow us to analyze the data as it comes into the application – Part 2

Exploring Cassandra

Continuing on with my theme this year of learning the integral parts of High Performing systems, I decided to explore Cassandra, a multi node NoSQL database that acts like a RDBMS. Cassandra is a project that was born out of Facebook and is currently maintained by the Apache Foundation. In this post, we are going to scratch the surface of this tool. This is not an indepth look at the project, but rather a “get it to work” approach.

At its core, Cassandra is designed to feel like an RDBMS but provide the high availability features that NoSQL is known for, attempting to bridge the best of both worlds. The query language used internally is called CQL (Cassandra Query Language). At the most basic level, we have keyspaces (one per node) and within those keyspaces we have tables. The big selling point of Cassandra is its ability to quickly replicate data between its nodes allowing it to automatically fulfill the tenant of Eventual Consistency.

For my purposes I wanted to see how fast I could read and write with a single node using .NET Core. This is mostly to understand the general interactions with the database from .NET. I decided to use Docker to standup the single node cluster and as such generated a Docker compose file so I could run a CQL script at startup to create my key space and table.

The Docker File

I started off trying to use the public cassandra image from Docker Hub but I found that it does not support the entry point concept and required me to create the key space and table myself. Luckily I found the dschroe/cassandra-docker image which extends the public Cassandra image to support this case.

I wanted to just write a bunch of random names to Cassandra so I create a simple keyspace and table. Here is the code:

I worked this into a simple Docker Compose file so that I could do all of the mounting and mapping I needed. You can find the compose file in the GitLab repo I reference at the end. Once you have it, simply run docker-compose up and you should have a Cassandra database with a cluster called names up and running. Its important to inspect the output, you should see the creation of the keyspace and table in the output.


This is the NuGet package I used to handle the connection with Cassandra. At a high level, you need to understand that a cluster can have multiple keyspaces, so you need to specify which one you want to connect to. I found it useful to view the keyspaces as databases since you will see the USE command with them. They are not databases per say, just logical groupings that can have different replication rules.

This connection creates a session which will allow you to manipulate the tables within the keyspace.

Insert the Data

I have always been amused by the funny names Docker will give containers when you dont specify a name. It turns out someone created an endpoint which can return you the names: This delighted me to no end and so I used this for my data.

You can find the logic which queries this in this file:

First we want to get the data into Cassandra, I found the best way to do this, especially since I am generating a lot of names is to use the BEGIN and APPLY BATCH wrappers for INSERT commands.

By doing this you can insert however many you like and have little chance of draining the cluster connections (I did this when I did an insert per approach).

Read the Data

When you perform Execute against a Session the result is a RowSet which is enumerable and can be used with LINQ. The interesting thing I found here is that while I specify my column names as firstName and lastName when I get back the row from RowSet the columns are named in all lower case: firstname and lastname. By far this was the most annoying part when I was building this sample.

Delete the Data

I do not know why but Cassandra is very weird about DELETE SQL statements. If I had to venture a guess, its likely restricted due to the replication that needs to happen to finalize the delete operation. It also appears that, if you want to delete, you have to provide a condition, the DELETE FROM <table> to delete everything does not appear to be supported, again I think this is due to the replication consideration.

Instead you have to use TRUNCATE to clear the table.


Cassandra is a big topic and I need more than a partial week to fully understand it but, overall my initial impression is good and I can see its use case. I look forward to using it more and eventually using it in a scenario where its needed.

Here is the code for my current Cassandra test, I recommend having Docker installed as it allows this code to be completely self contained, cheers.

Speaking at TechBash

Proud to announce that I have accepted the offer to speak at TechBash in Pennsylvania. I will be giving my talk on Blazor, which is Microsoft’s WebAssembly SPA Framework (recently blogged here) and I will also be debuting my newest talk, Building Serverless Microservices with Azure a topic I have touched on here as part of a larger series.

This will be my first time at TechBash and I have heard great things from past speakers, so I very much look forward to this event. I hope to see you there

Setting up Kubernetes on GKE

Kubernetes is the new hotness when it comes to orchestrating containers for both development and production scenarios. If that sentence doesn’t makes sense to you, dont worry, it didnt make sense to me either when I first got started.

One of the biggest trends in server development right now is the Microservice pattern, whereby we break up our monolithic backends into smaller services that we can then mash together to supper various scenarios. Containers helped make this process more efficient than straight server deploys. But as our deployments got larger and our availability needs grew things got complicated.

This is where the concept of orchestration comes in. Orchestration is a higher level tool which manages our sets of containers and ensures that we maintain operational status. Kubernetes is such a tool, developed by Google, which has gained immense popularity for its extensibility, efficiency, and resiliency (it can self heal). What’s more, since multiple containers can run on the same VM we can now more efficiently utilize our server resources, as opposed to previous models which saw an overuse of VMs to manage load.

Kubernetes has gained such a following that Azure, Amazon, and Google all provide managed versions enabling quick deployments without having to set everything up yourself (which is beyond painful). In this article, we are going to explore setting up a simple microservice on Google Cloud Platform.

Our Microservice

As part of my prototype Retail App I created a simple login service. The code is not important, just know that it utilizes PostgresSQL and Redis to handle user authentication and JWT token persistence.

Our goal is going to be to stand up a Kubernetes Cluster, in Google, to serve out this microservice.

A VERY Quick Intro to Kubernetes

Kubernetes is a HUGE topic that is getting larger every day, this blog post wont even cover a third of those concepts but, principally, there several you need to know for what I am going to say to make sense:

  • Cluster: A cluster is a collection of nodes (think VMs, but not really) that we can provision resources to including Pods, Deployments, Services and others.
  • Pod – the most basic unit of functionality within Kubernetes. A pod hosts one or more containers (created from Docker, if you dont know what these are click here). A Pod can be thought of, conceptually, as a machine
  • Deployment – A set of rules around how we deploy Pods. This can include how many replicas exist and what the action is if a Container fails.
  • Service – These allow Deployments of Pods to communicate either internally or externally. It can act as a Load Balancer for your deployment (if you have more than one Pod). Logically speaker it is what provides access into a cluster.

Overall, the goal of Kubernetes (often abbreviated k8s) is specify and adhere to an ideal state. For example, I can say I want three replicas which hold my API code and Kubernetes will guarantee that, at any given time, there will always be AT LEAST 3 Pods taking requests via the Deployment specification. I can also specify scaling conditions which allows Kubernetes to increase the number of Pods if necessary.

I often like to refer to Kubernetes as a “mini cloud” and in fact, that is what it is going for, a native cloud that can live in any Cloud. This lets us have all of the advantages of a Cloud without being tied to it.

Image Repositories

Throughout this process we will create Docker containers which run in our Pods. Google provides an automated way (via Container Repository -> Build Triggers) to automatically run Dockerfile code and build images that can then be used as Containers via the Kubernetes configuration file. I wont be covering how to set this up, instead I will assume you have already published your Microservice container somewhere.

Managed External Services

As I said above, we can often think of Kubernetes as our own personal Cloud that we can move around and run anywhere (both in the Cloud or on-prem). In my view, this is great for service code but, it loses interest for me when we start talking about running things like a database or a Redis cluster as well. You can do this but, I feel that since most Cloud providers have managed versions of these tools which already auto-scale and are reasonably cost effective, I would rather connect to those than try to run my own. I will be using Google managed Postgres and Redis in this example

Create & Connect to Your Cluster

Assuming you have a Google Cloud account and have already created a project, expand the Tools side menu and select Kubernetes Engine from underneath the Compute section.

This next part is very important and not really documented from what I saw. When the cluster finishes you will see a Connect next to it. This button will give you the credentials needed to connect to the cluster you just created. This is important because MANY of the options you will want to use are not available through the UI yet, this includes access to Secrets which are necessary to connected a Google Managed Database.

Run the given command in a terminal – the important line you are looking for is marked with kubeconfig. This is the context in which kubectl will run. Now we need to tell kubectl to use this context, to do that we execute the following command in the terminal, note my cluster name is test-cluster

kubectl config use-context <cluster-name>

Sometimes this wont work, it will say context does not exist. In this case run the following command to figure out what name the context is actually registered under

kubectl config get-contexts

It is usually pretty obvious which one is yours. Rerun the use-context command with the new name. Congrats you are connected.

Create the Proxy Account

Like most Cloud providers, Google advises you to not permit direct access to your managed service but rather utilize a “sidecar” pattern. In this pattern, we provision an additional Pod to the Pods within our Deployment to permit access to Postgres through a proxy account.

This means that our main Pod by itself does not have access to Postgres but rather must access it through the proxy. I wont bore you with instructions on how to do this, Google has done a good job laying this out:

Work up to Step 5 and then come back so we can talk about Secrets.

Creating our Secrets

I am not going to dive too deeply into how to create a managed Postgres and Redis service on GCP, they have documentation that suits that well enough. However, where things get a bit tricky is connecting to these services.

Kubernetes goal is to deploy and orchestrate containers relative to your “ideal state”. Containers can be provided environment variables which allows them to pull environment specific values in, a necessary element of any application.

On top of this, it is very common for production environment variables like usernames and passwords for databases. This is where Secrets come in, we can register these values with the Kubernetes cluster and then inject them as environment variables. It also keeps these sensitive variables out of source control.

Continue with This was the hard part for me cause the part about connecting kubectl up to my cluster was not clear.

After you register your secrets come back here, I will elaborate more on the updating your Pod Configuration.

Creating the Configuration File

As it stands today, the primary way to setup Kubernetes is to use YAML files. There are a host of tools in development to alleviate this burden but, for now, its the best way to go. I want to present you with this YAML file which shows how to configure your config file so you can connect to Postgres

Within this file we can see that we specify the DB_HOST as This works because of the sidecar pattern.

If you examine the YAML file you will notice our Pod (spec->container) containers TWO containers, one of which uses the gce-proxy image, this is the Cloud Proxy and since the two containers are running on the same Pod they are, effectively, said to be on the same machine. Thus refers to the loopback for the Pod which can see both containers.

Next, the DB_USER and DB_PASSWORD values are being extracted from our secret files that we created in previous steps. The names should match whatever you used, I personally like to name my credential file with a hint to which service it refers to since I have more than one database being referenced within my cluster (for other services).

Honestly, my recommendation here is to follow the YAML file I referenced above as you make your config file. It really is the best way to build it up. Bare in mind you do NOT need to define a Service in this file, this is strictly for your Deployment (which GKE refers to a Workloads). You can define the service if you like, and for more complex projects I recommend it, so as to keep everything together.

Once you have completed this file we need to apply it. Run the following command

kubectl apply -f <your yaml file>

This will connect and apply your configuration. Return to GKE and click Workloads. After some times things should refresh and you should see your Pods running.


So, as you can imagine debugging is pretty hard with Kubernetes so you will want to make sure you get most of it out of the way ahead of time. So here are some tips:

  • Make use of minikube for local development. Get things in order before you deploy including secrets
  • Add console logging to your code. You can access the Pod Log files right from the GKE console. Important, if you have multiple replicas you might need to enter each one to figure out which one got hit. I recommend starting with a replica count of 1 at first, makes this easier
  • Set you ImagePullPolicy to always. I have found that sometimes Google will recreate a container from an existing image and you wont get your latest changes
  • If you want to delete things you can use kubectl or the UI, I recommend waiting for everything to removed before redeploying things

Creating the Service

Your code is now deployed into Kubernetes but you wont be able to access it without a service. My preference for this is to define it in my YAML file, here is an example:


Alternatively, you can do this straight up on the UI. The important thing is the port and the type, here set to LoadBalancer. Others include ClusterIP which is predominantly used for internal services that are not accessible from outside.

Closing Thoughts

My goal is eventually to do a much larger series on Kubernetes and Microservices since this has become something I am very focused on and is a huge trend in development right now. I hope this article gets people thinking about Microservics and how they might use Kubernetes.

There is a lot of discussion around Docker vs Serverless for Microservices. Truthfully, this is a misnomer since, in most cases, we would want to host our serverless functions in Docker containers. There is even a project, Kubeless which allow you to map serverless function events into Kubernetes hosted functions. And there are Cloud events which seeks to standardize the event schema on Cloud platforms so events can be received the same way regardless of platform.

The real power of Kubernetes (and similar Cloud Native systems) is what it enables which is a platform that can efficiently use the resources given to it by the underlying provider. We have no reached a point where the big three providers are offering Kubernetes as a managed service so we have passed the point where it was too hard for most to standup. The next goal needs to be a way to make configuration easier through better tooling.

Source code for my login service: