Scratching the Surface of Real Time Apps – Part 1

Recently, I began experimenting with Amazon Kinesis as a way to delve into the world of real time data applications. For the uninitiated, this is referred to a “streaming” and it basically deals with the ideas of continuous data sets. By using streaming, we can make determinations in real time which is allows a system to generate more value for those wanting insights into their data.

Amazon has completely embraced the notion of streaming within many of their services, supported by Lambda. This notion of being event driven is not a new pattern but, by listening for events from within the Cloud they become much easier to develop and enable more scenarios than ever before.

Understanding Our Layout

Perhaps the most important thing when designing this, and really any Cloud based application, is a coherent plan of what your after and what services you intend to use. Here is a diagram of what we are going to build (well parts of it).

RealTimeApp

 

Our main focus will be on the Amazon components. For the Console App, I wrote a simple .NET Core app which randomly generates stock prices and shares bought. Really you can build anything that will just generate a lot of data for practice.

This data gets written to an API Gateway which allows us to create a more coherent mapping to our Lambda rather than exposing it directly. API Gatways enable a host of useful scenarios such as unifying disparate services behind a single front or starting the breakup of a monolith into microservices.

The Lambda takes the request and writes it to a Kinesis Firehose which is, as it sounds, a wide pipe that is designed to take A LOT of data. We do this to maximize throughput. Its like a queue, but much more than that and its all managed by Amazon and so will scale automatically.

The Firehose than writes the data it receives to another Kinesis Stream, but its an Analytics stream, designed to be queried to see trends and other aspects of the incoming data. This outputs to a stream which is then sent to another Lambda. The thing to remember here is that we are working against a continious data set, so bounding our queries is important.

The final Lambda, takes the query result from analytics and saves it to Dynamo. This gives us our data, time based, that we can query against from a web application.

 

Creating our Kinesis Ingestion Stream

Start off from the AWS Console, under Analytics selecting Kinesis. You will want to click the Create Firehose Delivery Stream.

For the first screen, you will want to give the stream a name and ensure the Source is set to Direct PUT. Click Next.

On the second screen, you will be asked if you want to do any kind of preprocessing on the items. This can be very important depending on the data set you are injesting. In our case, we are not going to do any preprocessing, click Next.

On the third screen you will be asked about Destination. This is where the records you write to Firehose go after they pass through the buffer and into the ether. For our purposes, select S3 and create a new bucket (or use an existing one). I encourage you to explore the others as they can serve other use cases. Click Next.

On the fourth screen, you will be asked to configure various settings. including the all important S3 Buffer Conditions. This is when Firehose will store the data it receives. Be sure to set your IAM Role, you can auto create if you need to. Click Next.

On the final screen click Create Delivery Stream. Once the process is complete we can return and write some code for the Lambda which will write to this stream.

Creating the Ingestion Lambda

There are many ways to get data into Kinesis each of which is geared towards a specific scenario and dependent on the desired ingestion rate. For our example, we are going to use a Http Trigger Lambda written in C# (.NET Core 2.0).

Setup: I am quite fond of Visual Studio Code (download) and it has become my editor of choice. To make this process easier, you will want to run the following command to install the Amazon AWS Lambda templates (note: this assumes you have installed .NET Core SDK to your machine)

dotnet new -i Amazon.Lambda.Templates::*

More information here

Once this is installed we can create our project using the EmptyServerless template, here is the command:

dotnet new lambda.EmptyServerless –name <YourProjectName>

This will create a new directory named after your project with src/ and test/ directories. Our efforts will be in /src/<ProjectName> directory, you will want to make sure you cd here as well as you must be in the project root to run the installed dotnet lambda commands.

Our basic process with this Api call is to receive a payload, via POST, and write those contents to our Firehose stream. Here is the code that I used: https://gitlab.com/RealTimeApp/StockTransactionApi/blob/master/src/StockTransactionApi/PostTransaction.cs

A few things to take note of here:

  • The first parameter is of type object and it needs to match the incoming JSON schema. Amazon will handle the deserialization from string to object for us. Though we end up needing the raw contents anyway.
  • We need to install the AWSSDK.KinesisFirehose (I used version 3.3.5.3) which gives us access to AmazonKinesisFirehoseClient
  • We call PutItemAsync to write to our given Delivery stream (in the case of this code that stream is named input_transactions)
  • I have changed the name of the function handler from FunctionHandler (which comes with the template) to PostHandler. This is fine, but make sure to update the aws-lambda-tools-defaults.json file to ensure function-handler is correct

Once you have your code written we need to deploy the function. Note that I would rather do it from the command line since it avoids the auto creation of the API Gateway path. I have done both and I find I like the clarity offered with doing it in stages. You need to run this command to deploy:

dotnet lambda deploy-function <LambdaFunctionName>

LambdaFunctionName is the name of the function as it appears in the AWS Console. It does NOT have to match what you call it in development, very useful to know.

As a prerequisite to this, you need to define a role for the Lambda to operate within. As with anytime you define an IAM Role, it should only have access to the bare minimum of services that callers will need.

When you run the deploy, you will receive a prompt to pick an applicable role for the newly created function, future deployments of the same Lambda name will not ask this.

Our next step is to expose this on the web, for that we need to create a API Gateway route. As a side note, there is a concept of lambda-proxy which I could not get to work. If you can, great, but I made due with the more standard approach.

Add Web Access

I think API Gateways are one of the coolest and most useful services offered by Cloud providers (API Gateway on Azure and Endpoints on Google). They enable us to tackle the task of API generation in a variety of ways, offer standard features prebuilt, and offer an abstract to let us make changes.

I will be honest, I can never find API Gateway in the Services list, I always have to search for it. Click on it once you find it. On the next page press Create API. Here are the settings you will need to get started:

  • Name: Anything you cant, this doesnt get exposed to the public. Be descriptove
  • Endpoint type: Regional (at this point I have explored the differences between the types)

Click Create and you will be dropped on the API configuration main page.

The default endpoint given is / or the root of the API. Often we will add resources (i.e User, Product, etc) and then have our operations against those resources. For simplicity here we will operate against the root, this is not typical but API organization and development is not the focus of this post.

With / selected click Actions and select Create Method. Select POST and click the checkmark. API Gateway will create a path to allow POST against / with no authentication requirements, this is what we want.

Here is a sample of the configuration we will specify for our API Gateway to Lambda pipe, note that my function is called PostTransaction, yours will likely differ.

Selection_005

Click Save and you will get a nice diagram showing the stages of the API Gateway route. We can test that things are working by clicking Test and sending a payload to our endpoint.

I wont lie, this part can be maddening. I tried to keep this simple to avoid problems. There is good response logging from the Test output you just have to play with it.

So we are not quite ready to call our application via Postman or whatever yet, we need to create a stage. Stages allow us to deploy the API to different environments (Development, QA, Staging, Production). We can actually combine this with Deploy API.

IMPORTANT: Your API changes are NEVER live when you make them, you must deploy, each time. This is important because it can be annoying figuring out why your API doesnt work only to find you didnt deploy your change, as I did.

When you do Actions -> Deploy Api you will be prompted to select a Stage. In the case of our API, we have no stages, we can create one, I choose Production, you can choose whatever.

Once the deploy finishes we need to get the actual Url that we will use. We can go to Stages in the left hand navigation and select our stage, the Url will be displayed along the top (remember to append your resource name if you didnt use /). This is the Url we can use to call our Lambda from Postman.

The ideal case here is that you create an app that sends this endpoint a lot of data so we can write a large amount to Firehose for processing.

Debugging

In this part, we did not write that much code and the code we did write could be tested rather easily within the testing harnesses for both API Gateway and Lambda. One thing that is useful is to use Console.WriteLine as CloudWatch will capture anything in Logs from STDOUT.

Next Steps

With this we have an endpoint applications can send data to and it will get written to Firehose. By using Firehose we can support ingesting  A LOT of data and with Lambda we get automatic scaling. We abstract the Lambda path away using API Gateway which lets us also include other API endpoints that could be pointing at other services.

In our next part we will create a Kinesis Analytics Data Stream which allow us to analyze the data as it comes into the application – Part 2

Advertisements

Exploring Cassandra

Continuing on with my theme this year of learning the integral parts of High Performing systems, I decided to explore Cassandra, a multi node NoSQL database that acts like a RDBMS. Cassandra is a project that was born out of Facebook and is currently maintained by the Apache Foundation. In this post, we are going to scratch the surface of this tool. This is not an indepth look at the project, but rather a “get it to work” approach.

At its core, Cassandra is designed to feel like an RDBMS but provide the high availability features that NoSQL is known for, attempting to bridge the best of both worlds. The query language used internally is called CQL (Cassandra Query Language). At the most basic level, we have keyspaces (one per node) and within those keyspaces we have tables. The big selling point of Cassandra is its ability to quickly replicate data between its nodes allowing it to automatically fulfill the tenant of Eventual Consistency.

For my purposes I wanted to see how fast I could read and write with a single node using .NET Core. This is mostly to understand the general interactions with the database from .NET. I decided to use Docker to standup the single node cluster and as such generated a Docker compose file so I could run a CQL script at startup to create my key space and table.

The Docker File

I started off trying to use the public cassandra image from Docker Hub but I found that it does not support the entry point concept and required me to create the key space and table myself. Luckily I found the dschroe/cassandra-docker image which extends the public Cassandra image to support this case.

I wanted to just write a bunch of random names to Cassandra so I create a simple keyspace and table. Here is the code: https://gitlab.com/xximjasonxx/cassandra-sandbox/blob/master/schema/init.cql

I worked this into a simple Docker Compose file so that I could do all of the mounting and mapping I needed. You can find the compose file in the GitLab repo I reference at the end. Once you have it, simply run docker-compose up and you should have a Cassandra database with a cluster called names up and running. Its important to inspect the output, you should see the creation of the keyspace and table in the output.

CassandraCSharpDriver

This is the NuGet package I used to handle the connection with Cassandra. At a high level, you need to understand that a cluster can have multiple keyspaces, so you need to specify which one you want to connect to. I found it useful to view the keyspaces as databases since you will see the USE command with them. They are not databases per say, just logical groupings that can have different replication rules.

This connection creates a session which will allow you to manipulate the tables within the keyspace.

Insert the Data

I have always been amused by the funny names Docker will give containers when you dont specify a name. It turns out someone created an endpoint which can return you the names: https://frightanic.com/goodies_content/docker-names.php. This delighted me to no end and so I used this for my data.

You can find the logic which queries this in this file: https://gitlab.com/xximjasonxx/cassandra-sandbox/blob/master/Application.cs

First we want to get the data into Cassandra, I found the best way to do this, especially since I am generating a lot of names is to use the BEGIN and APPLY BATCH wrappers for INSERT commands.

By doing this you can insert however many you like and have little chance of draining the cluster connections (I did this when I did an insert per approach).

Read the Data

When you perform Execute against a Session the result is a RowSet which is enumerable and can be used with LINQ. The interesting thing I found here is that while I specify my column names as firstName and lastName when I get back the row from RowSet the columns are named in all lower case: firstname and lastname. By far this was the most annoying part when I was building this sample.

Delete the Data

I do not know why but Cassandra is very weird about DELETE SQL statements. If I had to venture a guess, its likely restricted due to the replication that needs to happen to finalize the delete operation. It also appears that, if you want to delete, you have to provide a condition, the DELETE FROM <table> to delete everything does not appear to be supported, again I think this is due to the replication consideration.

Instead you have to use TRUNCATE to clear the table.

Thoughts

Cassandra is a big topic and I need more than a partial week to fully understand it but, overall my initial impression is good and I can see its use case. I look forward to using it more and eventually using it in a scenario where its needed.

Here is the code for my current Cassandra test, I recommend having Docker installed as it allows this code to be completely self contained, cheers.

https://gitlab.com/xximjasonxx/cassandra-sandbox

Speaking at TechBash

Proud to announce that I have accepted the offer to speak at TechBash in Pennsylvania. I will be giving my talk on Blazor, which is Microsoft’s WebAssembly SPA Framework (recently blogged here) and I will also be debuting my newest talk, Building Serverless Microservices with Azure a topic I have touched on here as part of a larger series.

This will be my first time at TechBash and I have heard great things from past speakers, so I very much look forward to this event. I hope to see you there

Setting up Kubernetes on GKE

Kubernetes is the new hotness when it comes to orchestrating containers for both development and production scenarios. If that sentence doesn’t makes sense to you, dont worry, it didnt make sense to me either when I first got started.

One of the biggest trends in server development right now is the Microservice pattern, whereby we break up our monolithic backends into smaller services that we can then mash together to supper various scenarios. Containers helped make this process more efficient than straight server deploys. But as our deployments got larger and our availability needs grew things got complicated.

This is where the concept of orchestration comes in. Orchestration is a higher level tool which manages our sets of containers and ensures that we maintain operational status. Kubernetes is such a tool, developed by Google, which has gained immense popularity for its extensibility, efficiency, and resiliency (it can self heal). What’s more, since multiple containers can run on the same VM we can now more efficiently utilize our server resources, as opposed to previous models which saw an overuse of VMs to manage load.

Kubernetes has gained such a following that Azure, Amazon, and Google all provide managed versions enabling quick deployments without having to set everything up yourself (which is beyond painful). In this article, we are going to explore setting up a simple microservice on Google Cloud Platform.

Our Microservice

As part of my prototype Retail App I created a simple login service. The code is not important, just know that it utilizes PostgresSQL and Redis to handle user authentication and JWT token persistence.

Our goal is going to be to stand up a Kubernetes Cluster, in Google, to serve out this microservice.

A VERY Quick Intro to Kubernetes

Kubernetes is a HUGE topic that is getting larger every day, this blog post wont even cover a third of those concepts but, principally, there several you need to know for what I am going to say to make sense:

  • Cluster: A cluster is a collection of nodes (think VMs, but not really) that we can provision resources to including Pods, Deployments, Services and others.
  • Pod – the most basic unit of functionality within Kubernetes. A pod hosts one or more containers (created from Docker, if you dont know what these are click here). A Pod can be thought of, conceptually, as a machine
  • Deployment – A set of rules around how we deploy Pods. This can include how many replicas exist and what the action is if a Container fails.
  • Service – These allow Deployments of Pods to communicate either internally or externally. It can act as a Load Balancer for your deployment (if you have more than one Pod). Logically speaker it is what provides access into a cluster.

Overall, the goal of Kubernetes (often abbreviated k8s) is specify and adhere to an ideal state. For example, I can say I want three replicas which hold my API code and Kubernetes will guarantee that, at any given time, there will always be AT LEAST 3 Pods taking requests via the Deployment specification. I can also specify scaling conditions which allows Kubernetes to increase the number of Pods if necessary.

I often like to refer to Kubernetes as a “mini cloud” and in fact, that is what it is going for, a native cloud that can live in any Cloud. This lets us have all of the advantages of a Cloud without being tied to it.

Image Repositories

Throughout this process we will create Docker containers which run in our Pods. Google provides an automated way (via Container Repository -> Build Triggers) to automatically run Dockerfile code and build images that can then be used as Containers via the Kubernetes configuration file. I wont be covering how to set this up, instead I will assume you have already published your Microservice container somewhere.

Managed External Services

As I said above, we can often think of Kubernetes as our own personal Cloud that we can move around and run anywhere (both in the Cloud or on-prem). In my view, this is great for service code but, it loses interest for me when we start talking about running things like a database or a Redis cluster as well. You can do this but, I feel that since most Cloud providers have managed versions of these tools which already auto-scale and are reasonably cost effective, I would rather connect to those than try to run my own. I will be using Google managed Postgres and Redis in this example

Create & Connect to Your Cluster

Assuming you have a Google Cloud account and have already created a project, expand the Tools side menu and select Kubernetes Engine from underneath the Compute section.

This next part is very important and not really documented from what I saw. When the cluster finishes you will see a Connect next to it. This button will give you the credentials needed to connect to the cluster you just created. This is important because MANY of the options you will want to use are not available through the UI yet, this includes access to Secrets which are necessary to connected a Google Managed Database.

Run the given command in a terminal – the important line you are looking for is marked with kubeconfig. This is the context in which kubectl will run. Now we need to tell kubectl to use this context, to do that we execute the following command in the terminal, note my cluster name is test-cluster

kubectl config use-context <cluster-name>

Sometimes this wont work, it will say context does not exist. In this case run the following command to figure out what name the context is actually registered under

kubectl config get-contexts

It is usually pretty obvious which one is yours. Rerun the use-context command with the new name. Congrats you are connected.

Create the Proxy Account

Like most Cloud providers, Google advises you to not permit direct access to your managed service but rather utilize a “sidecar” pattern. In this pattern, we provision an additional Pod to the Pods within our Deployment to permit access to Postgres through a proxy account.

This means that our main Pod by itself does not have access to Postgres but rather must access it through the proxy. I wont bore you with instructions on how to do this, Google has done a good job laying this out:

https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine

Work up to Step 5 and then come back so we can talk about Secrets.

Creating our Secrets

I am not going to dive too deeply into how to create a managed Postgres and Redis service on GCP, they have documentation that suits that well enough. However, where things get a bit tricky is connecting to these services.

Kubernetes goal is to deploy and orchestrate containers relative to your “ideal state”. Containers can be provided environment variables which allows them to pull environment specific values in, a necessary element of any application.

On top of this, it is very common for production environment variables like usernames and passwords for databases. This is where Secrets come in, we can register these values with the Kubernetes cluster and then inject them as environment variables. It also keeps these sensitive variables out of source control.

Continue with https://cloud.google.com/sql/docs/postgres/connect-kubernetes-engine This was the hard part for me cause the part about connecting kubectl up to my cluster was not clear.

After you register your secrets come back here, I will elaborate more on the updating your Pod Configuration.

Creating the Configuration File

As it stands today, the primary way to setup Kubernetes is to use YAML files. There are a host of tools in development to alleviate this burden but, for now, its the best way to go. I want to present you with this YAML file which shows how to configure your config file so you can connect to Postgres

https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/blob/master/cloudsql/postgres_deployment.yaml

Within this file we can see that we specify the DB_HOST as 127.0.0.1. This works because of the sidecar pattern.

If you examine the YAML file you will notice our Pod (spec->container) containers TWO containers, one of which uses the gce-proxy image, this is the Cloud Proxy and since the two containers are running on the same Pod they are, effectively, said to be on the same machine. Thus 127.0.0.1 refers to the loopback for the Pod which can see both containers.

Next, the DB_USER and DB_PASSWORD values are being extracted from our secret files that we created in previous steps. The names should match whatever you used, I personally like to name my credential file with a hint to which service it refers to since I have more than one database being referenced within my cluster (for other services).

Honestly, my recommendation here is to follow the YAML file I referenced above as you make your config file. It really is the best way to build it up. Bare in mind you do NOT need to define a Service in this file, this is strictly for your Deployment (which GKE refers to a Workloads). You can define the service if you like, and for more complex projects I recommend it, so as to keep everything together.

Once you have completed this file we need to apply it. Run the following command

kubectl apply -f <your yaml file>

This will connect and apply your configuration. Return to GKE and click Workloads. After some times things should refresh and you should see your Pods running.

Debugging

So, as you can imagine debugging is pretty hard with Kubernetes so you will want to make sure you get most of it out of the way ahead of time. So here are some tips:

  • Make use of minikube for local development. Get things in order before you deploy including secrets
  • Add console logging to your code. You can access the Pod Log files right from the GKE console. Important, if you have multiple replicas you might need to enter each one to figure out which one got hit. I recommend starting with a replica count of 1 at first, makes this easier
  • Set you ImagePullPolicy to always. I have found that sometimes Google will recreate a container from an existing image and you wont get your latest changes
  • If you want to delete things you can use kubectl or the UI, I recommend waiting for everything to removed before redeploying things

Creating the Service

Your code is now deployed into Kubernetes but you wont be able to access it without a service. My preference for this is to define it in my YAML file, here is an example:

Selection_004

Alternatively, you can do this straight up on the UI. The important thing is the port and the type, here set to LoadBalancer. Others include ClusterIP which is predominantly used for internal services that are not accessible from outside.

Closing Thoughts

My goal is eventually to do a much larger series on Kubernetes and Microservices since this has become something I am very focused on and is a huge trend in development right now. I hope this article gets people thinking about Microservics and how they might use Kubernetes.

There is a lot of discussion around Docker vs Serverless for Microservices. Truthfully, this is a misnomer since, in most cases, we would want to host our serverless functions in Docker containers. There is even a project, Kubeless which allow you to map serverless function events into Kubernetes hosted functions. And there are Cloud events which seeks to standardize the event schema on Cloud platforms so events can be received the same way regardless of platform.

The real power of Kubernetes (and similar Cloud Native systems) is what it enables which is a platform that can efficiently use the resources given to it by the underlying provider. We have no reached a point where the big three providers are offering Kubernetes as a managed service so we have passed the point where it was too hard for most to standup. The next goal needs to be a way to make configuration easier through better tooling.

Source code for my login service: https://gitlab.com/RetailApp/retail-app-login-service

An Introduction to Blazor

As a full stack developer it is important to always keep your ears to the ground and keep up with the latest and greatest. One of the more interesting projects that has come out in the past couple of years has been WebAssembly. No, its not the fabled “JavaScript killer”, at least not right now. But its maturation could one day legitimately usher in a time when JavaScript is not the only language for the web and the web is simply a compile target.

Blazor is a project put together by Steve Sanderson (of Microsoft fame) that uses WebAssembly and Mono to build C# applications for the browser; without plugins of extensions. It simply compiles the .NET Core code you write to IL which is then targeted to WebAssembly (put very simply). This project is very much in its early stages and has not (as of this writing) received official backing from Microsoft outside of it being a “experiment”.

But Blazor represents an idea, one that I think we will see more of in the coming years, and developer seek to utilize WebAssembly to open the web up to new programming models. With that in mind, let’s talk about how to get started in Blazor:

The Setup

For Blazor to work you will need Visual Studio 15.7 (or later), .NET Core 2.1 GA (or later), and the Blazor Language Services Extension (here). Once all of these are installed you are free to create Blazor application using the standard ‘Create New ASP .NET Core Application’ template (there will be two Blazor related options in the subsequent dialog)

New Blazor app dialog

Blazor
This template is your typical SPA style application that can be deployed to any web site; I refer to this as the React/Angular SPA style application. In it you can make various web calls to APIs not your own and bring back data. This is the template that I currently favor

Blazor (ASP .NET Core hosted)
Similar to the OG Blazor template, this one includes a .NET Core Web API Backend that you can call into. One of the advantages of Blazor is you can truly share code between the backend and frontend without issue. At the time of this writing, however, this one had stopped working for whatever reason. This is the approach you would take if you want your SPA to work with an API that was yours; you would need to deploy the API to a separate location from the web content, as is the case with any SPA deployment.

For now, I suggest using the Blazor template. You can run this straight out of the gate. IOt has two nice examples of client side operations: Counter which uses C# to track click counts and FetchData which shows async init operations in action to load a local JSON file (the Hosted template actually retrieves this data from a web service).

Component Driven Design

The concept of a “component” in UI design is nothing new, I remember them from way back in the day with UserControls in ASP .NET, and now we have Angular and React embracing it as the way to go for SPA applications.

In Blazor, you create Razor files (.cshtml) in which  you embed both C# and HTML, these are components and they are surfaced as such. In this files index.html there are three components: ForecastDayList, LocationDisplayand CurrentWeather.

Selection_001

This is a very common approach with web content these days and it makes it easier to design and do things like CSS isolation and emphasize reusability. The cool thing here is these components are surfaced for us automatically by .NET without any using statements.

Routing

This is somewhat new since the first time I opened Blazor at the MVP Summit, they didnt have this though it was talked about how important it was as a feature.

Selection_004

Before we had to refresh the page entirely, and dont even get me started on adding parameters. Now you can just drop this line at the top and everything is handled for you (though you can replace it if you like). If you want to add parameters simple add it with the :pName syntax you will see in React and Angular, plus others.

The reason this works is because, again, Blazor adoptive of the Component design pattern. All these routes do is say “if you see this path, load this component” – we do the same thing in React in Angular.

Event Handling

Much like with traditional ASP .NET applications, C# code can be embedded in these Razor pages using the Razor syntax (the now famous @ prefix). It offers a clean and simple way to elegantly blending code with markup.

For HTML elements we can use shorthands that let us reference our C# code directly. The default template has a great example of this with Counter.

Selection_002

One thing I will point out is to be careful using reference variables, since they will be null to start in most cases. We can see the use of the onclick attribute which is a special keyword for Razor in this case and will point at our method, IncrementCount in this case. The page automatically refreshes as the value changes; newly released there are lifecycle methods that let you hook into this process: OnAfterRender (here).

This is the simplest example of handling interactivity on pages, we can do some more complex things like using Async to bring down data from a remote source.

Handling Async

One of the coolest features in the .NET world, and it has begun to find its away outside there (in ECMA7) is async/await, .NET’s syntactical sugar to make writing asynchronous operations easier. For the uninitiated, async/await is NOT parallel programming, it creates a sort of event loop in the background where code waits for an operation to complete and then resumes at that point. While other code may execute in the interim, parallel programming is doing things at once. With async/await the system is still utilizing a single thread.

The ability to bring this feature to SPA programming is, in a word, awesome.

Selection_005

This file can be found in full here on my GitHub – I tried to get as much of here as possible. There looks to be a lot going on but, we can take this as a very small slice and not worry too much about other things.

When you specify an Event Handler you can either choose to return void or Task (void for an method marked as async). This differs from traditional ASP .NET which had the old event handler signature that necessitated void be allowed as a return from an async method, something you dont ever want to do (but that is another story).

Our page defines a number of properties and uses the bind attribute to record the user entered value. When executed this will operate as any async method does. And as we saw with the Counter page, as the variables change the page will re-render. In this case populating the data for our other components.

You can view this application, deployed as HTML to S3, here.

Dependency Injection

Also in use here is the @inject handler which allows us to inject dependencies into our components. You define these in the Program.cs file of your main, its very similar to what we see with ASP .NET Core server apps. Based on the type you “label” for the dependency you can @inject that type.

As a side, I noticed while doing this that you will always to @inject HttpClient in some way as there appears to be a special version of sorts in use (or it has certain configurations). The out of the box on in System.Net does not seem to work. Just an FYI

This is awesome, I want to use it for my next app!!

I wouldnt. Blazor is still VERY experimental and missing some key features that make it unsuitable for a mission critical application; the lack of Microsoft officially recognizing it should be you wary. The key things are:

  1. Lack of debugging within Visual Studio. You are wholly reliant on Console.WriteLine for debug information as the Mono team has not yet added debugging hooks. I expect this will come soon though
  2. Application size. If you are familiar with Xamarin and Mono you know there is a vast difference between the assembly you create during development and the one you deploy to the AppStore, principally in size. Blazor is not yet supportive of the AOT compilation style that would enable it to get a binary size that is palatable as a download. That is the biggest reason, in my mind to avoid this in Production for right now

The main point here is to play with this and offer feedback to the team working on Blazor at Microsoft. They are very active in this project and many of the requests that I have made have been fulfilled. The team is well aware of the projects current shortcomings and is working to eventually make this a legitimate offering to the larger community.

What is the future?

It has been interesting the reaction to WebAssembly some of the hardcore web developers I know have had. Some see it as just another would be silver bullet ala Flash and Silverlight that will ultimately lose to JavaScript. That may be true but I dont recall Flash or Silverlight ever being a standardized thing that is quickly being adopted. If anything, the rise of WebAssembly could alter the landscape for web developers over the next few years to give us more options.

I think a lot of the negativity is born out of people who dont like JavaScript seeing WebAssembly and reaching for the “JavaScript is dead” hammer, Ive certainly seen quite a few articles going in that direction. My take: WebAssembly is an alternative. There are many things that JavaScript can do that are, well, awkward. For example, while I can create a 3D game in JavaScript, chances are C++ might be better for it. WebAssembly gives me the option to run my code natively in the browser. But would I want to use C++ to deliver a Line of Business application? Probably Not.

But that gets at the heart of the problem on the web. Right now, its only JavaScript. So regardless of what you want to do you really only have one language you can do it in. Thankfully that language has been shown flexible enough to carry us this far but, maybe with WebAssembly, web developers will finally get to a point where we can pick the best language for the job. Is it so bad that I could pick from Swift, Python, C#, or Java as my language and have it run natively in the browser? I dont see that as a problem, I see that as helping us move the web forward.

There is a chance that, one day, WebAssembly  could supplant JavaScript but that day is far off at best and, if you listen to the creators of WebAssembly talk, that has never been the goal of WebAssembly. Its always been about filling the gaps and giving developers more options to do things that might be awkward in JavaScript.

Regardless, that conversation goes off into direction that is not entirely related to Blazor but I think Blazor is the first of, I hope, many such tools that enable us to have more options when developing web applications.

Docker: The Barest of Introductions

For the uninitiated, Docker is a tool which enables containerization for dependencies for programming. This enables adhering, more closely to the 12 Factor App principles which are designed strategies for synchronizing, more closely, the various environments used in development and mitigating the “works on my machine” problem.

But it is so much more than that. True, Docker has found the best uses as a means to provide for consistent deployments but I see it as much more than this. I see containerization as changing the way we develop applications because it lets us, as developers, do what we love, which is play with new stuff, while still allowing our development environments to be consistent. It normalizes this notion that everything should live together, as a unit, which makes deployment and management so much easier.

Perhaps it does sound grandiose when put this way but I do truly believe it. The ability for me to do LAMP, or LEMP without any of those things installed or the ability to dabble with Go without installing a compiler is huge. I imagine this being the way I want development to go from now on. A project starts and the lead or whomever creates the Dockerfile or the docker-compose file. Developers can then start almost immediately without having to worry about what is installed on their machine and how it might impact what they are working. We can store these files with our source allowing us to take it wherever want to go. I truly find the scenarios enabled by Docker to be amazing and game changing.

The Basics

You download Docker here: https://store.docker.com/search?type=edition&offering=community

Docker is built on the idea of images, effectively these are the templates for the containers which run your apps. The Docker dameaon (installed above) will automatically pull an image if you request one and it does not exist, by default it will pull from Docker Hub. Alternatively, you can pull an image yourself. Here I am pulling the latest version of aspnetcore, which is an image for a container that has the latest .NET Core Runtime installed:

docker pull microsoft/aspnetcore:latest

latest is a tag here to get the newest image available, alternatively you can request a specific version tag such as aspnetcore:2.0.103. By doing this you can pull down a new version of a runtime and see how your code will function in that runtime. A great check before an en masse update.

Once you have the image, you need to create a container. This is done using the run command. You can create many containers from the same image. Containers can be long running (web servers) or throw away (executors). Below I will run our image as a container:

docker run –name core-app microsoft/aspnetcore:latest

If you run the above it will not do much. This is because, while we can think of a container as a VM conceptually, that is not what it is. I like to think that a Container must exist for a purpose, which is contrary to a VM which exists to be used for something. Considered in this light, our above simply starts and then closes. Trick, you can actually see it if you run this command

docker container ls -a

This lists all of the containers on our machine, even those that are not running. So how can we give our container a purpose. For aspnetcore it needs to a web server to run or some sort of process. When dealing with Docker you need to consider the purpose of the container as that is what will drive the general need.

To demonstrate a running container, we are going to go with something a bit simpler, a Go environment. This is where we will write Go code locally and then run it through the container and observe the output. Our container will not need to be long running in this case and exist only long enough to compile and execute our Go code. Let’s get started.

Building a Go Development Environment with Docker

As always, we will be writing code so you will need an editor, so pick your favorite. I tend to use VSCode for most things these days, and it has a Go extension. You will need to disable various popups that complain about not finding Go in the path. It wont be there cause we are not going to install it.

Above we talked about some of the basic commands and only referenced the Dockerfile in passing. But this file is crucial and represents the core building block for an application built on Docker as it lets you take an existing image, customize it and create your own.

Here is the file I have created for my environment

FROM golang:1.8
WORKDIR /src
COPY ./src .
RUN go build -o startapp;
WORKDIR /app
RUN cp /src/startapp .
ENTRYPOINT [ “./startapp” ]
What this says is:
  • Pull the golang image tagged as 1.8 from a known repository (Docker Hub in this case)
  • Change working directory on the image to /src (will create if it does not exist)
  • Copy the contents of the host at ./src to the working directory (I have a folder at the root of my project called src where all code files reside)
  • Run the command go build -o startapp – this will run the Go compiler and output an executable called startapp
  • Change working directory to /app (create if it does not exist)
  • Run the copy command to move the created executable to /app
  • Set container entrypoint as the startapp executable in the working directory

In effect, this copies our local code into the image, runs a command, and copies that output of that command to a directory. Setting entrypoint tells Docker what it should call when the container is started. You remember how above our run command just exited? That is because we never told it what to do, here we do.

Here is a basic Go Hello World program, I have stored this at /src/start.go

package main
import “fmt”
func main() {
   fmt.Printf(“Hello World”);
}
This does nothing more than print “Hello World” to the screen. To do this, first run the following command:
docker build -t my-app .

 

This command will directly invoke the <em>Dockerfile</em> in the local directory. Per this, Docker will construct our image using the <strong>golang:1.8</strong> as a base. The -t option allows us to tag the image with a custom name. Once things finish up, use this command to see all of the images on your machine.

docker images

If this is the first time you have used Docker you should see two images in this list, one being that with the same name you used above with -t

Ok, so now we have our image we want to run this as a container. To do that, we use the Docker run command. This will also provide us with our output that we want. Here is a shot of my console.

console

A few things with this run command:

  • In Docker, container names must be unique. Since our container will exist SOLELY to run our Go code, we dont want it hanging around, even in a stopped state. The –rm option ensures that the container is removed once we are done
  • –name does what you expect and gives our container a name, if this is omitted Docker will provide a name for us, some of which can be quite amusing
  • go-app-hello-world is our target image. This is the image we will use as the template

Congratulations, you have run Go on your machine without installing it. Pretty cool eh?

Expanded Conversations

What we have demonstrated here is but a sliver of the scenarios that Containerization (here through Docker) opens up. If go beyond this and consider a general development scenario, we are also locking ourselves to THIS version of Go. That allows me to install whatever I want on my local machine in the way of tools or other applications and SDKs and have no fear of something being used that would otherwise not be available in production. This principle of isolation is something we have long sought to ensure consistent production deployments.

But there is more to it. Using containers allows for better resource use scenarios in the Cloud through Orchestration tools like Kubernetes, MesOS, and Docker Swarm. These tools enabled codified resilient architectures that can be deployed and managed in the Cloud. And with containerization your code becomes portable meaning if you want to move to AWS from Azure you can, or from AWS to Google. It really is amazing. I look forward to sharing more Docker insights with you.

Google Cloud: Backend and Vision API

This post is in conjunction with the first post I made (here) about building a Cloud Function in Google Cloud to accept an image and put it in my blob storage. This is a pretty common scenario for handling file uploads on the web when using Cloud. But this by itself is not overly useful.

I always like to use this simple example to explore the abilities for deferred processing of a platform by using a trigger to perform Computer Vision processing on the image. It is a very common pattern for deferred processing.

Add the Database

For any application like this we need a persistent storage mechanism because our processing should not be happening in real time so, we need to be able to store a reference to the file and update it once processing finishes.

Generally when I do this example I like to use a NoSQL database since it fits the processing model very well. However, I decided to deviate from my standard approach an opt for a managed MySQL instance through Google.

gcp5

This is all relatively simple. Just make sure you are using MySQL and not PG, its not clear to me that, as of this writing, you can easily connect to PG from a  GCF (Google Cloud Function).

The actual provisioning step can take a bit of time, so just keep that in mind.

At this time, there does not appear to be any sort of GUI interface for managing the MySQL instance, so you will need to remember your SQL and drop into the Google Cloud Shell.

gcp6

Once you are here you can create your database. You will want a table within that database to store the image references. Your schema may very, here is what I choose:

gcp7

The storageName is unique since I am assigning it during upload so that the name in the bucket matches the unique name of the image in both spots; allows me to do lookups. The rest of the values are designed to support a UI.

Inserting the record

You will want to update your upload trigger to insert the record into this table as the file comes across the wire. I wont show that code here, you can look at the backend trigger to see how to connect to the MySQL database. Once you have that figured out, it is a matter of running the INSERT statement.

Building the Backend Trigger

One of the big draws of Serverless is the integration abilities it gives you to integrate with the platform itself. By their nature, Cloud Platforms can easily produce events for various things happening within itself. Serverless functions are a great way to listen for these events through the notion of a trigger.

In our case, Google will raise an event when new blobs are created (or modified), among other actions. So, we can configure our trigger to fire on this event. Because Serverless scales automatically with load, this is an ideal way to handle this sort of deferred processing model without having to write our own plumbing.

Google makes this quite easy as when you elect to create a Cloud Function you will be asked what sort of trigger you want to respond to. For our first part, the upload function, we are listening to an Http Event (a POST specifically). In this case, we will want to listen to a particular bucket for when new items are finalized (that is Google’s term for created or updated).

gcp8

You can see, we also have the ability to listen to a Pub/Sub topic as well. This gives you an idea of how powerful these triggers can be as normally you would have to write the polling service to listen for events; this does it for you automatically.

The Code

So, we need to connect to MySQL. Oddly enough, at the time of this writing, this was not officially supported, due to GCF still being in beta from what I am told. I found a good friend which discussed this problem and offered some solutions here.

To summarize the link, we can establish a connection but that appears to be some question about the scalability of the connection. Personally, this doesnt seem to be something that Google should leave to a third party library but should offer a Google Cloud specific mechanism for hitting MySQL in their platform. We shall see if they offer this once GCF GAs.

For now, you will want to run npm install –save mysql. Here is the code to make the connection:

(This is NOT the trigger code)

gcp9

You can get the socketPath value from here:

gcp10

While you can use root to connect, I would advise creating an additional user.

From there its a simple matter of calling update. The event which fires to trigger the method includes what the name of the new/updated blobs is, so we pass it into our method call it imageBucketName.

Getting Our Image Data

So we are kind of going out of order since the update above only makes sense if you have data to update with, we dont or not yet. What we want to do is use Google Vision API to analyze the image and return a JSON block representing various features of the image.

To start, you will want to navigate to the Vision API in the Cloud Console, you need to make sure the API is enabled for your project. Pretty hard to miss this since they pop a dialog up to enable it when you first enter.

Use npm install –save @google-cloud/vision to get the necessary library for talking to the Vision API from your GCF.

Ok, I am not going to sugarcoat this, Google has quite a bit of work to do on the documentation for the API, it is hard to understand what is there and how to access it. Initially I was going to use a Promise.all to fire off calls to all of the annotators. However, after examining the response of the Annotate Labels call I realized that it was designed with the idea of batching the calls in mind. This led to a great hunt to how to do this. I was able to solve it, but I literally had to splunk the NPM package to figure out how to tell it what APIs I wanted it to call, this is what I ended up with:

The weird part here is the docs seemed to suggest that I dont need the hardcoded strings, but I couldnt figure out how to reference it through the API. So, still have to work that out.

gcp11

The updateInDatabase method is the call I discussed above. The call to the Vision API ends up generating a large block of JSON that I drop into a blob column in the MySQL table. This is a big reason I generally go with a NoSQL database since these sort of JSON responses are easier to work with than they are with a Relation Database like MySQL.

Summary

Here is a diagram of what we have built in the test tutorials:

GoogleImageProcessorFlow

We can see that when the user submits an image, that image is immediately stored in blob storage. Once that is complete we insert the new record in MySQL while at the same time the trigger can fire to start the Storage Trigger. Reason I opted for this approach is I dont want images in MySQL that dont exist in storage since this is where the query result for the user’s image list will come from and I dont want blanks. There is a potential risk that we could return from Vision API before the record is created but, that is VERY unlikely just due to speed and processing.

The Storage Trigger Cloud Function takes the image, runs it against the Vision API and then updates the record in the database. All pretty standard.

Thoughts

So, in the previous entry I talked about how I tried to use the emulator to develop locally, I didnt here. The emulator just doesnt feel far enough along to be overly useful for a scenario like this. Instead I used the streaming logs feature for Cloud Functions and copy and pasted by code in via the Inline Editor. I would then run the function, with console.log and address any errors. It was time consuming and inefficient but, ultimately, I got through it. It would terrify me for a more complex project though.

Interestingly, I had assumed that Google’s Vision API would be better than Azure and AWS; it wasnt. I have a standard test for racy images that I run and it felt the picture of my wife was more racy than the bikini model pic I use. Not sure if the algorithms have not been trained that well due to lack of use but I was very surprised that Azure is still, by far, the best and Google comes in dead last.

The once nice thing I found in the GCP Vision platform is the ability to find occurrences of the image on the web. You can give it an image and find out if its on the public web somewhere else. I see this being valuable to enforce copyright.

But my general feeling is Google lacks maturity compared to Azure and AWS; a friend of mine even called it “the Windows Phone of the Cloud platforms” which is kind of true. You would think Google would’ve been first being they more or less pioneered horizontal computing which is the basis for Cloud Computing and their ML/AI would be top notch as that is what they are more less known for. It was quite surprising to go through this exercise.

Ultimately the big question is, can Google survive in a space so heavily dominated by AWS? Azure has carved out a nice chunk but really, the space belongs to Amazon. It will be interesting to see if Google keep’s their attention and tries to carve out a nice niche or ends up abandoning the public cloud offering. We shall see.