Using Blobs with Windows Azure

So before I left for Japan I went to the Azure Bootcamp in Southfield, MI. My goal was to understand how I could potentially use Azure to deliver better solutions for customers. To that end, upon my return I have started to play with what I learned and see where it takes me.

Among the many facets of Azure that intrigue me, I decided to play with storage first. Coming back from Japan, I have about 600 pictures that I wanted to manage somehow. I figured I would try to see how it I could work with images. To this end, I created a very basic ASP .NET Web Application with a couple pages, one to add containers and then add blobs to those containers, then another to view those blobs and delete them.

Here is the code for listing the containers in a particular storage account. Note that all of this code is using DevelopmentStorage=true, so its going local.

   1: var account = CloudStorageAccount.

   2:     FromConfigurationSetting("DataConnectionString");

   3: var client = account.CreateCloudBlobClient();

   4:  

   5: var containerList = client.ListContainers().ToList();

   6: ddlContainers.Items.Clear();

   7:  

   8: if (containerList.Count == 0)

   9:     ddlContainers.Items.Add(new ListItem() {

  10:             Text = "No Containers Available",

  11:             Value = string.Empty

  12:     });

  13: else

  14: {

  15:     foreach (var cloudBlobContainer in containerList)

  16:         ddlContainers.Items.Add(new ListItem {

  17:             Text = cloudBlobContainer.Name,

  18:             Value = cloudBlobContainer.Name.ToLower()

  19:         });

  20: }

So we have the standard call to setup our reference to the storage account in the Cloud one line 1, again our DataConnectionString is set to UseDevelopmentStorage=true. The Azure SDK contains a number of client creation calls for the various storage types, in this case as we will be working with Blobs, we create a BlobClient.

It is important to understand that Blobs are partitioned into Containers which have unlimited size restrictions (well its not unlimited but its insanely high, such that you could never hit it unless you really tried). Blobs, however, have a 1TB size limit.

The code should be fairly straightforward. We will get a list of the containers within this storage account and then add them to a Drop Down List in our view.

In my example, the view shows the list of containers in the view for the purpose of picking which container the newly created blob should go into.

The next piece of code we will look at is the code for creating containers. This code does two things; first it creates a new container in the storage account, second is to update the dropdown list of containers that we referenced earlier. Here is our code snippet:

   1: var account =

   2:     CloudStorageAccount.FromConfigurationSetting("DataConnectionString");

   3: var client = account.CreateCloudBlobClient();

   4: var containerName = txtContainerName.Text;

   5:  

   6: var container = client.GetContainerReference(containerName);

   7: container.CreateIfNotExist();

   8:  

   9: var permissions = container.GetPermissions();

  10: permissions.PublicAccess = BlobContainerPublicAccessType.Container;

  11: container.SetPermissions(permissions);

  12:  

  13: BindContainerList();

The CreateIfNotExist pattern occurs throughout Azure and is the recommended way for creating elements (tables, blobs, queues). For our new container we must set the permissions for this container. Permissions can be set as one of the following:

  • No Public Access
  • Full Public Read Access
  • Public Read Access for Blobs only

The documentation for this topic exists and is available at

http://msdn.microsoft.com/en-us/library/dd179391(v=MSDN.10).aspx

Notice that the majority of the Azure documentation speaks to the REST API. Azure SDK is created first as an REST API, but many of these calls have been wrapped in managed code and it is recommended that the managed library be used for actual development. The vast majority of the features are supported by the managed library.

The call to BindContainerList encapsulates the code we wrote above for listing containers. It is used to give the user access to the newly created container.

The next example is perhaps the more important; how do we get data into the container. In my tests, I worked mostly with pictures that I brought back from Japan, these are my raw pictures that are about 2.2MB in size apiece. This is the code snippet:

   1: string extension = System.IO.Path.GetExtension(fileUpload.FileName);

   2:  

   3: var account =

   4:     CloudStorageAccount.FromConfigurationSetting("DataConnectionString");

   5: var client = account.CreateCloudBlobClient();

   6: var container =

   7:     client.GetContainerReference(ddlContainers.SelectedItem.Text);

   8: var blob = container.GetBlobReference(Guid.NewGuid() + extension);

   9:  

  10: blob.UploadFromStream(fileUpload.FileContent);

  11: blob.Metadata["FileName"] = fileUpload.FileName;

  12: blob.Metadata["Size"] = fileUpload.PostedFile.ContentLength.ToString();

  13: blob.SetMetadata();

  14:  

  15: blob.Properties.ContentType = fileUpload.PostedFile.ContentType;

  16: blob.SetProperties();

Again, we do the normal routine of setting up our reference to the storage account and then creating the client. Then we select the container based on the data the user has sent us, from this container reference we are able to create a blob reference (line 8).

Now you will notice there is no call to CreateIfNotExist for the blob, only a name is provided. For the sake of uniqueness we use a Guid here combined with the extension that was parsed from the file path. Think of this as a normal file on the file system; your OS does not care what the type of a file is, its only concerned with whether the file will fit on the disk. The same is true here, however, we are limited the max size of the blob to 1TB.

Blobs can also contain Metadata and Properties to provide additional information about the Blob. The Metadata collection is a NameValueCollection which can contain ny number of keys for the information. Properties are fixed and identical for every blob. It is mostly things relating to how the blob should be stored, the official content-type, etc. Each of these have corresponding set methods which must be called to persist the information.

The second piece is a page to actually read the container contents and display them and remove them if so desired. The markup for this is pretty straightforward, a page with a Repeater with Literal, Image, and LinkButton in the ItemTemplate. Here is a code snippet from the markup:

   1: <asp:DropDownList ID="ddlContainers" runat="server" 

   2:     AutoPostBack="true" OnSelectedIndexChanged="ddlContainers_SelectedIndexChanged" />

   3: <hr />

   4: <asp:Repeater ID="rptBlobs" runat="server" OnItemDataBound="rptBlobs_ItemDataBound"

   5:     OnItemCommand="rptBlobs_ItemCommand">

   6:     <ItemTemplate>

   7:         <asp:Literal ID="litName" runat="server" />

   8:         <asp:Image ID="img" runat="server" /><br />

   9:         [ <asp:LinkButton ID="lbDelete" runat="server" CommandName="Delete" Text="Delete"

  10:                 OnClientClick="return confirm('Really delete this entry?');" /> ]

  11:         <p> </p>

  12:     </ItemTemplate>

  13: </asp:Repeater>

The code which populates the ddlContainers widget is identical to the code we showed in the first example which reads the container list from the current storage account and populate the dropdown list. However, based on what the user selects we do need to get a list of all the blobs in the selected container and set it to the DataSource property of the Repeater. Here is the code snippet:

   1: var account = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");

   2: var client = account.CreateCloudBlobClient();

   3: var container = client.GetContainerReference(containerName);

   4:  

   5: var blobList = container.ListBlobs().ToList();

   6: rptBlobs.DataSource = blobList;

   7: rptBlobs.DataBind();

Most of this code should be with either familiar or self-explanatory. It really does not require explanation. The next code snippet comes from the ItemDataBound event handler for the repeater:

   1: if (ev.Item.ItemType == ListItemType.Item ||

   2:     ev.Item.ItemType == ListItemType.AlternatingItem)

   3: {

   4:     var item = (IListBlobItem) ev.Item.DataItem;

   5:     var blob = BlobContainer.GetBlobReference(item.Uri.ToString());

   6:     blob.FetchAttributes();

   7:  

   8:     var litName = (Literal) ev.Item.FindControl("litName");

   9:     StringBuilder sb = new StringBuilder();

  10:     foreach (var key in blob.Metadata.Keys)

  11:     {

  12:         sb.AppendFormat("{0}={1}
"
,

  13:             key, blob.Metadata[key.ToString()]);

  14:     }

  15:     sb.Append("
 "
);

  16:     litName.Text = sb.ToString();

  17:     

  18:     var img = (Image) ev.Item.FindControl("img");

  19:     img.ImageUrl = string.Format("BlobReader.ashx?Uri={0}", blob.Uri);

  20:     

  21:     var lbDelete = (LinkButton) ev.Item.FindControl("lbDelete");

  22:     lbDelete.CommandArgument = item.Uri.ToString();

  23: }

Understand that, despite the fact that we give the blob a name, we will retrieve it via its URI (its name is actually within the URI). So understand, the list that you get from ListBlobs is NOT the blobs themselves, but just a proxy that allows you to get basic information about the blob. This makes sense when you consider how big a blob could be, loading it all into memory would be crazy. The same principle applies to the Metadata and Properties. Notice the call to FetchAttributes on line 6. This is what populates the Metadata NameValueCollection and the Properties for the blob reference. Remember, what we get from the ListBlobs call is nothing more then a lightweight reference to what we have in Cloud storage.

The one thing that may be curious to you is line 22 where I am assigning the URI to the CommandArgument of my LinkButton. I will explain this momentarily, until then, think about ways we could implement uniqueness among the blobs in a container.

In this application, I am working with Images. There is no way to read the binary data out of my storage account and get it to work with an Image tag in ASP .NET. So I am using an ASHX handler to perform the binary read for me. Here is the code snippet:

   1: var account = CloudStorageAccount.FromConfigurationSetting("DataConnectionString");

   2: var client = account.CreateCloudBlobClient();

   3: var blob = client.GetBlobReference(Uri);

   4:  

   5: context.Response.ContentType = "image/jpeg";

   6: var byteArray = blob.DownloadByteArray();

   7:  

   8: context.Response.OutputStream.Write(byteArray, 0, byteArray.Length);

The code here is fairly straightforward though the OutputStream.Write call may look kind of weird if you have never seen this sort of approach before. I am providing the source code at the bottom and you are free to ask questions about the strategy if you are curious. That aside, this code should look very similar to our past code snippets. DownloadByteArray() is new, but its name indicates what it does.

Returning to our main page, I want to speak to how we delete blobs from a container. In this example, the user first selects a container which has blobs in it. The code reads this container and its blobs and outputs them, using an ASHX handler to display the actual contents of the blob. In our ItemDataBound event we set the Uri of the blob to the CommandArgument of the embedded LinkButton. Here is the code snippet for handling the delete action:

   1: string commandName = ev.CommandName;

   2: string commandArgument = ev.CommandArgument.ToString();

   3:  

   4: if (commandName == "Delete")

   5: {

   6:     var blobReference = BlobContainer.GetBlobReference(commandArgument);

   7:     blobReference.DeleteIfExists();

   8:     

   9:     ListContainerContents(ddlContainers.SelectedItem.Text);

  10: }

Again, we notice the *If[Not]Exists pattern that is commonly seen throughout the Azure SDK. Finally the call to ListContainerConents encapsulates the code from our sample showing how to bind the BlobList to a Repeater.

To conclude, it is very easy to create an Azure based application and the potential for clients is immense. When you consider how sites are used the Azure cloud model is much more appropriate then the traditional way of having the scalability for your highest peak. I can remember a project I worked on in the past where the client had 5 web servers, 3 of which were used only when they had a major promotion, the rest of the time they had minimal load. Azure would be a very viable solution for this problem of waste.

Companies tend to know their trends. Good companies understand their trends. Great companies use their trends to maximize profitability. But all companies hate waste. Microsoft did a case study with Dominos who had immense waste with a huge datacenter for its online ordering which was only used heavily on Super Bowl Sunday, the companies busiest day of the year. Using Azure, Dominos can spin up x amount of servers on that one day and spin them down when business returns to normal levels. In addition, because the processing is in the Cloud, Domino’s only pays for usage, this inevitably creates a savings for the company as a whole and helps improve the business process.

Given the global shrinking of economies cutting costs and being more lean is going to be even more important then it normally is. I view Azure as a viable solution to helping reduce IT and Operating costs for companies, which still offering “infinite scalability”.

Notes:

I apologize for the low quality of this code, this comes from more of a playing around perspective. I recognize a criminal violation of DRY (Don’t Repeat Yourself). I plan to utilize what I learn from my experiments to create a library to wrap the current managed library to reduce the repetition you see above.

I did not touch on the necessary updates to the WebRole.cs file to permit modification of cloud data. Please see the code example for this piece.

Code is available for download (Visual Studio 2010, 2MB)

http://cid-630ed6f198ebc3a4.skydrive.live.com/embedicon.aspx/Public/PictureStoreTest.zip

Advertisements

2 thoughts on “Using Blobs with Windows Azure

  1. Great work Jason! Thanks for trying out Azure and making such detailed notes available. If you haven't tried them yet, the Table storage service (no-SQL structured entity database), and SQL Azure (relational SQL database) may be interesting to try too.

    Like

  2. This is such a great resource that you are providing and you give it away for free. I enjoy seeing websites that understand the value of providing a prime resource for free. I truly loved reading your post. Thanks!

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s