Lessons Learned: Taking the best out of Windows Azure Virtual Machines


Now that Windows Azure IaaS offerings are out and made GA a lot of new workloads can be enabled with Windows Azure. Workloads like, SQL Server, SharePoint, System Center 2012, Server Roles like AD, ADFS DNS and so on, and even Team Foundation Server. More of the supported list of server software that is currently supported in Windows Azure Virtual Machines can be found here.

But knowing what we can leverage in the Cloud isn’t enough, every features has its tricks in order to take the best out of it. In this case in order to take the best performance out of the Windows Azure Virtual Machines, I’ll provide you with a list of things you should always do, and so making your life easier and the performance a lot better.

1. Place each data disk in a single storage account to improve IOPS

Last November 2012 Windows Azure Storage had an update which was called “Windows Azure’s Flat Network Storage” which provided some new scalability targets to the blob storage accounts. In this case it went from 5,000 to 20,000, which means that we can actually have something like 20,000 IOPS now.

Having 20,000 IOPS is good but if we have several disks for the same Virtual Machine this means that we’ll need to share those IOPS with all those disks, so if we have 2 disks in the same storage account we’ll have 10,000 IOPS for each one (roughly). This isn’t optimal.

So, in order to achieve optimal we should create each disk in a separate storage account, because that will mean that each disk has it’s 20,000 IOPS just for itself and not sharing with any other disk.

2. Always use Data Disks for Read/Write intensive operations, never the OS Disk

Windows Azure Virtual Machines have two types of disks, which are OS and Data Disks. The OS Disk goal is to have everything that has to do with OS installation or any other product installation information, but isn’t actually a good place to install your highly intensive read/write software. In order to do that, you should actually leverage Data Disks, because their goal is to provide a faster and read/write capability and also separate this from the OS Disk.

So since data disks are better than OS Disks it’s easy to understand why we should always place read/write intensive operations on data disks. Just be careful on the maximum number of data disks you can associate to your virtual machine, since it will differ. 16 Data Disks is the maximum you are allowed but for that you need to have an extra large virtual machine.

3. Use striped disks to achieve better performance

So we told that you should always place your read/write intensive operations software on data disks and in different storage accounts because of the IOPS you can get, and we told it was 20k IOPS, but is that enough? Can we live with only 20k IOPS in a disk?

Sometimes the answer might be yes, but in some cases it won’t because we need more. For example if we think about SQL Server or SharePoint they will require a lot more, and so how can we get more IOPS?

The answer is data disks striped together. What this means is that you’ll need to understand your requirements and know what’s the IOPS you’re going to need and based on that you’ll create several data disks and attach them to the virtual machine and finally stripe them together like they were a single disk. For the user of the virtual machine it will look like a single disk but it’s actually several ones striped together, which means each of the parts of that “large disk presented to the user” has 20k IOPS capability.

For example, imagine we’re building a virtual machine for SQL Server and that the size of the database is 1TB but requires at least 60k IOPS. What can we do?

Option 1, we could create a 1TB Data Disk and place the database files in there but that would max out to 20k IOPS only and not the 60k we need.

Option 2, we will create 4 data disks of 250GB each and place each of them in a single storage account. Then we’ll attach it to the virtual machine and in the Disk Management we’ll choose to stripe them together. Now this means that we have a 1TB disk in the virtual machine that is actually composed by 4 data disks. So this means that we can actually get something like a max of 80k IOPS for this. So a lot better than before.

4. Configure Data Disks HostCache for ReadWrite

By now you already understood that data disks are your friends, and so one of the ways to achieve better performance with them is leveraging the HostCache. Windows Azure provides three options for data disk HostCache, which are None, ReadOnly and ReadWrite. Of course most of the times you would choose the ReadWrite because it will provide you a lot better performance, since now instead of going directly to the data disk in the storage account it will have some cached content making IOPS even better, but that doesn’t work in all cases. For example in SQL Server you should never use it since they don’t play well together, in that case you should use None instead.

5. Always create VMs inside a Affinity Group or VNET to decrease latency

Also another big improvement you can do is to place always de VM inside an affinity group or a VNET, which in turn will live inside the affinity group. This is important because when you’re creating the several different storage accounts that will have data disks, OS disks and so on, you’ want to make sure the latency is decreased to the max and so affinity groups will provide you with that.

6. Always leverage Availability Sets to get SLA

Windows Azure Virtual Machines provide a 99,95% SLA but only if you have 2 or more virtual machines running inside an availability set, so leverage it, always create your virtual machines inside an availability set.

7. Always sysprep your machines

One of the important parts of the work when we take on Windows Azure Virtual Machines is to create a generalized machine that we can use later as a base image. Some people ask me, why is this important? why should I care?

The answer is simple, because we need to be able to quickly provision a new machine if it’s required and if we have it syspreped we’ll be able to use it as a base and then reducing the time of installation and provisioning.

Examples of where we would need this would be for Disaster Recovery and Scaling.

8. Never place intensive read/write information on the Windows System Drive for improved performance

As stated before, OS Disks aren’t good for intensive IOPS so avoid leveraging them for read/write intensive work, leverage data disks instead.

9. Never place persistent information on the Temporary Drive (D:)

Careful what you place inside the Temporary Drive (D: ) since that’s temporary and so if the machine recycles it will go away, so only place there something that can be deleted without issues. Things like the IIS Temporary files, ASP.NET Temp files, SQL Server TempDB (this has some challenges but can be achieved like it’s shown here, and it’s actually a best practice).


So in summary, Windows Azure Virtual Machines are a great addition to Windows Azure but there’s a lot of tricks in order to make it better and these are some of them. If you need any help feel free to contact me and I’ll help you in anyway possible. But best of all, start to take the best out of Windows Azure Virtual Machines today and take your solutions into the next level.

Also if you need some help doing that, please check Aditi’s offerings around Windows Azure IaaS here.

12 thoughts on “Lessons Learned: Taking the best out of Windows Azure Virtual Machines

  1. I would test your theory about striping disks giving you optimal performance for SQL Server workloads.

    All tests I have done to date show that letting SQL manage this by having your DB span multiple data files in SQL is better than letting Windows manage this in a striped array.

    Also, you will run in to problems with striped disk arrays and geo-replication, so if you go this way be wary of wanting to recover this striped volume.

  2. Thanks Nuno for posting this. Ever since I saw your sessions during TechDays2013 I was waiting for this.

  3. Ryan CrawCour, it’s a challenge I know but it’s something that in the end actually is worth it. Also this doesn’t mean you shouldn’t continue to use multiple files for in SQL, that will even increase more the odds of getting each file on a separate data disk.

    On the recovery it’s challenging but again in the end if you want the full power you need to balance something.

    Also, I’ll be writing another one just for getting the best out of SQL Server on Windows Azure IaaS and I’ll make sure to include some metrics for it.

  4. I know that Direct Access is not supported. But it works just fine! Do you know the actual reason it’s not? Can we dare to use it in production anyway?

  5. Henrik, I don’t know the reason why this isn’t supported. I haven’t even yet tried leveraging Direct Access but will do and will try to understand why isn’t that supported.

  6. I have several questions about this post.

    1) how are you achieving 20K iops in your azure virtual machine, we have been doing sqlio testing on a virtual machine and the results for the most part are between 500 – 1500 iops and throughput of 5-50 MB.

    2) you mention disk striping but the below link mentioned that it should not be done and single disk should be used




  7. I think you missed the fact that Microsoft also has a 500 IOPS limit per Disk. You can have 40 Disks on a single Storage account without them tripping over each other for IOPS, and you would need to strip 160 Disks to be able to get 80K IPOS out of this :-S

    With this in mind, having a storage account per disk is really unnecessary, still good to keep in mind when you have a lot of Disks so you don’t pass over 40 or so…

    source: http://msdn.microsoft.com/library/dn197896.aspx

  8. Thomas and Gonzalo both point out things I have read about IOPS in Azure SQL VMs. Here for example are the standard tier configs. 500 IOPS per disk.

    Your post implies 20K IOPS with a single disk.

    Am I missing something, did something change, or are you wrong about this?

    Furthermore MS best practice is not to stripe your data disk, at least for non-DR configurations, which given the cost of SQL in Azure I’d imagine there are a lot of.


Leave a Reply

Your email address will not be published. Required fields are marked *