Last Wednesday I delivered another session at the Visual Studio Live @ Redmond conference this time about “Tips & Tricks to Build Multi-Tenant Databases with SQL Databases”. The feedback from the session attendees was very good and this is a quick summary of the most important aspects.
First thing in order to be successful is understanding exactly what is Windows Azure SQL Databases (formally SQL Azure) and how it works. If we look at it from a high level, SQL Databases are actually:
- SQL Server database technology delivered as a service on Windows Azure
- This is actually a Shared Environment where we have SQL Server capabilities and features available in a pay as you go, and scalable mode. This of course doesn’t mean we have all the existing features and capabilities from SQL Server, since some of them would probably create some issues since this is a Shared Environment and also because the goal is High Availability and Scalability.
- Ideal for both simple and complex applications
- It’s a way for us to have a Relational Database as a Service quickly and powerful which can be used in all types of solutions, but in order to get the best out of it we really need to understand and adjust how it works and avoid things like being throttled, for example.
- Enterprise-ready with automatic support for HA
- True since it provide a higher level of management since we don’t deal with physical machines anymore but with the actual storing of data this way focusing more on what’s more important for our solution and our business.
- Also this provides us an automatic support that enables us to have High Availability without ‘a lot of work’.
- Designed to scale out elastically with demand
- SQL Databases were created to scale and of course that’s part of it’s DNA, since without the ability to scale elastically it wouldn’t be fit for the Cloud. This will be allowed with SQL Federations.
After a high-level understanding of what is a SQL Database in Windows Azure, it’s important to understand also what are the scaling strategies that we can use, since this way we can better use them whenever needed. And so the strategies are:
- Horizontal Partitioning or Sharding
- Spreads data across similar nodes
- All nodes have the same schema
- Allows us to achieve massive scale out both in terms of Data Size and Load
- We need to understand that while doing that we aren’t going to be able to get the complete list of data within a single query, and so it’s something we need to understand and consider.
- Vertical Partitioning
- Spreads data across Dis-Similar nodes
- Each node has it’s own schema where the data is stored. Eg. SQL Database, Table Storage, Blob Storage, …
- Allows us to place the data where it makes more sense by slitting the data we have for a solution and understanding how it’s used and how it should be stored so we can take the best out of it.
- In this case we need to understand that by doing that we a splitting at the row level, so if we want a complete row (if we were to be thinking about a regular database) we won’t be able to get that in 1 only query, since one part can be in a DB, another in Table Storage and another in Blob Storage.
- Hybrid Partitioning
- This is when we spread our data both in Vertically and Horizontally.
Now let’s have a quick look at SQL Federations:
- Integrated database sharding that can scale to hundreds of nodes
- Provides the ability to do Horizontal Partitioning or Sharding to data inside SQL Databases in a quick and ‘easy’ way.
- Multi-tenancy via flexible repartitioning
- Provide the ability to achieve multi-tenancy inside the same Database by providing the ability to split data horizontally.
Online split operations to minimize downtime
- Automatic data discovery regardless of changes in how data is partitioned
Finally, before we get to the actual Tips & Tricks we need to understand the multi-tenancy strategies that are typically used, and they are:
- Separate Servers
- This provides the best isolation possible and it’s regularly done On-Premises, but it’s also the one that doesn’t enable cutting costs, since each tenant has it’s own server, sql, license and so on.
- Separate Databases
- Very used in order to provide isolation for customer, because we can associate different logins, permissions and so on to each DB. Considered by many the only way to provide isolation for tenants.
- Separate Schema
- Also a very good way to achieve multi-tenancy but at the same time share some resources since everything it’s inside the same DB but the schemas used are different, one for each Tenant. That allows us to even customize a specific tenant without affecting others.
- Row Isolation
- Everything is shared in this option, Server, Database and even Schema. The only way they are differentiated is based on a TenantId or some other column that exists on the table level.
So now that we had a high-level view of all this let’s take a look at some of the Tips & Tricks for it, and they are:
- Choose the right Multi-Tenancy Strategy
- One of the most important steps for delivering a Multi-Tenant solution is understand exactly what should be the approach we should use, and normally the simplest one isn’t actually the best. For example, if we think about Isolation the best might seem Separate Server or Separate Database, but that means that from an economics standpoint we aren’t going to be very competitive, and so for this we need to understand that is we go further for a more shared approach, like Row Isolation, the impact in terms of development might be that at the beginning the investment is bigger, but in the long run that will pay off.
- Also important is that if we want Multi-Tenancy and Isolation the only solution is not separating everything, since that is something we can enforce programmatically, through security permissions and so on. It’s just might take a bit more effort but the end result should be that we achieve some other customers that we usual didn’t because of the prices.
- Some time when we start creating Multi-Tenant Databases we start thinking about sharing everything and forget about the Security part of that. This is actually a common mistake that can cost a lot, and in order to make things work we should never forget to:
- Use an intermediate layer that will receive all the requests to the DB and infer the Tenant filter, so that nobody has access to something that shouldn’t. Of course this also means that no one should have direct access to the DB.
- Very important when we consider Multi-Tenant Databases is the Permissions and how we can affect them. When we use Separate Servers, Databases or either Schemas, we can actually associate different logins, roles and so on to the different Tenants, but when we are in a Row Isolation model that isn’t possible, and that’s why the intermediate layer, that is actually your Data Layer will be very important since not only provides access to the data inferring the filter by Tenant, but also allows us to introduce permissions to access certain parts of the data. For example by leveraging Windows Azure Active Directory Access Control Service.
- This is of a huge importance. IF YOUR DATA IS SENSITIVE JUST ENCRYPT IT. SQL Databases don’t have the ‘With Encryption’ for Columns but this doesn’t mean I can’t really insert encrypted data in the database, I just need to do it on another layer, again in an intermediate layer.
- Also very important when we encrypt our data is to understand the method we’re going to use. Normally one of the best methods, if we don’t want anyone that isn’t part of a Tenant to access the data and have different encryptions per Tenant so that even if someone gets the full DB it won’t access the full data, is to use X.509 Client Certificates. This is a very good approach since the Client is actually the one that has the Certificate that is used, but it also means that we cannot count of doing background calculation with that data in the Cloud since we don’t have the certificate. So it’s a balance.
- A quick reminder is that IF YOU SHARE A SECRET IT STOPS BEING A SECRET, it’s just like telling a secret to a child. So for this reason, if you use X.509 Client Certificates to encrypt the data, and then register all those certificates in a Windows Azure Role, that isn’t the best approach because if someone get access to that role it will have access to the KEYS OF THE KINGDOM.
- When you choose the partitions it should be based on the ‘hotness’ of the data and not on the ‘# of records’. This is a very important premise since we normally see partitions being made in a way that they all have the same amount of records and data, and this doesn’t mean the solution will perform, because if we have one partition that has all the most commonly used data and another one with less common we won’t have any benefit with the partition. So the important part is to partition your data based on how the data is used and how commonly it is used, since in order to get good partitioning sometimes we can have a partition with very few records and another one with thousands.
- So it’s also important to before partitioning the data understand how that data will be consumed and used, since that will allow us to better understand what is the most used data or not.
- We can have several ways of defining partition keys and the most common are:
- Natural Keys
- Choosing a Natural key is usually one of the most used ways in order to select a partition key, and some samples are:
- But sometimes this isn’t the best approach since if we go for the Tenant that doesn’t partition based on the ‘hotness’ just based on the Tenant and if a tenant is small has less information and so it will be faster than a tenant that is larger because it has more data and all in the same partition. This is exactly what we want to avoid. The same thing happens with Country, since really isn’t the best way since if we use this we might have more customers from a specific customer than from another, and the same with region. When we partition by data, it will mean that everybody would be affected by everyone else data, since everything would be in the same partition.
- What this mean is that while Natural Keys are one of the most used partition keys they aren’t actually the best choice because it’s very difficult to find something that allows us to partition it optimally.
- This is another option for partition key since what we can do is used things like Hash or Modulo operator and other options to generate a mathematical calculation in order to find the ‘hotness’ point.
- Being this a very interesting approach is actually really difficult also since you need to understand your data very well as well as built mathematical formulas which isn’t everyone best hobby and capability.
- Another option is this one, Lookup based, this is actually the best since it really looks at how the data is used and consumed in order to find out the best partition key. In some cases this will mean a concatenation of something like ‘TenantId+Date’ or something like that, because in this case we’ll be saying that every tenant is partitioned independently and even at the same time is partitioning its own data making it faster.
- SQL Federations is a very good way to leverage Horizontal Partitioning (Sharding) of data for you solutions since allows us a ‘quick and easy’ way to partition our data and provide at the same time Isolation, since each Federation Member is actually a separate DB that is generated, but when we look at it we only see the main DB.
- Currently the limitations with this is actually the fact the SQL Federations only allow the partition key to be of types BigInt, Int, UniqueIdentifier and VarBinary. It would be great if it would support also Varchar but we can’t have all, unfortunately, but if we go for a partition key like ‘TenantId+Date’ this can be actually a number and so fall inside the BigInt possibility.
- Sometimes when we are in a multi-tenant environment we forget we still need to provide backups and not only for us, but most of the times the customer wants also to have a copy of the data and that is more challenging.
- In order to do that backups we can use the Export capability from SQL Databases as well as some third-party tools like RedGate’s SQL Azure Backup.
- To leverage a backup of the data for the customer we can leverage SQL Data Sync since it allows us to create filters in the data sets and so we can filter by tenant and get one Data Sync per Tenant this way making everyone happy.
So those are some of the Tips & Tricks you can use in order to be successful building Multi-Tenant Databases in Windows Azure SQL Databases. I hope that helps and would love to have your thoughts about it.