Tips & Tricks to Build Multi-Tenant Databases with SQL Databases

Last Wednesday I delivered another session at the Visual Studio Live @ Redmond conference this time about “Tips & Tricks to Build Multi-Tenant Databases with SQL Databases”. The feedback from the session attendees was very good and this is a quick summary of the most important aspects.

First thing in order to be successful is understanding exactly what is Windows Azure SQL Databases (formally SQL Azure) and how it works. If we look at it from a high level, SQL Databases are actually:

  • SQL Server database technology delivered as a service on Windows Azure
    • This is actually a Shared Environment where we have SQL Server capabilities and features available in a pay as you go, and scalable mode. This of course doesn’t mean we have all the existing features and capabilities from SQL Server, since some of them would probably create some issues since this is a Shared Environment and also because the goal is High Availability and Scalability.
  • Ideal for both simple and complex applications
    • It’s a way for us to have a Relational Database as a Service quickly and powerful which can be used in all types of solutions, but in order to get the best out of it we really need to understand and adjust how it works and avoid things like being throttled, for example.
  • Enterprise-ready with automatic support for HA
    • True since it provide a higher level of management since we don’t deal with physical machines anymore but with the actual storing of data this way focusing more on what’s more important for our solution and our business.
    • Also this provides us an automatic support that enables us to have High Availability without ‘a lot of work’.
  • Designed to scale out elastically with demand
    • SQL Databases were created to scale and of course that’s part of it’s DNA, since without the ability to scale elastically it wouldn’t be fit for the Cloud. This will be allowed with SQL Federations.

After a high-level understanding of what is a SQL Database in Windows Azure, it’s important to understand also what are the scaling strategies that we can use, since this way we can better use them whenever needed. And so the strategies are:

  • Horizontal Partitioning or Sharding
    • Spreads data across similar nodes
      • All nodes have the same schema
    • Allows us to achieve massive scale out both in terms of Data Size and Load
    • We need to understand that while doing that we aren’t going to be able to get the complete list of data within a single query, and so it’s something we need to understand and consider.
  • Vertical Partitioning
    • Spreads data across Dis-Similar nodes
      • Each node has it’s own schema where the data is stored. Eg. SQL Database, Table Storage, Blob Storage, …
    • Allows us to place the data where it makes more sense by slitting the data we have for a solution and understanding how it’s used and how it should be stored so we can take the best out of it.
    • In this case we need to understand that by doing that we a splitting at the row level, so if we want a complete row (if we were to be thinking about a regular database) we won’t be able to get that in 1 only query, since one part can be in a DB, another in Table Storage and another in Blob Storage.
  • Hybrid Partitioning
    • This is when we spread our data both in Vertically and Horizontally.

Now let’s have a quick look at SQL Federations:

  • Integrated database sharding that can scale to hundreds of nodes
    • Provides the ability to do Horizontal Partitioning or Sharding to data inside SQL Databases in a quick and ‘easy’ way.
  • Multi-tenancy via flexible repartitioning
    • Provide the ability to achieve multi-tenancy inside the same Database by providing the ability to split data horizontally.

Online split operations to minimize downtime

  • Automatic data discovery regardless of changes in how data is partitioned

Finally, before we get to the actual Tips & Tricks we need to understand the multi-tenancy strategies that are typically used, and they are:

  • Separate Servers
    • This provides the best isolation possible and it’s regularly done On-Premises, but it’s also the one that doesn’t enable cutting costs, since each tenant has it’s own server, sql, license and so on.
  • Separate Databases
    • Very used in order to provide isolation for customer, because we can associate different logins, permissions and so on to each DB. Considered by many the only way to provide isolation for tenants.
  • Separate Schema
    • Also a very good way to achieve multi-tenancy but at the same time share some resources since everything it’s inside the same DB but the schemas used are different, one for each Tenant. That allows us to even customize a specific tenant without affecting others.
  • Row Isolation
    • Everything is shared in this option, Server, Database and even Schema. The only way they are differentiated is based on a TenantId or some other column that exists on the table level.

So now that we had a high-level view of all this let’s take a look at some of the Tips & Tricks for it, and they are:

  1. Choose the right Multi-Tenancy Strategy
    • One of the most important steps for delivering a Multi-Tenant solution is understand exactly what should be the approach we should use, and normally the simplest one isn’t actually the best. For example, if we think about Isolation the best might seem Separate Server or Separate Database, but that means that from an economics standpoint we aren’t going to be very competitive, and so for this we need to understand that is we go further for a more shared approach, like Row Isolation, the impact in terms of development might be that at the beginning the investment is bigger, but in the long run that will pay off.
    • Also important is that if we want Multi-Tenancy and Isolation the only solution is not separating everything, since that is something we can enforce programmatically, through security permissions and so on. It’s just might take a bit more effort but the end result should be that we achieve some other customers that we usual didn’t because of the prices.
  2. Don’t forget Security Patterns
    • Some time when we start creating Multi-Tenant Databases we start thinking about sharing everything and forget about the Security part of that. This is actually a common mistake that can cost a lot, and in order to make things work we should never forget to:
      1. Filter
        • Use an intermediate layer that will receive all the requests to the DB and infer the Tenant filter, so that nobody has access to something that shouldn’t. Of course this also means that no one should have direct access to the DB.
      2. Permissions
        • Very important when we consider Multi-Tenant Databases is the Permissions and how we can affect them. When we use Separate Servers, Databases or either Schemas, we can actually associate different logins, roles and so on to the different Tenants, but when we are in a Row Isolation model that isn’t possible, and that’s why the intermediate layer, that is actually your Data Layer will be very important since not only provides access to the data inferring the filter by Tenant, but also allows us to introduce permissions to access certain parts of the data. For example by leveraging Windows Azure Active Directory Access Control Service.
      3. Encryption
        • This is of a huge importance. IF YOUR DATA IS SENSITIVE JUST ENCRYPT IT. SQL Databases don’t have the ‘With Encryption’ for Columns but this doesn’t mean I can’t really insert encrypted data in the database, I just need to do it on another layer, again in an intermediate layer.
        • Also very important when we encrypt our data is to understand the method we’re going to use. Normally one of the best methods, if we don’t want anyone that isn’t part of a Tenant to access the data and have different encryptions per Tenant so that even if someone gets the full DB it won’t access the full data, is to use X.509 Client Certificates. This is a very good approach since the Client is actually the one that has the Certificate that is used, but it also means that we cannot count of doing background calculation with that data in the Cloud since we don’t have the certificate. So it’s a balance.
        • A quick reminder is that IF YOU SHARE A SECRET IT STOPS BEING A SECRET, it’s just like telling a secret to a child. So for this reason, if you use X.509 Client Certificates to encrypt the data, and then register all those certificates in a Windows Azure Role, that isn’t the best approach because if someone get access to that role it will have access to the KEYS OF THE KINGDOM.
  3. Choose your partitions wisely
    • When you choose the partitions it should be based on the ‘hotness’ of the data and not on the ‘# of records’. This is a very important premise since we normally see partitions being made in a way that they all have the same amount of records and data, and this doesn’t mean the solution will perform, because if we have one partition that has all the most commonly used data and another one with less common we won’t have any benefit with the partition. So the important part is to partition your data based on how the data is used and how commonly it is used, since in order to get good partitioning sometimes we can have a partition with very few records and another one with thousands.
    • So it’s also important to before partitioning the data understand how that data will be consumed and used, since that will allow us to better understand what is the most used data or not.
  4. Choose your Partition Keys correctly
    • We can have several ways of defining partition keys and the most common are:
      1. Natural Keys
        • Choosing a Natural key is usually one of the most used ways in order to select a partition key, and some samples are:
          • Tenant
          • Country
          • Region
          • Date
        • But sometimes this isn’t the best approach since if we go for the Tenant that doesn’t partition based on the ‘hotness’ just based on the Tenant and if a tenant is small has less information and so it will be faster than a tenant that is larger because it has more data and all in the same partition. This is exactly what we want to avoid. The same thing happens with Country, since really isn’t the best way since if we use this we might have more customers from a specific customer than from another, and the same with region. When we partition by data, it will mean that everybody would be affected by everyone else data, since everything would be in the same partition.
        • What this mean is that while Natural Keys are one of the most used partition keys they aren’t actually the best choice because it’s very difficult to find something that allows us to partition it optimally.
      2. Mathematical
        • This is another option for partition key since what we can do is used things like Hash or Modulo operator and other options to generate a mathematical calculation in order to find the ‘hotness’ point.
        • Being this a very interesting approach is actually really difficult also since you need to understand your data very well as well as built mathematical formulas which isn’t everyone best hobby and capability.
      3. Lookup Based
        • Another option is this one, Lookup based, this is actually the best since it really looks at how the data is used and consumed in order to find out the best partition key. In some cases this will mean a concatenation of something like ‘TenantId+Date’ or something like that, because in this case we’ll be saying that every tenant is partitioned independently and even at the same time is partitioning its own data making it faster.
  5. Leverage SQL Federations to handle Multi-Tenancy
    • SQL Federations is a very good way to leverage Horizontal Partitioning (Sharding) of data for you solutions since allows us a ‘quick and easy’ way to partition our data and provide at the same time Isolation, since each Federation Member is actually a separate DB that is generated, but when we look at it we only see the main DB.
    • Currently the limitations with this is actually the fact the SQL Federations only allow the partition key to be of types BigInt, Int, UniqueIdentifier and VarBinary. It would be great if it would support also Varchar but we can’t have all, unfortunately, but if we go for a partition key like ‘TenantId+Date’ this can be actually a number and so fall inside the BigInt possibility.
  6. Don’t forget that you’ll continue to need to perform backups
    • Sometimes when we are in a multi-tenant environment we forget we still need to provide backups and not only for us, but most of the times the customer wants also to have a copy of the data and that is more challenging.
    • In order to do that backups we can use the Export capability from SQL Databases as well as some third-party tools like RedGate’s SQL Azure Backup.
    • To leverage a backup of the data for the customer we can leverage SQL Data Sync since it allows us to create filters in the data sets and so we can filter by tenant and get one Data Sync per Tenant this way making everyone happy.

 

So those are some of the Tips & Tricks you can use in order to be successful building Multi-Tenant Databases in Windows Azure SQL Databases. I hope that helps and would love to have your thoughts about it.

Tips and Tricks on Building Event Driven Architectures with Windows Azure Service Bus

Just now I did a presentation at Visual Studio Live @ Redmond under the topic “Tips & Tricks on Building Event Driven Architecture with Service Bus”.

One of the very interesting things to understand is that a lot of us already use Event Driven Architectures even without knowing it. That was one of the parts I explained in the session, since a lot of us currently use messaging mechanisms to decouple our solutions, and that’s already building things in an event driven way.

A summary of the session for the people who didn’t attend is:

  • What is Event Driven Architecture?
    • Architecture Pattern that orchestrates behavior around:
      • Production
      • Detection
      • Consumption of events
    • Method for building decoupled enterprise solution
    • Suited for Async and unpredictable environment (like the cloud for example)
    • Has 3 types of processing:
      • Simple Event Processing
        • Events that are directly related to specific and measurable changes
      • Stream Event Processing
        • Events that can be notable (important) or ordinary (not so important)
      • Complex Event Processing
        • Both Notable and Ordinary events allowing to search for patterns

 

  • What is Windows Azure Service Bus?
    • Connectivity
      • Service Relay
      • Protocol Tunnel Eventing
    • Messaging
      • Queuing Pub/Sub
      • Reliable Transfer
    • Svc Management
      • Naming, Discovery
      • Monitoring
    • Integration Routing
      • Coordination Transformation

 

Going now for the Tips & Tricks

  1. Choose your Event Stream wisely
    • Messaging (Applicable when you need reliability)
      • Windows Azure Service Bus Queues
      • Windows Azure Service Bus Topics
      • Windows Azure Storage Queues
    • Relay (Applicable when you don’t need reliability but only a way to sent a message to all the subscribers)
      • NetEventRelayBinding
    • Twitter (Never forget the power of Twitter. For example http://twitter.com/fltadvisor provides you information about flight delays in US. Why not leverage Twitter to do the same for your app?)
    • RSS Feed (Also very interesting. For example Windows Azure Service Dashboard uses RSS Feeds.)
    • Other sources
  2. Create different types of messages, DON’T GENERALIZE IT
    • One of the important parts is never generalize event types.
    • You have 4 types of events you need to work on
      • Business
      • Execution
      • Lifecycle
      • Management
    • Those are the type of events but you shouldn’t generalize because each event has it’s own properties that are notable for them. For example a DataChangeEvent is different from a NotificationSentEvent. Do the two, don’t generalize into one only with different types.
  3. Choose the right Event processing Type
    • Simple Event Processing
      • Use it whenever you need to handle events in a simple way, without having differences between the events. You just need to process all of them.
      • Eg.
        • Coordinating Worker Roles
    • Stream Event Processing
      • Use this whenever you need to handle events in a simple way, but you have different types of events with different importance, like Notable and Ordinary events.
      • Eg.
        • Notification System that received events but some are priority notifications and other non-priority
    • Complex Event Processing
      • Use this whenever, and only, when you need to search for a pattern inside the event stream. if not, don’t use it.
      • Eg.
        • Stock Exchange. If the Index A goes up 20% and Index B goes up more than 5% in a 5 min period. Do Something
  4. Keep your modules Loosely-Coupled
    • Event Driven Architecture is all about decoupling every module of the solution, so continue with that. Do you modules as they don’t need to know who creates the events and why they do it.
  5. ‘Shard’ your Event Stream
    • It’s important to think from the start how you are going to shard your Event Stream, because if the solution is going to be unpredictable in growth you need to be prepared.
    • Some simple way is to think Sharding on the Type (Business, Execution, Lifecycle, Management) but then you can for further.
  6. Consider Windows Azure IaaS with Microsoft StreamInsight for Complex Event Processing
    • Very interesting product for Complex Event Processing and that can easily be placed in the Cloud, and of course Windows Azure Virtual Machines
  7. Consider ‘Project Austin’ from Microsoft
    • This is really good since it’s Microsoft StreamInsight as a Service in Windows Azure

I presented a sample based on this http://www.codeproject.com/Articles/55620/Event-Driven-Architecture-in-the-Clouds-with-Windo that can’t be found here http://eventdrivensample.cloudapp.net.

Also very interesting to check out is the new CQRS Journey from Microsoft Patterns & Practices here.

Hope this helps you in future developments, and also I’d love to hear you thoughts about this, so please comment and let’s build a conversation about it and generate other posts.

Service Bus 1.0 Beta

We all know about Windows Azure Service Bus and how it allows features like Relay, Messaging and even integration. Personally this is one of my favorite features in Windows Azure but all this only worked in the Cloud and sometimes the same features and capabilities would be very good if they were available On-Premises also. That’s exactly what Service Bus 1.0 is, it’s getting those capabilities and features to the On-Premises world also, making it easier to have a complete parity between Cloud and On-Premises.

Of course this version isn’t still the full version of what we see on the Windows Azure version but will help a lot, I’m sure. This Beta is what the Team calls “Service Bus Messaging Engine” release as stated by Clemens Vasters in one of his responses at StackOverflow, and so this version still doesn’t have any features related to the Relay part, but only with the Messaging part.

Another very important part of this release is that by using it we’ll start getting the same (or at least very similar at this moment) API for both Cloud and On-Premises versions, which make things a lot easier when we build solutions that need to be deployed in both.

For more information about Service Bus 1.0 Beta check here.

In terms of Architecture, this version has the following:

(Figure 1 – Service Bus 1.0 Architecture from Windows Azure site)

 

So we can see that all this uses very well known elements like, SQL Server, PowerShell, .NET Framework, and the 3 most important elements of it are:

  • Service Bus Gateway
    • Receives all the incoming requests and provides the necessary Authentication, Authorization as well as address resolution sending then the request for the appropriate block so it can be handled.
    • Currently only allows communications using NET.TCP or REST over HTTP protocols.

(Figure 2 – Service Bus Gateway High-level Architecture – Windows Azure Site)

  • Service Bus Message Broker
    • Responsible for providing the Messaging capabilities, so can be seen as the Service Bus Messaging Engine.
    • Provides Queues and Topics/Subscriptions features to this release.

(Figure 3 – Service Bus Message Broker High-level Architecture – Windows Azure Site)

  • Windows Fabric
    • Provides capabilities like High availability, farm and cluster formation as well as load balancing across farm machines.
    • We can think about it as being the ‘orchestrator’ of all services since all services will connect to it in order to have those capabilities and be able to work in a success way and with high quality.

 

So by releasing Service Bus 1.0 (Beta) Microsoft is actually showing that they were really serious when they were saying that they were working on lowering the differences between Cloud and On-Premises and bringing the best solutions to both worlds.

What I believe this will bring to customer is:

Hope this post helps you better understand what’s Service Bus 1.0 (Beta) is all about and you can expect future post around this topic going deeper in terms of how to use it and benefits of it.