In my previous blog post I explained about the Data Storage Maturity Model and how you would get a much more mature and capable application if you used Event Sourcing. That blog post did bring up some interesting question.
Given that Event Sourcing was at the top of the pyramid you could conclude that you should always aim for the the top and use Event Sourcing. Aiming high is a noble cause and sounds like the right thing but it turns out that it isn’t that simple.
If your application is relative simple and you don’t have much of a domain model there is little point in Event Sourcing your data storage. For example a To-Do application probably has little reason to do so. Maybe if you want to do advanced analysis over the history of to-do items there is a need but in most cases all you need to do is persist a few items of data to some persistent store for later retrieval. And that sounds like level 1, CRUD with structured storage, will do quite well while adding Event Sourcing would just complicate things.
There is also a differentiation to be made inside of applications. Suppose you are doing a complex banking app. In that case Event Sourcing would make perfect sense for your domain layer. However there is more then just your domain layer. Every application has utility data, for example a list of countries in the world. This is a mostly static reference table and using Event Sourcing for data like that would be over engineering. Again just using a CRUD store for this data would be more then enough even though all financial transaction data is stored using Event Sourcing.
So I guess the answer is: It depends but probably not for everything in your applications or maybe not at all
Another question that came up is that of Data Access technology to be used. Again this is kind of a hard question to give a simple answer to. It also depends on whether you are looking at the Event Sourced domain model or the projected Read Model.
For the Event Sourcing side I really like Greg Young’s GetEventStore which can be used as a standalone server or embedded client. You can either use an HTTP API but as I mainly use .NET on the server it’s native C# client is the way to go.
For the projected Read Model it really depends on what you are using as the data storeage mechanism. In the case of a relational database you could use NHibernate or Entity Framework but these are probably a bit overkill and will hurt performance. On most cases you will be better of with one of the Mirco ORM’s out there like Dapper, ServiceStack.OrmLite or something similar.
I prefer using a NoSQL database though and really like RavenDB or MongoDB. Currently I am using Redis with the ServiceStack.Redis client in a sample project and that is also working really well for me.
So again it really depends on your preferences but choosing for speed and flexibility is a good thing.
There are many ways of storing data when developing applications, some more mature and capable than others. Storing data of some sort or another in an application is common. Extremely common to be exact as almost every application out there needs to store data is some way or another. After all even a game usually stores the users achievements.
But it’s not games I am interested in. Sure they are interesting to develop and play but most developers I know are busy developing line of business (LOB) applications of some sort or another. One thing line of business application have in common is they work with and usually store data of some sort.
When looking at data oriented applications we can categorize data storage architectures based on different characteristics and capabilities.
The most basic way of working with data is just dumping whatever the users works with in the UI to some proprietary data file. This typically means we are working with really simple Create Read Update Delete (CRUD) type of data entry forms and not even storing the data in a structured way. This is extremely limited in capabilities and should generally be avoided at all costs. Any time you have to work with a slightly larger data set or update the structure you are in for a world of hurt.
At level 1 we are still working with CRUD style data entry forms but at least we have started using a formal database of some sorts. The database can be a relational database like SQL Server or MySQL but a NoSQL database like MongoDB is equally valid. While data database used allows us to do much more the user interface is not much better. We are still loading complete objects and storing them in a CRUD fashion. This might work reasonably well in a low usage scenario with a low change of conflicts but is really not suitable for anything more complex than a basic business application. We are only storing the current state of the data and as the database stores whatever is send from the UI, or business processing code, there is no sense of meaning to any change made.
When we need to develop better and more complex business applications we really should use Command Query Responsibility Segregation (CQRS) as minimum. In this case we separate the read actions from the write actions. We no longer just send an object to be stored from the user interface to the back end but we are sending commands to update the data. These commands should be related to business actions the application works with. So in other words if a business analyst sees the command names he should be able to make sense of what they do without looking at the code implementations.
While this is a lot better we are still only storing the current state of the data. And that is the problem as it can be very hard to figure out how something got to be in a given state. So if a users detects that something is wrong with the data and suspects a bug in the program we might just have a hard time figuring out how it got to be that way. And once we do fixing the issue might be extremely hard as well.
There are other limitations with just storing the current state like not being able to produce reports, or only at great difficulty, the ask for. Or possibly alter business rules after the fact. And if you think that doesn’t happen just try working on a large government project where the slowness of the decision process means that rules are only definitely updated after the fact.
The most advanced level to be working at is using Event Sourcing (ES). An events sourced application resembles a CQRS style application in a lot of ways except for one vital part. With an Event Sourced application we are no longer storing the current state of the data but we are storing all events that lead up to this. All these events are stored as one big steam of changes and are used to deduce the current state of the data in the application. These events typically never change once written, after all we don’t change history (although our view of it might change over time). This has some large benefits as we can now track exactly how the state came to be as it is making it easier to find bugs. And if the bug is in how we used those business events that we can fix the bug and often that is enough to deduce the correct state.
The usual queries done in an application are much harder on an event stream. In order to fix that issue the events are usually projected out to a read model making querying much easier. This read model it normally stored in some appropriate database like SQL Server or a NoSQL database but could also just be kept in memory. However the event stream is the true source of the truth and not the projections as these are just a derived result. This means we can delete all projections and completely rebuild them from the existing events resulting in much more flexibility. Need to do an expensive query in version two of an application? Well just create a projection designed for that purpose and rebuild it from all previously stored events. This is similar to our view of history changing.
There are some more benefits from storing events instead of just the current state. We can now do temporal queries, or queries over time, on how the data got to be how it is. These kind of queries have lot of goals like for example fraud detection. Another possibility is displaying the state at any previous point in time and running reports or analysis on the data as it was then.
It’s kind of hard to say at what level you should be working. Level 0, limited as it is might be appropriate for your application. Lots of applications are at level 1 and just basic forms over data CRUD applications. In some that might be appropriate but in a lot of cases that is actually sub optimal. Level 2 with CQRS is a pretty sweet place to be. You can capture the business intent with command and have a reasonable flexibility. At level 3 with event sourcing you gain a lot of flexibility and strength. If you are doing a more complex business application you should be working on this level. But as always there is no free lunch so don’t go there is the application is really not that complex
In general AngularJS applications are quite fast, specially when compared to more traditional browser based applications that constantly post back to the server. However there are always a few things that will help performance and make an application even faster.
Normally AngularJS adds several things like CSS classes and some scope related properties to DOM elements. This is not really needed to run the application and is really only done to help development tools like Protractor and Batarang. When the application is in production that is not really needed and you can save some overhead by disabling this using the $compileProvider.debugInfoEnabled() function.
Another option to speed up your application is by using explicit dependency injection annotations. If the DI annotations are not present AngularJS has to parse functions to see the parameter names, something that can be avoided by adding the explicit annotations. The annotations can be added manually, which can be tedious to do, or automatically using something like ng-annotate with either a Gulp or Grunt task.
Adding the ngStrictDi directive to the same element as the ngApp directive can help you find missing annotations.
Another helpful option is to reduce the number of $apply() calls that are the result of $http request finishing. When you are doing multiple $http requests when a page loads each will trigger a $apply() function causing all watches and data bindings to be reevaluated. By combining these into a single $apply() call for requests that are done at almost the same time we can increase the load speed of you application, something that can be done using $httpProvider.useApplyAsync().
Testing AngularJS directives usually isn’t very hard. Most of the time it is just a matter of instantiating the directive using the $compile() function and interacting with the scope or related controller to verify if the behavior is as expected. However that leaves a bit of a gap as most of the time the interaction between the directives template and it’s scope isn’t tested. With really simple templates you can include them in the template property but using the templateUrl and loading them on demand is much more common, specially with more complex templates. Now when it comes to unit testing the HTTP request to load the template if not doing to work and as a result the interaction isn’t tested. Sure it is possible to use the $httpBackend service to fake the response but that still doesn’t use the actual template so doesn’t really test the interaction.
With that the files are available on the server. There are two problems here though. First of all when running unit tests the mock $httpBackend is used and that never does an actual HTTP request. Secondly the file is hosted at a slightly different URL, Karma includes ‘/base’ as the root of our files. So just letting AngularJS just load it is out of the question. However if we use a plain XMLHttpRequest object the mock $httpBackend is completely bypassed and we can load what we want. Using the plain XMLHttpRequest object has a second benefit in that we can do a synchronous request instead of the normal asynchronous request and use the response to pre-populate the $templateCache before the unit test runs. Using synchronous HTTP request is not advisable for code on the Internet and should be avoided in any production code but in a unit test like this would work perfectly fine.
So taking an AngularJS directive like this:
And a template like this:
Can be easily tested like this:
Now making any breaking change to the template, like removing the ng-click, will immediately cause the unit test to fail in Karma.
There are two ways to use the angular.module() function. There is the call with one parameter, that returns an existing module and there is an option of using two parameter which creates a new module. The second way, where a new module is created, is perfectly fine and should be used. However the first option, where an existing module is loaded should be considered and anti pattern in most cases and should not be used unless there is an exceptional and very good reason.
Splitting modules introduces a big risk. As soon as you split an AngularJS module into separate files you can run into the possibility of loading a partially configured module. Where AngularJS checks if all module dependencies can be satisfied at load time it has no way of seeing if these modules are complete or not. Missing a complete module produces a very clear error message right at startup time like this:
Uncaught Error: [$injector:modulerr] Failed to instantiate module mainApp due to:
Error: [$injector:modulerr] Failed to instantiate module mainApp.data due to:
Error: [$injector:nomod] Module ‘mainApp.data’ is not available! You either misspelled the module name or forgot to load it. If registering a module ensure that you
As the complete application fails to load very obvious and hard not to spot.
However if you fail to load just a part of a module the errors are a lot less obvious. In this case the error doesn’t appear until the missing component is actually needed, everything up to that point will run just fine. The king of error message you will see is something like:
Error: [$injector:unpr] Unknown provider: productsProvider <- products
The error in itself is clear enough but discovering it might not be as easy. If the error occurs in a part of that application that is not used often it might go completely unnoticed.
Want to split the functionality into multiple files. By all means go ahead but make sure to do so in a new module and take use module dependencies to make sure everything is loaded right at the application start time. And as angular.module(“module”) is only required to load a module defined in another file there really should almost never be a need to use it.
Consider the following client side code:
This depends on another piece of script below:
And for all of that to work we have to load the scripts in the right order using some HTML as below:
Not really rocket science here but if we want update utils.print() to call a printIt() function loaded from yet another library we have to go back to our HTML and make sure we load the printIt.js as well. Easy in a small app but this can become hard and error prone with larger applications.
With node each module can take a dependency on another module by requiring it using the require() function. And each module can define what it exports to other modules by using module.exports. The NodeJS runtime takes care of loading the files and adding dependencies inside a module will not require a change anywhere else in the program.
This system works really nice but unfortunately the browser doesn’t provide this NodeJS runtime capability. One problem here is that a call to require() is a synchronous call that returns the loaded module while the browser does all of its IO asynchronously. In the browser you can use something like RequireJS to asynchronously load scripts but while this works file this is not very efficient due to its asynchronous nature. As a result people usually use RequireJS during development and then create a bundle with all the code for production.
Browserify on the other hand will allow us to use the synchronous NodeJS approach with script loading in the browser. This is done by packaging up all files required based on the require() calls and creating one file to load at runtime. Converting the example above to use this style requires some small changes in the code.
The demo.js specifies it requires utils.js. The syntax “./utils” means that we should load the file from the same folder.
Next the utils.js specifies what it exports:
Next we need to run browserify to bundle the file for use in the browser. As browserify is a node application we need to install node and then, through the node package manager NPM, install browserify with
With browserify installed we can bundle the files into one using:
This will create a bundle.js with the following content:
Not the most readable but then that was not what it was designed to do. Instead we can see all code we need is included. Now by just including this generated file we ready to start our browser application.
Doing the same change as above is simple and best of all doesn’t require any change to the HTML to load different files. Just update utils.js to require() printIt.js and explicity export the function in printIt.js, rerun browserify and you are all set.
Note that it’s fine to just export a single function here.
And the result of running browserify is:
Again not the most readable code but the printIt() function is now included. Nice and no changes required to the HTML
Using browserify works really nice but this way we do have to start it after every time. In the next blog post I will show how to use Gulp or Grunt to automate this making the workflow a lot smoother.
Now that might sound great but it turns out that Automatic Semicolon Insertion can cause some interesting issues
However if we would return an object literal and format our code the same way we would run into a problem. Consider the following code:
You might expect this to print an object with a property sum containing the value 3. However the code prints “undefined”. Compare that with the following code that is only formatted differently:
This will print the expected object with a sum of 3.
This unexpected behavior is caused by semicolon insertion. instead of the code you most likely think will execute the following executes:
Notice the semicolon after the return statement?
That actually means return nothing, i.e. undefined, and just have some unreachable code on the next few lines. That is completely valid so that is what happens
Unfortunately ‘use strict’ doesn’t help here either. It will prevent some errors but it doesn’t make semicolons required
Just so you are aware, the URL with the RSS feed for my blog has changed. Please use the following URL now:
Apologies for the inconvenience.