DR test? It’s not a test, it’s an exercise

My colleague in DR advised me that I’m using the wrong word when talking about DR, and that this betrays something of my thinking, and of the thinking of people I describe in my post “How to pass a Disaster Recovery test

It’s not a test, he tells me. As is usual when I’m corrected, I bristle, but I listen.

Using the term “test” leads people immediately to thinking of scores, and of pass / fail mentality.

Consider, as a replacement, the term “exercise”. Remember the old joke, “how do you get to Carnegie Hall? Practice, practice, practice!” [I said it was old, I didn’t say it was funny].

Key here is the concept that an exercise is practice for “the main event”. In DR’s case, the main event is some catastrophic loss of business function – your headquarters gets cratered by a tornado or an earthquake, or the propane tank blows up. Maybe you just have a power outage in your data centre, and the exercise is to ensure that you know how to ensure that your generator starts up, supplies power, and is monitored for fuel consumption until the electric comes back.

When you think of a DR exercise, as opposed to a test, you realise that this is as much about training your staff, and exercising your recovery drills, as it is about knowing which areas you need to revisit in assessment, analysis and documentation. And then nobody asks about whether you passed or failed.

You can even talk about your DR exercise as an unqualified success – without having to qualify anything.

How to pass a Disaster Recovery test

Last week I went to a party hosted by MVP, all-round good guy and host of SBSMigration.com, Jeff Middleton.

Jeff hails from New Orleans, so while he is well known for "swing migration" techniques that allow you to move your domain from and to different versions of Windows (often, a small business gets bigger, and wants to migrate from Small Business Server to a Windows Server environment for more expansion possibilities, or just wants to upgrade their domain controller without taking it down for a couple of days to do so), he’s obviously quite familiar with good-old-fashioned disaster recovery.

So that reminded me of a number of occasions where I’ve spoken to IT Professionals who have all shared the same misconception about disaster recovery. Here’s a made-up example of the kind of question I mean:

"Alun, my company just plain sucks at disaster recovery testing. Why, every year, we have a DR test, and every year, we fail at something or another. You’d think we’d be passing them by now!"


"Alun, I just don’t get it with these DR tests – they’ve scheduled another test for later this year, and they’re doing it in the middle of the week, when our backups aren’t synchronised. Don’t they realise that’ll make us fail the DR test?"

Okay, you can probably guess where I’m headed.

You’re not supposed to pass a DR test.

A DR test is about spotting the problems that you will have in a disaster, and documenting them so that you can determine what needs altering in your disaster plan, whether it is to improve your response to a disaster, or to accept that some level of loss of service or data will occur.

A DR test where you succeed in recovering everything hasn’t told you anything – except perhaps that your test could have been more rigorously designed.

[A colleague of mine tells me of a disaster recovery test at a company that had been doing well in DR tests largely as a result of the efforts of one talented individual who knew everything. At the start of the test, the DR manager pulls the talented guy aside, and says "bad news, Chuck – you were incapacitated in the disaster, and they’ll have to recover the systems from the documentation you left behind."]

And, for those of you working in a corporate environment where it is important to you to expand your fiefdoms and/or justify your budget requests, bear in mind that where there are shortcomings discovered in DR, there too there will be money allocated to fix those shortcomings to prevent a real disaster from becoming a calamity. [But it goes better if you predict some of the failings beforehand, or at least whine about how snowed under you are in those areas.]

DRM should always be a choice

MMj02365270000[1] Jesper’s recent frustration with a bug in the DRM support on his Windows Media Center Edition (MCE) system demonstrates a couple of basic truths in system reliability:

  1. Complexity negatively impacts reliability.
  2. DRM contributes to complexity.

Clearly, this means that DRM makes systems less reliable than they would be without DRM.

So, why can’t Jesper simply kill the DRM component in his MCE system and have a more reliable system, without the worry of DRM? Because there’s two kinds of DRM, and this is the bad kind.

First of all, let’s review a basic tenet of client-server security. If the server is owned by someone who wants to secure data, all security decisions must be made at the server – client-side security is no security for the server’s owner, unless the server can guarantee that the client is owned by the same individual.

So, with DRM, the content provider wishes to protect his material, and make it available to content consumers – this means that either the content provider needs to not rely on the client for security, or must expect that his security will be broken.

As I’ve mentioned time and again before, this means that DRM is broken in the consumer marketplace – although it works very well for business, because there is an ownership of the client environment. To those willing to break contract with the content provider, or to alter the client or the content, DRM is a barrier to overcome.

Now to the two kinds of DRM.

I haven’t found any documentation that talks about the two kinds of DRM, so I’ll give them names here – Passive DRM and Active DRM. Please accept my apologies if there are other terms for these that I should be using – and correct me, if you can.

Passive DRM protects its content from onlookers who do not have a DRM-enabled client. Encryption is generally used for Passive DRM, so that the content is meaningless garbage unless you have the right bits in your client. I consider this "passive" protection, because the data is inaccessible by default, and only becomes accessible if you have the right kind of client, with the right key.

Active DRM, then, would be a scheme where protection is only provided if the client in use is one that is correctly coded to block access where it has not been specifically granted. This is a scheme in which the data is readily accessible to most normal viewers / players, but has a special code that tells a DRM-enabled viewer/player to hide the content from people who haven’t been approved.

Passive DRM offers a choice to consumers between these two options:

  1. Drop all DRM features and support, so that you can’t view the protected content, but you also don’t have the added complexity.
  2. Include DRM features and support, so that you can view the protected content, at the cost of increased complexity.

An example of Passive DRM is that of a DVD’s protection, where the content is encrypted, and can be decrypted by any device that has an appropriate CSS key.

Active DRM, by comparison offers the following non-choice:

  1. Install the DRM client, adding to complexity, and be blocked from seeing some ‘protected’ content.
  2. Don’t install the DRM client, keeping complexity low, and allowing you to see all content, including that which is protected.

Sony’s DRM for CDs is an example of Active DRM, and a great example of why Active DRM is bad. Put the CD in an ordinary player, and there’s no DRM, because the CD player can’t load the attached software. Put the CD into a PC, and you’re blocked from making copies of the CD, plus you’ve installed an extra root-kit that makes your computer more vulnerable to attack.

Both of these DRM examples have, of course, been cracked. In the first case, that of DVDs, the CSS keys are provided on DVDs, and can be decrypted if you can get just one key by attacking a DVD player. In the second case, of course, you simply play dumb and say "I don’t run non-music content from music CDs" (or you disable AutoPlay).

But there’s a difference to the consumer. Because Active DRM requires all clients to be made compliant, or its ‘protected’ content has no protection, there is an imperative on the content providers to force compliance from all clients.

You see this in Jesper’s MCE example, in that he is unable to use his MCE system to view content that he could happily have viewed with a cheap TV. That’s right – a high-priced personal video recorder is beaten in capabilities by a cheap TV. All because his MCE system was forced to have the Active DRM client software installed – and cannot have it uninstalled even when it is shown to be the cause of a catastrophic failure in the system.

If Passive DRM had been in place – if the output of the Comcast OnDemand signal had been encrypted, then it would not have displayed on an ordinary TV, and maybe Jesper’s MCE would still have crashed when it tried to display it, but Jesper could have removed the DRM component, abandoned his ability to watch Comcast OnDemand, but gained a reliable system from his MCE box by doing so.

For a system like MCE, that’s marketed as an appliance, reliability is of paramount importance.

Only Passive DRM gives the consumer the choice to improve their own reliability. Only Passive DRM is appropriate and ethical; Active DRM requires that content producers assert that they have some form of ownership or control over devices that, by rights, belong entirely to the content consumers.

To paraphrase an old sore, if you think that DRM will solve your problem, you now have two problems. If you think that Active DRM is the solution, you have three.