The Manager in the Middle Attack

The first problem any security project has is to get executive support. The second problem is to find a way to make use of and direct that executive support.

So, that was the original tweet that seems to have been a little popular (not fantastically popular, but then I only have a handful of followers).

I’m sure a lot of people thought it was just an amusing pun, but it’s actually a realisation on my part that there’s a real thing that needs naming here.

Executives support security

By and large, the companies I’ve worked for and/or with in the last few years have experienced a glacial but certain shift in perspective.

Where once the security team seemed to be perceived as a necessary nuisance to the executive layers, it seems clear now that there have been sufficient occurrences of bad news (and CEOs being forced to resign) that executives come TO the security team for reassurance that they won’t become the next … well, the next whatever the last big incident was.

TalkTalk had three security incidents in the last year

Obviously, those executives still have purse strings to manage, and most security professionals like to get paid, because that’s largely what distinguishes them from security amateurs. So security can’t get ALL the dollars, but it’s generally easier to get the money and the firepower for security than it ever was in the past.

So executives support security. Some of them even ask what more they can do – and they seem sincere.

Developers support security

Well, some of them do, but that’s a topic for another post.

There are sufficient numbers of developers who care about quality and security these days, that there’s less of a need to be pushing the security message to developers quite how we used to.

We’ve mostly reached those developers who are already on our side.

How developers communicate

And those developers can mentor other developers who aren’t so keen on security.

The security-motivated developers want to learn more from us, they’re aware that security is an issue, and for the most part, they’re capable of finding and even distinguishing good security solutions to use.

Why is security still so crap, then?

Pentester cat wins.

If the guys at the top, and the guys at the bottom (sorry devs, but the way the org structure goes, you don’t manage anyone, so ipso facto you are at the bottom, along with the cleaners, the lawyers, and the guy who makes sure the building doesn’t get stolen in the middle of the night) care about security, why are we still seeing sites get attacked successfully? Why are apps still being successfully exploited?

Why is it that I can exploit a web site with SQL injection, an attack that has been around for as long as many of the developers at your company have been aware of computers?

Someone is getting in the way.

So, who doesn’t support security?

Ask anyone in your organisation if they think security is important, and you’ll get varying answers, most of which are acknowledging that without security in the software being developed, so it’s clear that you can’t actually poll people that way for the right answer.

Ask who’s in the way, instead…

Often it’s the security team – because it’s really hard to fill out a security team, and to stretch out around the organisation.

But that’s not the whole answer.

Ask the security-conscious developers what’s preventing them from becoming a security expert to their team, and they’ll make it clear – they’re rewarded and/or punished at their annual review times by the code they produce that delivers features.

There is no reward for security

And because managers are driving behaviour through performance reviews, it actually doesn’t matter what the manager tells their underlings, even if they give their devs a weekly spiel about how important security is. Even if you have an executive show up at their meetings and tell them security is “Job #1”. Even if he means it.

Those developers will return to their desks, and they’ll look at the goals against which they’ll be reviewed come performance review time.

The Manager in the Middle Attack

If managers don’t specifically reward good security behaviour, most developers will not produce secure code.


This is the Manager in the Middle Attack. Note that it applies in the event that no manager is present (thanks, Dan Kaminsky!)

Managers have to manage

Because I never like to point out a problem without proposing a solution:

Managers have to actively manage their developers into changing their behaviours. Some performance goals will help, along with the support (financial and moral) to make them achievable.

Here are a few sample goals:

  • Implement a security bug-scanning solution in your build/deploy process
  • Track the creation / destruction of bugs just like you track your feature burn-down rate.
    • It’ll be a burn-up rate to begin with, but you can’t incentivise a goal you can’t track
  • Prevent the insertion of new security bugs.
    • No, don’t just turn the graph from “trending up” to “trending less up” – actually ban checkins that add vulnerabilities detected by your scanning tools.
  • Reduce the number of security bugs in your existing code
    • Prioritise which ones to work on first.
      • Use OWASP, or whatever “top N list” floats your boat – until you’ve exhausted those.
      • Read the news, and see which flaws have been the cause of significant problems.
      • My list: Code injection (because if an attacker can run code on my site, it’s not my site); SQL Injection / data access flaws (because the attacker can steal, delete, or modify my data); other injection (including XSS, because it’s a sign you’re a freaking amateur web developer)
    • Don’t be afraid to game the system – if you can modularise your database access and remove all SQL injection bugs with a small change, do so. And call it as big of a win as it truly is!
  • Refactor to make bugs less likely
    • Find a widespread potentially unsecure behaviour and make it into a class or function, so it’s only unsecure in one place.
    • Then secure that place. And comment the heck out of why it’s done that way.
    • Ban checkins that use the old way of doing things.
    • Delete old, unused code, if you didn’t already do that earlier.
  • Share knowledge of security improvements
    • With your colleagues on your dev team
    • With other developers across the enterprise
    • Outside of the company
    • Become a sought-after expert (inside or outside the team or organisation) on a security practice – from a dev perspective
    • Mentor another more junior developer who wants to become a security hot-shot like yourself.

That’s quite a bunch of security-related goals for developers, which managers can implement. All of them can be measured, and I’m not so crass as to suggest that I know which numbers will be appropriate to your appetite for risk, or the size of hole out of which you have to dig yourself.

Auto convert inked shapes in PowerPoint–coming to OneNote

I happened upon a blog post by the Office team yesterday which surprised me, because it talked about a feature in PowerPoint that I’ve wanted ever since I first got my Surface 2.

Shape recognition

Here’s a link to documentation on how to use this feature in PowerPoint.

It seems like the obvious feature a tablet should have.

Here’s a video of me using it to draw a few random shapes:

But not just in PowerPoint – this should be in Word, in OneNote, in Paint, and pretty much any app that accepts ink.

And at last, OneNote

So here’s the blog post from Office noting that this feature will finally be available for OneNote in November.

On iPad, iPhone and Windows 10. Which I presume means it’ll only be on the Windows Store / Metro / Modern / Immersive version of OneNote.

That’s disappointing, because it should really be in every Office app. Hell, I’d update from Office 2013 tomorrow if this was a feature in Office 2016!

Let’s not stop there

Please, Microsoft, don’t stop at the Windows Store version of OneNote.

Shape recognition, along with handwriting recognition (which is apparently also hard), should be a natural part of my use of the Surface Pen. It should work the same across multiple apps.

That’s only going to happen if it’s present in multiple apps, and is a documented API which developers – of desktop apps as well as Store apps – can call into.

Well, desktop apps can definitely get that.

How can I put it into my own app?

I’ll admit that I haven’t had the time yet to build my own sample, but I’m hoping that this still works – there’s an API called “Ink Analysis”, which is exactly how you would achieve this in your app:

It allows you to analyse ink you’ve captured, and decide if it’s text or a drawing, and if it’s a drawing, what kind of drawing it might be.

[I’ve marked this with the tag “Alun’s Code” because I want to write a sample eventually that demonstrates this function.]

NCSAM resume–sorry for the interruption

TL;DR – hardware problems, resuming NCSAM posts when / if I can get time.

Well, that went about as well as can be expected.

I start a month of daily posts, and the first one is all that I achieved.

Perhaps I’ve run out of readers, because nobody asked if I was unwell or had died.

No, I haven’t died, the simple truth is that a combination of hardware failures and beta testing got the better of me.

I’d signed up to the Fast Ring of Windows Insider testing, and had found that Edge and Internet Explorer both seemed to get tired of running Twitter and Facebook, and repeatedly got slower and slower to refresh, until eventually I had to quit and restart them.

Also the SP3 refused to recognise my Microsoft Band as plugged in [actually a hardware failure on the Band, but I’ll come to that another day].

Naturally, I assumed this was all because of the beta build I was using.

So, I did what any good beta tester would do. I filed feedback, and pressed the “Roll back” button.

It didn’t seem to take as long as I expected.


That’s your first sign that something is seriously wrong, and you should take a backup of whatever data is left.

So I did, which is nice, because the next thing that happened is that I tried to open a Windows Store app.

It opened a window and closed immediately.


So did every other Windows Store / Metro / Modern / Immersive app I tried.

Including Windows Store itself.

After a couple of days of dinking around with various ‘solutions’, I decided I’d reached beta death stage, and should FFR (FDISK, FORMAT and Reinstall).

First, make another backup, just because you can’t have too many copies of the data you rely on.

And … we’re good?

That should have been close to the end of the story, with me simply reinstalling all my apps and moving along.

In fact, I started that.

Then my keyboard stopped working. It didn’t even light up.

Plugging the keyboard (it’s a Surface Pro Type Cover) into another Surface (the Surface Pro Type Covers work on, but don’t properly cover, a Surface 2, which we have in my house) demonstrated that the keyboard was just fine on a different system, just not on my main system.

The keyboard for a Surface Pro 3 works on a Surface 2The keyboard for a Surface Pro 3 is not going to fit properly as a cover for your Surface 2

So, I kept a few things running by using my Bluetooth keyboard and mouse, and once I convinced myself it was worth the trip, I took my Surface Pro 3 out to the Microsoft Store in Bellevue for an appointment.

I dream of Jeannie – no, that’s creepy

Jeannie was the tech assigned to help me with my keyboard issue. Helpful and friendly, she didn’t waste time with unnecessary questions or dinking around with stuff that could already be ruled out.

She unplugged the keyboard and tried it on another system. It worked. No need to replace the keyboard.

Can she do a factory reset?

Be my guest – I made another backup before I came out to the store.

So, another quick round of FFR, and the Surface still doesn’t recognise the keyboard.

Definitely a hardware problem, and that’s the advantage of going to the Microsoft Store.

Let me get you a replacement SP3, says Jeannie, and heads out back to the stock room.

Bad news, she says on coming back, We don’t have the exact model you have (an i7 Surface Pro 3 with 256 GB of storage).

Is it OK if we get you the next model up, with twice the storage?

Only if you can’t find any way to upgrade me for free to the shiny Surface Book you have on display up front.

So, now I have a bigger Surface Pro 3

Many thanks to Jeannie for negotiating that upgrade!

But now I have to reinstall all my apps, restore all my data, and get back to functioning before I can engage in fun things like blogging.

I’ll get back into the NCSAM posts, but they’ll be more overview than I was originally planning.

NCSAM post 1: That time again?

Every year, in October, we celebrate National Cyber Security Awareness Month.

Normally, I’m dismissive of anything with the word “Cyber” in it. This is no exception – the adjective “cyber” is a manufactured word, without root, without meaning, and with only a tenuous association to the world it endeavours to describe.

But that’s not the point.

In October, I teach my blog readers about security

And I do it from a very basic level.

This is not the place for me to assume you’ve all been reading and understanding security for years – this is where I appeal to readers with only a vague understanding that there’s a “security” thing out there that needs addressing.

Information Security as a shared responsibility

This first week is all about Information Security – Cyber Security, as the government and military put it – as our shared responsibility.

I’m a security professional, in a security team, and my first responsibility is to remind the thousands of other employees that I can’t secure the company, our customers, our managers, and our continued joint success, without everyone pitching in just a little bit.

I’m also a customer, with private data of my own, and I have a responsibility to take reasonable measures to protect that data, and by extension, my identity and its association with me. But I also need others to take up their responsibility in protecting me.

When we fail in that responsibility…

This year, I’ve had my various identifying factors – name, address, phone number, Social Security Number (if you’re not from the US, that’s a government identity number that’s rather inappropriately used as proof of identity in too many parts of life) – misappropriated by others, and used in an attempt to buy a car, and to file taxes in my name. So, I’ve filed reports of identity theft with a number of agencies and organisations.

I have spent DAYS of time working on preventing further abuse of my identity, and that of my family

Just today, another breach report arrives, from a company I do business with, letting me know that more data has been lost – this time from one of the organisations charged with actually protecting my identity and protecting my credit.

And it’s not just the companies that are at fault

While companies can – and should – do much more to protect customers (and putative customers), and their data, it’s also incumbent on the customers to protect themselves.

Every day, thousands of new credit and debit cards get issued to eager recipients, many of them teenagers and young adults.

Excited as they are, many of these youths share pictures of their new cards on Twitter or Facebook. Occasionally with both sides. There’s really not much your bank can do if you’re going to react in such a thoughtless way, with a casual disregard for the safety of your data.

Sure, you’re only liable for the first $50 of any use of your credit card, and perhaps of your debit card, but it’s actually much better to not have to trace down unwanted charges and dispute them in the first place.

So, I’m going to buy into the first message of National Cyber Security Awareness Month – and I’m going to suggest you do the same:

Stop. Think. Connect.

This is really the base part of all security – before doing a thing, stop a moment. Think about whether it’s a good thing to do, or has negative consequences you hadn’t considered. Connect with other people to find out what they think.

I’ll finish tonight with some examples where stopping a moment to think, and connecting with others to pool knowledge, will improve your safety and security online. More tomorrow.

Example: passwords

The most common password is “12345678”, or “password”. This means that many people are using that simple a password. Many more people are using more secure passwords, but they still make mistakes that could be prevented with a little thought.

Passwords leak – either from their owners, or from the systems that use those passwords to recognise the owners.

When they do, those passwords – and data associated with them – can then be used to log on to other sites those same owners have visited. Either because their passwords are the same, or because they are easily predicted. If my password at Adobe is “This is my Adobe password”, well, that’s strong(ish), but it also gives a hint as to what my Amazon password is – and when you crack the Adobe password leak (that’s already available), you might be able to log on to my Amazon account.

Creating unique passwords – and yes, writing them down (or better still, storing them in a password manager), and keeping them safe – allows you to ensure that leaks of your passwords don’t spread to your other accounts.

Example: Twitter and Facebook

There are exciting events which happen to us every day, and which we want to share with others.

That’s great, and it’s what Twitter and Facebook are there FOR. All kinds of social media available for you to share information with your friends.

Unfortunately, it’s also where a whole lot of bad people hang out – and some of those bad people are, unfortunately, your friends and family.

Be careful what you share, and if you’re sharing about others, get their permission too.

If you’re sharing about children, contemplate that there are predators out there looking for the information you may be giving out. There’s one living just up the road, I can assure you. They’re almost certainly safely withdrawn, and you’re protected from them by natural barriers and instincts. But you have none of those instincts on Facebook unless you stop, think and connect.

So don’t post addresses, locations, your child’s phone number, and really limit things like names of children, friends, pets, teachers, etc – imagine that someone will use that as ‘proof’ to your child of their safety. “It’s OK, I was sent by Aunt Josie, who’s waiting for you to come and see Dobbie the cat”

Example: shared accounts

Bob’s going off on vacation for a month.

Lucky Bob.

Just in case, while he’s gone, he’s left you his password, so that you can log on and access various files.

Two months later, and the office gets raided by the police. They’ve traced a child porn network to your company. To Bob.

Well, actually, to Bob and to you, because the system can’t tell the difference between Bob and you.

Don’t share accounts. Make Bob learn (with the IT department’s help) how to share portions of his networked files appropriately. It’s really not all that hard.

Example: software development

I develop software. The first thing I write is always a basic proof of concept.

The second thing I write – well, who’s got time for a second thing?

Make notes in comments every time you skip a security decision, and make those notes in such a way that you can revisit them and address them – or at least, count them – prior to release, so that you know how badly you’re in the mess.

Ways you haven’t stopped my XSS, Number 3 – helped by the browser / website

Apologies for not having written one of these in a while, but I find that one of the challenges here is to not release details about vulnerable sites while they’re still vulnerable – and it can take oh, so long for web developers to get around to fixing these vulnerabilities.

And when they do, often there’s more work to be done, as the fixes are incomplete, incorrect, or occasionally worse than the original problem.

Sometimes, though, the time goes so slowly, and the world moves on in such a way that you realise nobody’s looking for the vulnerable site, so publishing details of its flaws without publishing details of its identity, should be completely safe.

Helped by the website

So, what sort of attack is actively aided by the website?

Overly-verbose error messages

My favourite “helped by the website” issues are error messages which will politely inform you how your attack failed, and occasionally what you can do to fix it.

Here’s an SQL example:


OK, so now I know I have a SQL statement that contains the sequence “' order by type asc, sequence desc” – that tells me quite a lot. There are two fields called “type” and “sequence”. And my single injected quote was enough to demonstrate the presence of SQL injection.

What about XSS help?

There’s a few web sites out there who will help you by telling you which characters they can’t handle in their search fields:


Now, the question isn’t “what characters can I use to attack the site?”, but “how do I get those characters into the site. [Usually it’s as simple as typing them into the URL instead of using the text box, sometimes it’s simply a matter of encoding]

Over-abundance of encoding / decoding

On the subject of encoding and decoding, I generally advise developers that they should document interface contracts between modules in their code, describing what the data is, what format it’s in, and what isomorphic mapping they have used to encode the data so that it is not possible to confuse it with its surrounding delimiters or code, and so that it’s possible to get the original string back.

An isomorphism, or 1:1 (“one to one”) mapping, in data encoding terms, is a means of making sure that each output can only correspond to one possible input, and vice versa.

Without these contracts, you find that developers are aware that data sometimes arrives in an encoded fashion, and they will do whatever it takes to decode that data. Data arrives encoded? Decode it. Data arrives doubly-encoded? Decode it again. Heck, take the easy way out, as this section of code did:
var input, output;
parms ="&");
input = parms[1];
while (input != output) {
    output = input;
    input = unescape(output);

[That’s from memory, so I apologise if it’s a little incorrect in many, many other ways as well.]

Yes, the programmer had decided to decode the input string until he got back a string that was unchanged.

This meant that an attacker could simply provide a multiply-encoded attack sequence which gets past any filters you have, such as WAFs and the like, and which the application happily decodes for you.

Granted, I don’t think WAFs are much good, compared to actually fixing the code, but they can give you a moment’s piece to fix code, as long as your code doesn’t do things to actively prevent the WAF from being able to help.

Multiple tiers, each decoding

This has essentially the same effect as described above. The request target for an HTTP request may be percent-encoded, and when it is, the server is required to treat it equivalently to the decoded target. This can sometimes have the effect that each server in a multi-tiered service will decode the HTTP request once, achieving the multiple-decode WAF traversal I talk about above.

Spelling correction


OK, that’s illustrative, and it illustrates that Google doesn’t fall for this crap.

But it’s interesting how you’ll find occasionally that such a correction results in executing code.

Stopwords and notwords in searches

When finding XSS in searches, we often concentrate on failed searches – after all, in most product catalogues, there isn’t an item called “<script>prompt()</script>” – unless we put it there on a previous attack.

But often the more complex (and easily attacked) code is in the successful search results – so we want to trigger that page.

Sometimes there’s something called “script”, so we squeak that by (there’s a band called “The Script”, and very often writing on things is desribed as being in a “script” font), but now we have to build Javascript with other terms that match the item on display when we find “script”. Fortunately, there’s a list of words that most search engines are trained to ignore – they are called “stopwords”. These are words that don’t impact the search at all, such as “the”, “of”, “to”, “and”, “by”, etc – words that occur in such a large number of matching items that it makes no sense to allow people to search on those words. Often colours will appear in the list of stopwords, along with generic descriptions of items in the catalogue (“shirt”, “book”, etc).

Well, “alert” is simply “and”[0]+”blue”[1]+”the”[2]+”or”[1]+”the”[0], so you can build function names quickly from stopwords. Once you have String.FromCharCode as a function object, you can create many more strings and functions more quickly. For an extreme example of this kind of “building Javascript from minimal characters”, see this page on how to create all JavaScript from eight basic characters (none of which are alphabetical!)

“Notwords” aren’t a thing, but made the title seem more interesting – sometimes it’d be nice to slip in a string that isn’t a stopword, and isn’t going to be found in the search results. Well, many search functions have a grammar that allow us to say things like “I’d like all your teapots except for the ones made from steel” – or more briefly, “teapot !steel”.

How does this help us execute an attack?

Well, we could just as easily search for “<script> !prompt() </script>” – valid JavaScript syntax, which means “run the prompt() function, and return the negation of its result”. Well, too late, we’ve run our prompt command (or other commands). I even had “book !<script> !prompt()// !</script>” work on one occasion.

Helped by the browser

So, now that we’ve seen some examples of the server or its application helping us to exploit an XSS, what about the browser?

Carry on parsing

One of the fun things I see a lot is servers blocking XSS by ensuring that you can’t enter a complete HTML tag except for the ones they approve of.

So, if I can’t put that closing “>” in my attack, what am I to do? I can’t just leave it out.

Well, strange things happen when you do. Largely because most web pages are already littered with closing angle brackets – they’re designed to close other tags, of course, not the one you’ve put in, but there they are anyway.

So, you inject “<script>prompt()</script>” and the server refuses you. You try “<script prompt() </script” and it’s allowed, but can’t execute.

So, instead, try a single tag, like “<img src=x onerror=prompt()>” – it’s rejected, because it’s a complete tag, so just drop off the terminating angle bracket. “<img src=x onerror=prompt()” – so that the next tag doesn’t interfere, add an extra space, or an “x=”:

<img src=x onerror=prompt() x=

If that gets injected into a <p> tag, it’ll appear as this:

<p><img src=x onerror=prompt() x=</p>

How’s your browser going to interpret that? Simple – open p tag, open img tag with src=x, onerror=prompt() and some attribute called “x”, whose value is “</p”.

If confused, close a tag automatically

Occasionally, browser heuristics and documented standards will be just as helpful to you as the presence of characters in the web page.

Can’t get a “/” character into the page? Then you can’t close a <script> tag. Well, that’s OK, because the <svg> tag can include scripts, and is documented to end at the next HTML tag that isn’t valid in SVG. So… “<svg><script>prompt()<p>” will happily execute as if you’d provided the complete “<svg><script>prompt()</script></svg><p>”

There are many other examples where the browser will use some form of heuristic to “guess” what you meant, or rather to guess what the server meant with the code it sends to the browser with your injected data. See what happens when you leave your attack half-closed.

Can’t comment with // ? Try other comments

When injecting script, you often want to comment the remaining line after your injection, so it isn’t parsed – a failing parse results in none of your injected code being executed.

So, you try to inject “//” to make the rest of the line a comment. Too bad, all “/” characters are encoded or discarded.

Well, did you know that JavaScript in HTML treats “<!—” as a perfectly valid equivalent?

Different browsers help in different ways

Try attacks in different browsers, they each behave in subtly different ways.

Firefox doesn’t have an XSS filter. So it won’t prevent XSS attacks that way.

IE 11 doesn’t encode URI elements, so will sometimes work when your attack would otherwise be encoded.

Chrome – well, I don’t use Chrome often enough to comment on its quirks. Too irritated with it trying to install on my system through Adobe Flash updates.

Well, I think that’s enough for now.

Why didn’t you delete my data?

The recent hack of Ashley Madison, and the subsequent discussion, reminded me of something I’ve been meaning to talk about for some time.

Can a web site ever truly delete your data?

This is usually expressed, as my title suggests, by a user asking the web site who hosted that user’s account (and usually directly as a result of a data breach) why that web site still had the user’s data.

This can be because the user deliberately deleted their account, or simply because they haven’t used the service in a long time, and only remembered that they did by virtue of a breach notification letter (or a web site such as Troy Hunt’s

1. Is there a ‘delete’ feature?

Web sites do not see it as a good idea to have a ‘delete’ feature for their user accounts – after all, what you’re asking is for a site to devote developer resources to a feature that specifically curtails the ability of that web site to continue to make money from the user.

To an accountant’s eye (or a shareholder’s), that’s money out the door with the prospect of reducing money coming in.

To a user’s eye, it’s a matter of security and trust. If the developer deliberately misses a known part of the user’s lifecycle (sunset and deprecation are both terms developers should be familiar with), it’s fairly clear that there are other things likely to be missing or skimped on. If a site allows users to disconnect themselves, to close their accounts, there’s a paradox that says more users will choose to continue their service, because they don’t feel trapped.

So, let’s assume there is a “delete” or “close my account” feature – and that it’s easy to use and functional.

2. Is there a ‘whoops’ feature for the delete?

In the aftermath of the Ashley Madison hack, I’m sure there’s going to be a few partners who are going to engage in retributive behaviours. Those behaviours could include connecting to any accounts that the partners have shared, and cause them to be closed, deleted and destroyed as much as possible. It’s the digital equivalent of cutting the sleeves off the cheating partner’s suit jackets. Probably.

Assuming you’ve finally settled down and broken/made up, you’ll want those accounts back under your control.

So there might need to be a feature to allow for ‘remorse’ over the deletion of an account. Maybe not for the jealous partner reason, even, but perhaps just because you forgot about a service you were making use of by that account, and which you want to resurrect.

OK, so many sites have a ‘resurrect’ function, or a ‘cool-down’ period before actually terminating an account.

Facebook, for instance, will not delete your account until you’ve been inactive for 30 days.

3. Warrants to search your history

Let’s say you’re a terrorist. Or a violent criminal, or a drug baron, or simply someone who needs to be sued for slanderous / libelous statements made online.

OK, in this case, you don’t WANT the server to keep your history – but to satisfy warrants of this sort, a lawyer is likely to tell the server’s operators that they have to keep history for a specific period of time before discarding them. This allows for court orders and the like to be executed against the server to enforce the rule of law.

So your server probably has to hold onto that data for more than the 30 day inactive period. Local laws are going to put some kind of statute on how long a service provider has to hold onto your data.

As an example, a retention notice served under the UK’s rather steep RIPA law could say the service provider has to hold on to some types of data for as much as 12 months after the data is created.

4. Financial and Business records

If you’ve paid for the service being provided, those transaction details have to be held for possible accounting audits for the next several years (in the US, between 3 and 7 years, depending on the nature of the business, last time I checked).

Obviously, you’re not going to expect an audit to go into complete investigations of all your individual service requests – unless you’re billed to that level. Still, this record is going to consist of personal details of every user in the system, amounts paid, service levels given, a la carte services charged for, and some kind of demonstration that service was indeed provided.

So, even if Ashley Madison, or whoever, provided a “full delete” service, there’s a record that they have to keep somewhere that says you paid them for a service at some time in the past.

Eternal data retention – is it inevitable?

I don’t think eternal data retention is appropriate or desirable. It’s important for developers to know data retention periods ahead of time, and to build them into the tools and services they provide.

Data retention shouldn’t be online

Hackers fetch data from online services. Offline services – truly offline services – are fundamentally impossible to steal over the network. An attacker would have to find the facility where they’re stored, or the truck the tapes/drives are traveling in, and steal the data physically.

Not that that’s impossible, but it’s a different proposition from guessing someone’s password and logging into their servers to steal data.

Once data is no longer required for online use, and can be stored, move it into a queue for offline archiving. Developers should make sure their archivist has a data destruction policy in place as well, to get rid of data that’s just too old to be of use. Occasionally (once a year, perhaps), they should practice a data recovery, just to make sure that they can do so when the auditors turn up. But they should also make sure that they have safeguards in place to prevent/limit illicit viewing / use of personal data while checking these backups.

Not everything has to be retained

Different classifications of data have different retention periods, something I alluded to above. Financial records are at the top end with seven years or so, and the minutiae of day-to-day conversations can probably be deleted remarkably quickly. Some services actually hype that as a value of the service itself, promising the messages will vanish in a snap, or like a ghost.

When developing a service, you should consider how you’re going to classify data so that you know what to keep and what to delete, and under what circumstances. You may need a lawyer to help with that.

Managing your data makes service easier

If you lay the frameworks in place when developing a service, so that data is classified and has a documented lifecycle, your service naturally becomes more loosely coupled. This makes it smoother to implement, easier to change, and more compartmentalised. This helps speed future development.

Providing user lifecycle engenders trust and loyalty

Users who know they can quit are more likely to remain loyal (Apple aside). If a user feels hemmed in and locked in place, all that’s required is for someone to offer them a reason to change, and they’ll do so. Often your own staff will provide the reason to change, because if you’re working hard to keep customers by locking them in, it demonstrates that you don’t feel like your customers like your service enough to stay on their own.

So, be careful who you give data to

Yeah, I know, “to whom you give data”, thanks, grammar pedants.

Remember some basic rules here:

1. Data wants to be free

Yeah, and Richard Stallmann’s windows want to be broken.

Data doesn’t want anything, but the appearance is that it does, because when data is disseminated, it essentially cannot be returned. Just like if you go to RMS’s house and break all his windows, you can’t then put the glass fragments back into the frames.

Developers want to possess and collect data – it’s an innate passion, it seems. So if you give data to a developer (or the developer’s proxy, any application they’ve developed), you can’t actually get it back – in the sense that you can’t tell if the developer no longer has it.

2. Sometimes developers are evil – or just naughty

Occasionally developers will collect and keep data that they know they shouldn’t. Sometimes they’ll go and see which famous celebrities used their service recently, or their ex-partners, or their ‘friends’ and acquaintances.

3. Outside of the EU, your data doesn’t belong to you

EU data protection laws start from the basic assumption that factual data describing a person is essentially the non-transferrable property of the person it describes. It can be held for that person by a data custodian, a business with whom the person has a business relationship, or which has a legal right or requirement to that data. But because the data belongs to the person, that person can ask what data is held about them, and can insist on corrections to factual mistakes.

The US, and many other countries, start from the premise that whoever has collected data about a person actually owns that data, or at least that copy of the data. As a result, there’s less emphasis on openness about what data is held about you, and less access to information about yourself.

Ideally, when the revolution comes and we have a socialist government (or something in that direction), the US will pick up this idea and make it clear that service providers are providing a service and acting only as a custodian of data about their customers.

Until then, remember that US citizens have no right to discover who’s holding their data, how wrong it might be, or to ask for it to be corrected.

4. No one can leak data that you don’t give them

Developers should also think about this – you can’t leak data you don’t hold. Similarly, if a user doesn’t give data, or gives incorrect or value-less data, if it leaks, that data is fundamentally worthless.

The fallout from the Ashley Madison leak is probably reduced significantly by the number of pseudonyms and fake names in use. Probably.

Hey, if you used your real name on a cheating web site, that’s hardly smart. But then, as I said earlier today, sometimes security is about protecting bad people from bad things happening to them.

5. Even pseudonyms have value

You might use the same nickname at several places; you might provide information that’s similar to your real information; you might link multiple pseudonymous accounts together. If your data leaks, can you afford to ‘burn’ the identity attached to the pseudonym?

If you have a long message history, you have almost certainly identified yourself pretty firmly in your pseudonymous posts, by spelling patterns, word usages, etc.

Leaks of pseudonymous data are less problematic than leaks of eponymous data, but they still have their problems. Unless you’re really good at OpSec.


Finally, I was disappointed earlier tonight to see that Troy had already covered some aspects of this topic in his weekly series at Windows IT Pro, but I think you’ll see that his thoughts are from a different direction than mine.

Hack Your Friends Next

My buddy Troy Hunt has a popular PluralSight training class called “Hack Yourself First”. This is excellent advice, as it addresses multiple ideas:

  • You have your own permission to hack your own site, which means you aren’t getting into trouble
  • Before looking outward, you get to see how good your own security is
  • Hacking yourself makes it less likely that when you open up to the Internet, you’ll get pwned
  • By trying a few attacks, you’ll get to see what things an attacker might try and how to fend them off

Plenty of other reasons, I’m sure. Maybe I should watch his training.

Every now and again, though, I’ll hack my friends as well. There are a few reasons for this, too:

  • I know enough not to actually break a site – this is important
  • My friends will generally rather hear from me than an attacker that they have an obvious flaw
  • Tools that I use to find vulnerabilities sometimes stay enabled in the background
  • It’s funny

Such is the way with my recent visit to – I’ve been researching reflected XSS issues caused by including script in the Referrer header.

What’s the Referrer header?

Actually, there’s two places that hold the referrer, and it’s important to know the difference between them, because they get attacked in different ways, and attacks can be simulated in different ways.

The Referrer header (actually misspelled as “Referer”) is an HTTP header that the browser sends as part of its request for a new web page. The Referrer header contains a URL to the old page that the browser had loaded and which triggered the browser to fetch the new page.

There are many rules as to when this Referrer header can, and can’t, be sent. It can’t be sent if the user typed a URL. It can’t be sent if the target is HTTP, but the source was HTTPS. But there are still enough places it can be sent that the contents of the Referer header are a source of significant security concern – and why you shouldn’t EVER put sensitive data in the URL or query parameters, even when sending to an HTTPS destination. Even when RESTful.

Forging the Referer when attacking a site is a simple matter of opening up Fiddler (or your other favourite scriptable proxy) and adding a new automatic rule to your CustomRules.js, something like this:

// AMJ
    if (oSession.oRequest.headers.Exists("Referer"))
            if (oSession.oRequest.headers["Referer"].Contains("?"))
                oSession.oRequest.headers["Referer"] += "&\"-prompt()-\"";
                oSession.oRequest.headers["Referer"] += "?\"-prompt()-\"";
            oSession.oRequest.headers["Referer"] = "\"-prompt()-\"";

Something like this code was in place when I visited other recently reported vulnerable sites, but Troy’s I hit manually. Because fun.

JavaScript’s document.referrer

The other referrer is in Javascript, the document.referrer field. I couldn’t find any rules about when this is, or isn’t available. That suggests it’s available for use even in cases where the HTTP Referer header believes it is not safe to do so, at least in some browser or other.

Forging this is harder, and I’m not going to delve into it. I want you to know about it in case you’ve used the Referer header, and referrer-vulnerable code isn’t triggering. Avoids tearing your hair out.

Back to the discovery

So, lately I’ve been testing sites with a URL ending in the magic string ?"-prompt()-" – and happened to try it at Troy’s site, among others.

I’d seen a pattern of advertising being vulnerable to this issue. [It’s not the only one by any means, but perhaps the most prevalent]. It’s difficult accurately reproducing this issue, because advertising mediators will send you to different advertisers each time you visit a site.

And so it was with great surprise that I tried this on Troy’s site and got an immediate hit. Partly because I know Troy will have already tried this on his own site.

Through a URL parameter, I’m injecting script into a hosted component that unwisely includes the Referer header’s contents in its JavaScript without encoding and/or quoting it first.

It’s ONLY Reflected XSS

I hear that one all the time – no big deal, it’s only a reflected XSS, the most you can do with this is to abuse yourself.

Kind of, yeah. Here’s some of my reasons why Reflected XSS is important:

  • It’s an obvious flaw – it suggests your code is weak all over
  • It’s easy to fix – if you don’t fix the easy flaws, do you want me to believe you fix the hard ones?
  • An attacker can send a link to your trusted web site in a spam email, and have thousands of your users clicking on it and being exploited
  • It’s like you’ve hired a new developer on your web site – the good news is, you don’t have to pay them. The bad news is, they don’t turn up to design meetings, and may have completely different ideas about how your web site should work
  • The attacker can change content as displayed to your users without you knowing what changes are made
  • The attacker can redirect your users to other malicious websites, or to your competitors
  • The attacker can perform network scans of your users’ systems
  • The attacker can run keylogging – capturing your users’ username and password, for instance
  • The attacker can communicate with your users – with your users thinking it’s you
  • A reflected XSS can often become stored XSS, because you allow users of your forums / reviews / etc to post links to your site “because they’re safe, trusted links”
  • Once an attacker convinces one of your staff to visit the reflected XSS, the attack becomes internal. Your staff will treat the link as “trusted” and “safe”
  • Any XSS will tend to trump your XSRF protections.

So, for multiple values of “self” outside the attacker, you can abuse yourself with Reflected XSS.

Contacting the vendor and resolving

With all security research, there comes a time when you want to make use of your findings, whether to garner yourself more publicity, or to earn a paycheck, or simply to notify the vendor and have them fix something. I prefer the latter, when it’s possible / easy.

Usually, the key is to find an email address at the vulnerable domain – but wasn’t working, and I couldn’t find any hints of an actual web site at for me to go look at.

Troy was able to start from the other direction – as the owner of a site showing these adverts, he contacted the advertising agent that puts ads onto his site, and get them to fix the issue.

“Developer Media” was the name of the group, and their guy Chris quickly got onto the issue, as did Jamie from Integral Ads, the owners of Developer Media pulled adsafeprotected as a source of ads, and Integral Ads fixed their code.

Sites that were previously vulnerable are now not vulnerable – at least not through that exact attack.

I count that as a win.

There’s more to learn here

Finally, some learning.

1. Reputational risk / impact

Your partners can bring you as much risk as your own developers and your own code. You may be able to transfer risk to them, but you can’t transfer reputational risk as easily. With different notifications, Troy’s brand could have been substantially damaged, as could Developer Media’s and Integral Ads’. As it is, they all responded quickly, quietly and appropriately, reducing the reputational impact.

[As for my own reputational impact – you’re reading this blog entry, so that’s a positive.]

2. Good guy / bad guy hackers

This issue was easy to find. So it’s probably been in use for a while by the bad guys. There are issues like this at multiple other sites, not related to adsafeprotected.

So you should test your site and see if it’s vulnerable to this, or similar, code. If you don’t feel like you’ll do a good job, employ a penetration tester or two.

3. Reducing risk by being paranoid (iframe protection)

There’s a thin line between “paranoia” and “good security practice”. Troy’s blog uses good security practice, by ensuring that all adverts are inside an iframe, where they can’t execute in Troy’s security context. While I could redirect his users, perhaps to a malicious or competing site, I wasn’t able to read his users’ cookies, or modify content on his blog.

There were many other hosts using adsafeprotected without being in an iframe.

Make it a policy that all externally hosted content (beyond images) is required to be inside of an iframe. This acts like a firewall between your partners and you.

4. Make yourself findable

If you’re a developer, you need to have a security contact, and that contact must be findable from any angle of approach. Security researchers will not spend much time looking for your contact information.

Ideally, for each domain you handle, have the address (where you replace “” with your domain) point to a monitored email address. This will be the FIRST thing a security researcher will try when contacting you. Finding the “Contact Us” link on your web page and filling out a form is farther down on the list of things a researcher will do. Such a researcher usually has multiple findings they’re working on, and they’ll move on to notifying someone else rather than spend time looking for how to notify you.

5. Don’t use “safe”, “secure”, “protected” etc in your domain name

This just makes it more ironic when the inevitable vulnerability is found.

6. Vulns protected by XSS Filter are still vulns

As Troy notes, I did have to disable the XSS Filter in order to see this vuln happen.

That doesn’t make the vuln any less important to fix – all it means is that to exploit it, I have to find customers who have also disabled the XSS Filter, or find a way to evade the filter.

There are many sites advising users how to disable the XSS Filter, for various (mostly specious) reasons, and there are new ways every day to evade the filter.

7. Ad security is HARD

The web ad industry is at a crisis point, from my perspective.

Flash has what appear to be daily vulnerabilities, and yet it’s still seen to be the medium of choice for online advertising.

Even without vulnerabilities in Flash, its programmability lends it to being used by bad guys to distribute malicious software. There are logic-based and time-based exploits (display a ‘good’ ad when inspected by the ad hosting provider; display a bad ad, or do something malicious when displayed on customers’ computers) which attackers will use to ensure that their ad passes rigorous inspection, but still deploys bad code to end users.

Any ad that uses JavaScript is susceptible to common vulnerability methods.

Ad blockers are being run by more and more people – even institutions (one college got back 40% of their network bandwidth by employing ad blocking).

Web sites need to be funded. If you’re not paying for the content, someone is. How is that to be done except through advertising? [Maybe you have a good idea that hasn’t been tried yet]

8. Timing of bug reports is a challenge

I’ll admit, I was bored when I found the bug on Troy’s site on a weekend. I decided to contact him straight away, and he responded immediately.

This led to Developer Media being contacted late on a Sunday.

This is not exactly friendly of me and Troy – but at least we didn’t publish, and left it to the developers to decide whether to treat this as a “fire drill”.

A good reason, indeed, to use responsible / coordinated disclosure, and make sure that you don’t publish until teams are actively working on / have resolved the problem.

9. Some browsers are safer – that doesn’t mean your web site is safe

There are people using old and poorly configured browsers everywhere. Perhaps they make up .1% of your users. If you have 100,000 users, that’s a hundred people who will be affected by issues with those browsers.

Firefox escaped because it encoded the quote characters to %22, and the server at adsafeprotected didn’t decode them. Technically, adsafeprotected’s server is not RFC compliant because of this, so Firefox isn’t really protecting anyone here.

Chrome escaped because it encoded the quote characters AND has an XSS filter to block things like my attack. This is not 100% safe, and can be disabled easily by the user.

Internet Explorer up to version 11 escaped if you leave the XSS Filter turned on.

Microsoft Edge in Windows 10 escaped because it encodes the quote characters and has a robust XSS Filter that, as far as I can tell, you can’t turn off.

All these XSS filters can be turned off by setting a header in network traffic.

Nobody would do that.

Until such time as one of these browsers has a significant flaw in their XSS filter.

So, don’t rely on the XSS Filter to protect you – it can’t be complete, and it may wind up being disabled.

Windows 10 – first impressions

I’ve updated from Windows 8.1 to Windows 10 Enterprise Insider Preview over this weekend, on my Surface Pro 3 and a Lenovo tablet. Both machines are used for software development as well as playing games, so seemed the ideal place to practice.

So here’s some initial impressions:

1. VPN still not working properly

I’ve mentioned before (ranted, perhaps) about how the VPN support in Windows 8.1 is great for desktop apps, but broken for Metro / Modern / Immersive / Windows Store apps.

Still, maybe now I’m able to provide feedback, and Windows is in a beta test phase, perhaps they’ll pay attention and fix the bugs.

2. Stuff crashes

It’s a beta, but just in case you were persuaded to install this on a production system, it’s still not release quality.

Every so often, the Edge browser (currently calling itself “Project Spartan”) will just die on you.

I’ve managed to get the “People Hub” to start exactly twice without crashing immediately.

3. Update after you install

Download the most recent version from the Insider’s page, and you still have to apply an update to the entire system before you’re actually up to date. The update takes essentially as long as the initial install.

4. Update before you install – and make a backup

Hey, it’s a beta – what did you expect?

Things will break, you’ll find yourself missing functionality, so you may need to restore to your original state. Update before you install, and fewer things will be as likely to go wrong in the upgrade.

5. Provide feedback – even about the little things

They won’t fix things you don’t provide feedback about.

OK, so maybe they also won’t fix things that you DO provide feedback on, but that’s how life works. Not everything gets fixed. Ever.

But if you don’t report issues, you won’t ever see them fixed.

6. The new People Hub is awful

The People “Hub” in Windows 10, from the couple of times I’ve managed to execute it, basically has my contacts, and can display what’s new from them in Outlook Mail.

I rather enjoy the Windows 8.1 People Hub, where you can see in one place the most recent interactions in Twitter, Facebook, LinkedIn and Skype. Or at least, that’s what it promises, even if it only actually delivers Facebook and Twitter.

7. Videos can now be deleted

It’s always possible to delete a video file, of course, but in Windows 8.1, after you’ve finished watching a video from the Videos app, you had to go find some other tool in which to do so – and hope that you deleted the right one.

In Windows 10 you can use the context menu (right click, or tap and hold) on a video to delete it from your store.

Still needs some more work – it doesn’t display subtitles / closed-captioning, it only orders alphabetically, and there’s no jumping to the letter “Q” by pressing the “Q” key, but this app is already looking very functional even for those of us who collect MP4 files to watch.

8. No Media Center

I really, really liked the Media Center. More than TiVo. We have several Media Center PCs in our house, and now we have to figure out what we’re going to do. I’m not going back to having a made-for-purpose device that can’t do computing, I want my Media Center. I’ll try some of its competitors, but it’d be really nice if Microsoft relents and puts support back for Media Center.

9. Edge / Spartan browser – awesome

Excellent HTML5 compatibility, reduced chance of being hit by third party vulnerabilities, F12 Developer Tools, and still allows me to test for XSS vulnerabilities if I choose to do so.

Pretty much what I want in a browser, although from a security standpoint, the choice to allow two third party vulnerabilities add-ins into the browser, Flash and Reader, seems to be begging future trouble.

Having said that, you can disable Adobe Flash in the Advanced Settings of your Spartan browser. I’m going to recommend that you do that on all your non-gaming machines. Then find out which of your web sites need it, and either fix them, or decide whether you can balance the threat of Flash with the utility of that service.

The F12 Developer Tools continue to be a very useful set of web site debugging tools, and assist me greatly in discovering and expanding on web site vulnerabilities. I personally find them easier than debugging tools in other browsers, and they have the benefit of being always installed in recent Microsoft browsers.

The “Reader” view is a nice feature, although it was present in Windows 8.1, and should be used any time you want to actually read the contents of a story, rather than wade through adverts and constant resizing of other content around the text you’re actually interested in.

9.1 XSS

Because, you know, I’m all about the XSS.

Internet Explorer has a pretty assertive XSS filter built in, and even when you turn it off in your settings, it still comes back to prevent you. I find this to be tricky, because I sometimes need to convince developers of the vulnerabilities in their apps. Firefox is often helpful here, because it has NO filters, but sometimes the behaviour I’m trying to show is specific to Internet Explorer.

Particularly, if I type a quote character into the URL in Internet Explorer, it sends a quote character. Firefox will send a %22 or %27 (double or single quotes). So, sometimes IE will trigger behaviour that Firefox doesn’t.

Sadly, although Spartan does seem to still be useful for XSS testing, the XSS filter can’t be specifically turned off in settings. I’d love to see if I can find a secret setting for this.

10. Microsoft Print to PDF

Windows has needed a PDF printer since, oh, Windows 3.1. A print driver that prompts you for a file name, and saves whatever you’re printing as a PDF file.

With Office, this kind of existed with Save as PDF. With OneNote, you could Print to OneNote, open the View ribbon, and hide the header, before exporting as a PDF. But that’s the long way around.

With Windows 10, Microsoft installed a new printer driver, “Microsoft Print to PDF”. It does what it says on the tin, allowing you to generate PDFs from anywhere that can print.

11. Tablet Mode / PC mode

I use a Surface Pro 3 as my main system, and I have to say that the reversion to a mainly desktop model of operations is nice to my eyes, but a little confusing to the hands – I don’t quite know how to manage things any more.

Sometimes I like to work without the keyboard, because the tablet works well that way. But now I can’t close apps by sliding from top to bottom, even when I’ve expanded them to full screen. Not sure how I’m supposed to do this.

Lessons to learn already from Premera – 2. Prevention

Not much has been released about exactly how Premera got attacked, and certainly nothing from anyone with recognised insider knowledge.

Disclaimer: I worked at Premera in the Information Security team, but it’s so so long ago that any of my internal knowledge is incorrect – so I’ll only talk about those things that I have seen published.

I am, above all, a customer of Premera’s, from 2004 until just a few weeks ago. But I’m a customer with a strong background in Information Security.

What have we read?

Almost everything boils down rather simply to one article as the source of what we know.

Some pertinent dates

February 4, 2015: News stories break about Anthem’s breach (formerly Wellpoint).

January 29, 2015: The date given by Premera as the date when they were first made aware that they’d been attacked.

I don’t think that it’s a coincidence that these dates are so close together. In my opinion, these dates imply that Anthem / Wellpoint found their own issues, notified the network of other health insurance companies, and then published to the news outlets.

As a result of this, Premera recognised the same attack patterns in their own systems.

This suggests that any other health insurance companies attacked by the same group (alleged to be “Deep Panda”) will discover and disclose it shortly.

Why I keep mentioning Wellpoint

I’ve kind of driven in the idea that Anthem used to be called Wellpoint, and the reason I’m bringing this out is that a part of the attack documented by ThreatConnect was to create a site called “” – that’s “”, but with the two letter “els” replaced with two “one” digits.

That’s relevant because the ThreatConnect article also called out that there was a web site called “” created by the same group.

That’s sufficient for an attack

So, given a domain name similar to that of a site you wish to attack, how would you get full access to the company behind that site?

Here’s just one way you might mount that attack. There are other ways to do this, but this is the most obvious, given the limited information above.

  1. Create a domain,
  2. Visit the sites used by employees from the external Internet (these would be VPN sites, Outlook Web Access and similar, HR sites – which often need to be accessed by ex-employees)
  3. Capture data related to the logon process – focus only on the path that represents a failed logon (because without passwords, you don’t know what a successful logon does)
  4. Replicate that onto your network. Test it to make sure it looks good
  5. Now send email to employees, advising them to check on their email, their paystubs, whatever sites you’ve verified, but linking them to the version
  6. Sit back and wait for people to send you their usernames and passwords – when they do, tell them they’ve got it wrong, and redirect them to the proper site
  7. Log on to the target network using the credentials you’ve got

If you’re concerned that I’m telling attackers how to do this, remember that this is obvious stuff. This is already a well known attack strategy, “homograph attacks”. This is what a penetration tester will do if you hire one to test your susceptibility to social engineering.

There’s no vulnerability involved, there’s no particularly obvious technical failing here, it’s just the age-old tactic of giving someone a screen that looks like their logon page, and telling them they’ve failed to logon. I saw this basic form of attack in the eighties, it’s that old.

How to defend against this?

If you’ve been reading my posts to date, you’ll know that I’m aware that security offence is sexy and exciting, but security defence is really where the clever stuff belongs.

I have a few simple recommendations that I think apply in this case:

  1. Separate employee and customer account databases. Maybe your employees are also customers, but their accounts should be separately managed and controlled. Maybe the ID is the same in both cases, but the accounts being referenced must be conceptually and architecturally in different account pools.
  2. Separate employee and customer web domains. This is just a generally good idea, because it means that a session ID or other security context from the customer site cannot be used on the employee site. Cross-Origin Security Policies apply to allow the browser to prevent sharing of credentials and access between the two domains. Separation of environments is one of the tasks a firewall can achieve relatively successfully, and with different domains, that job is assisted by your customers’ and employees’ browsers.
  3. External-to-internal access must be gated by a secondary authentication factor (2FA, MFA – standing for two/multi factor authentication). That way, an attacker who phishes your employees will not get a credential they can use from the outside.

Another tack that’s taken by companies is to engage a reputation management company, to register domain names that are homoglyphs to your own (those that look the same in a browser address bar).  Or, to file lawsuits that take down such domains when they appear. Whichever is cheaper. My perspective on this is that it costs money, and is doomed to fail whenever a new TLD arises, or your company creates a new brand.

[Not that reputation management companies can’t help you with your domain names, mind you – they can prevent you, for instance, from releasing a product with a name that’s already associated with a domain name owned by another company.]

These three steps are somewhat interdependent, and they may cause a certain degree of inconvenience, but they will prevent exactly the kind of attacks I’ve described. [Yes, there are other potential attacks, but none introduced by the suggested changes]

HTML data attributes – stop my XSS

First, a disclaimer for the TL;DR crowd – data attributes alone will not stop all XSS, mine or anyone else’s. You have to apply them correctly, and use them properly.

However, I think you’ll agree with me that it’s a great way to store and reference data in a page, and that if you only handle user data in correctly encoded data attributes, you have a greatly-reduced exposure to XSS, and can actually reduce your exposure to zero.

Next, a reminder about my theory of XSS – that there are four parts to an XSS attack – Injection, Escape, Attack and Cleanup. Injection is necessary and therefore can’t be blocked, Attacks are too varied to block, and Cleanup isn’t always required for an attack to succeed. Clearly, then, the Escape is the part of the XSS attack quartet that you can block.

Now let’s set up the code we’re trying to protect – say we want to have a user-input value accessible in JavaScript code. Maybe we’re passing a search query to Omniture (by far the majority of JavaScript Injection XSS issues I find). Here’s how it often looks:

/************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/

Let’s suppose that “SEARCH-STRING” above is the string for which I searched.

I can inject my code as a search for:


The second line then becomes:


Yes, I know you can’t subtract two strings, but JavaScript doesn’t know that until it’s evaluated the function, and by then it’s too late, because it’s already executed the bad thing. A more sensible language would have thrown an error at compile time, but this is just another reason for security guys to hate dynamic languages.

How do data attributes fix this?

A data attribute is an attribute in an HTML tag, whose name begins with the word “data” and a hypen.

These data attributes can be on any HTML tag, but usually they sit in a tag which they describe, or which is at least very close to the portion of the page they describe.

Data attributes on table cells can be associated to the data within that cell, data attributes on a body tag can be associated to the whole page, or the context in which the page is loaded.

Because data attributes are HTML attributes, quoting their contents is easy. In fact, there’s really only a couple of quoting rules needed to consider.

  1. The attribute’s value must be quoted, either in double-quote or single-quote characters, but usually in double quotes because of XHTML
  2. Any ampersand (“&”) characters need to be HTML encoded to “&amp;”.
  3. Quote characters occurring in the value must be HTML encoded to “&quot;

Rules 2 & 3 can simply be replaced with “HTML encode everything in the value other than alphanumerics” before applying rule 1, and if that’s easier, do that.

Sidebar – why those rules?

HTML parses attribute value strings very simply – look for the first non-space character after the “=” sign, which is either a quote or not a quote. If it’s a quote, find another one of the same kind, HTML-decode what’s in between them, and that’s the attribute’s value. If the first non-space after the equal sign is not a quote, the value ends at the next space character.
Contemplate how these are parsed, and then see if you’re right:

  • <a onclick="prompt("1")">&lt;a onclick="prompt("1")"&gt;</a>

  • <a onclick = "prompt( 1 )">&lt;a onclick = "prompt( 1 )"&gt;</a>

  • <a onclick= prompt( 1 ) >&lt;a onclick= prompt( 1 ) &gt;</a>

  • <a onclick= prompt(" 1 ") >&lt;a onclick= prompt(" 1 ") &gt;</a>

  • <a onclick= prompt( "1" ) >&lt;a onclick= prompt( "1" ) &gt;</a>

  • <a onclick= "prompt( 1 )">&lt;a onclick=&amp;#9;"prompt( 1 )"&gt;</a>

  • <a onclick= "prompt( 1 )">&lt;a onclick=&amp;#32;"prompt( 1 )"&gt;</a>

  • <a onclick= thing=1;prompt(thing)>&lt;a onclick= thing=1;prompt(thing)&gt;</a>

  • <a onclick="prompt(\"1\")">&lt;a onclick="prompt(\"1\")"&gt;</a>

Try each of them (they aren’t live in this document – you should paste them into an HTML file and open it in your browser), see which ones prompt when you click on them. Play with some other formats of quoting. Did any of these surprise you as to how the browser parsed them?

Here’s how they look in the Debugger in Internet Explorer 11:


Uh… That’s not right, particularly line 8. Clearly syntax colouring in IE11’s Debugger window needs some work.

OK, let’s try the DOM Explorer:


Much better – note how the DOM explorer reorders some of these attributes, because it’s reading them out of the Document Object Model (DOM) in the browser as it is rendered, rather than as it exists in the source file. Now you can see which are interpreted as attribute names (in red) and which are the attribute values (in blue).

Other browsers have similar capabilities, of course – use whichever one works for you.

Hopefully this demonstrates why you need to follow the rules of 1) quoting with double quotes, 2) encoding any ampersand, and 3) encoding any double quotes.

Back to the data-attributes

So, now if I use those data-attributes, my HTML includes a number of tags, each with one or more attributes named “data-something-or-other”.

Accessing these tags from basic JavaScript is easy. You first need to get access to the DOM object representing the tag – if you’re operating inside of an event handler, you can simply use the “this” object to refer to the object on which the event is handled (so you may want to attach the data-* attributes to the object which triggers the handler).

If you’re not inside of an event handler, or you want to get access to another tag, you should find the object representing the tag in some other way – usually document.getElementById(…)

Once you have the object, you can query an attribute with the function getAttribute(…) – the single argument is the name of the attribute, and what’s returned is a string – and any HTML encoding in the data-attribute will have been decoded once.

Other frameworks have ways of accessing this data attribute more easily – for instance, JQuery has a “.data(…)” function which will fetch a data attribute’s value.

How this stops my XSS

I’ve noted before that stopping XSS is a “simple” matter of finding where you allow injection, and preventing, in a logical manner, every possible escape from the context into which you inject that data, so that it cannot possibly become code.

If all the data you inject into a page is injected as HTML attribute values or HTML text, you only need to know one function – HTML Encode – and whether you need to surround your value with quotes (in a data-attribute) or not (in HTML text). That’s a lot easier than trying to understand multiple injection contexts each with their own encoding function. It’s a lot easier to protect the inclusion of arbitrary user data in your web pages, and you’ll also gain the advantage of not having multiple injection points for the same piece of data. In short, your web page becomes more object-oriented, which isn’t a bad thing at all.

One final gotcha

You can still kick your own arse.

When converting user input from the string you get from getAttribute to a numeric value, what function are you going to use?

Please don’t say “eval”.

Eval is evil. Just like innerHtml and document.write, its use is an invitation to Cross-Site Scripting.

Use parseFloat() and parseInt(), because they won’t evaluate function calls or other nefarious components in your strings.

So, now I’m hoping your Omniture script looks like this:

<div id="myDataDiv" data-search-term="SEARCH-STRING"></div>
/************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/

You didn’t forget to HTML encode your SEARCH-STRING, or at least its quotes and ampersands, did you?

P.S. Omniture doesn’t cause XSS, but many people implementing its required calls do.