General Security

How do you encrypt a password?

I hate when people ask me this question, because I inevitably respond with a half-dozen questions of my own, which makes me seem like a bit of an arse.

To reduce that feeling, because the questions don’t seem to be going away any time soon, I thought I’d write some thoughts out.

Put enough locks on a thing, it's secure. Or it collapses under the weight of the locks.

Do you want those passwords in the first place?

Passwords are important objects – and because people naturally share IDs and passwords across multiple services, your holding on to a customer’s / user’s password means you are a necessary part of that user’s web of credential storage.

It will be a monumental news story when your password database gets disclosed or leaked, and even more of a story if you’ve chosen a bad way of protecting that data. You will lose customers and you will lose business; you may even lose your whole business.

Take a long hard look at what you’re doing, and whether you actually need to be in charge of that kind of risk.

Do you need those passwords to verify a user or to impersonate them?

If you are going to verify a user, you don’t need encrypted passwords, you need hashed passwords. And those hashes must be salted. And the salt must be large and random. I’ll explain why some other time, but you should be able to find much documentation on this topic on the Internet. Specifically, you don’t need to be able to decrypt the password from storage, you need to be able to recognise it when you are given it again. Better still, use an acknowledged good password hashing mechanism like PBKDF2. (Note, from the “2” that it may be necessary to update this if my advice is more than a few months old)

Now, do not read the rest of this section – skip to the next question.

Seriously, what are you doing reading this bit? Go to the heading with the next question. You don’t need to read the next bit.

<sigh/>

OK, if you are determined that you will have to impersonate a user (or a service account), you might actually need to store the password in a decryptable form.

First make sure you absolutely need to do this, because there are many other ways to impersonate an incoming user using delegation, etc, which don’t require you storing the password.

Explore delegation first.

Finally, if you really have to store the password in an encrypted form, you have to do it incredibly securely. Make sure the key is stored separately from the encrypted passwords, and don’t let your encryption be brute-forcible. A BAD way to encrypt would be to simply encrypt the password using your public key – sure, this means only you can decrypt it, but it means anyone can brute-force an encryption and compare it against the ciphertext.

A GOOD way to encrypt the password is to add some entropy and padding to it (so I can’t tell how long the password was, and I can’t tell if two users have the same password), and then encrypt it.

Password storage mechanisms such as keychains or password vaults will do this for you.

If you don’t have keychains or password vaults, you can encrypt using a function like Windows’ CryptProtectData, or its .NET equivalent, System.Security.Cryptography.ProtectedData.

[Caveat: CryptProtectData and ProtectedData use DPAPI, which requires careful management if you want it to work across multiple hosts. Read the API and test before deploying.]

[Keychains and password vaults often have the same sort of issue with moving the encrypted password from one machine to another.]

For .NET documentation on password vaults in Windows 8 and beyond, see: Windows.Security.Credentials.PasswordVault

For non-.NET on Windows from XP and later, see: CredWrite

For Apple, see documentation on Keychains

Can you choose how strong those passwords must be?

If you’re protecting data in a business, you can probably tell users how strong their passwords must be. Look for measures that correlate strongly with entropy – how long is the password, does it use characters from a wide range (or is it just the letter ‘a’ repeated over and over?), is it similar to any of the most common passwords, does it contain information that is obvious, such as the user’s ID, or the name of this site?

Maybe you can reward customers for longer passwords – even something as simple as a “strong account award” sticker on their profile page can induce good behaviour.

Length is mathematically more important to password entropy than the range of characters. An eight character password chosen from 64 characters (less than three hundred trillion combinations – a number with 4 commas) is weaker than a 64 character password chosen from eight characters (a number of combinations with 19 commas in it).

An 8-character password taken from 64 possible characters is actually as strong as a password only twice as long and chosen from 8 characters – this means something like a complex password at 8 characters in length is as strong as the names of the notes in a couple of bars of your favourite tune.

Allowing users to use password safes of their own makes it easier for them to use longer and more complex passwords. This means allowing copy and paste into password fields, and where possible, integrating with any OS-standard password management schemes

What happens when a user forgets their password?

Everything seems to default to sending a password reset email. This means your users’ email address is equivalent to their credential. Is that strength of association truly warranted?

In the process to change my email address, you should ask me for my password first, or similarly strongly identify me.

What happens when I stop paying my ISP, and they give my email address to a new user? Will they have my account on your site now, too?

Every so often, maybe you should renew the relationship between account and email address – baselining – to ensure that the address still exists and still belongs to the right user.

Do you allow password hints or secret questions?

Password hints push you dangerously into the realm of actually storing passwords. Those password hints must be encrypted as well as if they were the password themselves. This is because people use hints such as “The password is ‘Oompaloompah’” – so, if storing password hints, you must encrypt them as strongly as if you were encrypting the password itself. Because, much of the time, you are. And see the previous rule, which says you want to avoid doing that if at all possible.

Other questions that I’m not answering today…

How do you enforce occasional password changes, and why?

What happens when a user changes their password?

What happens when your password database is leaked?

What happens when you need to change hash algorithm?

On Widespread XSS in Ad Networks

Randy Westergren posted a really great piece entitled “Widespread XSS Vulnerabilities in Ad Network Code Affecting Top Tier Publishers, Retailers

Go read it – I’ll wait.

The article triggered a lot of thoughts that I’ll enumerate here:

This is not a new thing – and that’s bad

This was reported by SoftPedia as a “new attack”, but it’s really an old attack. This is just another way to execute DOM-based XSS.

That means that web sites are being attacked by old bugs, not because their own coding is bad, but because they choose to make money from advertising.

And because the advertising industry is just waaaay behind on securing their code, despite being effectively a widely-used framework across the web.

You’ve seen previously on my blog how I attacked Troy Hunt’s blog through his advertising provider, and he’s not the first, or by any means the last, “victim” of my occasional searches for flaws.

It’s often difficult to trace which ad provider is responsible for a piece of vulnerable code, and the hosting site may not realise the nature of their relationship and its impact on security. As a security researcher, it’s difficult to get traction on getting these vulnerabilities fixed.

Important note

I’m trying to get one ad provider right now to fix their code. I reported a bug to them, they pointed out it was similar to the work Randy Westergren had written up.

So they are aware of the problem.

It’s over a month later, and the sites I pointed out to them as proofs of concept are still vulnerable.

Partly, this is because I couldn’t get a reliable repro as different ad providers loaded up, but it’s been two weeks since I sent them a reliable repro – which is still working.

Reported a month ago, reliable repro two weeks ago, and still vulnerable everywhere.

[If you’re defending a site and want to figure out which ad provider is at fault, inject a “debugger” statement into the payload, to have the debugger break at the line that’s causing a problem. You may need to do this by replacing “prompt()” or “alert()” with “(function(){debugger})()” – note that it’ll only break into the debugger if you have the debugger open at the time.]

How the “#” affects the URL as a whole

Randy’s attack example uses a symbol you won’t see at all in some web sites, but which you can’t get away from in others. The “#” or “hash” symbol, also known as “number” or “hash”. [Don’t call it “pound”, please, that’s a different symbol altogether, “£”] Here’s his example:

http://nypost.com/#1'-alert(1)-'"-alert(1)-"

Different parts of the URL have different names. The “http:” part is the “protocol”, which tells the browser how to connect and what commands will likely work. “//nypost.com/” is the host part, and tells the browser where to connect to. Sometimes a port number is used – commonly, 80 or 443 – after the host name but before the terminating “/” of the host element. Anything after the host part, and before a question-mark or hash sign, is the “path” – in Randy’s example, the path is left out, indicating he wants the root page. An optional “query” part follows the path, indicated by a question mark at its start, often taking up the rest of the URL. Finally, if a “#” character is encountered, this starts the “anchor” part, which is everything from after the “#” character on to the end of the URL.

The “anchor” has a couple of purposes, one by design, and one by evolution. The designed use is to tell the browser where to place the cursor – where to scroll to. I find this really handy if I want to draw someone’s attention to a particular place in an article, rather than have them read the whole story. [It can also be used to trigger an onfocus event handler in some browsers]

The second use is for communication between components on the page, or even on other pages loaded in frames.

The anchor tag is for the browser only

I want to emphasise this – and while Randy also mentioned it, I think many web site developers need to understand this when dealing with security.

The anchor tag is not sent to the server.

The anchor tag does not appear in your server’s logs.

WAFs cannot filter the anchor tag.

If your site is being attacked through abuse of the anchor tag, you not only can’t detect it ahead of time, you can’t do basic forensic work to find out useful things such as “when did the attack start”, “what sort of things was the attacker doing”, “how many attacks happened”, etc.

[Caveat: pedants will note that when browser code acts on the contents of the anchor tag, some of that action will go back to the server. That’s not the same as finding the bare URL in your log files.]

If you have an XSS that can be triggered by code in an anchor tag, it is a “DOM-based XSS” flaw. This means that the exploit happens primarily (or only) in the user’s browser, and no filtering on the server side, or in the WAF (a traditional, but often unreliable, measure against XSS attacks), will protect you.

When trying out XSS attacks to find and fix them, you should try attacks in the anchor tag, in the query string, and in the path elements of the URL if at all possible, because they each will get parsed in different ways, and will demonstrate different bugs.

What does “-alert(1)-“ even mean?

The construction Randy uses may seem a little odd:

"-alert(1)-"'-alert(1)-'

With some experience, you can look at this and note that it’s an attempt to inject JavaScript, not HTML, into a quoted string whose injection point doesn’t properly (or at all) escape quotes. The two different quote styles will escape from quoted strings inside double quotes and single quotes alike (I like to put the number ‘2’ in the alert that is escaped by the double quotes, so I know which quote is escaped).

But why use a minus sign?

Surely it’s invalid syntax?

While JavaScript knows that “string minus void” isn’t a valid operation, in order to discover the types of the two arguments to the “minus” operator, it actually has to evaluate them. This is a usual side-effect of a dynamic language – in order to determine whether an operation is valid, its arguments have to be evaluated. Compiled languages are usually able to identify specific types at compile time, and tell you when you have an invalid operand.

So, now that we know you can use any operator in there – minus, times, plus, divide, and, or, etc – why choose the minus? Here’s my reasoning: a plus sign in a URL is converted to a space. A divide (“/”) is often a path component, and like multiplication (“*”) is part of a comment sequence in JavaScript, “//” or “/*”, an “&” is often used to separate arguments in a query string, and a “|” for “or” is possibly going to trigger different flaws such as command injection, and so is best saved for later.

Also, the minus sign is an unshifted character and quick to type.

There are so many other ways to exploit this – finishing the alert with a line-ending comment (“//” or “<–”), using “prompt” or “confirm” instead of “alert”, using JavaScript obfuscaters, etc, but this is a really good easy injection point.

Another JavaScript syntax abuse is simply to drop “</script>” in the middle of the JavaScript block and then start a new script block, or even just regular HTML. Remember that the HTML parser only hands off to the JavaScript parser once it has found a block between “<script …>” and “</script …>” tags. It doesn’t matter if the closing tag is “within” a JavaScript string, because the HTML parser doesn’t know JavaScript.

There’s no single ad provider, and they’re almost all vulnerable

Part of the challenge in repeating these attacks, demonstrating them to others, etc, is that there’s no single ad provider, even on an individual web site.

Two visits to the same web site not only bring back different adverts, but they come through different pieces of code, injected in different ways.

If you don’t capture your successful attack, it may not be possible to reproduce it.

Similarly, if you don’t capture a malicious advert, it may not be possible to prove who provided it to you. I ran into this today with a “fake BSOD” malvert, which pretended to be describing a system error, and filled as much of my screen as it could with a large “alert” dialog, which kept returning immediately, whenever it was dismissed, and which invited me to call for “tech support” to fix my system. Sadly, I wasn’t tracing my every move, so I didn’t get a chance to discover how this ad was delivered, and could only rage at the company hosting the page.

This is one reason why I support ad-blockers

Clearly, ad providers need to improve their security. Until such time as they do so, a great protection is to use an ad-blocker. This may prevent you from seeing actual content at some sites, but you have to ask yourself if that content is worth the security risk of exposing yourself to adverts.

There is a valid argument to be made that ad blockers reduce the ability of content providers to make legitimate profit from their content.

But there is also a valid argument that ad blockers protect users from insecure adverts.

Defence – protect your customers from your ads

Finally, if you’re running a web site that makes its money from ads, you need to behave proactively to prevent your users from being targeted by rogue advertisers.

I’m sure you believe that you have a strong, trusting relationship with the ad providers you have running ads on your web site.

Don’t trust them. They are not a part of your dev team. They see your customers as livestock – product. Your goals are substantially different, and that means that you shouldn’t allow them to write code that runs in your web site’s security context.

What this means is that you should always embed those advertising providers inside an iframe of their own. If they give you code to run, and tell you it’s to create the iframe in which they’ll site, put that code in an iframe you host on a domain outside your main domain. Because you don’t trust that code.

Why am I suggesting you do that? Because it’s the difference between allowing an advert attack to have limited control, and allowing it to have complete control, over your web site.

If I attack an ad in an iframe, I can modify the contents of the iframe, I can pop up a global alert, and I can send the user to a new page.

If I attack an ad – or its loading code – and it isn’t in an iframe, I can still do all that, but I can also modify the entire page, read secret cookies, insert my own cookies, interact with the user as if I am the site hosting the ad, etc.

If you won’t do it for your customers, at least defend your own page

capture20160409090703064

Here’s the front page of a major website with a short script running through an advert with a bug in it.

[I like the tag at the bottom left]

Insist on security clauses with all your ad providers

Add security clauses in to your contracts , so that you can pull an ad provider immediately a security vulnerability is reported to you, and so that the ad providers are aware that you have an interest in the security and integrity of your page and your users. Ask for information on how they enforce security, and how they expect you to securely include them in your page.

[I am not a lawyer, so please talk with someone who is!]

We didn’t even talk about malverts yet

Malverts – malicious advertising – is the term for an attacker getting an ad provider to deliver their attack code to your users, by signing up to provide an ad. Often this is done using apparent advertising copy related to current advertising campaigns, and can look incredibly legitimate. Sometimes, the attack code will be delayed, or region-specific, so an ad provider can’t easily notice it when they review the campaign for inclusion in your web page.

Got a virus you want to distribute? Why write distribution code and try to trick a few people into running it, when for a few dollars, you can get an ad provider to distribute it for you to several thousand people on a popular web site for people who have money?

Why am I so cross?

There are many reasons why Information Security hasn’t had as big an impact as it deserves. Some are external – lack of funding, lack of concern, poor management, distractions from valuable tasks, etc, etc.

But the ones we inflict on ourselves are probably the most irritating. They make me really cross.

Why cross?

OK, “cross” is an English term for “angry”, or “irate”, but as with many other English words, it’s got a few other meanings as well.

It can mean to wrong someone, or go against them – “I can’t believe you crossed Fingers MacGee”.

It can mean to make the sign of a cross – “Did you just cross your fingers?”

It can mean a pair of items, intersecting one another – “I’m drinking at the sign of the Skull and Cross-bones”.

It can mean to breed two different subspecies into a third – “What do you get if you cross a mountaineer with a mosquito? Nothing, you can’t cross a scaler and a vector.”

Or it can mean to traverse something – “I don’t care what Darth Vader says, I always cross the road here”.

Green_cross_man_take_it

It’s this last sense that InfoSec people seem obsessed about, to the extent that every other attack seems to require it as its first word.

Such a cross-patch

These are just a list of the attacks at OWASP that begin with the word “Cross”.

Yesterday I had a meeting to discuss how to address three bugs found in a scan, and I swear I spent more than half the meeting trying to ensure that the PM and the Developer in the room were both discussing the same bug. [And here, I paraphrase]

“How long will it take you to fix the Cross-Frame Scripting bug?”

“We just told you, it’s going to take a couple of days.”

“No, that was for the Cross-Site Scripting bug. I’m talking about the Cross-Frame Scripting issue.”

“Oh, that should only take a couple of days, because all we need to do is encode the contents of the field.”

“No, again, that’s the Cross-Site Scripting bug. We already discussed that.”

“I wish you’d make it clear what you’re talking about.”

Yeah, me too.

A modest proposal

The whole point of the word “Cross” as used in the descriptions of these bugs is to indicate that someone is doing something they shouldn’t – and in that respect, it’s pretty much a completely irrelevant word, because we’re already discussing attack types.

In many of these cases, the words “Cross-Site” bring absolutely nothing to the discussion, and just make things confusing. Am I crossing a site from one page to another, or am I saying this attack occurs between sites? What if there’s no other site involved, is that still a cross-site scripting attack? [Yes, but that’s an irrelevant question, and by asking it, or thinking about asking/answering it, you’ve reduced your mental processing abilities to handle the actual issue.]

Check yourself when you utter “cross” as the first word in the description of an attack, and ask if you’re communicating something of use, or just “sounding like a proper InfoSec tool”. Consider whether there’s a better term to use.

I’ve previously argued that “Cross-Site Scripting” is really a poor term for the conflation of HTML Injection and JavaScript Injection.

Cross-Frame Scripting is really Click-Jacking (and yes, that doesn’t exclude clickjacking activities done by a keyboard or other non-mouse source).

Cross-Site Request Forgery is more of a Forced Action – an attacker can guess what URL would cause an action without further user input, and can cause a user to visit that URL in a hidden manner.

Cross-Site History Manipulation is more of a browser failure to protect SOP – I’m not an expert in that field, so I’ll leave it to them to figure out a non-confusing name.

Cross-Site Tracing is just getting silly – it’s Cross-Site Scripting (excuse me, HTML Injection) using the TRACE verb instead of the GET verb. If you allow TRACE, you’ve got bigger problems than XSS.

Cross-User Defacement crosses all the way into crosstalk, requiring as it does that two users be sharing the same TCP connection with no adequate delineation between them. This isn’t really common enough to need a name that gets capitalised. It’s HTTP Response-Splitting over a shared proxy with shitty user segregation.

Even more modestly…

I don’t remotely anticipate that I’ll change the names people give to these vulnerabilities in scanning tools or in pen-test reports.

But I do hope you’ll be able to use these to stop confusion in its tracks, as I did:

“Never mind cross-whatever, let’s talk about how long it’s going to take you to address the clickjacking issue.”

In Summary

Here’s the TL;DR version of the web post:

Prevent or interrupt confusion by referring to bugs using the following non-confusing terms:

Confusing Not Confusing Much, Probably
Cross-Frame Scripting Clickjacking
Cross-Site History Manipulation [Not common enough to name]
Cross-Site Tracing TRACE is enabled
Cross-Site Request Forgery Forced User Action
Cross-Site Scripting HTML Injection
JavaScript Injection
Cross-User Defacement Crappy proxy server

Fear the browsing dead!

Browsing Dead

Ding dong, the plugin’s dead!

There’s been a lot of celebration lately from the security community about the impending death of Adobe’s Flash, or Oracle’s Java plugin technology.

You can understand this, because for years these plugins have been responsible for vulnerability on top of vulnerability. Their combination of web-facing access and native code execution means that you have maximum exposure and maximum risk concentrated in one place on the machine.

Browser manufacturers have recognised this risk in their own code, and have made great strides in improving security. Plus, you can always switch browsers if you feel one is more secure than another.

Attackers can rely on Flash and Java.

An attacker can pretty much assume that their target is running Flash from Adobe, and Java from Oracle. [Microsoft used to have a competing Java implementation, but Oracle sued it out of existence.]

Bugs in those implementations are widely published, and not widely patched, whether or not patches are available.

Users don’t upgrade applications (plugins included) as often or as willingly as they update their operating system. So, while your browser may be updated with the operating system, or automatically self-update, it’s likely most users are running a version of Java and/or Flash that’s several versions behind.

Applications never die, they just lose their support

As you can imagine, the declaration by Oracle that Java plugin support will be removed is a step forward in recognising the changing landscape of browser security, but it’s not an indication that this is an area in which security professionals can relax.

Just the opposite.

With the deprecation of plugin support comes the following:

  • Known bugs – without fixes. Ever.
  • No availability of tools to manage old versions.
  • No tools to protect vulnerable plugins.
  • Users desperately finding more baroque (and unsecurable) ways to keep their older setups together to continue to use applications which should have been replaced, but never were.

It’s not like Oracle are going to reach into every machine and uninstall / turn off plugin support. Even if they had the technical means to do so, such an act would be a completely inappropriate act.

There will be zombies

So, what we’re left with, whenever a company deprecates a product, application or framework, is a group of machines – zombies, if you will – that are operated by people who do not heed the call to cull, and which are going to remain active and vulnerable until such time as someone renders those walking-dead components finally lifeless.

If you’re managing an enterprise from a security perspective, you should follow up every deprecation announcement with a project to decide the impact and schedule the actual death and dismemberment of the component being killed off.

Then you can celebrate!

Assuming, of course, that you followed through successfully on your plan.

Until then, watch out for the zombies.

The Browsing Dead.

Artisan or Labourer?

Back when I started developing code, and that was a fairly long time ago, the vast majority of developers I interacted with had taken that job because they were excited to be working with technology, and enjoyed instructing and controlling computers to an extent that was perhaps verging on the creepy.

Much of what I read about application security strongly reflects this even today, where developers are exhorted to remember that security is an aspect of the overall quality of your work as a developer.

This is great – for those developers who care about the quality of their work. The artisans, if you like.

But who else is there?

For every artisan I meet when talking to developers, there’s about another two or three who are more like labourers.

They turn up on time, they do their daily grind, and they leave on time. Even if the time expected / demanded of them is longer than the usual eight hours a day.

By itself, this isn’t a bad thing. When you need another pair of “OK” and “Cancel” buttons, you want someone to hammer them out, not hand-craft them in bronze. When you need an API to a back-end database, you want it thin and functional, not baroque and beautiful.

Many – perhaps most – of your developers are there to do a job for pay, not because they love code.

And that’s what you interviewed them for, hired them for, and promoted them for.

It’s important to note that these guys mostly do what they are told. They are clever, and can be told to do complex things, but they are not single-mindedly interested in the software they are building, except in as much as you will reward them for delivering it.

What do you tell these guys?

If these developers will build only the software they’re told to build, what are you telling them to build?

At any stage, are you actively telling your developers that they have to adhere to security policies, or that they have to build in any kind of “security best practice”, or even to “think like an attacker” (much as I hate that phrase) – I’d rather you tell them to “think about all the ways every part of your code can fail, and work to prevent them” [“think like a defender”]?

Some of your developers will interject their own ideas of quality.

– But –

Most of your developers will only do as they have been instructed, and as their examples tell them.

How does this affect AppSec?

The first thing to note is that you won’t reach these developers just with optional training, and you might not even reach them just with mandatory training. They will turn up to mandatory training, because it is required of them, and they may turn up to optional training because they get a day’s pay for it. But all the appeals to them to take on board the information you’re giving them will fall upon deaf ears, if they return to their desks and don’t get follow-up from their managers.

Training requires management support, management enforcement, and management follow-through.

When your AppSec program makes training happen, your developers’ managers must make it clear to their developers that they are expected to take part, they are expected to learn something, and they are expected to come back and start using and demonstrating what they have learned.

Curiously enough, that’s also helpful for the artisans.

Second, don’t despair about these developers. They are useful and necessary, and as with all binary distinctions, the lines are not black and white, they are a spectrum of colours. There are developers at all stages between the “I turn up at 10, I work until 6 (as far as you know), and I do exactly what I’m told” end and the “I love this software as if it were my own child, and I want to mould it into a perfect shining paragon of perfection” end.

Don’t despair, but be realistic about who you have hired, and who you will hire as a result of your interview techniques.

Work with the developers you have, not the ones you wish you had.

Third, if you want more artisans and fewer labourers, the only way to do that is to change your hiring and promotion techniques.

Screen for quality-biased developers during the interview process. Ask them “what’s wrong with the code”, and reward them for saying “it’s not very easy to understand, the comments are awful, it uses too many complex constructs for the job it’s doing, etc”.

Reward quality where you find it. “We had feedback from one of the other developers on the team that you spent longer on this project than expected, but produced code that works flawlessly and is easy to maintain – you exceed expectations.”

Security is a subset of quality – encourage quality, and you encourage security.

Labourers as opposed to artisans have no internal “quality itch” to scratch, which means quality bars must be externally imposed, measured, and enforced.

What are you doing to reward developers for securing their development?

SQL injection in unexpected places

Every so often, I write about some real-world problems in this blog, rather than just getting excited about generalities. This is one of those times.

1. In which I am an idiot who thinks he is clever

I had a list of users the other day, exported from a partner with whom we do SSO, and which somehow had some duplicate entries in.

These were not duplicate in the sense of “exactly the same data in every field”, but differed by email address, and sometimes last name. Those of you who manage identity databases will know exactly what I’m dealing with here – people change their last name, through marriage, divorce, adoption, gender reassignment, whim or other reason, and instead of editing the existing entry, a new entry is somehow populated to the list of identities.

What hadn’t changed was that each of these individuals still held their old email address in Active Directory, so all I had to do was look up each email address, relate it to a particular user, and then pull out the canonical email address for that user. [In this case, that’s the first email address returned from AD]

A quick search on the interwebs gave me this as a suggested VBA function to do just that:

   1: Function GetEmail(email as String) as String

   2: ' Given one of this users' email addresses, find the canonical one.

   3:  

   4: ' Find our default domain base to search from

   5: Set objRootDSE = GetObject("LDAP://RootDSE")

   6: strBase = "'LDAP://" & objRootDSE.Get("defaultNamingContext") & "'"

   7:  

   8: ' Open a connection to AD

   9: Set ADOConnection = CreateObject("ADODB.Connection")

  10: ADOConnection.Provider = "ADsDSOObject"

  11: ADOConnection.Open "Active Directory Provider"

  12:  

  13: ' Create a command

  14: Set ADCommand = CreateObject("ADODB.Command")

  15: ADCommand.ActiveConnection = ADOConnection

  16:  

  17: 'Find user based on their email address

  18: ADCommand.CommandText = _

  19:     "SELECT distinguishedName,userPrincipalName,mail FROM " & _

  20:     strBase & " WHERE objectCategory='user' and mail='" & email & "'"

  21:  

  22: ' Execute this command

  23: Set ADRecordSet = ADCommand.Execute

  24:  

  25: ' Extract the canonical email address for this user.

  26: GetEmail = ADRecordSet.Fields("Mail")

  27:  

  28: ' Return.

  29: End Function

That did the trick, and I stopped thinking about it. Printed out the source just to demonstrate to a couple of people that this is not rocket surgery.

2. In which I realise I am idiot

Yesterday the printout caught my eye. Here’s the particular line that made me stop:

  18: ADCommand.CommandText = _

  19:     "SELECT distinguishedName,userPrincipalName,mail FROM " & _

  20:     strBase & " WHERE objectCategory='user' AND mail='" & email & "'"

That looks like a SQL query, doesn’t it?

Probably because it is.

It’s one of two formats that can be used to query Active Directory, the other being the less-readable LDAP syntax.

Both formats have the same problem – when you build the query using string concatenation like this, it’s possible for the input to give you an injection by escaping from the data and into the code.

I checked this out – when I called this function as follows, I got the first email address in the list as a response:

   1: Debug.Print GetEmail("x' OR mail='*")

You can see my previous SQL injection articles to come up with ideas of other things I can do now that I’ve got the ability to inject.

3. In which I try to be clever again

Normally, I’d suggest developers use Parameterised Queries to solve this problem – and that’s always the best idea, because it not only improves security, but it actually makes the query faster on subsequent runs, because it’s already optimised. Here’s how that ought to look:

   1: ADCommand.CommandText = _

   2:     "SELECT distinguishedName,userPrincipalName,mail FROM " & _

   3:     strBase & "WHERE objectCategory='user' AND mail=?"

   4:  

   5: 'Create and bind parameter

   6: Set ADParam = ADCommand.CreateParameter("", adVarChar, adParamInput, 40, email)

   7: ADCommand.Parameters.Append ADParam

That way, the question mark “?” gets replaced with “’youremail@example.com’” (including the single quote marks) and my injection attempt gets quoted in magical ways (usually, doubling single-quotes, but the parameter insertion is capable of knowing in what way it’s being inserted, and how exactly to quote the data).

4. In which I realise other people are idiot

uninterface

That’s the rather meaningful message:

Run-time error ‘-2147467262 (80004002)’:

No such interface supported

It doesn’t actually tell me which interface is supported, so of course I spend a half hour trying to figure out what changed that might have gone wrong – whether I’m using a question mark where perhaps I might need a named variable, possibly preceded by an “@” sign, but no, that’s SQL stored procedures, which are almost never the SQL injection solution they claim to be, largely because the same idiot who uses concatenation in his web service also does the same stupid trick in his SQL stored procedures, but I’m rambling now and getting far away from the point if I ever had one, so…

The interface that isn’t supported is the ability to set parameters.

The single best solution to SQL injection just plain isn’t provided in the ADODB library and/or the ADsDSOObject provider.

Why on earth would you miss that out, Microsoft?

5. I get clever

So, the smart answer here is input validation where possible, and if you absolutely have to accept any and all input, you must quote the strings that you’re passing in.

In my case, because I’m dealing with email addresses, I think I can reasonably restrict my input to alphanumerics, the “@” sign, full stops, hyphens and underscores.

Input validation depends greatly on the type of your input. If it’s a string, that will need to be provided in your SQL request surrounded with single quotes – that means that any single quote in the string will need to be encoded safely. Usually that means doubling the quote mark, although you might choose to replace them with double quotes or back ticks.

If your input is a number, you can be more restrictive in your input validation – only those characters that are actually parts of a number. That’s not necessarily as easy as it sounds – the letter “e” is often part of numbers, for instance, and you have to decide whether you’re going to accept bases other than 10. But from the perspective of securing against SQL injection, again that’s not too difficult to enforce.

Finally, of course, you have to decide what to do when bad input comes in – an error response, a static value, throw an exception, ignore the input and refuse to respond, etc. If you choose to signal an error back to the user, be careful not to provide information an attacker could find useful.

What’s useful to an attacker?

Sometimes the mere presence of an error is useful.

Certainly if you feed back to the attacker the full detail of the SQL query that went wrong – and people do sometimes do this! – you give the attacker far too much information.

Even feeding back the incorrect input can be a bad thing in many cases. In the Excel case I’m running into, that’s probably not easily exploitable, but you probably should be cautious anyway – if it’s an attacker causing an error, they may want you to echo back their input to exploit something else.

Call to Microsoft

Seriously, Microsoft, this is an unforgiveable lapse – not only is there no ability to provide the single best protection, because you didn’t implement the parameter interface, but also your own samples provide examples of code that is vulnerable to SQL injections. [Here and here – the other examples I was able to find use hard-coded search filters.]

Microsoft, update your samples to demonstrate how to securely query AD through the ADODB library, and consider whether it’s possible to extend the provider with the parameter interface so that we can use the gold-standard protection.

Call to developers

Parse your parameters – make sure they conform to expected values. Complain to the user when they don’t. Don’t use lack of samples as a reason not to deliver secure components.

Finally – how I did it right

And, because I know a few of you will hope to copy directly from my code, here’s how I wound up doing this exact function.

Please, by all means review it for mistakes – I don’t guarantee that this is correct, just that it’s better than I found originally. For instance, one thing it doesn’t check for is if the user actually has a value set for the “mail” field in Active Directory – I can tell you for certain, it’ll give a null reference error if you have one of these users come back from your search.

   1: Function GetEmail(email As String) As String

   2: ' Given one of this users' email addresses, find the canonical one.

   3:  

   4: ' Pre-execution input validation - email must contain only recognised characters.

   5: If email Like "*[!a-zA-Z0-9_@.]*" Then

   6: GetEmail = "Illegal characters"

   7: Exit Function

   8: End If

   9:  

  10:  

  11: ' Find our default domain base to search from

  12: Set objRootDSE = GetObject("LDAP://RootDSE")

  13: strBase = "'LDAP://" & objRootDSE.Get("defaultNamingContext") & "'"

  14:  

  15: ' Open a connection to AD

  16: Set ADOConnection = CreateObject("ADODB.Connection")

  17: ADOConnection.Provider = "ADsDSOObject"

  18: ADOConnection.Open "Active Directory Provider"

  19:  

  20: ' Create a command

  21: Set ADCommand = CreateObject("ADODB.Command")

  22: ADCommand.ActiveConnection = ADOConnection

  23:  

  24: 'Find user based on their email address

  25: ADCommand.CommandText = _

  26: "SELECT distinguishedName,userPrincipalName,mail FROM " & _

  27: strBase & " WHERE objectCategory='user' AND mail='" & email & "'"

  28:  

  29: ' Execute this command

  30: Set ADrecordset = ADCommand.Execute

  31:  

  32: ' Post execution validation - we should have exactly one answer.

  33: If ADrecordset Is Nothing Or (ADrecordset.EOF And ADrecordset.BOF) Then

  34: GetEmail = "Not found"

  35: Exit Function

  36: End If

  37: If ADrecordset.RecordCount > 1 Then

  38: GetEmail = "Many matches"

  39: Exit Function

  40: End If

  41:  

  42: ' Extract the canonical email address for this user.

  43: GetEmail = ADrecordset.Fields("Mail")

  44:  

  45: ' Return.

  46: End Function

As always, let me know if you find this at all useful.

The Manager in the Middle Attack

The first problem any security project has is to get executive support. The second problem is to find a way to make use of and direct that executive support.


So, that was the original tweet that seems to have been a little popular (not fantastically popular, but then I only have a handful of followers).

I’m sure a lot of people thought it was just an amusing pun, but it’s actually a realisation on my part that there’s a real thing that needs naming here.

Executives support security

By and large, the companies I’ve worked for and/or with in the last few years have experienced a glacial but certain shift in perspective.

Where once the security team seemed to be perceived as a necessary nuisance to the executive layers, it seems clear now that there have been sufficient occurrences of bad news (and CEOs being forced to resign) that executives come TO the security team for reassurance that they won’t become the next … well, the next whatever the last big incident was.

TalkTalk had three security incidents in the last year

Obviously, those executives still have purse strings to manage, and most security professionals like to get paid, because that’s largely what distinguishes them from security amateurs. So security can’t get ALL the dollars, but it’s generally easier to get the money and the firepower for security than it ever was in the past.

So executives support security. Some of them even ask what more they can do – and they seem sincere.

Developers support security

Well, some of them do, but that’s a topic for another post.

There are sufficient numbers of developers who care about quality and security these days, that there’s less of a need to be pushing the security message to developers quite how we used to.

We’ve mostly reached those developers who are already on our side.

How developers communicate

And those developers can mentor other developers who aren’t so keen on security.

The security-motivated developers want to learn more from us, they’re aware that security is an issue, and for the most part, they’re capable of finding and even distinguishing good security solutions to use.

Why is security still so crap, then?

Pentester cat wins.

If the guys at the top, and the guys at the bottom (sorry devs, but the way the org structure goes, you don’t manage anyone, so ipso facto you are at the bottom, along with the cleaners, the lawyers, and the guy who makes sure the building doesn’t get stolen in the middle of the night) care about security, why are we still seeing sites get attacked successfully? Why are apps still being successfully exploited?

Why is it that I can exploit a web site with SQL injection, an attack that has been around for as long as many of the developers at your company have been aware of computers?

Someone is getting in the way.

So, who doesn’t support security?

Ask anyone in your organisation if they think security is important, and you’ll get varying answers, most of which are acknowledging that without security in the software being developed, so it’s clear that you can’t actually poll people that way for the right answer.

Ask who’s in the way, instead…

Often it’s the security team – because it’s really hard to fill out a security team, and to stretch out around the organisation.

But that’s not the whole answer.

Ask the security-conscious developers what’s preventing them from becoming a security expert to their team, and they’ll make it clear – they’re rewarded and/or punished at their annual review times by the code they produce that delivers features.

There is no reward for security

And because managers are driving behaviour through performance reviews, it actually doesn’t matter what the manager tells their underlings, even if they give their devs a weekly spiel about how important security is. Even if you have an executive show up at their meetings and tell them security is “Job #1”. Even if he means it.

Those developers will return to their desks, and they’ll look at the goals against which they’ll be reviewed come performance review time.

The Manager in the Middle Attack

If managers don’t specifically reward good security behaviour, most developers will not produce secure code.

 

This is the Manager in the Middle Attack. Note that it applies in the event that no manager is present (thanks, Dan Kaminsky!)

Managers have to manage

Because I never like to point out a problem without proposing a solution:

Managers have to actively manage their developers into changing their behaviours. Some performance goals will help, along with the support (financial and moral) to make them achievable.

Here are a few sample goals:

  • Implement a security bug-scanning solution in your build/deploy process
  • Track the creation / destruction of bugs just like you track your feature burn-down rate.
    • It’ll be a burn-up rate to begin with, but you can’t incentivise a goal you can’t track
  • Prevent the insertion of new security bugs.
    • No, don’t just turn the graph from “trending up” to “trending less up” – actually ban checkins that add vulnerabilities detected by your scanning tools.
  • Reduce the number of security bugs in your existing code
    • Prioritise which ones to work on first.
      • Use OWASP, or whatever “top N list” floats your boat – until you’ve exhausted those.
      • Read the news, and see which flaws have been the cause of significant problems.
      • My list: Code injection (because if an attacker can run code on my site, it’s not my site); SQL Injection / data access flaws (because the attacker can steal, delete, or modify my data); other injection (including XSS, because it’s a sign you’re a freaking amateur web developer)
    • Don’t be afraid to game the system – if you can modularise your database access and remove all SQL injection bugs with a small change, do so. And call it as big of a win as it truly is!
  • Refactor to make bugs less likely
    • Find a widespread potentially unsecure behaviour and make it into a class or function, so it’s only unsecure in one place.
    • Then secure that place. And comment the heck out of why it’s done that way.
    • Ban checkins that use the old way of doing things.
    • Delete old, unused code, if you didn’t already do that earlier.
  • Share knowledge of security improvements
    • With your colleagues on your dev team
    • With other developers across the enterprise
    • Outside of the company
    • Become a sought-after expert (inside or outside the team or organisation) on a security practice – from a dev perspective
    • Mentor another more junior developer who wants to become a security hot-shot like yourself.

That’s quite a bunch of security-related goals for developers, which managers can implement. All of them can be measured, and I’m not so crass as to suggest that I know which numbers will be appropriate to your appetite for risk, or the size of hole out of which you have to dig yourself.

NCSAM post 1: That time again?

Every year, in October, we celebrate National Cyber Security Awareness Month.

Normally, I’m dismissive of anything with the word “Cyber” in it. This is no exception – the adjective “cyber” is a manufactured word, without root, without meaning, and with only a tenuous association to the world it endeavours to describe.

But that’s not the point.

In October, I teach my blog readers about security

And I do it from a very basic level.

This is not the place for me to assume you’ve all been reading and understanding security for years – this is where I appeal to readers with only a vague understanding that there’s a “security” thing out there that needs addressing.

Information Security as a shared responsibility

This first week is all about Information Security – Cyber Security, as the government and military put it – as our shared responsibility.

I’m a security professional, in a security team, and my first responsibility is to remind the thousands of other employees that I can’t secure the company, our customers, our managers, and our continued joint success, without everyone pitching in just a little bit.

I’m also a customer, with private data of my own, and I have a responsibility to take reasonable measures to protect that data, and by extension, my identity and its association with me. But I also need others to take up their responsibility in protecting me.

When we fail in that responsibility…

This year, I’ve had my various identifying factors – name, address, phone number, Social Security Number (if you’re not from the US, that’s a government identity number that’s rather inappropriately used as proof of identity in too many parts of life) – misappropriated by others, and used in an attempt to buy a car, and to file taxes in my name. So, I’ve filed reports of identity theft with a number of agencies and organisations.

I have spent DAYS of time working on preventing further abuse of my identity, and that of my family

Just today, another breach report arrives, from a company I do business with, letting me know that more data has been lost – this time from one of the organisations charged with actually protecting my identity and protecting my credit.

And it’s not just the companies that are at fault

While companies can – and should – do much more to protect customers (and putative customers), and their data, it’s also incumbent on the customers to protect themselves.

Every day, thousands of new credit and debit cards get issued to eager recipients, many of them teenagers and young adults.

Excited as they are, many of these youths share pictures of their new cards on Twitter or Facebook. Occasionally with both sides. There’s really not much your bank can do if you’re going to react in such a thoughtless way, with a casual disregard for the safety of your data.

Sure, you’re only liable for the first $50 of any use of your credit card, and perhaps of your debit card, but it’s actually much better to not have to trace down unwanted charges and dispute them in the first place.

So, I’m going to buy into the first message of National Cyber Security Awareness Month – and I’m going to suggest you do the same:

Stop. Think. Connect.

This is really the base part of all security – before doing a thing, stop a moment. Think about whether it’s a good thing to do, or has negative consequences you hadn’t considered. Connect with other people to find out what they think.

I’ll finish tonight with some examples where stopping a moment to think, and connecting with others to pool knowledge, will improve your safety and security online. More tomorrow.

Example: passwords

The most common password is “12345678”, or “password”. This means that many people are using that simple a password. Many more people are using more secure passwords, but they still make mistakes that could be prevented with a little thought.

Passwords leak – either from their owners, or from the systems that use those passwords to recognise the owners.

When they do, those passwords – and data associated with them – can then be used to log on to other sites those same owners have visited. Either because their passwords are the same, or because they are easily predicted. If my password at Adobe is “This is my Adobe password”, well, that’s strong(ish), but it also gives a hint as to what my Amazon password is – and when you crack the Adobe password leak (that’s already available), you might be able to log on to my Amazon account.

Creating unique passwords – and yes, writing them down (or better still, storing them in a password manager), and keeping them safe – allows you to ensure that leaks of your passwords don’t spread to your other accounts.

Example: Twitter and Facebook

There are exciting events which happen to us every day, and which we want to share with others.

That’s great, and it’s what Twitter and Facebook are there FOR. All kinds of social media available for you to share information with your friends.

Unfortunately, it’s also where a whole lot of bad people hang out – and some of those bad people are, unfortunately, your friends and family.

Be careful what you share, and if you’re sharing about others, get their permission too.

If you’re sharing about children, contemplate that there are predators out there looking for the information you may be giving out. There’s one living just up the road, I can assure you. They’re almost certainly safely withdrawn, and you’re protected from them by natural barriers and instincts. But you have none of those instincts on Facebook unless you stop, think and connect.

So don’t post addresses, locations, your child’s phone number, and really limit things like names of children, friends, pets, teachers, etc – imagine that someone will use that as ‘proof’ to your child of their safety. “It’s OK, I was sent by Aunt Josie, who’s waiting for you to come and see Dobbie the cat”

Example: shared accounts

Bob’s going off on vacation for a month.

Lucky Bob.

Just in case, while he’s gone, he’s left you his password, so that you can log on and access various files.

Two months later, and the office gets raided by the police. They’ve traced a child porn network to your company. To Bob.

Well, actually, to Bob and to you, because the system can’t tell the difference between Bob and you.

Don’t share accounts. Make Bob learn (with the IT department’s help) how to share portions of his networked files appropriately. It’s really not all that hard.

Example: software development

I develop software. The first thing I write is always a basic proof of concept.

The second thing I write – well, who’s got time for a second thing?

Make notes in comments every time you skip a security decision, and make those notes in such a way that you can revisit them and address them – or at least, count them – prior to release, so that you know how badly you’re in the mess.

Ways you haven’t stopped my XSS, Number 3 – helped by the browser / website

Apologies for not having written one of these in a while, but I find that one of the challenges here is to not release details about vulnerable sites while they’re still vulnerable – and it can take oh, so long for web developers to get around to fixing these vulnerabilities.

And when they do, often there’s more work to be done, as the fixes are incomplete, incorrect, or occasionally worse than the original problem.

Sometimes, though, the time goes so slowly, and the world moves on in such a way that you realise nobody’s looking for the vulnerable site, so publishing details of its flaws without publishing details of its identity, should be completely safe.

Helped by the website

So, what sort of attack is actively aided by the website?

Overly-verbose error messages

My favourite “helped by the website” issues are error messages which will politely inform you how your attack failed, and occasionally what you can do to fix it.

Here’s an SQL example:

image

OK, so now I know I have a SQL statement that contains the sequence “' order by type asc, sequence desc” – that tells me quite a lot. There are two fields called “type” and “sequence”. And my single injected quote was enough to demonstrate the presence of SQL injection.

What about XSS help?

There’s a few web sites out there who will help you by telling you which characters they can’t handle in their search fields:

image

Now, the question isn’t “what characters can I use to attack the site?”, but “how do I get those characters into the site. [Usually it’s as simple as typing them into the URL instead of using the text box, sometimes it’s simply a matter of encoding]

Over-abundance of encoding / decoding

On the subject of encoding and decoding, I generally advise developers that they should document interface contracts between modules in their code, describing what the data is, what format it’s in, and what isomorphic mapping they have used to encode the data so that it is not possible to confuse it with its surrounding delimiters or code, and so that it’s possible to get the original string back.

An isomorphism, or 1:1 (“one to one”) mapping, in data encoding terms, is a means of making sure that each output can only correspond to one possible input, and vice versa.

Without these contracts, you find that developers are aware that data sometimes arrives in an encoded fashion, and they will do whatever it takes to decode that data. Data arrives encoded? Decode it. Data arrives doubly-encoded? Decode it again. Heck, take the easy way out, as this section of code did:
var input, output;
parms = document.location.search.substr(1).split("&");
input = parms[1];
while (input != output) {
    output = input;
    input = unescape(output);
}

[That’s from memory, so I apologise if it’s a little incorrect in many, many other ways as well.]

Yes, the programmer had decided to decode the input string until he got back a string that was unchanged.

This meant that an attacker could simply provide a multiply-encoded attack sequence which gets past any filters you have, such as WAFs and the like, and which the application happily decodes for you.

Granted, I don’t think WAFs are much good, compared to actually fixing the code, but they can give you a moment’s piece to fix code, as long as your code doesn’t do things to actively prevent the WAF from being able to help.

Multiple tiers, each decoding

This has essentially the same effect as described above. The request target for an HTTP request may be percent-encoded, and when it is, the server is required to treat it equivalently to the decoded target. This can sometimes have the effect that each server in a multi-tiered service will decode the HTTP request once, achieving the multiple-decode WAF traversal I talk about above.

Spelling correction

image

OK, that’s illustrative, and it illustrates that Google doesn’t fall for this crap.

But it’s interesting how you’ll find occasionally that such a correction results in executing code.

Stopwords and notwords in searches

When finding XSS in searches, we often concentrate on failed searches – after all, in most product catalogues, there isn’t an item called “<script>prompt()</script>” – unless we put it there on a previous attack.

But often the more complex (and easily attacked) code is in the successful search results – so we want to trigger that page.

Sometimes there’s something called “script”, so we squeak that by (there’s a band called “The Script”, and very often writing on things is desribed as being in a “script” font), but now we have to build Javascript with other terms that match the item on display when we find “script”. Fortunately, there’s a list of words that most search engines are trained to ignore – they are called “stopwords”. These are words that don’t impact the search at all, such as “the”, “of”, “to”, “and”, “by”, etc – words that occur in such a large number of matching items that it makes no sense to allow people to search on those words. Often colours will appear in the list of stopwords, along with generic descriptions of items in the catalogue (“shirt”, “book”, etc).

Well, “alert” is simply “and”[0]+”blue”[1]+”the”[2]+”or”[1]+”the”[0], so you can build function names quickly from stopwords. Once you have String.FromCharCode as a function object, you can create many more strings and functions more quickly. For an extreme example of this kind of “building Javascript from minimal characters”, see this page on how to create all JavaScript from eight basic characters (none of which are alphabetical!)

“Notwords” aren’t a thing, but made the title seem more interesting – sometimes it’d be nice to slip in a string that isn’t a stopword, and isn’t going to be found in the search results. Well, many search functions have a grammar that allow us to say things like “I’d like all your teapots except for the ones made from steel” – or more briefly, “teapot !steel”.

How does this help us execute an attack?

Well, we could just as easily search for “<script> !prompt() </script>” – valid JavaScript syntax, which means “run the prompt() function, and return the negation of its result”. Well, too late, we’ve run our prompt command (or other commands). I even had “book !<script> !prompt()// !</script>” work on one occasion.

Helped by the browser

So, now that we’ve seen some examples of the server or its application helping us to exploit an XSS, what about the browser?

Carry on parsing

One of the fun things I see a lot is servers blocking XSS by ensuring that you can’t enter a complete HTML tag except for the ones they approve of.

So, if I can’t put that closing “>” in my attack, what am I to do? I can’t just leave it out.

Well, strange things happen when you do. Largely because most web pages are already littered with closing angle brackets – they’re designed to close other tags, of course, not the one you’ve put in, but there they are anyway.

So, you inject “<script>prompt()</script>” and the server refuses you. You try “<script prompt() </script” and it’s allowed, but can’t execute.

So, instead, try a single tag, like “<img src=x onerror=prompt()>” – it’s rejected, because it’s a complete tag, so just drop off the terminating angle bracket. “<img src=x onerror=prompt()” – so that the next tag doesn’t interfere, add an extra space, or an “x=”:

<img src=x onerror=prompt() x=

If that gets injected into a <p> tag, it’ll appear as this:

<p><img src=x onerror=prompt() x=</p>

How’s your browser going to interpret that? Simple – open p tag, open img tag with src=x, onerror=prompt() and some attribute called “x”, whose value is “</p”.

If confused, close a tag automatically

Occasionally, browser heuristics and documented standards will be just as helpful to you as the presence of characters in the web page.

Can’t get a “/” character into the page? Then you can’t close a <script> tag. Well, that’s OK, because the <svg> tag can include scripts, and is documented to end at the next HTML tag that isn’t valid in SVG. So… “<svg><script>prompt()<p>” will happily execute as if you’d provided the complete “<svg><script>prompt()</script></svg><p>”

There are many other examples where the browser will use some form of heuristic to “guess” what you meant, or rather to guess what the server meant with the code it sends to the browser with your injected data. See what happens when you leave your attack half-closed.

Can’t comment with // ? Try other comments

When injecting script, you often want to comment the remaining line after your injection, so it isn’t parsed – a failing parse results in none of your injected code being executed.

So, you try to inject “//” to make the rest of the line a comment. Too bad, all “/” characters are encoded or discarded.

Well, did you know that JavaScript in HTML treats “<!—” as a perfectly valid equivalent?

Different browsers help in different ways

Try attacks in different browsers, they each behave in subtly different ways.

Firefox doesn’t have an XSS filter. So it won’t prevent XSS attacks that way.

IE 11 doesn’t encode URI elements, so will sometimes work when your attack would otherwise be encoded.

Chrome – well, I don’t use Chrome often enough to comment on its quirks. Too irritated with it trying to install on my system through Adobe Flash updates.

Well, I think that’s enough for now.

Why didn’t you delete my data?

The recent hack of Ashley Madison, and the subsequent discussion, reminded me of something I’ve been meaning to talk about for some time.

Can a web site ever truly delete your data?

This is usually expressed, as my title suggests, by a user asking the web site who hosted that user’s account (and usually directly as a result of a data breach) why that web site still had the user’s data.

This can be because the user deliberately deleted their account, or simply because they haven’t used the service in a long time, and only remembered that they did by virtue of a breach notification letter (or a web site such as Troy Hunt’s haveibeenpwned.com).

1. Is there a ‘delete’ feature?

Web sites do not see it as a good idea to have a ‘delete’ feature for their user accounts – after all, what you’re asking is for a site to devote developer resources to a feature that specifically curtails the ability of that web site to continue to make money from the user.

To an accountant’s eye (or a shareholder’s), that’s money out the door with the prospect of reducing money coming in.

To a user’s eye, it’s a matter of security and trust. If the developer deliberately misses a known part of the user’s lifecycle (sunset and deprecation are both terms developers should be familiar with), it’s fairly clear that there are other things likely to be missing or skimped on. If a site allows users to disconnect themselves, to close their accounts, there’s a paradox that says more users will choose to continue their service, because they don’t feel trapped.

So, let’s assume there is a “delete” or “close my account” feature – and that it’s easy to use and functional.

2. Is there a ‘whoops’ feature for the delete?

In the aftermath of the Ashley Madison hack, I’m sure there’s going to be a few partners who are going to engage in retributive behaviours. Those behaviours could include connecting to any accounts that the partners have shared, and cause them to be closed, deleted and destroyed as much as possible. It’s the digital equivalent of cutting the sleeves off the cheating partner’s suit jackets. Probably.

Assuming you’ve finally settled down and broken/made up, you’ll want those accounts back under your control.

So there might need to be a feature to allow for ‘remorse’ over the deletion of an account. Maybe not for the jealous partner reason, even, but perhaps just because you forgot about a service you were making use of by that account, and which you want to resurrect.

OK, so many sites have a ‘resurrect’ function, or a ‘cool-down’ period before actually terminating an account.

Facebook, for instance, will not delete your account until you’ve been inactive for 30 days.

3. Warrants to search your history

Let’s say you’re a terrorist. Or a violent criminal, or a drug baron, or simply someone who needs to be sued for slanderous / libelous statements made online.

OK, in this case, you don’t WANT the server to keep your history – but to satisfy warrants of this sort, a lawyer is likely to tell the server’s operators that they have to keep history for a specific period of time before discarding them. This allows for court orders and the like to be executed against the server to enforce the rule of law.

So your server probably has to hold onto that data for more than the 30 day inactive period. Local laws are going to put some kind of statute on how long a service provider has to hold onto your data.

As an example, a retention notice served under the UK’s rather steep RIPA law could say the service provider has to hold on to some types of data for as much as 12 months after the data is created.

4. Financial and Business records

If you’ve paid for the service being provided, those transaction details have to be held for possible accounting audits for the next several years (in the US, between 3 and 7 years, depending on the nature of the business, last time I checked).

Obviously, you’re not going to expect an audit to go into complete investigations of all your individual service requests – unless you’re billed to that level. Still, this record is going to consist of personal details of every user in the system, amounts paid, service levels given, a la carte services charged for, and some kind of demonstration that service was indeed provided.

So, even if Ashley Madison, or whoever, provided a “full delete” service, there’s a record that they have to keep somewhere that says you paid them for a service at some time in the past.

Eternal data retention – is it inevitable?

I don’t think eternal data retention is appropriate or desirable. It’s important for developers to know data retention periods ahead of time, and to build them into the tools and services they provide.

Data retention shouldn’t be online

Hackers fetch data from online services. Offline services – truly offline services – are fundamentally impossible to steal over the network. An attacker would have to find the facility where they’re stored, or the truck the tapes/drives are traveling in, and steal the data physically.

Not that that’s impossible, but it’s a different proposition from guessing someone’s password and logging into their servers to steal data.

Once data is no longer required for online use, and can be stored, move it into a queue for offline archiving. Developers should make sure their archivist has a data destruction policy in place as well, to get rid of data that’s just too old to be of use. Occasionally (once a year, perhaps), they should practice a data recovery, just to make sure that they can do so when the auditors turn up. But they should also make sure that they have safeguards in place to prevent/limit illicit viewing / use of personal data while checking these backups.

Not everything has to be retained

Different classifications of data have different retention periods, something I alluded to above. Financial records are at the top end with seven years or so, and the minutiae of day-to-day conversations can probably be deleted remarkably quickly. Some services actually hype that as a value of the service itself, promising the messages will vanish in a snap, or like a ghost.

When developing a service, you should consider how you’re going to classify data so that you know what to keep and what to delete, and under what circumstances. You may need a lawyer to help with that.

Managing your data makes service easier

If you lay the frameworks in place when developing a service, so that data is classified and has a documented lifecycle, your service naturally becomes more loosely coupled. This makes it smoother to implement, easier to change, and more compartmentalised. This helps speed future development.

Providing user lifecycle engenders trust and loyalty

Users who know they can quit are more likely to remain loyal (Apple aside). If a user feels hemmed in and locked in place, all that’s required is for someone to offer them a reason to change, and they’ll do so. Often your own staff will provide the reason to change, because if you’re working hard to keep customers by locking them in, it demonstrates that you don’t feel like your customers like your service enough to stay on their own.

So, be careful who you give data to

Yeah, I know, “to whom you give data”, thanks, grammar pedants.

Remember some basic rules here:

1. Data wants to be free

Yeah, and Richard Stallmann’s windows want to be broken.

Data doesn’t want anything, but the appearance is that it does, because when data is disseminated, it essentially cannot be returned. Just like if you go to RMS’s house and break all his windows, you can’t then put the glass fragments back into the frames.

Developers want to possess and collect data – it’s an innate passion, it seems. So if you give data to a developer (or the developer’s proxy, any application they’ve developed), you can’t actually get it back – in the sense that you can’t tell if the developer no longer has it.

2. Sometimes developers are evil – or just naughty

Occasionally developers will collect and keep data that they know they shouldn’t. Sometimes they’ll go and see which famous celebrities used their service recently, or their ex-partners, or their ‘friends’ and acquaintances.

3. Outside of the EU, your data doesn’t belong to you

EU data protection laws start from the basic assumption that factual data describing a person is essentially the non-transferrable property of the person it describes. It can be held for that person by a data custodian, a business with whom the person has a business relationship, or which has a legal right or requirement to that data. But because the data belongs to the person, that person can ask what data is held about them, and can insist on corrections to factual mistakes.

The US, and many other countries, start from the premise that whoever has collected data about a person actually owns that data, or at least that copy of the data. As a result, there’s less emphasis on openness about what data is held about you, and less access to information about yourself.

Ideally, when the revolution comes and we have a socialist government (or something in that direction), the US will pick up this idea and make it clear that service providers are providing a service and acting only as a custodian of data about their customers.

Until then, remember that US citizens have no right to discover who’s holding their data, how wrong it might be, or to ask for it to be corrected.

4. No one can leak data that you don’t give them

Developers should also think about this – you can’t leak data you don’t hold. Similarly, if a user doesn’t give data, or gives incorrect or value-less data, if it leaks, that data is fundamentally worthless.

The fallout from the Ashley Madison leak is probably reduced significantly by the number of pseudonyms and fake names in use. Probably.

Hey, if you used your real name on a cheating web site, that’s hardly smart. But then, as I said earlier today, sometimes security is about protecting bad people from bad things happening to them.

5. Even pseudonyms have value

You might use the same nickname at several places; you might provide information that’s similar to your real information; you might link multiple pseudonymous accounts together. If your data leaks, can you afford to ‘burn’ the identity attached to the pseudonym?

If you have a long message history, you have almost certainly identified yourself pretty firmly in your pseudonymous posts, by spelling patterns, word usages, etc.

Leaks of pseudonymous data are less problematic than leaks of eponymous data, but they still have their problems. Unless you’re really good at OpSec.

Finally

Finally, I was disappointed earlier tonight to see that Troy had already covered some aspects of this topic in his weekly series at Windows IT Pro, but I think you’ll see that his thoughts are from a different direction than mine.