Programmer Hubris

Heartbleed–musings while it’s still (nearly) topical

Hopefully, you’ll all know by now what Heartbleed is about. It’s not a virus, it’s a bug in a new feature that was added to a version of OpenSSL, wasn’t very well checked before making it part of the standard build, and which has for the last couple of years meant that any vulnerable system can have its running memory leached by an attacker who can connect to it. I have a number of approaches to make to this, which I haven’t seen elsewhere:

Behavioural Changes to Prevent “the next” Heartbleed

You know me, I’m all about the “defence against the dark arts” side of information security – it’s fun to attack systems, but it’s more interesting to be able to find ways to defend.

Here are some of my suggestions about programming practices that would help:

  1. Don’t adopt new features into established protocols without a clear need to do so. Why was enabling the Heartbeat extension a necessary thing to foist on everyone? Was it a MUST in the RFC? Heartbeat, “keep-alive” and similar measures are a waste of time on most of the Internet’s traffic, either because the application layer already keeps up a constant communication, or because it’s easy to recover and restart. Think very carefully before adopting as mandatory a new feature into a security related protocol.
  2. This was not a security coding bug, it was a secure coding bug, but in a critical piece of security code. Secure code practices in general should be a part of all developers’ training and process, much like hand-washing is important for doctors whether they’re performing surgery or checking your throat for swelling. In this case, the code submitted should have tripped a “this looks odd” sense in the reviewer, combined with its paucity of comments and self-explanation (but then, OpenSSL is by and large that way anyway).
  3. Check lengths of buffers. When data is structured, or wrapped in layers, like SSL records are, transform it back into structures and verify at each layer. I’m actually trying to say write object-oriented code to represent objects – whether the language is object-oriented or not. [It’s a matter of some pride for me that I reviewed some of the Fortran code I wrote as a student back in the mid eighties, and I can see object-orientedness trying to squeeze its way out.]
  4. Pay someone to review code. Make it their job, pay them well to cover the boredom, and hold them responsible for reviewing it properly. If it matters enough to be widely used, it matters enough to be supported.
  5. Stop using magic numbers. No “1 + 2 + 16” – use sizeof, even when it’s bleedin’ obvious.
  6. Unit tests. And then tests written by a QA guy, who’s trying to make it fail.

There’s just a few ideas off the top of my head. It’s true that this was a HARD bug to find in automated code review, or even with manual code review (though item 2 above tells you that I think the code looked perverse enough for a reviewer to demand better, cleaner code that could at least be read).

National Security reaction – inappropriate!

Clearly, from the number of sites (in all countries) affected negatively by this flaw, from the massive hysteria that has resulted, as well as the significant thefts disclosed to date, this bug was a National Security issue.

So, how does the US government respond to the allegations going around that they had knowledge of this bug for a long time?

By denying the allegations? By asserting they have a mandate to protect?

No, by reminding us that they’ll protect US (and world) industries UNLESS there’s a benefit to spying in withholding and exploiting the bug.

There was even a quote in the New York Times saying:

“You are not going to see the Chinese give up on ‘zero days’ just because we do.”

No, you’re going to see “the Chinese” [we always have to have an identifiable bogeyman] give up on zero days when our response to finding them is to PATCH THEM, not hold them in reserve to exploit at our leisure.

Specifically, if we patch zero days when we find them, those weapons disappear from our adversaries’ arsenals.

If we hold on to zero days when we find them, those weapons are a part of our adversaries’ arsenals (because the bad guys share better than the good guys).

National Security officials should recognise that in cyberwar – which consists essentially of people sending postcards saying “please hit yourself” to one another, and then expressing satisfaction when the recipient does so – you win by defending far more than by attacking.

Many eyeballs review is surprisingly incomplete

It’s often been stated that “many eyeballs” review open source code, and as a result, the reviews are of implicitly better quality than closed source code.

Clearly, OpenSSL is an important and widely used piece of security software, and yet this change was, by all accounts, reviewed by three people before being published and widely deployed. Only one of those people works full time for OpenSSL, and another was the author of the feature in question.

There are not “many” eyeballs working on this review. Closed source will often substitute paid eyeballs for quantity of eyeballs, and as a result will often achieve better reviews.

Remember, it’s the quality of the review that counts, and not whether the source is closed or open.

Closed source that is thoroughly reviewed by experts is better than open source that’s barely reviewed at all.

Finally, in case you’re not yet tired of Heartbleed analogies

Yes, XKCD delivered perhaps the most widely used analogy.

But here’s the one I use to describe it to family members.

Imagine you’re manning a reception desk.

Calls come in, you write down messages, and you send them off.

At some point, you realise that this is a waste of paper, so you start writing your messages on a whiteboard.

Wiping the whole whiteboard for each message is a waste of effort, so you only wipe out enough space to write each incoming message.

Some messages are long, some are short.

One day, you are asked to read a message back to the person who leaves it, just after you wrote it.

And to make it easy, they tell you how long their message is.

If someone gave you a six letter message, and asked you to read all six hundred letters of it back to them, you’d be upset, because that’s not how many letters they gave you.

Computers aren’t so smart, they are just really fast idiots.

The computer doesn’t get surprised that you sent six characters and ask for six hundred back, so it reads off the entire whiteboard, containing bits and pieces of every message you’ve had sent through you.

And because most messages are small, and only some are large, there’s almost an entire message in each response.

Microsoft’s (new!) SDL Threat Modeling Tool 2014

Amid almost no fanfare whatsoever, Microsoft yesterday released a tool I’ve been begging them for over the last five or six years.

[This is not unusual for me to be so persistently demanding, as I’ve found it’s often the only way to get what I want.]

As you’ve guessed from the title, this tool is the “SDL Threat Modeling Tool 2014”. Sexy name, indeed.

Don’t they already have one of those?

Well, yeah, kind of. There’s the TAM Threat Analysis & Modeling Tool, which is looking quite creaky with age now, and which I never found to be particularly usable (though some people have had success with it, so I’m not completely dismissive of it). Then there’s the previous versions of the SDL Threat Modeling Tool.

These have had their uses – and certainly it’s noticeable that when I work with a team of developers, one of whom has worked at Microsoft, it’s encouraging to ask “show me your threat model” and have them turn around with something useful to dissect.

So what’s wrong with the current crop of TM tools?

In a word, Cost.

Threat modeling tools from other than Microsoft are pretty pricey. If you’re a government or military contractor, they’re probably great and wonderful. Otherwise, you’ll probably draw your DFDs in PowerPoint (yes, that’s one of the easier DFD tools available to most of you!), and write your threat models in Word.

Unless, of course, you download and use the Microsoft SDL Threat Modeling Tool, which has always been free.

So where’s the cost?

The SDL TM tool itself was free, but it had a rather significant dependency.

Visio.

Visio is not cheap.

As a result, those of us who championed threat modeling at all in our enterprises found it remarkably difficult to get approval to use a free tool that depended on an expensive tool that nobody was going to use.

What’s changed today?

With the release of Microsoft SDL Threat Modeling Tool 2014, Microsoft has finally delivered a tool that allows for the creation of moderately complex DFDs (you don’t want more complex DFDs than that, anyway!), and a threat library-based analysis of those DFDs, without making it depend on anything more expensive or niche than Windows and .NET. [So, essentially, just Windows.]

Yes, that means no Visio required.

Is there anything else good about this new tool?

A quick bullet list of some of the features you’ll like, besides the lack of Visio requirement:

  • Imports from the previous SDL Threat Modeling Tool (version 3), so you don’t have to re-work
  • Multiple diagrams per model, for different levels of DFD
  • Analysis is per-interaction, rather than per-object [scary, but functionally equivalent to per-object]
  • The file format is XML, and is reasonably resilient to modification
  • Objects and data flows can represent multiple types, defined in an XML KnowledgeBase
  • These types can have customised data elements, also defined in XML
  • The rules about what threats to generate are also defined in XML
  • [These together mean an enterprise can create a library of threats for their commonly-used components]
  • Trust boundaries can be lines, or boxes (demonstrating that trust boundaries surround regions of objects)
  • Currently supported by a development team who are responsive to feature requests

Call to Action?

Yes, every good blog post has to have one of these, doesn’t it? What am I asking you to do with this information?

Download the tool. Try it out on a relatively simple project, and see how easy it is to generate a few threats.

Once you’re familiar with the tool, visit the KnowledgeBase directory in the tool’s installation folder, and read the XML files that were used to create your threats.

Add an object type.

Add a data flow type.

Add custom properties that describe your custom types.

Use those custom properties in a rule you create to generate one of the common threats in your environment.

Work with others in your security and development teams to generate a good threat library, and embody it in XML rules that you can distribute to other users of the threat modeling tool in your enterprise.

Document and mitigate threats. Measure how successful you are, at predicting threats, at reducing risk, and at impacting security earlier in your development cycle.

Then do a better job on each project.

Ways you haven’t stopped my XSS, Number 2–backslash doesn’t encode quotes in HTML attributes

Last time in this series, I posted an example where XSS was possible because a site’s developer is unaware of the implications that his JavaScript is hosted inside of HTML.

This is sort of the opposite of that, noting that time-worn JavaScript (and C, Java, C++, C#, etc) methods don’t always apply to HTML.

The XSS mantra for HTML attributes

I teach that XSS is prevented absolutely by appropriate contextual encoding of user data on its way out of your application and into the page.

The context dictates what encoding you need, whether the context is “JavaScript string”, “JavaScript code”, “HTML attribute”, “HTML content”, “URL”, “CSS expression”, etc, etc.

In the case of HTML attributes, it’s actually fairly simple.

Unless you are putting a URL into an attribute, there are three simple rules:

  1. Every attribute’s value must be quoted, whether with single quotes or double quotes.
  2. If the quote you use appears in the attribute value, it must be encoded.
  3. You must encode any characters which could confuse the encoding. [Encode the encoding characters]

Seems easy, right?

This is all kinds of good, except when you run into a site where the developer hasn’t really thought about their encoding very well.

You see, HTML attribute values are encoded using HTML encoding, not C++ encoding.

To HTML, the back-slash has no particular meaning.

I see this all the time – I want to inject script, but the site only lets me put user data into an attribute value:

<meta name="keywords" content="Wot I searched for">

That’s lovely. I’d like to put "><script>prompt(1)</script> in there as a proof of concept, so that it reads:

<meta name="keywords" content=""><script>prompt(1)</script>">

The dev sees this, and cuts me off, by preventing me from ending the quoted string that makes up the value of the content attribute:

<meta name="keywords" content="\"><script>prompt(1)</script>">

Nice try, Charlie, but that back-slash, it’s just a back-slash. It means nothing to HTML, and so my quote character still ends the string. My prompt still executes, and you have to explain why your ‘fix’ got broken as soon as you released it.

Oh, if only you had chosen the correct HTML encoding, and replaced my quote with “&quot;” [and therefore, also replace every “&” in my query with “&amp;”], we’d be happy.

And this, my friends, is why every time you implement a mitigation, you must test it. And why you follow the security team’s guidance.

Exercise for the reader – how do you exploit this example if I don’t encode the quotes, but I do strip out angle brackets?

Apple’s “goto fail” SSL issue–how do you avoid it?

Context – Apple releases security fix; everyone sees what they fixed

 

Last week, Apple released a security update for iOS, indicating that the vulnerability being fixed is one that allows SSL / TLS connections to continue even though the server should not be authenticated. This is how they described it:

Impact: An attacker with a privileged network position may capture or modify data in sessions protected by SSL/TLS

Description: Secure Transport failed to validate the authenticity of the connection. This issue was addressed by restoring missing validation steps.

Secure Transport is their library for handling SSL / TLS, meaning that the bulk of applications written for these platforms would not adequately validate the authenticity of servers to which they are connected.

Ignore “An attacker with a privileged network position” – this is the very definition of a Man-in-the-Middle (MITM) attacker, and whereas we used to be more blasé about this in the past, when networking was done with wires, now that much of our use is wireless (possibly ALL in the case of iOS), the MITM attacker can easily insert themselves in the privileged position on the network.

The other reason to ignore that terminology is that SSL / TLS takes as its core assumption that it is protecting against exactly such a MITM. By using SSL / TLS in your service, you are noting that there is a significant risk that an attacker has assumed just such a privileged network position.

Also note that “failed to validate the authenticity of the connection” means “allowed the attacker to attack you through an encrypted channel which you believed to be secure”. If the attacker can force your authentication to incorrectly succeed, you believe you are talking to the right server, and you open an encrypted channel to the attacker. That attacker can then open an encrypted channel to the server to which you meant to connect, and echo your information straight on to the server, so you get the same behaviour you expect, but the attacker can see everything that goes on between you and your server, and modify whatever parts of that communication they choose.

So this lack of authentication is essentially a complete failure of your secure connection.

As always happens when a patch is released, within hours (minutes?) of the release, the patch has been reverse engineered, and others are offering their description of the changes made, and how they might have come about.

In this case, the reverse engineering was made easier by the availability of open source copies of the source code in use. Note that this is not an intimation that open source is, in this case, any less secure than closed source, because the patches can be reverse engineered quickly – but it does give us a better insight into exactly the code as it’s seen by Apple’s developers.

Here’s the code:

    if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail;
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;


Yes, that’s a second “goto fail”, which means that the last “if” never gets called, and the failure case is always executed. Because of the condition before it, however, the ‘fail’ label gets executed with ‘err’ set to 0.



Initial reaction – lots of haha, and suggestions of finger pointing



So, of course, the Internet being what it is, the first reaction is to laugh at the clowns who made such a simple mistake, that looks so obvious.



T-shirts are printed with “goto fail; goto fail;” on them. Nearly 200 have been sold already (not for me – I don’t generally wear black t-shirts).



But really, these are smart guys – “be smarter” is not the answer



This is SSL code. You don’t get let loose on SSL code unless you’re pretty smart to begin with. You don’t get to work as a developer at Apple on SSL code unless you’re very smart.



Clearly “be smart” is already in evidence.



There is a possibility that this is too much in evidence – that the arrogance of those with experience and a track record may have led these guys to avoid some standard protective measures. The evidence certainly fits that view, but then many developers start with that perspective anyway, so in the spirit of working with the developers you have, rather than the ones you theorise might be possible, let’s see how to address this issue long term:



Here’s my suggested answers – what are yours?



Enforce indentation in your IDE / check-in process



OK, so it’s considered macho to not rely on an IDE. I’ve never understood that. It’s rather like saying how much you prefer pounding nails in with your bare fists, because it demonstrates how much more of a man you are than the guy with a hammer. It doesn’t make sense when you compare how fast the job gets done, or the silly and obvious errors that turn up clearly when the IDE handles your indenting, colouring, and style for you.



Yes, colouring. I know, colour-blind people exist – and those people should adjust the colours in the IDE so that they make sense. Even a colour-blind person can get shade information to help them. I know syntax colouring often helps me spot when an XSS injection is just about ready to work, when I would otherwise have missed it in all the surrounding garbage of HTML code. The same is true when building code, you can spot when keywords are being interpreted as values, when string delimiters are accidentally unescaped, etc.



The same is true for indentation. Indentation, when it’s caused by your IDE based on parsing your code, rather than by yourself pounding the space bar, is a valuable indication of program flow. If your indentation doesn’t match control flow, it’s because you aren’t enforcing indentation with an automated tool.



What the heck, enforce all kinds of style



Your IDE and your check-in process are a great place to enforce style standards to ensure that code is not confusing to the other developers on your team – or to yourself.



A little secret – one of the reasons I’m in this country in the first place is that I sent an eight-page fax to my bosses in the US, criticising their programming style and blaming (rightly) a number of bugs on the use of poor and inconsistent coding standards. This was true two decades ago using Fortran, and it’s true today in any number of different languages.



The style that was missed in this case – put braces around all your conditionally-executed statements.



I have other style recommendations that have worked for me in the past – meaningful variable names, enforced indenting, maximum level of indenting, comment guidelines, constant-on-the-left of comparisons, don’t include comparisons and assignments in the same line, one line does one thing, etc, etc.



Make sure you back the style requirements with statements as to what you are trying to do with the style recommendation. “Make the code look the same across the team” is a good enough reason, but “prevent incorrect flow” is better.



Make sure your compiler warns on unreachable code



gcc has the option “-Wunreachable-code”.



gcc disabled the option in 2010.



gcc silently disabled the option, because they didn’t want anyone’s build to fail.



This is not (IMHO) a smart choice. If someone has a warning enabled, and has enabled the setting to produce a fatal error on warnings, they WANT their build to fail if that warning is triggered, and they WANT to know when that warning can no longer be relied upon.



So, without a warning on unreachable code, you’re basically screwed when it comes to control flow going where you don’t want it to.



Compile with warnings set to fatal errors



And of course there’s the trouble that’s caused when you have dozens and dozens of warnings, so warnings are ignored. Don’t get into this state – every warning is a place where the compiler is confused enough by your code that it doesn’t know whether you intended to do that bad thing.



Let me stress – if you have a warning, you have confused the compiler.



This is a bad thing.



You can individually silence warnings (with much comments in your code, please!) if you are truly in need of a confusing operation, but for the most part, it’s a great saving on your code cleanliness and clarity if you address the warnings in a smart and simple fashion.



Don’t over-optimise or over-clean your code



The compiler has an optimiser.



It’s really good at its job.



It’s better than you are at optimising code, unless you’re going to get more than a 10-20% improvement in speed.



Making code shorter in its source form does not make it run faster. It may make it harder to read. For instance, this is a perfectly workable form of strstr:



const char * strstr(const char *s1, const char *s2)
{
  return (!s1||!s2||!*s2)?s1:((!*s1)?0:((*s1==*s2&&s1==strstr(s1+1,s2+1)-1)?s1:strstr(s1+1,s2)));
}



Can you tell me if it has any bugs in it?



What’s its memory usage? Processor usage? How would you change it to make it work on case-insensitive comparisons? Does it overflow buffers?



Better still: does it compile to smaller or more performant code, if you rewrite it so that an entry-level developer can understand how it works?



Now go and read the implementation from your CRT. It’s much clearer, isn’t it?



Release / announce patches when your customers can patch



Releasing the patch on Friday for iOS and on Tuesday for OS X may have actually been the correct move – but it brings home the point that you should release patches when you maximise the payoff between having your customers patch the issue and having your attackers reverse engineer it and build attacks.



Make your security announcements findable



Where is the security announcement at Apple? I go to apple.com and search for “iOS 7.0.6 security update”, and I get nothing. It’d be really nice to find the bulletin right there. If it’s easier to find your documentation from outside your web site than from inside, you have a bad search engine.



Finally, a personal note



People who know me may have the impression that I hate Apple. It’s a little more nuanced than that.



I accept that other people love their Apple devices. In many ways, I can understand why.



I have previously owned Apple devices – and I have tried desperately to love them, and to find why other people are so devoted to them. I have failed. My attempts at devotion are unrequited, and the device stubbornly avoids helping me do anything useful.



Instead of a MacBook Pro, I now use a ThinkPad. Instead of an iPad (remember, I won one for free!), I now use a Surface 2.



I feel like Steve Jobs turned to me and quoted Dr Frank N Furter: “I didn’t make him for you.”



So, no, I don’t like Apple products FOR ME. I’m fine if other people want to use them.



This article is simply about a really quick and easy example of how simple faults cause major errors, and what you can do, even as an experienced developer, to prevent them from happening to you.

Thoughts on a New Year

It’s about this time of year that I think…

  • Why do reporters talk so much about NSA spying and Advanced Persistent Threats, when half the websites in existence will cough up cookies if you search for "-alert(document.cookie)-" ?
  • How can we expect people to write secure code when:
    • they don’t know what it is?
    • they can’t recognise insecure code?
    • it’s easier (more clicks, more thinks, etc) to write insecure code?
  • What does it take for a developer to get:
    • fired?
    • a bad performance review?
    • just mildly discomforted?
  • What is it about developers that makes us all believe that nobody else has written this piece of code before? (or that we can write it better)
  • Every time a new fad comes along, whether it’s XML, PHP, Ruby, etc, why do we spend so much time recognising that it has the same issues as the old ones? But without fixes.
  • Can we have an article on “the death of passwords” which will explain what the replacement is – and without that replacement turning out to be “a layer in front of a big password”?
  • Should you let your application out (publish it, make it available on the Internet, etc) if it is so fragile that:
    • you can’t patch it?
    • you can’t update the framework or libraries on which it depends (aka patch them)?
    • you don’t want a security penetration test to be performed on it?
  • Is it right to hire developers on the basis that they can:
    • steer a whiteboard to a small function which looks like it might work?
    • understand an obfuscated sample that demonstrates an obscure feature of your favourite framework?
    • tell you how to weigh twelve coins, one of which might be a fake?
    • bamboozle the interviewer with tales of technological wonders the likes of which he/she cannot fathom?
    • sing the old school song?

Ah, who am I kidding, I think those kinds of things all the time.

Ways you haven’t stopped my XSS–Number 1, JavaScript Strings

I saw this again today. I tried smiling, but could only manage a weak grin.

You think you’ve defeated my XSS attack. How did you do that?

Encoding or back-slash quoting the back-slash and quote characters in JavaScript strings

Sure, I can no longer turn this:

<script>
s_prop0="[user-input here]";
</script>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }


into this, by providing user input that consists of ";nefarious();// :



<script>
s_prop0="";nefarious();//";
</script>
.csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, “Courier New”, courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }


Instead, I get this:



<script>
s_prop0="\";nefarious();//";
</script>
.csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, “Courier New”, courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }


But, and this surprises many web developers, if that’s all you’ve done, I can still close that script tag.



INSIDE THE STRING



Yes, that’s bold, italic and underlined, because developers see this, and think “I have no idea how to parse this”:



<script>
s_prop0="</script><script>nefarious();</script>";
</script>


Fortunately, your browser does.



First it parses it as HTML.



This is important.



The HTML parser knows nothing about your JavaScript, it uses HTML rules to parse HTML bodies, and to figure out where scripts start and end.



So, when the HTML parser sees “<script>”, it creates a buffer. It starts filling that buffer with the first character after the tag, and it ends it with whatever character precedes the very next “</script>” tag it sees.



This means the HTML above gets interpreted as:



1. a block of script that won’t run, because it’s not complete code and generates a syntax error.



s_prop="


2. a block of script that will run, because it parses properly.



nefarious();


3. a double-quote character, a semi-colon, and an unnecessary end tag that it discards



Obviously, your code is more complex than mine, so this kind of injection has all kinds of nasty effects – but it’s possible for an attacker to hide those (not that the attacker needs to!)



So then, the fix is … what?



If you truly have to insert data from users into a JavaScript string, remember what it’s embedded in – HTML.



There are three approaches:



  1. Validate.
    If at all possible, discard characters willy-nilly. Does the user really need to input anything other than alphanumeric characters and spaces? Maybe you can just reject all those other characters.
  2. Encode.
    Yeah, you fell afoul of encoding, but let’s think about it scientifically this time.
    What are you embedded in? A JavaScript string embedded in HTML. You can’t HTML-encode your JavaScript content (try it and you’ll see it doesn’t work that way), so you have to JavaScript-string-encode anything that might make sense either to the HTML parser OR the JavaScript parser.
    You know I don’t like blacklists, but in this case, the only characters you actually need to encode are the double-quote, the back-slash (because otherwise you can’t uniquely reverse the encoding), and either the less-than or forward-slash.
    But, since I don’t like blacklists, I’d rather you chose to encode everything other than alphanumeric and spaces – it doesn’t cost that much.
  3. Span / Div.
    OK, this is a weird idea, but if you care to follow me, how about putting the user-supplied data into a hidden <span> or <div> element?
    Give it an id, and the JavaScript can reference it by that id. This means you only have to protect the user-supplied data in one place, and it won’t appear a dozen times throughout the document.


A note on why I don’t like the blacklists



OK, aside from last weekend’s post, where I demonstrated how a weak blacklist is no defence, it’s important to remember that the web changes day by day. Not every browser is standard, and they each try to differentiate themselves from the other browsers by introducing “killer features” that the other browsers don’t have for a few weeks.



As a result, you can’t really rely on the HTML standard as the one true documentation of all things a browser may do to your code.



Tags change, who knows if tomorrow a <script> tag might not be “pausable” by a <pause>Some piece of text</pause> tag? Ludicrous, maybe, until someone decides it’s a good idea. Or something else.



As a result, if you want to be a robust developer who produces robust code, you need to think less in terms of “what’s the minimum I have to encode?”, and more in terms of “what’s the cost of encoding, and what’s the cost of failure if I don’t encode something that needs it?”

There is no such thing as “small sample code”

Every few months, something encourages me to make the tweet that:

There is no such thing as “small sample code”, every sample you publish is an SDK of its own

OK, so the choice of calling these “SDKs” is rooted in my Microsoft dev background, where “sample code” didn’t need documentation or bug tracking, whereas an SDK does. You can adjust the terminology to suit.

The basic point here is to remind you that you do not get to abrogate all responsibility by saying “this is sample code, you will need to add error checking and security”, even if you do say it in the article – even if you say it in the comments of the sample!

Why do I care so much? It’s only three lines of code!

Simply stated, I’ve seen too many cases where people have included three lines of code (or five, or twenty, the count doesn’t matter) into a program, and they’ve stepped away and shipped that code.

“It wasn’t my fault,” they say, when the incident happens, “I copied that code from a sample online.”

This is the point at which the re-education machine is engaged – because, of course, it totally is your fault, if you include code in your development without treating it with the same rigour as if you had written every line of it yourself. You will get punished – usually by having to stay late and fix it.

It’s also the sample writer’s fault.

He gave you the mini-SDK that you imported blindly into your application, without testing it, without checking errors in it, without appropriate security measures, and he brushed you off with “well, of course, you should add your own error checks and security magic to it”.

Here’s an example of what I’m talking about, courtesy of Troy Hunt linking to an ASP forum.

No, if you’re providing sample code on the Internet, it’s important to make sure it doesn’t embody BAD design; this is code that will be taken up by people by definition less keen, less eager, less smart and less motivated to do things right than you are – after all, rather than figuring out how to write this code for themselves, they are allowing you to do it for them, to teach them how it’s done. If you then teach them how it’s done badly, that’s how they will learn to do it – badly. And they will teach others.

So, instead, make your three line samples five lines, and add enough error checking that unexpected issues or other bad things will break the sample’s execution.

Oh yeah, and what about updates, when you find a horrendous bug – how do you distribute those?

Why don’t we do that?

Reading a story on the consequences of the theft of Adobe’s source code by hackers, I come across this startling phrase:

The hackers seem to be targeting vulnerabilities they find within the stolen code. The prediction is that they’re sifting through the code, attempting to find widespread weaknesses, intending to exploit them with maximum effect by using zero-day attacks.

What I’d love to know is why we aren’t seeing a flood of developers crying out to be educated in how they, too, can learn to sift through their own code, attempt to find widespread weaknesses, so they can shore them up and prevent their code from being exploited.

An example of the sort of comments we are seeing can be found here, and they are fairly predictable – “does this mean Open Source is flawed, if having access to the source code is a security risk”, schadenfreude at Adobe’s misfortune, all manner of assertions that Adobe weren’t a very secure company anyway, etc.

Something that’s missing is an acknowledgement that we are all subject to the same pool of developers.

And attackers.

So, if you’re in the business of developing software – whether to sell, licence, give away, or simply to use in your own endeavours, you’re essentially in the same boat as Adobe prior to the hackers breaching their defences. Possibly the same boat as Adobe after the breach, but prior to the discovery.

Unless you are doing something different to what Adobe did, you are setting yourself up to be the next Adobe.

Obviously, Adobe isn’t giving us entire details of their own security program, and what’s gone right or wrong with it, but previous stories (as early as mid-2009) indicated that they were working closely with Microsoft to create an SDL (Security Development Lifecycle) for Adobe’s development.

So, instead of being all kinds of smug that Adobe got hacked, and you didn’t, maybe you should spend your time wondering if you can improve your processes to even reach the level Adobe was at when they got hacked.

And, to bring the topic back to what started the discussion – are you even doing to your software what these unidentified attackers are doing to Adobe’s code?

Are you poring over your own source code to find flaws?

How long are you spending to do that, and what tools are you using to do so?

Training developers to write secure code

I’ve done an amount of training developers recently, and it seems like there are a number of different kinds of responses to my security message.

[You can safely assume that there’s also something that’s wrong with the message and the messenger, but I want to learn about the thing I likely can’t control or change – the supply of developers]

Here are some unfairly broad descriptions of stereotypes I’ve encountered along the way. The truth, as ever, is more nuanced, but I think if I can reach each of these target personas, I should have just about everyone covered.

Is there anyone I’ve missed?

The previous victim

I’m always happy to have one or more of these people in the room – the sort of developer who has some experience, and has been on a project that was attacked successfully at some point or another.

This kind of developer has likely quickly learned the lesson that even his own code is subject to attack, vulnerable and weak to the persistent probes of attackers. Perhaps his experience has also included examples of his own failures in more ordinary ways – mere bugs, with no particular security implications.

Usually, this will be an older developer, because experience is required – and his tales of terror, unrehearsed and true, can sometimes provide the “scared straight” lesson I try to deliver to my students.

The previous attacker

This guy is usually a smart, younger individual. He may have had some previous nefarious activity, or simply researched security issues by attacking systems he owns.

But for my purposes, this guy can be too clever, because he distracts from my talk of ‘least privilege’ and ‘defence in depth’ with questions about race conditions, side-channel attacks, sub-millisecond time deltas across multi-second latency routes, and the like. IF those were the worst problems we see in this industry, I’d focus on them – but sadly, sites are still vulnerable to simple attacks, like my favourite – Reflected XSS in the Search field. [Simple exercise – watch a commercial break, and see how many of the sites advertised there have this vulnerability in them.]

But I like this guy for other reasons – he’s a possible future hire for my team, and a probable future assistant in finding, reporting and addressing vulnerabilities. Keeping this guy interested and engaged is key to making sure that he tells me about his findings, rather than sharing them with friends on the outside, or exploiting them himself.

“I did a security class at college”

Unbelievably to me, there are people who “done a project on it”, and therefore know all they want to about security. If what I was about to tell them was important, they’d have been told it by their professor at college, because their professor knew everything of any importance.

I personally wonder if this is going to be the kind of SDE who will join us for a short while, and not progress – because the impression they give to me is that they’ve finished learning right before their last final exam.

Salaryman

Related to the previous category is the developer who only does what it takes to get paid and to receive a good performance review.

I think this is the developer I should work the hardest to try and reach, because this attitude lies at the heart of every developer on their worst days at their desk. When the passion wanes, or the task is uninteresting, the desire to keep your job, continue to get paid, and progress through your career while satisfying your boss is the grinding cog that keeps you moving forward like a wind-up toy.

This is why it is important to keep searching to find ways of measuring code quality, and rewarding people who exhibit it – larger rewards for consistent prolonged improvement, smaller but more frequent rewards to keep the attention of the developer who makes a quick improvement to even a small piece of code.

Sadly, this guy is in my class because his boss told him he ought to attend. So I tell him at the end of my class that he needs to report back to his boss the security lesson that he learned – that all of his development-related goals should have the adverb “securely” appended to them. So “develop feature X” becomes “develop feature X securely”. If that is the one change I can make to this developer’s goals, I believe it will make a difference.

Fanboy

I’ve been doing this for long enough that I see the same faces in the crowd over and over again. I know I used to be a fanboy myself, and so I’m aware that sometimes this is because these folks learn something new each time. That’s why I like to deliver a different talk each time, even if it’s on the same subject as a previous lesson.

Or maybe they just didn’t get it all last time, and need to hear it again to get a deeper understanding. Either way, repeat visitors are definitely welcome – but I won’t get anywhere if that’s all I get in my audience.

Vocational

Some developers do the development thing because they can’t NOT write code. If they were independently wealthy and could do whatever they want, they’d be behind a screen coding up some fun little app.

I like the ones with a calling to this job, because I believe I can give them enough passion in security to make it a part of their calling as well. [Yes, I feel I have a calling to do security – I want to save the world from bad code, and would do it if I was independently wealthy.]

Stereotypical / The Surgeon

Sadly, the hardest person to reach – harder even than the Salaryman – is the developer who matches the stereotypical perception of the developer mindset.

Convinced of his own superiority and cleverness, even if he doesn’t express it directly in such conceited terms, this person will see every suggested approach as beneath him, and every example of poor code as yet more proof of his own superiority.

“Sure, you’ve had problems with other developers making stupid security mistakes,” he’ll think to himself, “But I’m not that dumb. I’ve never written code that bad.”

I certainly hope you won’t ever write code as bad as the examples I give in my classes – those are errant samples of code written in haste, and which I wouldn’t include in my class if they didn’t clearly illustrate my point. But my point is that your colleagues – everyone around you – are going to write this bad a piece of code one day, and it is your job to find it. It is also their job to find it in the code you write, so either you had better be truly as good as you think you are, or you had better apply good security practices so they don’t find you at your worst coding moment.

Playing with security blogs

I’ve found a new weekend hobby – it takes only a few minutes, is easily interruptible, and reminds me that the state of web security is such that I will never be out of a job.

I open my favourite search engine (I’m partial to Bing, partly because I get points, but mostly because I’ve met the guys who built it), search for “security blog”, and then pick one at random.

Once I’m at the security blog site – often one I’ve never heard of, despite it being high up in the search results – I find the search box and throw a simple reflected XSS attack at it.

If that doesn’t work, I view the source code for the results page I got back, and use the information I see there to figure out what reflected XSS attack will work. Then I try that.

[Note: I use reflected XSS, because I know I can only hurt myself. I don’t play stored XSS or SQL injection games, which can easily cause actual damage at the server end, unless I have permission and I’m being paid.]

Finally, I try to find who I should contact about the exploitability of the site.

It’s interesting just how many of these sites are exploitable – some of them falling to the simplest of XSS attacks – and even more interesting to see how many sites don’t have a good, responsive contact address (or prefer simply not to engage with vuln discoverers).

So, what do you find?

I clearly wouldn’t dream of disclosing any of the vulnerabilities I’ve found until well after they’re fixed. Of course, after they’re fixed, I’m happy to see a mention that I’ve helped move the world forward a notch on some security scale. [Not sure why I’m not called out on the other version of that changelog.] I might allude to them on my twitter account, but not in any great detail.

From clicking the link to exploit is either under ten minutes or not at all – and reporting generally takes another ten minutes or so, most of which is hunting for the right address. The longer portion of the game is helping some of these guys figure out what action needs to be taken to fix things.

Try using a WAF – NOT!

You can try using a WAF to solve your XSS problem, but then you’ve got two problems – a vulnerable web site, and that you have to manage your WAF settings. If you have a lot of spare time, you can use a WAF to shore up known-vulnerable fields and trap known attack strings. But it really doesn’t ever fix the problem.

Don’t echo my search query

If you can, don’t echo back to me what I sent you, because that’s how these attacks usually start. Don’t even include it in comments, because a good attack will just terminate the comment and start injecting HTML or script.

Remove my strange characters

Unless you’re running a source code site, you probably don’t need me to search for angle brackets, or a number of other characters. So take them out of my search – or plain reject my search if I include them in my search.

Encode everything

OK, so you don’t have to encode the basics – what are the basics? I tend to start with alphabetic and numeric characters, maybe also a space. Encode everything else.

Which encoding?

Yeah, that’s always the hard part. Encode it using the right encoding. That’s the short version. The long version is that you figure out what’s going to decode it, and make sure you encode for every layer that will decode. If you’re putting my text into a web page as a part of the page’s content, HTML encode it. If it’s in an attribute string, quote the characters using HTML attribute encoding – and make sure you quote the entire attribute value! If it’s an attribute string that will be used as a URL, you should URL encode it. Then you can HTML encode it, just to be sure.

[Then, of course, check that your encoding hasn’t killed the basic function of the search box!]

Respond to security reports

You should definitely respond to security reports – I understand that not everyone can have a 24/7 response team watching their blog (I certainly don’t) – you should try to respond within a couple of days, and anything under a week is probably going to be alright. Some vuln discoverers are upset if they don’t get a response much sooner, and see that as cause to publish their findings.

Me, I send a message first to ask if I’ve found the right place to send a security vulnerability report to, and only when I receive a positive acknowledgement do I send on the actual details of the exploit.

Be like Billy – Mind your XSS Manners!

I’ve said before that I wish programmers would respond to reports of XSS as if I’d told them I caught them writing a bubble sort implementation in Cobol. Full of embarrassment at being such a beginner.