On Widespread XSS in Ad Networks

Randy Westergren posted a really great piece entitled “Widespread XSS Vulnerabilities in Ad Network Code Affecting Top Tier Publishers, Retailers

Go read it – I’ll wait.

The article triggered a lot of thoughts that I’ll enumerate here:

This is not a new thing – and that’s bad

This was reported by SoftPedia as a “new attack”, but it’s really an old attack. This is just another way to execute DOM-based XSS.

That means that web sites are being attacked by old bugs, not because their own coding is bad, but because they choose to make money from advertising.

And because the advertising industry is just waaaay behind on securing their code, despite being effectively a widely-used framework across the web.

You’ve seen previously on my blog how I attacked Troy Hunt’s blog through his advertising provider, and he’s not the first, or by any means the last, “victim” of my occasional searches for flaws.

It’s often difficult to trace which ad provider is responsible for a piece of vulnerable code, and the hosting site may not realise the nature of their relationship and its impact on security. As a security researcher, it’s difficult to get traction on getting these vulnerabilities fixed.

Important note

I’m trying to get one ad provider right now to fix their code. I reported a bug to them, they pointed out it was similar to the work Randy Westergren had written up.

So they are aware of the problem.

It’s over a month later, and the sites I pointed out to them as proofs of concept are still vulnerable.

Partly, this is because I couldn’t get a reliable repro as different ad providers loaded up, but it’s been two weeks since I sent them a reliable repro – which is still working.

Reported a month ago, reliable repro two weeks ago, and still vulnerable everywhere.

[If you’re defending a site and want to figure out which ad provider is at fault, inject a “debugger” statement into the payload, to have the debugger break at the line that’s causing a problem. You may need to do this by replacing “prompt()” or “alert()” with “(function(){debugger})()” – note that it’ll only break into the debugger if you have the debugger open at the time.]

How the “#” affects the URL as a whole

Randy’s attack example uses a symbol you won’t see at all in some web sites, but which you can’t get away from in others. The “#” or “hash” symbol, also known as “number” or “hash”. [Don’t call it “pound”, please, that’s a different symbol altogether, “£”] Here’s his example:

http://nypost.com/#1'-alert(1)-'"-alert(1)-"

Different parts of the URL have different names. The “http:” part is the “protocol”, which tells the browser how to connect and what commands will likely work. “//nypost.com/” is the host part, and tells the browser where to connect to. Sometimes a port number is used – commonly, 80 or 443 – after the host name but before the terminating “/” of the host element. Anything after the host part, and before a question-mark or hash sign, is the “path” – in Randy’s example, the path is left out, indicating he wants the root page. An optional “query” part follows the path, indicated by a question mark at its start, often taking up the rest of the URL. Finally, if a “#” character is encountered, this starts the “anchor” part, which is everything from after the “#” character on to the end of the URL.

The “anchor” has a couple of purposes, one by design, and one by evolution. The designed use is to tell the browser where to place the cursor – where to scroll to. I find this really handy if I want to draw someone’s attention to a particular place in an article, rather than have them read the whole story. [It can also be used to trigger an onfocus event handler in some browsers]

The second use is for communication between components on the page, or even on other pages loaded in frames.

The anchor tag is for the browser only

I want to emphasise this – and while Randy also mentioned it, I think many web site developers need to understand this when dealing with security.

The anchor tag is not sent to the server.

The anchor tag does not appear in your server’s logs.

WAFs cannot filter the anchor tag.

If your site is being attacked through abuse of the anchor tag, you not only can’t detect it ahead of time, you can’t do basic forensic work to find out useful things such as “when did the attack start”, “what sort of things was the attacker doing”, “how many attacks happened”, etc.

[Caveat: pedants will note that when browser code acts on the contents of the anchor tag, some of that action will go back to the server. That’s not the same as finding the bare URL in your log files.]

If you have an XSS that can be triggered by code in an anchor tag, it is a “DOM-based XSS” flaw. This means that the exploit happens primarily (or only) in the user’s browser, and no filtering on the server side, or in the WAF (a traditional, but often unreliable, measure against XSS attacks), will protect you.

When trying out XSS attacks to find and fix them, you should try attacks in the anchor tag, in the query string, and in the path elements of the URL if at all possible, because they each will get parsed in different ways, and will demonstrate different bugs.

What does “-alert(1)-“ even mean?

The construction Randy uses may seem a little odd:

"-alert(1)-"'-alert(1)-'

With some experience, you can look at this and note that it’s an attempt to inject JavaScript, not HTML, into a quoted string whose injection point doesn’t properly (or at all) escape quotes. The two different quote styles will escape from quoted strings inside double quotes and single quotes alike (I like to put the number ‘2’ in the alert that is escaped by the double quotes, so I know which quote is escaped).

But why use a minus sign?

Surely it’s invalid syntax?

While JavaScript knows that “string minus void” isn’t a valid operation, in order to discover the types of the two arguments to the “minus” operator, it actually has to evaluate them. This is a usual side-effect of a dynamic language – in order to determine whether an operation is valid, its arguments have to be evaluated. Compiled languages are usually able to identify specific types at compile time, and tell you when you have an invalid operand.

So, now that we know you can use any operator in there – minus, times, plus, divide, and, or, etc – why choose the minus? Here’s my reasoning: a plus sign in a URL is converted to a space. A divide (“/”) is often a path component, and like multiplication (“*”) is part of a comment sequence in JavaScript, “//” or “/*”, an “&” is often used to separate arguments in a query string, and a “|” for “or” is possibly going to trigger different flaws such as command injection, and so is best saved for later.

Also, the minus sign is an unshifted character and quick to type.

There are so many other ways to exploit this – finishing the alert with a line-ending comment (“//” or “<–”), using “prompt” or “confirm” instead of “alert”, using JavaScript obfuscaters, etc, but this is a really good easy injection point.

Another JavaScript syntax abuse is simply to drop “</script>” in the middle of the JavaScript block and then start a new script block, or even just regular HTML. Remember that the HTML parser only hands off to the JavaScript parser once it has found a block between “<script …>” and “</script …>” tags. It doesn’t matter if the closing tag is “within” a JavaScript string, because the HTML parser doesn’t know JavaScript.

There’s no single ad provider, and they’re almost all vulnerable

Part of the challenge in repeating these attacks, demonstrating them to others, etc, is that there’s no single ad provider, even on an individual web site.

Two visits to the same web site not only bring back different adverts, but they come through different pieces of code, injected in different ways.

If you don’t capture your successful attack, it may not be possible to reproduce it.

Similarly, if you don’t capture a malicious advert, it may not be possible to prove who provided it to you. I ran into this today with a “fake BSOD” malvert, which pretended to be describing a system error, and filled as much of my screen as it could with a large “alert” dialog, which kept returning immediately, whenever it was dismissed, and which invited me to call for “tech support” to fix my system. Sadly, I wasn’t tracing my every move, so I didn’t get a chance to discover how this ad was delivered, and could only rage at the company hosting the page.

This is one reason why I support ad-blockers

Clearly, ad providers need to improve their security. Until such time as they do so, a great protection is to use an ad-blocker. This may prevent you from seeing actual content at some sites, but you have to ask yourself if that content is worth the security risk of exposing yourself to adverts.

There is a valid argument to be made that ad blockers reduce the ability of content providers to make legitimate profit from their content.

But there is also a valid argument that ad blockers protect users from insecure adverts.

Defence – protect your customers from your ads

Finally, if you’re running a web site that makes its money from ads, you need to behave proactively to prevent your users from being targeted by rogue advertisers.

I’m sure you believe that you have a strong, trusting relationship with the ad providers you have running ads on your web site.

Don’t trust them. They are not a part of your dev team. They see your customers as livestock – product. Your goals are substantially different, and that means that you shouldn’t allow them to write code that runs in your web site’s security context.

What this means is that you should always embed those advertising providers inside an iframe of their own. If they give you code to run, and tell you it’s to create the iframe in which they’ll site, put that code in an iframe you host on a domain outside your main domain. Because you don’t trust that code.

Why am I suggesting you do that? Because it’s the difference between allowing an advert attack to have limited control, and allowing it to have complete control, over your web site.

If I attack an ad in an iframe, I can modify the contents of the iframe, I can pop up a global alert, and I can send the user to a new page.

If I attack an ad – or its loading code – and it isn’t in an iframe, I can still do all that, but I can also modify the entire page, read secret cookies, insert my own cookies, interact with the user as if I am the site hosting the ad, etc.

If you won’t do it for your customers, at least defend your own page

capture20160409090703064

Here’s the front page of a major website with a short script running through an advert with a bug in it.

[I like the tag at the bottom left]

Insist on security clauses with all your ad providers

Add security clauses in to your contracts , so that you can pull an ad provider immediately a security vulnerability is reported to you, and so that the ad providers are aware that you have an interest in the security and integrity of your page and your users. Ask for information on how they enforce security, and how they expect you to securely include them in your page.

[I am not a lawyer, so please talk with someone who is!]

We didn’t even talk about malverts yet

Malverts – malicious advertising – is the term for an attacker getting an ad provider to deliver their attack code to your users, by signing up to provide an ad. Often this is done using apparent advertising copy related to current advertising campaigns, and can look incredibly legitimate. Sometimes, the attack code will be delayed, or region-specific, so an ad provider can’t easily notice it when they review the campaign for inclusion in your web page.

Got a virus you want to distribute? Why write distribution code and try to trick a few people into running it, when for a few dollars, you can get an ad provider to distribute it for you to several thousand people on a popular web site for people who have money?

Leave a Reply

Your email address will not be published. Required fields are marked *