Programmer Hubris – Tales from the Crypto

Programmer Hubris

1 2 3 13

What does “input validation” even mean any more?

Information Security is full of terminology.

Sometimes we even understand what we mean. I’ve yet to come across a truly awesome, yet brief, definition of “threat”, for instance.

But one that bugs me, because it shouldn’t be that hard to get right, and because I hear it from people I otherwise respect greatly, is that of “input validation”.

What is “validation”?

Fight me on this, but I think that validation is essentially a yes/no decision on a set of input, whether it’s textual, binary, or whatever other format you care to define.

Exactly what you are validating is up for debate, whether you’re looking at syntax or semantics – is it formatted correctly, versus does it actually make sense?

Syntax versus semantics

“Green ideas sleep furiously” is a famous example of a sentence that is syntactically correct – it follows a standard “Adjective noun verb adverb” pattern that is common in English – but semantically, it makes no sense: ideas can’t be green, and they can’t sleep, and nothing can sleep furiously (although my son used to sleep with his fists clenched really tight when he was a little baby).

“0 / 0” is a syntactically correct mathematical expression, but you can argue if it’s semantically correct.

“Sell 1000 shares” might be a syntactically correct instruction, but semantically, it could be you don’t have 1000 shares, or there’s a business logic limit, which says such a transaction requires extra authentication.

So there’s a difference between syntactical validation and semantic validation, but


What’s that got to do with injection exploits?

Injection attacks occur when an input data – a string of characters – is semantically valid in the language of the enclosing code, as code itself, and not just as data. Sometimes (but not always) this means the data contains a character or character sequence that allows the data to “escape” from its data context to a code context.

Can validation stop injection exploits?

This is a question I ask, in various round-about ways, in a lot of job interviews, so it’s quite an important question.

The answer is really simple.

Yes. And no.

If you can validate your input, such that it is always syntactically and semantically correct, you can absolutely prevent injection exploits.

But this is really only possible for relatively simple sets of inputs, and where the processing is safe for that set of inputs.

How about an example?

An example – suppose I’ve got a product ordering site, and I’m selling books.

You can order an integer number of books. Strictly speaking, positive integers, and 0 makes no sense, so start at 1. You probably want to put a maximum limit on that field, perhaps restricting people to buying no more than a hundred of that book. If they’re buying more, they’ll want to go wholesale anyway.

So, your validation is really simple – “is the field an integer, and is the integer value between 1 and 100?”

What about a counter-example?

Having said “yes, and no”, I have to show you an example of the “no”, right?

OK, let’s say you’re asking for validation of names of people – what’s your validation rules?

Let’s assume you’re expecting everyone to have ‘latinised’ their name, to make it easy. All the letters are in the range a-z, or A-Z if there’s a capital letter.

Great, so there’s a rule – only match “[A-Za-z]”

Unless, you know, Leonardo da Vinci. Or di Caprio. So you need spaces.

Or Daniel Day-Lewis. So there’s also hyphens to add.

And if you have an O’Reilly, an O’Brian, or a D’Artagnan, or a N’Dour – yes, you’re going to add apostrophes.

Now your validation rule is letting in a far broader range of characters than you start out with, and there’s enough there to allow for SQL injection to happen.

Input can now be syntactically correct by your validation rule, and yet semantically equivalent to data plus SQL code.

Validation alone is insufficient to block injection attacks.

Why do people say validation is sufficient, then?

I have a working hypothesis. It goes like this.

As a neophyte in information security, you learn a trick.

That trick is validation, and it’s a great thing to share with developers.

They don’t need to be clever or worry hard about the input that comes in, they simply need to validate it.

It actually feels good to reject incorrect input, because you know you’re keeping the bad guys out, and the good guys in.

Then you find an input field where validation alone isn’t sufficient.

Something else must be done

But you’ve told everyone – and had other security folk agree with you – that validation is the way to solve injection attacks.

So you learn a new trick – a new way of protecting inputs.

And you call this ‘validation’, too

After all, it 
 uhh, kind of does the same thing. It stops injection attacks, so it must be validation.

What is this ‘new trick’?

This new trick is encoding, quoting, or in some way transforming the data, so the newly transformed data is safe to accept.

Every one of those apostrophes? Turn them into the sequence “'” if they’re going into HTML, or double them if they’re in a SQL string, or – and this is FAR better – use parameterised queries so you don’t have to even know how the input string is being encoded on its way into the SQL command.

Now your input can be validated – and injection attacks are stopped.

But it’s not validation any more

In fact, once you’ve encoded your inputs properly, your validation can be entirely open and empty! At least from the security standpoint, because you’ve made the string semantically entirely meaningless to the code in which it is to be embedded as data. There are no escape characters or sequences, because they, too, have been encoded or transformed into semantically safe data.

It’s encoding
 or transformation.

And I happen to think it’s important to separate the two concepts of validation and encoding.

Validation is saying “yes” or “no” to the question “is this string ‘good’ data?” You can validate in a number of different ways, and with good defence in depth, you’ll validate at different locations, based on different knowledge about what is “good”. This matches very strongly with the primary dictionary definition of “validation” – it’s awesome when a technical term matches very closely with a common language term, because teaching it to others becomes easier.

Encoding doesn’t say “yes” or “no”, encoding simply takes whatever input it’s given, and makes it safe for the next layer to which the data will be handed.

Stop calling encoding “validation”

It’s not.

Lessons from my first official bounty

Sometimes, it’s just my job to find vulnerabilities, and while that’s kind of fun, it’s also a little unexciting compared to the thrill of finding bugs in other people’s software and getting an actual “thank you”, whether monetarily or just a brief word.

About a year ago, I found a minor Cross-Site Scripting (XSS) flaw in a major company’s web page, and while it wasn’t a huge issue, I decided to report it, as I had a few years back with a similar issue in the same web site. I was pleased to find that the company was offering a bounty programme, and simply emailing them would submit my issue.

Lesson 1: The WAF
 it does nothing!

The first thing to notice, as with all XSS issues, is that there were protections in place that had to be got around. In this case, some special characters or sequences were being blocked. But not all. And it’s really telling that there are still many websites which have not implemented widespread input validation / output encoding as their XSS protection. So, while the WAF slowed me down even when I knew the flaw existed, it only added about 20 minutes to the exploit time. So, my example had to use “confirm()” instead of “alert()” or “prompt()”. But really, if I was an attacker, my exploit wouldn’t have any of those functions, and would probably include an encoded script that wouldn’t be detected by the WAF either. WAFs are great for preventing specific attacks, but aren’t a strong protection against an adversary with a little intelligence and understanding.

Lesson 2: A quick response is always welcome

My email resulted in an answer that same day, less than an hour after my initial report. A simple “thank you”, and “we’re forwarding this to our developers” goes a long way to keeping a security researcher from idly playing with the thought of publishing their findings and moving on to the next game.

Lesson 3: The WAF
 still does nothing!

In under a week, I found that the original demo exploit was being blocked by the WAF – but if I replaced “onclick” with “oclick”, “onmouseover” with “omouseover”, and “confirm” with “cofirm”, I found the blocking didn’t get in the way. Granted, since those aren’t real event handlers or JavaScript functions, I can’t use those in a real exploit, but it does mean that once again, all the WAF does is block the original example of the attack, and it took only a few minutes again to come up with another exploit string.

Lesson 4: Keep communicating

If they’d told me “hey, we’re putting in a WAF rule while we work on fixing the actual bug”, I wouldn’t have been so eager to grump back at them and say they hadn’t fixed the issue by applying their WAF and by the way, here’s another URL to exploit it. But they did at least respond to my grump and reassure me that, yes, they were still going to fix the application.

Lesson 5: Keep communicating

I heard nothing after that, until in February of this year, over six months later, I replied to the original thread and asked if the report qualified for a bounty, since I noticed that they had actually fixed the vulnerability.

Lesson 6: Keep communicating

No response. Thinking of writing this up as an example of how security researchers still get shafted by businesses – bear in mind that my approach is not to seek out bounties for reward, but that I really think it’s common courtesy to thank researchers for reporting to you rather than pwning your website and/or your customers.

Lesson 7: Use a broker

About a month later, while looking into other things, I found that the company exists on HackerOne, where they run a bug bounty. This renewed my interest in seeing this fixed. So I reported the email exchange from earlier, noted that the bug was fixed, and asked if it constituted a rewardable finding. Again, a simple “thanks for the report, but this doesn’t really rise to the level of a bounty” is something I’ve been comfortable with from many companies (though it is nice when you do get something, even if it’s just a keychain or a t-shirt, or a bag full of stickers).

Lesson 8: Keep communicating

3/14: I got a reply the next day, indicating that “we are investigating”.

3/28: Then nothing for two weeks, so I posted another response asking where things were going.

4/3: Then a week later, a response. “We’re looking into this and will be in touch soon with an update.”

4/18: Me: Ping?

5/7: Me: Hey, how are we doing?

5/16: Anything happening?

Lesson 9: Deliver

5/18: Finally, over two months after my report to the company through HackerOne, and ten months after my original email to the first bug bounty address, it’s addressed.

5/19: The severity of the bug report is lowered (quite rightly, the questionnaire they used pushed me to a priority of “high”, which was by no means warranted). A very welcome bounty, and a bonus for my patience – unexpected but welcome, are issued.

Lesson 10: Learn

The cheapest way to learn things is from someone else’s mistakes. So I decided to share with my readers the things I picked up from this experience.

Here are a few other lessons I’ve picked up from bug bounties I’ve observed:

Lesson 11: Don’t start until you’re ready

If you start a bug bounty, consider how ready you might be. Are you already fixing all the security bugs you can find for yourself? Are you at least fixing those bugs faster than you can find more? Do your developers actually know how to fix a security bug, or how to verify a vulnerability report? Do you know how to expand on an exploit, and find occurrences of the same class of bug? [If you don’t, someone will milk your bounty programme by continually filing variations on the same basic flaw]

Lesson 12: Bug Bounty programmes may be the most expensive way to find vulnerabilities

How many security vulnerabilities do you think you have? Multiply that by an order of magnitude or two. Now multiply that by the average bounty you expect to offer. Add the cost of the personnel who are going to handle incoming bugs, and the cost of the projects they could otherwise be engaged in. Add the cost of the developers, whose work will be interrupted to fix security bugs, and add the cost of the features that didn’t get shipped on time before they were fixed. Sure, some of that is just a normal cost of doing business, when a security report could come at you out of the blue and interrupt development until it’s fixed, but starting a bug bounty paints a huge target on you.

Hiring a penetration tester, or renting a tool to scan for programming flaws, has a fixed cost – you can simply tell them how much you’re willing to pay, and they’ll work for that long. A bug bounty may result in multiple orders of magnitude more findings than you expected. Are you going to pay them all? What happens when your bounty programme runs out of money?

Finding bugs internally using bug bashes, software scanning tools or dedicated development staff, has a fixed cost, which is probably still smaller than the amount of money you’re considering on putting into that bounty programme.

12.1: They may also be the least expensive way to find vulnerabilities

That’s not to say bug bounties are always going to be uneconomical. At some point, in theory at least, your development staff will be sufficiently good at resolving and preventing security vulnerabilities that are discovered internally, that they will be running short of security bugs to fix. They still exist, of course, but they’re more complex and harder to find. This is where it becomes economical to lure a bunch of suckers – excuse me, security researchers – to pound against your brick walls until one of them, either stronger or smarter than the others, finds the open window nobody saw, and reports it to you. And you give them a few hundred bucks – or a few thousand, if it’s a really good find – for the time that they and their friends spent hammering away in futility until that one successful exploit.

At that point, your bug bounty programme is actually the least expensive tool in your arsenal.

Security Questions are Bullshit

 

I’m pretty much unhappy with the use of “Security Questions” – things like “what’s your mother’s maiden name”, or “what was your first pet”. These questions are sometimes used to strengthen an existing authentication control (e.g. “you’ve entered your password on a device that wasn’t recognised, from a country you normally don’t visit – please answer a security question”), but far more often they are used as a means to recover an account after the password has been lost, stolen or changed.

 

I’ve been asked a few times, given that these are pretty widely used, to explain objectively why I have such little disregard for them as a security measure. Here’s the Too Long; Didn’t Read summary:

  1. Most questions are equivalent to a low-entropy password
  2. Answers to many of these questions can be found online in public documents
  3. Answers can be squeezed out of you, by fair means or foul
  4. The same questions – and answers – are shared at multiple sites
  5. Questions (and answers) don’t get any kind of regulatory protection
  6. Because the best answers are factual and unchanging, you cannot change them if they are exposed

Let’s take them one by one:

Questions are like a low-entropy password

What’s your favourite colour? Blue, or Green. At the outside, red, yellow, orange or purple. That covers most people’s choices, in less than 3 bits of entropy.

What’s your favourite NBA team? There’s 29 of those – 30, if you count the 76ers. That’s 6 bits of entropy.

Obviously, there are questions that broaden this, but are relatively easy to guess with a small number of tries – particularly when you can use the next fact about Security Questions.

The Answers are available online

What’s your mother’s maiden name? It’s a matter of public record.

What school did you go to? If we know where you grew up, it’s easy to guess this, since there were probably only a handful of schools you could possibly have attended.

Who was your first boyfriend/girlfriend? Many people go on about this at length in Facebook posts, I’m told. Or there’s this fact:

You’ll tell people the answers

What’s your porn name? What’s your Star Wars name? What’s your Harry Potter name?

All these stupid quizzes, and they get you to identify something about yourself – the street you grew up on, the first initial of your secret crush, how old you were when you first heard saxophones.

And, of course, because of the next fact, all I really have to do is convince you that you want a free account at my site.

Answers are shared everywhere

Every site that you visit asks you variants of the same security questions – which means that you’ll have told multiple sites the same answers.

You’ve been told over and over not to share your password across multiple sites – but here you are, sharing the security answers that will reset your password, and doing so across multiple sites that should not be connected.

And do you think those answers (and the questions they refer back to) are kept securely by these various sites? No, because:

Questions & Answers are not protected like passwords

There’s regulatory protection, under regimes such as PCI, etc, telling providers how to protect your passwords.

There is no such advice for protecting security questions (which are usually public) and the answers to them, which are at least presumed to be stored in a back-end database, but are occasionally sent to the client for comparison against the answers! That’s truly a bad security measure, because of course you’re telling the attacker.

Even assuming the security answers are stored in a database, they’re generally stored in plain text, so that they can be accessed by phone support staff to verify your answers when you call up crying that you’ve forgotten your password. [Awesome pen-testing trick]

And because the answers are shared everywhere, all it takes is a breach at one provider to make the security questions and answers they hold have no security value at all any more.

If they’re exposed, you can’t change them

There’s an old joke in security circles, “my password got hacked, and now I have to rename my dog”. It’s really funny, because there are so many of these security answers which are matters of historical fact – while you can choose different questions, you can’t generally choose a different answer to the same question.

Well, obviously, you can, but then you’ve lost the point of a security question and answer – because now you have to remember what random lie you used to answer that particular question on that particular site.

And for all you clever guys out there


Yes, I know you can lie, you can put in random letters or phrases, and the system may take them (“Your place of birth cannot contain spaces” – so, Las Vegas, New York, Lake Windermere are all unusable). But then you’ve just created another password to remember – and the point of these security questions is to let you log on once you’ve forgotten your password.

So, you’ve forgotten your password, but to get it back, you have to remember a different password, one that you never used. There’s not much point there.

In summary

Security questions and answers, when used for password recovery / reset, are complete rubbish.

Security questions are low-entropy, predictable and discoverable password substitutes that are shared across multiple sites, are under- or un-protected, and (like fingerprints) really can’t be changed if they become exposed. This makes them totally unsuited to being used as password equivalents in account recovery / password reset schemes.

If you have to implement an account recovery scheme, find something better to use. In an enterprise, as I’ve said before, your best bet is to use something that the enterprise does well – the management hierarchy. Every time you forget your password, you have to get your manager, or someone at the next level up from them, to reset your password for you, or to vouch for you to tech support. That way, someone who knows you, and can affect your behaviour in a positive way, will know that you keep forgetting your password and could do with some assistance. In a social network, require the

Password hints are bullshit, too

Also, password hints are bullshit. Many of the Adobe breach’s “password hints” were actually just the password in plain-text. And, because Adobe didn’t salt their password hashes, you could sort the list of password hashes, and pick whichever of the password hints was either the password itself, or an easy clue for the password. So, even if you didn’t use the password hint yourself, or chose a really cryptic clue, some other idiot came up with the same password, and gave a “Daily Express Quick Crossword” quality clue.

How to do password reset

First, a quick recap:

Credentials include a Claim and a Proof (possibly many).

The Claim is what states one or more facts about your identity.

A Username is one example of a Claim. So is Group Membership, Age, Eye Colour, Operating System, Installed Software, etc


The Proof is what allows someone to reliably trust the Claim is true.

A Password is one example of a Proof. So is a Signature, a Passport, etc


Claims are generally public, or at least non-secret, and if not unique, are at least specific (e.g. membership of the group “Brown eyes” isn’t open to people with blue eyes).

Proofs are generally secret, and may be shared, but such sharing should not be discoverable except by brute force. (Which is why we salt passwords).

Now, the topic – password resets

Password resets can occur for a number of reasons – you’ve forgotten your password, or the password change functionality is more cumbersome than the password reset, or the owner of the account has changed (is that allowable?) – but the basic principle is that an account needs a new password, and there needs to be a way to achieve that without knowledge of the existing password.

Let’s talk as if it’s a forgotten password.

So we have a Claim – we want to assert that we possess an identity – but we have to prove this without using the primary Proof.

Which means we have to know of a secondary Proof. There are common ways to do this – alternate ID, issued by someone you trust (like a government authority, etc). It’s important in the days of parody accounts, or simply shared names (is that Bob Smith, his son, Bob Smith, or his unrelated neighbour, Bob Smith?) that you have associated this alternate ID with the account using the primary Proof, or as a part of the process of setting up the account with the primary Proof. Otherwise, you’re open to account takeover by people who share the same name as their target.

And you can legally change your name.

What’s the most common alternate ID / secondary Proof?

E-mail.

Pretty much every public web site relies on the use of email for password reset, and uses that email address to provide a secondary Proof.

It’s not enough to know the email address – that’s unique and public, and so it matches the properties of a Claim, not a Proof, of identity.

We have to prove that we own the email address.

It’s not enough to send email FROM the email address – email is known to be easily forged, and so there’s no actual proof embodied in being able to send an email.

That leaves the server with the prospect of sending something TO the email address, and the recipient having proved that they received it.

You could send a code-word, and then have the recipient give you the code-word back. A shared secret, if you like.

And if you want to do that without adding another page to the already-too-large security area of the site, you look for the first place that allows you to provide your Claim and Proof, and you find the logon page.

By reusing the logon page, you’re going to say that code-word is a new password.

[This is not to say that email is the only, or even the best, way to reset passwords. In an enterprise, you have more reliable proofs of identity than an email provider outside of your control. You know people who should be able to tell you with some surety that a particular person is who they claim to be. Another common secondary identification is the use of Security Questions. See my upcoming article, “Security Questions are Bullshit” for why this is a bad idea.]

So it’s your new password, yes?

Well, yes and no. No, actually. Pretty much definitely no, it’s not your new password.

Let’s imagine what can go wrong. If I don’t know your password, but I can guess your username (because it’s not secret), I can claim to be you wanting to reset your password. That not only creates opportunity for me to fill your mailbox with code-words, but it also prevents you from logging on while the code-words are your new password. A self-inflicted denial of service.

So your old password should continue working, and if you never use the code-word, because you’re busy ignoring and deleting the emails that come in, it should keep working for you.

I’ve frequently encountered situations in my own life where I’ve forgotten my password, gone through the reset process, and it’s only while typing in the new password, and being told what restrictions there are on characters allowed in the new password, that I remember what my password was, and I go back to using that one.

In a very real sense, the code-word sent to you is NOT your new password, it’s a code-word that indicates you’ve gone the password reset route, and should be given the opportunity to set a new password.

Try not to think of it as your “temporary password”, it’s a special flag in the logon process, just like a “duress password”. It doesn’t replace your actual password.

It’s a shared secret, so keep it short-lived

Shared secrets are fantastic, useful, and often necessary – TLS uses them to encrypt data, after the initial certificate exchange.

But the trouble with shared secrets is, you can’t really trust that the other party is going to keep them secret very long. So you have to expire them pretty quickly.

The same is true of your password reset code-word.

In most cases, a user will forget their password, click the reset link, wait for an email, and then immediately follow the password reset process.

Users are slow, in computing terms, and email systems aren’t always directly linked and always-connected. But I see no reason why the most usual automated password reset process should allow the code-word to continue working after an hour.

[If the process requires a manual step, you have to count that in, especially if the manual step is something like “contact a manager for approval”, because managers aren’t generally 24/7 workers, the code-word is going to need to last much longer. But start your discussion with an hour as the base-point, and make people fight for why it’ll take longer to follow the password reset process.]

It’s not a URL

You can absolutely supply a URL in the email that will take the user to the right page to enter the code-word. But you can’t carry the code-word in the URL.

Why? Check out these presentations from this year’s Black Hat and DefCon, showing the use of a malicious WPAD server on a local – or remote – network whose purpose is to trap and save URLs, EVEN HTTPS URLs, and their query strings.

Every URL you send in an email is an HTTP or HTTPS GET, meaning all the parameters are in the URL or in the query string portion of the URL.

This means the code-word can be sniffed and usurped if it’s in the URL. And the username is already assumed to be known, since it’s a non-secret. [Just because it’s assumed to be known, don’t give the attacker an even break – your message should simply say “you requested a password reset on an account at our website” – a valid request will come from someone who knows which account at your website they chose to request.]

So, don’t put the code-word in the URL that you send in the email.

Log it, and watch for repeats, errors, other bad signs

DON’T LOG THE PASSWORD

I have to say that, because otherwise people do that, as obviously wrong as it may seem.

But log the fact that you’ve changed a password for that user, along with when you did it, and what information you have about where the user reset their password from.

Multiple users resetting their password from the same IP address – that’s a bad sign.

The same user resetting their password multiple times – that’s a bad sign.

Multiple expired code-words – that’s a bad sign.

Some of the bad things being signaled include failures in your own design – for instance, multiple expired code-words could mean that your password reset function has stopped working and needs checking. You have code to measure how many abandoned shopping carts you have, so include code that measures how many abandoned password reset attempts you have.

In summary, here’s how to do email password reset properly

  1. Consider whether email is an appropriately reliable secondary proof of identification.
    1. A phone call from the user, to their manager, followed by the manager resetting the user’s password is trustworthy, and self-correcting. A little slow and awkward, but stronger.
    2. Other forms of identification are almost all stronger and more resistant to forgery and attack than email.
    3. Bear in mind that, by trusting to email, you’re trusting that someone who just forgot their password can remember their other password, and hasn’t shared it, or used an easily-guessed password.
  2. When an account is created, associate it with an email address, and allow & encourage the validated account holder to keep that updated.
  3. Allow anyone to begin a password reset process – but don’t allow them to repeatedly do so, for fear of allowing DoS attacks on your users.
  4. When someone has begun a password reset process, send the user a code-word in email. This should be a random sequence, of the same sort of entropy as your password requirements.
    1. Encoding a time-stamp in the code-word is useful.
    2. Do not put the code-word in a URL in the message.
    3. Do not specify the username in the message, not even in the URL.
  5. Do not change the user’s password for them yet. This code-word is NOT their new password.
  6. Wait for up to an hour (see item 4.1. – the time stamp) during which you’ll accept the code-word. Be very cautious about extending this period.
  7. The code-word, when entered at the password prompt for the associated user, is an indication to change the password, and that’s ALL it can be used for.
  8. Allow the user to cancel out of the password change, up to the point where they have entered a new password and its verification code and hit OK.

And here’s how you’re going to screw it up (don’t do these!)

  • You’ll use email for password resets on a high-security account. Email is relatively low-security.
  • You’ll use email when there’s something more reliable and/or easier to use, because “it’s what everyone does”.
  • You’ll use email to an external party with whom you don’t have a business relationship, as the foundation on which you build your corporate identity tower of cards.
  • You won’t have a trail of email addresses associated with the user, so you don’t have reason to trust you’re emailing the right user, and you won’t be able to recover when an email address is hacked/stolen.
  • You’ll think the code-word is a new password, leading to:
    • Not expiring the code-word.
    • Sending passwords in plain-text email (“because we already do it for password reset”).
    • Invalidating the user’s current password, even though they didn’t want it reset.
  • You’ll include the credentials (username and/or password) in a URL in the email message, so it can be picked up by malicious proxies.
  • You’ll fail to notice that someone’s DoSing some or all of your users by repeatedly initiating a password reset.
  • You’ll fail to notice that your password reset process is failing, leading to accounts being dropped as people can no longer get in.
  • You won’t include enough randomness in your code-word, so an attacker can reset two accounts at once – their own, and a victim’s, and use the same code-word to unlock both.
  • You’ll rely on security questions
  • You’ll use a poor-quality generation on your code-words, leading to easy predictability, reuse, etc.
  • You won’t think about other ways that your design will fail outside of this list here.

Let the corrections begin!

Did I miss something, or get something wrong? Let me know by posting a comment!

Got on with Git

In which I move my version control from ComponentSoftware’s CS-RCS Pro to Git while preserving commit history.

[If you don’t want the back story, click here for the instructions!]

OK, so having watched the video I linked to earlier, I thought I’d move some of my old projects to Git.

I picked one at random, and went looking for tools.

I’m hampered a little by the fact that all my old projects used ComponentSoftware’s “CS-RCS Pro”.

Why did you choose CS-RCS Pro?

A couple of really good reasons:

  • It works on Windows
  • It integrates moderately well with Visual Studio through the VSS functionality
  • It’s compatible with GNU RCS, which I had some familiarity with
  • It was free if you’re the only dev on your projects

But you know who doesn’t use CS-RCS Pro any more?

That’s right, ComponentSoftware.

It’s a dead platform, unsupported, unpatched, and belongs off my systems.

So why’s it still there?

One simple reason – if I move off the platform, I face the usual choice when migrating from one version control system to another:

  • Carry all my history, so that I can review earlier versions of the code (for instance, when someone says they’ve got a new bug that never happened in the old version, or when I find a reversion, or when there’s a fix needed in one area of the code tree that I know I already made in a different area and just need to copy)
  • Lose all the history by starting fresh with the working copy of the source code

The second option seems a bit of a waste to me.

OK, so yes, technically I could mix the two modes, by using CS-RCS Pro to browse the ancient history when I need to, and Git to browse recent history, after starting Git from a clean working folder. But I could see a couple of problems:

  • Of course the bug I’m looking through history for is going to be across the two source control packages
  • It would mean I still have CS-RCS Pro sitting around installed, unpatched and likely vulnerable, on one of my dev systems

So, really, I wanted to make sure that I could move my files, history and all.

What stopped you?

I really didn’t have a good way to do it.

Clearly, any version control system can be moved to any other version control system by the simple expedient of:

  • For each change X:
    • Set the system date to X’s date
    • Fetch the old source control’s files from X into the workspace
    • Commit changes to the new source control, with any comments from X
    • Next change

But, as you can imagine, that’s really long-winded and manual. That should be automatable.

In fact, given the shared APIs of VSS-compatible source control services, I’m truly surprised that nobody has yet written a tool to do basically this task. I’d get on it myself, but I have other things to do. Maybe someone will write a “VSS2Git” or “VSS2VSS” toolkit to do just this.

There is a format for creating a single-file copy of a Git repository, which Git can process using the command “git fast-import”. So all I have to find is a tool that goes from a CS-RCS repository to the fast-import file format.

Nobody uses CS-RCS Pro

So, clearly there’s no tool to go from CS-RCS Pro to Git. There’s a tool to go from CS-RCS Pro to CVS, or there was, but that was on the now-defunct CS-RCS web site.

But
 Remember I said that it’s compatible with GNU RCS.

And there’s scripts to go from GNU RCS to Git.

What you waiting for? Do it!

OK, so the script for this is written in Ruby, and as I read it, there seemed to be a few things that made it look like it might be for Linux only.

I really wasn’t interested in making a Linux VM (easy though that may be) just so I could convert my data.

So why are you writing this?

Everything changed with the arrival of the recent Windows 10 Anniversary Update, because along with it came a new component.

bashonubu

Bash on Ubuntu on Windows.

It’s like a Linux VM, without needing a VM, without having to install Linux, and it works really well.

With this, I could get all the tools I needed – GNU RCS, in case I needed it; Ruby; Git command line – and then I could try this out for myself.

Of course, I wouldn’t be publishing this if it wasn’t somewhat successful. But there are some caveats, OK?

Here’s the caveats

I’ve tried this a few times, on ONE of my own projects. This isn’t robustly tested, so if something goes all wrong, please by all means share, and people who are interested (maybe me) will probably offer suggestions, some of them useful. I’m not remotely warrantying this or suggesting it’s perfect. It may wipe your development history out of your one and only copy of version control
 so don’t do it on your one and only copy. Make a backup first.

GNU RCS likes to store files in one of two places – either in the same directory as the working files, but with a “,v” pseudo-extension added to the filename, or in a sub-directory off each working folder, called “RCS” and with the same “,v” extension on the files. If you did either of these things, there’s no surprises. But


CS-RCS Pro doesn’t do this. It has a separate RCS Repository Root. I put mine in C:\RCS, but you may have yours somewhere else. Underneath that RCS Repository Root is a full tree of the drives you’ve used CS-RCS to store (without the “:”), and a tree under that. I really hope you didn’t embed anything too deep, because that might bode ill.

Initially, this seemed like a bad thing, but because you don’t actually need the working files for this task, you can pretend that the RCS Repository is actually your working space.

Maybe this is obvious, but it took me a moment of thinking to decide I didn’t have to move files into RCS sub-folders of my working directories.

Make this a “flag day”. After you do this conversion, never use CS-RCS Pro again. It was good, and it did the job, and it’s now buried in the garden next to Old Yeller. Do not sprinkle the zombification water on that hallowed ground to revive it.

This also means you MUST check in all your code before converting, because checking it in afterwards will be 
 difficult.

Enough already, how do we do this?

Assumption: You have Windows 10.

  1. Install Windows 10 Anniversary Update – this is really easy, it’s an update, you’ve probably been offered it already, and you may even have installed it. This is how you’ll know you have it:
    capture20160826194922505
  2. Install Bash on Ubuntu on Windows – everyone else has written an article on how to do this, so here’s a link (I was going to link to the PC World article, but the full-page ad that popped up and obscured the screen, without letting me click the “no thanks” button persuaded me otherwise).
  3. Run the following commands in the bash shell:
    sudo apt-get update
    sudo apt-get install git
    sudo apt-get install ruby
  4. [Optional] Run “sudo apt-get instal rcs”, if you want to use the GNU RCS toolset to play with your original source control tree. Not sure I’d recommend doing too much of that.
  5. Change directory in the bash shell to a new, blank workspace folder you can afford to mess around in.
  6. Now a long bash command, but this really simply downloads the file containing rcs-fast-export:
    curl http://git.oblomov.eu/rcs-fast-export/blob_plain/c8a2bd6edbb21c1bfaf269ad1ec0e82af72c911a:/rcs-fast-export.rb -o rcs-fast-export.rb
  7. Make it executable with the command “chmod +x rcs-fast-export.rb”
  8. Git uses email addresses, rather than owner names, and it insists on them having angle brackets. If your username in CS-RCS Pro was “bob”, and your email address is “kate@example.com”, create an authors file with a bash command like this:
    echo “bob=Kate Smith <kate@example.com>” > AuthorsFile
  9. Now do the actual creation of the file to be imported, with this bash command:
    ./rcs-fast-export.rb -A AuthorsFile /mnt/c/RCS/
path-to-project
 > project-name.gitexport
    [Note a couple of things here – starting with “./”, because that isn’t automatically in the PATH in Linux. Your Windows files are logically mounted in drives under /mnt, so C:\RCS is in /mnt/c/RCS. Case is important. Your “
path-to-project
” probably starts with “c/”, so that’s going to look like “/mnt/c/RCS/c/
” which might look awkward, but is correct. Use TAB-completion on folder names to help you.]
  10. Read the errors and correct any interesting ones.
  11. Now import the file into Git. We’re going to initialise a Git repository in the “.git” folder under the current folder, import the file, reset the head, and finally checkout all the files into the “master” branch under the current directory “.”. These are the bash commands to do this:
    git init
    git fast-import < project-name.gitexport
    git reset
    git checkout master .
  12. Profit!
  13. If you’re using Visual Studio and want to connect to this Git repository, remember that your Linux home directory sits under “%userprofile%\appdata\local\lxss\home”

This might look like a lot of instructions, but I mostly just wanted to be clear. This is really quick work. If you screw up after the “git init” command, simply “rm –rf .git” to remove the new repository.

The blame game: It’s always never human error

The Ubuntu “Circle of Friends” logo.

Depending on the kind of company you work at, it’s either:

  • a group of three friends holding hands and dancing in a merry circle
  • a group of three colleagues each pointing at the other two to tell you who to blame
  • three guys tied to a pole desperately trying to escape the rising orange flood waters

If you work at the first place, reach out to me on LinkedIn – I know some people who might want to work with you.

If you’re at the third place, you should probably get out now. Whatever they’re paying you, or however much the stock might be worth come the IPO, it’s not worth the pain and suffering.

If you’re at the second place, congratulations – you’re at a regular, ordinary workplace that could do with a little better management.

What’s this to do with security?

A surprisingly great deal.

Whenever there’s a security incident, there should be an investigation as to its cause.

Clearly the cause is always human error. Machines don’t make mistakes, they act in predictable ways – even when they are acting randomly, they can be stochastically modeled, and errors taken into consideration. Your computer behaves like a predictable machine, but at various levels it actually routinely behaves like it’s rolling dice, and there are mechanisms in place to bias those random results towards the predictable answers you expect from it.

Humans, not so much.

Humans make all the mistakes. They choose to continue using parts that are likely to break, because they are past their supported lifecycle; they choose to implement only part of a security mechanism; they forget to finish implementing functionality; they fail to understand the problem at hand; etc, etc.

It always comes back to human error.

Or so you think

Occasionally I will experience these great flashes of inspiration from observing behaviour, and these flashes dramatically affect my way of doing things.

One such was when I attended the weekly incident review board meetings at my employer of the time – a health insurance company.

Once each incident had been resolved and addressed, they were submitted to the incident review board for discussion, so that the company could learn from the cause of the problem, and make sure similar problems were forestalled in future.

These weren’t just security incidents, they could be system outages, problems with power supplies, really anything that wasn’t quickly fixed as part of normal process.

But the principles I learned there apply just as well to security incident.

Root cause analysis

The biggest principle I learned was “root cause analysis” – that you look beyond the immediate cause of a problem to find what actually caused it in the long view.

At other companies, who can’t bear to think that they didn’t invent absolutely everything, this is termed differently, for instance, “the five whys” (suggesting if you ask “why did that happen?” five times, you’ll get to the root cause). Other names are possible, but the majority of the English-speaking world knows it as ‘root cause analysis’

This is where I learned that if you believe the answer is that a single human’s error caused the problem, you don’t have the root cause.

But!

Whenever I discuss this with friends, they always say “But! What about this example, or that?”

You should always ask those questions.

Here’s some possible individual causes, and some of their associated actual causes:

Bob pulled the wrong lever Who trained Bob about the levers to pull? Was there documentation? Were the levers labeled? Did anyone assess Bob’s ability to identify the right lever to pull by testing him with scenarios?
Kate was evil and did a bad thing Why was Kate allowed to have unsupervised access? Where was the monitoring? Did we hire Kate? Why didn’t the background check identify the evil?
Jeremy told everyone the wrong information Was Jeremy given the right information? Why was Jeremy able to interpret the information from right to wrong? Should this information have been automatically communicated without going through a Jeremy? Was Jeremy trained in how to transmute information? Why did nobody receiving the information verify it?
Grace left her laptop in a taxi Why does Grace have data that we care about losing – on her laptop? Can we disable the laptop remotely? Why does she even have a laptop? What is our general solution for people, who will be people, leaving laptops in a taxi?
Jane wrote the algorithm with a bug in it Who reviews Jane’s code? Who tests the code? Is the test automated? Was Jane given adequate training and resources to write the algorithm in the first place? Is this her first time writing an algorithm – did she need help? Who hired Jane for that position – what process did they follow?

 

I could go on and on, and I usually do, but it’s important to remember that if you ever find yourself blaming an individual and saying “human error caused this fault”, it’s important to remember that humans, just like machines, are random and only stochastically predictable, and if you want to get predictable results, you have to have a framework that brings that randomness and unpredictability into some form of logical operation.

Many of the questions I asked above are also going to end up with the blame apparently being assigned to an individual – that’s just a sign that it needs to keep going until you find an organisational fix. Because if all you do is fix individuals, and you hire new individuals and lose old individuals, your organisation itself will never improve.

[Yes, for the pedants, your organisation is made up of individuals, and any organisational fix is embodied in those individuals – so blog about how the organisation can train individuals to make sure that organisational learning is passed on.]

Finally, if you’d like to not use Ubuntu as my “circle of blame” logo, there’s plenty of others out there – for instance, Microsoft Alumni:

Microsoft Alumni

On Widespread XSS in Ad Networks

Randy Westergren posted a really great piece entitled “Widespread XSS Vulnerabilities in Ad Network Code Affecting Top Tier Publishers, Retailers”

Go read it – I’ll wait.

The article triggered a lot of thoughts that I’ll enumerate here:

This is not a new thing – and that’s bad

This was reported by SoftPedia as a “new attack”, but it’s really an old attack. This is just another way to execute DOM-based XSS.

That means that web sites are being attacked by old bugs, not because their own coding is bad, but because they choose to make money from advertising.

And because the advertising industry is just waaaay behind on securing their code, despite being effectively a widely-used framework across the web.

You’ve seen previously on my blog how I attacked Troy Hunt’s blog through his advertising provider, and he’s not the first, or by any means the last, “victim” of my occasional searches for flaws.

It’s often difficult to trace which ad provider is responsible for a piece of vulnerable code, and the hosting site may not realise the nature of their relationship and its impact on security. As a security researcher, it’s difficult to get traction on getting these vulnerabilities fixed.

Important note

I’m trying to get one ad provider right now to fix their code. I reported a bug to them, they pointed out it was similar to the work Randy Westergren had written up.

So they are aware of the problem.

It’s over a month later, and the sites I pointed out to them as proofs of concept are still vulnerable.

Partly, this is because I couldn’t get a reliable repro as different ad providers loaded up, but it’s been two weeks since I sent them a reliable repro – which is still working.

Reported a month ago, reliable repro two weeks ago, and still vulnerable everywhere.

[If you’re defending a site and want to figure out which ad provider is at fault, inject a “debugger” statement into the payload, to have the debugger break at the line that’s causing a problem. You may need to do this by replacing “prompt()” or “alert()” with “(function(){debugger})()” – note that it’ll only break into the debugger if you have the debugger open at the time.]

How the “#” affects the URL as a whole

Randy’s attack example uses a symbol you won’t see at all in some web sites, but which you can’t get away from in others. The “#” or “hash” symbol, also known as “number” or “hash”. [Don’t call it “pound”, please, that’s a different symbol altogether, “£”] Here’s his example:

http://nypost.com/#1'-alert(1)-'"-alert(1)-"

Different parts of the URL have different names. The “http:” part is the “protocol”, which tells the browser how to connect and what commands will likely work. “//nypost.com/” is the host part, and tells the browser where to connect to. Sometimes a port number is used – commonly, 80 or 443 – after the host name but before the terminating “/” of the host element. Anything after the host part, and before a question-mark or hash sign, is the “path” – in Randy’s example, the path is left out, indicating he wants the root page. An optional “query” part follows the path, indicated by a question mark at its start, often taking up the rest of the URL. Finally, if a “#” character is encountered, this starts the “anchor” part, which is everything from after the “#” character on to the end of the URL.

The “anchor” has a couple of purposes, one by design, and one by evolution. The designed use is to tell the browser where to place the cursor – where to scroll to. I find this really handy if I want to draw someone’s attention to a particular place in an article, rather than have them read the whole story. [It can also be used to trigger an onfocus event handler in some browsers]

The second use is for communication between components on the page, or even on other pages loaded in frames.

The anchor tag is for the browser only

I want to emphasise this – and while Randy also mentioned it, I think many web site developers need to understand this when dealing with security.

The anchor tag is not sent to the server.

The anchor tag does not appear in your server’s logs.

WAFs cannot filter the anchor tag.

If your site is being attacked through abuse of the anchor tag, you not only can’t detect it ahead of time, you can’t do basic forensic work to find out useful things such as “when did the attack start”, “what sort of things was the attacker doing”, “how many attacks happened”, etc.

[Caveat: pedants will note that when browser code acts on the contents of the anchor tag, some of that action will go back to the server. That’s not the same as finding the bare URL in your log files.]

If you have an XSS that can be triggered by code in an anchor tag, it is a “DOM-based XSS” flaw. This means that the exploit happens primarily (or only) in the user’s browser, and no filtering on the server side, or in the WAF (a traditional, but often unreliable, measure against XSS attacks), will protect you.

When trying out XSS attacks to find and fix them, you should try attacks in the anchor tag, in the query string, and in the path elements of the URL if at all possible, because they each will get parsed in different ways, and will demonstrate different bugs.

What does “-alert(1)-“ even mean?

The construction Randy uses may seem a little odd:

"-alert(1)-"'-alert(1)-'

With some experience, you can look at this and note that it’s an attempt to inject JavaScript, not HTML, into a quoted string whose injection point doesn’t properly (or at all) escape quotes. The two different quote styles will escape from quoted strings inside double quotes and single quotes alike (I like to put the number ‘2’ in the alert that is escaped by the double quotes, so I know which quote is escaped).

But why use a minus sign?

Surely it’s invalid syntax?

While JavaScript knows that “string minus void” isn’t a valid operation, in order to discover the types of the two arguments to the “minus” operator, it actually has to evaluate them. This is a usual side-effect of a dynamic language – in order to determine whether an operation is valid, its arguments have to be evaluated. Compiled languages are usually able to identify specific types at compile time, and tell you when you have an invalid operand.

So, now that we know you can use any operator in there – minus, times, plus, divide, and, or, etc – why choose the minus? Here’s my reasoning: a plus sign in a URL is converted to a space. A divide (“/”) is often a path component, and like multiplication (“*”) is part of a comment sequence in JavaScript, “//” or “/*”, an “&” is often used to separate arguments in a query string, and a “|” for “or” is possibly going to trigger different flaws such as command injection, and so is best saved for later.

Also, the minus sign is an unshifted character and quick to type.

There are so many other ways to exploit this – finishing the alert with a line-ending comment (“//” or “<–”), using “prompt” or “confirm” instead of “alert”, using JavaScript obfuscaters, etc, but this is a really good easy injection point.

Another JavaScript syntax abuse is simply to drop “</script>” in the middle of the JavaScript block and then start a new script block, or even just regular HTML. Remember that the HTML parser only hands off to the JavaScript parser once it has found a block between “<script 
>” and “</script 
>” tags. It doesn’t matter if the closing tag is “within” a JavaScript string, because the HTML parser doesn’t know JavaScript.

There’s no single ad provider, and they’re almost all vulnerable

Part of the challenge in repeating these attacks, demonstrating them to others, etc, is that there’s no single ad provider, even on an individual web site.

Two visits to the same web site not only bring back different adverts, but they come through different pieces of code, injected in different ways.

If you don’t capture your successful attack, it may not be possible to reproduce it.

Similarly, if you don’t capture a malicious advert, it may not be possible to prove who provided it to you. I ran into this today with a “fake BSOD” malvert, which pretended to be describing a system error, and filled as much of my screen as it could with a large “alert” dialog, which kept returning immediately, whenever it was dismissed, and which invited me to call for “tech support” to fix my system. Sadly, I wasn’t tracing my every move, so I didn’t get a chance to discover how this ad was delivered, and could only rage at the company hosting the page.

This is one reason why I support ad-blockers

Clearly, ad providers need to improve their security. Until such time as they do so, a great protection is to use an ad-blocker. This may prevent you from seeing actual content at some sites, but you have to ask yourself if that content is worth the security risk of exposing yourself to adverts.

There is a valid argument to be made that ad blockers reduce the ability of content providers to make legitimate profit from their content.

But there is also a valid argument that ad blockers protect users from insecure adverts.

Defence – protect your customers from your ads

Finally, if you’re running a web site that makes its money from ads, you need to behave proactively to prevent your users from being targeted by rogue advertisers.

I’m sure you believe that you have a strong, trusting relationship with the ad providers you have running ads on your web site.

Don’t trust them. They are not a part of your dev team. They see your customers as livestock – product. Your goals are substantially different, and that means that you shouldn’t allow them to write code that runs in your web site’s security context.

What this means is that you should always embed those advertising providers inside an iframe of their own. If they give you code to run, and tell you it’s to create the iframe in which they’ll site, put that code in an iframe you host on a domain outside your main domain. Because you don’t trust that code.

Why am I suggesting you do that? Because it’s the difference between allowing an advert attack to have limited control, and allowing it to have complete control, over your web site.

If I attack an ad in an iframe, I can modify the contents of the iframe, I can pop up a global alert, and I can send the user to a new page.

If I attack an ad – or its loading code – and it isn’t in an iframe, I can still do all that, but I can also modify the entire page, read secret cookies, insert my own cookies, interact with the user as if I am the site hosting the ad, etc.

If you won’t do it for your customers, at least defend your own page

capture20160409090703064

Here’s the front page of a major website with a short script running through an advert with a bug in it.

[I like the tag at the bottom left]

Insist on security clauses with all your ad providers

Add security clauses in to your contracts , so that you can pull an ad provider immediately a security vulnerability is reported to you, and so that the ad providers are aware that you have an interest in the security and integrity of your page and your users. Ask for information on how they enforce security, and how they expect you to securely include them in your page.

[I am not a lawyer, so please talk with someone who is!]

We didn’t even talk about malverts yet

Malverts – malicious advertising – is the term for an attacker getting an ad provider to deliver their attack code to your users, by signing up to provide an ad. Often this is done using apparent advertising copy related to current advertising campaigns, and can look incredibly legitimate. Sometimes, the attack code will be delayed, or region-specific, so an ad provider can’t easily notice it when they review the campaign for inclusion in your web page.

Got a virus you want to distribute? Why write distribution code and try to trick a few people into running it, when for a few dollars, you can get an ad provider to distribute it for you to several thousand people on a popular web site for people who have money?

Leap Day again

I’ve mentioned before how much I love the vagaries of dates and times in computing, and I’m glad it’s not a part of my regular day-to-day work or hobby coding.

Here’s some of the things I expect to happen this year as a result of the leap year:

  • Hey, it’s February 29 – some programs, maybe even operating systems, will refuse to recognise the day and think it’s actually March 1. Good luck figuring out how to mesh that with other calendar activities. Or maybe you’ll be particularly unlucky, and the app/OS will break completely.
  • But the fun’s not over, as every day after February 29, until March 1 NEXT YEAR, you’re a full 366 days ahead of the same date last year. So, did you create a certificate that expires next year, last year? If so, I hope you have a reminder well ahead of time to renew the certificate, because otherwise, your certificate probably expires 365 days ahead, not one year. Or maybe it’ll just create an invalid certificate when you renew one today.
  • The same is true for calendar reminders – some reminders for “a year ahead” will be 365 days ahead, not one year. Programmers often can’t tell the difference between AddDays(365) and AddYears(1) – and why would they, when the latter is difficult to define unambiguously (add a year to today’s date, what do you get? February 28 or March 1?)
  • But the fun’s not over yet – we’ve still got December 31 to deal with. Why’s that odd? Normal years have a December 31, so that’s no problem, right? Uh, yeah, except that’s day 366. And that’s been known to cause developers a problem – see what it did to the Zune a few years back.
  • Finally, please don’t tell me I have an extra day and ask me what I’m going to do with it – the day, unless you got a day off, or are paid hourly, belongs to your employer, not to you – they have an extra day’s work from you this year, without adding to your salary at all.

And then there’s the ordinary issues with dates that programmers can’t understand – like the fact that there are more than 52 weeks in a year. “ASSERT(weeknum>0 && weeknum<53);”, anyone? 52 weeks is 364 days, and every year has more days than that. [Pedantic mathematical note – maybe this somewhat offsets the “employer’s extra day” item above]

Happy Leap Day – and always remember to test your code in your head as well as in real life, to find its extreme input cases and associated behaviours. They’ll get tested anyway, but you don’t want it to be your users who find the bugs.

Why am I so cross?

There are many reasons why Information Security hasn’t had as big an impact as it deserves. Some are external – lack of funding, lack of concern, poor management, distractions from valuable tasks, etc, etc.

But the ones we inflict on ourselves are probably the most irritating. They make me really cross.

Why cross?

OK, “cross” is an English term for “angry”, or “irate”, but as with many other English words, it’s got a few other meanings as well.

It can mean to wrong someone, or go against them – “I can’t believe you crossed Fingers MacGee”.

It can mean to make the sign of a cross – “Did you just cross your fingers?”

It can mean a pair of items, intersecting one another – “I’m drinking at the sign of the Skull and Cross-bones”.

It can mean to breed two different subspecies into a third – “What do you get if you cross a mountaineer with a mosquito? Nothing, you can’t cross a scaler and a vector.”

Or it can mean to traverse something – “I don’t care what Darth Vader says, I always cross the road here”.

Green_cross_man_take_it

It’s this last sense that InfoSec people seem obsessed about, to the extent that every other attack seems to require it as its first word.

Such a cross-patch

These are just a list of the attacks at OWASP that begin with the word “Cross”.

Yesterday I had a meeting to discuss how to address three bugs found in a scan, and I swear I spent more than half the meeting trying to ensure that the PM and the Developer in the room were both discussing the same bug. [And here, I paraphrase]

“How long will it take you to fix the Cross-Frame Scripting bug?”

“We just told you, it’s going to take a couple of days.”

“No, that was for the Cross-Site Scripting bug. I’m talking about the Cross-Frame Scripting issue.”

“Oh, that should only take a couple of days, because all we need to do is encode the contents of the field.”

“No, again, that’s the Cross-Site Scripting bug. We already discussed that.”

“I wish you’d make it clear what you’re talking about.”

Yeah, me too.

A modest proposal

The whole point of the word “Cross” as used in the descriptions of these bugs is to indicate that someone is doing something they shouldn’t – and in that respect, it’s pretty much a completely irrelevant word, because we’re already discussing attack types.

In many of these cases, the words “Cross-Site” bring absolutely nothing to the discussion, and just make things confusing. Am I crossing a site from one page to another, or am I saying this attack occurs between sites? What if there’s no other site involved, is that still a cross-site scripting attack? [Yes, but that’s an irrelevant question, and by asking it, or thinking about asking/answering it, you’ve reduced your mental processing abilities to handle the actual issue.]

Check yourself when you utter “cross” as the first word in the description of an attack, and ask if you’re communicating something of use, or just “sounding like a proper InfoSec tool”. Consider whether there’s a better term to use.

I’ve previously argued that “Cross-Site Scripting” is really a poor term for the conflation of HTML Injection and JavaScript Injection.

Cross-Frame Scripting is really Click-Jacking (and yes, that doesn’t exclude clickjacking activities done by a keyboard or other non-mouse source).

Cross-Site Request Forgery is more of a Forced Action – an attacker can guess what URL would cause an action without further user input, and can cause a user to visit that URL in a hidden manner.

Cross-Site History Manipulation is more of a browser failure to protect SOP – I’m not an expert in that field, so I’ll leave it to them to figure out a non-confusing name.

Cross-Site Tracing is just getting silly – it’s Cross-Site Scripting (excuse me, HTML Injection) using the TRACE verb instead of the GET verb. If you allow TRACE, you’ve got bigger problems than XSS.

Cross-User Defacement crosses all the way into crosstalk, requiring as it does that two users be sharing the same TCP connection with no adequate delineation between them. This isn’t really common enough to need a name that gets capitalised. It’s HTTP Response-Splitting over a shared proxy with shitty user segregation.

Even more modestly


I don’t remotely anticipate that I’ll change the names people give to these vulnerabilities in scanning tools or in pen-test reports.

But I do hope you’ll be able to use these to stop confusion in its tracks, as I did:

“Never mind cross-whatever, let’s talk about how long it’s going to take you to address the clickjacking issue.”

In Summary

Here’s the TL;DR version of the web post:

Prevent or interrupt confusion by referring to bugs using the following non-confusing terms:

Confusing Not Confusing Much, Probably
Cross-Frame Scripting Clickjacking
Cross-Site History Manipulation [Not common enough to name]
Cross-Site Tracing TRACE is enabled
Cross-Site Request Forgery Forced User Action
Cross-Site Scripting HTML Injection
JavaScript Injection
Cross-User Defacement Crappy proxy server

Fear the browsing dead!

Browsing Dead

Ding dong, the plugin’s dead!

There’s been a lot of celebration lately from the security community about the impending death of Adobe’s Flash, or Oracle’s Java plugin technology.

You can understand this, because for years these plugins have been responsible for vulnerability on top of vulnerability. Their combination of web-facing access and native code execution means that you have maximum exposure and maximum risk concentrated in one place on the machine.

Browser manufacturers have recognised this risk in their own code, and have made great strides in improving security. Plus, you can always switch browsers if you feel one is more secure than another.

Attackers can rely on Flash and Java.

An attacker can pretty much assume that their target is running Flash from Adobe, and Java from Oracle. [Microsoft used to have a competing Java implementation, but Oracle sued it out of existence.]

Bugs in those implementations are widely published, and not widely patched, whether or not patches are available.

Users don’t upgrade applications (plugins included) as often or as willingly as they update their operating system. So, while your browser may be updated with the operating system, or automatically self-update, it’s likely most users are running a version of Java and/or Flash that’s several versions behind.

Applications never die, they just lose their support

As you can imagine, the declaration by Oracle that Java plugin support will be removed is a step forward in recognising the changing landscape of browser security, but it’s not an indication that this is an area in which security professionals can relax.

Just the opposite.

With the deprecation of plugin support comes the following:

  • Known bugs – without fixes. Ever.
  • No availability of tools to manage old versions.
  • No tools to protect vulnerable plugins.
  • Users desperately finding more baroque (and unsecurable) ways to keep their older setups together to continue to use applications which should have been replaced, but never were.

It’s not like Oracle are going to reach into every machine and uninstall / turn off plugin support. Even if they had the technical means to do so, such an act would be a completely inappropriate act.

There will be zombies

So, what we’re left with, whenever a company deprecates a product, application or framework, is a group of machines – zombies, if you will – that are operated by people who do not heed the call to cull, and which are going to remain active and vulnerable until such time as someone renders those walking-dead components finally lifeless.

If you’re managing an enterprise from a security perspective, you should follow up every deprecation announcement with a project to decide the impact and schedule the actual death and dismemberment of the component being killed off.

Then you can celebrate!

Assuming, of course, that you followed through successfully on your plan.

Until then, watch out for the zombies.

The Browsing Dead.

1 2 3 13