OK, let me back up a little and explain.
There’s an argument I’ve been a part of at many places, with many people, as to which method of protection is best to guard against injection attacks.
What’s an injection attack?
Put simply, the attacker abuses your site’s or application’s need to collect data from users – instead of providing simple useful data, the attacker provides data that contains code along with some magical sequence or other, to allow it to execute as code, rather than merely be displayed as data.
As a simplistic example, imagine you’re writing a shopping list. On it, you write things like “Carrots”, “Potatoes”, “Steak”, etc. Then someone comes along and writes “P.S. Don’t forget to rob the bank”. Lists of items to buy at the grocery store don’t often contain instructions – but if you then give that list to a rather literally-minded, but stupid, assistant, there’s a chance the “P.S.” makes them drop out of “list of grocery” mode, and into “follow instruction” mode.
Sure, but who would be that dumb?
No person you’d send to the grocery store on their own, of course – most people have an in-built filter that prevents them from mindlessly doing stupid stuff.
Computers aren’t so fortunate, sadly – any such filter would have to be built in by the programmers. When it comes to the equivalent of reading the grocery list, the computer would have to be told specifically not to rob banks – or steal cars, murder the grocery clerks, steal from the register, or do anything else bad that might be on the list.
That’s clearly a non-starter – there are so many items in the “list of bad things you might be told to do”, it’s really much easier to start from the opposite end, and build a filter that accepts only those things that are known to be good. It can make life a little tricky, because your first run at the filter might not list all of the things that you might want to get at the grocery store – but it’s still the only way to be safe.
That’s a white-list, right?
Yes – unless you work at Microsoft, where the term is “allow list”. It’s coupled with the concept of “default deny”, a very basic security premise, which says that if you don’t know what to do, just don’t do anything. Refuse to process the grocery list, if you care to continue using the metaphor.
But let’s break out of the metaphor and go into technical details. What we have just described is “input validation”. Make sure that all the input that’s given you matches a restricted set of possible good inputs.
This is an easy win, and should be applied whenever possible, simply because it quickly rejects bad input in most cases. You’re asking for a quantity to purchase? Make sure it’s a positive integer – if anything other than a series of digits is given to you, refuse the value. For even more win, make sure there are no leading zeroes, and that it’s smaller than the largest expected order you can fulfill.
That’s great for such a limited case, and there are many more cases you can analyse to determine that they are indeed possible to limit.
But what happens if you need to accept a character that you know is bad, or that you don’t know is good?
How does that work with Input Validation?
Many places I’ve worked have seen me run into fanatical developers who heard about SQL Injection, and Input Validation, and determined that they needed to protect against it. Away with the word “drop” – that’s not allowed. Away with the single quote character – not allowed, either. Away with angle brackets and the word “script” in case we’re susceptible to Cross-Site Scripting (XSS) too. Probably thanks to this cartoon from xkcd.
[I’ve often thought it might be funny for there to be a follow-up cartoon, in which the school gets its revenge and assigns Bobby a GPA of 3.”>, knowing that every University web site will discard that as an attempted attack.]
Fine, but you go tell Mr O’Reilly that he can’t enter his name, or order any lemon drops; tell the French they can’t put anything in quotes (you did know that French quotation marks are double angle brackets, oui?); and as for the word “script”, I’m sure you can come up with your own examples. Start with this blog post, for instance, which uses the word “script” or “scripting” about a half-dozen times.
The pathological case of a complete inability to do input validation comes, ironically enough, in software designed to log security incidents. How are you supposed to report the exact string that triggers an XSS attack or SQL Injection, if you are afraid that the string itself will break the security incident reporting tool? [If you’ve read Douglas Hofstadter’s “Goedel, Escher, Bach, an Eternal Golden Braid”, you’ll recognise this as the record that breaks the player. If you’re under thirty you may well ask “what’s a record?”]
Quite a quandary. But you know there has to be something to solve it, and the key comes from what I said earlier – if you can’t distinguish between code and data, you may find yourself executing as code something that you should be processing or displaying or storing as data.
So you have to say “this thing that looks like code is really data”. And you have to say it unambiguously, because ambiguity means that you can’t tell between data and code.
So, that’s where Output Encoding comes in.
Where Input Validation essentially says “I will only accept data that is obviously data”, Output Encoding says “I will only pass on data and code that can be told apart from each other”.
You already see the flaw, of course – to do this as the caller, I have to know how my recipient distinguishes between code and data. I have to have a contract of sorts – and developers often refer to a “code contract” – that tells me how I should tell my partner which pieces I’m sending him are code, and which are data.
There are several ways to do this, and they depend on the library / language / platform your code is running in, as much as they depend on the communication mechanism (protocol) with the partner.
Give us a simple example, then
A simple example is sending an email through SMTP – the Simple Mail Transfer Protocol. Assuming you’re the mail sender, once you’re connected and ready to go, you send the commands “RCPT TO <email@example.com>” and “MAIL FROM <firstname.lastname@example.org>” to distinguish who you’re sending mail to and from, followed by the command “DATA”.
After the DATA command, as you might expect, comes the data of the message, line by line. [This actually includes header information, but I’m ignoring that to keep this simple.]
And how does the DATA end? With a single line, containing simply a dot (full-stop, period, whatever you want to call it).
That’s great, except of course an attacker could send an email with a single dot on its own line, followed by some evil SMTP commands, and you, the mail sender, would essentially ask the mail server to execute those commands. Or someone could accidentally include a single dot on its own, and trigger some random command to execute.
Fortunately, the SMTP designers thought of that – when sending a line from an email that begins with a dot, you are required to add another dot before it. So, a dot on its own becomes two dots, two dots become three, etc, etc. And the server knows that this is a data line, and not a command to be executed.
That’s perhaps the simplest example of Output Encoding there is.
Other examples of Output Encoding would include calling HtmlEncode, or a similar function for your framework, on data which you know should not be executed as HTML. If the data is HTML-clean, the HtmlEncode function won’t touch it, but otherwise your simple call to that one function will prevent XSS attacks (aka HTML injection, remember).
Another example is when passing data to a SQL database – instead of concatenating code and data to make a string that you execute, all SQL libraries have parameterised queries, allowing you to pass data in a manner that will allow the SQL server to recognise it as data, rather than code. [There is a side issue here, in that the SQL code itself may build a command by concatenating code and data in a string – in that case, your SQL developers need a quick session with the clue-by-four.]
So, which one is better? Input Validation or Output Encoding?
Here we come back to the topic of this article.
Neither is better than the other. You have to do both.
You see, injection attacks aren’t an example of input validation failures, or an example of output encoding failures.
Injection attacks are caused by a failure to do throughput handling correctly.
Throughput handling requires that you do input validation and output encoding. Input validation is cheap, easy, understandable by all, and allows you to dismiss bad data immediately, before a server wastes its time on it. But it doesn’t catch everything, and it can’t be used in all cases. So, output encoding must be used, difficult though it may be, to ensure that the data which does seep through input validation looking like code doesn’t actually get passed to the next layer as code.
Yep. Good luck.