Posted By Grant Forrest | last update 22-Sep 2011
Whether you're an experienced or novice web programmer, SQL injection is something that you need to know about. As the Wikipedia page suggests, it's your code that does queries against a database that is vulnerable to SQL injection.
I used to think that the protection offered by the PHP mysql_real_escape_string() function was sufficient but recent experience suggests otherwise. For those not familar with mysql_real_escape_string(), it's a PHP library function that will do a search-and-destroy on rogue characters such as single quotes (') and line breaks. This essential sanitisation of user input prevents hackers from 'injecting' their own SQL code into your query...or so I was led to believe.
I started getting repeated messages about a SQL injection vulnerability on a web site which I ignored initially because I thought it was spam, but on closer inspection, I noticed some strings of text that looked like 41-character password hashes and most astonishingly, one of the hashes had been reverse-engineered to reveal the actual password, something I'd understood to be impossible.
With my attention well and truly grabbed, I began the investigation. As it turned out, someone from Romania (elicited by IP geolocation) had found the vulnerability but rather than exploiting it, they'd given me a clear example of how it worked and advised me that I needed to do some patching up. Nice to get a helpful message from East Europe for a change.
Here's how it works. A URL takes a numeric parameter ($id) and returns content based on a matching row in a table. For instance :
The returned page content will contain something related to the content of row id=5 in a table.
Consider the following URL:
Ignoring the detail for the moment, what caught me by surprise is that there is nothing in the 'injected' code that would be sanitised by mysql_real_escape_string(). The whole query string is passed unaltered to the function that is intended to pull out record id=5 from the table.
In effect, what the injected code does is to append a second query to our basic one, pulling out the content of the `users` table. This is very bad. Not only does the hacker get the names of all our users, but if our passwords were stored in plain text, they'd get those too. This is a classic example of why you should never store passwords in plain text in a database. But the story doesn't end there.
A sensible way to protect passwords is to use a hashing algorithm (of which there are many) to generate a long string of characters that represents the password. It's important to note that this is different from encryption. Whereas encryption is a two-way process (encrypt-decrypt), hashing is one-way i.e. the original password is not supposed to be recoverable from the hash. All you can say is that if two hashes match, then they were both generated from the same password, but you have no way to figure out what that password was.
So how could it be that alongside one of the hashes was the original 4-digit password ? The clue is that the original password was very weak. A 4-digit code has only 10,000 permutations. It should come as no surprise then, that every permutation from 0000 to 9999 has been hashed using all of the popular algorithms and the resulting hashes are made available to be queried by botnets. So the sequence is :
1. URLs are probed for SQL injection vulnerability
2. If available, hashed password info is obtained
3. Hashes are compared against the databases of known hashes and if matched, the original password is returned
So how do you properly sanitise your user input to prevent this sort of attack ? As it says very helpfully on the PHP web site, it's a good idea to use one of the character type functions such as is_numeric() or ctype_digit() on your input data, which is fine if the data you're expecting is numeric, but if your data is character-based then you're into the realms of PERL regular expressions, and if that's the case, then as Sponge Bob says : "Good luck with that! "
Posted by Rory C-L on 08-Nov 2011 23:11
A useful way of avoiding exposure to SQL injection is to use procedural SQL to ensure that important database queries are parameterised. Although it may be too much work to convert all of the SQL queries in a normal site to procedural SQL, it is certainly worth considering ensuring that authentication queries are written in this manner.
An excellent guide to procedural SQL (admittedly for Postgresql) can be found here:
Posted by Matt F on 22-Nov 2011 20:11
I would also add that sanitisiing should include type casting as a simple initial step. When you design your script, you know what data types you should receive. You should limit these to only those types with the least capability for destructive consequences such as detailed in this article.
In the id example we know that this should only be a integer. You could clean that using a regex or simply type casting. In php the string "-394%20union%20select%201,2,3,4,5,group_concat%28username,0x3a,password%29,7,8,9,10,11,12,13%20from%20users--" becomes "-394" when cast as an integer (int)$var