Monday, July 15, 2013

Blind SQL injection optimization

I was recently performing some blind SQL exploitation and was very frustrated at how long it was taking me to dump out the database.  The code I was using to exploit the blind SQL was retrieved from some old blind SQLi I had written a while ago (2004-ish, the only backups that were in place since the great HD crash of 2012) when I first learned this exploitation technique, and it consisted of something like this:

s = ‘’
# ‘ and 1=1 and substr(USER, 1, 1)=’S’; --‘ #example injection returning true or false
front_sqli = ‘\’ and 1=1 and substr(USER,’
middle_sqli = ‘, 1)=\’’
end_sqli = ‘\’; --\’’
try:
  for i in range(0,255):
    for j in range(0,255):
      if exploit(front_sqli + chr(i) + middle_sqli + chr(j) + end_sqli):
        s += chr(j)
except:
  pass
print s

Coming from a newby background, this is completely sufficient to enumerate strings in the database.  It’s really slow.

Really, really, really slow.

So, what are some optimizations we can do to this?  Firstly, we have to take in the constraints of what sort of return we get, and with this application I am getting success or failures (true / false).  We aren't just limited to comparisons of equality (=), we can do quality comparisons (<>), as well as use any SQL to assist in lowering the searchable space.

The first impulse I have is to perform a number of operations to limit the search space, so:
  1. Is the character lower case?
    1. Sequentially search through lower case alphabet a-z
  2. Is the character upper case?
    1.  Sequentially search through upper case alphabet A-Z
  3. If neither,
    1. sequentially search through numbers and symbols
But a couple of caveats come into place here.  The average search time is the following:
  • 2 operations to find the character type (on average)
  • Search space is roughly 32 characters long (a-z=26, A-Z=26, non-az=42 ASCII), average search time is 16 requests (This is not including binary data).
This comes out to an average of 18 operations on the characters for a sequential search.
What about implementing a more efficient search?  The one that comes in immediately is one of the most efficient, the binary search.  So, we now have the following number of operations:
  • 2 operations to find the character type (on average)
  • Search space is still roughly 32 characters long, so worst case we have 5 additional operations.
7 operations are MUCH more efficient than the 16, and given the average response time from the server (due to latency, internet traffic, etc.) I’m getting ~20 requests per second (using the python requests library), which means instead of an average of 1-2 characters per second I’m getting 2-3.

BUT, this current implementation has some problems, as the worst case scenario happens quite frequently when dumping passwords or other non-alpha data (like binary).  So, for a more efficient algo, I decided to use the following:
  1. If dumping a database name / table name / column title, use a binary search WITHOUT the upper/lower comparison but using a printable ASCII searchspace (94 characters) for a worst case search time of 6 requests.
  2. If dumping a column field, use a binary search WITHOUT the upper/lower comparison using the full ASCII searchspace (0-255), for a worst case search time of 8 requests.

The average search request for [1] was 4.21 and for [2] was 6.37, but this is heavily dependent upon the data being parsed.

There are plenty of other ways to optimize and get an extra instruction or two, but 4.21 and 6.37 are massive improvements upon the average of 18.21 requests I was having previously; I was very pleased at the improvement of performance as I am able to pull about 3-5 and 2-4 characters out of the database per second, respectively.