Padding Oracle 3–making it usable

Just a quick note, because I’ve been sick this week, but last weekend, I put a little more work into my Padding Oracle exploit tool.

You can find the new code up at https://github.com/alunmj/PaddingOracle, and because of all the refactoring, it’s going to look like a completely new batch of code. But I promise that most of it is just moving code from Program.cs into classes, and adding parsing of command-line arguments.

I don’t pretend to be the world’s greatest programmer by any stretch, so if you can tell me a better way to do what I’ve done here, do let me know, and I’ll make changes and post something about them here.

Also, please let me know if you use the tool, and how well it worked (or didn’t!) for you.

Arguments

The arguments currently supported are:

URL

The only parameter unadorned with an option letter – this is the URL for the resource the Padding Oracle code will be pounding to test guesses at the encrypted code.

-c ciphertext

Also, –cipher. This provides a .NET regular expression which matches the ciphertext in the URL.

-t encoding:b64|b64URL|hex|HEX

Also, –textencoding, –encoding. This sets the encoding that’s used to specify the ciphertext (and IV) in the URL. The default is b64

  • b64 – standard base64, URL encoded (so ‘=’ is ‘%3d’, ‘+’ is ‘%2b’, and ‘/’ is ‘%2f’)
  • b64URL – “URL safe” base64, which uses ‘!’, ‘-‘ and ‘~’ instead of the base64 characters that would be URL encoded.
  • hex – hexadecimal encoding with lower case alphabetic characters a-f.
  • HEX – hexadecimal encoding with upper case alphabetic characters A-F.

-i iv

Also, –iv. This provides a .NET regular expression which matches the IV in the URL if it’s not part of the ciphertext.

-b blocksize

Also, –blocksize. This sets the block size in bytes for the encryption algorithm. It defaults to 16, but should work for values up to 32.

-v

Also, –verbose. Verbose – output information about the packets we’re decrypting, and statistics on speed at the end.

-h

Also, –help. Outputs a brief help message

-p parallelism:-1|1|#

Also –parallelism. Dictates how much to parallelise. Specifying ‘1’ means to use one thread, which can be useful to see what’s going on. –1 means “maximum parallelisation” – as many threads as possible. Any other integer is roughly akin to saying “no more than this number of threads”, but may be overridden by other aspects of the Windows OS. The default is –1.

-e encryptiontext

Instead of decrypting, this will encrypt the provided text, and provide a URL in return that will be decrypted by the endpoint to match your provided text.

Examples

These examples are run against the WebAPI project that’s included in the PadOracle solution.

Example 1

Let’s say you’ve got an example URL like this:

http://localhost:31140/api/encrypted/submit?iv=WnfvRLbKsbYufMWXnOXy2Q%3d%3d&ciphertext=087gbLKbFeRcyPUR2tCTajMQAeVp0r50g07%2bLKh7zSyt%2fs3mHO96JYTlgCWsEjutmrexAV5HFyontkMcbNLciPr51LYPY%2f%2bfhB9TghbR9kZQ2nQBmnStr%2bhI32tPpaT6Jl9IHjOtVwI18riyRuWMLDn6sBPWMAoxQi6vKcnrFNLkuIPLe0RU63vd6Up9XlozU529v5Z8Kqdz2NPBvfYfCQ%3d%3d

This strongly suggests (because who would use “iv” and “ciphertext” to mean anything other than the initialisation vector and cipher text?) that you have an IV and a ciphertext, separate from one another. We have the IV, so let’s use it – here’s the command line I’d try:

PadOracle "http://localhost:31140/api/encrypted/submit?iv=WnfvRLbKsbYufMWXnOXy2Q%3d%3d&ciphertext=087gbLKbFeRcyPUR2tCTajMQAeVp0r50g07%2bLKh7zSyt%2fs3mHO96JYTlgCWsEjutmrexAV5HFyontkMcbNLciPr51LYPY%2f%2bfhB9TghbR9kZQ2nQBmnStr%2bhI32tPpaT6Jl9IHjOtVwI18riyRuWMLDn6sBPWMAoxQi6vKcnrFNLkuIPLe0RU63vd6Up9XlozU529v5Z8Kqdz2NPBvfYfCQ%3d%3d" -c "087gb.*%3d%3d" –i "WnfvRL.*2Q%3d%3d"

This is the result of running that command:

capture20181111175736366

Notes:

  • The IV and the Ciphertext both end in Q==, which means we have to specify the regular expressions carefully to avoid the expression being greedy enough to catch the whole query string.
  • I didn’t use the “-v” output to watch it run and to get statistics.
  • That “12345678” at the end of the decrypted string is actually there – it’s me trying to push the functionality – in this case, to have an entirely padding last block. [I should have used the letter “e” over and over – it’d be faster.]

Example 2

Same URL, but this time I want to encrypt some text.

Our command line this time is:

PadOracle "http://localhost:31140/api/encrypted/submit?iv=WnfvRLbKsbYufMWXnOXy2Q%3d%3d&ciphertext=087gbLKbFeRcyPUR2tCTajMQAeVp0r50g07%2bLKh7zSyt%2fs3mHO96JYTlgCWsEjutmrexAV5HFyontkMcbNLciPr51LYPY%2f%2bfhB9TghbR9kZQ2nQBmnStr%2bhI32tPpaT6Jl9IHjOtVwI18riyRuWMLDn6sBPWMAoxQi6vKcnrFNLkuIPLe0RU63vd6Up9XlozU529v5Z8Kqdz2NPBvfYfCQ%3d%3d" -c "087gb.*%3d%3d" –i "WnfvRL.*2Q%3d%3d" –e "Here’s some text I want to encrypt"

When we run this, it warns us it’s going to take a very long time, and boy it’s not kidding – we don’t get any benefit from the frequency table, and we can’t parallelise the work.

capture20181111215602359

And you can see it took about two hours.

Padding Oracle 2: Speeding things up

Last time, I wrote about how I’d decided to write a padding oracle exploit tool from scratch, as part of a CTF, and so that I could learn a thing or two. I promised I’d tell you how I made it faster… but first, a question.

Why build, when you can borrow?

One question I’ve had from colleagues is “why didn’t you just run PadBuster?”

It’s a great question, and in general, you should always think first about whether there’s an existing tool that will get the job done quickly and easily.

Time

Having said that, it took me longer to install PadBuster and the various language components it required than it did to open Visual Studio and write the couple of hundred lines of C# that I used to solve this challenge.

So, from a time perspective, at least, I saved time by doing it myself – and this came as something of a surprise to me.

The time it used up was my normally non-productive time, while I’m riding the bus into Seattle with spotty-to-zero network connectivity (there’s none on the bus, and my T-Mobile hot-spot is useful, but neither fast nor reliable down the I-5 corridor). This is time I generally use to tweet, or to listen to the BBC.

Interest

I just plain found it interesting to take what I thought I knew about padding oracles, and demonstrate that I had it solidly in my head.

That’s a benefit that really can’t be effectively priced.

Plus, I learned a few things doing it myself:

  • Parallelisation in C# is easier than it used to be.
  • There’s not much getting around string conversions in trying to speed up the construction of a base64-encoded URL, but then again, when executing against a crypto back-end, that’s not your bottleneck.
  • Comments and blank lines are still important, especially if you’re going to explain the code to someone else.

Performance

The other thing that comes with writing your own code is that it’s easier to adjust it for performance – you know where the bottlenecks might lie, and you can dive in and change them without as much of a worry that you’re going to kill the function of the code. Because you know at a slightly more intuitive level how it all works.

You can obviously achieve that intuitive level over time with other people’s code, but I wasn’t really going to enjoy that.

Looking at some of the chat comments directed at the PadBuster author, it’s clear that other people have tried to suggest optimisations to him, but he believes them not to be possible.

Guessing

Specifically, he doesn’t see that it’s possible to use guesses as to the plaintext’s likely contents to figure out what values should be in the ciphertext. You just plug the values 0..255 into the N-1 ciphertext block until your padding error from the N block goes away, and then that value can be XORed with the padding value to get the intermediate value from the N block. Then the intermediate value gets XORed with the original ciphertext value from the N-1 block to give the original plaintext.

Let’s see how that works in the case of the last block – where we’re expecting to see some padding anyway. Let’s say our block size is 4. Here’s what two of our ciphertext blocks might look like:

CN-1 CN
0xbe 0x48 0x45 0x30 0x71 0x4f 0xcc 0x63

Pretty random, right? Yeah, those are actually random numbers, but they’ll work to illustrate how we work here.

We iterate through values of CN-1[3] from 0..255, until we get a response that indicates no padding errors.

0x30 comes back without any padding errors. That’s convenient. So, we’ve sent “be484530714fcc63”, and we know now that we’ve got a padding byte correct. Buuut that isn’t the only right padding byte, because this is the last block, which also has a valid padding byte.

In fact, we can see that 0x30 matches the original value of the CN-1 block’s last byte, so that’s not terribly useful. Our padding count has a good chance of not being 1, and we’re trying to find the value that will set it to 1.

Keep iterating, and we get 0x32, giving us a request that doesn’t contain a padding exception. Two values. Which one made our padding byte 0x1, so we can use it to determine the intermediate value?

The only way we get two matches will be because the real plaintext ends in a padding count that isn’t 0x1. One of those values corresponds to 0x1, the other corresponds to the padding count, which could be 0x2..0x4. [Because we’re using four byte blocks as an example – a real-life example might have a 16-byte block size, so the padding count could be up to 0x10]

The clue is in the original plaintext – 0x30 MUST be the value that corresponds to the original padding count, so 0x32 MUST correspond to 0x1.

[If the original padding count was 0x1, we would only find one value that matched, and that would be the original value in CN-1]

That means the Intermediate value is 0x32 XOR 0x1 = 0x33 – which means the plaintext value is 0x3 – there’s three bytes of padding at the end of this block.

We can actually write down the values of the last three plaintext and intermediate blocks now:

CN-1 CN
0xbe 0x48 0x45 0x30 0x71 0x4f 0xcc 0x63
IN
?? 0x4b 0x46 0x33
C’N-1 PN
?? 0x4f 0x42 0x37 ?? 0x3 0x3 0x3

Wow – that’s easy! How’d we do that? Really simple. We know the last padding must be three bytes of 0x3, so we write those down. Then the intermediate bytes must be the XOR of 0x3 with the value in the CN-1 block.

[I chose in the code, instead of just “writing down” the values for each of those bytes, to check each one as I did so, to make sure that things were working. This adds one round-trip for each byte of padding, which is a relatively low cost, compared to the rest of the process.]

Now, if we want to detect the next byte, we want to change the last three bytes of CN-1, so they’ll set the PN values to 0x4, and then iterate through the target byte until we get a lack of padding errors.

So, each new value of the last few bytes of CN-1 will be C’[i] = C[i] XOR 0x3 XOR 0x4 – taking the value in the original, XORing it with the original plaintext, and then with the desired plaintext to get a new value for the ciphertext.

I’ve put those values of C’N-1 in the table above.

This trick doesn’t just stop with the padding bytes, though. I’m going to guess this is a JSON object, so it’s going to end with a ‘}’ character (close-brace), which is 0x7d.

So, C’ = C XOR 0x7d XOR 0x4 = 0xbe XOR 0x7d XOR 0x4 = 0xc7.

Let’s try that – we now send “c74f4237” – no padding error!

A successful guess for the last four bytes of PN. Now we can fill in more of the table:

CN-1 CN
0xbe 0x48 0x45 0x30 0x71 0x4f 0xcc 0x63
IN
0xba 0x4b 0x46 0x33
C’N-1 PN
0xc7 0x4f 0x42 0x37 0x7d 0x3 0x3 0x3

Awesome.

That does require me making the right guess, surely, though?

Yes, but it’s amazing how easy it is to either make completely correct guesses, or just pick a set of values that are more likely to be good guesses, and start by trying those, failing back to the “plod through the rest of the bytes” approach when you need to.

I’ve coded an English-language frequency table into my padding oracle code, because that was appropriate for the challenge I was working on.

This code is available for you to review and use at https://github.com/alunmj/PaddingOracle/blob/master/PadOracle/Program.cs

You can imagine all kinds of ways to improve your guesses – when proceeding backward through a JSON object, for instance, a ‘}’ character will be at the end; it’ll be preceded by white space, double quotes, or brackets/braces, or maybe numerics. A 0x0a character will be preceded by a 0x0d (mostly), etc.

Parallelisation

The other big performance improvement I made was to parallelise the search. You can work on one block entirely independently from another.

I chose to let the Parallel.For() function from C# decide exactly how it was going to split up work between different blocks, and the result is a whole lot faster. There are some wrinkles to manage when parallelising an algorithm, but I’m not going to get into that here. This is not a programming blog, really!

15x performance improvement

I figured I’d put that in big letters, because it’s worth calling out – the parallelisation alone obviously multiplies your performance by the number of cores you’ve got (or the number of cores the web server has, if it’s underpowered), and the predictive work on the text does the rest. Obviously, the predictive approach only works if you can separate between “likely” and “unlikely” characters – if the plaintext consists of random binary data, you’re not going to get much of a benefit. But most data is formatted, and/or is related to English/Latin text.

Bonus stage – use a decryptor to encrypt!

I haven’t published the code for this part yet, but you can use this same breach to encrypt data without knowing the key.

This is really fun and simple once you get all the previous stuff. Here goes.

Let’s encrypt a block.

Encrypting a block requires the generation of two ciphertext blocks from one plaintext block. What the second block is, actually doesn’t matter. We can literally set it to random data, or (which is important) specific data of our choosing.

The first block of the pair, acting like an IV, we can set to 0. There’s a reason for this which we’ll come to in a minute.

With these two initial blocks, we run the decrypter. This will give us a ‘plaintext’ block as output. Remember how the intermediate block is the plaintext block XORed with the first of the pair of blocks? Well, because we set that first block to all zeroes, that means the plaintext block IS the same as the intermediate block. And that intermediate block was generated by decrypting the second block of the pair. In order for that decryption to result in the plaintext we want instead, we can simply take the intermediate block, XOR it with the plaintext block we want, and then put that into the first ciphertext block. [We’re actually XORing this with the first ciphertext block, but that’s a straight copy in this case, because the first ciphertext block is zeroes.]

Now, draw the rest of the owl

Do the same thing for each of the rest of the blocks.

Sadly, there’s no parallelising this approach, and the guessing doesn’t help you either. You have to start with CN (randomly generated) and CN-1 (deduced with the approach above), then when you’ve established what CN-1 is, you can use the same approach to get CN-2, and so on back to the IV (C0). So this process is just plain slow. But it allows you to encrypt an arbitrary set of data.

Padding Oracles for a thousand, please

We did a CTF at work.

I have to say it was loads of fun – I’ve never really participated in a CTF before, so it was really good to test my hacking skills against my new colleagues.

We had one instruction from my manager – “don’t let the interns beat you”. I was determined to make sure that didn’t happen, but I was also determined to share as much knowledge and excitement for InfoSec as possible. This meant that once or twice, I may have egged an intern on in the face of the fact that they were about to discover it anyway, and it just seemed like a really good way to keep them interested.

This is not that story.

This is about me turning the corner from knowing about a security failure, to understanding how it works. Let’s see if I can help you guys understand, too.

Tales from the Crypto

That’s the title of my blog, and there’s not a whole lot of cryptography here. It’s just a play on words, which was a little more relevant when I first started the blog back in 2005. So here’s some crypto, at last.

There’s several aspects to cryptography that you have to get right as a developer:

  • Identify whether you’re doing hashing, signing, encryption, encoding, etc.
  • If you have a key, create and store it securely
  • Pick correct algorithms – modern algorithms with few known issues
  • Use the algorithms in a way that doesn’t weaken them

Having tried to teach all of these to developers in various forms, I can tell you that the first one, which should be the simplest, is still surprisingly hard for some developers to master. Harder still for managers – the number of breach notifications that talk about passwords being “encrypted” is a clear sign of this – encrypted passwords mean either your developers don’t understand and implemented the wrong thing, or your manager doesn’t understand what the developer implemented and thinks “encrypted sounds better than hashed”, and puts that down without checking that it’s still technically accurate.

Key creation (so it’s not predictable), and storage (so it can’t be found by an attacker) is one of those issues that seems to go perennially unsolved – I’m not happy with many of the solutions I’ve seen, especially for self-hosted services where you can’t just appeal to a central key vault such as is currently available in all good cloud platforms.

Picking correct algorithms is a moving target. Algorithms that were considered perfectly sound ten or twenty years ago are now much weaker, and can result in applications being much weaker if they aren’t updated to match new understanding of cryptography, and processor and memory speed and quantity improvements. You can store rainbow tables in memory now that were unthinkable on disk just a decade or two ago.

Finally, of course, if all that wasn’t enough to make cryptography sound really difficult (spoiler: it is, which is why you get someone else to do it for you), there are a number of ways in which you can mess up the way in which you use the algorithm.

Modes, block-sizes, and padding

There are a large number of parameters to set even when you’ve picked which algorithms you’re using. Key sizes, block sizes, are fairly obvious – larger is (generally) better for a particular algorithm. [There are exceptions, but it’s a good rule of thumb to start from.]

There are a number of different modes available, generally abbreviated to puzzling TLAs – ECB, CFB, OFB, CBC, GCM, CTR, and so on and so forth. It’s bewildering. Each of these modes just defines a different order in which to apply various operations to do things like propagating entropy, so that it’s not possible to infer anything about the original plaintext from the ciphertext. That’s the idea, at least. ECB, for instance, fails on this because any two blocks of plaintext that are the same will result in two blocks of ciphertext that are the same.

And if you’re encrypting using a block cipher, you have to think about what to do with the last block – which may not be a complete block. This requires that the block be filled out with “padding” to make a full block. Even if you’re just filling it out with zeroes, you’re still padding – and those zeroes are the padding. (And you have to then answer the question “what if the last block ended with a zero before you padded it?”)

There’s a number of different padding schemes to choose from, too, such as “bit padding”, where after the last bit, you set the next bit to 1, and the remaining bits in the block to 0. Or there’s padding where the last byte is set to the count of how many padding bytes there are, and the remaining bytes are set to 0 – or a set of random bytes – or the count repeated over and over. It’s this latter that is embodied as PKCS#5 or PKCS#7 padding. For the purposes of this discussion, PKCS#7 padding is a generalised version of PKCS#5 padding. PKCS#5 padding works on eight-byte blocks, and PKCS#7 padding works on any size blocks (up to 256 bytes, presumably).

So, if you have a three-byte last block, and the block size is 16 bytes, the last block is ** ** ** 0x0d 0x0d 0x0d 0x0d 0x0d 0x0d 0x0d 0x0d 0x0d 0x0d 0x0d 0x0d 0x0d (where “**” represents the last three bytes of data, and 0x0d represents the hexadecimal value for 13, the number of bytes in the padding). If your last block is full, PKCS#7 covers this by making you create an extra 16-byte block, with the value 0x10 (decimal 16) in every byte.

Tying this in to the CTF

It’s not at all unlikely that you wind up with the scenario with which we were presented in the CTF – a service that communicated with AES encryption, in CBC mode, using PKCS#7 padding. The fact that this was described as such was what tipped me off in the first place. This is the perfect setup for a Padding Oracle attack.

An Oracle is simply a device/function/machine/system/person that you send a request to, and get a response back, and which gives you some information as a result. The classical Oracles of Ancient Greece & Roman times were confusing and unhelpful at best, and that’s really something we want from any cryptographic oracle. The term “Random Oracle” refers to a hypothetical system which returns random information to every query. A good cryptographic system is one that is indistinguishable from a Random Oracle.

Sadly, CBC with PKCS#7 padding is generally very different from a Random Oracle. It is a Padding Oracle, because it will tell us when the padding is correct or incorrect. And that’s our vulnerability.

At this point, I could have done what one of my colleagues did, and download PadBuster, choosing parameters and/or modifying code, to crack the encryption.

But… I’ve been attacking this CTF somewhat … non-traditionally, using tools other than the normal ones, and so I thought I’d try and understand the algorithm’s weaknesses, and implement my own attack. I wrote it on the bus on my way into work, and was pleased to see when I got in that it worked – albeit slowly – first time.

How CBC and PKCS#5/7 is vulnerable

When decrypting each block using CBC, we say that PN = DK(CN)⊕CN-1 – which is just a symbolic way of saying that the recipient Decrypts (with key “K”) the current Ciphertext block (block N), and then XORs the result with the previous Ciphertext block (the N-1th block). Let’s also assume that we’re only decrypting those two blocks, N-1 and N, with N being the last block provided to the recipient.

In other modes, the padding check may not deliver the helpful information we’re looking for, but CBC is special. The way CBC decrypts data is to decrypt the current block of ciphertext (CN), which creates an intermediate block DK(CN). That intermediate block is combined with the previous ciphertext block, CN-1, to give the plaintext block, PN. This combining of blocks is done using the XOR (exclusive-or) operation, which has interesting properties any developer should be familiar with. Particularly, it’s important to note that XOR (represented here as “⊕”) is reversible. If X⊕Y=Z, you know also that Z⊕Y=X and Z⊕X=Y. This is one of the reasons the XOR operation is used in a lot of cryptographic algorithms.

If we want to change things in the inputs to produce a different output, we can really only change two things – the current and the previous block of Ciphertext – CN and CN-1. We should really only alter one input at a time. If we alter CN, that’s going to be decrypted, and a small change will be magnified into a big difference to the DK(CN) value – all the bytes will have changed. But if we alter CN-1, just a bit, what we wind up with is a change in the plaintext value PN which matches that change. If we alter the 23rd bit of CN-1, it will alter the 23rd bit of PN, and only that one bit. Now if we can find what we’ve changed that bit to, we can then figure out what that means we must have changed it from.

If we change the last byte of CN-1, to create C’N-1 (pronounced “C prime of N minus 1”) and cycle it through all the possible values it can take, the decryption will occur, and the recipient will reject our new plain text, P’N (“P prime of N”) because it is poorly formed – it will have a bad padding. With one (two, but I’ll come to that in a minute) notable exception. If the last byte of the plaintext decrypted is the value 0x01, it’s a single byte of padding – and it’s correct padding. For that value of the last byte of C’N-1, we know that the last byte of P’N is 1. We can rewrite PN = DK(CN)⊕CN-1 as DK(CN) = CN-1⊕PN – and then we can put the values in for the last byte: DK(CN)[15] = C’N-1[15]⊕0x01.

Let’s say, for illustration’s sake, that the value we put in that last byte of C’N-1 was 0xa5, when our padding was accepted. That means DK(CN)[15] = 0xa5 ⊕ 0x01 = 0xa4. Note the lack of any “prime” marks there – we’ve figured out what the original value of the decrypted last byte was. Note that this isn’t the same as the last byte of the plain text. No, we get that by taking this new value and XORing it with the original last byte of the previous block of ciphertext – that’s CN-1[15]. For illustration, let’s say that value is 0xc5. We calculate PN[15] = DK(CN)[15]⊕CN-1[15] = 0xa4⊕0xc5 = 0x61. That’s the lower case letter ‘a’.

OK, so we got the first piece of plaintext out – the last byte.

[Remember that I said I’d touch on another case? If CN is the original last block of ciphertext, it already contains valid padding! But not necessarily the 0x01 we’re trying to force into place.]

Let’s get the next byte!

Almost the same process is used to get the next byte, with a couple of wrinkles. First, obviously, we’re altering the second-to-last byte through all possible values. Second, and not quite so obvious, we have to tweak the last byte once as well, because we’re looking to get the sequence 0x02 0x02 (two twos) to happen at the end of P’N. The last byte of C’N-1 to achieve this is simply the last byte of C’N-1 that we used to get 0x01, XORed by 0x03 (because that’s 0x02 ⊕ 0x01). In our illustrative example, that’s 0xa6.

And the next, and the next…

Each time, you have to set the end values of the ciphertext block, so that the end of P’N will look like 0x03 0x03 0x03, 0x04 0x04 0x04 0x04, etc, all the way up to 0x10 … 0x10 (sixteen 16s).

Code, or it didn’t happen

So here’s the 200 lines or so that I wrote on the bus. I also wrote a test harness so that this would work even after the CTF finished and got shut down. You’ll find that in the same repo.

I’ve massaged the code so it’s easier to understand, or to use as an explainer for what’s going on.

I plan on expanding this in a couple of ways – first, to make it essentially command-line compatible with ‘PadBuster’, and second, to produce a graphical demo of how the cracking happens.

And in the next post, I’m going to talk a little about how I optimised this code, so that it was nearly 15x faster than PadBuster.