NISCC IPSec Vulnerability?

OK, I know I promised that the next article would be about a non-existant DDI,
but work has been so busy lately that I’ve been unable to cobble together the
little demo program I had in mind at the time. One day soon, I promise.

Anyway, there is something more interesting to talk about. Stupid security
vulnerabilities! Security is such an important topic that I believe that all
coders should understand the basics. NTDEV is full of e-mails about how to
correctly (or incorrectly!) apply encryption and authentication to drivers, and
these issues pop up at least as often in user space too.

Now, I’ll be the first to point out that Security Is Hard. Bruce Schneier is
fond of saying that anyone can design a cryptosystem that cannot be broken by
oneself. Really, good security design is 1/3 intelligence, 1/3 experience, and
1/3 peer review. You cannot hope to get it right without experience and peer
review, no matter how smart you (think you) are.

With that in mind, I was eager to read Friday’s NISCC Vulnerability Advisory IPSEC – 004033, allegedly
detailing a new security threat had been found in IPSec. (And by the way, it’s
“IPSec”, not “IPSEC”, IMNSHO. Looks prettier that way.) IPSec has a little bit
of a history on security problems, but they’re all (to my knowledge) well
addressed, and anyway, the problems were only found because of the vast number
of eyeballs pointed at it. Other protocols (i.e. proprietary protocols) have
nothing to brag about until the same number of eyeballs examine them. This
story seemed to get a lot of press, including articles on Slashdot and elsewhere.

Well, I have analyzed this, and it is my opinion that this advisory is basically
junk, mainly because it has been known about for years. Most vendors,
including FreeSWAN, OpenSWAN, and StrongSwan, prevent the mis-configuration that
this advisory’s vulnerabilities rely on. Still, there are other reasons why
this is less significant than the media would have you believe.

To recap, there are three attack vectors that all basically exploit the same
mis-configuration. The mis-configuration is choosing to encrypt the data without
choosing to authenticate that same data. In other words, you choose to use ESP
(one of the payload protocols in IPSec), but you don’t turn on any sort of hash
to make sure that the encrypted message arrives intact. Now why would you care
if a few bits are flipped in the ciphertext? Because it will change the
plaintext. Sometimes, it’ll change the plaintext dramatically. So when you
decrypt the ciphertext, what you end up is NOT what the sender sent.

Some who know a thing or two about crypto might point out that any good
encryption algorithm (and the ones generally in use in the IPSec world, like
3DES and AES certainly are good algorithms) will exhibit a property known as the
avalanche effect. This means, roughly speaking, that any given bit flip in the
ciphertext should have as close as possible to a 50/50 probability of affecting
each bit in the plaintext. In other words, very tiny changes made to the
ciphertext should result in major changes to the plaintext. So, with that in
mind, how can this attack be practical? Wouldn’t the receiving IP stack just
trash such a badly-mangled datagram?

Well, there was an implicit assumption contained in that theory; namely, that
the algorithm was using Electronic Codebook (ECB) mode of the cipher. ECB mode
does the naive thing and just encrypts every block-sized block of data (64 bits
in 3DES, 128 bits in AES) independently. The problem with that is that
identical blocks of plaintext will encrypt to identical blocks of ciphertext.
This reveals patterns in the plaintext, and can wind up leaking sensitive
information through a statistical analysis.

Most cryptosystems use Cipher Block Chaining (CBC) mode instead, to address this
issue. IPSec uses 3DES-CBC and AES-CBC. CBC mode XORs the output from an
encryption operation (i.e. a 64/128-bit block) with the input of the next
operation, thereby creating an encrypted chain of data. An explicit
Initialization Vector (IV) is chosen for the first XOR operation. In CBC mode,
the statistical properties of the plaintext that are revealed by ECB mode are
hidden. However, this comes at a price: CBC is open to a relatively
straightforward plaintext modification attack. Because the decrypted plaintext
is based on an XOR of the previous block, that block can be manipulated to
affect specific bits pretty reliably in the next block to be decrypted. As
usual, Wikipedia has amazingly good pictures and details of this
stuff. Wikipedia is the Best Thing Ever™.

So, this attack takes advantage of the CBC modification attack to
deterministically manipulate the plaintext in a few key ways. The ways differ
depending on which vulnerability we’re exploiting. In every case, you have to
be in the packet path. The vulnerabilities addressed in the advisory are:

  • Destination Address Rewriting – using the above-mentioned technique,
    we modify the destination address in the IP datagram. Some computers (although
    very few by default) will inspect the destination address of the packet, and if
    it isn’t one of the local host’s addresses, will forward the datagram on to the
    real destination based on the routing table. This is just good old fashioned IP
    routing, and the attack takes advantage of the fact that the IPSec endpoint
    probably won’t re-encrypt the packet unless it happens to have an SA with the
    target. Even if it does, the new destination will be able to read the complete
    plaintext. This attack only works if the host has IP routing on, which is
    unusual in general but more common on boxes that also terminate IPSec tunnels.

  • IP Options Rewriting – The attacker tweaks the IP options fields to
    attempt to elicit an ICMP Parameter Problem message. The source address of the

    packet must also be modified so that the ICMP message won’t be encrypted on
    return to sender. This attack takes advantage of the fact that ICMP messages
    include the first 8 bytes of the IP datagram that elicited the message as
    payload, so that the sender can attempt to correlate the ICMP error to a
    particular application. This exposes 8 bytes of plaintext, and doesn’t require
    any special configuration on the target host. Remember, however, that those 8
    bytes of plaintext are almost always transport layer headers – TCP’s header is
    at least 20 bytes, and UDP’s and ICMP’s are 8. Because anyone can define a new
    IP protocol, I can’t say for sure which ones are vulnerable to having
    their payload revealed with this attack, I can say that there aren’t
    many.

  • Protocol Field Rewriting – This attack relies on the fact that IP
    datagrams that are targeted to invalid IP protocol IDs elicit an ICMP
    Destination Unreachable (Protocol Unreachable) message that also includes 8
    bytes of the original datagram. This attack works similarly to IP Options
    Rewriting.

These attacks would certainly work if you could manage to get into the packet
path, and if the IPSec tunnel you’re attacking doesn’t use authentication on its
packets. Getting into the packet path is difficult in general, but often
possible. However, as mentioned above, most vendors explicitly forbid the
configuration of IPSec security associations that don’t include some sort of
encryption. From the OpenSWAN page:

notification_t parse_ipsec_sa_body(

        [...]
        switch (esp_attrs.auth)
        {
            case AUTH_ALGORITHM_NONE:
                if (!ah_seen)
                {
                    DBG(DBG_CONTROL | DBG_CRYPT
                        , DBG_log("ESP from %s must either have AUTH or be combined with AH"
                            , ip_str(&c->spd.that.host_addr)));
                    continue;   /* try another */
                }
                break;
        [...]

So, to summarize, these attacks are real, they have been known about for a
while, and it is a well-understood best practice to never configure encryption
without also having a way to authenticate the encrypted data. Every major
encryption protocol I know of does this, or at least enables it.

Frustration

Complaining in public – some say the whole point of having a blog!

There are many problems with working 80-hour-plus work weeks, and I’m well-acquainted with all of them. But the worst one, clearly, is this: I managed to miss not ONE but TWO shows tonight that I’ve been looking forward to for months, because I was too overloaded to remember they were coming. The Shins were in Lawrence, KS tonight, and Nickel Creek was in Des Moines, IA, a couple hours away. I’ve seen Nickel Creek a few times, but I’m a little Shins-crazy at the moment and would happily have shelled out the $20 or whatever it was for the show. How frustrating!

Michael Kaplan, whose blog I read every day (despite the fact that it’s not linked at the side… gotta update the blogroll some day… I hate .Text…) has a post where he points out a pretty funny post on Language Log dealing with split infinitives. I’m kind of the Resident Grammar Goon at my office (although I can’t spell worth a damn, which you know if you have been reading this blog for any appreciable period of time), and I love LL (even though it’s not on my blog list either). This is what I love about The Blogosphere™ – it crosslinks the web in the most amazingly interesting ways. This paragraph gets the award for the most parentheticals ever (which it richly deserves).

In kernel-mode news, Mark Roddy has updated his popular DDKBUILD script. For those of you who haven’t figured out Vim yet, this is the ticket – it lets you use Visual Studio to develop drivers using the DDK. The latest update adds support for the current Windows Driver Framework beta. Mark does this as a community service, and his efforts are much appreciated.

Incidentally, if you haven’t checked out the WDF beta yet, you should. I was very impressed with their work, and Johan (whose last name slips my mind atm), the PM for the team, really seemed like he has his ducks in a row. My wife has been wearing around an OSR shirt I got at DevCon that clearly illustrates the fact that you can save an entire shirt-front of WDM code using two or three lines of WDF code on the back. If that doesn’t sell you, imagine never writing PnP and power code again. Did that get you? No? Then you’ve never written a WDM driver. Or you did it wrong. 🙂

Okay, rant over. Next up: ExInterlockedRemoveEntryList(). Hmm, wait, that doesn’t exist. . .

Dummy Pages

Hmm, that’s not a very Politically Correct name for a page, now is it?

My favorite new bug that I found (or rather, had pointed out to me) at DevCon has to do with “dummy pages”. Landy Wang, the guy who seems to own the memory manager these days, pulled together a group of people after his most excellent talk about the future directions of MM. I wish I could give out details, but they’re most certainly NDA material. For me, it was easily the best presentation of the event. Anyway, after the presentation, Landy described a very interesting bug that has just been discovered at Microsoft concerning these dummy pages, and he gave me explicit permission to talk about it. As if I’d refuse. 🙂

A little background: when MM wants to bring in a page off of disk due to a page fault, it issues an I/O request and brings it in. No big deal there. But if you have a row of four pages like this (x = paged out, v = valid):

 [x|x|v|v]

MM will try to bring in both missing pages at once, because it is more efficient to issue one big I/O request than lots of little ones. Still not complicated. The fun happens when you have this:

 [x|v|x]

If the first missing page takes a fault, MM would like to bring in the third one with it. BUT – if the middle page is dirty, it cannot be paged in from disk without destroying data. The solution MM uses to avoid the performance hit is to substitute the valid (dirty) page in the middle with a dummy page. The dummy page is a sacrificial page of memory whose only purpose is to receive data from a large MM inpage request. It is then returned back to the bit bucket, and the net result is that pages 1 and 3 are updated, while page 2 remains untouched. Neat trick, eh?

There is a big problem here for drivers that are in the paging path: their dummy page data could become invalid at any point, changed out from under them by another read (or ostensibly something else entirely). This means that if drivers have to depend on a consistent copy of the data (crypto drivers, compression drivers, etc.), they must double-buffer the read.

The fun part about this bug is that it has apparently been in the code since Windows XP, five-ish years ago. Microsoft seems to have just discovered this problem in the last two months, though, because Longhorn has changed an aspect of how dummy pages are used. In Longhorn, there is a very high probability that your dummy page will get scribbled on while you think you have control of it.

The moral of the story is that you have to treat read data in a similar way to write data – it can change out from under you at any time, so if you need a consistent view of the bits, you had better make a copy. Incidentally, this whole dummy page mechanism works on the write path as well, but that doesn’t matter as much since most people expect this sort of weirdness on the write path. Or if not, they should. 🙂

So: go out and fix your drivers, and now. They are probably crashing on most current Microsoft OSes, and they will almost certainly crash in Longhorn.

Pointer Stew, Revisited

Here’s my take on Pointer Stew, for those who are curious.


char *c[]={
        "ENTER",
        "NEW",
        "POINT",
        "FIRST"
};

char **cp[]={c+3,c+2,c+1,c};
/*
cp[0] == &c[3] == &"FIRST"
cp[1] == &c[2] == &"POINT"
cp[2] == &c[1] == &"NEW"
cp[3] == &c[0] == &"ENTER"
*/

char ***cpp=cp;
/*
cpp[0] == cp[0] == &c[3] == &"FIRST"
cpp[1] == cp[1] == &c[2] == &"POINT"
cpp[2] == cp[2] == &c[1] == &"NEW"
cpp[3] == cp[3] == &c[0] == &"ENTER"
*/

main()
{
        printf("%s",**++cpp);
/*
RULE: prefix unary operators are all the same precedence and associate right->left
RULE: prefix unary increment returns incremented value to the expression

1)  **++cpp ---> *(*(++cpp))
 - increment cpp.  the new table looks like:
cpp[0] == cp[1] == &c[2] == &"POINT"
cpp[1] == cp[2] == &c[1] == &"NEW"
cpp[2] == cp[3] == &c[0] == &"ENTER"

2)  *(*cpp)
 - dereference cpp: *cpp == cpp[0] == cp[1] == &c[2] == &"POINT"

3) *(&"POINT")
 - dereference a reference, leading to: "POINT"

4) printf("%s", "POINT")
 - output "POINT"
*/

        printf("%s ",*--*++cpp+3);
/*
RULE: prefix unary operators are all the same precedence and associate right->left
RULE: binary addition/subtraction operators bind below prefix unary operators
RULE: binary addition/subtraction on pointer types adds/subtracts the size of the object pointed to

1) (*(--(*(++cpp)))) + 3
 - increment cpp.  the new table looks like:
cpp[0] == cp[2] == &c[1] == &"NEW"
cpp[1] == cp[3] == &c[0] == &"ENTER"

2) (*(--(*cpp))) + 3
 - dereference cpp: *cpp == cpp[0] == cp[2] == &c[1] = &"NEW"

3) (*(--&"NEW")) + 3
 - unary prefix decrement &"NEW", which is sizeof(char*), yielding &"ENTER"
   (ref. c[])
 
4) (*(&"ENTER") + 3
 - dereference a reference --> "ENTER"

5) "ENTER" + 3
 - add 3 to the char* --> "ER"

6) printf("%s ", "ER")
 - output "ER "

*/
        printf("%s",*cpp[-2]+3);
/*
RULE: postfix unary operators ([] in this case) bind tightest of the operators here
RULE: prefix arethmetic operations do the obvious thing
RULE: prefix unary operators bind next 
RULE: binary plus/minus are last of the ones here

1) (*(cpp[-2]) + 3
 - we've bumpted cpp twice so far, so we can safely decrement it back to where it started
 - cpp[-2] == cp[0] == &c[3] == &"FIRST"

2) (*(&"FIRST") + 3
 - dereference reference: --> "FIRST"

3) "FIRST" + 3 --> "ST"

4) printf("%s", "ST")
 - output "ST"

 */

        printf("%s\n",cpp[-1][-1]+1);
/*
RULE: postfix unary operators ([] in this case) bind tightest and associate L->R
RULE: binary plus/minus are least precedent
RULE: array subscripting is done by multiplying the integer by the sizeof the
      array element and adding it to the pointer, followed by applying the
      dereference operator, unless the result is itself an array

1) ((cpp[-1])[-1]) + 1
 - again, we can decrement cpp by one due to previous two increments
 - cpp[-1] == cp[1] == &c[2] == &"POINT"

2) (&"POINT"[-1]) + 1
 - subject to rule #3, we have &"POINT" - sizeof(char*) --> *(&"NEW") (cf c[])

3) (*(&"NEW")) + 1
 - dereference the reference --> "NEW"

4) "NEW" + 1
 - pointier math: add sizeof(char) --> "EW"

5) printf("%s\n", "EW");
 - output: "EW\n"

final output:  "POINT" "ER " "ST" "EW\n" --> "POINTER STEW\n"

NOTE that this clever little thing managed to get the 0-terms right every time;
otherwise, the printf's would have crashed
 */
}

More Sharp Languages: F#

I was perusing The Blogosphere™ today and ran across this page about the new F# language, via a Jason Matusow post about MS Research. It seems to be a newish functionalish language based on ML. I just hired a programmer fresh out of CS school last autumn, and he tells me that he learned all about functional languages in school. I just kind of smiled and nodded and hired him anyway. After all, he didn’t say the word “Java” during his interview.

Well, I thought I’d burn 5 minutes of my Sunday morning reading about F#, so I downloaded their introductory presentation. It turns out that F# is in use at Microsoft, integrates with VS2k3 (and 2k5), and *was used to write Static Driver Verifier*. I haven’t covered SDV yet, but suffice it to say that it’s kind of a static code analysis tool that verifies that you’ve operated the APIs correctly before you actually run the driver. It’s a neat tool, and I’ll be making more use of it in the future.

Microsoft has obviously turned up the heat to an unprecedented level as it relates to testing, and they’ve made tons of great tools available for that purpose. Other initiatives like the kernel-mode and user-mode driver frameworks are clear attempts to improve code reliability, both by creating sandboxes (usermode) and using proven code (both). The cancel-safe queue library is another example of this; safe strings are a third.

The thing that interests me is that SDV, a new testing tool (and presumably an important effort within Microsoft) was written in a functional language. The functional language people will tell you that a big advantage to that style of programming is that your programs don’t have (as much) state. You’ll have to look at some sample code to get an idea of what I mean, but in general, once you define a variable, there’s no changing it. The lack of state makes programs easier to prove correct mathematically. In other words, using functional languages well ostensibly leads to programs that have fewer bugs, increased testability, and most importantly, provability.

I’m usually skeptical about “major advances” in programming. I am equally capable of writing excellent bugs in asm, c, and c++. Then again, proving a program correct is one of the more difficult exercises in computer science, so it just doesn’t get done. Anything that can help in that regard is worth a look. As I said, I know very little about functional programming, so please don’t cite any of this as authoritative. But it is interesting.

In totally unrelated news, there was no Ask Adrian session at this DevCon. That was one of the best parts of last year, so when I found out he wasn’t going to be back, I asked a Microsoft person why. She something to the effect of “ooh, um, well, nobody can ask Adrian what he’s working on right now!” Adrian, one of the lead devs on the kernel team, had said something about having a lot to do with Driver Verifier last time. The mind wanders. 🙂

Now Playing: One by U2.