Why is PKI so hard? – Page 2 – Tales from the Crypto

Why is PKI so hard?

My take on the SSL MitM Attacks – part 2 – clarifications

Since the last post I made on the topic of SSL renegotiation attacks, I’ve had a few questions in email. Let’s see how well I can answer them:

Q. Some stories talk about SSL, others about TLS, what’s the difference?

A. For trademark reasons, when SSL became an open standard, it had to change its name from SSL to TLS. TLS 1.0 is essentially SSL 3.1 – it even claims to be version “3.1” in its communication. I’ll just call it SSL from here on out to remind you that it’s a problem with SSL and TLS both.

Q. All the press coverage seems to be talking about HTTPS – is this limited to HTTPS?

A. No, this isn’t an HTTPS-only attack, although it is true that most people’s exposure to SSL is through HTTPS. There are many other protocols that use SSL to protect their connections and traffic, and they each may be vulnerable in their own special ways.

Q. I’ve seen some posts saying that SSH and SFTP are not vulnerable – how did they manage that?

A. Simply by being “not SSL”. SFTP is a protocol on top of SSH, and SSH is not related to SSL. That’s why it’s not affected by this issue. Of course, if there’s a vulnerability discovered in SSH, it’ll affect SSH and SFTP, but won’t affect SSL or SSL-based protocols such as HTTPS and FTPS.

Q. Is it OK to disable SSL renegotiation to fix this bug?

A. Obviously, if SSL didn’t need renegotiation at all, it wouldn’t be there. So, in some respects, if you disable SSL renegotiation, you may be killing functionality. There are a few reasons that you might be using SSL renegotiation:

  1. Because that’s how client authentication works – while you can do client authentication without renegotiation, most HTTPS implementations use renegotiation to request the client certificate. Disabling renegotiation will generally prevent most clients from authenticating with client authentication.
  2. After 10 hours, renegotiation is required, so as to refresh the session key. Do you have SSL connections lasting 10 hours? You probably should be looking at some disconnect/reconnect scenario instead.
  3. Because you can’t disable SSL renegotiation in all cases. In OpenSSL, you can only disable renegotiation if you download and install the new version, and in other SSL implementations, there is no way to disable renegotiation outside of modifying the application.

Q. Since this attack requires the attacker to become a man-in-the-middle, doesn’t that make it fundamentally difficult, esoteric, or close to impossible?

A. If becoming a man-in-the-middle (MitM) was impossible or difficult, there would be little-to-no need for SSL in the first place. SSL is designed specifically to protect against MitM attacks by authenticating and encrypting the channel. If a MitM can alter traffic and make it seem as if everything’s secure between client and server over SSL, then there’s a failure in SSL’s basic goal of protecting against men-in-the-middle.

Once you assume that an attacker can intercept, read, and modify (but not decrypt) the SSL traffic, this attack is actually relatively easy. There are demonstration programs available already to show how to exploit it.

I was asked earlier today how someone could become a man-in-the-middle, and off the top of my head I came up with six ways that are either recently or frequently used to do just that.

Q. Am I safe at a coffee shop using the wifi?

A. No, not really – over wifi is the easiest way for an attacker to insert himself into your stream.

When using a public wifi spot, always connect as soon as possible to a secured VPN. Ironically, of course, most VPNs are SSL-based, these days, and so you’re relying on SSL to protect you against possible attacks that might lead to SSL issues. This is not nearly as daft as it sounds.

Q. Is this really the most important vulnerability we face right now?

A. No, it just happens to be one that I understood quickly and can blather on about. I think it’s under-discussed, and I don’t think we’ve seen the last entertaining use of it. I’d like to make sure developers of SSL-dependent applications are at least thinking about what attacks can be performed against them using this step, and how they can prevent these attacks. I know I’m working to do something with WFTPD Pro.

Q. Isn’t the solution to avoid executing commands outside the encrypted tunnel?

A. Very nearly, yes. The answer is to avoid executing commands sent across two encrypted sessions, and to deal harshly with those connections who try to send part of their content in one session and the rest in a differently negotiated session.

In testing WFTPD Pro out against FTPS clients, I found that some would send two encrypted packets for each command – one containing the command itself, the other containing the carriage return and linefeed. This is bad in itself, but if the two packets straddle either side of a renegotiation, disconnect the client. That should prevent the HTTPS Request-Splitting using renegotiation.

One key behaviour HTTPS has is that when you request a protected resource, it will ask for authentication and then hand you the resource. What it should probably be doing is to ask for authentication and then wait for you to re-request the resource. That action alone would have prevented the client-certificate attacks discussed so far.

Q. What is the proposed solution?

A. The proposed solution, as I understand it, is for client and server to state in their renegotiation handshake what the last negotiated session state was. That way, an interloper cannot hand off a previously negotiated session to the victim client without the client noticing.

Note that, because this is implemented as a TLS handshake extension, it cannot be implemented in SSLv3. Those of you who just got done with mandating SSLv2 removal throughout your organisations, prepare for the future requirement that SSLv3 be similarly disabled.

Q. Can we apply the solution today?

A. It’s not been ratified as a standard yet, and there needs to be some discussion to avoid rushing into a solution that might, in retrospect, turn out to be no better – or perhaps worse – than the problem it’s trying to solve.

Even when the solution is made available, consider that PCI auditors are still working hard to persuade their customers to stop using SSLv2, which was deprecated over twelve years ago. I keep thinking that this is rather akin to debating whether we should disable the Latin language portion of our web pages.

However, it does demonstrate that users and server operators alike do not like to change their existing systems. No doubt IDS and IPS vendors will step up and provide modules that can disconnect unwarranted renegotiations.

Update: Read Part 3 for a discussion of the possible threats to FTPS.

My take on the SSL MITM Attacks – part 1 – the HTTPS attack

If you’re in the security world, you’ve probably heard a lot lately about new and deadly flaws in the SSL and TLS protocols – so-called “Man in the Middle” attacks (aka MITM).

These aren’t the same as old-style MITM attacks, which relied on the attacker somehow pretending strongly to be the secure site being connected to – those attacks allowed the attacker to get the entire content of the transmission, but they required the attacker to already have some significant level of access. The access required included that the attacker had to be able to intercept and change the network traffic as it passed through him, and also that the attacker had to provide a completely trusted certificate representing himself as the secure server. [Note – you can always perform a man-in-the-middle attack if you own a trusted certificate authority.]

The current SSL MITM attack follows a different pattern, because of the way HTTPS authentication works in practice. This means it has more limited effect, but requires less in the way of access. You gain some security advantage, you lose some. The attacker still needs to be able to intercept and modify the traffic between client and server, but does not get to see the content of traffic between client and server. All the attacker gets to do is to submit data to the server before the client gets its turn.

Imagine you’re ordering a pizza over the phone. Normally, the procedure is that you call and tell them what the pizza order is (type of pizza, delivery address), and they ask you for your credit card number as verification. Sometimes, though, the phone operator asks for your credit card number first, and then takes your order. So, you’re comfortable working either way.

Now, suppose an attacker can hijack your call to the pizza restaurant and mimic your voice. While playing you a ringing tone to keep you on the line, he talks to the phone operator, specifying the pizza he wants and the address to which it is to be delivered. Immediately after that, he connects you to your pizza restaurant, you’re asked for your credit card number, which you supply, and then you place your pizza order.

Computers are as dumb as a bag of rocks. Not very smart rocks at that. So, imagine that this phone operator isn’t smart enough to say “what, another pizza? You just ordered one.”

That’s a rough, non-technical description of the HTTPS attack. There’s another subtle variation, in which the caller states his pizza order, then says “oh, and ignore my attempt to order a pizza in a few seconds”. The computer is dumb enough to accept that, too.

For a more technical description, go see Eric Rescorla’s summary at Understanding the TLS Renegotiation Attack, or Marsh Ray’s original report.

Let’s call these the HTTPS client-auth attack and the HTTPS request-splitting attack. That’s a basic description of what they do.

HTTPS client-authentication attack

The client-authentication attack is getting the biggest press, because it allows the attacker one go (per try) at persuading the server to perform an action in the context of the authenticated user. From ordering a pizza to pretty any activity that can be caused in a single request to a web site can be achieved with this attack.

Preventing the attack at the server.

Servers have been poorly designed in this respect – but out of some necessity. Eric Rescorla explains this in the SSL and TLS bible, “SSL and TLS” [Subtitle: Designing and Building Secure Systems] on page 322, section 9.18.

“The commonly used approach is for the server to negotiate an ordinary SSL connection for all clients. Then, once the request has been received, the server determines whether client authentication is required… If it is required, the server requests a rehandshake using HelloRequest. In this second handshake, the server requests client authentication.”

How does HTTP handle other authentication, such as Forms, Digest, Basic, Windows Integrated, etc? Is it different from the above description?

A client can provide credentials along with its original request using the WWW-Authenticate header, or the server can refuse an unauthorised (anonymous) request with a 401 error code indicating that authentication is necessary (and listing WWW-Authenticate headers containing appropriate challenges). In the latter case, the client resends the request with the appropriate WWW-Authenticate header.

HTTPS Mutual Authentication (another term for client authentication) doesn’t do this. Why on earth not? I’m not sure, but I think it’s probably because SSL already has a mostly unwarranted reputation for being slow, and this would add another turnaround to the process.

Whatever the reason, a sudden dose of unexpected ‘401’ errors would lead to clients failing, because they aren’t coded to re-request the page with mutual auth in place.

So, we can’t redesign from scratch to fix this immediately – how do we fix what’s in place?

The best way is to realise what the attack can do, and make sure that the effects are as limited as possible. The attack can make the client engage in one action – the first action it performs after authenticating – using the credentials sent immediately after requesting the action to be performed.

A change of application design is warranted, then, to ensure that the first thing your secure application does on authenticating with a client certificate is to display a welcome screen, and not to perform an action. Reject any action requested prior to authentication having been received.

Sadly, while this is technically possible using SSL if you’ve written your own server to go along with the application, or can tie into information about the underlying SSL connection, it’s likely that most HTTPS servers operate on the principle that HTTP is stateless, and the app should have no knowledge of the SSL state beyond “have I been authenticated or not”.

Doubtless web server vendors are going to be coming out with workarounds, advice and fixes – and you should, of course, be looking to their advice on how to fix this behaviour.

The best defence against the client-authentication attack, of course, is to not use client authentication.

Preventing the attack at the client

Not much you can do here, I’m afraid – the client can’t tell if the server has already received a request. Perhaps it would work to not provide client certificates to a server unless you already have an existing SSL connection, but that would kill functionality to perfectly good web sites that are operating properly. Assuming that most web sites operate in the mode of “accept a no-client-auth connection before requesting authentication”, you could rework your client to insist on this happening all the time. Prepare for failures to be reported.

Again, the best defence is not to use client authentication right now. Perhaps split your time between browsers – one with client certificates built in for those few occasions when you need them, and the other without client certs, for your main browsing. That will, at least, limit your exposure.

HTTPS Request-splitting attack

Preventing the attack at the server

The HTTPS Request-splitting attack is technically a little easier to block at the server, if you write the server’s SSL interface – there should be absolutely no reason for an HTTP Request to be split across an SSL renegotiation. So, an HTTPS server should be able to discard any connection state, including headers already sent, when renegotiation happens. Again, consult with your web server developer / vendor for their recommendations.

Preventing the attack at the client?

Again, you’re pretty much out of luck here – even sending a double carriage return to terminate any previous request would cause the attacker’s request to succeed.

The long term approach – fix the protocol

As you can imagine, there are some changes that can be made to TLS to fix all of this. The basic thought is to have client and server add a little information in the renegotiation handshake that checks that client and server both agree about what has already come before in their communication. This allows client and server both to tell when an interloper has added his own communication before the renegotiation has taken place.

Details of the current plan can be found at draft-rescorla-tls-renegotiate.txt

Final thoughts

Yeah, this is a significant attack against SSL, or particularly HTTPS. There are few, if any, options for protecting yourself as a client, and not very many for protecting yourself as a server.

Considering how long it’s taken some places to get around to ditching SSLv2 after its own security flaws were found and patched 14 years ago with the development of SSLv3 and TLS, it seems like we’ll be trying to cope with these issues for many years to come.

Like it or not, though, the long-term approach of revising TLS is our best protection, and it’s important as users that we consider keeping our software up-to-date with changes in the security / threat landscape.

Update: read Part 2 of this discussion for answers to a number of questions.

Update: read Part 3 for some details on FTPS and the potential for attacks.

When “All” isn’t everything you need – Terminal Services Gateway certificates.

Setting up Terminal Services Gateway on Windows Server 2008 the other day.

It’s an excellent technology, and one I’ve been waiting for for some time – after all, it’s fairly logical to want to have one “bounce point” into which you connect, and have your connection request forwarded to the terminal server of your choice. Before this, if you were tied to Terminal Services, you had to deal with the fact that your terminal connection was taking up far more traffic than it should, and that the connection optimisation settings couldn’t reliably tell that your incoming connection was at WAN speeds, rather than LAN speeds.

image But to get TS Gateway working properly, it needs a valid server certificate that matches the name you provide for the gateway, and that certificate needs to be trusted by the client. Not usually a problem, even for a small business operating on the cheap – if you can’t afford a third-party trusted certificate, there are numerous ways to deploy a self-signed certificate so that your client computers will trust it.

I have a handily-created certificate that’s just right for the job.

I ran into a slight problem when I tried to install the certificate, however.


The certificate isn’t there! In this machine, it isn’t even possible for me to “Browse Certificates” to find the certificate I’m looking for. On another machine, the option is present:


That’s promising, but my certificate doesn’t appear in the list of certificates available for browsing:


I checked in the Local Computer’s Personal Certificates store, which is where this certificate should be, and sure enough, on both machines, it’s right there, ready to be used by TSG.


So, why isn’t TSG offering this certificate to me to select? The clue is in the title.

The certificate that doesn’t show up is the one with “Intended purposes: <All>” – the cert that shows up has only “Server Authentication” enabled. Opening the certificate’s properties, I see this:


Simply selecting the radio-button “Enable only the following purposes”, I click “OK”:


And now, back over in the TSG properties, when I Browse Certficates, the Install Certificate dialog shows me exactly the certificates I expected to see:


This isn’t a solution I would have expected, and if that one certificate hadn’t shown up there, I wouldn’t have had the one clue that let me solve this issue.

Hopefully my little story will help someone solve this issue on their system.

Debugging SSTP error -2147023660

Setting up an SSTP (Secure Socket Tunneling Protocol) connection earlier, I encountered a vaguely reminiscent problem. [SSTP allows virtual private network – VPN – connections between clients running Vista Service Pack 1 and later and servers running Windows Server 2008 and later, using HTTP over SSL, usually on port 443. Port 443 is the usual HTTPS port, and creating a VPN over just that port and no other allows it to operate over most firewalls.]

The connection just didn’t seem to want to take, even though I had already followed the step-by-step instructions for setting up the SSTP server. I thought I had resolved the issue originally by ensuring that I installed the certificate (it was self-signed) in the Trusted Roots certificate store. [If the certificate was not self-signed, I would have ensured that the root certificate itself was installed in Trusted Roots]

The first thing I did was to check the event viewer on the client, where I found numerous entries.

I found error -2147023660 in the Application event log from RasClient. This translates to 0x800704D4, ERROR_CONNECTION_ABORTED. That was pretty much the same information I already had, that the connection was being prevented from completing. So I visited the server to see if there was more information there.

On the server, I couldn’t find any entries from the time around when I was trying to connect. Not too good, because of course that’s where you’re going to look. In some cases, particularly errors that Microsoft thinks are going to happen too frequently, the conditions are checked at boot-time, and an error reported then, rather than every time the service is called on to perform an action.

Fortunately, it hadn’t been that long since I last booted (and I had a hint or two from the RRAS team at Microsoft), so my eyes were quickly drawn to an Event with ID 24 in the System Log, sourced at Microsoft-Windows-RasSstp. The text said:

The certificates bound to the HTTPS listener for IPv4 and IPv6 do not match. For SSTP connections, certificates should be configured for for IPv4, and [::]:Port for IPv6. The port is the listener port configured to be used with SSTP.

Note that this happens even if your RRAS server isn’t configured to offer IPv6 addresses to clients.

So, here’s some documentation on event ID 24 :


This is one of those nasty areas where there is no user interface other than the command-line. Don’t get me wrong, I love being able to do things using the command line, because it’s easy to script, simple to email to people who need to implement it, and it works well with design-approve-implement processes, where a designer puts a plan together that is approved by someone else and finally implemented by a third party. With command-line or other scripts, you can be sure that if the script didn’t change on its way through the system, then what was designed is what was approved, and is also what was implemented.

But it’s also easy to get things wrong in a script, whereas a selection in a UI is generally much more intuitive. It’s particularly easy to get long strings of hexadecimal digits wrong, as you will see when you try and follow the instructions above. Make sure to use copy-and-paste when assembling your script, and read the output for any possible errors.

The CWE Top 25 Programming Mistakes

I’ve read some debate about the top 25 programming mistakes as documented by the CWE (Common Weakness Enumeration) project, in collaboration with the SANS Institute and the MITRE . That the list isn’t complete, that there are some items that aren’t in the list, but should be, or vice-versa.

I think we should look at the CWE top-25 as something like the PCI Data Security Standard – it’s not the be-all and end-all of security, it’s not universally applicable, it’s not even a “gold standard”. It’s just the very bare minimum that you should be paying attention to, if you’ve got nowhere else to start in securing your application.

As noted by the SANS Institute, the top 25 list will allow schools and colleges to more confidently teach secure development as a part of their classes.

I personally would like to see a more rigorous taxonomy, although in this field, it’s really hard to do that, because in large part it’s a field that feeds off publicity – and you just can’t get publicity when you use phrases like “rigorous taxonomy”. Here’s my take on the top 25 mistakes, in the order presented:

Insecure Interaction Between Components

“These weaknesses are related to insecure ways in which data is sent and received between separate components, modules, programs, processes, threads, or systems.”

  • CWE-20: Improper Input Validation
    • What’s proper input validation? Consider the thought that there is no input, no output, only throughput. A string is received at the browser, and turned into a byte encoding; this byte encoding is sent to the web server, and possibly re-encoded, before being held in storage, or passed to a processing unit. For every input, there is an output, even if it’s only to local in-memory storage.
    • Validating the input portion falls broadly into two categories – validating for length, and validating for content. Validating for length seems simple – is it longer than the output medium is expecting? You should, however, check your assumptions about an encoding – sometimes encodings will add, and sometimes they will remove, counts of the members of the sequence – and sometimes they may do both.
    • Validating for content can similarly be broken into two groups – validating for correctness against the encoding expected, and then validating for content as to “business logic” (have you supplied a telephone number with a square-root sign or an apostrophe in it, say). Decide whether to strip invalid codes, or simply to reject the entire transaction. Usually, it is best (safest) to reject the entire transaction.
  • CWE-116: Improper Encoding or Escaping of Output
    • The other part of “throughput validation” – and while we constantly tell programmers that they should refuse to trust input, that should not be held as an excuse to produce untrustworthy output. There are many times when your code is trusted to produce good quality output. Some examples:
      • When you write a web application visited by a user, that user trusts you not to forward other people’s code on to them. Just your own, and that of your business partners. [See Cross-Site Scripting, below]
      • When your application is used internally [See SQL Injection, below]
    • Be conservative in what you send – make sure it rigorously follows whatever protocol or design-time contract has been agreed to. And above all, when sending data that isn’t code, make sure to encode it so that it can’t be read as code!
  • CWE-89: Failure to Preserve SQL Query Structure (aka ‘SQL Injection’)
    • SQL Injection is a throughput validation issue. In its essence, it involves an attacker who feeds SQL command codes into an interface, and that interface passes them on to a SQL database server.
    • This is almost an inexcusable error, as it is relatively easy to fix. The fix is usually hampered somewhat in that the SQL database server is required to trust the web server interface code, but that means only that the web server interface code must either encode, or remove, elements of the data that is being passed in the SQL command sequence being sent to the server. The most reliable way to do this is to use parameterised queries or stored procedures. Avoid building SQL commands through concatenation at almost any cost.
  • CWE-79: Failure to Preserve Web Page Structure (aka ‘Cross-site Scripting’)
    • I hate the term “cross-site scripting”. It’s far easier to understand if you just call it “HTML injection”. Like SQL injection, it’s about an attacker injecting HTML code into a web page (or other HTML page) by including it as data, in such a way that it is provided to the user as code.
    • Again, a throughput content validation issue, anything that came in as data and needs to go out as a part of an HTML page should be HTML encoded, ideally so that only the alphanumerics are unencoded.
  • CWE-78: Failure to Preserve OS Command Structure (aka ‘OS Command Injection’)
    • Like SQL injection, this is about generating code and including data. Don’t use your data as part of the generation of code.
    • There are many ways to fix this kind of an issue – my favourite is to save the data to a file, and make the code read the file. Don’t derive the name or location of the file from the user-supplied data.
  • CWE-319: Cleartext Transmission of Sensitive Information
    • What’s sensitive information? You decide, based on an analysis of the data you hold, and a reading of appropriate laws and contractual regulations. For example, with PCI DSS, sensitive information would include the credit card number, magnetic track data, and personal information included with that data. Depending on your state, personal contact information is generally sensitive, and you may also decide that certain business information is also sensitive.
    • Seriously, SSL and IPsec are not significant performance drains – if your system is already so overburdened that it cannot handle the overhead of encrypting sensitive data, you are ALREADY too slow, and only providence has saved you from problems.
    • Especially where the data is not your own, make an informed decision as to whether you will be communicating in clear text.
  • CWE-352: Cross-Site Request Forgery (CSRF)
    • Another confusing term, CSRF refers to the ability of one web page to send you HTML code that your browser will execute against another web page. This really is cross-site, and forges requests that look to come from the user, but really come from a web page being viewed in the user’s browser.
    • The fix for this is that every time you display a form (or even a solitary button, if that button’s effects should be unforgeable), you should include a hidden value that contains a random number. Then, when the “submit” (or equivalent) button is pressed, this hidden value will be sent back with the other contents of the form. Your server must, of course, validate this number is correct, and must not allow the number to be long-lived, or be used a second time. A simple fix, but one that you have to apply to each form.
    • This really falls under a category of guaranteeing that you are talking to the user (or the user’s trusted agent), and not someone pretending to be the user. Related to non-repudiation.
  • CWE-362: Race Condition
    • Race conditions refer to any situation in which the execution of two parallel threads or processes behaves differently when the order of execution is altered. If I tell my wife and son to go get a bowl and some flour, and to pour the flour into the bowl, there’s going to be a mess if my wife doesn’t get the bowl as quickly as my son gets the flour. Similarly, programs are full of occasions where a precedence is expected or assumed by the designer or programmer, but where that precedence is not guaranteed by the system.
    • There are books written on the topic of thread synchronisation and resource locking, so I won’t attempt to address fixing this class of issues.
  • CWE-209: Error Message Information Leak
    • Be helpful, but not too helpful. Give the user enough information to fix his side of the error, but not so much that he has the ability to learn sensitive information from the error message.
    • “Incorrect user name or password” is so much better than “Incorrect password for that user name”.
    • “Internal error, please call technical support, or wait a few minutes and try again” is better than “Buffer length exceeded at line 543 in file c:\dev\web\creditapp\cardcruncher.c”
    • Internal information like that should be logged in a file that is accessible to you when fixing your system, but not accessible to the general end users.
Risky Resource Management

“The weaknesses in this category are related to ways in which software does not properly manage the creation, usage, transfer, or destruction of important system resources.”

  • CWE-119: Failure to Constrain Operations within the Bounds of a Memory Buffer
    • The old “buffer overflow” – a throughput length validation issue.  Any time you take data from one source and place it into another destination, you have to reliably predict whether the destination is large enough to hold it, and you also have to decide what you will do if it is not.
    • Don’t rely solely on .NET or Java “protecting you from buffer overruns” – when you try and access an element outside of a buffer’s limits, they will simply throw an exception – crashing your program dead in its tracks. This in itself could cause half-complete files or other communications, which could feed into and damage other processes. [And simply catching all exceptions and continuing blindly is something I’ve complained about before]
  • CWE-642: External Control of Critical State Data
    • By “Critical State Data”, this refers to information about where in the processing your user is. The obvious example of bad external control of critical state data is sending the price to the user, and then reading it back from the user. It obviously isn’t too hard from an attacker to simply modify the value before sending it to the server.
    • Other examples of poorly chosen state being passed includes the use of customer ID numbers in URLs, in such a way that it is obvious how to select a different customer’s number.
    • State data such as this should generally be held at the server, and a ‘reference’ value exchanged to allow the server to regain state when a user responds. If this value is populated among users sufficiently sparsely, it’s close to impossible for an attacker to steal someone else’s state.
  • CWE-73: External Control of File Name or Path
    • This is related to forced-browsing, path-traversal, and other attacks. The idea is that any time you have external paths (such as URLs) with a direct 1:1 relationship to internal paths (directories and paths), it is usually possible to pass path control from the external representation into the internal representation.
    • Make sure that all files requested can only come from a known set of files; disable path representations (such as “..”, for ‘parent directory’) that your code doesn’t actually make use of.
    • Instead of trying to parse the strings yourself to guess what file name the operating system will use, always use the operating system to tell you what file name it’s going to access. Where possible, open the file and then query the handle to see what file it really represents.
  • CWE-426: Untrusted Search Path
    • Windows’ LoadLibrary is the classic example of this flaw in design – although the implicit inclusion of the current directory in Windows’ execution PATH searched is another.
    • When writing programs, you can only trust the code that you load or call if you can verify where you are loading or calling it from.
    • A favourite trick at college was to place ‘.’ at the front of your path, add a malicious shell file called ‘rm’, and invite a system administrator to show you how to kill a print job. The “lprm” command he’d run would call “rm”, and would run the local version, rather than the real command. Bingo, instant credentials!
    • Don’t search for code that you trust – know where it is, and if it isn’t there, fail.
  • CWE-94: Failure to Control Generation of Code (aka ‘Code Injection’)
    • I find it hard to imagine the situation that makes it safe to generate code in any way based off user input.
    • Perhaps you could argue that this is what you do when you generate HTML that contains, as part of its display, user input. OK then, the answer here is to properly encode that which you embed, so that the code processor cannot become confused as to what is code and what is data.
  • CWE-494: Download of Code Without Integrity Check
    • Either review the code that you download, or insist that it is digitally signed by a party with whom you have contracted for that purpose. Otherwise you don’t know what you are downloading or what you are executing.
  • CWE-404: Improper Resource Shutdown or Release
    • This covers a large range of issues:
      • Don’t “double-free” resources. Make sure you meticulously enforce one free / delete for every allocation you make. Otherwise, you wind up releasing a resource that you wanted to hang onto, or you may crash your program.
      • If the memory you’re about to release (or file you’re about to close) contained sensitive information, make sure it is wiped before release. Verify in the release build that the optimiser hasn’t optimised away this wiping!
      • Make sure you release resources when they are no longer in use, so that there are no memory leaks or other resource overuse problems that will lead to your application becoming bloated and fragile.
  • CWE-665: Improper Initialization
    • Lazy languages like Javascript, where a mistype becomes an instant variable assignment, should be avoided.
    • Define all variables’ types – no “IMPLICIT INTEGER*4 (I-N)” (Am I showing my age?)
    • Put something into your variables, so that you know what’s there. Don’t rely on the compiler unless the compiler is documented to guarantee initialisation.
    • By “variable”, I mean anything that might act like a variable – stretches of memory, file contents, etc.
  • CWE-682: Incorrect Calculation
    • Again, a multitude of sins:
      • “should have used sin, but we actually used cos”
      • divide by zero – or some similar operation – that causes the program to halt
      • length validation / numeric overflow – in a single byte, 128 + 128 = 0
    • As you can see, a denial of service can definitely occur, as can remote execution (usually a result of calculating too short a buffer, as a result of numeric overflow, and then overflowing the buffer itself)
    • Don’t underestimate the possible results of just plain getting the answer wrong – cryptographic implementations have been brought to their knees (and resulted in approving untrustworthy access) because they couldn’t add up properly.
Porous Defenses

“The weaknesses in this category are related to defensive techniques that are often misused, abused, or just plain ignored.”

  • CWE-285: Improper Access Control (Authorization)
    • This one pretty much speaks for itself. There’s public parts of your application, and there’s non-public parts. Make sure that you have to provide authentication before crossing that boundary, and make sure that the user account verified in authentication is the one that’s used for authorisation to access resources.
    • Carry user authentication information around carefully, without letting it be exposed to other forms of attack, but also to make sure that the information is available the next time you need to authorise access to resources.
  • CWE-327: Use of a Broken or Risky Cryptographic Algorithm
    • Translation – get a crypto expert to manage your crypto. [Note – this is why I recommend using CryptoAPI rather than OpenSSL, because you have to be your own expert to use OpenSSL.]
    • New algorithms arise, and old ones become obsolete. In the case of cryptographic algorithms, obsolete means “no longer effectively cryptographic”. In other words, if you use an old algorithm, or a broken algorithm, or don’t use an existing algorithm the right way, your data isn’t as protected as you thought it was.
    • Where possible, use a cryptographic framework such as SSL, where the choice of cryptographic algorithms available can be adjusted over time to deal with changing realities.
  • CWE-259: Hard-Coded Password
    • If there’s a hard-coded password, it will be discovered. And when discovered, it will be disseminated, and then you have to figure out how to get the message out to all of your users that they can now be owned because of your application. Not an easy conversation to have, at a guess.
    • This is a “just don’t do it” recommendation, not a “do it this way” or “do it that way”.
  • CWE-732: Insecure Permission Assignment for Critical Resource
    • If a low-privilege user can lock, or corrupt, a resource that is required for high-importance transactions, you’ve created an easy denial-of-service.
    • If a low-privilege user can modify something that is used as a basis for trust assignments, there’s an elevation of privilege attack.
    • And if a low-privilege user can write to your code base, you’re owned.
  • CWE-330: Use of Insufficiently Random Values
    • Give me a random number. 7. Give me another random number. 7. And another? 7.
    • How do you tell if a number is random enough? You hire a mathematician to do a statistical analysis to see if the next number is predictable if you know any or all of the previous numbers.
    • This mostly ties into CWE-327, don’t do your own crypto if you’re not a crypto expert (and by the way, you’re not a crypto expert). However, if you’re hosting a poker web site, it’s pretty important to be able to shuffle cards in an unpredictable manner!
    • Remember that the recent Kaminsky DNS attack, as well as the MD5 collision issues, could have been avoided entirely by the use of unpredictable numbers.
  • CWE-250: Execution with Unnecessary Privileges
    • Define “unnecessary”? No, define “necessary”. That which is required to do the job. Start your development and testing process as a restricted user. When you run into a function that fails because of lack of privileges, ask yourself “is this because I need this privilege, or can I continue without?”
    • Too many applications have been written that ask for “All” access to a file, when they only need “Read”.
    • Too many applications demand administrator access when they don’t really need it. I’m talking to you, Sansa Media Converter.
  • CWE-602: Client-Side Enforcement of Server-Side Security
    • I’ve seen this one hundreds of times. “We prompt the user for their birth date, and we reject invalid day numbers”; “Where do you reject those?”; “In the user interface so it’s nice and quick”. Great, so I can go in and make a copy of your web page, delete the checks, and input any number I like. Don’t consider it impossible that an attacker has written his own copy of the web browser, or can interfere with the information passing through the network.

What’s missing?

Glaringly absent, as usual, is any mention of logging or auditing.

Protections will fail, always, or they will be evaded. When this happens, it’s vital to have some idea of what might have happened – that’s impossible if you’re not logging information, if your logs are wiped over, or if you simply can’t trust the information in your logs.

Maybe I say this because my own “2ndAuth” tool is designed to add useful auditing around shared accounts that are traditionally untraceable – or maybe it’s the other way around, that I wrote 2ndAuth, because I couldn’t deal with the fact that shared accounts are essentially unaudited without it?

Of course, that leads to other subtleties – the logs should not provide interesting information to an attacker, for instance, and you can achieve this either by secreting them away (which makes them less handy), or by limiting the information in the logs (which makes them less useful).

Another missing issue is that of writing software to serve the user (all users) – and not to frustrate the attacker. [Some software reverses the two, frustrating the user and serving the attacker.] We developers are all trained to write code that does stuff – we don’t tend to get a lot of instruction on how to write code that doesn’t do stuff.

Another mistake, though it isn’t a coding mistake as such, is the absence of code review. You really can’t find all issues with code review alone, or with code analysis tools alone, or with testing alone, or with penetration testing alone, etc. You have to do as many of them as you can afford, and if you can’t afford enough to protect your application, perhaps there are other applications you’d be better off producing.

Other mistakes that I’d like to face head-on? Trusting the ‘silver bullet’ promises of languages and frameworks that protect you; releasing prototypes as production, or using prototype languages (hello, Perl, PHP!) to develop production software; feature creep; design by coding (the design is whatever you can get the code to do); undocumented deployment; fear/lack of dead code removal (“someone might be using that”); deploy first, secure later; lack of security training.

Microsoft Security Advisory – MD5 collisions

I would hardly be able to call my blog “Tales from the Crypto” if I didn’t pass at least some comment on the recent Microsoft Security Advisory, and the technical pre-paper on which it is based.

To an uninformed reader, the advisory (and especially the paper) doesn’t make a whole lot of sense, as with most cryptography documents. If there’s an attack on a cryptographic technology, doesn’t that mean it’s broken and we should stop using it?

Not really, no. We should stop using, or shore up, those components that have an increased vulnerability.

First, let’s remember that cryptography is necessarily full of mathematical theory and that it is very much a developing field. If I say something along the lines of “magic happens here”, please accept that at face value. It means that there is something hugely full of mathematical complexity that I don’t understand, but which has been assessed by mathematicians who know more than I do about the subject.

How do certificates work?

So, a little background, and an explanation of the attack, before we get to the mitigations.

Every time you use HTTPS (HTTP over SSL / TLS), there’s an identifying exchange – at the very least, the server identifies itself to you, and possibly you identify yourself to the server. In SSL, this is almost always done using certificates – strictly speaking, X.509 certificates.

A certificate is a list of statements about the identity of the party it represents, followed by a mathematically-derived encrypted value called a “signature”. The signature is based on a hash function, which is chosen to be resistant to attack. Typical hash functions are MD5, SHA1, and the “SHA-2” family which are identified by the number of bits of output they produce (i.e. how well they uniquely represent the original information to be hashed). The signature is the hash of the identity statements, encrypted using the issuer’s private key. This means that anyone can decrypt the hash, but in doing so, they will recognise both that only the issuer can have created the signature, and that the identity claims made in the certificate are accepted as valid by the issuer.

This allows you to trust the owner of the certificate, on the basis that you trust the issuer. Sometimes you don’t know if you can trust the issuer, either, and so you have to find out if you can trust the issuer – by looking at their certificate, seeing what claims it made, and what other issuer signed it, and so on, up a “chain of trust”, until you either meet a certificate you do trust, or you meet a certificate that is “self-signed” – that is, that it claims to be its own issuer, and has no other signatory.

So, from this description, you should be able to envisage a chain of trust, where the “leaf certificate” of the site whose identity you want to verify, is signed by an intermediate certificate authority (CA), which may in turn be signed by an intermediate CA, and so on, until you meet a certificate that is signed by a “root CA” – a self-signed certificate whose trust you can use as a basis for trusting the leaf certificate.

Many root CAs are installed by default in operating systems, or applications that use SSL, with the intent that you should be able to trust all certificates issued by those CAs, because they take adequate steps to verify the certificates they issue, and because they use modern technology.

Where’s the attack?

There’s nothing surprising about this attack to those of us who follow cryptography news. One of the problems with hashes is that it is possible to generate two paired documents, that have different content, but whose hash is the same. It has been known since 2004 that you can generate such colliding documents using MD5 as a hash without quite as much effort as the “brute force” technique of trying to generate documents and see if they match. From that, we have (or should have) predicted that this attack was possible, though not easy.

The attack is this – the attacker requests a bona fide web-site, or email (or any other) certificate from a reputable certificate authority. The certificate request is generated along with a second ‘shadow’ certificate – the two differ in areas chosen by the attacker, and with sufficient care to make sure that the issued certificates will both match the same signature.

This gives two certificates, which each appear to have been issued by the certificate authority, but only one of which actually contains information that was seen by the certificate authority.

The method of attack beyond this point will depend on what the shadow certificate was. The simplest way to attack this would be to have both certificates be web site certificates (or both be email certificates, etc), so that you could ask the CA for a certificate for your own name, but wind up with a certificate for someone else’s name – a big company or an important individual, say. That’s useful, but it only gives you one usable certificate per request. Keep that up, and you are sure to be detected.

The method outlined in the research paper, however, goes a step further than that – the certificate request that the CA sees is, as before, a simple web site certificate request. But the shadow certificate is designed to be that of an intermediate CA itself. Once this attack is successful, you can use the intermediate CA to issue any number of web site, email, code-signing, and even other CA certificates. Because these certificates chain up through your bogus intermediate CA, and then to a trusted root, they too will be trusted.

What about defence?

There are several defences to consider, and I’ll address them from the perspective of various different parties.

1. The Certificate Authority

First of all, all certificate authorities need to move to stop using MD5 when signing other people’s certificates. They should have stopped doing this some time ago, as it was clear that the generation of colliding certificate requests was an ever-increasing possibility. Also on the way out should be SHA1 (although that does mean older systems and software may have issues, because they may not be able to support newer SHA-based hash and signature algorithms). Note that this (particularly the dropping of SHA1) is a recommendation that should be followed with glacial slowness, over years, rather than days. We’re not that broken yet.

Even if the CA continues to use MD5 and SHA1, they can adequately protect against this attack by using non-predictable serial numbers when generating the certificate signatures. This is essentially the area where the CA can most easily and most effectively prevent this attack from succeeding, relying as it does on being able to predict precisely the contents of the returned certificate. This will continue to work so long as the attackers can only generate two colliding paired documents – if there is ever a sustainable attack that allows creating a document that matches the hash of another document without generating them together, this too will be a cause to doubt those certificates.

Another defence against this (but not the simpler form of the attack) is to ensure that you use different CAs to issue leaf certificates than you use to issue intermediate CA certificates, and that you set limits on how long the chain may be as signed by your CAs. That way, a leaf certificate request cannot be used to create a shadow intermediate CA certificate, because verification of the chain will fail because of length constraints.

Check your certificate requests, and make sure that you have not seen a large number of certificate requests from substantially the same source, in an attempt to generate a desired serial number. Offer your existing customers, if they are worried about MD5-signed certificates, the option to replace their certificates with certificates signed by other hash schemes.

2. The Web-Site Owner

There’s really not anything the web-site owner can do, beyond checking any reports of hijacked sessions, or web sites not appearing to be correctly identified, and then taking legal action to remove such pretender sites when they are found.

One thing that can be done is to champion the use of Enhanced Validation (EV) SSL Certificates, as specified by the Browser Forum. These certificates are required to use a chain that has no MD5 signatures in anything other than the root CA. Push the message to your customers and users that the green bar indicates a higher level of trustworthiness. You’ve not only identified yourself to the CA’s satisfaction, but your CA and you are committed to a more up-to-date technical configuration.

Ask your CA if you need to take action with your existing certificates – if they are signed by using MD5 hashes, it may be that some customers will refuse to accept your certificates. Your CA may have a reasonable offer on replacing your certificates with ones signed by SHA1 or other hashes.

3. The Web-Site Visitor

These are the guys that really matter – because if they can be fooled, then the attack has succeeded.

The first thing that has to be drummed into web-site users’ heads is that a certificate error message should be reason for you to stop your visit to the web site with the error, and to not place any orders with them, or supply it with your private information (password, personal details, etc) until you have resolved with their technical support what the issue is. This step alone is something that I have emphasised before, and I emphasise it again now, not because it is the best fix for this issue (because a clever attacker will try to produce a certificate that doesn’t error), but because it’s something that protects against the far easier attacks, and it is still not a habit that users have gotten into.

Next, keep up-to-date with patches. If there are interesting ways to block this at the browser, those will be distributed through security patches to your browser or other applications. If you use a lot of OpenSSL-based applications, keep looking for updates to those; if you use a lot of CryptoAPI-based apps, updates should come to you automatically through Windows Update.

Read Microsoft’s Security Advisory, as well as entries on the Microsoft Security Response Center Blog and the Microsoft Security Vulnerability Research & Defense Blog.

4. The software developer

Consider, if you already verify certificate chains yourself, adding or documenting features to refuse chains that flow through CA certificates signed with MD5; also to refuse chains that flow through CA certificates with too much ‘cruft’ (this attack uses the “Netscape Comment” field and fills it with binary that doesn’t look very comment-like).

Make sure that your verification routines check for chain length constraints, as well as corrupt or absent revocation list locations. Again, this attack had no space to put a valid CRL location in place.

If you develop IDS solutions, you may want to try and check for an SSL negotiation that includes certificates signed by intermediate CAs that are themselves signed by using the MD5 hash algorithm – although this is a little complex to track, it shouldn’t be completely impossible.

And, in summary (phew – at last!)

This is a proof of concept of a theoretical attack, and has generated some interest because it’s a shoe we’ve been waiting to see drop. Repeating the work with the information supplied by Sotirov et al would require a lot of significant and serious mathematics. I know that’s not something to make it impossible, but I think it’s enough to suggest that the sort of people with enough resources to hire advanced mathematicians would find it cheaper and easier to just use something more like social engineering to achieve the effect of having visitors trust your web site.

In several months, the tools will become more widely available, but by then, CAs should be smart enough to stop using MD5, and be considering a move to SHA256 and above. And if they aren’t, I’m sure there will be further advisories with instructions on which root CAs to remove from your trusts.

This is a thoroughly interesting attack, and exciting to people like me. That shouldn’t be taken as an indication that the world is about to collapse, or that you can’t go on trusting HTTPS the way you currently do. Even though we now have the ‘perfect storm’ of a serious DNS flaw backed with a way to subvert SSL, it doesn’t appear to be in use at the present, and with the information on how this attack was achieved, it’s possible for a root CA to comb back through their records and find suspicious behaviours that match this attack.

Link: Verisign’s statement (they own RapidSSL, the CA that was the subject of this attack).

Searching for Weak Debian / Ubuntu SSL Certificates

Tuxkeys_2 I’ve seen a number of people promote packages that have shipped for Debian and Ubuntu, which allow users to scan their collected keys – OpenSSH or OpenSSL or OpenVPN, to discover whether they’re too weak to be of any functional use. [See my earlier story on Debian and the OpenSSL PRNG]

These tools all have one problem.

They run on the Linux systems in question, and they scan the certificates in place.

Given that the keys in question could be as old as 2 years, it seems likely that many of them have migrated off the Linux platforms on which they have started, and onto web sites outside of the Linux platform.

Or, there may simply be a requirement for a Windows-centric security team to be able to scan existing sites for those Linux systems that have been running for a couple of years without receiving maintenance (don’t nod like that’s a good thing).

So, I’ve updated my SSLScan program. I’m attaching a copy of the tool to this blog post, (along with a copy of the Ubuntu OpenSSL blacklists for 1024-bit and 2048-bit keys if I can get approval), though of course I would suggest keeping up with your own copies of these blacklists. It took a little research to find out how to calculate the quantity being used for the fingerprint by Debian, but I figure that it’s best to go with the most authoritative source to begin with.

Please let me know if there are other, non-authoritative blacklists that you’d like to see the code work with – for now, the tool will simply search for “blacklist.RSA-1024” and “blacklist.RSA-2048” in the current directory to build a list of weak key fingerprints.

I’ve found a number of surprising certificates that haven’t been reissued yet, and I’ll let you know about them after the site owners have been informed.

[Sadly, I didn’t find https://whitehouse.gov before it was changed – its certificate is shared with, of all places, https://www.gov.cn – yes, the White House, home of the President of America, is hosted from the same server as the Chinese government. The certificate was changed yesterday, 2008/5/21. https://www.cacert.org’s certificate was issued two days ago, 2008/5/20 – coincidence?]

My examples are from the web, but the tool will work on any TCP service that responds immediately with an attempt to set up an SSL connection – so LDAP over SSL will work, but FTP over SSL will not. It won’t work with SSH, because that apparently uses a different key format.

Simply run SSLScan, and enter the name of a web site you’d like to test, such as www.example.com– don’t enter “http://” at the beginning, but remember that you can test a host at a non-standard port (which you will need to do for LDAP over SSL!) by including the port in the usual manner, such as www.example.com:636.

If you’re scanning a larger number of sites, simply put the list of addresses into a fie, and supply the file’s name as the argument to SSLScan.

Let me know if you think of any useful additions to the tool.

Here is some slightly modified output from a sample run of the tool (the names have been changed to protect the innocent):Image-0195_2

The text to look for here is “>>>This Key Is A Weak Debian Key<<<“.

Debian and the OpenSSL PRNG

[PRNG is an abbreviation for “Pseudo-Random Number Generator”, a key core component of the key-generation in any cryptographic library.]

Warning: Choking HazardA few people have already commented on the issue itself – Debian issued, in 2006, a version of their Linux build that contained a modified version of OpenSSL. The modification has been found to drastically reduce the randomness of the keys generated by OpenSSL on Debian Linux and any Linux derived from that build (such as Ubuntu, Edubuntu, Xubuntu, and any number of other buntus). Instead of being able to generate 1024-bit RSA keys that have a 1-in-2^1024 chance of being the same, the Debian build generated 1024-bit RSA keys that have a 1-in-2^15 chance of being the same (that’s 1 in 32,768).

Needless to say, that makes life really easy on a hacker who wants to pretend to be a server or a user who is identifed as the owner of one of these keys.

The fun comes when you go to http://metasploit.com/users/hdm/tools/debian-openssl/ and see what the change actually was that caused this. Debian fetched the source for OpenSSL, and found that Purify flagged a line as accessing uninitialised memory in the random number generator’s pre-seeding code.

So. They. Removed. The. Line.

I thought I’d state that slowly for dramatic effect.

If they’d bothered researching Purify and OpenSSL, they’d have found this:


Which states (in 2003, three years before Debian applied teh suck patch) “No, it’s fine – the problem is Purify and Valgrind assume all use of uninitialised data is inherently bad, whereas a PRNG implementation has nothing but positive (or more correctly, non-negative) things to say about the idea.”

So, Debian removed a source of random information used to generate the key. Silly Debian.

But there’s a further wrinkle to this.

If I understand HD Moore’s assertions correctly, this means that the sole sources of entropy (essentially, “randomness”) for the random numbers used to generate keys in Debian are:

  1. The Process ID (from 1 to 32,767)
  2. The contents of an uninitialised area in the process’ memory
  3. uh… that’s it.

[Okay, so that’s not strictly true in all cases – there are other ways to initialise randomness, but these two are the fallback position – the minimum entropy that can be used to create a key. In the absence of a random number source, these are the two things that will be used to create randomness.]

If you compile C++ code using Microsoft’s Visual C++ compiler in DEBUG mode, or with the /GZ, /RTC1, or /RTCs flags, you are asking the compiler to automatically initialise all uninitialised memory to 0xcc. I’m sure there’s some similar behaviour on Linux compilers, because this aids with debugging accidental uses of uninitialised memory.

But what if you don’t set those flags?

What does “uninitialised memory” contain?

It would be bad if “uninitialised memory” contained memory from other processes – previous processes that had owned memory but were now defunct – because that would potentially mean that your new process had access to secrets that it shouldn’t.

So, “uninitialised memory” has to be initialised to something, at least the first time it is accessed.

Is it really going to be initialised to random values? That would be such a huge waste of processor time – and anyway, we’re looking at this from the point of view of a cryptographic process, which needs to have strongly random numbers.

No, random would be bad. Perhaps in some situations, the memory will be filled with copies of ‘public’ data – environment variables, say. But most likely, because it’s a fast easy thing to do, uninitialised memory will be filled with zeroes.

Of course, after a few functions are called, and returned from, and after a few variables are created and go out of scope, the stack will contain values indicative of the course that the program has taken so far – it may look randomish, but it will probably vary very little, if any, from one execution of the program to another.

In the absence of a random number seed file, or a random number generator providing /dev/urand or /dev/random, then, an OpenSSL key is going to have a 1 in 32,768 chance of being the same as a key created on a similar build of OpenSSL – higher, if you consider that most PIDs fall in a smaller range.

So, here’s some lessons to learn about compiling other people’s cryptographic code:

  1. Don’t ever compile cryptographic code in release mode, because you will optimize away lines that clear secrets from memory.
  2. Don’t ever compile cryptographic code in debug mode, because you will initialize memory that is expected to be uninitialised and random.
  3. Don’t ever modify cryptographic code, even if it throws up warnings. You don’t understand what you’re doing.
  4. Don’t ever compile cryptographic code, because you don’t know what you are doing.

Why I use CryptoAPI

This is one reason why I prefer to use Microsoft’s CryptoAPI, rather than libraries such as OpenSSL. There are others:

  1. It’s not my fault if something goes wrong with the crypto.
  2. The users will apply patches to the crypto, and I don’t have to go persuading my users to apply the patches.
  3. There’s a central place where administrators will expect to find crypto keys, and it’s well-protected.
  4. The documentation for CryptoAPI is far better than the documentation for OpenSSL, which is at best confusing, and at worst, non-existent.

In fairness, there are reasons not to use CryptoAPI:

  1. New algorithms are made available for new versions of Windows, and not backported readily to older versions. With a library you ship, you get to decide which version customers can run – unless someone else comes and installs another version.
  2. Microsoft’s documentation is better, but it’s still not perfect. Once in a while, it’s not even correct. At least if you have the source code, and are insanely motivated, you can find out what the truth of a matter is.

We’ll still be learning lessons for a while…

The lessons to learn from this episode are almost certainly not yet over. I expect someone to find in the next few weeks that OpenSSL with no extra source of entropy on some operating system or family of systems generates easily guessed keys, even using the “uninitialised memory” as entropy. I wait with ‘bated breath.

In Defence of the Self-Signed Certificate


Recently I discussed using EFS as a simple, yet reliable, form of file encryption. Among the doubts raised was the following from an article by fellow MVP Deb Shinder on EFS:

EFS generates a self-signed certificate. However, there are problems inherent in using self-signed certificates:

  • Unlike a certificate issued by a trusted third party (CA), a self-signed certificate signifies only self-trust. It’s sort of like relying on an ID card created by its bearer, rather than a government-issued card. Since encrypted files aren’t shared with anyone else, this isn’t really as much of a problem as it might at first appear, but it’s not the only problem.
  • If the self-signed certificate’s key becomes corrupted or gets deleted, the files that have been encrypted with it can’t be decrypted. The user can’t request a new certificate as he could do with a CA.

Well, she’s right, but that only really gives a part of the picture, and it verges on out-and-out recommending that self-signed certificates are completely untrustworthy. Certainly that’s how self-signed certificates are often viewed.

Let’s take the second item first, shall we?

“Request a new certificate” isn’t quite as simple as all that. If the user has deleted, or corrupted, the private key, and didn’t save a copy, then requesting a new certificate will merely allow the user to encrypt new files, and won’t let them recover old files. [The exception is, of course, if you use something called “Key Recovery” at your certificate authority (CA) – but that’s effectively an automated “save a copy”.]

Even renewing a certificate changes its thumbprint, so to decrypt your old EFS-encrypted files, you should keep your old EFS certificates and private keys around, or use CIPHER to re-encrypt with current certificates.

So, the second point is dependent on whether the CA has set up Key Recovery – this isn’t a problem if you make a copy of your certificate and private key, onto removable storage. And keep it very carefully stored away.

As to the first point – you (or rather, your computer) already trust dozens of self-signed certificates. Without them, Windows Update would not work, nor would many of the secured web sites that you use on a regular basis.



Hey, look – they’ve all got the same thing in “Issued To” as they have in “Issued By”!

Yes, that’s right – every single “Trusted Root” certificate is self-signed!

If you’re new to PKI and cryptography, that’s going to seem weird – but a moment’s thought should set you at rest.

Every certificate must be signed. There must be a “first certificate” in any chain of signed certificates, and if that “first certificate” is signed by anyone other than itself, then it’s not the first certificate. QED.

The reason we trust any non-root certificate is that we trust the issuer to choose to sign only those certificates whose identity can be validated according to their policy.

So, if we can’t trust these trusted roots because of who they’re signed by, why should we trust them?

The reason we trust self-signed certificates is that we have a reason to trust them – and that reason is outside of the certificate and its signature. The majority (perhaps all) of the certificates in your Trusted Root Certificate Store come from Microsoft – they didn’t originate there, but they were distributed by Microsoft along with the operating system, and updates to the operating system.

You trusted the operating system’s original install disks implicitly, and that trust is where the trust for the Trusted Root certificates is rooted. That’s a trust outside of the certificate chains themselves.

So, based on that logic, you can trust the self-signed certificates that EFS issues in the absence of a CA, only if there is something outside of the certificate itself that you trust.

What could that be?

For me, it’s simple – I trust the operating system to generate the certificate, and I trust my operational processes that keep the private key associated with the EFS certificate secure.

There are other reasons to be concerned about using the self-signed EFS certificates that are generated in the absence of a CA, though, and I’ll address those in the next post on this topic.

Can You Write Good Code for an OS you Despise?

No, this isn’t another of my anti-Mac frothing rants.

This is one of my “here’s what I hate about many of the open-source projects I deal with” rants.

I’m trying to find an SFTP client for Windows that works the way I want it to.

All I seem to be able to find are SFTP clients for Unix shoe-horned in to Windows.

[Perhaps the Unix guys feel the same way about playing Halo under Wine.]

What do I mean?

Here’s an example – Windows has a certificate store. It’s well-protected, in that there haven’t been any disclosures of significant vulnerabilities that allow you to read certificates without first having got the credentials that would allow you to do so.

So, I want an SFTP client that lets me store my private keys in the Windows certificate store. Or at least, that uses DPAPI to protect its data.

Can’t find one.

Can’t find ONE. And I’m known for being good at finding stuff.

PuTTY is recommended to me. It, too, requires that the private key be stored in a file, not in the certificate store. Its alternative is to use its own certificate store, called Pageant (it’s an authorization “Age-Ant” for PuTTY, get it?) Maybe I could do something with that – write a variant of Pageant that directly accesses certificates stored in the certificate store.

But no, there’s no protocol definition or API, or service contract that I can see in the documentation, that would allow me to rejigger this. I could edit the source code, but that’s an awful lot of effort compared to building a clean implementation of only those parts of the API that I’d need.

What I do find in the documentation for Pageant are comments such as these:

  • Windows unfortunately provides no way to protect pieces of memory from being written to the system swap file. So if Pageant is holding your private keys for a long period of time, it’s possible that decrypted private key data may be written to the system swap file, and an attacker who gained access to your hard disk later on might be able to recover that data. (However, if you stored an unencrypted key in a disk file they would certainly be able to recover it.)
  • Although, like most modern operating systems, Windows prevents programs from accidentally accessing one another’s memory space, it does allow programs to access one another’s memory space deliberately, for special purposes such as debugging. This means that if you allow a virus, trojan, or other malicious program on to your Windows system while Pageant is running, it could access the memory of the Pageant process, extract your decrypted authentication keys, and send them back to its master.

I’ll address the second comment first – it’s a strange way of noting that Windows, like other modern operating systems, assumes that every process run by the user has the same access as the user. Typically, this is addressed by simply minimising the amount of time that a secret is held in memory in its decrypted form, and using something like DPAPI to store the secret encrypted.

The first comment, though, indicates a lack of experience with programming for Windows, and an inability to search. Five minutes at http://msdn.microsoft.com gets you a reference to VirtualLock, which allows you to lock 4kB at a time into physical memory, aka non-paged pool. Of course, there are other options – encrypting the Pagefile using EFS also helps protect against this kind of attack, and the aforementioned trick of holding the secret decrypted in memory for as short a time as possible also reduces the risk of having it exposed.

Now I’m really stretching to assert that this single author despises Windows and that’s why he’s completely unaware of some of its obvious security features and common modes of use. But it does seem to be a trend prevalent in some of the more religious of open source developers – “Windows sucks because it can’t do X, Y and Z” – without actually learning for certain whether that’s true. Often, X and Y can be done, and Z is only necessary on other operating systems due to quirks of their design.

Back when I first started writing Windows server software, the same religious folks would tell me “don’t bother writing servers for Windows – it’s not stable enough”. True enough, Windows 3.1 wasn’t exactly blessed with great uptime. But instead of saying “you can’t build a server on Windows”, I realised that there was a coming market in Windows NT, which was supposed to be server class. So I wrote for Windows NT, I assumed it was capable of server functionality, and any time I felt like I’d hit a “Windows can’t do this”, I bugged Microsoft until they fixed it.

Had I simply walked away and gone to a different platform, I’d be in a different place – but my point is that if you believe that your target OS is incapable, you will find it to be so. If you believe it should be capable, you will find it to be so.