What does “input validation” even mean any more?

Information Security is full of terminology.

Sometimes we even understand what we mean. I’ve yet to come across a truly awesome, yet brief, definition of “threat”, for instance.

But one that bugs me, because it shouldn’t be that hard to get right, and because I hear it from people I otherwise respect greatly, is that of “input validation”.

What is “validation”?

Fight me on this, but I think that validation is essentially a yes/no decision on a set of input, whether it’s textual, binary, or whatever other format you care to define.

Exactly what you are validating is up for debate, whether you’re looking at syntax or semantics – is it formatted correctly, versus does it actually make sense?

Syntax versus semantics

“Green ideas sleep furiously” is a famous example of a sentence that is syntactically correct – it follows a standard “Adjective noun verb adverb” pattern that is common in English – but semantically, it makes no sense: ideas can’t be green, and they can’t sleep, and nothing can sleep furiously (although my son used to sleep with his fists clenched really tight when he was a little baby).

“0 / 0” is a syntactically correct mathematical expression, but you can argue if it’s semantically correct.

“Sell 1000 shares” might be a syntactically correct instruction, but semantically, it could be you don’t have 1000 shares, or there’s a business logic limit, which says such a transaction requires extra authentication.

So there’s a difference between syntactical validation and semantic validation, but…

What’s that got to do with injection exploits?

Injection attacks occur when an input data – a string of characters – is semantically valid in the language of the enclosing code, as code itself, and not just as data. Sometimes (but not always) this means the data contains a character or character sequence that allows the data to “escape” from its data context to a code context.

Can validation stop injection exploits?

This is a question I ask, in various round-about ways, in a lot of job interviews, so it’s quite an important question.

The answer is really simple.

Yes. And no.

If you can validate your input, such that it is always syntactically and semantically correct, you can absolutely prevent injection exploits.

But this is really only possible for relatively simple sets of inputs, and where the processing is safe for that set of inputs.

How about an example?

An example – suppose I’ve got a product ordering site, and I’m selling books.

You can order an integer number of books. Strictly speaking, positive integers, and 0 makes no sense, so start at 1. You probably want to put a maximum limit on that field, perhaps restricting people to buying no more than a hundred of that book. If they’re buying more, they’ll want to go wholesale anyway.

So, your validation is really simple – “is the field an integer, and is the integer value between 1 and 100?”

What about a counter-example?

Having said “yes, and no”, I have to show you an example of the “no”, right?

OK, let’s say you’re asking for validation of names of people – what’s your validation rules?

Let’s assume you’re expecting everyone to have ‘latinised’ their name, to make it easy. All the letters are in the range a-z, or A-Z if there’s a capital letter.

Great, so there’s a rule – only match “[A-Za-z]”

Unless, you know, Leonardo da Vinci. Or di Caprio. So you need spaces.

Or Daniel Day-Lewis. So there’s also hyphens to add.

And if you have an O’Reilly, an O’Brian, or a D’Artagnan, or a N’Dour – yes, you’re going to add apostrophes.

Now your validation rule is letting in a far broader range of characters than you start out with, and there’s enough there to allow for SQL injection to happen.

Input can now be syntactically correct by your validation rule, and yet semantically equivalent to data plus SQL code.

Validation alone is insufficient to block injection attacks.

Why do people say validation is sufficient, then?

I have a working hypothesis. It goes like this.

As a neophyte in information security, you learn a trick.

That trick is validation, and it’s a great thing to share with developers.

They don’t need to be clever or worry hard about the input that comes in, they simply need to validate it.

It actually feels good to reject incorrect input, because you know you’re keeping the bad guys out, and the good guys in.

Then you find an input field where validation alone isn’t sufficient.

Something else must be done

But you’ve told everyone – and had other security folk agree with you – that validation is the way to solve injection attacks.

So you learn a new trick – a new way of protecting inputs.

And you call this ‘validation’, too

After all, it … uhh, kind of does the same thing. It stops injection attacks, so it must be validation.

What is this ‘new trick’?

This new trick is encoding, quoting, or in some way transforming the data, so the newly transformed data is safe to accept.

Every one of those apostrophes? Turn them into the sequence “'” if they’re going into HTML, or double them if they’re in a SQL string, or – and this is FAR better – use parameterised queries so you don’t have to even know how the input string is being encoded on its way into the SQL command.

Now your input can be validated – and injection attacks are stopped.

But it’s not validation any more

In fact, once you’ve encoded your inputs properly, your validation can be entirely open and empty! At least from the security standpoint, because you’ve made the string semantically entirely meaningless to the code in which it is to be embedded as data. There are no escape characters or sequences, because they, too, have been encoded or transformed into semantically safe data.

It’s encoding… or transformation.

And I happen to think it’s important to separate the two concepts of validation and encoding.

Validation is saying “yes” or “no” to the question “is this string ‘good’ data?” You can validate in a number of different ways, and with good defence in depth, you’ll validate at different locations, based on different knowledge about what is “good”. This matches very strongly with the primary dictionary definition of “validation” – it’s awesome when a technical term matches very closely with a common language term, because teaching it to others becomes easier.

Encoding doesn’t say “yes” or “no”, encoding simply takes whatever input it’s given, and makes it safe for the next layer to which the data will be handed.

Stop calling encoding “validation”

It’s not.

Padding Oracle 3–making it usable

Just a quick note, because I’ve been sick this week, but last weekend, I put a little more work into my Padding Oracle exploit tool.

You can find the new code up at https://github.com/alunmj/PaddingOracle, and because of all the refactoring, it’s going to look like a completely new batch of code. But I promise that most of it is just moving code from Program.cs into classes, and adding parsing of command-line arguments.

I don’t pretend to be the world’s greatest programmer by any stretch, so if you can tell me a better way to do what I’ve done here, do let me know, and I’ll make changes and post something about them here.

Also, please let me know if you use the tool, and how well it worked (or didn’t!) for you.

Arguments

The arguments currently supported are:

URL

The only parameter unadorned with an option letter – this is the URL for the resource the Padding Oracle code will be pounding to test guesses at the encrypted code.

-c ciphertext

Also, –cipher. This provides a .NET regular expression which matches the ciphertext in the URL.

-t encoding:b64|b64URL|hex|HEX

Also, –textencoding, –encoding. This sets the encoding that’s used to specify the ciphertext (and IV) in the URL. The default is b64

  • b64 – standard base64, URL encoded (so ‘=’ is ‘%3d’, ‘+’ is ‘%2b’, and ‘/’ is ‘%2f’)
  • b64URL – “URL safe” base64, which uses ‘!’, ‘-‘ and ‘~’ instead of the base64 characters that would be URL encoded.
  • hex – hexadecimal encoding with lower case alphabetic characters a-f.
  • HEX – hexadecimal encoding with upper case alphabetic characters A-F.

-i iv

Also, –iv. This provides a .NET regular expression which matches the IV in the URL if it’s not part of the ciphertext.

-b blocksize

Also, –blocksize. This sets the block size in bytes for the encryption algorithm. It defaults to 16, but should work for values up to 32.

-v

Also, –verbose. Verbose – output information about the packets we’re decrypting, and statistics on speed at the end.

-h

Also, –help. Outputs a brief help message

-p parallelism:-1|1|#

Also –parallelism. Dictates how much to parallelise. Specifying ‘1’ means to use one thread, which can be useful to see what’s going on. –1 means “maximum parallelisation” – as many threads as possible. Any other integer is roughly akin to saying “no more than this number of threads”, but may be overridden by other aspects of the Windows OS. The default is –1.

-e encryptiontext

Instead of decrypting, this will encrypt the provided text, and provide a URL in return that will be decrypted by the endpoint to match your provided text.

Examples

These examples are run against the WebAPI project that’s included in the PadOracle solution.

Example 1

Let’s say you’ve got an example URL like this:

http://localhost:31140/api/encrypted/submit?iv=WnfvRLbKsbYufMWXnOXy2Q%3d%3d&ciphertext=087gbLKbFeRcyPUR2tCTajMQAeVp0r50g07%2bLKh7zSyt%2fs3mHO96JYTlgCWsEjutmrexAV5HFyontkMcbNLciPr51LYPY%2f%2bfhB9TghbR9kZQ2nQBmnStr%2bhI32tPpaT6Jl9IHjOtVwI18riyRuWMLDn6sBPWMAoxQi6vKcnrFNLkuIPLe0RU63vd6Up9XlozU529v5Z8Kqdz2NPBvfYfCQ%3d%3d

This strongly suggests (because who would use “iv” and “ciphertext” to mean anything other than the initialisation vector and cipher text?) that you have an IV and a ciphertext, separate from one another. We have the IV, so let’s use it – here’s the command line I’d try:

PadOracle "http://localhost:31140/api/encrypted/submit?iv=WnfvRLbKsbYufMWXnOXy2Q%3d%3d&ciphertext=087gbLKbFeRcyPUR2tCTajMQAeVp0r50g07%2bLKh7zSyt%2fs3mHO96JYTlgCWsEjutmrexAV5HFyontkMcbNLciPr51LYPY%2f%2bfhB9TghbR9kZQ2nQBmnStr%2bhI32tPpaT6Jl9IHjOtVwI18riyRuWMLDn6sBPWMAoxQi6vKcnrFNLkuIPLe0RU63vd6Up9XlozU529v5Z8Kqdz2NPBvfYfCQ%3d%3d" -c "087gb.*%3d%3d" –i "WnfvRL.*2Q%3d%3d"

This is the result of running that command:

capture20181111175736366

Notes:

  • The IV and the Ciphertext both end in Q==, which means we have to specify the regular expressions carefully to avoid the expression being greedy enough to catch the whole query string.
  • I didn’t use the “-v” output to watch it run and to get statistics.
  • That “12345678” at the end of the decrypted string is actually there – it’s me trying to push the functionality – in this case, to have an entirely padding last block. [I should have used the letter “e” over and over – it’d be faster.]

Example 2

Same URL, but this time I want to encrypt some text.

Our command line this time is:

PadOracle "http://localhost:31140/api/encrypted/submit?iv=WnfvRLbKsbYufMWXnOXy2Q%3d%3d&ciphertext=087gbLKbFeRcyPUR2tCTajMQAeVp0r50g07%2bLKh7zSyt%2fs3mHO96JYTlgCWsEjutmrexAV5HFyontkMcbNLciPr51LYPY%2f%2bfhB9TghbR9kZQ2nQBmnStr%2bhI32tPpaT6Jl9IHjOtVwI18riyRuWMLDn6sBPWMAoxQi6vKcnrFNLkuIPLe0RU63vd6Up9XlozU529v5Z8Kqdz2NPBvfYfCQ%3d%3d" -c "087gb.*%3d%3d" –i "WnfvRL.*2Q%3d%3d" –e "Here’s some text I want to encrypt"

When we run this, it warns us it’s going to take a very long time, and boy it’s not kidding – we don’t get any benefit from the frequency table, and we can’t parallelise the work.

capture20181111215602359

And you can see it took about two hours.

One Simple Thing

I’ve been a little absent from this blog for a while, mostly because I’ve been settling in to a new job where I’ve briefly changed my focus almost completely from application security to being a software developer.

The blog absence is going to change now, and I’d like to start that with a renewed effort to write something every week. In addition to whatever grabs my attention from the security news feeds I still suck up, I want to get across some of knowledge and approaches I’ve used while working as an application security guy. I’ll likely be an application security guy in my next job, whenever that is, so it’ll stand me in good stead to write what I think.

“One Simple Thing”

The phrase “One Simple Thing” underscores what I try to return to repeatedly in my work – that if you can get to the heart of what you’re working on, everything else flows easily and smoothly.

This does not mean that there’s only one thing to think about with regards to security, but that when you start asking clarifying questions about the “one simple thing” that drives – or stops – a project in the moment, it’s a great way to make tremendous progress.

The Simplest Security Thing

I’ll start by discussing the One Simple Thing I pick up by default whenever I’m given a security challenge.

What are we protecting?

This is the first question I ask on joining a new security team – often as early as the first interviews. Everyone has a different answer, and it’s a great way to find out what approaches you’re likely to encounter. The question also has several cling-on questions that it demands be asked and answered at the same time:

Why are we protecting it?

Who are we protecting it from?

Why do they want it?

Why shouldn’t they get it?

What are our resources?

These come very quickly out of the One Simple Thing of “what are we protecting?”.

Here’s some typical answers:

  • Our systems need to continue running and be accessible at all times
  • We need to keep making money
  • We need to stop [the risk of] losing money
  • Our customers trust us, and we need that to continue
  • We operate in a regulatory compliance climate, and need to keep our lawsuits down to close to zero
  • We don’t even have a product yet, and haven’t a clue what we need to secure
  • We don’t want a repeat of last week’s / month’s / year’s very public security / privacy debacle
  • We want to balance the amount we spend on security staff against the benefit it has to our bottom line – and demonstrate results

You can see from the selection of answers that not everyone has anything like the same approach, and that they don’t all line up exactly under the typical buckets of Confidentiality, Integrity and Availability.

Do you think someone can solve your security issues or set up a security team without first finding out what it is you’re protecting?

Do you think you can engage with a team on security issues without understanding what they think they’re supposed to be protecting?

Answers vary

You’ve seen from my [short] list above that there are many answers to be had between different organisations and companies.

I’d expect there to be different answers within an organisation, within a team, within a meeting room, and even depending on the time I ask the question.

“What are we protecting” on the day of the Equifax leak quickly becomes a conversation on personal data, and the damaging effect of a leak to “customers”. [I prefer to call them “data subjects”, because they aren’t always your customers.]

On the day that Yahoo gets bought by Verizon for substantially less than initially offered, the answer becomes more about company value, and even perhaps executive stability.

Next time you’re confused by a security problem, step back and ask yourself – and others – “What are we protecting?” and see how much it clarifies your understanding.

Lessons from my first official bounty

Sometimes, it’s just my job to find vulnerabilities, and while that’s kind of fun, it’s also a little unexciting compared to the thrill of finding bugs in other people’s software and getting an actual “thank you”, whether monetarily or just a brief word.

About a year ago, I found a minor Cross-Site Scripting (XSS) flaw in a major company’s web page, and while it wasn’t a huge issue, I decided to report it, as I had a few years back with a similar issue in the same web site. I was pleased to find that the company was offering a bounty programme, and simply emailing them would submit my issue.

Lesson 1: The WAF… it does nothing!

The first thing to notice, as with all XSS issues, is that there were protections in place that had to be got around. In this case, some special characters or sequences were being blocked. But not all. And it’s really telling that there are still many websites which have not implemented widespread input validation / output encoding as their XSS protection. So, while the WAF slowed me down even when I knew the flaw existed, it only added about 20 minutes to the exploit time. So, my example had to use “confirm()” instead of “alert()” or “prompt()”. But really, if I was an attacker, my exploit wouldn’t have any of those functions, and would probably include an encoded script that wouldn’t be detected by the WAF either. WAFs are great for preventing specific attacks, but aren’t a strong protection against an adversary with a little intelligence and understanding.

Lesson 2: A quick response is always welcome

My email resulted in an answer that same day, less than an hour after my initial report. A simple “thank you”, and “we’re forwarding this to our developers” goes a long way to keeping a security researcher from idly playing with the thought of publishing their findings and moving on to the next game.

Lesson 3: The WAF… still does nothing!

In under a week, I found that the original demo exploit was being blocked by the WAF – but if I replaced “onclick” with “oclick”, “onmouseover” with “omouseover”, and “confirm” with “cofirm”, I found the blocking didn’t get in the way. Granted, since those aren’t real event handlers or JavaScript functions, I can’t use those in a real exploit, but it does mean that once again, all the WAF does is block the original example of the attack, and it took only a few minutes again to come up with another exploit string.

Lesson 4: Keep communicating

If they’d told me “hey, we’re putting in a WAF rule while we work on fixing the actual bug”, I wouldn’t have been so eager to grump back at them and say they hadn’t fixed the issue by applying their WAF and by the way, here’s another URL to exploit it. But they did at least respond to my grump and reassure me that, yes, they were still going to fix the application.

Lesson 5: Keep communicating

I heard nothing after that, until in February of this year, over six months later, I replied to the original thread and asked if the report qualified for a bounty, since I noticed that they had actually fixed the vulnerability.

Lesson 6: Keep communicating

No response. Thinking of writing this up as an example of how security researchers still get shafted by businesses – bear in mind that my approach is not to seek out bounties for reward, but that I really think it’s common courtesy to thank researchers for reporting to you rather than pwning your website and/or your customers.

Lesson 7: Use a broker

About a month later, while looking into other things, I found that the company exists on HackerOne, where they run a bug bounty. This renewed my interest in seeing this fixed. So I reported the email exchange from earlier, noted that the bug was fixed, and asked if it constituted a rewardable finding. Again, a simple “thanks for the report, but this doesn’t really rise to the level of a bounty” is something I’ve been comfortable with from many companies (though it is nice when you do get something, even if it’s just a keychain or a t-shirt, or a bag full of stickers).

Lesson 8: Keep communicating

3/14: I got a reply the next day, indicating that “we are investigating”.

3/28: Then nothing for two weeks, so I posted another response asking where things were going.

4/3: Then a week later, a response. “We’re looking into this and will be in touch soon with an update.”

4/18: Me: Ping?

5/7: Me: Hey, how are we doing?

5/16: Anything happening?

Lesson 9: Deliver

5/18: Finally, over two months after my report to the company through HackerOne, and ten months after my original email to the first bug bounty address, it’s addressed.

5/19: The severity of the bug report is lowered (quite rightly, the questionnaire they used pushed me to a priority of “high”, which was by no means warranted). A very welcome bounty, and a bonus for my patience – unexpected but welcome, are issued.

Lesson 10: Learn

The cheapest way to learn things is from someone else’s mistakes. So I decided to share with my readers the things I picked up from this experience.

Here are a few other lessons I’ve picked up from bug bounties I’ve observed:

Lesson 11: Don’t start until you’re ready

If you start a bug bounty, consider how ready you might be. Are you already fixing all the security bugs you can find for yourself? Are you at least fixing those bugs faster than you can find more? Do your developers actually know how to fix a security bug, or how to verify a vulnerability report? Do you know how to expand on an exploit, and find occurrences of the same class of bug? [If you don’t, someone will milk your bounty programme by continually filing variations on the same basic flaw]

Lesson 12: Bug Bounty programmes may be the most expensive way to find vulnerabilities

How many security vulnerabilities do you think you have? Multiply that by an order of magnitude or two. Now multiply that by the average bounty you expect to offer. Add the cost of the personnel who are going to handle incoming bugs, and the cost of the projects they could otherwise be engaged in. Add the cost of the developers, whose work will be interrupted to fix security bugs, and add the cost of the features that didn’t get shipped on time before they were fixed. Sure, some of that is just a normal cost of doing business, when a security report could come at you out of the blue and interrupt development until it’s fixed, but starting a bug bounty paints a huge target on you.

Hiring a penetration tester, or renting a tool to scan for programming flaws, has a fixed cost – you can simply tell them how much you’re willing to pay, and they’ll work for that long. A bug bounty may result in multiple orders of magnitude more findings than you expected. Are you going to pay them all? What happens when your bounty programme runs out of money?

Finding bugs internally using bug bashes, software scanning tools or dedicated development staff, has a fixed cost, which is probably still smaller than the amount of money you’re considering on putting into that bounty programme.

12.1: They may also be the least expensive way to find vulnerabilities

That’s not to say bug bounties are always going to be uneconomical. At some point, in theory at least, your development staff will be sufficiently good at resolving and preventing security vulnerabilities that are discovered internally, that they will be running short of security bugs to fix. They still exist, of course, but they’re more complex and harder to find. This is where it becomes economical to lure a bunch of suckers – excuse me, security researchers – to pound against your brick walls until one of them, either stronger or smarter than the others, finds the open window nobody saw, and reports it to you. And you give them a few hundred bucks – or a few thousand, if it’s a really good find – for the time that they and their friends spent hammering away in futility until that one successful exploit.

At that point, your bug bounty programme is actually the least expensive tool in your arsenal.

Security Questions are Bullshit

 

I’m pretty much unhappy with the use of “Security Questions” – things like “what’s your mother’s maiden name”, or “what was your first pet”. These questions are sometimes used to strengthen an existing authentication control (e.g. “you’ve entered your password on a device that wasn’t recognised, from a country you normally don’t visit – please answer a security question”), but far more often they are used as a means to recover an account after the password has been lost, stolen or changed.

 

I’ve been asked a few times, given that these are pretty widely used, to explain objectively why I have such little disregard for them as a security measure. Here’s the Too Long; Didn’t Read summary:

  1. Most questions are equivalent to a low-entropy password
  2. Answers to many of these questions can be found online in public documents
  3. Answers can be squeezed out of you, by fair means or foul
  4. The same questions – and answers – are shared at multiple sites
  5. Questions (and answers) don’t get any kind of regulatory protection
  6. Because the best answers are factual and unchanging, you cannot change them if they are exposed

Let’s take them one by one:

Questions are like a low-entropy password

What’s your favourite colour? Blue, or Green. At the outside, red, yellow, orange or purple. That covers most people’s choices, in less than 3 bits of entropy.

What’s your favourite NBA team? There’s 29 of those – 30, if you count the 76ers. That’s 6 bits of entropy.

Obviously, there are questions that broaden this, but are relatively easy to guess with a small number of tries – particularly when you can use the next fact about Security Questions.

The Answers are available online

What’s your mother’s maiden name? It’s a matter of public record.

What school did you go to? If we know where you grew up, it’s easy to guess this, since there were probably only a handful of schools you could possibly have attended.

Who was your first boyfriend/girlfriend? Many people go on about this at length in Facebook posts, I’m told. Or there’s this fact:

You’ll tell people the answers

What’s your porn name? What’s your Star Wars name? What’s your Harry Potter name?

All these stupid quizzes, and they get you to identify something about yourself – the street you grew up on, the first initial of your secret crush, how old you were when you first heard saxophones.

And, of course, because of the next fact, all I really have to do is convince you that you want a free account at my site.

Answers are shared everywhere

Every site that you visit asks you variants of the same security questions – which means that you’ll have told multiple sites the same answers.

You’ve been told over and over not to share your password across multiple sites – but here you are, sharing the security answers that will reset your password, and doing so across multiple sites that should not be connected.

And do you think those answers (and the questions they refer back to) are kept securely by these various sites? No, because:

Questions & Answers are not protected like passwords

There’s regulatory protection, under regimes such as PCI, etc, telling providers how to protect your passwords.

There is no such advice for protecting security questions (which are usually public) and the answers to them, which are at least presumed to be stored in a back-end database, but are occasionally sent to the client for comparison against the answers! That’s truly a bad security measure, because of course you’re telling the attacker.

Even assuming the security answers are stored in a database, they’re generally stored in plain text, so that they can be accessed by phone support staff to verify your answers when you call up crying that you’ve forgotten your password. [Awesome pen-testing trick]

And because the answers are shared everywhere, all it takes is a breach at one provider to make the security questions and answers they hold have no security value at all any more.

If they’re exposed, you can’t change them

There’s an old joke in security circles, “my password got hacked, and now I have to rename my dog”. It’s really funny, because there are so many of these security answers which are matters of historical fact – while you can choose different questions, you can’t generally choose a different answer to the same question.

Well, obviously, you can, but then you’ve lost the point of a security question and answer – because now you have to remember what random lie you used to answer that particular question on that particular site.

And for all you clever guys out there…

Yes, I know you can lie, you can put in random letters or phrases, and the system may take them (“Your place of birth cannot contain spaces” – so, Las Vegas, New York, Lake Windermere are all unusable). But then you’ve just created another password to remember – and the point of these security questions is to let you log on once you’ve forgotten your password.

So, you’ve forgotten your password, but to get it back, you have to remember a different password, one that you never used. There’s not much point there.

In summary

Security questions and answers, when used for password recovery / reset, are complete rubbish.

Security questions are low-entropy, predictable and discoverable password substitutes that are shared across multiple sites, are under- or un-protected, and (like fingerprints) really can’t be changed if they become exposed. This makes them totally unsuited to being used as password equivalents in account recovery / password reset schemes.

If you have to implement an account recovery scheme, find something better to use. In an enterprise, as I’ve said before, your best bet is to use something that the enterprise does well – the management hierarchy. Every time you forget your password, you have to get your manager, or someone at the next level up from them, to reset your password for you, or to vouch for you to tech support. That way, someone who knows you, and can affect your behaviour in a positive way, will know that you keep forgetting your password and could do with some assistance. In a social network, require the

Password hints are bullshit, too

Also, password hints are bullshit. Many of the Adobe breach’s “password hints” were actually just the password in plain-text. And, because Adobe didn’t salt their password hashes, you could sort the list of password hashes, and pick whichever of the password hints was either the password itself, or an easy clue for the password. So, even if you didn’t use the password hint yourself, or chose a really cryptic clue, some other idiot came up with the same password, and gave a “Daily Express Quick Crossword” quality clue.

How to do password reset

First, a quick recap:

Credentials include a Claim and a Proof (possibly many).

The Claim is what states one or more facts about your identity.

A Username is one example of a Claim. So is Group Membership, Age, Eye Colour, Operating System, Installed Software, etc…

The Proof is what allows someone to reliably trust the Claim is true.

A Password is one example of a Proof. So is a Signature, a Passport, etc…

Claims are generally public, or at least non-secret, and if not unique, are at least specific (e.g. membership of the group “Brown eyes” isn’t open to people with blue eyes).

Proofs are generally secret, and may be shared, but such sharing should not be discoverable except by brute force. (Which is why we salt passwords).

Now, the topic – password resets

Password resets can occur for a number of reasons – you’ve forgotten your password, or the password change functionality is more cumbersome than the password reset, or the owner of the account has changed (is that allowable?) – but the basic principle is that an account needs a new password, and there needs to be a way to achieve that without knowledge of the existing password.

Let’s talk as if it’s a forgotten password.

So we have a Claim – we want to assert that we possess an identity – but we have to prove this without using the primary Proof.

Which means we have to know of a secondary Proof. There are common ways to do this – alternate ID, issued by someone you trust (like a government authority, etc). It’s important in the days of parody accounts, or simply shared names (is that Bob Smith, his son, Bob Smith, or his unrelated neighbour, Bob Smith?) that you have associated this alternate ID with the account using the primary Proof, or as a part of the process of setting up the account with the primary Proof. Otherwise, you’re open to account takeover by people who share the same name as their target.

And you can legally change your name.

What’s the most common alternate ID / secondary Proof?

E-mail.

Pretty much every public web site relies on the use of email for password reset, and uses that email address to provide a secondary Proof.

It’s not enough to know the email address – that’s unique and public, and so it matches the properties of a Claim, not a Proof, of identity.

We have to prove that we own the email address.

It’s not enough to send email FROM the email address – email is known to be easily forged, and so there’s no actual proof embodied in being able to send an email.

That leaves the server with the prospect of sending something TO the email address, and the recipient having proved that they received it.

You could send a code-word, and then have the recipient give you the code-word back. A shared secret, if you like.

And if you want to do that without adding another page to the already-too-large security area of the site, you look for the first place that allows you to provide your Claim and Proof, and you find the logon page.

By reusing the logon page, you’re going to say that code-word is a new password.

[This is not to say that email is the only, or even the best, way to reset passwords. In an enterprise, you have more reliable proofs of identity than an email provider outside of your control. You know people who should be able to tell you with some surety that a particular person is who they claim to be. Another common secondary identification is the use of Security Questions. See my upcoming article, “Security Questions are Bullshit” for why this is a bad idea.]

So it’s your new password, yes?

Well, yes and no. No, actually. Pretty much definitely no, it’s not your new password.

Let’s imagine what can go wrong. If I don’t know your password, but I can guess your username (because it’s not secret), I can claim to be you wanting to reset your password. That not only creates opportunity for me to fill your mailbox with code-words, but it also prevents you from logging on while the code-words are your new password. A self-inflicted denial of service.

So your old password should continue working, and if you never use the code-word, because you’re busy ignoring and deleting the emails that come in, it should keep working for you.

I’ve frequently encountered situations in my own life where I’ve forgotten my password, gone through the reset process, and it’s only while typing in the new password, and being told what restrictions there are on characters allowed in the new password, that I remember what my password was, and I go back to using that one.

In a very real sense, the code-word sent to you is NOT your new password, it’s a code-word that indicates you’ve gone the password reset route, and should be given the opportunity to set a new password.

Try not to think of it as your “temporary password”, it’s a special flag in the logon process, just like a “duress password”. It doesn’t replace your actual password.

It’s a shared secret, so keep it short-lived

Shared secrets are fantastic, useful, and often necessary – TLS uses them to encrypt data, after the initial certificate exchange.

But the trouble with shared secrets is, you can’t really trust that the other party is going to keep them secret very long. So you have to expire them pretty quickly.

The same is true of your password reset code-word.

In most cases, a user will forget their password, click the reset link, wait for an email, and then immediately follow the password reset process.

Users are slow, in computing terms, and email systems aren’t always directly linked and always-connected. But I see no reason why the most usual automated password reset process should allow the code-word to continue working after an hour.

[If the process requires a manual step, you have to count that in, especially if the manual step is something like “contact a manager for approval”, because managers aren’t generally 24/7 workers, the code-word is going to need to last much longer. But start your discussion with an hour as the base-point, and make people fight for why it’ll take longer to follow the password reset process.]

It’s not a URL

You can absolutely supply a URL in the email that will take the user to the right page to enter the code-word. But you can’t carry the code-word in the URL.

Why? Check out these presentations from this year’s Black Hat and DefCon, showing the use of a malicious WPAD server on a local – or remote – network whose purpose is to trap and save URLs, EVEN HTTPS URLs, and their query strings.

Every URL you send in an email is an HTTP or HTTPS GET, meaning all the parameters are in the URL or in the query string portion of the URL.

This means the code-word can be sniffed and usurped if it’s in the URL. And the username is already assumed to be known, since it’s a non-secret. [Just because it’s assumed to be known, don’t give the attacker an even break – your message should simply say “you requested a password reset on an account at our website” – a valid request will come from someone who knows which account at your website they chose to request.]

So, don’t put the code-word in the URL that you send in the email.

Log it, and watch for repeats, errors, other bad signs

DON’T LOG THE PASSWORD

I have to say that, because otherwise people do that, as obviously wrong as it may seem.

But log the fact that you’ve changed a password for that user, along with when you did it, and what information you have about where the user reset their password from.

Multiple users resetting their password from the same IP address – that’s a bad sign.

The same user resetting their password multiple times – that’s a bad sign.

Multiple expired code-words – that’s a bad sign.

Some of the bad things being signaled include failures in your own design – for instance, multiple expired code-words could mean that your password reset function has stopped working and needs checking. You have code to measure how many abandoned shopping carts you have, so include code that measures how many abandoned password reset attempts you have.

In summary, here’s how to do email password reset properly

  1. Consider whether email is an appropriately reliable secondary proof of identification.
    1. A phone call from the user, to their manager, followed by the manager resetting the user’s password is trustworthy, and self-correcting. A little slow and awkward, but stronger.
    2. Other forms of identification are almost all stronger and more resistant to forgery and attack than email.
    3. Bear in mind that, by trusting to email, you’re trusting that someone who just forgot their password can remember their other password, and hasn’t shared it, or used an easily-guessed password.
  2. When an account is created, associate it with an email address, and allow & encourage the validated account holder to keep that updated.
  3. Allow anyone to begin a password reset process – but don’t allow them to repeatedly do so, for fear of allowing DoS attacks on your users.
  4. When someone has begun a password reset process, send the user a code-word in email. This should be a random sequence, of the same sort of entropy as your password requirements.
    1. Encoding a time-stamp in the code-word is useful.
    2. Do not put the code-word in a URL in the message.
    3. Do not specify the username in the message, not even in the URL.
  5. Do not change the user’s password for them yet. This code-word is NOT their new password.
  6. Wait for up to an hour (see item 4.1. – the time stamp) during which you’ll accept the code-word. Be very cautious about extending this period.
  7. The code-word, when entered at the password prompt for the associated user, is an indication to change the password, and that’s ALL it can be used for.
  8. Allow the user to cancel out of the password change, up to the point where they have entered a new password and its verification code and hit OK.

And here’s how you’re going to screw it up (don’t do these!)

  • You’ll use email for password resets on a high-security account. Email is relatively low-security.
  • You’ll use email when there’s something more reliable and/or easier to use, because “it’s what everyone does”.
  • You’ll use email to an external party with whom you don’t have a business relationship, as the foundation on which you build your corporate identity tower of cards.
  • You won’t have a trail of email addresses associated with the user, so you don’t have reason to trust you’re emailing the right user, and you won’t be able to recover when an email address is hacked/stolen.
  • You’ll think the code-word is a new password, leading to:
    • Not expiring the code-word.
    • Sending passwords in plain-text email (“because we already do it for password reset”).
    • Invalidating the user’s current password, even though they didn’t want it reset.
  • You’ll include the credentials (username and/or password) in a URL in the email message, so it can be picked up by malicious proxies.
  • You’ll fail to notice that someone’s DoSing some or all of your users by repeatedly initiating a password reset.
  • You’ll fail to notice that your password reset process is failing, leading to accounts being dropped as people can no longer get in.
  • You won’t include enough randomness in your code-word, so an attacker can reset two accounts at once – their own, and a victim’s, and use the same code-word to unlock both.
  • You’ll rely on security questions
  • You’ll use a poor-quality generation on your code-words, leading to easy predictability, reuse, etc.
  • You won’t think about other ways that your design will fail outside of this list here.

Let the corrections begin!

Did I miss something, or get something wrong? Let me know by posting a comment!

The Automatic Gainsaying of Anything the Other Person Says

Sometimes I think that title is the job of the Security Engineer – as a Subject Matter Expert, we’re supposed to meet with teams and tell them how their dreams are going to come crashing down around their ears because of something they hadn’t thought of, but which is obvious to us.

This can make us just a little bit unpopular.

But being argumentative and sceptical isn’t entirely a bad trait to have.

Sometimes it comes in handy when other security guys spread their various statements of doom and gloom – or joy and excitement.

Examples in a single line

“Rename your administrator account so it’s more secure” – or lengthen the password and achieve the exact same effect without breaking scripts or requiring extra documentation so people know what the administrator is called this week.

“Encrypt your data at rest by using automatic database encryption” – which means any app that authenticates to the database can read that data back out, voiding the protection that was the point of encrypting at rest. If fields need encrypting, maybe they need field-level access control, too.

“Complex passwords, one lower case, one upper case, one number, one symbol, no repeated letters” – or else, measure strength in more interesting ways, and display to users how strong their password is, so that a longish phrase, used by a competent typist, becomes an acceptable password.

Today’s big example – password expiration

Now I’m going to commit absolute heresy, as I’m going against the biggest recent shock news in security advice.

capture20160919063258336capture20160919063534856

I understand the arguments, and I know I’m frequently irritated with the unnecessary requirements to change my password after sixty days, and even more so, I know that the reasons behind password expiration settings are entirely arbitrary.

But…

There’s a good side to password expiry.

capture20160919063926113

These aren’t the only ways in which passwords are discovered.

The method that frequently gets overlooked is when they are deliberately shared.

Sharing means scaring

“Bob’s out this week, because his mother died, and he has to arrange details in another state. He didn’t have time to set up access control changes before he left, but he gave me a sticky-note with his password on it, so that we don’t need to bother him for anything”

“Everyone on this team has to monitor and interact with so many shared service accounts, we just print off a list of all the service account passwords. You can photocopy my laminated card with the list, if you like.”

Yes, those are real situations I’ve dealt with, and they have some pretty obvious replacement solutions:

Bob

Bob (or Bob’s manager, if Bob is too distraught to talk to anyone, which isn’t at all surprising) should notify a system administrator who can then respond to requests to open up ACLs as needed, rather than someone using Bob’s password. But he didn’t.

When Bob comes back, is he going to change his password?

No, because he trusts his assistant, Dave, with his communications.

But, of course, Dave handed out his password to the sales VP, because it was easier for him than fetching up the document she wanted. And sales VPs just can’t be trusted. Now the entire sales team knows Bob’s password. And then one of them gets fired, or hired on at a new competitor. The temptation to log on to Bob’s account – just once – is immense, because that list of customers is just so enticing. And really, who would ever know? And if they did know, everyone has Bob’s password, so it’s not like they could prosecute you, because they couldn’t prove it was you.

What’s going to save Bob is if he is required to change his password when he returns.

Laminated Password List

Yes, this also happened. Because we found the photocopy of the laminated sheet folded up on the floor of a hallway outside the lavatory door.

There was some disciplining involved. Up to, and possibly including, the termination of employment, as policy allows.

Then the bad stuff happened.

The team who shared all these passwords pointed out that, as well as these being admin-level accounts, they had other special privileges, including the avoidance of any requirement to change passwords.

These passwords hadn’t changed in six years.

And the team had no idea what would break if they changed the passwords.

Maybe one of those passwords is hard-coded into a script somewhere, and vital business processes would grind to a halt if the password was changed.

When I left six months later, they were still (successfully) arguing that it would be too dangerous to try changing the passwords.

To change a behaviour, you must first accept it

I’m not familiar with any company that acknowledges in policy that users share passwords, nor the expected behaviour when they do [log when you shared it, and who you shared it with, then change it as soon as possible after it no longer needs to be shared].

Once you accept that passwords are shared for valid reasons, even if you don’t enumerate what those reasons are, you can come up with processes and tools to make that sharing more secure.

If there was a process for Bob to share with Dave what his password is, maybe outlining the creation of a temporary password, reading Dave in on when Dave can share the password (probably never), and how Dave is expected to behave, and becomes co-responsible for any bad things done in Bob’s account, suddenly there’s a better chance Dave’s not going to share. “I can’t give you Bob’s password, but I can get you that document you’re after”

If there was a tool in which the team managing shared service accounts could find and unlock access to passwords, that tool could also be configured to distribute changed passwords to the affected systems after work had been performed.

Without acceptance, it is expiry which saves you

If you don’t have these processes or tools, the only protection you have against password sharing (apart from the obviously failing advice to “just don’t do it”) is regular password expiry.

Password expiry as a Business Continuity Practice

I’m also fond of talking about password expiration as being a means to train your users.

Do you even know how to change your passwords?

Certificates expire once a year, and as a result, programmers write code as if it’s never going to happen. After all, there’s plenty of time between now and next year to write the “renew certificate” logic, and by the time it’s needed, I’ll be working on another project anyway.

If passwords don’t expire – or don’t expire often enough – users will not have changed their password anything like recently enough to remember how to do so if they have to in an emergency.

So, when a keylogger has been discovered to be caching all the logons in the conference room, or the password hashes have been posted on Pastebin, most of your users – even the ones approving the company-wide email request for action – will fight against a password change request, because they just don’t know how to do it, or what might break when they do.

Unless they’ve been through it before, and it were no great thing. In which case, they’ll briefly sigh, and then change their password.

So, password expiry – universally good?

This is where I equivocate and say, yes, I broadly think the advice to reduce or remove password expiration is appropriate. My arguments above are mainly about things we’ve forgotten to remember that are reasons why we might have included password expiry to begin with.

Here, in closing, are some ways in which password expiry is bad, just to balance things out:

  • Some people guard their passwords closely, and generate secure passwords, or use hardware token devices to store passwords. These people should be rewarded for their security skills, not punished under the same password expiry regimen as the guy who adds 1 to his password, and is now up to “p@55w0rd73”.
  • Expiring passwords too frequently drives inexorably to “p@55word73”, and to password reuse across sites, as it takes effort to think up complex passwords, and you’re not going to waste that effort thinking up unique words every couple of months.
  • Password expiration often means that users spend significant time interrupted in their normal process, by the requirement to think up a password and commit it to the multiple systems they use.

Where do we stand?

Right now, this is where we stand:

  • Many big security organisations – NIST, CESG, etc – talk about how password expiry is an outdated and unnecessary concept
  • I think they’re missing some good reasons why password expiry is in place (and some ways in which that could be fixed)
  • Existing security standards that businesses are expected to comply with have NOT changed their stance.

As a result, especially of this last item, I don’t think businesses can currently afford to remove password expiry from their accounts.

But any fool can see which way the wind is blowing – at some point, you will be able to excuse your company from password expiry, but just in case your compliance standard requires it, you should have a very clear and strong story about how you have addressed the risks that were previously resolved by expiring passwords as frequently as once a quarter.

Explaining Blockchains – without understanding them.

I tweeted this the other day, after reading about Microsoft’s Project Bletchley:

I’ve been asked how I can tweet something as specific as this, when in a subsequent tweet, I noted:

[I readily admit I didn’t understand the announcement, or what it’s /supposed/ to be for, but that didn’t stop me thinking about it]

— Alun Jones (@ftp_alun) June 17, 2016

Despite having  reasonably strong background in the use of crypto, and a little dabbling into the analysis of crypto, I don’t really follow the whole “blockchain” thing.

So, here’s my attempt to explain what little I understand of blockchains and their potential uses, with an open invitation to come and correct me.

What’s a blockchain used for?

The most widely-known use of blockchains is that of Bit Coin and other “digital currencies”.

Bit Coins are essentially numbers with special properties, that make them progressively harder to find as time goes on. Because they are scarce and getting scarcer, it becomes possible for people of a certain mindset to ascribe a “value” to them, much as we assign value to precious metals or gemstones aside from their mere attractiveness. [Bit Coins have no intrinsic attractiveness as far as I can tell] That there is no actual intrinsic value leads me to refer to Bit Coin as a kind of shared madness, in which everyone who believes there is value to the Bit Coin shares this delusion with many others, and can use that shared delusion as a basis for trading other valued objects. Of course, the same kind of shared madness is what makes regular financial markets and country-run money work, too.

Because of this value, people will trade them for other things of value, whether that’s shiny rocks, or other forms of currency, digital or otherwise. It’s a great way to turn traceable goods into far less-traceable digital commodities, so its use for money laundering is obvious. Its use for online transactions should also be obvious, as it’s an irrevocable and verifiable transfer of value, unlike a credit card which many vendors will tell you from their own experiences can be stolen, and transactions can be revoked as a result, whether or not you’ve shipped valuable goods.

What makes this an irrevocable and verifiable transfer is the principle of a “blockchain”, which is reported in a distributed ledger. Anyone can, at any time, at least in theory, download the entire history of ownership of a particular Bit Coin, and verify that the person who’s selling you theirs is truly the current and correct owner of it.

How does a blockchain work?

I’m going to assume you understand how digital signatures work at this point, because that’s a whole ‘nother explanation.

Remember that a Bit Coin starts as a number. It could be any kind of data, because all data can be represented as a number. That’s important, later.

The first owner of that number signs it, and then distributes the number and signature out to the world. This is the “distributed ledger”. For Bit Coins, the “world” in this case is everyone else who signs up to the Bit Coin madness.

When someone wants to buy that Bit Coin (presumably another item of mutually agreed similar value exchanges hands, to buy the Bit Coin), the seller signs the buyer’s signature of the Bit Coin, acknowledging transfer of ownership, and then the buyer distributes that signature out to the distributed ledger. You can now use the distributed ledger at any time to verify that the Bit Coin has a story from creation and initial signature, unbroken, all the way up to current ownership.

I’m a little flakey on what, other than a search in the distributed ledger for previous sales of this Bit Coin, prevents a seller from signing the same Bit Coin over simultaneously to two other buyers. Maybe that’s enough – after all, if the distributed ledger contains a demonstration that you were unreliable once, your other signed Bit Coins will presumably have zero value.

So, in this perspective, a blockchain is simply an unbroken record of ownership or provenance of a piece of data from creation to current owner, and one that can be extended onwards.

In the world of financial use, of course, there are some disadvantages – the most obvious being that if I can make you sign a Bit Coin against your well, it’s irrevocably mine. There is no overarching authority that can say “no, let’s back up on that transaction, and say it never happened”. This is also pitched as an advantage, although many Bit Coin owners have been quite upset to find that their hugely-valuable piles of Bit Coins are now in someone else’s ownership.

Now, explain your tweet.

With the above perspective in the back of my head, I read the Project Bletchley report.

I even looked at the pictures.

I still didn’t really understand it, but something went “ping” in my head.

Maybe this is how C-level executives feel.

Here’s my thought:

Businesses get data from customers, users, partners, competitors, outright theft and shenanigans.

Maybe in environments where privacy is respected, like the EU, blockchains could be an avenue by which regulators enforce companies describing and PROVING where their data comes from, and that it was not acquired or used in an inappropriate manner?

When I give you my data, I sign it as coming from me, and sign that it’s now legitimately possessed by you (I won’t say “owned”, because I feel that personal data is irrevocably “owned” by the person it describes). Unlike Bit Coin, I can do this several times with the same packet of data, or different packets of data containing various other information. That information might also contain details of what I’m approving you to do with that information.

This is the start of a blockchain.

When information is transferred to a new party, that transfer will be signed, and the blockchain can be verified at that point. Further usage restrictions can be added.

Finally, when an information commissioner wants to check whether a company is handling data appropriately, they can ask for the blockchains associated with data that has been used in various ways. That then allows the commissioner to verify whether reported use or abuse has been legitimately approved or not.

And before this sounds like too much regulatory intervention, it also allows businesses to verify the provenance of the data they have, and to determine where sensitive data resides in their systems, because if it always travels with its blockchain, it’s always possible to find and trace it.

[Of course, if it travels without its blockchain, then it just looks like you either have old outdated software which doesn’t understand the blockchain and needs to be retired, or you’re doing something underhanded and inappropriate with customers’ data.]

It even allows the revocation of a set of data to be performed – when a customer moves to another provider, for instance.

Yes, there’s the downside of hugely increased storage requirements. Oh well.

Oh, and that revocation request on behalf of the customer, that would then be signed by the business to acknowledge it had been received, and would be passed on to partners – another blockchain.

So, maybe I’ve misunderstood, and this isn’t how it’s going to be used, but I think it’s an intriguing thought, and would love to hear your comments.

Untrusting the Blue Coat Intermediate CA from Windows

So, there was this tweet that got passed around the security community pretty quickly:

Kind of confusing and scary if you’re not quite sure what this all means – perhaps clear and scary if you do.

BlueCoat manufactures “man in the middle” devices – sometimes used by enterprises to scan and inspect / block outbound traffic across their network, and apparently also used by governments to scan and inspect traffic across the network.

The first use is somewhat acceptable (enterprises can prevent their users from distributing viruses or engaging in illicit behaviour from work computers, which the enterprises quite rightly believe they own and should control), but the second use is generally not acceptable, depending on how much you trust your local government.

Filippo helpfully gives instructions on blocking this from OSX, and a few people in the Twitter conversation have asked how to do this on Windows.

Disclaimer!

Don’t do this on a machine you don’t own or manage – you may very well be interfering with legitimate interference in your network traffic. If you’re at work, your employer owns your computer, and may intercept, read and modify your network traffic, subject to local laws, because it’s their network and their computer. If your government has ruled that they have the same rights to intercept Internet traffic throughout your country, you may want to consider whether your government shouldn’t be busy doing other things like picking up litter and contributing to world peace.

The simple Windows way

As with most things on Windows, there’s multiple ways to do this. Here’s one, which can be followed either by regular users or administrators. It’s several steps, but it’s a logical progression, and will work for everyone.

Step 1. Download the certificate. Really, literally, follow the link to the certificate and click “Open”. It’ll pop up as follows:

5-26-2016 3-49-29 PM

Step 2. Install the certificate. Really, literally, click the button that says “Install Certificate…”. You’ll see this prompt asking you where to save it:

5-26-2016 3-49-41 PM

Step 3. If you’re a non-administrator, and just want to untrust this certificate for yourself, leave the Store Location set to “Current User”. If you want to set this for the machine as a whole, and you’re an administrator, select Local Machine, like this:

5-26-2016 3-49-50 PM

Step 4: Click Next, to be asked where you’re putting the certificate:

5-26-2016 3-50-02 PM

Step 5: Select “Place all certificates in the following store”:

5-26-2016 3-50-16 PM

Step 6: Click the “Browse…” button to be given choices of where to place this certificate:

5-26-2016 3-50-23 PM

Step 7: Don’t select “Personal”, because that will explicitly trust the certificate. Scroll down and you’ll see “Untrusted Certificates”. Select that and hit OK:

5-26-2016 3-50-35 PM

Step 8: You’re shown the store you plan to install into:

5-26-2016 3-50-47 PM

Step 9: Click “Next” – and you’ll get a final confirmation option. Read the screen and make sure you really want to do what’s being offered – it’s reversible, but check that you didn’t accidentally install the certificate somewhere wrong. The only place this certificate should go to become untrusted is in the Untrusted Certificates store:

5-26-2016 3-50-52 PM

Step 10: Once you’re sure you have it right, click “Finish”. You’ll be congratulated with this prompt:

5-26-2016 3-50-59 PM

Step 11: Verification. Hit OK on the “import was successful” box. If you still have the Certificate open, close it. Now reopen it, from the link or from the certificate store, or if you downloaded the certificate, from there. It’ll look like this:

5-26-2016 3-51-44 PM

The certificate hasn’t actually been revoked, and you can open up the Untrusted Certificates store to remove this certificate so it’s trusted again if you find any difficulties.

Other methods

There are other methods to do this – if you’re a regular admin user on Windows, I’ll tell you the quicker way is to open MMC.EXE, add the Certificates Snap-in, select to manage either the Local Computer or Current User, navigate to the Untrusted Certificates store and Import the certificate there. For wide scale deployment, there are group policy ways to do this, too.

OK, OK, because you asked, here’s a picture of how to do it by GPO:

5-26-2016 4-49-38 PM

How do you encrypt a password?

I hate when people ask me this question, because I inevitably respond with a half-dozen questions of my own, which makes me seem like a bit of an arse.

To reduce that feeling, because the questions don’t seem to be going away any time soon, I thought I’d write some thoughts out.

Put enough locks on a thing, it's secure. Or it collapses under the weight of the locks.

Do you want those passwords in the first place?

Passwords are important objects – and because people naturally share IDs and passwords across multiple services, your holding on to a customer’s / user’s password means you are a necessary part of that user’s web of credential storage.

It will be a monumental news story when your password database gets disclosed or leaked, and even more of a story if you’ve chosen a bad way of protecting that data. You will lose customers and you will lose business; you may even lose your whole business.

Take a long hard look at what you’re doing, and whether you actually need to be in charge of that kind of risk.

Do you need those passwords to verify a user or to impersonate them?

If you are going to verify a user, you don’t need encrypted passwords, you need hashed passwords. And those hashes must be salted. And the salt must be large and random. I’ll explain why some other time, but you should be able to find much documentation on this topic on the Internet. Specifically, you don’t need to be able to decrypt the password from storage, you need to be able to recognise it when you are given it again. Better still, use an acknowledged good password hashing mechanism like PBKDF2. (Note, from the “2” that it may be necessary to update this if my advice is more than a few months old)

Now, do not read the rest of this section – skip to the next question.

Seriously, what are you doing reading this bit? Go to the heading with the next question. You don’t need to read the next bit.

<sigh/>

OK, if you are determined that you will have to impersonate a user (or a service account), you might actually need to store the password in a decryptable form.

First make sure you absolutely need to do this, because there are many other ways to impersonate an incoming user using delegation, etc, which don’t require you storing the password.

Explore delegation first.

Finally, if you really have to store the password in an encrypted form, you have to do it incredibly securely. Make sure the key is stored separately from the encrypted passwords, and don’t let your encryption be brute-forcible. A BAD way to encrypt would be to simply encrypt the password using your public key – sure, this means only you can decrypt it, but it means anyone can brute-force an encryption and compare it against the ciphertext.

A GOOD way to encrypt the password is to add some entropy and padding to it (so I can’t tell how long the password was, and I can’t tell if two users have the same password), and then encrypt it.

Password storage mechanisms such as keychains or password vaults will do this for you.

If you don’t have keychains or password vaults, you can encrypt using a function like Windows’ CryptProtectData, or its .NET equivalent, System.Security.Cryptography.ProtectedData.

[Caveat: CryptProtectData and ProtectedData use DPAPI, which requires careful management if you want it to work across multiple hosts. Read the API and test before deploying.]

[Keychains and password vaults often have the same sort of issue with moving the encrypted password from one machine to another.]

For .NET documentation on password vaults in Windows 8 and beyond, see: Windows.Security.Credentials.PasswordVault

For non-.NET on Windows from XP and later, see: CredWrite

For Apple, see documentation on Keychains

Can you choose how strong those passwords must be?

If you’re protecting data in a business, you can probably tell users how strong their passwords must be. Look for measures that correlate strongly with entropy – how long is the password, does it use characters from a wide range (or is it just the letter ‘a’ repeated over and over?), is it similar to any of the most common passwords, does it contain information that is obvious, such as the user’s ID, or the name of this site?

Maybe you can reward customers for longer passwords – even something as simple as a “strong account award” sticker on their profile page can induce good behaviour.

Length is mathematically more important to password entropy than the range of characters. An eight character password chosen from 64 characters (less than three hundred trillion combinations – a number with 4 commas) is weaker than a 64 character password chosen from eight characters (a number of combinations with 19 commas in it).

An 8-character password taken from 64 possible characters is actually as strong as a password only twice as long and chosen from 8 characters – this means something like a complex password at 8 characters in length is as strong as the names of the notes in a couple of bars of your favourite tune.

Allowing users to use password safes of their own makes it easier for them to use longer and more complex passwords. This means allowing copy and paste into password fields, and where possible, integrating with any OS-standard password management schemes

What happens when a user forgets their password?

Everything seems to default to sending a password reset email. This means your users’ email address is equivalent to their credential. Is that strength of association truly warranted?

In the process to change my email address, you should ask me for my password first, or similarly strongly identify me.

What happens when I stop paying my ISP, and they give my email address to a new user? Will they have my account on your site now, too?

Every so often, maybe you should renew the relationship between account and email address – baselining – to ensure that the address still exists and still belongs to the right user.

Do you allow password hints or secret questions?

Password hints push you dangerously into the realm of actually storing passwords. Those password hints must be encrypted as well as if they were the password themselves. This is because people use hints such as “The password is ‘Oompaloompah’” – so, if storing password hints, you must encrypt them as strongly as if you were encrypting the password itself. Because, much of the time, you are. And see the previous rule, which says you want to avoid doing that if at all possible.

Other questions that I’m not answering today…

How do you enforce occasional password changes, and why?

What happens when a user changes their password?

What happens when your password database is leaked?

What happens when you need to change hash algorithm?