alunj

Explaining Blockchains – without understanding them.

I tweeted this the other day, after reading about Microsoft’s Project Bletchley:

I’ve been asked how I can tweet something as specific as this, when in a subsequent tweet, I noted:

[I readily admit I didn’t understand the announcement, or what it’s /supposed/ to be for, but that didn’t stop me thinking about it]

— Alun Jones (@ftp_alun) June 17, 2016

Despite having  reasonably strong background in the use of crypto, and a little dabbling into the analysis of crypto, I don’t really follow the whole “blockchain” thing.

So, here’s my attempt to explain what little I understand of blockchains and their potential uses, with an open invitation to come and correct me.

What’s a blockchain used for?

The most widely-known use of blockchains is that of Bit Coin and other “digital currencies”.

Bit Coins are essentially numbers with special properties, that make them progressively harder to find as time goes on. Because they are scarce and getting scarcer, it becomes possible for people of a certain mindset to ascribe a “value” to them, much as we assign value to precious metals or gemstones aside from their mere attractiveness. [Bit Coins have no intrinsic attractiveness as far as I can tell] That there is no actual intrinsic value leads me to refer to Bit Coin as a kind of shared madness, in which everyone who believes there is value to the Bit Coin shares this delusion with many others, and can use that shared delusion as a basis for trading other valued objects. Of course, the same kind of shared madness is what makes regular financial markets and country-run money work, too.

Because of this value, people will trade them for other things of value, whether that’s shiny rocks, or other forms of currency, digital or otherwise. It’s a great way to turn traceable goods into far less-traceable digital commodities, so its use for money laundering is obvious. Its use for online transactions should also be obvious, as it’s an irrevocable and verifiable transfer of value, unlike a credit card which many vendors will tell you from their own experiences can be stolen, and transactions can be revoked as a result, whether or not you’ve shipped valuable goods.

What makes this an irrevocable and verifiable transfer is the principle of a “blockchain”, which is reported in a distributed ledger. Anyone can, at any time, at least in theory, download the entire history of ownership of a particular Bit Coin, and verify that the person who’s selling you theirs is truly the current and correct owner of it.

How does a blockchain work?

I’m going to assume you understand how digital signatures work at this point, because that’s a whole ‘nother explanation.

Remember that a Bit Coin starts as a number. It could be any kind of data, because all data can be represented as a number. That’s important, later.

The first owner of that number signs it, and then distributes the number and signature out to the world. This is the “distributed ledger”. For Bit Coins, the “world” in this case is everyone else who signs up to the Bit Coin madness.

When someone wants to buy that Bit Coin (presumably another item of mutually agreed similar value exchanges hands, to buy the Bit Coin), the seller signs the buyer’s signature of the Bit Coin, acknowledging transfer of ownership, and then the buyer distributes that signature out to the distributed ledger. You can now use the distributed ledger at any time to verify that the Bit Coin has a story from creation and initial signature, unbroken, all the way up to current ownership.

I’m a little flakey on what, other than a search in the distributed ledger for previous sales of this Bit Coin, prevents a seller from signing the same Bit Coin over simultaneously to two other buyers. Maybe that’s enough – after all, if the distributed ledger contains a demonstration that you were unreliable once, your other signed Bit Coins will presumably have zero value.

So, in this perspective, a blockchain is simply an unbroken record of ownership or provenance of a piece of data from creation to current owner, and one that can be extended onwards.

In the world of financial use, of course, there are some disadvantages – the most obvious being that if I can make you sign a Bit Coin against your well, it’s irrevocably mine. There is no overarching authority that can say “no, let’s back up on that transaction, and say it never happened”. This is also pitched as an advantage, although many Bit Coin owners have been quite upset to find that their hugely-valuable piles of Bit Coins are now in someone else’s ownership.

Now, explain your tweet.

With the above perspective in the back of my head, I read the Project Bletchley report.

I even looked at the pictures.

I still didn’t really understand it, but something went “ping” in my head.

Maybe this is how C-level executives feel.

Here’s my thought:

Businesses get data from customers, users, partners, competitors, outright theft and shenanigans.

Maybe in environments where privacy is respected, like the EU, blockchains could be an avenue by which regulators enforce companies describing and PROVING where their data comes from, and that it was not acquired or used in an inappropriate manner?

When I give you my data, I sign it as coming from me, and sign that it’s now legitimately possessed by you (I won’t say “owned”, because I feel that personal data is irrevocably “owned” by the person it describes). Unlike Bit Coin, I can do this several times with the same packet of data, or different packets of data containing various other information. That information might also contain details of what I’m approving you to do with that information.

This is the start of a blockchain.

When information is transferred to a new party, that transfer will be signed, and the blockchain can be verified at that point. Further usage restrictions can be added.

Finally, when an information commissioner wants to check whether a company is handling data appropriately, they can ask for the blockchains associated with data that has been used in various ways. That then allows the commissioner to verify whether reported use or abuse has been legitimately approved or not.

And before this sounds like too much regulatory intervention, it also allows businesses to verify the provenance of the data they have, and to determine where sensitive data resides in their systems, because if it always travels with its blockchain, it’s always possible to find and trace it.

[Of course, if it travels without its blockchain, then it just looks like you either have old outdated software which doesn’t understand the blockchain and needs to be retired, or you’re doing something underhanded and inappropriate with customers’ data.]

It even allows the revocation of a set of data to be performed – when a customer moves to another provider, for instance.

Yes, there’s the downside of hugely increased storage requirements. Oh well.

Oh, and that revocation request on behalf of the customer, that would then be signed by the business to acknowledge it had been received, and would be passed on to partners – another blockchain.

So, maybe I’ve misunderstood, and this isn’t how it’s going to be used, but I think it’s an intriguing thought, and would love to hear your comments.

Capital ‘I’ Internet

The Atlantic today published a reminder that the Associated Press has declared in their style guide as of today that the word “Internet” will be spelt with a lowercase ‘i’ rather than an uppercase ‘I’.

The title is “Elegy for the Capital-I Internet”, but manages to be neither elegy nor eulogy, and misses the mark entirely, focusing as it does on the awe-inspiring size of the Internet being why the upper-case initial was important; then moving to describe how its sheer ubiquity should lead us to associating it with a lower-case i.

Here’s my take

The "Internet", capital I, gives the information that this is the only one of its kind, anywhere, ever. There is only one Internet. A lower-case I would indicate that there are several "internets". And, sure enough, there are several lower-class networks-of-networks (which is the definition of “internet” as a lower-case noun).

I’d like to inform the people who are engaging in this navel-gazing debate over big-I or small-i, that there functionally is only exactly one Internet. When their cable company came to "install the Internet", there was no question on the form to say "which internet do you want to connect to?" and people would have been rightly upset if there had been.

So, from that perspective, very much capital-I is still the right term for the Internet. There’s only one. Those other smaller internets are not comparable to “the Internet”.

From a technical perspective, we’re actually at the time when it’s closest to being true that there’s two internets. We’re in the midst of the long, long switch from IPv4 to IPv6. We’ve never done that before. And, while there are components of each that will talk to the other, it’s possible to describe the IPv6 and IPv4 collections of networks as two different "internets". So, maybe small-i is appropriate, but for none of the reasons this article describes.

Having said that, IPv6 engineers work really really hard to make sure that users just plain don’t notice that there’s a second internet while they’re building it, and it just feels exactly like it would if there was still only one Internet.

Again, you come back to "there is only one Internet", you don’t get to check a box that selects which of several internets you are going to connect to, it’s not like "the cloud", where there are multiple options. You are either connected to the one Internet, or you’re not connected to any internet at all.

The winner is…

Capital I, and bollocks to the argument from the associated press – lower-cased, because it’s not really that big or important, and neither is the atlantic. So, with their own arguments (which I believe are fallacious anyway), I don’t see why they deserve an upper-case initial.

The Atlantic, on the other hand – that’s huge and I wouldn’t want to cross it under my own steam.

And the Internet, different from many other internets, deserves its capital I as a designation of its singular nature. Because it’s a proper noun.

The blame game: It’s always never human error

The Ubuntu “Circle of Friends” logo.

Depending on the kind of company you work at, it’s either:

  • a group of three friends holding hands and dancing in a merry circle
  • a group of three colleagues each pointing at the other two to tell you who to blame
  • three guys tied to a pole desperately trying to escape the rising orange flood waters

If you work at the first place, reach out to me on LinkedIn – I know some people who might want to work with you.

If you’re at the third place, you should probably get out now. Whatever they’re paying you, or however much the stock might be worth come the IPO, it’s not worth the pain and suffering.

If you’re at the second place, congratulations – you’re at a regular, ordinary workplace that could do with a little better management.

What’s this to do with security?

A surprisingly great deal.

Whenever there’s a security incident, there should be an investigation as to its cause.

Clearly the cause is always human error. Machines don’t make mistakes, they act in predictable ways – even when they are acting randomly, they can be stochastically modeled, and errors taken into consideration. Your computer behaves like a predictable machine, but at various levels it actually routinely behaves like it’s rolling dice, and there are mechanisms in place to bias those random results towards the predictable answers you expect from it.

Humans, not so much.

Humans make all the mistakes. They choose to continue using parts that are likely to break, because they are past their supported lifecycle; they choose to implement only part of a security mechanism; they forget to finish implementing functionality; they fail to understand the problem at hand; etc, etc.

It always comes back to human error.

Or so you think

Occasionally I will experience these great flashes of inspiration from observing behaviour, and these flashes dramatically affect my way of doing things.

One such was when I attended the weekly incident review board meetings at my employer of the time – a health insurance company.

Once each incident had been resolved and addressed, they were submitted to the incident review board for discussion, so that the company could learn from the cause of the problem, and make sure similar problems were forestalled in future.

These weren’t just security incidents, they could be system outages, problems with power supplies, really anything that wasn’t quickly fixed as part of normal process.

But the principles I learned there apply just as well to security incident.

Root cause analysis

The biggest principle I learned was “root cause analysis” – that you look beyond the immediate cause of a problem to find what actually caused it in the long view.

At other companies, who can’t bear to think that they didn’t invent absolutely everything, this is termed differently, for instance, “the five whys” (suggesting if you ask “why did that happen?” five times, you’ll get to the root cause). Other names are possible, but the majority of the English-speaking world knows it as ‘root cause analysis’

This is where I learned that if you believe the answer is that a single human’s error caused the problem, you don’t have the root cause.

But!

Whenever I discuss this with friends, they always say “But! What about this example, or that?”

You should always ask those questions.

Here’s some possible individual causes, and some of their associated actual causes:

Bob pulled the wrong lever Who trained Bob about the levers to pull? Was there documentation? Were the levers labeled? Did anyone assess Bob’s ability to identify the right lever to pull by testing him with scenarios?
Kate was evil and did a bad thing Why was Kate allowed to have unsupervised access? Where was the monitoring? Did we hire Kate? Why didn’t the background check identify the evil?
Jeremy told everyone the wrong information Was Jeremy given the right information? Why was Jeremy able to interpret the information from right to wrong? Should this information have been automatically communicated without going through a Jeremy? Was Jeremy trained in how to transmute information? Why did nobody receiving the information verify it?
Grace left her laptop in a taxi Why does Grace have data that we care about losing – on her laptop? Can we disable the laptop remotely? Why does she even have a laptop? What is our general solution for people, who will be people, leaving laptops in a taxi?
Jane wrote the algorithm with a bug in it Who reviews Jane’s code? Who tests the code? Is the test automated? Was Jane given adequate training and resources to write the algorithm in the first place? Is this her first time writing an algorithm – did she need help? Who hired Jane for that position – what process did they follow?

 

I could go on and on, and I usually do, but it’s important to remember that if you ever find yourself blaming an individual and saying “human error caused this fault”, it’s important to remember that humans, just like machines, are random and only stochastically predictable, and if you want to get predictable results, you have to have a framework that brings that randomness and unpredictability into some form of logical operation.

Many of the questions I asked above are also going to end up with the blame apparently being assigned to an individual – that’s just a sign that it needs to keep going until you find an organisational fix. Because if all you do is fix individuals, and you hire new individuals and lose old individuals, your organisation itself will never improve.

[Yes, for the pedants, your organisation is made up of individuals, and any organisational fix is embodied in those individuals – so blog about how the organisation can train individuals to make sure that organisational learning is passed on.]

Finally, if you’d like to not use Ubuntu as my “circle of blame” logo, there’s plenty of others out there – for instance, Microsoft Alumni:

Microsoft Alumni

Untrusting the Blue Coat Intermediate CA from Windows

So, there was this tweet that got passed around the security community pretty quickly:

Kind of confusing and scary if you’re not quite sure what this all means – perhaps clear and scary if you do.

BlueCoat manufactures “man in the middle” devices – sometimes used by enterprises to scan and inspect / block outbound traffic across their network, and apparently also used by governments to scan and inspect traffic across the network.

The first use is somewhat acceptable (enterprises can prevent their users from distributing viruses or engaging in illicit behaviour from work computers, which the enterprises quite rightly believe they own and should control), but the second use is generally not acceptable, depending on how much you trust your local government.

Filippo helpfully gives instructions on blocking this from OSX, and a few people in the Twitter conversation have asked how to do this on Windows.

Disclaimer!

Don’t do this on a machine you don’t own or manage – you may very well be interfering with legitimate interference in your network traffic. If you’re at work, your employer owns your computer, and may intercept, read and modify your network traffic, subject to local laws, because it’s their network and their computer. If your government has ruled that they have the same rights to intercept Internet traffic throughout your country, you may want to consider whether your government shouldn’t be busy doing other things like picking up litter and contributing to world peace.

The simple Windows way

As with most things on Windows, there’s multiple ways to do this. Here’s one, which can be followed either by regular users or administrators. It’s several steps, but it’s a logical progression, and will work for everyone.

Step 1. Download the certificate. Really, literally, follow the link to the certificate and click “Open”. It’ll pop up as follows:

5-26-2016 3-49-29 PM

Step 2. Install the certificate. Really, literally, click the button that says “Install Certificate…”. You’ll see this prompt asking you where to save it:

5-26-2016 3-49-41 PM

Step 3. If you’re a non-administrator, and just want to untrust this certificate for yourself, leave the Store Location set to “Current User”. If you want to set this for the machine as a whole, and you’re an administrator, select Local Machine, like this:

5-26-2016 3-49-50 PM

Step 4: Click Next, to be asked where you’re putting the certificate:

5-26-2016 3-50-02 PM

Step 5: Select “Place all certificates in the following store”:

5-26-2016 3-50-16 PM

Step 6: Click the “Browse…” button to be given choices of where to place this certificate:

5-26-2016 3-50-23 PM

Step 7: Don’t select “Personal”, because that will explicitly trust the certificate. Scroll down and you’ll see “Untrusted Certificates”. Select that and hit OK:

5-26-2016 3-50-35 PM

Step 8: You’re shown the store you plan to install into:

5-26-2016 3-50-47 PM

Step 9: Click “Next” – and you’ll get a final confirmation option. Read the screen and make sure you really want to do what’s being offered – it’s reversible, but check that you didn’t accidentally install the certificate somewhere wrong. The only place this certificate should go to become untrusted is in the Untrusted Certificates store:

5-26-2016 3-50-52 PM

Step 10: Once you’re sure you have it right, click “Finish”. You’ll be congratulated with this prompt:

5-26-2016 3-50-59 PM

Step 11: Verification. Hit OK on the “import was successful” box. If you still have the Certificate open, close it. Now reopen it, from the link or from the certificate store, or if you downloaded the certificate, from there. It’ll look like this:

5-26-2016 3-51-44 PM

The certificate hasn’t actually been revoked, and you can open up the Untrusted Certificates store to remove this certificate so it’s trusted again if you find any difficulties.

Other methods

There are other methods to do this – if you’re a regular admin user on Windows, I’ll tell you the quicker way is to open MMC.EXE, add the Certificates Snap-in, select to manage either the Local Computer or Current User, navigate to the Untrusted Certificates store and Import the certificate there. For wide scale deployment, there are group policy ways to do this, too.

OK, OK, because you asked, here’s a picture of how to do it by GPO:

5-26-2016 4-49-38 PM

How do you encrypt a password?

I hate when people ask me this question, because I inevitably respond with a half-dozen questions of my own, which makes me seem like a bit of an arse.

To reduce that feeling, because the questions don’t seem to be going away any time soon, I thought I’d write some thoughts out.

Put enough locks on a thing, it's secure. Or it collapses under the weight of the locks.

Do you want those passwords in the first place?

Passwords are important objects – and because people naturally share IDs and passwords across multiple services, your holding on to a customer’s / user’s password means you are a necessary part of that user’s web of credential storage.

It will be a monumental news story when your password database gets disclosed or leaked, and even more of a story if you’ve chosen a bad way of protecting that data. You will lose customers and you will lose business; you may even lose your whole business.

Take a long hard look at what you’re doing, and whether you actually need to be in charge of that kind of risk.

Do you need those passwords to verify a user or to impersonate them?

If you are going to verify a user, you don’t need encrypted passwords, you need hashed passwords. And those hashes must be salted. And the salt must be large and random. I’ll explain why some other time, but you should be able to find much documentation on this topic on the Internet. Specifically, you don’t need to be able to decrypt the password from storage, you need to be able to recognise it when you are given it again. Better still, use an acknowledged good password hashing mechanism like PBKDF2. (Note, from the “2” that it may be necessary to update this if my advice is more than a few months old)

Now, do not read the rest of this section – skip to the next question.

Seriously, what are you doing reading this bit? Go to the heading with the next question. You don’t need to read the next bit.

<sigh/>

OK, if you are determined that you will have to impersonate a user (or a service account), you might actually need to store the password in a decryptable form.

First make sure you absolutely need to do this, because there are many other ways to impersonate an incoming user using delegation, etc, which don’t require you storing the password.

Explore delegation first.

Finally, if you really have to store the password in an encrypted form, you have to do it incredibly securely. Make sure the key is stored separately from the encrypted passwords, and don’t let your encryption be brute-forcible. A BAD way to encrypt would be to simply encrypt the password using your public key – sure, this means only you can decrypt it, but it means anyone can brute-force an encryption and compare it against the ciphertext.

A GOOD way to encrypt the password is to add some entropy and padding to it (so I can’t tell how long the password was, and I can’t tell if two users have the same password), and then encrypt it.

Password storage mechanisms such as keychains or password vaults will do this for you.

If you don’t have keychains or password vaults, you can encrypt using a function like Windows’ CryptProtectData, or its .NET equivalent, System.Security.Cryptography.ProtectedData.

[Caveat: CryptProtectData and ProtectedData use DPAPI, which requires careful management if you want it to work across multiple hosts. Read the API and test before deploying.]

[Keychains and password vaults often have the same sort of issue with moving the encrypted password from one machine to another.]

For .NET documentation on password vaults in Windows 8 and beyond, see: Windows.Security.Credentials.PasswordVault

For non-.NET on Windows from XP and later, see: CredWrite

For Apple, see documentation on Keychains

Can you choose how strong those passwords must be?

If you’re protecting data in a business, you can probably tell users how strong their passwords must be. Look for measures that correlate strongly with entropy – how long is the password, does it use characters from a wide range (or is it just the letter ‘a’ repeated over and over?), is it similar to any of the most common passwords, does it contain information that is obvious, such as the user’s ID, or the name of this site?

Maybe you can reward customers for longer passwords – even something as simple as a “strong account award” sticker on their profile page can induce good behaviour.

Length is mathematically more important to password entropy than the range of characters. An eight character password chosen from 64 characters (less than three hundred trillion combinations – a number with 4 commas) is weaker than a 64 character password chosen from eight characters (a number of combinations with 19 commas in it).

An 8-character password taken from 64 possible characters is actually as strong as a password only twice as long and chosen from 8 characters – this means something like a complex password at 8 characters in length is as strong as the names of the notes in a couple of bars of your favourite tune.

Allowing users to use password safes of their own makes it easier for them to use longer and more complex passwords. This means allowing copy and paste into password fields, and where possible, integrating with any OS-standard password management schemes

What happens when a user forgets their password?

Everything seems to default to sending a password reset email. This means your users’ email address is equivalent to their credential. Is that strength of association truly warranted?

In the process to change my email address, you should ask me for my password first, or similarly strongly identify me.

What happens when I stop paying my ISP, and they give my email address to a new user? Will they have my account on your site now, too?

Every so often, maybe you should renew the relationship between account and email address – baselining – to ensure that the address still exists and still belongs to the right user.

Do you allow password hints or secret questions?

Password hints push you dangerously into the realm of actually storing passwords. Those password hints must be encrypted as well as if they were the password themselves. This is because people use hints such as “The password is ‘Oompaloompah’” – so, if storing password hints, you must encrypt them as strongly as if you were encrypting the password itself. Because, much of the time, you are. And see the previous rule, which says you want to avoid doing that if at all possible.

Other questions that I’m not answering today…

How do you enforce occasional password changes, and why?

What happens when a user changes their password?

What happens when your password database is leaked?

What happens when you need to change hash algorithm?

On Widespread XSS in Ad Networks

Randy Westergren posted a really great piece entitled “Widespread XSS Vulnerabilities in Ad Network Code Affecting Top Tier Publishers, Retailers

Go read it – I’ll wait.

The article triggered a lot of thoughts that I’ll enumerate here:

This is not a new thing – and that’s bad

This was reported by SoftPedia as a “new attack”, but it’s really an old attack. This is just another way to execute DOM-based XSS.

That means that web sites are being attacked by old bugs, not because their own coding is bad, but because they choose to make money from advertising.

And because the advertising industry is just waaaay behind on securing their code, despite being effectively a widely-used framework across the web.

You’ve seen previously on my blog how I attacked Troy Hunt’s blog through his advertising provider, and he’s not the first, or by any means the last, “victim” of my occasional searches for flaws.

It’s often difficult to trace which ad provider is responsible for a piece of vulnerable code, and the hosting site may not realise the nature of their relationship and its impact on security. As a security researcher, it’s difficult to get traction on getting these vulnerabilities fixed.

Important note

I’m trying to get one ad provider right now to fix their code. I reported a bug to them, they pointed out it was similar to the work Randy Westergren had written up.

So they are aware of the problem.

It’s over a month later, and the sites I pointed out to them as proofs of concept are still vulnerable.

Partly, this is because I couldn’t get a reliable repro as different ad providers loaded up, but it’s been two weeks since I sent them a reliable repro – which is still working.

Reported a month ago, reliable repro two weeks ago, and still vulnerable everywhere.

[If you’re defending a site and want to figure out which ad provider is at fault, inject a “debugger” statement into the payload, to have the debugger break at the line that’s causing a problem. You may need to do this by replacing “prompt()” or “alert()” with “(function(){debugger})()” – note that it’ll only break into the debugger if you have the debugger open at the time.]

How the “#” affects the URL as a whole

Randy’s attack example uses a symbol you won’t see at all in some web sites, but which you can’t get away from in others. The “#” or “hash” symbol, also known as “number” or “hash”. [Don’t call it “pound”, please, that’s a different symbol altogether, “£”] Here’s his example:

http://nypost.com/#1'-alert(1)-'"-alert(1)-"

Different parts of the URL have different names. The “http:” part is the “protocol”, which tells the browser how to connect and what commands will likely work. “//nypost.com/” is the host part, and tells the browser where to connect to. Sometimes a port number is used – commonly, 80 or 443 – after the host name but before the terminating “/” of the host element. Anything after the host part, and before a question-mark or hash sign, is the “path” – in Randy’s example, the path is left out, indicating he wants the root page. An optional “query” part follows the path, indicated by a question mark at its start, often taking up the rest of the URL. Finally, if a “#” character is encountered, this starts the “anchor” part, which is everything from after the “#” character on to the end of the URL.

The “anchor” has a couple of purposes, one by design, and one by evolution. The designed use is to tell the browser where to place the cursor – where to scroll to. I find this really handy if I want to draw someone’s attention to a particular place in an article, rather than have them read the whole story. [It can also be used to trigger an onfocus event handler in some browsers]

The second use is for communication between components on the page, or even on other pages loaded in frames.

The anchor tag is for the browser only

I want to emphasise this – and while Randy also mentioned it, I think many web site developers need to understand this when dealing with security.

The anchor tag is not sent to the server.

The anchor tag does not appear in your server’s logs.

WAFs cannot filter the anchor tag.

If your site is being attacked through abuse of the anchor tag, you not only can’t detect it ahead of time, you can’t do basic forensic work to find out useful things such as “when did the attack start”, “what sort of things was the attacker doing”, “how many attacks happened”, etc.

[Caveat: pedants will note that when browser code acts on the contents of the anchor tag, some of that action will go back to the server. That’s not the same as finding the bare URL in your log files.]

If you have an XSS that can be triggered by code in an anchor tag, it is a “DOM-based XSS” flaw. This means that the exploit happens primarily (or only) in the user’s browser, and no filtering on the server side, or in the WAF (a traditional, but often unreliable, measure against XSS attacks), will protect you.

When trying out XSS attacks to find and fix them, you should try attacks in the anchor tag, in the query string, and in the path elements of the URL if at all possible, because they each will get parsed in different ways, and will demonstrate different bugs.

What does “-alert(1)-“ even mean?

The construction Randy uses may seem a little odd:

"-alert(1)-"'-alert(1)-'

With some experience, you can look at this and note that it’s an attempt to inject JavaScript, not HTML, into a quoted string whose injection point doesn’t properly (or at all) escape quotes. The two different quote styles will escape from quoted strings inside double quotes and single quotes alike (I like to put the number ‘2’ in the alert that is escaped by the double quotes, so I know which quote is escaped).

But why use a minus sign?

Surely it’s invalid syntax?

While JavaScript knows that “string minus void” isn’t a valid operation, in order to discover the types of the two arguments to the “minus” operator, it actually has to evaluate them. This is a usual side-effect of a dynamic language – in order to determine whether an operation is valid, its arguments have to be evaluated. Compiled languages are usually able to identify specific types at compile time, and tell you when you have an invalid operand.

So, now that we know you can use any operator in there – minus, times, plus, divide, and, or, etc – why choose the minus? Here’s my reasoning: a plus sign in a URL is converted to a space. A divide (“/”) is often a path component, and like multiplication (“*”) is part of a comment sequence in JavaScript, “//” or “/*”, an “&” is often used to separate arguments in a query string, and a “|” for “or” is possibly going to trigger different flaws such as command injection, and so is best saved for later.

Also, the minus sign is an unshifted character and quick to type.

There are so many other ways to exploit this – finishing the alert with a line-ending comment (“//” or “<–”), using “prompt” or “confirm” instead of “alert”, using JavaScript obfuscaters, etc, but this is a really good easy injection point.

Another JavaScript syntax abuse is simply to drop “</script>” in the middle of the JavaScript block and then start a new script block, or even just regular HTML. Remember that the HTML parser only hands off to the JavaScript parser once it has found a block between “<script …>” and “</script …>” tags. It doesn’t matter if the closing tag is “within” a JavaScript string, because the HTML parser doesn’t know JavaScript.

There’s no single ad provider, and they’re almost all vulnerable

Part of the challenge in repeating these attacks, demonstrating them to others, etc, is that there’s no single ad provider, even on an individual web site.

Two visits to the same web site not only bring back different adverts, but they come through different pieces of code, injected in different ways.

If you don’t capture your successful attack, it may not be possible to reproduce it.

Similarly, if you don’t capture a malicious advert, it may not be possible to prove who provided it to you. I ran into this today with a “fake BSOD” malvert, which pretended to be describing a system error, and filled as much of my screen as it could with a large “alert” dialog, which kept returning immediately, whenever it was dismissed, and which invited me to call for “tech support” to fix my system. Sadly, I wasn’t tracing my every move, so I didn’t get a chance to discover how this ad was delivered, and could only rage at the company hosting the page.

This is one reason why I support ad-blockers

Clearly, ad providers need to improve their security. Until such time as they do so, a great protection is to use an ad-blocker. This may prevent you from seeing actual content at some sites, but you have to ask yourself if that content is worth the security risk of exposing yourself to adverts.

There is a valid argument to be made that ad blockers reduce the ability of content providers to make legitimate profit from their content.

But there is also a valid argument that ad blockers protect users from insecure adverts.

Defence – protect your customers from your ads

Finally, if you’re running a web site that makes its money from ads, you need to behave proactively to prevent your users from being targeted by rogue advertisers.

I’m sure you believe that you have a strong, trusting relationship with the ad providers you have running ads on your web site.

Don’t trust them. They are not a part of your dev team. They see your customers as livestock – product. Your goals are substantially different, and that means that you shouldn’t allow them to write code that runs in your web site’s security context.

What this means is that you should always embed those advertising providers inside an iframe of their own. If they give you code to run, and tell you it’s to create the iframe in which they’ll site, put that code in an iframe you host on a domain outside your main domain. Because you don’t trust that code.

Why am I suggesting you do that? Because it’s the difference between allowing an advert attack to have limited control, and allowing it to have complete control, over your web site.

If I attack an ad in an iframe, I can modify the contents of the iframe, I can pop up a global alert, and I can send the user to a new page.

If I attack an ad – or its loading code – and it isn’t in an iframe, I can still do all that, but I can also modify the entire page, read secret cookies, insert my own cookies, interact with the user as if I am the site hosting the ad, etc.

If you won’t do it for your customers, at least defend your own page

capture20160409090703064

Here’s the front page of a major website with a short script running through an advert with a bug in it.

[I like the tag at the bottom left]

Insist on security clauses with all your ad providers

Add security clauses in to your contracts , so that you can pull an ad provider immediately a security vulnerability is reported to you, and so that the ad providers are aware that you have an interest in the security and integrity of your page and your users. Ask for information on how they enforce security, and how they expect you to securely include them in your page.

[I am not a lawyer, so please talk with someone who is!]

We didn’t even talk about malverts yet

Malverts – malicious advertising – is the term for an attacker getting an ad provider to deliver their attack code to your users, by signing up to provide an ad. Often this is done using apparent advertising copy related to current advertising campaigns, and can look incredibly legitimate. Sometimes, the attack code will be delayed, or region-specific, so an ad provider can’t easily notice it when they review the campaign for inclusion in your web page.

Got a virus you want to distribute? Why write distribution code and try to trick a few people into running it, when for a few dollars, you can get an ad provider to distribute it for you to several thousand people on a popular web site for people who have money?

A quick April Fools’ Day reminder

Tomorrow is April 1, also known as April Fools’ Day.

As a result, you should expect that anything I say on this blog is fabrication, fantasy, foolery and snark.

Apparently, this hasn’t previously been completely stupidly blindly obvious.

Leap Day again

I’ve mentioned before how much I love the vagaries of dates and times in computing, and I’m glad it’s not a part of my regular day-to-day work or hobby coding.

Here’s some of the things I expect to happen this year as a result of the leap year:

  • Hey, it’s February 29 – some programs, maybe even operating systems, will refuse to recognise the day and think it’s actually March 1. Good luck figuring out how to mesh that with other calendar activities. Or maybe you’ll be particularly unlucky, and the app/OS will break completely.
  • But the fun’s not over, as every day after February 29, until March 1 NEXT YEAR, you’re a full 366 days ahead of the same date last year. So, did you create a certificate that expires next year, last year? If so, I hope you have a reminder well ahead of time to renew the certificate, because otherwise, your certificate probably expires 365 days ahead, not one year. Or maybe it’ll just create an invalid certificate when you renew one today.
  • The same is true for calendar reminders – some reminders for “a year ahead” will be 365 days ahead, not one year. Programmers often can’t tell the difference between AddDays(365) and AddYears(1) – and why would they, when the latter is difficult to define unambiguously (add a year to today’s date, what do you get? February 28 or March 1?)
  • But the fun’s not over yet – we’ve still got December 31 to deal with. Why’s that odd? Normal years have a December 31, so that’s no problem, right? Uh, yeah, except that’s day 366. And that’s been known to cause developers a problem – see what it did to the Zune a few years back.
  • Finally, please don’t tell me I have an extra day and ask me what I’m going to do with it – the day, unless you got a day off, or are paid hourly, belongs to your employer, not to you – they have an extra day’s work from you this year, without adding to your salary at all.

And then there’s the ordinary issues with dates that programmers can’t understand – like the fact that there are more than 52 weeks in a year. “ASSERT(weeknum>0 && weeknum<53);”, anyone? 52 weeks is 364 days, and every year has more days than that. [Pedantic mathematical note – maybe this somewhat offsets the “employer’s extra day” item above]

Happy Leap Day – and always remember to test your code in your head as well as in real life, to find its extreme input cases and associated behaviours. They’ll get tested anyway, but you don’t want it to be your users who find the bugs.

Why am I so cross?

There are many reasons why Information Security hasn’t had as big an impact as it deserves. Some are external – lack of funding, lack of concern, poor management, distractions from valuable tasks, etc, etc.

But the ones we inflict on ourselves are probably the most irritating. They make me really cross.

Why cross?

OK, “cross” is an English term for “angry”, or “irate”, but as with many other English words, it’s got a few other meanings as well.

It can mean to wrong someone, or go against them – “I can’t believe you crossed Fingers MacGee”.

It can mean to make the sign of a cross – “Did you just cross your fingers?”

It can mean a pair of items, intersecting one another – “I’m drinking at the sign of the Skull and Cross-bones”.

It can mean to breed two different subspecies into a third – “What do you get if you cross a mountaineer with a mosquito? Nothing, you can’t cross a scaler and a vector.”

Or it can mean to traverse something – “I don’t care what Darth Vader says, I always cross the road here”.

Green_cross_man_take_it

It’s this last sense that InfoSec people seem obsessed about, to the extent that every other attack seems to require it as its first word.

Such a cross-patch

These are just a list of the attacks at OWASP that begin with the word “Cross”.

Yesterday I had a meeting to discuss how to address three bugs found in a scan, and I swear I spent more than half the meeting trying to ensure that the PM and the Developer in the room were both discussing the same bug. [And here, I paraphrase]

“How long will it take you to fix the Cross-Frame Scripting bug?”

“We just told you, it’s going to take a couple of days.”

“No, that was for the Cross-Site Scripting bug. I’m talking about the Cross-Frame Scripting issue.”

“Oh, that should only take a couple of days, because all we need to do is encode the contents of the field.”

“No, again, that’s the Cross-Site Scripting bug. We already discussed that.”

“I wish you’d make it clear what you’re talking about.”

Yeah, me too.

A modest proposal

The whole point of the word “Cross” as used in the descriptions of these bugs is to indicate that someone is doing something they shouldn’t – and in that respect, it’s pretty much a completely irrelevant word, because we’re already discussing attack types.

In many of these cases, the words “Cross-Site” bring absolutely nothing to the discussion, and just make things confusing. Am I crossing a site from one page to another, or am I saying this attack occurs between sites? What if there’s no other site involved, is that still a cross-site scripting attack? [Yes, but that’s an irrelevant question, and by asking it, or thinking about asking/answering it, you’ve reduced your mental processing abilities to handle the actual issue.]

Check yourself when you utter “cross” as the first word in the description of an attack, and ask if you’re communicating something of use, or just “sounding like a proper InfoSec tool”. Consider whether there’s a better term to use.

I’ve previously argued that “Cross-Site Scripting” is really a poor term for the conflation of HTML Injection and JavaScript Injection.

Cross-Frame Scripting is really Click-Jacking (and yes, that doesn’t exclude clickjacking activities done by a keyboard or other non-mouse source).

Cross-Site Request Forgery is more of a Forced Action – an attacker can guess what URL would cause an action without further user input, and can cause a user to visit that URL in a hidden manner.

Cross-Site History Manipulation is more of a browser failure to protect SOP – I’m not an expert in that field, so I’ll leave it to them to figure out a non-confusing name.

Cross-Site Tracing is just getting silly – it’s Cross-Site Scripting (excuse me, HTML Injection) using the TRACE verb instead of the GET verb. If you allow TRACE, you’ve got bigger problems than XSS.

Cross-User Defacement crosses all the way into crosstalk, requiring as it does that two users be sharing the same TCP connection with no adequate delineation between them. This isn’t really common enough to need a name that gets capitalised. It’s HTTP Response-Splitting over a shared proxy with shitty user segregation.

Even more modestly…

I don’t remotely anticipate that I’ll change the names people give to these vulnerabilities in scanning tools or in pen-test reports.

But I do hope you’ll be able to use these to stop confusion in its tracks, as I did:

“Never mind cross-whatever, let’s talk about how long it’s going to take you to address the clickjacking issue.”

In Summary

Here’s the TL;DR version of the web post:

Prevent or interrupt confusion by referring to bugs using the following non-confusing terms:

Confusing Not Confusing Much, Probably
Cross-Frame Scripting Clickjacking
Cross-Site History Manipulation [Not common enough to name]
Cross-Site Tracing TRACE is enabled
Cross-Site Request Forgery Forced User Action
Cross-Site Scripting HTML Injection
JavaScript Injection
Cross-User Defacement Crappy proxy server

Fear the browsing dead!

Browsing Dead

Ding dong, the plugin’s dead!

There’s been a lot of celebration lately from the security community about the impending death of Adobe’s Flash, or Oracle’s Java plugin technology.

You can understand this, because for years these plugins have been responsible for vulnerability on top of vulnerability. Their combination of web-facing access and native code execution means that you have maximum exposure and maximum risk concentrated in one place on the machine.

Browser manufacturers have recognised this risk in their own code, and have made great strides in improving security. Plus, you can always switch browsers if you feel one is more secure than another.

Attackers can rely on Flash and Java.

An attacker can pretty much assume that their target is running Flash from Adobe, and Java from Oracle. [Microsoft used to have a competing Java implementation, but Oracle sued it out of existence.]

Bugs in those implementations are widely published, and not widely patched, whether or not patches are available.

Users don’t upgrade applications (plugins included) as often or as willingly as they update their operating system. So, while your browser may be updated with the operating system, or automatically self-update, it’s likely most users are running a version of Java and/or Flash that’s several versions behind.

Applications never die, they just lose their support

As you can imagine, the declaration by Oracle that Java plugin support will be removed is a step forward in recognising the changing landscape of browser security, but it’s not an indication that this is an area in which security professionals can relax.

Just the opposite.

With the deprecation of plugin support comes the following:

  • Known bugs – without fixes. Ever.
  • No availability of tools to manage old versions.
  • No tools to protect vulnerable plugins.
  • Users desperately finding more baroque (and unsecurable) ways to keep their older setups together to continue to use applications which should have been replaced, but never were.

It’s not like Oracle are going to reach into every machine and uninstall / turn off plugin support. Even if they had the technical means to do so, such an act would be a completely inappropriate act.

There will be zombies

So, what we’re left with, whenever a company deprecates a product, application or framework, is a group of machines – zombies, if you will – that are operated by people who do not heed the call to cull, and which are going to remain active and vulnerable until such time as someone renders those walking-dead components finally lifeless.

If you’re managing an enterprise from a security perspective, you should follow up every deprecation announcement with a project to decide the impact and schedule the actual death and dismemberment of the component being killed off.

Then you can celebrate!

Assuming, of course, that you followed through successfully on your plan.

Until then, watch out for the zombies.

The Browsing Dead.