How would your behaviour change if you knew the person one cubicle over was about to work for a competitor?
What if you knew that your cubicle neighbour was going to lose her job (be fired or laid off) in the next six months? Do you think she’d be looking to work in a different industry, or the one where she had the most recent experience?
What if the economic situation was such that you just couldn’t be sure who in your office would still be with you a year from now?
My point is less about pointing out that the current economic situation seems very like this harsh threatening landscape, but to ask you to consider that the answer to this question is actually the answer you should give all the time.
A recent study from Ponemon stated that six out of ten departing employees will take data with them as they leave, whether that’s customer data or business intelligence. Why do they do this? Well, we could get into the whole motivation of why, but the real answer is simple:
Because they can, and because they think they can benefit from doing so. Not because they won’t get caught – because, really, what are you going to do, fire them?
Design your data and processes around the idea that important, private, or proprietary data should only rest with individuals or in stores for as long as it is needed to do the job at hand.
After that, then what?
If you no longer need it, or can reconstruct or re-collect it when you next need it, why not just destroy the data?
If you need it, return it to a secure data store, from which it can’t be fetched again without business need, and appropriate authorisation.
If you never needed it in the first place, why collect it at all?
Protecting systems, networks, applications – that’s just resiliency and protection of a few thousand dollars of assets. The real money – and the real requirement for security protection – is in the data.
I used to say that people should “act like the data isn’t yours in the first place” – makes logical sense, doesn’t it?
Sure, if you think that way – if you think that you should be careful with other people’s possessions that they’ve loaned to you.
Over several jobs and several years, I’ve come to realise that we aren’t all of the same species of thought. Some of us are careless with other people’s possessions, and are only concerned with taking care of what’s ours.
So, my explanation has changed – now, the explanation is still that the data doesn’t belong to us, but we have possession of it, and therefore we, as application designers and architects, have a double requirement to be careful with it. We must protect it because it isn’t ours, and we must protect it because it is in our care. To be loose with other people’s data would be to cause them damage, and to be loose with data in our care would be to cause our business damage by reducing the value that we get from holding that data.
As we mentioned in the 1st part of this series, FTP is a more complex protocol than many, using one control connection and one data connection.
In typical Stream Mode operation, a new data connection is opened and closed for each data transfer, whether that’s an upload, a download, or a directory listing. To avoid confusion between different data connections, and as a recognition of the fact that networks may have old packets shuttling around for some time, these connections need to be distinguishable from one another.
In the previous article, we noted that two network sockets are distinguished by the five elements of “Local Address”, “Local Port”, “Protocol”, “Remote Address”, and “Remote Port”. For a data connection associated with any particular request, the local and remote addresses are fixed, as the addresses of the client and server. The protocol is TCP, and only the two ports are variable.
For a PASV, or passive data connection, the client-side port is chosen randomly by the client, and the server-side port is similarly chosen randomly by the server. The client connects to the server.
For a PORT, or active data connection, the client-side port is chosen randomly by the client, and the server-side port is set to port 20. The server connects to the client.
All of these work through firewalls and NAT routers, because firewalls and NAT routers contain an Application Layer Gateway (ALG) that watches for PORT and PASV commands, and modifies the control (in the case of a NAT) and/or uses the values provided to open up a firewall hole.
For the default data connection (what happens if no PORT or PASV command is sent before the first data transfer command), the client-side port is predictable (it’s the same as the source port the client used when connecting the control channel), and the server-side port is 20. Again, the server connects to the client.
Because firewalls and NATs open up a ‘reverse’ hole for TCP sockets, the default data port works with firewalls and NATs that aren’t running an ALG, or whose ALG cannot scan for PORT and PASV commands.
There are a couple of reasons – the first is that it doesn’t know that the service connected to is running the FTP protocol. This is common if the server is running on a port other than the usual port 21.
The second reason is that the FTP control connection doesn’t look like it contains FTP commands – usually because the connection is encrypted. This can happen because you’re tunneling the FTP control connection through an encrypted tunnel such as SSH (don’t laugh – it does happen!), or hopefully it’s because you’re running FTP over SSL, so that the control and data connections can be encrypted, and you can authenticate the identity of the FTP server.
In the words of Deep Thought: “Hmm… tricky”.
There are a couple of classic solutions:
The astute reader can probably see where I’m going with this.
The default data port is predictable – if the client connects from port U to port L at the server (L is usually 21), then the default data port will be opened from port L-1 at the server to port U at the client.
The default data port doesn’t need the firewall to do anything other than allow reverse connections back along the port that initiated the connection. You don’t need to open huge ranges at the server’s firewall (in fact you should be able to simply open port 21 inbound to your server).
The default data port is required to be supported by FTP servers going back a long way- at least a couple of decades. Yes, really, that long.
Good point, that, and a great sentence to use whenever you wish to halt innovation in its tracks.
Okay, it’s obvious that there are some drawbacks:
Even with those drawbacks, there are still further solutions to apply – the first being to use Block-mode instead of Stream-mode. In Stream-mode, each data transfer requires opening and closing the data connection; in Block-mode, which is a little like HTTP’s chunked mode, blocks of data are sent, and followed by an “EOF” marker (End of File), so that the data connection doesn’t need to be closed. If you can convince your FTP client to request Block-mode with the default data connection, and your FTP server supports it (WFTPD Pro has done so for several years), you can achieve FTP over SSL through NATs and firewalls simply by opening port 21.
For the second problem, it’s worth noting that many FTP client authors implemented default data connections out of a sense of robustness, so default data connections will often work if you can convince the PORT and PASV commands to fail – by, for instance, putting restrictive firewalls or NATs in the way, or perhaps by preventing the FTP server from accepting PORT or PASV commands in some way.
Clearly, since Microsoft’s IIS 7.5 downloadable FTP Server supports FTPS in block mode with the default data port, there has been some consideration given to my whispers to them that this could solve the FTP over SSL through firewall problem.
Other than my own WFTPD Explorer, I am not aware of any particular clients that support the explicit use of FTP over SSL with Block-mode on the default data connection – I’d love to hear of your experiments with this mode of operation, to see if it works as well for you as it does for me.
Here are some technologies I just can’t wait for:
I’ve been wanting to post this comment for some time, but never seemed to get around to it.
I’ve been through a number of different laptops over the last decade or so – Compaq, Dell, Gateway, and Toshiba – and each time, I’ve found that they just don’t seem to last. I can’t point to anything in particular – it’s never the same thing twice, but for one reason or another, I don’t get more than a couple of years’ life out of a laptop. Sometimes it’s physical failure – the screen breaks, the drive fails, the battery stops holding a charge – and sometimes it’s simply that the machine is too slow and impossible to upgrade to support me as new software is needed.
Unless I buy a ThinkPad.
It’s not that the ThinkPad doesn’t have its problems – it’s more that IBM support always made things right. When the CD-R drive on my first ThinkPad started failing, I called them up, and they quickly sent me a replacement (taking, as usual, my credit card number as guarantee in case I didn’t send them the drive back). The replacement turned out to be a DVD-R drive, so I was ahead on that deal – particularly since the failure happened right at the end of the warranty period.
So my more recent ThinkPad concerned me, coming as it did with a Lenovo sticker instead of IBM.
As usual, problems with the laptop happened once in a while. About six months in, the laptop battery stopped retaining its charge. I’m used to companies telling me that the battery is only warranted for 90 days, and that when batteries stop holding their charge, it’s because of my usage patterns (whatever that means – isn’t a battery supposed to be used when you’re on the bus or train, or in a meeting?)
Not these guys, no, they sent me a replacement battery (after the ritual exchange of credit card numbers).
One persistent problem stayed with me from the first few months of the purchase of the laptop – the sound stuttered. Now, I should note here what I mean by “stuttered”, because I gather others have sound stuttering that isn’t the same problem as mine.
Imagine, if you will, that the speakers can handle sounds only “so” loud. Pass any sounds louder than that to them, and the sound ceases until the sound is back to a good volume. So, the timing of the sound is unaffected, it’s just as if someone’s repeatedly hammering the ‘mute’ button. Not a problem if everything’s normalised to below 70%, say, but then that’s difficult to listen to because it’s so quiet.
That’s the problem I had – the other sort of problem appears to be where the processing of the sound signal is held up, so the timing of the sound is affected, as if someone is hammering a ‘pause’ button repeatedly on and off.
I called Lenovo a couple of times about this, and assumed it was simply not going to be fixed, as they kept suggesting new drivers, or that I take it to a service centre where they would decide if it could be fixed there or had to be sent away. I wasn’t keen on the service centres they were suggesting.
Finally I reached the end of my warranty, and also the end of my patience with the problem – I was playing more and more stuff from BBC Radio (see a theme here?), and they were coming through normalised properly, rather than dead quiet. So, I either had to re-normalise everything myself, or get the problem fixed.
I called Lenovo, spoke to a nice man in North Carolina, and was told they’d have to look at the system. I’d have to send it in.
I hate being without my laptop – all the more so because I had to send in my hard drive as well. So, it’s make-a-backup time, plus delete-all-the-secrets. A box arrived, with paid shipping, I stuck the laptop in the box, and sent it back. Over Thanksgiving, so that “5 business days” became naturally closer to two weeks, and because it eventually took a while to fix the problem, closer to three weeks.
When I received the system back, I noticed a few things:
You’ll often hear people bad-mouthing non-US companies for having poor technical support that doesn’t speak English and can’t often help – and though this may be true for Lenovo’s online support ‘chat’ (where you type into a browser window), it’s not true for their phone support, and I really can’t argue with the quality of the warranty work they’ve done for me (and how comfortable they were stretching the warranty in the instance that I had been complaining for a while before the warranty expired).
Perhaps it’s a little sad that I have to post a glowing review like this of support that matches roughly what I would expect. But I think Lenovo deserves a pat on the back for this support, and I can only apologise that it has taken me so long to get around to doing so.
I will likely be buying another Lenovo ThinkPad when I finally need to dispose of this one.
This will be the first of a couple of articles on FTP, as I’ve been asked to post this information in an easy-to-read format in a public place where it can be referred to. I think my expertise in developing and supporting WFTPD and WFTPD Pro allow me to be reliable on this topic. Oh, that and the fact that I’ve contributed to a number of RFCs on the subject.
First, a quick refresher on TCP – every TCP connection can be thought of as being associated with a “socket” at each device along the way – from one computer, through routers, to the other computer. The socket is identified by five individual items – the local IP address, the local port, the remote IP address, the remote port, and the protocol (in this case, the protocol is TCP).
Firewalls are essentially a special kind of router, with rules not only for how to forward data, but also rules on connection requests to drop or allow. Once a connection request is allowed, the entire flow of traffic associated with that connection request is allowed, also – any traffic flow not associated with a previously allowed connection request is discarded.
When you set up a firewall to allow access to a server, you have to consider the first segment – the “SYN”, or connection request from the TCP client to the TCP server. The rule can refer to any data that would identify the socket to be created, such as “allow any connection request where the source IP address is 10.1.1.something, and the destination port is 54321”.
Typically, an external-facing firewall will allow all outbound connections, and have rules only for inbound connections. As a result, firewall administrators are used to saying things like “to enable access to the web server, simply open port 80”, whereas what they truly mean is to add a rule that applies to incoming TCP connection requests whose source address and source port could be anything, but whose destination port is 80, and whose destination address is that of the web server.” This is usually written in some short hand, such as “allow tcp 0.0.0.0:0 10.1.2.3:80”, where “0.0.0.0” stands for “any address” and “:0” stands for “any port”.
For an FTP server, firewall rules are known to be a little trickier than for most other servers.
Sure, you can set up the rule “allow tcp 0.0.0.0:0 10.1.2.3:21”, because the default port for the control connection of FTP is 21. That only allows the control connection, though.
What other connections are there?
In the default transfer mode of “Stream”, every file transfer gets its own data connection. Of course, it’d be lovely if this data connection was made on port 21 as well, but that’s not the way the protocol was built. Instead, Stream mode data connections are opened either as “Active” or “Passive” connections.
The terms "Active" and "Passive" refer to how the FTP server connects. The choice of connection method is initiated by the client, although the server can choose to refuse whatever the client asked for, at which point the client should fail over to using the other method.
In the Active method, the FTP server connects to the client (the server is the “active” participant, the client just lies back and thinks of England), on a random port chosen by the client. Obviously, that will work if the client’s firewall is configured to allow the connection to that port, and doesn’t depend on the firewall at the server to do anything but allow connections outbound. The Active method is chosen by the client sending a “PORT” command, containing the IP address and port to which the server should connect.
In the Passive method, the FTP client connects to the server (the server is now the “passive” participant), on a random port chosen by the server. This requires the server’s firewall to allow the incoming connection, and depends on the client’s firewall only to allow outbound connections. The Passive method is chosen by the client sending a “PASV” command, to which the server responds with a message containing the IP address and port at the server that the client should connect to.
So in theory, your firewall now needs to know what ports are going to be requested by the PORT and PASV commands. For some situations, this is true, and you need to consider this – we’ll talk about that in part 2. For now, let’s assume everything is “normal”, and talk about how the firewall helps the FTP user or administrator.
If you use port 21 for your FTP server, and the firewall is able to read the control connection, just about every firewall in existence will recognise the PORT and PASV commands, and open up the appropriate holes. This is because those firewalls have an Application Level Gateway, or ALG, which monitors port 21 traffic for FTP commands, and opens up the appropriate holes in the firewall. We’ve discussed the FTP ALG in the Windows Vista firewall before.
Where does port 20 come in? A rather simplistic view is that administrators read the “Services” file, and see the line that tells them that port 20 is “ftp-data”. They assume that this means that opening port 20 as a destination port on the firewall will allow FTP data connections to flow. By the “elephant repellant” theory, this is proved “true” when their firewalls allow FTP data connections after they open ports 21 and 20. Nobody bothers to check that it also works if they only open port 21, because of the ALG.
OK, so if port 20 isn’t needed, why is it associated with “ftp-data”? For that, you’ll have to remember what I said early on in the article – that every socket has five values associated with it – two addresses, two ports, and a protocol. When the data connection is made from the server to the client (remember, that’s an Active data connection, in response to a PORT command), the source port at the server is port 20. It’s totally that simple, and since nobody makes firewall rules that look at source port values, it’s relatively unimportant. That “ftp-data” in the Services file is simply so that the output from “netstat” has a meaningful service name instead of “:20” as a source port.
Next time, we’ll expand on this topic, to go into the inability of the ALG to process encrypted FTP control traffic, and the resultant issues and solutions that face encrypted FTP.
I was starting to wonder why other people were getting news stories before me.
Then I realised I just wasn’t getting news at all.
Looking at my Unread RSS Feeds search folder in Outlook 2007, I noticed that I hadn’t received a single post since June 10th 2009. Coincidentally, this is when I installed a number of updates:
None of these updates had any “Known Issues” listed in the Knowledge Base articles associated with them that would stop feeds from updating, so I went searching.
First I went searching at Microsoft’s support page (a supported fix or workaround is generally so much safer and more reliable than an unsupported one), and found that this problem had indeed been fixed in the February 2009 Cumulative Update for Outlook 2007 (“RSS feeds become dormant and do not reactivate.”), which was incorporated into Outlook 2007 Service Pack 2. I’ve already installed those.
Great. They’re obviously talking about a completely different problem cause.
Next I go searching the web in general – I use Bing, simply because it’s easy to get to, and Google when I think the answer is more likely to be in the Usenet newsgroups (is it too much to ask Microsoft to maintain their own Usenet archive and search there from Bing?)
In this case, the web had sporadic references to people deleting “~last~.sharing.xml.obi” and “Outlook.sharing.xml.obi” – I would generally avoid doing this sort of change without a backup and a box of tissues to cry into when things go wrong. Deleting temporary files and hoping they get rebuilt is sometimes a miracle, and sometimes more of a magic trick, making things disappear without a trace. So I continued looking.
One question that was asked – and that I should have asked myself – is what kind of “feeds not updating” issue I was having. There are several kinds:
I was in the latter category – when I opened the Tools menu and selected Account Settings, the RSS Feeds tab contained only a few items, rather than the several dozen I was expecting to see. This is what I was expecting:
As it turns out, there is a simple and stupid workaround for this issue, which requires no deletion of files.
Navigate to the RSS Feeds folder (mine is under an RSS Feeds PST file, but if you selected the default, it’ll still be in your Personal Folders file), and for each feed that you’re missing, simply select the feed’s folder, as shown to the right.
For each folder you select, Outlook will display the downloaded items from that feed – and will slyly go behind the scenes to make sure that the feed is in the RSS Feeds tab.
For my several dozen feeds, this took a while, but wasn’t too bad.
[Note: Don’t try to navigate back through the folder history by holding down the ‘back’ key on your keyboard or Alt-Left Arrow – when I did this, Outlook crashed after zipping through a few folders.]
As you can see from my later screenshot of the “RSS Feeds” tab above, all my feeds are re-added, and a new sync caused them to be updated with new content.
It’d be really nice if this process could be automated for a number of folders at a time, to “refresh feeds from RSS Folders” – but for now, this is at least a workaround when you notice that you’re just not as well-informed as you used to be.