This will be the first of a couple of articles on FTP, as Iâ€™ve been asked to post this information in an easy-to-read format in a public place where it can be referred to. I think my expertise in developing and supporting WFTPD and WFTPD Pro allow me to be reliable on this topic. Oh, that and the fact that Iâ€™ve contributed to a number of RFCs on the subject.
First, a quick refresher on TCP â€“ every TCP connection can be thought of as being associated with a â€śsocketâ€ť at each device along the way â€“ from one computer, through routers, to the other computer. The socket is identified by five individual items â€“ the local IP address, the local port, the remote IP address, the remote port, and the protocol (in this case, the protocol is TCP).
Firewalls are essentially a special kind of router, with rules not only for how to forward data, but also rules on connection requests to drop or allow. Once a connection request is allowed, the entire flow of traffic associated with that connection request is allowed, also â€“ any traffic flow not associated with a previously allowed connection request is discarded.
When you set up a firewall to allow access to a server, you have to consider the first segment â€“ the â€śSYNâ€ť, or connection request from the TCP client to the TCP server. The rule can refer to any data that would identify the socket to be created, such as â€śallow any connection request where the source IP address is 10.1.1.something, and the destination port is 54321â€ť.
Typically, an external-facing firewall will allow all outbound connections, and have rules only for inbound connections. As a result, firewall administrators are used to saying things like â€śto enable access to the web server, simply open port 80â€ť, whereas what they truly mean is to add a rule that applies to incoming TCP connection requests whose source address and source port could be anything, but whose destination port is 80, and whose destination address is that of the web server.â€ť This is usually written in some short hand, such as â€śallow tcp 0.0.0.0:0 10.1.2.3:80â€ť, where â€ś0.0.0.0â€ť stands for â€śany addressâ€ť and â€ś:0â€ť stands for â€śany portâ€ť.
For an FTP server, firewall rules are known to be a little trickier than for most other servers.
Sure, you can set up the rule â€śallow tcp 0.0.0.0:0 10.1.2.3:21â€ť, because the default port for the control connection of FTP is 21. That only allows the control connection, though.
What other connections are there?
In the default transfer mode of â€śStreamâ€ť, every file transfer gets its own data connection. Of course, itâ€™d be lovely if this data connection was made on port 21 as well, but thatâ€™s not the way the protocol was built. Instead, Stream mode data connections are opened either as â€śActiveâ€ť or â€śPassiveâ€ť connections.
The terms "Active" and "Passive" refer to how the FTP server connects. The choice of connection method is initiated by the client, although the server can choose to refuse whatever the client asked for, at which point the client should fail over to using the other method.
In the Active method, the FTP server connects to the client (the server is the â€śactiveâ€ť participant, the client just lies back and thinks of England), on a random port chosen by the client. Obviously, that will work if the client’s firewall is configured to allow the connection to that port, and doesn’t depend on the firewall at the server to do anything but allow connections outbound. The Active method is chosen by the client sending a â€śPORTâ€ť command, containing the IP address and port to which the server should connect.
In the Passive method, the FTP client connects to the server (the server is now the â€śpassiveâ€ť participant), on a random port chosen by the server. This requires the server’s firewall to allow the incoming connection, and depends on the client’s firewall only to allow outbound connections. The Passive method is chosen by the client sending a â€śPASVâ€ť command, to which the server responds with a message containing the IP address and port at the server that the client should connect to.
So in theory, your firewall now needs to know what ports are going to be requested by the PORT and PASV commands. For some situations, this is true, and you need to consider this â€“ weâ€™ll talk about that in part 2. For now, letâ€™s assume everything is â€śnormalâ€ť, and talk about how the firewall helps the FTP user or administrator.
If you use port 21 for your FTP server, and the firewall is able to read the control connection, just about every firewall in existence will recognise the PORT and PASV commands, and open up the appropriate holes. This is because those firewalls have an Application Level Gateway, or ALG, which monitors port 21 traffic for FTP commands, and opens up the appropriate holes in the firewall. Weâ€™ve discussed the FTP ALG in the Windows Vista firewall before.
Where does port 20 come in? A rather simplistic view is that administrators read the â€śServicesâ€ť file, and see the line that tells them that port 20 is â€śftp-dataâ€ť. They assume that this means that opening port 20 as a destination port on the firewall will allow FTP data connections to flow. By the â€śelephant repellantâ€ť theory, this is proved â€śtrueâ€ť when their firewalls allow FTP data connections after they open ports 21 and 20. Nobody bothers to check that it also works if they only open port 21, because of the ALG.
OK, so if port 20 isnâ€™t needed, why is it associated with â€śftp-dataâ€ť? For that, youâ€™ll have to remember what I said early on in the article â€“ that every socket has five values associated with it â€“ two addresses, two ports, and a protocol. When the data connection is made from the server to the client (remember, thatâ€™s an Active data connection, in response to a PORT command), the source port at the server is port 20. Itâ€™s totally that simple, and since nobody makes firewall rules that look at source port values, itâ€™s relatively unimportant. That â€śftp-dataâ€ť in the Services file is simply so that the output from â€śnetstatâ€ť has a meaningful service name instead of â€ś:20â€ť as a source port.
Next time, weâ€™ll expand on this topic, to go into the inability of the ALG to process encrypted FTP control traffic, and the resultant issues and solutions that face encrypted FTP.