As promised, here begins my attempt to help shine some light on how to write network code for .NET in C#. Before we begin though, I would like to share a couple of links to resources that, while not specifically about .NET, are still in my opinion “must-read” material for anyone writing socket-based code on Windows. Those resources are:
In particular, the .NET stuff is mostly built on top of Winsock and so a lot of what’s described in those references is pertinent to .NET too. In addition, a lot of what’s important to know in Winsock is actually related to TCP/IP, the main network protocol used these days, and many issues surrounding the use of TCP/IP and which are not actually unique to Winsock are nevertheless discussed in the above references.
Now, that said, even among those issues there are a handful that seem to come up on a regular basis. They are described more fully in those references, but I’d like to start by touching on them here before getting to any actual coding. The rest of this post will be devoted to that…
What does “protocol” mean? For better or worse, the word “protocol” is used in a number of ways. Generally, it always means “a defined standard for the interchange of data”. But in the context of network programming, there are many levels at which this can be applied. TCP/IP is a protocol. But then, so too are TCP and UDP, which are implemented on top of TCP/IP. And then there’s what I call the “application protocol” — that is, the application-defined format of the data, such as FTP, HTTP, or some custom protocol unique to the application — which is itself implemented using TCP, UDP, or perhaps even some other protocol.
It’s my hope that in context, the word “protocol” will always have a clear meaning. Please feel free to point out if it doesn’t.
TCP or UDP? One of the first questions that will come up when writing a new network application is which protocol to use. Now, as it happens Winsock supports a much broader range of network protocols than just TCP/IP, and so there are actually more options than just “TCP or UDP”. But for most people, and I feel even especially for beginners, TCP/IP is likely to be the main, if not the only, network protocol being used and the choice really is for practical purposes limited to “TCP or UDP”.
So how do you choose? Well, it depends mainly on the needs of the application, and on how much work the programmer wants to have to do. UDP (“User Datagram Protocol”) is “unreliable”. That is, it provides no guarantees other than that if you receive a datagram (a single self-contained message), it’s a datagram that was sent by the remote endpoint. In particular, a couple of important guarantees it does not make are the order of the datagrams, and the uniqueness of the datagrams. That’s right. Not only may datagrams not be received at all, or received in a different order than that in which they were sent, you might actually receive a datagram more than once!
Some applications can tolerate this sort of issue very well, sometimes without even doing much, if any, extra work. For those kinds of applications, UDP works very well. But for others, TCP is often a better option. You could in fact write a reliable protocol on top of UDP, but why reinvent the wheel?
The one thing TCP does not guarantee is that the data is received in the same grouping as that used when sending it (more on that in a moment). However, it does guarantee that data will be received uniquely, in the same order, and without gaps. That is, you can be sure that you won’t received byte N until you’ve already received bytes 0 through N-1 and that any bytes received will be exactly the bytes that were sent.
If you cannot afford to have any of your data go missing, then TCP is usually the way to go.
What’s a connection? How do I know if it’s broken/reset/lost? In addition to the above, there’s another crucial difference between UDP and TCP: UDP is “connectionless” while TCP is “connection-oriented”. That is, with UDP you just send a datagram to a given address and hope it gets there. Each datagram is treated independently of any other datagram. With TCP, you establish a logical connection with the remote endpoint and this connection is used to preserve state with respect to the communication between endpoints (for example, to handle all the lower-level packet reordering, confirmation, and verification needed to make TCP reliable).
One implication of the above is that once a TCP connection has been established, data can only be sent and received on that connection by one of the two endpoints involved in creating the connection. That is, for each endpoint, the socket associated with the connection will only ever be used to communicate with the other endpoint to which the connection was made. With UDP, a single socket can send to any endpoint, and can receive data from any endpoint.
Since UDP has no connection, obviously the question of the connection being reset is irrelevant. The OS will in fact generate an error if it can tell right away that a specific datagram is undeliverable (e.g. by virtue of there simply not being a route to the recipient). But otherwise, errors that might occur during delivery go unreported to the sender. But with TCP, the network driver is doing some work to manage the connection, and can report back if some unexpected failure occurs.
The one thing that gets beginners a bit surprised though is that this error detection only happens if you try to send some data. The physical connection between endpoints can come and go without any problem being noted, as long as neither end tries to actually use the connection during an interruption. It’s only if one end tries to send data and fails that a connection error will be noted. This is actually a good thing — it means that connect is more robust — but some just learning network programming are surprised when they don’t get errors they thought they would.
It’s very unusual to need to change this behavior, but if one decides that’s a requirement, the solution is simple: send data periodically. This can either be part of the application protocol, or enabled specifically for the TCP socket. In either case, the technique is known as “keep alive”, which is ironic because the main thing it does is kill your connection in situations when it otherwise would have been fine. [:)]
Why are my messages sent over TCP getting all squashed together? Why are my messages sent over TCP getting broken apart? These two questions are really part of the same behavior: TCP guarantees the order of the bytes you send, but not the grouping. If you send several logical “messages” in quick succession, the network driver may coalesce them into a single transmission or, if they are large, at least into groups that fit the underlying network protocol instead of whatever grouping you sent them in. Likewise, even a single send of some block of data can be broken apart and received in smaller pieces, especially if the block of data is large and there are delays in transmitting some of the pieces of the data.
Note: while the two above questions are the precise manifestations of this behavior, at first glance when this is going on it often looks to the programmer as though some of the data is simply not being sent at all (usually because the code has received the data, but then ignored it because it wasn’t designed to deal with multiple sends being received together). So, if you’re using TCP and you think that you’re losing data, there’s a good chance this is the mistake you made.
When using TCP, if some sort of message-based communication is desired, it’s up to the application to implement that. The simplest mechanism is to send one block of data with each connection, closing the connection when the block has been completed. This is inefficient, but if the blocks are large and the number of them is small, it can work fine.
The other options are either to preface any transmission with a description of the length of bytes to follow (which description of length would itself need to be well-defined, either by being fixed-size or a terminated string), or to delimit the data in some way (for example, sending null-terminated strings).
Note that if delimiting is used, this often means including some way to quote the delimiter. If the data being sent is textual, a null-terminator may be sufficient and not need quoting because it will never show up in the data actually being sent. But most other situations involve sending data without any restrictions, and so some way to distinguish a true delimiter from just some data that happens to look like one is required.
Why is my data getting corrupted? It’s important to keep in mind that the code executing at each end of a networked system is not only running on a different computer, it might not have even been compiled with the same compiler, written in the same language, etc. It’s unusual to run into issues when you’re writing both ends yourself using the same tools and especially unusual if you’re running both ends on the same physical computer. But even that’s not impossible. The important thing to keep in mind is that you can’t take anything for granted. Structure layout, data type sizes, character encoding, etc. are all very much language-, compiler-, and environment-dependent.
The solution is to decide ahead of time on a precise definition of how the data will be formatted, and then make sure that each end of the networked code is written to translate (if necessary) between that precise definition and whatever the “natural” format for the data is in that environment.
For example, when sending text data you might be using an environment in which either ASCII or Unicode are permissible formats. Failing to standardize your application protocol on one or the other can lead to each end sending data in a different format than that expected by the other end. In C#, this is less of a problem because there’s no practical way to get directly at the bytes in the string data type; you have to go through some form of text encoding/decoding anyway, and so it’s simple enough to just declare ahead of time what character encoding will be used. But do make sure you make that decision and stick with it.
How do I make sure data I send is sent from a specific IP address? You don’t. It’s the network driver’s job to decide what the best way to send data is.
Well, what can I control then? You can control the address that others must use in order to send data to you. Every network protocol has a standard way of specifying addresses. For TCP/IP, this is an IP address and a port number. Generally speaking, the IP address describes an actual network adapter and the port number describes some specific application using that adapter.
When creating a network object (e.g. a socket), both of these need to be specified, either explicitly or implicitly. Most common would be for an application expecting to receive connections (often described as the “server”) to decide on a port (these are well-defined for protocols like HTTP, FTP, POP3, SMTP, etc. which use ports 80, 21, 110, and 25, respectively) and then use the special “any” IP address to indicate that it wants to receive traffic sent to the specified port on any of the network adapters present. An application initiating a connection (often described as the “client”) would specify not only “any” for the IP address, but also 0 for the port number. This allows the network driver to select an available port for the client to use.
As a general rule, if you create a socket and use it before you’ve bound it to a particular endpoint address, the platform (.NET, Winsock, etc.) will attempt to bind the socket implicitly to the “any:0″ address the first time the socket is used in a way when a bound address is required. Note that for servers, it generally is easier if you use a consistently defined port. Numbers between 5000 and 49151 are best. See the Winsock FAQ for more details.
The above doesn’t even come close to covering all the potential “gotchas” to be found when writing network code. But I hope that it does a sufficient job of describing the most common and/or most important ones. At the very least, it should help emphasize that while the essentials of network i/o are actually reasonably simple, there are lots of little details that are important to get right. Otherwise, things simply don’t work.
For the rest of this series, I’ll be posting code and explanations for a variety of different kinds of network applications. For simplicity, in all cases, the server will do nothing but just send back to the client whatever it received (sometimes called an “echo server”). We’ll start with a simple peer-to-peer, single-connection implementation and move up from there. See you next time!