I”m starting to take a close look at multithreading and I thought it would be a good idea to start writing about this topic. This should help me understand the concepts better (I understand things better when I try to explain them to others and that”s one of the reasons I”m writing this series) and it might even help me clear some doubts that I”ll be having along this long road (yes, I”m counting with the help of the readers of this blog for improving my knowledge on this area). Let”s get started, shall we?
State is an important concept for any program. And it”s also the reason for most problems that we encounter when we write multithreading programs. In the old days, we had only one thread accessing state and everything was fine. However, with the appearance of multithreading, that is no longer the case; most of the time, you”ll end up needing shared state for reading and writing operations. This might lead you to several unwanted scenarios…for instance, a thread might read a variable which hasn”t totally been set (ex.: reading a 64bit value on a 32 bit architecture). Due to that, I”m really convinced that the first step in understanding multithreading resides in understanding state and how it influences the way we write code.
In fact, it helps to see state as being one of two types: shared and private. As we”ll see, problems generally happen when we deal with shared state. We”ve been talking about shared state for some time, but what is *shared* state? Since I”ll be using C#,then it”s discuss this topic in its context. In this case,shared state includes:
- classes” fields (instance and static);
- state passed to a thread during creation;
- all references reached from a top shared reference.
When writing multithreaded code, working with shared state is the main points of pain we”ll face. For instance, lets consider the following line of code:
int a = 1; //suppose this is shared
Take a close look at the code…do you think that the a++ line is thread safe? (suppose that a is shared after being created – for instance, think of it as a field of a class that is shared across several threads) The answer is no because a++ is really a = a + 1 which, in practice, should be translated into assembly that performs the following steps:
- load a into a register;
- increment the value of that register;
- move the value off that register back to the position of memory occupied by variable a.
When you think of that simple instruction as being a set of instructions and once you notice that multiple hardware instructions aren”t atomic by default, we can surely agree that things might not go as expected when several threads execute that instruction and variable a is shared between those threads! Ok, let me tell you what I consider to be correct results with two threads t1 and t2:
As you can see, if we can ensure that the a++ operation is executed atomicly, then we will end up with the correct results (assuming, of course, that letting the a variable being incremented several times is expected and allowed).
Atomicity is an important concept, especially in the multithread world. An operation or a set of operations are atomic if they happen at once. For instance, loading a value into a register is an atomic operation. However, the operations performed as a result of the a++ instruction, aren’t (at least, not by default).
Besides atomicity, there are two important concepts that might help us understand what”s going on. I”m talking about serializability and linearizability. Two operations are serializable when one happens before the other. For instance, if we look at the set of operations that result from the the a++ instruction on a *single* threaded program, we can say that we have achieved serializability because all three operations are executed one after another. With mutithreading programming, that doesn”t happen by default (and this is one of the things we need to ensure in order to guarantee that our program works correctly in multithreaded environments).
Linearizability is related with serializability and you can think of it as a point where a set of updates become visible to other threads. Achieving Linearizability is fundamental for ensuring the correctness of multithreaded applications. In practice, and as we’ll see in the future, you achieve this by wrapping regions of code in critical regions.
And I guess that is all for today. More thoughts about threading in future posts.