May 25

Multithreading: more on CLR locks

Posted in C# Multithreading      Comments Off on Multithreading: more on CLR locks

In the previous post we’ve taken a quick peek at how we can use the CLR lock primitive. In this post we’re going to talk about some interesting things you might need to know when you work with these locks. According to Joe’s reference, the CLR locks use some spinning internally before waiting on a kernel object. Unfortunately, there’s really no way for you to change the time it spins (if you’re a C/C++ programmer, you probably know that you can control the amount of spinning time before waiting).

One thing that we didn’t discuss is what to expect when we have exceptions. For instance, consider the traditional code we’ve seen before:

try { 
finally {                 

Ok, now, what happens if you get an exception between Monitor.Enter and the try block? Impossible, you say…I thought that too, but it seems like the JIT compiler could put (it seems like the x64 does that on pre 3.5 versions!) a simple nop instruction between the Monitor.Enter and the try block (if we get an async exception when the pointer was at that NOP instruction, then we wouldn’t be able to release the lock because the try block wouldn’t run).

Fortunately, it seems like the C# lock keyword ensures that there are no IL instructions between Monitor.Enter and the try block (at least, in non-debug builds for the X64; it seems like this problem does not affect the x86 JIT compiler). Since the Enter method is entirely written in *unmanaged* code,then it can’t also be interrupted by a managed exception (meaning that even if you get a managed async exception during that method,you’ll only get it inside the try block).

So, I guess that it’s fair to say that you should be fairly safe if you’re not using a debug release in x64.

As we’ve seen, all managed objects can be used for “locking on”. Understanding why and how this works is a good exercise (and that’s what I’ll try to do here based on what I’ve learned from Richter’s and Duffy’s books – it goes without saying that both are required reading for anyone working in .NET). All CLR objects have an object header (a chunk of memory) which resides before the address in memory to which a reference points to. The CLR uses this header for storing the hashcode (after you’ve called GetHashCode, that is), COM interop info and for the so called thin lock.

The thin lock is of special interest for us: it contains the ID of the monitor’s owning thread encoded in less than a natural word. When possible, the CLR will always use this thin lock in the header for locking. Unfortunately, things start to get a little “tight” when you need to put all these things in the header. For instance, if we need to allocate an event handle for waiting purposes (which is done by the monitor, as we’ll see in the next paragraphs), more space will be needed. When this happens, the CLR does its magic and performs the so called header inflation.

Whenever the CLR starts, it creates an array of sync blocks. By default, the header object doesn’t point to none of these blocks. When the CLR sees that the object needs more space for its header, it will simply check for an available sync block from that array and it will set the header to the index of the sync block that should be used. As you might expect, the contents that were stored on the header are copied to the sync block.

Btw, and since we mentioned sync blocks, you should also keep in mind that the CLR is able to increase the number of blocks in the array and it will also manage the deflation of the header (done during garbage collection), all in thread safe manner.Back to the topic, which is how monitors implement locks…

As we’ve seen, locks always incur in spinning before waiting on a kernel object. Whenever the spinning isn’t enough for allowing a monitor to acquire the critical section, an auto-reset event will be created and a reference to it will be put in the associated sync block (as we’ve seen, inflation is always needed when we need to wait on a kernel object). Deflation is always performed during GC time for all the objects whose sync block aren’t needed any more (don’t forget that the sync block holds other info besides “locking” and they won’t be “deleted” if that info is still needed). Regarding our topic of threading, it’s important to understand that sync blocks aren’t “cleaned up” when a thread owns the monitor or when a thread is waiting for a monitor (which, in practice, means that you might end up leaking kernel objects if you don’t dispose the monitor when you’re done with it).

And I guess it’s all for today. Keep tuned for more on multithreading.