LA.NET [EN]

MultithreadingArchive

Oct 08

Transforming loops in methods which invoke delegates

Posted in Multithreading       Comments Off on Transforming loops in methods which invoke delegates

In a previous post, we’ve started talking about data parallelism. Before getting started with the “real stuff”, I thought it would be a good idea to write a small post that explains how to break our loops into units of work. For instance, consider the following snippet:

int [] arr = {1, 2, 3};
for(var i = 0; i < arr.Length; i++){
    //do some work here
}

Before thinking about executing the loop in parallel, we need to identify the main units. It probably won’t take much time until you see that the work you’re doing will need to be wrapped inside an action. For instance, here’s a possible solution for creating a general for method which is capable of doing the same thing as the previous example:

void DoLoop(int startIndex, int endIndex, Action<int> loopBody)
{
    for (var i = startIndex; i < endIndex; i++)
    {
        loopBody(i);
    }
}

As you can see, we’ve introduced an Action delegate which “moved” the work you’re doing in the loop into a an Action which you can define. That means that now you’d use the following method for interacting over the elements in the array:

DoLoop(0, arr.Length,i => { /* do some work */});

In practice,this means that we can concentrate our multithreading efforts in the DoLoop method. The consumer of the method stays “ignorant” and will only have to worry with defining the Action that is executed (don’t forget to respect the assumptions presented in the previous post).

Things aren’t as straightforward when you don’t know the size of the collection. For instance, when you’re working with an IEnumerator, there’s no way for you to get the number of items without going through all the elements in the collection (and this is something which you might not really want to do in order to improve speed). One of the best approaches here is to build buffers of x items and handle them until you reach the end of the collection (we’ll leave this code for a future post).

Anyway, I digress. The main purpose of this post is completed! Now that we’ve identified the units of our loop, we can easily update the code of the DoLoop method so that it uses threads for doing its work. In the next post, we’ll start with the simplest approach there is: using the ThreadPool to add parallelism to our DoLoop method. Stay tuned!

Oct 04

Improving performance with data parallelism

Posted in Multithreading       Comments Off on Improving performance with data parallelism

I know that it’s been a long time since I’ve written on multithreading (I’ve been really busy in JavaScript land!), but that is still an area which interests me and that’s why I’ll be writing on it from time to time.

As we’ve seen along this series, there are several things you might do to improve the performance of your apps. One of the most used techniques for improving performance is relying on something called data parallelism. This technique says that you should divide the received data into smaller chunks and then give each chunk to a different processor in order to maximize the throughput of your algorithm. Generally, this is a good approach for those cases where you’ve several items and need to perform some sort of operation over each one of them. In other words, this might be a good option when you need to parallelize a loop.

Notice that there are several things you should keep in mind before starting to apply this technique. The first thing you need to understand is that you should only apply this when you have a big enough collection of items (big enough depends on the work you’ll be doing with each item and you should always check your “gut feeling” with appropriate measurements). Don’t forget that data parallelism adds overhead to the main operation you’re performing: you’ll be splitting data and then you’ll probably need to get it all back again before returning the processed data back to the user.

If you’ve already decided that this is the way to go, then you need to make sure that the body of the for loop is thread safe. It goes without saying that thread safe shouldn’t be achieved through the use of locks. Doing that would completely defeat the purpose of executing several operations in parallel and that would mean that the total time of execution wouldn’t really be improved (in many scenarios, you’d probably end up degrading your algorithms). 

There’s still one last thing you should keep in mind: you should only use this technique when the order of execution for each item in the loop doesn’t matter. If you look at existing literature, you’ll see that some authors tend to call this as non-associative loops.

And that’s it for this short introduction. In the next couple of posts we’ll be seeing some code for  improving the performance of your loops (always keep in mind the observations I’ve made in this post). Keep tuned for more on multithreading.

Jul 06

[Update: Brad detected a flaw in the code: I had forgotten to initialize the _initialized field. Thanks!]

[Update2: Brad detected another flaw in the code: return _instance must be outside the if. Thanks!]

In the previous post we’ve seen how we can use the C# volatile keyword to guarantee that those nasty load-load reordering stay away from our code. As I’ve said before, we can also use the static Thread.VolatileRead or Thread.VolatileWrite for having more control over the way fences are applied to our code. Going back to our previous volatile example, the question is: do we really need a fence whenever we access our instance variable?

Looking at the code, I guess that we can get  away by just using an acquire fence on the initialization of the instance. Recall that an acquire fence is an optimization of the full fence and ensures that no load or store that comes after the fence can be moved before the fence (it’s just what we need to ensure proper initialization and eliminate the possible load/load reorderings allowed by the CLR).

With this in mind, lets update our sample, ok? Btw, we’ll be using another variable for controlling initialization (we’re picking an integer). This is your best option for initializing value types since you can’t control it size or check it for null (don’t forget our previous discussion on word size, alignment and .NET). Here’s the final code:

class Lazy{
  private Object _locker = new Object();
  private SomeObject _instance = null;
  private Int32 _initialized = 0;
  public SomeObject SomeObject {
    get {
      if (Thread.VolatileRead(ref _initialized) == 0) {
        lock (_locker) {
          if (_initialized == 0) {
            _instance = new SomeObject();
_initialized = 1; } }
}
return _instance; } } }

This code is also correct and will behave properly in all the current architectures that run Windows and the CLR. There’s no need for running another VolatileRead on the inner comparison due to a thing called control dependency (check this post by Joe Duffy for more info). Notice that in these posts our main objective is ensuring that you end up getting only one instance of a specific type. As I’ve said, if you don’t care about creating multiple instances and only need to ensure that you’ll have only one active instance, you can only use the Interlocked.CompareExchange method for that. We’ll see how in the next post. Keep tuned!

Jul 06

In the last post, I’ve showed you some code I’ve written in  the past and asked if there was anything wrong with it. Here’s the code again:

class Lazy {
  private SomeObject _object;
  private Object _locker = new Object();
  public SomeObject SomeObject {
    get {
      if (_object == null) {
        lock (_locker) {
          if (_object == null) {
            _object = new SomeObject();
          }
        }
      }
      return _object;
    }
  }
}

Here’s main idea (for the double-checked locked pattern): you start by checking the instance against null. If it is null, then you acquire the lock and test again. We’re testing it again because someone might have already initialized the instance in the time that passed since the initial test and the lock acquisition. Ok, so is there anything wrong with this?

To answer this question correctly, we need to go back to the memory model supported by the CLR. As we’ve seen, the CLR won’t allow store-store reorderings (meaning that all the other types are allowed). If we assume that SomeObject has fields (and this is really a valid assumption), then they will be initialized during construction. So, if we’re using the CLR, everything should be ok because store-store aren’t allowed.

However, the CLR allows load-load reorderings, meaning that the load of the instance can be moved after the load of its fields, meaning that we could get into trouble. And what happens if we’re writing code that should be run against another ECMA CLI implementation? For instance,say we want to write code that will also run in Mono (I don’t really know Mono,so I don’t know if it follows the CLR 2.0 tighter rules). In this case, store-store reordering are possible and our code might break if the store of the _object instance occurs before the store of its fields. In this case, the testing condition will be null (_object won’t be null) but its fields are because they haven’t been written to yet. Solving this is as simple as adding volatile (recall that volatile allows only store-load reordering!). Here’s the code again:

class Lazy {
  private volatile SomeObject _object;
  private Object _locker = new Object();
  public SomeObject SomeObject {
    get {
      if (_object == null) {
        lock (_locker) {
          if (_object == null) {
            _object = new SomeObject();
          }
        }
      }
      return _object;
    }
  }
}

And that’s it. Adding the volatile keyword solves all the problems mentioned before. Keep tuned for more on multithreading.

Jul 05

Today we’re only going to talk about the volatile keyword. The volatile keyword can be used on the declaration of a field, transforming it into a volatile field. Currently, you can only annotate a field with this keyword if it is:

  • a reference type;
  • a pointer type (unsafe code);
  • one of the following types: sbyte, byte, short, ushort, int, uint, char, float or bool;
  • an enum with a base type of byte, sbyte, short,ushort,int or uint.

As we’ve seen, volatiles ensures that proper fences are applied when someone access that field (ie, reading means having an acquire fence and writing ends up injecting a release fence). As you know by now, load and store reordering can happen at several levels and you might be wondering if using the volatile is enough for ensuring that fences are applied on all levels. Fortunately, the answer is yes, and the volatile keyword  is respected by the compiler and by the processor.

Ok, so when should you use this keyword? Well, probably an example is in order, right? Lets take a look at the following code which shows the code I’ve written in the past for lazy loading:

class Lazy {
  private SomeObject _object;
  private Object _locker = new Object();
  public SomeObject SomeObject {
    get {
      if (_object == null) {
        lock (_locker) {
          if (_object == null) {
            _object = new SomeObject();
          }
        }
      }
      return _object;
    }
  }
}

What is your opinion? Do you see anything wrong (btw, suppose SomeObject is a reference type with some properties). I’ll return tomorrow and we’ll come back to this discussion. Keep tuned!

Jul 04

Multithreading: a final example on how CompareExchange might help you

Posted in C#, Multithreading       Comments Off on Multithreading: a final example on how CompareExchange might help you

In the last posts we’ve been poking around memory models, memory fences and other interesting things that might lead to the so called lock free programming. Before keep looking at how we can use those features from our C# code, I’d like to add one more example that shows how the interlocked operations might help you in the real world. Do you recall our implementation of the IAsyncResult interface?

At the time, we’ve used a lock for ensuring proper access and initialization of our manual reset event used for signaling the end of the operation. Now that we’ve met interlocked operations, we can improve that code and get rid of the lock. Let’s focus on the private GetEvtHandle method:

private ManualResetEvent GetEvtHandle() {
  var newEvt = new ManualResetEvent(false);
  if (Interlocked.CompareExchange(ref _evt, newEvt, null) != null) {
    newEvt.Close();
  }
  if (_isCompleted) {
    _evt.Set();
  }
  return _evt;
}

As you can see, we’ve replaced the lock with a CompareExchange method call for ensuring proper atomic update: if the _evt field is null, then set it to newEvt (which, if you recall our previous post on interlocked operations, will only happen when that value is null!). Since the method returns the old value, if we get anything different from null,then it means that some other thread already set the field to a valid value. When that happens,we need to clean up and close the manual reset event we’ve just created. And there you go: the lock is gone.

As you’ve probably guessed, This strategy can be adjusted to objects that implement the IDisposable interface and you can use it when the creation of new objects isn’t too expensive and you need to have only a single instance of an item (in other words, it might be a valid option for implementing singletons in multithreading scenarios).

And that’s it. Keep tuned for more on multithreading.

Jul 03

In the last post, we’ve talked about memory fences. Today we’re going to keep looking at this topic, but we’re turning our attention to coding, ie, we’re going to talk about the options we have to add fences to our classes. In .NET, things are relatively straightforward.

Whenever we use one of the interlocked methods we’ve met in the past, we’re adding a full fence to our code. Locking will also end up using full fences. As you can see,  you’re already using full fences in several places. The good news is that you can also be specific about them, ie, there’s a method you can call if you want to add a fence to your code in a specific place: I’m talking about the Thread.MemoryBarrier static method (btw, this method will also add a  full fence).

Volatile reads or writes generate fences too! In C#, you can use the volatile keyword to mark a field as volatile. Reading a volatile field is the “equivalent” of having an acquiring fence. Writing to a volatile field can be seen as “adding” a release fence. Now,it’s important to notice that you’ll always get these read and write behaviors whenever you use the volatile field. If you’re looking for more control (ex.: you’ll only interested in having an acquire fence for a read),then you should rely on the Thread.VolatileRead or Thread.VolatileWrite methods for ensuring proper the desired behavior.

And I guess that’s all there’s time for today. In the next post, we’ll keep looking at multithreading and see how we can use these features on a C# program. Keep tuned!

Jul 03

Multithreading: introducing memory fences

Posted in C#, Multithreading       Comments Off on Multithreading: introducing memory fences

A few posts back, we’ve introduced the concept of load and store reordering. As we’ve seen, reordering operations exist as a way of improving performance and can be introduced on several levels (starting at compilation time and ending at runtime when the processor executes the instructions). We saw that even though things can get chaotic quickly, there are some guarantees we can hold on to when writing multithreaded code. One of those guarantees is that all platforms respect a specific memory model and that’s what we’ll be talking about in this post.

A memory model defines which kinds of moves may occur  (ie, which loads and stores can be moved). If you’ve got a weak memory model, then you’ll get plenty of allowable moves and this will lead to a superior performance. However, you’ll also need to pay lots of attention to the code you write. Allowing less moves will, of course, lead to less complexity but won’t give you (hum…I mean, the compiler and processor) that many chances for updating your code.

Since we’re talking about .NET here, we’ll focus the rest of the post on the valid assumptions for the CLR. The CLR has a strong memory model. In practice, this means that several compiler optimizations are forbidden  and that it should be fairly easy (sort of…) to write code that is portable across several architectures where the code might run. Before going on,it’s important to notice that the CLR memory model is tighter than the one you get in the ECMA spec.

In the CLR,you can get reordering for load/load, load/store and store/load. The only one which isn’t permitted is store/store reordering (meaning that a store can never move after another store). Volatile loads and stores are different and only allow store/load reordering (we’ll be talking about volatiles in future posts). Btw, the ECMA specification allows all these move types.

Ok, so if those store and load reordering are allowed, how can we stopped them from happening? ah, glad you asked! We can use fences (or barriers) to ensure that they don’t occur at specific times.

A fence (aka barrier) prevents memory loads and stores reordering from happening. There are several types of fences. Full fences are probably the most known and used type. A full fence ensures that no load or stores moves across the fence (ie, no load or store before the fence can move after it nor any load or store placed after the fence may move before it).

Besides the ubiquitous full fence (which is available everywhere), there are other variations. With Store fences, no store can move over the fence (it’s ok for reorders to happen with loads). Load fences are similar, but in this case, only loads are “fixed”.

Finally, there’s also a couple of “one way” fences: acquire and release fences. Acquire fences ensures that no memory operation that happens after the fence can be moved before the fence. Release fences work the other way around: instructions defined after the fence may happen before the fence but no “pre-fence” instruction may happen after the fence.

As you might have guessed by now, fences load to a more sequential model which will, without any doubt, lead to a degradation of your application. This means that you should apply them carefully. Yes, we do need fences, but do keep in mind that using them will reduce the ability for reordering and improving the code you write.

By now, I guess that we’ve covered most of the theory around fences. You might be asking: how do I use fences in my .NET code. Good question, but we’ll leave the answer for the next post. Keep tuned for more on multithreading.

Jul 02

In the previous post, we’ve started looking at interlocked operations. As we’ve seen, interlocked operations are great at what they do but they won’t be usable in all scenarios (ie, don’t think that they’ll solve all your locks problems). To show how things might go awry when using interlocks, I’ll reuse a great example written by Raymond Chen a few years ago (I’m updating it to C#):

class Program
{
  private static Object _lock = new Object();
  public static Int64 InterlockedMultiply(
ref Int64 multiplicand, Int64 multiplier) { Int64 result = 0; lock (_lock) { var aux = multiplicand; Thread.Sleep(100);//oops!!! result = multiplicand = aux * multiplier; } return result; } static void Main(string[] args) { Int64 a = 5; new Thread(
() => InterlockedMultiply(
ref a, 5)).Start(); new Thread(() => {
Thread.Sleep(50);
Interlocked.Increment(ref a); }).Start(); Thread.Sleep(2000); Console.WriteLine(a); } }

The idea is to add a safe multiplier method. As we’ve seen in the previous post, interlocked increments are atomic. That means that they’re executed as a “single” operation by the processor. Since we didn’t had a method that performs the same operation for multiplication, we’ve decided to mimic that behavior by adding a new method which uses a lock to ensure proper multiplication.

If I asked you what Console.WriteLine(a) would print, what would you say? For now, forget those nasty Sleep invocations (they’re there to force the wrong behavior)… I’m guessing that you’d probably say that Console.WriteLine will only write 26 or 30. It will write 26 if InterlockedMultiply “beats” Increment or 30 if Increment is run before InterlockedMultiply. Ah, well,with those nasty sleep instructions,I’ve managed to get 25 here on my machine. Wtf? How? Why?

Well, what happened is logical…Interlocked.Increment will always update the value in a single atomic operation (this means it will load, update and then store the value in a “single” step). However, InterlockedMultiply only ensures that the code wrapped by the lock will only be executed by a thread at a time. Look at that method carefully…can you see a load followed by a store? Those two operations aren’t performed atomically like the one you get through the Interlocked.Increment method!

There is a solution to this problem, but it involves looping until you get a valid result. Take a look at the method updated to work correctly:

public static Int64 InterlockedMultiply(
ref Int64 multiplicand, Int64 multiplier) { Int64 result = 0; Int64 aux = 0; do { aux = multiplicand; result = aux * multiplier; } while ( Interlocked.CompareExchange(
ref multiplicand, result, aux) != aux); return result; }

As you can see, we’re using the Interlocked.CompareExchange method to ensure that multiplicand will only be updated if it hasn’t changed during the  execution of that loop. That happens because the Interlocked.CompareExchange method will always return the value that was stored in multiplicand at the time of the call (recall that CompareExchange always returns the original value of the 1st parameter passed to the method at the time of the call).

As you can see, interlocked operations don’t ensure proper serialization of your code. They only guarantee that the interlocked operation is done atomically. And I guess that’s all for now. Keep tuned for more on multithreading.

Jul 02

Multithreading: introducing the interlocked operations

Posted in C#, Multithreading       Comments Off on Multithreading: introducing the interlocked operations

As we’ve seen in the previous post, most processors give us important insurances regarding memory loads and stores. However, even though those insurances are important and can be used in several scenarios, the truth is that they aren’t enough for all real world tasks.

Fortunately, most processors also offer a group of interlocked operations which enable atomic compare and swap scenarios. These operations rely on hardware and interprocess synchronization. Notice that these operations aren’t as simply as they might seem at first sight. For instance, it’s important to recall that in today’s architectures, these kind of operations need to play well with caches. My point is that event though these operations tend to be cheaper that the traditional locks, they’re still not cheap (for instance, there are cases where an interlocked operation ends up locking the bus and that is not a good thing).

Currently, there are several kinds of interlocked operations. In .NET, all interlocked operations are exposed as members of the Interlocked class:

  • Add: adds two integers (int or long) and replaces the first with the value of the sum;
  • CompareAndExchange: compares two values and if they’re equal, replaces one of them with the other (notice that this method receives three parameters). This is probably the most important method of this class (it supports compares and exchanges on reference types two!);
  • Decrement: Similar to add,but in this case,it performs a subtraction;
  • Exchange: updates a variable with another value;
  • Increment: adds one to an existing variable;
  • Read: reads a value from a variable;

As I’ve said before, the advantage of using these methods is that all the operations are performed atomically. The Read method might not seen necessary at first until you look at its signature:

public static long Read(ref long location);

As you can see, it should only be used in 32 bits system when you need to access 64 bits integers. Atomically setting a value on 64 bits can be done through the Exchange method:

public static extern long Exchange(ref long location1, long value);

This means that you can easily build a generic write or read routine and reuse them across your programs:

public static class Generic64Helper {
  private const int WordSize32 = 4;
  public static void WriteLong(ref long location, long value) {
    if (IntPtr.Size == WordSize32) {
        Interlocked.Exchange(ref location, value);
    } else {
        location = value;
    }
  }
  public static long ReadLong(ref long location) {
    if (IntPtr.Size == WordSize32) {
        return Interlocked.Read(ref location);
    }
    return location;
  }
}

There a couple of interesting things going on here. First, we use the IntPtr.Size property to get the size of the “current” word. In 64 bits, we really don’t want to pay the price of the Interlocked operation and will simply delegate to a hardware atomic write or read. However, in 32 bits system (where the word size is 4), we really need to use those methods to have atomicity. Reusing this class is rather simple, as you can see from the next snippet (t is initialized with 10 and y is initialized with the value of t):

Generic64Helper.WriteLong(ref t, 10);
long y = Generic64Helper.ReadLong(ref t);

There a couple of gotchas with the previous approach (notice that only the read and write are atomics), but we’ll leave that for a future post. Now, the important thing is to concentrate on introducing most of these methods. If you use reflector, you should see that Interlocked.Read depends directly on the CompareExchange method:

public static long Read(ref long location) {
  return CompareExchange(ref location, 0L, 0L);
}

This is really a cool trick!

As I’ve said before, the CompareExchange method is one of the most important methods exposed by the class. The CompareExchange method receives three parameters: if the first and third parameters are equal, then it updates the first with the value of the second parameter. The method will always return the initial value of the first parameter (even when there’s no update).

So, the code used on the Read method will only end up doing a write (ie, a store) if the current stored value is 0. And even when this happens, there really isn’t any update in the value (notice that if its value is 0 we’re writing 0 again!).

The remaining methods exposed by the class are fairly obvious, so I won’t really go into examples here (the docs do a good job on showing how to use these methods). The important thing to take from this post is that these operations are atomic and are performed at the hardware level. On the next post, we’ll keep talking about this class and on how you can reuse them to add some free lock programming to your apps. Keep tuned!

Jun 29

In the previous post, we’ve started looking at memory loads and stores reordering. In this post, we’re going to keep our study and we’re going to introduce atomicity. Atomicity is a really important concept we’ve met in the past. We’ve already talked about it on a higher level: do you recall our discussion on critical regions (and how they’re implemented through critical sections)?

With critical regions we were able to get atomicity at a high level (though at a logical level – recall that using a critical region you can have a “logical” instruction which is really a group of low level instructions). In this post, we’re really into understanding memory loads and stores and that’s why we’re interested in low level atomicity. At this level, atomic operations are performed as a single instruction between the processor and memory and ensure that a thread never sees a corrupt value.

Regarding memory access,  all processors ensure that we have atomic loads and stores of aligned word sized values (note that I’m talking about the current processors on which windows can run).

Since I’m also a beginner, I think that explaining these concepts a  little bit more might be a good idea. Lets start with the concept of word sized values. Word sized values, aka pointer sized values, represent the maximum amount of memory a processor can handle at a time. for instance, on a 32 bits processor, the word sized value is 32 bits, ie,4 bytes.  On a 64 bit processor,you get an 8 byte word sized value. Notice that we generally use bytes for memory sizes instead of bits. I guess that you’ve got the general idea, right?

Now, the second part, which is also important: aligned. We say that a value is aligned if its address begins at a position which is evenly divisible by a certain memory unit size. For instance, a value is said to be 4 byte aligned if its memory starts on a position which is evenly divisible by 4. Here’s a practical example: if you’re loading a word sized value which “starts” at 0x28, then you can be sure that you’re accessing a value that can be 4 or 8 byte aligned  (notice that 0x28 = 40 in decimal, which is evenly divisible by 4 or by 8 ).

Since this is really important, I guess I’ll repeat myself again: you’ll *only* get atomicity when you load or store an aligned word sized value. If you’re loading or storing a value which is smaller than the processor’s word size, you’ll still need to respect the current alignment. For instance, if you’re in  a 32 bit processor, where words are 4 bytes long, then that value should be positioned on a position whose address is divisible by 4 (note that you’ll probably need to  “fill” – relax, this is generally done by the compiler 🙂 -  the other 3 bytes with padding so that the next value is also aligned to ensure proper atomicity).

On the other hand, if you need more space than is available for the current processor’s word size, then you will not get hardware atomicity for that load or store (and this happens even if the value is aligned). In these cases, you’ll need to watch out because you can’t simply load and store a value without any further consideration. If you do this, then you can end up with a thread loading the value before another has completed storing it! (btw, this is know as torn read)

(It’s important to understand that this behavior is also observed for chunks of memory smaller  than or equal to the processor’s word size if that value isn’t aligned.)

For instance, this means that if you’re writing multithreaded code that uses long variables and you’re running that code in 32 bits processors, then you shouldn’t forget to protect those write and read operations ! (in future posts. we’ll talk about interlocked operations; if you’re not using them, then you’ll need at least a lock – but don’t get too smart using a lock for “writing only” because that will not work in all the scenarios).

Now that we’re clear on alignments and processor’s word sizes, it’s time to take a quick look at how things work in the CLR. The good new is that the C# compiler and the JIT ensure proper alignment in all cases. In practice, values bigger than 4 bytes on 32 bits processors and values bigger that 8 bytes on 64 bits processors always start on 4 or 8 byte aligned boundaries. When we’re working with smaller values, the CLR will also ensure proper placement, filling the remaining space with padding.

If you want, you can have more control over the way fields are aligned. If you’ve done interop programming, then you’ve surely met the StructLayoutAttribute class. This class allows you to control the way fields are defined on the layout of a specific type. If you’re thinking about using this feature (for instance, to control the amount of wasted memory), then proceed with care (in fact, think thrice – yes, I learned this word a few days ago and i could hardly wait for using it in a post 🙂 – before going down this path!). It’s that you can easily end up loosing the CLR’s type safety and that means you’ll probably end up getting exceptions from your code at runtime.

It’s important to understand that whenever you work with values that fall out of the aligned word size value (a non-aligned value or a value bigger than the current word size), the compiler will end up generating multiple instructions. As we’ve seen, these might end up leading to torn reads if you don’t take the necessary precautions.

Notice that even though stores and loads of aligned word sized values are atomic, they don’t really let us do much. Why? Simply because there are several scenarios where we need  to check a value before updating and this means that we end up with a load followed by a store. In these cases and in order to ensure atomicity, we’re back to locks (interesting: have you ever though on how locks are implemented?)…or maybe not! The truth is that we’ve got a couple of interlocked operations which ensure atomicity and that are perfect for these scenarios. we’ll talk about them in the next posts. Don’t you think that things are getting rather interesting! Keep tuned!

Jun 29

Multithreading: load and store reordering

Posted in C#, Multithreading       Comments Off on Multithreading: load and store reordering

Until now, we’ve been busy talking a look at several interesting topics associated with multithreading programming. As we’ve seen, one of the most problematic areas in multithreaded programs is sharing state across multiple threads. As we’ve seen in several posts along this series, we can use a critical regions for ensuring that shared state is accessed by one thread at a time. In other words, using a critical region ensures that access to shared state is serialized leading to correctness of the program.

One thing that surprises many people is learning that single variables access isn’t always safe. Understanding why leads us to a new concept: memory loads and stores reordering. Even though most of us have grown used to thinking that the program executes precisely in the order we wrote it, that isn’t really guaranteed. Currently, memory operations like loads (ie, accessing a variable) and stores (ie, putting a value into a variable) can be reordered under the optimization “banner”:

  • the first “optimization” you might get will be performed by the compiler. Compilers might move, eliminate (or even add) memory loads and stores to the program of your app. However, compilers will always preserve sequential behavior (though its reordering can break multithreaded code);
  • processors might also change the way compiled code gets executed. For instance, modern processors tend to use branch prediction to improve the performance of your program. This is just one of the optimizations that may break your code when run in parallel (ie,after adding multithreading to your app).
  • caches may give you wrong results too. Most modern processor architectures employ several levels of caches. Some are shared between all processors,while others are processor specific. Caches tend to break the “memory as a big array” vision, leading to effective reorder of loads and stores (at least, this is the perception you get when caching starts breaking your code).

After this small introduction, you should be worried about the code you write because it seems like nothing is safe (if even a simple variable access smaller than the “current” processor word size isn’t safe, then what can we do?). Fortunately, there are some guarantees which we can use to write safe multithreaded programs:

  • instruction reordering cannot break the sequential evaluation of the code. What this means is that your code should always run safely and correctly if you run it through a single thread (meaning that we only need to worry with reordering when we write multithreaded code);
  • data dependency will always be respected. As an example, this means that if you have something like x = 10; y = x; then memory access won’t be reordered (ie, you will never get y=x; x = 10) because there is a data dependency between x and y;
  • finally, all platforms conform to a specific memory model which define the rules that are to be followed for memory loads and stores reordering.

I guess that the main thing you should take from these points is that what is run isn’t always what you’ve written. On the next post, we’ll keep looking at these issues and see how critical regions help with memory reordering issues. Keep tuned!

Jun 25

Multithreading: the BackgroundWorker class

Posted in C#, Multithreading       Comments Off on Multithreading: the BackgroundWorker class

In the last posts of the series, we’ve been looking at many important features related to GUIs and multithreading. Today, we’re going to wrap up this topic with the BackgroundWorker class. The previous posts are really targeted at library developers.

The truth is that having to implement the EAP pattern  to have asynchronous behavior is simply too much work. Interact with the SynchronizationContext object to have asynchronous behavior is really too cumbersome and involves too much boilerplate code.

Wouldn’t it be good if we had a reusable class where we’d simply need to write the code for the asynchronous operation and that would be able to let us know when things are done? The good news is that there’s already a component you can use for that: the BackgroundWorker class.

The BackgroundWorker class can be seen as a helper class which encapsulates all that boilerplate code we’ve seen in previous posts. Here’s the public API of this class:

public class BackgroundWorker : Component {
  public event DoWorkEventHandler DoWork;
  public event ProgressChangedEventHandler ProgressChanged;
  public event RunWorkerCompletedEventHandler 
RunWorkerCompleted; public BackgroundWorker(); public void CancelAsync(); public void ReportProgress(int percentProgress); public void ReportProgress(
int percentProgress, object userState); public void RunWorkerAsync(); public void RunWorkerAsync(object argument); public bool CancellationPending { get; } public bool IsBusy { get; } public bool WorkerReportsProgress { get; set; } public bool WorkerSupportsCancellation { get; set; } }

As you can see, this API is really similar to the one we ended up after adding cancellation to the initial EAP implementation. The truth is that you can see this class as a reusable EAP implementation. You’ve got several events used for:

  • signaling the start of an asynchronous operation;
  • for reporting progress;
  • for notifying about the completion of an operation.

Asynchronous operations can be started through one of the two overloads of the RunWorkerAsync method.  This method ends up firing the DoWork event on a separated thread. You’re supposed to handle this event and put the code that should be run asynchronously in that method.

Cancellation is also available through the CancelAsync method. If you recall our previous discussion about the EAP pattern, then you know that this class supports only one asynchronous operation at a time (notice that the CancelAsync does not receive any object parameter). It’s also important to notice that you need to allow cancellations by setting the WorkerSupportsCancellation property to true.

Reporting is also available: you can use the ReportProgress method for marshalling report information back to the “main thread”. Anyone that is interested in reporting progress should handle the ProgressChanged event. The same observation we’ve made about cancellations is also valid for progress reporting: you need to enable it by setting the WorkerReportsProgress to true.

As you might expect, all the operations are supported through an internal AsyncOperation instance which is created when you invoke the RunWorkerAsync method. And I guess that’s it. With this post, we’ve ended our quick analysis on GUIs and multithreading. However, there’s still a lot of things to talk about multithreading, so I’ll keep writing about it. Stay tuned for more!

Jun 24

Multithreading: AsyncOperationManager and AsyncOperation helpers

Posted in C#, Multithreading       Comments Off on Multithreading: AsyncOperationManager and AsyncOperation helpers

In the last posts, we’ve taken a look at how synchronization contexts help marshal work across threads. Today we’re going to talk about two classes (which we’ve already met in the past when we implemented the EAP) that abstract even further the use of synchronization contexts: I’m talking about the AsyncOperationManager and AsyncOperation classes. Let’s start with the AsyncOperationManager class.

The AsyncOperationManager is a simple helper class with very few lines of code in it, as you can see from the next snippet:

public static class AsyncOperationManager {
  public static AsyncOperation CreateOperation(
object userSuppliedState); public static SynchronizationContext
SynchronizationContext { get; set; } }

As you can see, this is a static class with only two members. The SynchronizationContext property lets you access or set the current SynchronizationContext. The biggest advantage it offers when compared with accessing the synchronization context through the SynchronizationContext.Current property (which we’ve used in the previous post) is that you’ll always get a valid context (notice the comment in the previous post’s sample).

The CreateOperation method is the only way you have to create an instance of the AsyncOperation type. Each AsyncOperation instance can only track one asynchronous operation. The AsyncOperation type exposes the following members:

public sealed class AsyncOperation {
  public void OperationCompleted();
  public void Post(SendOrPostCallback d, object arg);
  public void PostOperationCompleted(
SendOrPostCallback d, object arg); public SynchronizationContext
SynchronizationContext { get; } public object UserSuppliedState { get; } }

A couple of observations about the previous methods exposed by the AsyncOperation type:

  • you can access the current synchronization context through the SynchronizationContext property;
  • the UserSuppliedState property lets you access the optional state parameter that you’ve passed through the userSuppliedState parameter (AsyncOperationManager.CreateOperation method);
  • you’re supposed to use the Post operation to marshal a notification back to the “main” thread;
  • you’re supposed to signal the end of the asynchronous operation by calling the PostOperationCompleted method.

As you can see, the AsyncOperation will always perform the operation in asynchronous fashion (ie, it will always invoke the Post method of the internal SynchronizationContext). The main difference between Post and PostOperationCompleted resides in an internal flag that is set by the 2nd method. After this flag is set, all invocations of the Post/PostOperationCompleted methods end up throwing exceptions.

By now, I’m thinking that you’ve got enough information for understanding the use of these types when we saw how to implement the EAP pattern in a previous post, so I’m not going to waste your time repeating what has been said before.

And that’s all. With this post, I guess that there’s only one thing to talk about: the BackgroundWorker class. We’ll leave that topic for the next post. Keep tuned for more.

Jun 23

In the previous post, we’ve started looking at synchronization contexts. In this post, we’ll take a close look at the widows forms custom synchronization context. Whenever you run a windows form app, you might end up interacting with the WindowsFormsSynchronizationContext. The constructor of this class is responsible for getting a reference to the GUI thread so that it can then create a control which will be used for marshalling the results back to the UI thread.

With a valid reference to a control and after understanding how work is marshaled back to the GUI thread (which involves posting win32 messages to the UI thread – as we’ve seen in this post), it shouldn’t be a surprise to learn that the Post and Send methods are overridden and that the new implementation relies on the BeginInvoke and Invoke methods of the special “marshal” control. Here’s the current implementation copied from reflector:

public override void Post(
SendOrPostCallback d,
object state) { if (this.controlToSendTo != null) { this.controlToSendTo.BeginInvoke( d, new object[] { state }); } } public override void Send(
SendOrPostCallback d,
object state) { Thread destinationThread = this.DestinationThread; if ((destinationThread == null) ||
!destinationThread.IsAlive) {
throw new InvalidAsynchronousStateException(
SR.GetString(
"ThreadNoLongerValid")); } if (this.controlToSendTo != null) { this.controlToSendTo.Invoke(d, new object[] { state }); } }

As you can see, the implementation performs several auxiliary verifications to ensure that everything is still ok before going on with the invocation of the methods.

You might be wondering how things get hooked up, ie, how is the WindowsFormsSyncrhonizationContext set up in the current thread’s ExecutionContext. The answer is rather simple: the base Control class performs that task from its constructor,ensuring that whenever you create any window,you end up with the correct synchronization context.

So, how can we use this information for building multithreaded code for GUIs? The first thing you should do is base your code in synchronization contexts. This ensures that you get the correct behavior and don’t need to worry about GUI threads (notice that this should work in other custom environments that have their own custom synchronization contexts). To show you how you can use this class in code, we’re going to update the code of one of the previous posts so that it uses synchronization contexts instead of relying in a control to marshal work back into the GUI thread:

var number = GetNumberFromSomewhere();
button1.Enabled = false;
//should check for null here!
var syncContext = SynchronizationContext.Current;
ThreadPool.QueueUserWorkItem(state => {
  var ctx = state as SynchronizationContext;
ctx.OperationStarted();
var isPrime = false; Exception thrownException = null; try { //algorithm for checking if number is prime
}
catch (Exception ex) { thrownException = ex; } finally { ctx.Send(marshaledState => { var result = marshaledState as PrimeVerifierResult; button1.Enabled = true; if (result.ThrownException != null) { throw result.ThrownException; } if (result.IsPrime) { MessageBox.Show("Prime number"); } }, new PrimeVerifierResult(isPrime, thrownException));
ctx.OperationCompleted();
} }, syncContext);

A couple of observations on the previous snippet:

  • you should always check the context obtained from the Current property (though that is not done in this snippet);
  • updating the button’s state needs to be done from the GUI thread. This is a peace of cake because we’ve already got a reference to the current SynchronizationContext (obtained on the GUI thread and passed to the secondary thread through the state parameter) and the only thing we need to do is call Send or Post;
  • notice how we signal the beginning and ending of an operation by calling the OperationStarted and OperationCompleted methods. This allows us to notify the synchronous context that an asynchronous operation began  so that it can perform any operation it sees fit.

Looking at the previous sample, you might think that it’s not as simple as the first one. And you’re right: if I was writing this code, I’d always go with option 1.

If instead of putting this code on the GUI, I told you that you’d need to write a a class that could be reused across several GUIs, then you can probably start seeing value in the previous code. In fact, if  you look at it and pay enough attention, you can start to see that it looks a lot like the code we had when we implemented the EAP pattern: the main difference is that his code relies on a SynchronousContext instance while that old code used the AsyncOperation and AsyncOperationManager classes (btw, I’m talking about the internals here!).

The truth is that these classes are just helpers and in the next post we’ll see how they relate with synchronization contexts. Until then, stay tuned!

Jun 23

In the last post, we’ve seen how to marshal back the results obtained on a secondary thread so that controls are updated on the GUI thread. Today we’re going to start looking at synchronization contexts. Synchronization contexts are abstractions for marshalling between threads. In other words, they abstract those scenarios where you cannot call a method from the current thread and need to make sure that the method is executed on a specific thread (as we’ve seen, GUIs fall in this kind of scenario).

As we’ve seen in the previous example, GUI controls can only be updated from the GUI thread and in that example, we’ve relied on the ISynchronizeInvoke method for running the code on the correct thread. In practice, whenever we’ve got a reference to a control, we can use its Invoke or BeginInvoke method to marshal work back to the current thread. The problem is getting a reference to the control. When we build the form, this is simple because we know the controls that are placed on the form and it is easy to get a reference to any of them.

However, suppose we’re building a general component that needs to be reused on several GUI apps. Now things become harder and we’d need to write boilerplate code for getting a control so that we can marshal work back into the GUI thread. This is where synchronization contexts step in and save the day. Before delving into the specifics of the synchronization contexts used on GUIs, lets take a step back an study the general API introduced by the base SynchronizationContext class:

public class SynchronizationContext {
  public virtual SynchronizationContext CreateCopy();
  public bool IsWaitNotificationRequired();
  public virtual void OperationCompleted();
  public virtual void OperationStarted();
  public virtual void Post(SendOrPostCallback d, 
object state); public virtual void Send(SendOrPostCallback d,
object state); public virtual int Wait(IntPtr[] waitHandles,
bool waitAll,
int millisecondsTimeout); public static SynchronizationContext Current { get; } public static void SetSynchronizationContext(
SynchronizationContext syncContext); }

As you can see, you can get (Current property) or set (by calling the static SetSynchronizationContext) a thread’s synchronization context. Synchronization contexts are always obtained through the “current” ExecutionContext (which you can get through the ExecutionContext property of the thread class). Notice that if you want to interact with synchronization contests, you must be prepared for getting a null reference.

Even though a synchronization context is part of the ExecutionContext, it’s important to keep in mind that queuing an item on the thread pool will never propagate the “current” SynchronizationContext (not even when you use the QueueUserWorkItem method – event though the execution context is propagated, the synchronization context is one of the items that will not be propagated to the new thread).

Post and Send are probably the most important methods exposed by the class. As you can see, both expect a SendOrPostDelegate and an option object parameter for passing some state to that delegate. The difference is that Post performs the operation in asynchronous fashion while Send performs a synchronous callback. The default implementation of these methods is rather simple: Post uses the thread pool while send simply invokes the delegate.

The OperationStarted and OperationCompleted methods let you receive notifications when an operation starts and ends. The default SynchronizationContext doesn’t do anything in these methods, but as we’ll see, they can be used for (for example) tracking running operations.

Finally, we’ve got a Wait method. The CLR uses this method for waiting if you the IsWaitNotificationRequired property is set to true. What this means is that whenever you wait on a thread that has a synchronization context that has this property set to true, your wait calls end up being redirected to this Wait method. Custom synchronization contexts override this method to perform some specific actions whenever a thread needs to wait on a specific action.

And that’s it. After this brief introduction, we’re ready to take a look at the synchronization contexts used by windows forms. But we’ll leave that to the next post. Keep tuned!

Jun 22

In the last post, we’ve see that multithreading is almost a necessity in GUIs. We’ve also seen that there are some gotchas associated with it: a control can only be updated from the GUI thread. In practice, this means that we’ll need to marshal back the results to the main thread when they’re ready.

In .NET, the Windows Forms introduces the ISynchronizeInvoke interface for performing that kind of operation:

public interface ISynchronizeInvoke {
  IAsyncResult BeginInvoke(Delegate method, object[] args);
  object EndInvoke(IAsyncResult result);
  object Invoke(Delegate method, object[] args);
  bool InvokeRequired { get; }
}

The Control class (which is reused by all the existing controls) implements this interface, letting you marshal the results back to the main thread by calling one of the Invoke methods.

As you can see from the interface API, you can block the secondary thread until the GUI is updated (in this case, you use the Invoke method, which is equivalent to calling BeginInvoke followed by EndInvoke), or you can perform that work in asynchronous fashion, by invoking the BeginInvoke method and use one of the available approaches for waiting on the IAsyncResult returned. Besides these methods,the Control class offers two extra helper methods which you can use when you don’t need to pass parameters to the delegate:

public IAsyncResult BeginInvoke(Delegate method);
public object Invoke(Delegate method);

These are just shortcuts to the previous methods and don’t offer any benefits over the interface methods.

The InvokeRequired property is there for checking if marshalling is needed. After all,you don’t want to marshal if you don’t have to, right? Checking if marshalling is needed involves getting the Win32 control’s handle and seeing its window’s thread is the same as the current one.

Internally, the control class performs several interesting steps for executing the update on GUI through the Invoke methods(sync or async):

  • it starts by getting a reference to the control (or parent control) which has an associated Win32 handle;
  • it checks the associated thread ID (the unmanaged thread ID) and sees if marshaling will be needed (this happens whenever the control’s windows thread is different from the current thread that is calling the Invoke method – this relies in using the GetWindowThreadProcessId win 32 function;
  • it propagates the current ExecutionContext (more on this in future posts) so that it flows across threads;
  • if needed, it posts a message by using the Win32 PostMessage method (that’s why it needs to get a reference to a control that has a Win32 handle) and sets up a callback that will be fired when the message is processed.

As you’d expect, there’s a custom IAsyncResult implementation which performs many of the things we’ve seen before (like allocating a lazy event so that it is created only if it’s needed, etc, etc). To show you how easy it is to use marshalling, we’re going to create a Windows Forms test project which will use a secondary thread for calculating if a number is prime. We’ll start with a really simple form which has only one button for starting the operation:

frm

Here’s the code we’ve added to the buttons click event:

private void button1_Click(object sender, EventArgs e) {
  button1.Enabled = false;
  ThreadPool.UnsafeQueueUserWorkItem(state =>
  {
    var isPrime = CheckIfNumberIsPrime((Int32)state);
    Action updater = () => { 
MessageBox.Show(isPrime.ToString());
button1.Enabled = true; }; button1.Invoke(updater,
null); }, 19// hardcoded number ); }

As you can see, we’re making sure that all UI code runs on the GUI thread. The important thing here is making sure that the Enable property is set from the correct thread. In pre-NET 2.0, you could go ahead and set a property from a secondary thread. Most of the time, things would work and you’d get occasional crashes which were difficult to debug. From .NET 2.0 onward, the behavior changed: when you’re running in a debugger, you’ll always get additional checks which verify if the code is being called from the correct thread.

As a final optimization, you’ll probably want to stop the ExecutionContext from flowing in most GUI apps (specially for full trust apps). Doing this is as simple as calling the SuppressFlow method:

ExecutionContext.SuppressFlow();

And that’s it for today. Keep tuned for more on multithreading.

Jun 21

Multithreading: why multithreading on GUIs

Posted in C#, Multithreading       Comments Off on Multithreading: why multithreading on GUIs

In the latest posts, we’ve seen how to implement the event based asynchronous (EAP). When we introduced at the main features of the EAP, I’ve said that you should use this pattern when the consumers of your classes are GUIs programmers.

Since we’ve already learned lots of things about EAP, now it’s probably a good time for starting to look at asynchronous programming and GUIs. GUIs are probably one of the areas which will get the most from multithreading. To understand why, we need to make a small detour on how GUIs work in windows. Since I’m focusing on managed code, from now on I’ll be using windows forms platform for illustrating purposes (notice that the main concepts are common to unmanaged and managed code and it looks like things will remain the same for several years),

On windows, GUIs apps rely on messages for signaling several kinds of operations. What this means is that mouse, keys events, etc end up generating a message that are send to the GUI thread’s message queue. The main job of this thread is to pump the messages that are queued on the message queue.

In unmanaged code, pumping a message generally involves writing a loop that gets a message from the queue (GetMessage) and dispatches it (DispatchMessage) to a a special windows function (notice there’s always a windows function associated to a window – if you don’t create a custom one, you’ll end up with the default one). Windows functions will inspect the message and run some code in response to specific messages.

Back to the managed world and windows forms,you’ll notice that all these details are hidden by several classes. For instance,pumping messages is supported through the Application.Run call which you get by default on all the Windows Forms projects. Notice that this method “transforms” the thread on which it’s called into a GUI thread.

Windows Forms also encapsulates the so called Windows function. In fact, if you look at the Control class, you’ll see that it exposes a protected WndProc method. Internally, this method translates the windows messages into events which you might (or not) handle from your app. Notice that even though there’s plenty of stuff exposed as events, you can still overload this method if you need to handle a windows message directly.

As you can see, we can say that the final result of pumping a message is running some code that performs a specific action. In Windows Forms apps, this generally means handling an event. Now, after the previous paragraph, it should be obvious why you shouldn’t do much work from within your event handlers: if you do a lengthily operation from the GUI thread, you’re not letting it pump more messages, leading to a blocked window (something which we’ve all seen in the past – and probably cursed the guys that have written that app for it:)).

Blocking the GUI thread isn’t such a good idea either. However, it’s important to recall that in windows you can block a thread while pumping messages from the message queue. The CLR does this automatically for you and you have no control over it (which is good and bad as we’ve seen in the past).

Now, since you already know lot of things about multithreading, you know that you can easily execute that lengthily operation on a separated thread. The problem is that all windows controls have thread affinity and can only be updated from the GUI thread. This means that when we are on a secondary thread, we need to marshal the results back to the GUI thread if we need to change a property of a control.

As you can see, using separate threads on GUI programming is a must for ensuring proper response from our application. However, it also needs special care for guaranteeing that everything works as expected. In the next posts we’ll start looking at some practical examples which show how to use multithreading on GUIs and on the internal details that support this kind of operations. Keep tuned!

Jun 20

Multithreading: updating the code to support multiple asynchronous executions of one method

Posted in C#, Multithreading       Comments Off on Multithreading: updating the code to support multiple asynchronous executions of one method

In the previous posts, we’ve seen how to implement the asynchronous event based pattern and how to support several features:

Today we’re going to update our initial code to support several simultaneous asynchronous calls of the same method. If you look at that code, you’ll notice that for supporting a single operation, we’re using several fields:

  • an AsyncOperation (_currentOperation) instance that represents the current operation;
  • a boolean field (_isRunning) which indicates if there’s currently a running operation;
  • a boolean field used for supporting cancellation.

Going multi-method is not really complicated, though it means we need to pay more attention to the way the internal fields of the class are accessed. The first thing we’ll need is a container for storing several items. You probably recall from previous posts that we mentioned an optional Object parameter for several of the methods exposed by a class. This optional parameter stops being optional when we want to support several asynchronous operations. In practice, it’s used as a key that identifies a specific asynchronous execution. Notice that the consumer of the API must remember that key if it wants to get progress info or if it needs to cancel a running operation.  In this case, we’ll be using a Dictionary<Object, AsyncOperation> as the container that stores all the running asynchronous operations.

Using a dictionary means that will have to translate the operations that relied on the fields into methods that use the dictionary. Here are the methods we’ve ended up adding to the previous class:

private IDictionary<Object, AsyncOperation> _operations =
                  new Dictionary<Object, AsyncOperation>();
private void AddOperation(Object key, AsyncOperation operation) {
  lock (_operations) {
    if (_operations.ContainsKey(key)) {
      throw new ArgumentException();
    }
    _operations.Add(key, operation);
  }
}
private AsyncOperation GetOperation(Object key) {
  AsyncOperation operation = null;
  lock (_operations) {
    operation = _operations[key];
  }
  return operation;
}
private void CancelOperation(Object key) {
  lock (_operations) {
    _operations[key] = null;
  }
}
private void EndOperation(Object key) {
  lock (_operations) {
    _operations.Remove(key);
  }
}
private Boolean IsRunning(Object key) {
  var isRunning = false;
  lock (_operations) {
    isRunning = _operations.ContainsKey(key) &&
                _operations[key] != null;
  }
  return isRunning;
}
private Boolean OperationWasCancelled(Object key) {
  Boolean cancelled = false;
  lock (_operations) {
    cancelled = _operations.ContainsKey(key) &&
                _operations[key] == null;
  }
  return cancelled;
}

As you can see,we’ve added several methods that lets cancel an operation (CancelOperation),check if an operation is running (IsRunning), check if an operation was cancelled (OperationWasCancelled) and clear an operation which has ended from the private dictionary (EndOperation). Notice that we’re using locks for ensuring proper dictionary access.

There are a couple of interesting changes between the previous snippet and the old code:

  • the first thing you should notice is that the asynchronous operations can only be identified by a key (we’ll see from where that key comes in the next paragraphs) and that’s why all the helper methods receive one.
  • if you compare the “running” concept implementation with the one we have in the previous snippet, you should notice that running means having a valid entry with the specified key (in the previous simpler code, we used a simple field for checking if a task was running);
  • cancelling means that we set the associated AsynchronousOperation to null but don’t remove the entry from the dictionary;
  • we’ve also got a helper method that should only be called when the asynchronous operation ends and is responsible for removing everything from the entry from the dictionary.

The next thing we need to change is the IsPrimeAsync method. Besides adding the Object parameter used for identifying an asynchronous operation, we also need to change the internal code used by that method:

public void IsPrimeAsync(Int32 number, Object key) {
  if (IsRunning(key)) {
      throw new InvalidOperationException();
  }
  var currentOperation = AsyncOperationManager.CreateOperation(null);
  AddOperation(key, currentOperation);
  ThreadPool.QueueUserWorkItem(state =>  {
    var numberToCheck = (Int32)number;
    var isPrime = false;
    Exception throwException = null;
    try {
      if (number > 2) {
        isPrime = true;
        var half = number / 2;
        var currentProgress = 0f;
        for (var i = 2; i < half &&
                  !OperationWasCancelled(key); i++) {
            if (number % i == 0) {
                isPrime = false;
                break;
            }
            currentProgress = ((float)i  / (float)half) * 100;
            currentOperation.Post(
              evtState => OnProgressChanged(
                  (ProgressChangedEventArgs)evtState),
                  new ProgressChangedEventArgs(
                    (Int32)currentProgress, key));
          }
      }
    }
    catch (Exception ex) {
        throwException = ex;
    }
    finally {
        NotifyEndOfOperation(
          numberToCheck,
          isPrime,
          OperationWasCancelled(key),
          throwException,
          key,
          currentOperation);
    }
  }, number);
}

I think there’s not much to say about this method: the principles are the same as before but now everything is encapsulated by several helper methods. As you can see, we’ve updated the code use for propagating progress (now we pass the key that identifies the operation, instead of passing null) and we’ve also changed the API of the NotifyEndOfOperation so that it also receives the associated AsyncOperation instance and its key (we could always get the key from the UserSuppliedState property of the AsyncOperation instance, but I preferred to be explicit about it). Here’s the code for that method:

private void NotifyEndOfOperation( Int32 numberToTest,
  Boolean isPrime,
  Boolean cancelled,
  Exception thrownException,
  Object key,
  AsyncOperation currentOperation ){
    var evt = new PrimeNumberVerificationCompletedEventArgs(
            numberToTest, isPrime, thrownException, cancelled, key);
    currentOperation.PostOperationCompleted(
          evtState => {
            EndOperation(key);
            OnPrimeNumberCompleted(
              (PrimeNumberVerificationCompletedEventArgs)evtState);
         },
        evt);
}

The main point of interest here is that we’ve updated the instantiation of the PrimeNumberVerificationCompletedEventArgs so that if is able to flow the key that identifies the operation that ended. Notice also that we clear the current AsyncOperation from the dictionary before notifying the user of the completion of the task.

The only thing needed is updating the code used for cancelling a running task:

public void CancelAsync(Object key) {
  if (OperationWasCancelled(key)) {
      return;
  }
  CancelOperation(key);
}

And that’s it. With this post, I’d say that there really isn’t much more to say about this pattern. Keep tuned for more on multithreading apps!

Jun 20

I know that I’ve said that the next post of the series would be on how to support several asynchronous tasks. However, I was reminded by a friend that I didn’t had any post on how to support the optional progress info feature. So, I decided to take a small detour today and we’ll see how we can modify our class so that it is able to report progress back to the user.

As we’ve seen, progress is exposed through an event named ProgressChanged, of type ProgressChangedEventHandler (notice that if you have several diferent asynchronous method, then you should name the event XXXProgressChanged, where XXX is the name of the asynchronous method).

Here’s the code we’re using to fire the event:

public event ProgressChangedEventHandler ProgressChanged;
protected void OnProgressChanged(ProgressChangedEventArgs evt){
   
if (ProgressChanged != null) {
        ProgressChanged(
this, evt);
    }
}

As you can see, this is standard stuff, ie, we’ve added the event and then an OnEventName method which is responsible for firing the event (btw, keep in mind that if the class you’re building isn’t sealed,then you should make the method virtual so that a derived class can use overrides as a way to “handle” the event ).

As you might guess by now,progress report must come from the method that does the work. In this case, we’re talking about an anonymous method that is passed to the ThreadPool.QueueUserWorkItem method. In this example, calculating the progress is easy and depends on the total number of elements used in the for loop:

public void IsPrimeAsync(Int32 number) {
  if (_isRunning) {
      throw new InvalidOperationException();
  }
  _isRunning = true;
  _currentOperation = AsyncOperationManager.CreateOperation(null);
  ThreadPool.QueueUserWorkItem(state =>  {
      var numberToCheck = (Int32)number;
      var isPrime = false;
      Exception throwException = null;
      try {
        if (number > 2) {
          isPrime = true;
          var half = number / 2;
          var currentProgress = 0f;
          for (var i = 2; i < half && !_cancelOperation; i++) {
              if (number % i == 0) {
                  isPrime = false;
                  break;
              }
              currentProgress = ((float)i  / (float)half) * 100;
              _currentOperation.Post(
                      evtState => OnProgressChanged(
                             (ProgressChangedEventArgs)evtState ),
                      new ProgressChangedEventArgs(
(
Int32)currentProgress, null)); } } } catch (Exception ex) { throwException = ex; } finally { NotifyEndOfOperation(numberToCheck,
isPrime, _cancelOperation, throwException); } }, number); }

If you compare this code with the previous snippet, you’ll see that we added a currentProgress local variable for tracking the progress of the current operation and that we’re using our AsyncOperation instance (_currentOperation field) for marshalling the results and invoking the OnProgressChanged event from the “main thread” (where “main thread” is the thread that started the operation). And that’s it. From this point on, you can also get progress information by hooking up the ProgressChanged event:

work.ProgressChanged += (sender, e) => { //do something here with progress info};

And that’s it for today. On the next post, we’ll talk about the changes we need to make to the current code so that we can support several asynchronous executions of the IsPrimeAsync method. Keep tuned!

PS: I’ve followed Tuna’s twitter suggestion and I’ve used a plug-in for showing code in color. Do you guys prefer it this way?

Jun 18

Today we’re going to improve the code we’ve started writing yesterday: we’re adding canceling support. As you might recall, in our last post we’ve built a simple class which supports a single asynchronous operation at a time. Since we only support one asynchronous operation at a time and we can only start one asynchronous operation from our class, then we only need to add a parameterless CancelAsync method.

Here’s how you might implement the CancelAsync method:

public void CancelAsync() {
    if (!_isRunning) {
       return;
    }
    _cancelOperation = true;
}

As you can see, we need to add a _cancel operation field which is used for notifying the helper method that runs in asynchronous mode that it should stop. This means that we need to change our asynchronous method slightly so that it checks that value in the for cycle and passes that value for the PrimeNumberVerificationCompletedEventArgs instance passed to whoever consumes the PrimeNumberCompleted event. Since I’ve cheated a little bit in the previous post, we’ve already got the hooking points for flowing the needed information. This means that changes are mostly concentrated on the IsPrimeAsync method:

public void IsPrimeAsync(Int32 number) {
  if (_isRunning) {
      throw new InvalidOperationException();
  }
  _isRunning = true;
  _currentOperation = AsyncOperationManager.CreateOperation(null);
  ThreadPool.QueueUserWorkItem(state =>
                  {
                      var numberToCheck = (Int32)number;
                      var isPrime = false;
                      Exception throwException = null;
                      try {
                          if (number > 2) {
                              isPrime = true;
                              var half = number / 2;
                              for (var i = 2; i < half && !_cancelOperation; i++) {
                                  if (number % i == 0) {
                                      isPrime = false;
                                      break;
                                  }
                              }                               
                          }
                      }
                      catch (Exception ex) {
                          throwException = ex;
                      }
                      finally {
                          NotifyEndOfOperation(numberToCheck,
                                               isPrime,
                                               _cancelOperation,
                                               throwException);
                      }

                  }, number);
}

There’s still one thing missing: we need to clean up the _cancelOperation flag when that asynchronous operation ends. This can be done from within the anonymous method we’re using inside the NotifyEndOfOperation method:

private void NotifyEndOfOperation( Int32 numberToTest,
  Boolean isPrime,
  Boolean cancelled,
  Exception thrownException ){
      var evt = new PrimeNumberVerificationCompletedEventArgs(
                  numberToTest,
                  isPrime,
                  thrownException,
                  cancelled,
                  null);
      _currentOperation.PostOperationCompleted(
          state => {
                      _isRunning = false;
                      _cancelOperation = false;
                      OnPrimeNumberCompleted((PrimeNumberVerificationCompletedEventArgs)state);
                   },
                  evt);
}

And that’s it. As you can see, supporting cancelation is not really that hard. We’ll keep improving this sample and on the next post we’ll see how we can improve our code so that it supports multiple asynchronous operations. Keep tuned.

Jun 17

In the last post of the series, we’ve taken a look at the main features offered by the event based pattern. Today, we’re going to look at how we can implement that pattern. We’re going to start small and we’re going to reuse the IAsyncResult sample for showing how you can implement this pattern.

As we’ve seen in the previous post, we need to (at least) add a method (named XXXAsync) that will start the asynchronous operation and an event (named XXXCompleted) that will fire . In practice, this means that we need something like this:

class DoSomeWork {
    public Boolean IsPrimeNumber(Int32 number) {
        /* same as before*/
    }
    public event EventHandler<PrimeNumberVerificationCompletedEventArgs> PrimeNumberCompleted;
    protected void OnPrimeNumberCompleted( PrimeNumberVerificationCompletedEventArgs evtArgs ){
        if (PrimeNumberCompleted != null) {
            PrimeNumberCompleted(this, evtArgs);
        }
    }
    public void IsPrimeAsync(Int32 number) {
     //some code here…
    }
}

As you can see, we’ve got a PrimeNumberVerificationCompletedEventArgs that is passed back. As we’ve said, this must be a AsyncCompletedEventArgs (or derived) class. In this case, we need to return a value, so we’ll need to create a new derived class which I called PrimeNumberVerificationCompletedEventArgs. Here’s the code for that class:

public class PrimeNumberVerificationCompletedEventArgs: AsyncCompletedEventArgs {
    private readonly Int32 _testedNumber;
    private readonly Boolean _isPrime;
    internal PrimeNumberVerificationCompletedEventArgs( Int32 testedNumber,
        Boolean isPrime,
        Exception exception,
        Boolean calculationCanceled,
        Object state )
        : base(exception, calculationCanceled, state) {
        _testedNumber = testedNumber;
        _isPrime = isPrime;
    }
    public Int32 TestedNumber {
        get{
            RaiseExceptionIfNecessary();
            return _testedNumber;
        }
    }
    public Boolean IsPrime{
        get{
            RaiseExceptionIfNecessary();
            return _isPrime;
        }
    }
}

As you can see, I’ve added two properties to the base class: TestedNumber (returns the number that was passed to the IsPrimeAsync method) and IsPrime (returns the result of the processing). As you can see, both properties call the RaiseExceptionIfNecessary method before returning the value back to the client. Internally, this method will throw the exception (when it isn’t null) that supposedly originated during the asynchronous operation and that was passed to the constructor.

As you might expect, most of the action happens on the IsPrimeAsync method. From within that method, we need to start the processing on a different thread. For starters, we’re allowing only one asynchronous call at the time. Here’s a possible implementation for the IsPrimeAsync method:

public void IsPrimeAsync(Int32 number) {
    if (_isRunning) {
        throw new InvalidOperationException();
    }
    _isRunning = true;
    _currentOperation = AsyncOperationManager.CreateOperation(null);
    ThreadPool.QueueUserWorkItem(state =>
                    {
                        var numberToCheck = (Int32)number;
                        var isPrime = false;
                        Exception throwException = null;
                        try {
                            if (number > 2) {
                                isPrime = true;
                                var half = number / 2;
                                for (var i = 2; i < half; i++) {
                                    if (number % i == 0) {
                                        isPrime = false;
                                        break;
                                    }
                                }                               
  
                          }
                        }
                        catch (Exception ex) {
                            throwException = ex;
                        }
                        finally {
                            NotifyEndOfOperation(numberToCheck, isPrime, false, throwException);
                        }

                    }, number);
}

There are a couple of interesting points here:

  • we start by creating an AsyncOperation instance (_currentOperation is a field of the class) by using the helper AsyncOperationManager class. As you’ll see, this class handles most of the work for us and I’ll have much more to say about it in future posts (I guess that when I finish this topic, I’ll go straight into GUI and multithreading);
  • we use an auxiliary field (_isRunning) which is signaled when we start a new operation. Since this first version does not support more than one asynchronous call at a time, we end up throwing an exception whenever the method is called before it ends the “current” running asynchronous operation;
  • we need to wrap our code in a try/catch block so that we catch all the exceptions that might happen during the execution of that code (this is not without its problems – what to do with an out of memory exception? – but it ensures that a less “critical” exception doesn’t crash the process). Notice that eventual exceptions are saved on local field so that they can be passed to the client of the event.

The NotifyEndOfOperation method is responsible for “firing“ the event back on the thread that started the request (which is really needed when you’re writing code for GUIs). To do that, it must first pack all the info into a PrimeNumberVerificationEventsArg expected by the consumer of the API. As you’ll see, the method ends up being really simple because we end up relying once again on the “mysterious” AsyncOperation class:

private void NotifyEndOfOperation( Int32 numberToTest,
                Boolean isPrime,
                Boolean cancelled,
                Exception thrownException ){
  var evt = new PrimeNumberVerificationCompletedEventArgs(
                                numberToTest,
                                isPrime,
                                thrownException,
                                cancelled,
                                null);
  _currentOperation.PostOperationCompleted(
           state => {
              _isRunning = false;
              OnPrimeNumberCompleted((PrimeNumberVerificationCompletedEventArgs)state);
           },
           evt);
}

As I’ve said before, we’ll spend a couple of posts on GUIs and asynchronous processing. Until then, just know that calling this method signals the end of an asynchronous operation (notice also that this means that you cannot make future calls over this AsyncOperation instance). From within the delegate we pass to the method, we turn off our running flag and fire the event by calling the auxiliary OnPrimeNumberCompleted method:

protected void OnPrimeNumberCompleted( PrimeNumberVerificationCompletedEventArgs evtArgs ){
    if (PrimeNumberCompleted != null) {
        PrimeNumberCompleted(this, evtArgs);
    }
}

And now that we’ve got the class ready, here’s how you’re expected to use it:

var work = new DoSomeWork();
work.PrimeNumberCompleted += (sender, e ) => {
                                 Console.WriteLine(e.IsPrime); evt.Set();
                             };
work.IsPrimeAsync(19);

There still more to say about this pattern, so we’ll return to it in future posts. Keep tuned!

Jun 16

In the last posts we’ve looked at several details associated with the use of the APM pattern. Today we’re going to start looking at the second pattern for doing asynchronous work: the event based asynchronous pattern.

This pattern was introduced with .NET 2.0 and it targets components that are going to be used in GUIs. In other words, if you’re building components that are going to be used by developers that build GUIs, then you should prefer this pattern instead of the APM.

To implement this pattern, a class needs to:

  • have a method with the name XXXAsync, where XXX is the name of the synchronous method;
  • the XXXAsync method receives the same parameters that are expected by the synchronous version plus an optional object parameter (used to pass extra state that should be consumed later);
  • expose an event with the name XXXCompleted, which should be fired when the asynchronous task completes;
  • the EventArgs type of that event should be an AsyncCompletedEventArgs derived class;
  • expose a CancelAsync method (which might optionally receive an object parameter) that is responsible for cancelling the async operation;
  • optionally expose a ProgressChanged event (of type ProgressChangedEventHandler), that can be consumed for getting info  on the progress of the operation.

Unlike the APM, the XXXAsync method that starts the asynchronous task does not return an IAsyncResult instance (in fact, it always returns nothing, ie, void). This means that the only option available for “waiting” for the completion of the task is handling the event.

Notice that a class that has one XXXAsync method might support the execution of several concurrent tasks. In those cases, you do need to have the extra state parameter I’ve mentioned above because it is used for distinguishing between the multiple executing operations (if you don’t want to support multiple execution,then you don’t need that extra parameter).

If you’re building GUIs (ex.: windows forms apps),you don’t really care much about the flexibility of the APM. In fact, in these scenarios, the event based pattern is your best option. Why? Simple: because the method that handles the event will be fired on the GUI thread (which, btw, is required for interacting with the controls – ie, whenever you need to update a control, you need to do it from the GUI thread).

Another great feature of the event based pattern is that it offers first class cancelation support (which does not happen with the APM). As we’ve seen, a class that implements this pattern needs to expose a CancelAsync method, which the consuming developer can use for cancelling the current asynchronous operation. Notice that when you support multiple concurrent executions, you need to pass the state parameter to this method so that the class knows which operation should be cancelled.

Reporting on progress is another great feature which is very important for GUIs (generally, giving feedback about the status of the current operation is important for this kind of apps). As we’ve seen, a component that implements this pattern might decide to support this feature by exposing a ProgressChanged event. Notice that if the class exposes several XXXAsync method, then you should expose several events named XXXProgressChanged.

And I guess this sums it up. More about this pattern and multithreading on the next posts. Keep tuned.

Jun 15

Today we’re going to wrap up our study of the APM pattern by seeing how we can implement the IAsyncResult interface. For those that can’t remember, here’s the interface API again:

public interface IAsyncResult{
    object AsyncState { get; }
    WaitHandle AsyncWaitHandle { get; }
    bool CompletedSynchronously { get; }
    bool IsCompleted { get; }
}

As we’ve seen in the previous post, we can encapsulate all the asynchronous code in a custom IAsyncResult implementation. And that’s what we’ll do here. Here’s the code for our class:

class TestAsyncResult<T> : IAsyncResult {
    private volatile Boolean _isCompleted;
    private ManualResetEvent _evt;
    private readonly AsyncCallback _cbMethod;
    private readonly Object _state;
    private T _result;
    private Exception _exception;

    public TestAsyncResult( Func<T> workToBeDone, AsyncCallback cbMethod, Object state ){
        _cbMethod = cbMethod;
        _state = state;
        QueueWorkOnThreadPool(workToBeDone);
    }
    private void QueueWorkOnThreadPool(Func<T> workToBeDone) {
        ThreadPool.QueueUserWorkItem(state => {
                        try {
                            _result = workToBeDone();
                        } catch (Exception ex) {
                            _exception = ex;
                        } finally {
                            UpdateStatusToComplete(); //1 and 2
                            NotifyCallbackWhenAvailable(); //3 callback invocation
                        }
        });
    }
    public T FetchResultsFromAsyncOperation() {
        if (!_isCompleted) {
            AsyncWaitHandle.WaitOne();
            AsyncWaitHandle.Close();
        }
        if (_exception != null) {
            throw _exception;
        }
        return _result;
    }
    private void NotifyCallbackWhenAvailable() {
        if (_cbMethod != null) {
            _cbMethod(this);
        }
    }
    public object AsyncState {
        get { return _state; }
    }
    public WaitHandle AsyncWaitHandle {
        get { return GetEvtHandle(); }
    }
    public bool CompletedSynchronously {
        get { return false; }
    }
    public bool IsCompleted {
        get { return _isCompleted; }
    }
    private readonly Object _locker = new Object();
    private ManualResetEvent GetEvtHandle() {
        lock (_locker) {
                if (_evt == null) {
                    _evt = new ManualResetEvent(false);
                }         
                if (_isCompleted) {
                    _evt.Set();
                }
        }
               return _evt;
    }
    private void UpdateStatusToComplete() {
        _isCompleted = true; //1. set _iscompleted to true
        lock (_locker) {
            if (_evt != null) {
                _evt.Set(); //2. set the event, when it exists
            }
        }
    }

}

Our custom IAsyncResult type will receive a Func<T> (a function with no arguments which returns T) which initiates the asynchronous task in a thread from thread pool. Besides that, it will also receive references to the callback and to the state parameters that can be passed to the BeginXXX method.

The constructor is responsible for queuing the passed Func<T> on the thread pool. Since it uses the thread pool, it will always be executed asynchronously (that’s why the IsCompleted property returns false).Notice that we need to catch all exceptions so that we can propagate them later. As you might guess, this will destroy the stack trace, reducing its utility. However, since this is the expected behavior, we need to comply with it (do keep in mind that not handling an exception from the pool leads to a process crash from .NET 2.0 onwards).

The code on the finally block is important. When the operation is completed,you should:

  1. set the completed flag to true;
  2. signal the event;
  3. invoke the callback,when available.

The order by which you perform these steps is important because it ensures things work out correctly (Joe Duffy presents some problems that might happen when
you don’t use the correct order). The previous implementation initializes the event object lazily. And that’s why we need a lock to ensure that it gets proper initialized. Locking might seem unnecessary, but it is not and ensures correctness. You can even even try more exotic things here (take a look at this post by Joe Duffy). However, I prefer to play safe and use a simple lock on these scenarios.

Since we’re talking about the event object, let’s take a look at the GetEvtHandle method which is responsible for getting a reference to the internal ManualResetEvent field. As you can see, the first thing it does is check for a null event reference. When that happens, it will simply create a new ManualResetEvent instance on the non signaled state. After creating the new reference, it will automatically check if the current async processing has ended. If it has, then it will automatically signal the event before returning a reference to it.

The FetchResultsFromAsyncOperation is a helper method which we can use from within the EndXXX method. As you can see, it checks for the _isCompleted flag before waiting on the wait handle (WaitOne is always a costly operation). If there was an exception, it will be thrown from the end method. If not, then we’ll return the result.

Since the hard work is encapsulated in your custom IAsyncResult implementation, the BeginXXX and EndXXX methods are simple wrappers. Here’s a possible use for this custom IAsyncResult implementation:

class DoSomeWork {
    public Boolean IsPrimeNumber(Int32 number) {
        if (number < 2) {
            return false;
        }
        var half = number / 2;
        for (var i = 2; i < half; i++) {
            if (number % i == 0) {
                return false;
            }
        }
        return true;
    }
    public IAsyncResult BeginIsPrimeNumber(Int32 number, AsyncCallback cbMethod, Object state) {
        return new TestAsyncResult<Boolean>( () => IsPrimeNumber(number),
                            cbMethod,
                            state );
    }
    public Boolean EndIsPrimeNumber(IAsyncResult result) {
        var customAsync = result as TestAsyncResult<Boolean>;
        if(customAsync == null ){
            throw new ArgumentException();
        }
        return customAsync.FetchResultsFromAsyncOperation();
    }
}

As you can see, the

BeginIsPrimeNumber returns a new TestAsyncResult instance which kicks off the async processing (in this case, it will simply invoke the synchronous version on the thread pool). The EndIsPrimeNumber method delegates all the work to the FetchResultsFromAsyncOperation.

Now, you’ve got several options for using this class:

//1. using a callback
var waiter = new ManualResetEvent(false);
var work = new DoSomeWork();           
var asyncResult = work.BeginIsPrimeNumber(
    51243,
    result => { Console.WriteLine(work.EndIsPrimeNumber(result)); waiter.Set(); },
    null);
waiter.WaitOne();

//2. polling IsCompleted
var work = new DoSomeWork();   
var async = work.BeginIsPrimeNumber(
    51243,
    null,
    null);
while (!async.IsCompleted) {
    Console.WriteLine("Sleeping");
    Thread.Sleep(100);               
}
Console.WriteLine(work.EndIsPrimeNumber(async));

//3. waiting on the handle
var work = new DoSomeWork();   
var async = work.BeginIsPrimeNumber(
                51243,
                null,
                null);
while (!async.AsyncWaitHandle.WaitOne(100, false) {
    Console.WriteLine("Sleeping");               
}
Console.WriteLine(work.EndIsPrimeNumber(async));

//4. blocking until it completes
var work = new DoSomeWork();   
var async = work.BeginIsPrimeNumber(
                51243,
                null,
                null);
Console.WriteLine(work.EndIsPrimeNumber(async));

Even though you can reuse this class in several places, I couldn’t finish this post before pointing out that Jeffrey Richter has created a much more reusable implementation of the interface in his MSDN’s concurrent affairs column (which, btw, also shows how you can put the code that starts the async action in the main class itself, instead of putting it all on the IAsyncResult implementation).

And that’s all for today. Keep tuned for more on multithreading.

Jun 14

Multithreading: understanding the BeginXXX and EndXXX methods

Posted in C#, Multithreading       Comments Off on Multithreading: understanding the BeginXXX and EndXXX methods

Now that we’ve looked at the available waiting options for the APM pattern, it’s time to start digging on its internals. This post is all about understanding the work of the BeginXXX and EndXXX methods (which isn’t much, as we’ll see).

As we’ve seen in previous posts, the BeginXXX method is responsible for kicking off the asynchronous processing. In theory, you’d expect it to create an IAsyncResult which will be responsible for:

  • executing the synchronous action in a different thread (probably by using a thread from the thread pool);
  • creating a kernel object that will be signaled when the synchronous action completes;
  • handling eventual exceptions that might have been thrown during the execution of the task so that you can get them when you access the returned IAsyncResult returned by the EndXXX method;

As you can see, the IAsyncResult ends up doing all the work and that’s why we’ll dedicate a couple of posts to that topic.

The EndXXX method is responsible for checking the correct type of the IAsyncResult that is passed in and for getting the value of the asynchronous calculation. As you might expect, this means that you need to interact with the passed in IAsyncResult instance. You should also keep in mind that this might block if the action is not completed when you call the EndXXX method. Again, all the hard work is delegated to the IAsyncResult implementation.

If you’re curious enough and take a look at the internals of some of the existing .NET framework classes, you’ll see that some of them rely on delegates for executing the action in asynchronous way (you probably know, delegates have synchronous and asynchronous methods which are automatically generated for you). For instance, the base Stream class relies on delegates.

This is convenient but not good for the performance of your app. According to Joe Duffy, delegates rely on remoting for executing an asynchronous task and this means that you end up getting an overhead for that operation. What this means is that you’re probably better off with using the pool directly instead of creating a ”dumb” delegate for getting the asynchronous operation “for free”.

And that’s it for today. On the next post we’ll take a look at how you might create a custom IAsyncResult type. Keep tuned for more.

Jun 10

Multithreading: APM and the continuation passing style

Posted in C#, Multithreading       Comments Off on Multithreading: APM and the continuation passing style

In this post, we’ll take a look at the last option we can use for the rendezvous phase of the APM pattern. This last option is based on the continuation passing style principle. In practice, this means that we need to wrap up the work that should be performed after the async task ends in a closure and pass it to the BeginXXX method.

There are some cases where this might not be a viable option. After all, having to pack everything that is needed into a closure so that it can be executed on the completion of the async task might not be so easy as it might appear at first sight (keep in mind that the thread pool should be used for “small” tasks only!). Before going on, lets see the code that uses this approach:

var request = WebRequest.Create("http://msmvps.com/blogs/luisabreu",,);
var evt = new ManualResetEvent(false);
var result = request.BeginGetResponse( asyncResult => {
        WebResponse response = null;
        try {
            response = request.EndGetResponse(asyncResult);                       
        } catch (WebException ex) {
            //log error…                       
        }
        //rest of work needs to go here                   
        evt.Set();
    }
    , null);
evt.WaitOne();

As you can see (and since I was using a console app), I had to use an event to stop the program from ending before I received the results (don’t forget that  threads from the pool are background threads). As you can see, the comment “rest of work” represents the place where you should put the rest of the things that need to be done.

And that’s it. Now that we’ve explored the four waiting available options, it’s time to explore how we’d implement the Begin/End methods and talk about the details that go in an IAsyncResult implementation. Keep tuned for more on multithreading.

Jun 10

Multithreading: the APM pattern and polling for completion

Posted in C#, Multithreading       Comments Off on Multithreading: the APM pattern and polling for completion

In this post we’re going to take a look at how we can use polling to see if an asynchronous task has completed. As we’ve seen, the IAsyncResult instance returned from the BeginXXX method (that started an asynchronous task) has an IsCompleted property that returns true when the task is completed.

In practice, this means that you can use this property for polling the status of the task. For instance, here’s some  based on the previous example that relies on this property:

var request = WebRequest.Create("http://msmvps.com/blogs/luisabreu",,);
var result = request.BeginGetResponse(null, null);
Console.WriteLine("async request fired at {0}", DateTime.Now);
while (!result.IsCompleted) {
   //do something "inexpensive"
}
WebResponse response = null;
try {
    response = request.EndGetResponse(result);
}
catch (WebException ex) {
    //log error…
}

As you can see, it’s similar to the previous example. The main difference between IsCompleted and waiting on the AsyncWaitHandle property is that the IsCompleted property won’t block the thread. However, you should be careful with the instructions you put inside the while cycle:  don’t build a manual spin that does nothing (that is really inefficient!) and that ends up consuming precious CPU resources, ok?

And that’s it. On the next post we’ll talk about the last option (which is the one I use more often) for handling the completion of an async task started through the APM pattern.

Jun 09

Multithreading: waiting on the APM’s WaitHandle

Posted in C#, Multithreading       Comments Off on Multithreading: waiting on the APM’s WaitHandle

Today, we’re going to keep looking at the available options waiting for the conclusion of an asynchronous task started through the APM model. In this post we’re going to see how we can use the WaitHandle which we can get through the IAsyncResult’s AsyncWaitHandle property.

After getting a valid reference to it, you can do what you normally do with handles: make the thread wait until it gets signaled. You might say: “dude, what’s the difference between this approach and the previous one?” And I say: “dude, you have more control with this approach”. “How”, you ask…glad you did that :,,)

In the previous option, you get blocked, waiting until the operation completes. On the other hand, if you a get a reference to the WaitHandle, then you can block and specify a *timeout*! This means that you can specify a maximum amount of time for the execution of the task or you could use a cycle with a small timeout for giving feedback to the user.

Besides the timeout option, it’s also important to keep in mind that, once you have a reference to the WaitHandle, you have the option to wait for several handles (by calling the WaitAll or WaitAny methods). To show you how we can use this approach, lets see some code. We’ll migrate the latest sample to use this approach:

var request = WebRequest.Create("http://msmvps.com/blogs/luisabreu");
var result = request.BeginGetResponse(null, null);
Console.WriteLine("async request fired at {0}", DateTime.Now);
while (!result.AsyncWaitHandle.WaitOne(500, false)) {
    //give feedback
    Console.WriteLine("Operation executing: {0}", DateTime.Now);
}
WebResponse response;
try {
    response = request.EndGetResponse(result);
}
catch (WebException ex) {
   //same as before
}

As you can see, the code is similar to the one we had before: however, in this case we’re accessing the WaitHandle and waiting on it for 500 ms. When the method returns false, we give feedback to the user. When it returns true, we end up calling EndGetResponse for getting the response.

Before ending up, there’s still time to remind you that you should always call the EndXXX method for cleaning up the resources used. It’s important to keep this in mind! For instance, if in the previous sample you decide to wait only for 500ms and then discard the results (if they aren’t computed in that time), you can’t simply keep doing what you want without calling the EndXXX method…in these cases, the best option is to use a thread from the pool to do that:

if (!result.AsyncWaitHandle.WaitOne(500, false)) {
ThreadPool.UnsafeQueueUserWorkItem(
           (state) => {
                try {
                    request.EndGetResponse(result);
                }
                catch (WebException ex) {
                    //log error…
                }
            },
            null);
}

As you can see, we don’t care about propagating the context and that’s why we’re using the UnsafeQueueUserWorkItem method.

And that’s it for today. There are still two other options for waiting on the APM, but we’ll leave them for future posts. Keep tuned for more on asynchronous multithreading.

Jun 08

Multithreading: APM and options for waiting until work completes

Posted in C#, Multithreading       Comments Off on Multithreading: APM and options for waiting until work completes

In the last post, we’ve started looking at oldest asynchronous pattern of the .NET framework: the APM model. Today, we’re going to list the available options for waiting to the completion of an asynchronous operation started through a framework that implements APM pattern.

After kicking off an asynchronous operation, we have four available options:

  • we can block the thread until work is completed by calling the EndXXX method directly. This is a good approach when we have a couple of extra things to do and, after that, we really need to wait until the asynchronous operation completes;
  • we can wait on the IAsyncResult’s WaitHandle (AsyncWaitHandle property) until the work that is being executed in parallel ends;
  • we can poll the IAsyncResult’s IsCompleted property. This is also a good option when we want to perform other tasks while the asynchronous operation is running;
  • we can pass a callback (WaitCallback parameter passed to the BeginXXX method) that is to be called when the operation completes. This is a good option when we don’t need to block the “main” thread until the operation completes and relies on a “continuation passing” style, where a closure is executed when the async operation ends.

As you can see, the APM model gives us several options  and we’ll take a detailed look at each in the next posts. Keep tuned for more.