Eduasync part 7: generated code from a simple async method

In part 6 we tried to come up with a "manual" translation of a very simple async method. This time we’ll see what the compiler really generates.

As a reminder, here’s the async method in question:

private static async Task<int> ReturnValueAsyncWithAssistance() 

    Task<int> task = Task<int>.Factory.StartNew(() => 5); 

    return await task; 
}

Whereas our manual code still ended up with a single method (admittedly involving a lambda expression), the generated code is split into the method itself (which ends up being quite small), and a whole extra class representing a state machine.

The "stub" method

We looked at a similar translation before, but didn’t really explain it. This time let’s take a closer look:

private static Task<int> ReturnValueAsyncWithStateMachine()
{
    StateMachine stateMachine = new StateMachine(0);
    stateMachine.moveNextDelegate = stateMachine.MoveNext;
    stateMachine.builder = AsyncTaskMethodBuilder<int>.Create();
    stateMachine.MoveNext();
    return stateMachine.builder.Task;
}

Note that I’ve renamed all the variables and the generated type itself to be valid C# identifiers with useful names.

This is pretty much a minimal stub from an async method. If the async method took parameters, each parameter value would end up being copied into a field in the state machine – but other than that, this is what all async methods end up looking like. (Void async methods just won’t return the builder’s task at the end.)

The method doesn’t have much work to do:

  • It creates a new instance of the state machine, passing in an initial state of 0. (I’d expect this constructor parameter to be removed by release – it’ll always be 0, after all.)
  • It creates a new Action delegate instance corresponding to the MoveNext method in the state machine. This is always passed to awaiters as the continuation – by using a field in the state machine, we can avoid creating a new Action instance each time we call OnCompleted.
  • It creates a new AsyncTaskMethodBuilder which the state machine uses to set the result and/or exception, and which exposes a task we can return to the caller.
  • It calls MoveNext() to synchronously: this will execute at least as far as the first "await" before returning, and may get further than that if the awaited tasks have completed before the await expression.
  • It returns the Task<int> from the builder. By this point there may already be a result, if all the await expressions had already completed. In real async methods, most of the time the returned Task or Task<T> will represent an ongoing task which the state machine will keep track of.

Obviously, all the smarts are in the state machine…

The state machine (completed)

Here’s the complete code of the generated state machine, as of the second CTP of the async feature.

[CompilerGenerated]
private sealed class StateMachine
{
    // Fields representing local variables
    public Task<int> task;

    // Fields representing awaiters
    private TaskAwaiter<int> awaiter;

    // Fields common to all async state machines
    public AsyncTaskMethodBuilder<int> builder;
    private int state;
    public Action moveNextDelegate;

    public StateMachine(int state)
    {
        // Pointless: will always be 0. Expect this to be removed from later builds.
        this.state = state;
    }

    public void MoveNext()
    {
        int result;
        try
        {
#pragma warning disable 0219 // doFinallyBodies is never used
            bool doFinallyBodies = true;
#pragma warning restore
            if (state != 1)
            {
                if (state != -1)
                {
                    task = Task<int>.Factory.StartNew(() => 5);
                    awaiter = task.GetAwaiter();
                    if (awaiter.IsCompleted)
                    {
                        goto Label_GetResult;
                    }
                    state = 1;
                    doFinallyBodies = false;
                    awaiter.OnCompleted(moveNextDelegate);
                }
                return;
            }
            state = 0;
          Label_GetResult:
            int awaitResult = awaiter.GetResult();
            awaiter = new TaskAwaiter<int>();
            result = awaitResult;
        }
        catch (Exception e)
        {
            state = -1;
            builder.SetException(e);
            return;
        }
        state = -1;
        builder.SetResult(result);
    }


    // Obsolete: will be removed from later builds.
#pragma warning disable 0414
    private bool disposing;
#pragma warning restore

    [DebuggerHidden]
    public void Dispose()
    {
        disposing = true;
        MoveNext();
        state = -1;
    }
}

Yes, it’s pretty complicated. Let’s break it down a bit…

Unused code

Some of the code here is pointless. It’s a side-effect of the fact that for the CTP, which uses the same code for iterator blocks and async methods – that’s why the primary method within the state machine is called MoveNext. So everything below the comment line starting "Obsolete" can be ignored.

Additionally, in this case we don’t have any finally blocks – so the doFinallyBodies local variable is unused too. Later on we’ll see an example which does use it, so don’t worry about it for now.

Fields and initialization

I’ve explained all the fields in the comments, to some extent. Broadly speaking, there are three types of field in async method translations:

  • Variables for the infrastructure:
    • The state number, used to work out where to jump back into the method when the continuation is called (or on our first call from the stub method)
    • The builder, used to communicate with the task returned to the caller, via SetResult and SetException
    • The moveNextDelegate, always passed to the OnCompleted method when an awaiter indicates an incomplete task
  • Awaiter variables: each await expression (currently) has its own awaiter variable. (This may be optimized in a later build.) Each one is set back to the type’s default value (e.g. null for a reference type) after use, but we need to maintain this as a field so we can get the result of an awaiter when we return from a continuation.
  • "User" variables representing the local variables of the async method. If the method uses language features such as collection initializers, there may be more of these than expected, but basically they’re whatever you’d see as local variables in a normal method. This also includes parameters for the async method, which are copied into the state machine by the stub method.

Standard skeleton

All async methods (well, those returning results) have a standard skeleton for the MoveNext() method. Void async methods differ in the obvious way. You may recognise this as being extremely similar to part of our manually-written method, too:

public void MoveNext()
{
    int result; // Type varies based on result type
    try
    {
        // Method-specific code
    }
    catch (Exception e)
    {
        state = -1;
        builder.SetException(e);
        return;
    }
    state = -1;
    builder.SetResult(result);
}

Hopefully this doesn’t need much explanation: while catching Exception is normally a bad idea, it’s appropriate here, so that it can be rethrown in the caller’s context (or detected without being rethrown). Obviously the method-specific code in the middle needs to make sure that we only reach the bottom of the method in the case where we’re really done – if it ever calls OnCompleted, it needs to return directly rather than falling through to this code.

Actual method-specific code…

Phew, we’ve finally got to the bit which varies depending on the async method we’re dealing with. Fortunately now that we’ve removed all the extraneous code, it’s easier to see what’s going on. I’ve added some comments for clarity:

  if (state != 1)
  {
      if (state != -1)
      {
          // Code before the first await expression
          task = Task<int>.Factory.StartNew(() => 5);

          // Boiler-plate for the first part of an await
          awaiter = task.GetAwaiter();
          if (awaiter.IsCompleted)
          {
              goto Label_GetResult;
          }
          state = 1;
          doFinallyBodies = false;
          awaiter.OnCompleted(moveNextDelegate);
      }
      return;
  }
  // Boiler-plate for the second part of an await
  state = 0;
Label_GetResult:
  int awaitResult = awaiter.GetResult();
  awaiter = new TaskAwaiter<int>();
                    
  // Code after the await (return, basically)
  result = awaitResult;

Now the state machine for our async method can be in any of three states at the start/end of MoveNext():

  • -1: the method has already completed (possibly with an exception). (It’s unclear how we’d end up getting back into the state machine at this point, but the generated code makes sure it just returns immediately.)
  • 0: The initial state - this is also used just before calling GetResult() for situations where we’ve not needed to use a continuation. I’ll go into this slight oddity in another post :)
  • 1: We’re waiting to be called back by the awaiter. When the continuation is executed, we want to jump straight to "Label_GetResult" so we can fetch the result and continue with the rest of the method.

I suggest you look at the code path taken by the above code if you start in each of those states – convince yourself that when we exit the method, we’ll be in an appropriate state, whether it’s normally or via a continuation.

Even within this code, we can see boilerplate material: any time you reach an await expression in an async method, the compiler will generate code like the middle section, to get the awaiter, then either skip to the "get result" part or add a continuation. The fiddly bit is really getting your head round how the code flow works, but hopefully as there are relatively few states in play here, it’s not too bad.

Conclusion

This was a very long post for such a short async method. However, now that we’ve sorted out how it all hangs together, we should be able to look at more complicated async methods without looking at things like the obsolete code and the stub method. Next time we’ll see what happens when we introduce loops, catch blocks and finally blocks.