Versioning long running Workflow Services in WF4

One of the problems with the current version of Windows Workflow Foundation is how to handle different versions of your workflows. With short running workflows this is no big deal, workflows do whatever they are supposed to do and finish, and you can deploy a newer updated version of your XAMLX files whenever you want. Provided the public facing SOAP interface doesn’t change no one will notice a difference and everything will work just fine.

However as soon as we get into long running workflows and the SQL Workflow Instance Store things get quite a bit more complicated. As soon as you add or remove activities from your XAMLX, the workflow service definition, you can no longer load any of the workflow instances currently saved in the SQL Workflow Instance Store. This is a bit of a problem because it would mean you would either have to wait until all workflows are finished before upgrading you workflow definition or you would have to abort all running instances, neither is an acceptable solution in most cases.

image

 

How workflow data is stored

The SQL Workflow Instance Store keeps track of the WCF address used to start a workflow and stores that along with the actual workflow state. It uses this data to differentiate between different workflow service definition. And this can actually help us fixing our versioning problem, just leave the existing workflow definition as is and create a new one alongside it with the new definition.

image

So this solves the problem of separating the state of each workflow version but means that the client application needs to be updated each time a new version of the workflow service is deployed. Not only that but the client needs to keep track of which workflow was started using which service and send each future requests to the same address. This puts a big extra burden on our client app and that is something we don’t want.

 

The WCF 4 RoutingService to the rescue

We can solve this problem by adding the WCF RoutingService, a new .NET 4 feature, to the mix. In this case the client only talks to the routing service and the routing service is aware of each workflow service version and knows how to route the request to the correct address. This way the client never knows when new workflow services are created, all it knows about is the WCF RoutingService address.

image

 

So how does the WCF RoutingService where to send messages?

There are several ways this can be done but the easiest is to have the workflow service return a version number from the initial request that started the workflow. This version number is also a required argument for each subsequent request into the workflow. The WCF RoutingService can now use this version part of the message, or the lack thereof, to determine where to route the message. If there is no version information the message is always routed to the last version of the workflow service so new instance requests as well as WSDL requests are always send to the most recent version.

 

So does this solve all our problems?

Unfortunately not. This will solve the problem of updating the workflow service definition and keeping the different versions apart it does keep the workflow already running with their old definition. And that might be exactly what you want in some cases but if there is a bug in the existing definition you still can’t fix that. And that is a problem that can’t really be solved properly with WF4. This feature has been promised at the 2010 PDC for the next version of .NET but that doesn’t help us now.

 

Enjoy!

 

[f1]
[f2]

One thought on “Versioning long running Workflow Services in WF4

  1. Interesting.

    So how DO you address the problem of bug fixing an existing long-running workflow, where that bugfix requires a change to the xamlx?

    Let’s suppose you were to use the PersistenceIOParticipant model to store the definition in the database, and load it back up when the instance resumes.

    Could you then change the xamlx definition in a running workflow via a database update and get the current clients to call the new definition, hence allowing a bugfix without a version change?

    Or does the heart of the problem lie with how WCF addresses the definition? How and why is the xamlx change detected?

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>