Assembly Organization in MEF

The purpose of MEF is to hook up parts. Parts are indentified by string contracts.

MEF (and tools like it) empower fully composable systems. I use that term to describe a system in which the system no longer knows what is going on. What you write empowers an ecosystem specific to a problem domain it does not solve the problem.

In my case, MEF empowers a harness where processes, generally code generation templates, are run in a specific manner (such as with pre and post processing). I don’t care what your process does. At one level I don’t care about what services it needs or what metadata it uses (although in another post, I’ll explain why I actually do care, the harness definitely doesn’t care). This is a tremendously powerful environment for certain problems, precisely because it becomes an ecosystem solution, not a proscribed solution.

It is also a very different environment where we’re still working out the rules as an industry. One area where the rules shift is in assembly organization. There are always two core drivers for assembly (project) organization – deployment and isolation. How does this play out in a MEF system?

Let’s assume UI isolation. MEF systems generally solve a complex but finite problem. It makes sense for them to be class libraries that are distinct from any single user interface. In the code generation case, we know we will have a graphical UI and a command line. I suspect I’ll have a kind of sucky graphical UI and someone will make a pretty one and starting in 2010 embedding in Visual Studio becomes more realistic. That’s 2 in any single deployment – one of the UI’s and a class library that is the core of the system.

A core goal of a composable system is to replace direct references with indirect references. This means interactions will generally use interfaces which belong in their own library for isolation. These interfaces will generally have a default implementation. While technically these defaults could reside in either the interface or the core library directory, putting them in either location means they will always be available and you will have to work harder for the user to replace them. Thus I think four class libraries is the minimum complexity for a real world MEF solution.

Almost all fully composed systems should go to at least one additional set of assemblies. Fully composed assemblies contain interfaces which are directly referenced and used by the system core and interfaces that do not have this direct interaction. It’s software entropy if I force you to design an interface for extracted database metadata. I’ll give you one, although the core system could care less about it. For isolation and clarifying intent, I think separation is a better approach.

You could place the default implementations for indirect services (such as metadata extraction) into the same assembly as the default implementations for services directly referenced by the core. Nothing will break. But they will always be in the same catalog and export provider, requiring the complexity of selecting the correct one. Deployment is simpler if these are separate. I also think a parallel between implementation and interface assemblies makes the system overall easier to understand.

In some cases, including the code generation harness, there are specific sets of roles parts can play. A given use of the harness may use any combination of these. If the assemblies are partitioned across these boundaries, people setting up the MEF system can employ only those items they need. Among other things, this speeds the MEF discovery process. For example, in the code generator, there are a couple of categories of metadata, output, naming and miscellaneous services.

The specific roles the parts play is defined via the interface assemblies. If these assemblies are partitioned, programmers extending the system later reference the specific set of interfaces they want. With partitioned assemblies, they might need to reference more than one, but once referenced, the complexity of what the programmer sees is less (which could also be done with interfaces).

Let’s explore the role of the extending programmer a bit more. If there is no extending programmer (you or someone else) there is no point in a composable system. This extending programmer might be part of the original team, if the purpose of composition is simply flexibility in deployment, such as whether particular modules based on customer status. In many cases, including the generation harness, the system is really being created for the convenience of the extending programmer, and it is there scenarios that should drive the application design. You may be able to design your system so certain changes can be made without MEF knowledge (new templates in the code generation case) but the reason you’re using a composable system is to support programmers creating new parts.

This extending programmer works in their own assembly, which among other things allows you to issue updates to default assemblies without breaking their work. These assemblies reference interface assemblies. Is it easier for that programmer to select well partitioned interface assemblies and use them to better understand how the system works. If you partition your implementation assemblies to parallel the interfaces, deployment is simpler and if you fully implement an assembly, you can remove the default one.

There are a few blog posts including this by Chad Myers and this by Davy Brion that speak disparagingly about solutions with many projects. I do not actually agree with either post (because solutions are convenience only features and our architectures should not reflect Visual Studio bugs) but I do think they make some good points. And the fundamental agreement is that you should understand why you have a lot of projects if you do. In this case, it’s for isolation (a core tenant of MEF) and deployment. If you want only a couple of projects in your solution, you don’t want a composable system (although you might still use MEF for other reasons).

Prioritizing MEF Parts – Or When Things Are Easy

So much of what I do with technology turns out ten times harder than I think it should be that I’ll admit I walked very softly and fearful of the quicksand when I wanted to implement my own prioritization strategy for MEF parts. In the end, the MEF interaction was dirt simple.

For review, MEF has a prioritization strategy. When you are looking for parts, the first export provider that fulfills the request (contract and cardinality) ends the part search process (the GetExport(s) variation). However, the generation harness has more complex requirements that this allows.

To create an application, you’ll use a set of templates. Templates will have a specific purpose, such as a Select stored procedure. I’ll supply a default for this, but you may not like my default and will want to replace it.

I want to do this while maintaining complete ignorance in the template about what contracts you are working with. I can’t use the multiple export provider strategy because I want all the templates in all of the catalogs considered.

You also will probably want to replace metadata and service parts. This scenario is different because the template knows the specific contract. Because there is a specific contract request, the MEF native prioritization could potentially work, except that I want the simplicity of a single export provider to manage the templates and do not want multiple export providers/containers.

Parts, including templates, have a contract, which in MEF is a string. There will be many templates doing multiple jobs in the generation process. MEF just grabs everything that implements IProcessWrapper when it grabs templates. I want another level of selection.

All contracts involved in the prioritization strategy have an Id and a priority as MEF metadata. MEF metadata is just a dictionary of values so I can check whether an Id or priority exist in the export definition. I want to run only the highest priority template for each Id. An Id might be “SelectStoredProc” or some other descriptive name. For convenience I’ll supply constants for the common templates. My templates will have a priority of 0 and will only run if no other templates are available. I’ll include a different post about why I think we can all play nice with priorities and about how to override priorities, after I figure out how I’m going to do that.

To accomplish this…

I created a new aggregate catalog that derived from the MEF AggregateCatalog and override the GetExports method. 

Imports System

Imports System.Linq

Imports System.ComponentModel.Composition

Imports System.Collections.Generic

 

Public Class PrioritizedPartCatalog

   Inherits Hosting.AggregateCatalog

 

 Public Overrides Function GetExports( _

            ByVal definition As Primitives.ImportDefinition) _

            As IEnumerable(Of TempTuple(Of Primitives.ComposablePartDefinition, Primitives.ExportDefinition))

 

[[ In case I confused the C# folks, VB has “partial namespaces” which means it tries to tack namespace pieces like “Primitives” onto the stated namespaces and applies a little more clarity to namespace organization. I could have alternatively imported System.ComponentModel.Composition.Primitives]]

I want to grab all of the matching parts and do some additional filtering.

      Dim allExports = MyBase.GetExports(definition)

 

The next part involves some KinkyLinq (I should probably grab that url). When I’m doing hard stuff in Linq, I split things up into multiple queries. Linq will optimize this if it can, and my brain won’t explode from one overly complex statement. The first query creates groups by Id. The second grabs the highest priority item in each group:

      Dim q1 = From y In allExports _

               Group y By key = GetId(y) _

               Into Group _

               Select key, Group

      Dim q2 = From g In q1 _

                Select _

                ( _

                From y In g.Group _

                Where GetPriority(y) = _

                g.Group.Max(Function(z) GetPriority(z)) _

                ).FirstOrDefault

      Return q2.ToList()

   End Function

Not all parts will opt into prioritization. This is entirely optional. Managing this with priority is relatively easy. Any part without a priority has a priority of zero. However, managing Id’s for things that opt out of prioritization is a little more complex.

   Private Function GetId( _

            ByVal tuple As TempTuple( _

                Of Primitives.ComposablePartDefinition, _

                   Primitives.ExportDefinition)) _

            As String

      ‘ The guid means any export that opted out of prioritization is included

      Dim idObject As Object = Nothing

      tuple.Second.Metadata.TryGetValue(“Id”, idObject)

      Dim id = CStr(idObject)

      If String.IsNullOrEmpty(id) Then

         Return Guid.NewGuid().ToString()

      End If

      Return id

   End Function

 

If the ID exists, it’s used. If the Id does not exist, a Guid takes its place for the purpose of the filter. That means anything without an Id will always be considered the sole member of a unique group and will always be included since one member is selected for each Id.

The amusing thing about this solution is looking back in retrospect at where I spent my time. I spent a lot of time figuring out what could be done with MEF. I spent some time doing something wrong with Id’s at the metadata attribute level (which does not need to be complex) and spend some time on that KinkyLinq statement. Thanks to Glenn Block, Jay Harlow, Rob Teixaira, and Andreas for their help.

What I didn’t spend much time on is actually fitting the solution into LINQ. Adding this additional filtering involved only the code here (and the helper function for GetId() which you can predict). How simple can you get? The extensibility of MEF is what really makes playing with it worthwhile. If you hit a problem, like my unique filtering needs, you can probably find a way to work around it.

 

What’s Wrong with T4?

First, hugs and kisses to the people at Microsoft that made the decision to include T4 in Visual Studio 2008 instead of having it only available within the separately downloaded DSL toolkit. This indicated an important commitment by Microsoft to T4 and was a really positive step in code generation over all. Thanks guys!

I think the option wasn’t to have T4 done right or wrong in VS 2008, but to have T4 done wrong or not to have it. I’m happy the decision to include it was made, but I also think people need to understand what’s wrong with it.

First, the T4 template engine runs code in place and puts the result in the same project as the template in a dependent location. Huh? When would you ever want this? A core purpose of templates is to reuse templates across many projects, and once you get to working with and debugging the project, the template should be in the background, not the foreground. Oh, and this design strongly implies a single output file when single file output has been pursued by no one outside Microsoft (and by the way, Microsoft, we hate single file output). The folks working on the T4 Toolbox have shown us ways around this, but they are hacks giving templates inappropriate responsibilities for organizing the generation process. Templates should have the single responsibility of defining what to output.

Second, templates do not have a mechanism for passing parameters. Huh? We always pass parameters to templates, and we are almost always looping when we do it. There are hooks to pass parameters in T4, but you must understand some pretty complex plumbing to write into these hooks and it takes a few hundred lines of ugly code and internal, on the fly code generation.

Third, templates represent two interleaved sets of code. It’s the nature of templates and in T4 this is C# or VB code embedded within the artifact definition, which is often the definition of C# or VB code. Ad in the Microsoft provided editor its all the same color (black). Huh? Did you ever notice that these are nearly unreadable when displayed in a single color?

Sure you can fix these problems on your own, but everyone doing things their own way both wastes time and increases software entropy. Software entropy is the converse of feedback systems, and I’ll be talking more about that in relation to architecture expression in the next few days.

There’s good news here. Clarius provides a colorizing editor as a community edition to get you started. If you want fancier features, their main product is just $100. We’re releasing a community generation tool (also for free) as the AppVenture Community Generation Harness. This harness is a fully composable generation harness that will initially deliver with support for T4 and class based generation (such as VB 9 XML literal generation). It’s got the basic feature list in this entry. That’s in review at the moment and I’ll post here on my blog when it’s available at www.AppVenture.com. Using these two tools together, you’ll have a very workable generation environment based on what is a really a very sweet template language in T4 or VB 9 XML literal code generation, or any other type of generation someone writes a wrapper for. And thanks to Clarius and AppVenture, it’s all free to you.

MEF Assembly Granularity

I’ve been contemplating how to organize MEF assemblies. I think the processing I did establishing the first cut at organization, and the shake down of that strategy, may be interesting to other people designing MEF systems.


As a quick review, MEF lets you throw parts into a MEF container and sort out how parts work together at runtime. Parts are recognized by a string identifier. I’m almost always using interfaces as the contract and the interface names as the identifiers. Parts reside within assemblies and in the common case assemblies are discovered because they are grouped into anticipated directories.


With this approach, only the part implementing the interface and the part that is using the interface need to understand the interface or explicitly reference the interface’s assembly. And since parts are discovered and loaded at an assembly level, the granularity of implementing assemblies also controls the granularity of the load. I care about the assembly granularity of contract/interface assemblies so excess stuff can be avoided and naming conflicts (resolved via namespaces) are minimized. I care about the granularity of implementation assemblies because until I attain a priority system with additional granularity, prioritization/defaults are only as granular as their containing assemblies.


At one extreme, all interfaces reside in one assembly and all implementations reside in another. It doesn’t make sense to put them into the same assembly as then hard coded references exist and ensuring isolation is difficult. At the other extreme, every interface and every implementation resides in its own assembly. I think both of these extremes are a terrible solution. That’s because this composable system (and I would think any composable system) have parts with very different roles and lineages/history. In the simplest sense for a generator – metadata providers and templates are fundamentally different and could easily be provided by different teams.


Initially I thought the primary consideration should be the implementation deployment, but Phil Spidey pointed out in the MEF discussions that the interface organization is more important, because once released to the wild it might be hard to fix.


I decided on six contract assemblies:


CommonContracts

Interfaces referenced by the template harness itself

CommonDatabaseMetadataContracts

Interfaces sharing database structure

CommonDomainMetadataContracts

Interfaces sharing business object structure

CommonNamingServiceContracts

Interfaces for a naming service

CommonOutputServiceContracts

Interfaces for outputting data, including hashing

CommonServiceContracts

Miscellaneous interfaces that don’t fit elsewhere



I’ve used a few criteria for this design:


Interfaces that are used by the system and therefore can’t easily be changed reside together in CommonContracts. The template harness also references CommonOutputServiceContracts but this is in a separate assembly because it has a distinct purpose, may evolve on a different time frame and you are far more likely to provide alternate implementations for output than for the core interfaces.


The naming service is also a separate assembly because it is a distinct purpose and some people will certainly supply alternate implementations to manage human languages other than US English. Both the output service and naming service are a few distinct interfaces that work together. I also had a few odd ball interfaces and decided to go with a grab bag of miscellaneous interfaces rather than a separate assembly for each interface. Time will tell whether that is a good decision.


I initially put the two metadata interfaces into a single assembly, but I think it’s quite likely that these interfaces will evolve separately and almost certain that they will be implemented independently.


I’d like to note that the first version of the harness, which is almost, almost done (a separate blog post) will be a CTP /alpha level release. I will take feedback on the interfaces and I do expect them to change. A core part of the composable design is that you can spin off your interfaces/implementations so while these changes will be breaking, you can uptake them at your own pace.  

Talking to T4 – When NOT to MEF-ify

If you aren’t currently creating T4 templates, skip this post as a rather geeky exploration of something you should never have to touch. Hopefully in another few days you’ll harness to take care of this ickiness.

T4 templates should know their own output file name! This is not a function of the host or the harness or anything else. We want to wrap up the responsibility of creating a template in one location (with any redirection necessary by the class to get the job done and support single responsibility).

How to do this? My first idea was way nerdy wacko cool – use MEF! It’s new, it’s cool, everyone loves something with such a cute name!

The first step to MEF-ifywas moderately easy – add a class in the T4 template that has an export (or a property, or whatever). This just involves usual ickiness of setting T4 references, imports etc. Messy, but if you are working with T4 hosts and engines, you know this stuff. Got that done.

The code for the MEF export approach might look like:

<#+

[Export(typeof(IT4Template))]

public class TemplateSupport : IT4Template

{

     public string GetFileName()

     {

         return “Fred”;

     }

}

#>

Then, comes the harder part – where do you deal with the MEF container? Some place in looking through this, I realized why this is NOT a good use for MEF. Two reasons actually – First I know exactly where the value should come from. I suppose I could argue for some sort of default service that uses an embedded naming pattern, but I already know that embedded naming patterns get ugly over time. XSLT will require a naming pattern (or call backs which are problematic during testing), but MEF and XML literals, etc will not. We need a single location, the template, to tell us the output file name. And it needs to run code to determine the name. I can’t predict for a pattern language everything someone will want to do, leaving people to extend a pattern service – well complexity goes up.

The second reason is that there is another way that is demonstrably less complex. To meet this bar, we need code simpler than what I showed above.

The second approach I considered was a value holder service for the host. Basically the template could request a service (which could the template itself) that supported the service and simply set the value. Then after the template ran, the engine could retrieve the filename from the host, which was holding it for everyone’s convenience:

<# (this.Host as IValueHolderService).SetValue(“FileName”, “Fred”); #>

 

That’s still pretty ugly. It occurred to me that we already have a value holder service, we just call it properties and it contains parameters to the T4 template. I’m just going to swipe a slot in the properties dictionary for my use, which by implication means you can’t use the property name I use. I decided it would not be nice to force you to avoid having a property named FileName with some other purpose than the output file name. The name needs to be exquisitely precise, because I am messing in your logical symbol space: TemplateOutputFileName does the trick for me (I’ll be happy to take other suggestions).

Now, you can just set the name of the output file in code, using whatever additional functionality you need. At the level of the T4 template code, you are just setting the value of a variable. The result is something like:

<# TemplateOutputFileName = “Fred”; #>

 

This fulfills my goal of a very simple way for you to set the filename inside your T4 template while allowing you to run any code you. There will be some T4 geeks that do a double take at the fact this is not a directive, but directives do not allow internal template code to run. If this gives people heartburn they can either declare the property (I won’t redeclare it) or I’ll also add support for a directive if people really want it.

Binding to a MEF ExportCollection

Josh said this in the MEF discussion list on CodePlex:

MEF is just a composition engine its job isn’t really to offer choices for a users.

While that is true, it’s not that hard to present choices to users and it gives you a bit of insight into the very powerful strongly typed metadata attributes of MEF.

MEF is a tool for matching up parts. Its discoverability is not against the actual class, because that would require too deep a dive into the assembly. Its discoverability is based on class attributes which announce that a part fulfills a contract. In MEF, a contract ID is always a string and you’re committing to fulfill expectations someone will have about something that claims that contract. I almost always use one of two types of contracts – type contracts against interfaces that formalize my commitments and simple strings when I have a random piece of information floating around like a database or server name. To clarify how it works, MEF turns the type into a string for its internal use.  

When you create an interface for MEF you have the option to create metadata to use with it. Metadata generally (always in my case) exists as two pieces – an attribute and an interface. What happens internally is that MEF creates a dictionary with the data provided by the attribute. At runtime, you can request that you only see parts that fulfill the contract and metadata by supplying contract interface and metadata interface. If you strongly type this via the generic overload you have a MetadataView property which is of the type of your metadata interface.

You can explore this more in the documentation for strongly typed attributes at the MEF site on CodePlex.

From a binding perspective, if you want to bind something you create an ExportCollection(Of IFoo, IFooMetadata). This is a collection of export objects which have a IFooMetadata strongly typed MetadataView property. With me so far?

   <Import()> _

   Private mDatabaseServices As _

ExportCollection(Of IDatabaseServices, _

IDatabaseServicesComposition)

 

And in the Load method:

      providerComboBox.ItemsSource = mDatabaseServices

 

You can bind to this export collection (FriendlyName is a property on my metadata interface):

     <ComboBox Grid.Row=”1″ Grid.Column=”1″ Name=”providerComboBox”

                SelectionChanged=”providerComboBox_SelectionChanged”>

         <ComboBox.ItemTemplate>

            <DataTemplate>

               <TextBlock Text=”{Binding Path=MetadataView.FriendlyName}”/>

            </DataTemplate>

         </ComboBox.ItemTemplate>

      </ComboBox>

 

ExportCollection is not an ObservableCollection, so if you want those extra features, you can create one yourself.  In the Load method this is simply:

      availableOutputs.ItemsSource = OutputItem.GetBindableList(mOutputListeners)

 

where OutputItem is

Public Class OutputItem

   Private mExport As Export(Of IOutputListener, IOutputListenerComposition)

 

   Public Shared Function GetBindableList( _

               ByVal list As IEnumerable(Of Export(Of IOutputListener, IOutputListenerComposition))) _

               As ObservableCollection(Of OutputItem)

      Dim ret = New ObservableCollection(Of OutputItem)

      For Each item In list

         ret.Add(New OutputItem(item))

      Next

      Return ret

   End Function

 

   Private Sub New(ByVal export As Export(Of IOutputListener, IOutputListenerComposition))

      mExport = export

   End Sub

 

    Public ReadOnly Property FriendlyName() As String

      Get

         Return mExport.MetadataView.FriendlyName

      End Get

   End Property

 

The XAML for binding this is:

      <ListBox Grid.Row=”5″ Grid.Column=”1″ Name=”availableOutputs” >

         <ListBox.ItemTemplate>

            <DataTemplate>

               <CheckBox  Content=”{Binding Path=FriendlyName}”

                           IsChecked=”{Binding Path=IsActive}” />

            </DataTemplate>

         </ListBox.ItemTemplate>

      </ListBox>

 

Where differences other than in the binding are because I’m solving a different problem.

 

What Does a T4/Code Generation Harness Need to do?

I’m struggling to get the AppVenture Community Edition Code Generation harness into release because I can’t figure out where the boundaries should be. I initially thought I could just reuse my old stuff in the area of data extraction and mapping/morphing, but too much has changed.

What’s in:

  • Multi-UI supporting core code generation engine
  • Full composability via MEF
  • Configuration driven ordering of automatically discovered templates
  • Template focused generation (templates know stuff)
  • Multi-file output via a simple naming mechanism
  • Support for VB 9 XML literal templates
  • Support for T4 templates
  • A few example templates at Hello <item> level
  • Key service interfaces defined
  • T4 property extraction/value setting
  • Partial data extraction (tables and a few others)
  • Rudimentary US English pluralization service

What’s delayed:

  • Full data extraction (views, foreign keys, etc)
  • Mapping across data/object impedance mismatch boundary
  • Full project templates (stored procs and biz layer)
  • Hashing output files to protect handcrafted code
  • Services to load stored procedures
  • Services to create project files
  • XSLT template support
  • More services such as a better naming service

The core of this design is that I do not want to stand between you and your templates, metadata or services. Thus if you have metadata, you can hook this up and generate anything you can write T4 or VB9 XML literal templates for. You can create any services, metadata or templates you want.

Delaying stuff is hard. I want this to fully meet the capabilities of the GenDotNet harness so I can retire it. This is a vastly superior way to approach the problem.

So, where’s the boundary? For this tool to be worth a look, what does it need to do? This is V1, the rest will come. I’m trying to balance when to release V1 and I’m doing this all in my “spare” time so some patience will be needed.

Comments please!

A Quiet Conversation about DDD and Data First Design

At the MVP Summit I had the pleasure to sit down at a party for some one on one time with Don Smith. I’m trying to think of my blog as a nice little corner to talk, rather than a soapbox. I want to make an share something that is not shouted from the rooftops. Eeegads, I don’t want to start another debate on this.

A few months ago, the EF team started a wiki where Ward Bell and I felt quite attacked for suggesting that DDD is not always the best approach. And thus, it is with some trepidation that I touch this topic. But today’s Database Weekly has a column on it and I really feel there’s stuff worth hearing. If you’re here in my nice intimate corner, you can hear it.

The question of whether to start with a database or a domain (business object) model makes no sense. The answer is easy: start with the one most likely to bring you success, and don’t ignore the impedance mismatch problem.

A well structured application has a good domain model and a good (relational) database and a good strategy to cross the impedance mismatch boundary. That boundary exists because neither the domain nor the database should drive the structure of the other.

A database might be a more successful starting point if you have good, stubborn, or available DBA’s or if your DBA’s are good analysts. If you’re a small shop – which do you build better and have you ever tried building it the other way? Database first is also often a good starting point if you have an existing database. Even if the database is bad, it contains the existing business, and it’s my belief we should never close our eyes to a way the business has already expressed itself if we can get a hold of it (it’s not in code). While we should consider available expressions of the business, we should not blindly accept any piece without exploring also exploring its problems.

A domain might be a more successful starting point if you have good, stubborn, or available coders, or if your coders are good analysts. If you’re a small shop – which do you build better and have you ever tried building it the other way? Domain first (DDD) can also be a good starting point if you have an existing database. If you build a domain model that you constantly validate against the existing database you can base your thinking on experience while not being stuck in that experience. While we should consider available expressions of the business, we should not blindly accept any piece without exploring also exploring its problems.

If it’s an even match, consider DDD. The issues are more subtle and getting them out of the way might be helpful to your project.

The monumental disservice that resulted from the EF wiki (which has thankfully now died a formal death) is that this decision appeared to be a religious one or one that marked you in one camp, or perhaps to some even something about your level of coding. All of that is stupid.

- Do DDD or database first based on what makes sense in your specific scenario

- Whichever way you start, attention to the impedance mismatch will minimize negative consequences to the other side of the boundary

It comes down to the obvious. It’s your team, it’s your project. Make decisions based on your reality, not dogma. Learn from the debates in our industry. Don’t pick sides and follow blindly (even my side.)

So, now we can go back to the rest of the party. If this kicks off another brawl, I suggest slipping out by the side door.

Dancing with Red Herrings

I don’t know the history of the term red herring, but to me it means something you’re looking at thinking it’s the problem, when it’s just not the right problem.

I have to return to the red-herring in my recent MEF debugging post. I wasted a lot of calendar time thinking the problem was in MEF. It wasn’t. It was in my code.

The biggest hurdle to effective debugging, the biggest time waster, the mind killer of debugging is the red herring. When you are debugging the quickest way to become ineffective is to latch on to a perceived solution and cease to look in other directions. Nancy Folsom and I wrote about this years ago as I was adopting her style of debugging. Her answer is very simple – debug using a scientific method which means you attempt to prove yourself wrong. When you think a problem is in a particular location, figure out a test that will tell you that you are wrong, and do not cling to the incorrect assumption that the converse is true.

There are few tests that can prove you are correct because there are many reasons for a particular result. If you look back at my previous blog post, there’s a profound example of this. While I didn’t perceive it that way, my test of the dictionary was actually evidence of an inheritance problem, but in the attribute while I thought the inheritance problem was in the interface. But remember it wasn’t an inheritance problem at all. Had I taken this as “strong indication” (similar but not quite as strong as the word “proof”) that my theory was correct, I would have continued to dance with the red herring and not proceed toward the solution.

Debugging is full of red herrings. If you watch your debugging process you’ll find you make dozens of incorrect guesses about the problem. This is good. You don’t know what the problem is and you must creatively come up with dozens of ideas. The trick to effective debugging is to come up with these ideas, phrase them as questions that can be answered, prioritize them based on likelihood and ease of testing, and then run tests that prove each idea wrong so the wrong ideas can be discarded as quickly as possible.

Team this with “Price is Right” techniques. There was a game on that show where someone had a short amount of time to guess the right price and an answer of high or low. The smart player guessed any number, then split the difference between that and a boundary condition. This can also be called divide and conquer debugging. While the scenarios where divide and conquer is effective is limited, it’s a valuable way of thinking/debugging in those scenarios. For example, if you have the wrong value displayed, is the database returning the right value, does the object contain the right value, etc. It’s a subset of scientific method debugging, but it’s a significantly important subset to call out.

Enjoy your debugging! It’s where you spend most of your time.