T4 – Good, Bad and Ugly

So, there’s a buzz on T4 and you’re trying to work out what it means to you?

But, first, what does it mean to me. Why do I spend more than half my time on unpaid research? The next generation of software development will have three legs to stand on: automation, composition, and development process. That’s the whole story. Code generation is the most accessible tool in the automation space – and that space is one-third of the future. I’m ready for that future. I believe technically the industry is ready to be there. I think we have a ways to go to actually get there together and currently my research is in the compositional architecture space.

Code generation is relatively mature. It’s entirely usable. You’re probably using it today. You can probably use it to make your applications better today. You can probably use it better today.

T4 (*) is a very sweet template language. Gareth Jones and the totally awesome DSL team at Microsoft brought you this tool. Its original purpose was to express DSL in 3GL code (VB/C#) but ultimately all transformations are just that, transformations. Generative transformations are almost always merge transformations – you’ve got something called metadata that describes aspects of your application, and you’ve got something called a template and running the template produces something – in our case code. Any transformation engine (XSLT, T4, VB literal XML, etc.) can produce code. T4 can produce any kind of text output (I want blog posts that people customized their Christmas letters by applying category metadata to people on their list and then ran T4 to produce a sensible letter, extra points for outputting Office XML in a pretty format instead of cutting and pasting the results).

This post is the overview of a four part series. Follow-on posts will cover the good, bad, and ugly of T4.

· The good is that it works. It’s a great opportunity.

· The bad is that the level of support for it is way below what we expect from our tools.

· The ugly is an attitude that generation templates are just some under the hood gunk that doesn’t matter. It’s code. You decide whether to write good code or bad code.

I believe we do our work on top of two sciences: The hard science of traditional Computer Science and a fluffy science that is the collision of economics, business and management engineering, human ergonomics, and Computer Science. CS offers the gears – the science of building the machine is more interesting. If there’s a science, we can discover (not invent) principles. I think the code generation principles I published in 2002 still stand:

1) The application remains the responsibility of and in the control of the end programmers

a. It’s that desk someone beats on

b. Primarily in the automation sense, this means they control templates

c. Templates must be in an accessible language

d. Templates must be of high quality to allow modification and reuse

e. Separation of concern applies to templates

2) Metadata is distinct, available and friendly to templates

a. Friendly is necessary to achieve the first principle (There is an approximate N^M ratio of template complexity where m is metadata friendliness factor – the exercise to prove this is creating templates directly with XSD metadata or slightly better EDMX, and treat yourself to something friendly)

b. Metadata is a major point of debugging, it must be viewable/reportable by humans

c. Metadata should come from many sources

d. Separation of concerns applies to metadata

e. Metadata is the business: it must be owned by it and morph between technology

3) Generation should be automatic and simple

a. Simple inclusion in development workflow is necessary to achieve the first principle

b. No checklists, it just happens, one click or no click

c. This means “no friction” in today’s parlance

4) Hand crafted code is sacred and protected

a. Handcrafted code is critical to achieving the first principle

b. Humans are creative, they don’t do things the same way twice

c. Handcrafted code must be embraced by code gen architectures

d. Handcrafted code must be physically protect from mistakes)

5) Generated applications are of equal or higher quality

a. High quality applications are consistent with the goals of the first principle

b. Higher quality code

c. Quality is a broad metric including economics

(*) T4 stands for Text-templating transformation toolkit. Amaze your friends at cocktail parties with that trivia

About a Dozen Things

It’s very easy to believe that our code does a lot of different things – we live in a world of incredible complexity with well over 10,000 classes and 100,000 members in the .NET framework (1). We write code of great complexity – sometimes it’s amazing that it even works.

Take a minute to stand back from your code and consider what it actually does. I have the privilege of being part of the Northern Colorado Architects and we asked ourselves this question last spring. I’ve mentioned that we found about a dozen in several talks, and was asked to share the list. I’d like you to challenge this list if you think your code does anything else. Well, that and because I think I am forgetting at least one:

- Persistence

- Validation

- Authorization

- Localization

- Display/Edit

- Report

- Log/audit

- Test

- Exception avoidance and recovery

- Process

- Calculate

- Workflow (*)

The complexity of our world comes from the thousands of ways we can do each of these things and the billions of combinations. The complexity of a particular software application comes from tossing all of these concerns together along with a top- dressing of entropy and stirring vigorously.

The first three items on the list are pretty straight-forward. There is similarity between validation and authorization because they are both guards, but one focuses on who and the other on what, and we tend to use different techniques.

Localization may eventually fold into Display/Edit but today code on that front is rather different. Display/edit and reporting are also very similar. They differ because reporting is complex read-only analysis which might not be done over current data and may or may not ever appear on paper.

Three of the items on that list we do for internal purposes: logging/ auditing, testing, and exception management. Assertions and code contracts are part of exception management and what I mean by exception avoidance. Something is wrong and we’re responding to that, rather than crashing our systems – but it’s still a response to something being wrong. Logging and auditing covers all system health reporting and hopefully testing is a straightforward concept – even if it’s not straightforward in practice.

You might consider it cheating to have two buckets as big as “process” and “calculate.” Certainly they are critical and complex. But from a concerns point of view, it doesn’t really matter what you are doing – you’re doing it. I make a distinction between a process that changes the state of the universe (often by altering a database or an external system) and calculation which supplies information without altering the state of the universe.

Only three things (other than workflow) on this list should alter the state of the universe – persistence, logging, and process.

Workflow falls into a very special category. I mean a specific kind of workflow – the interaction between the application and the non-software business world it lives in. I don’t mean using workflow or business integration tools to do processing. I’m not actually sure workflow should be on the list, because in practice, it merely uses the other eleven kinds of code. At the end of my contemplation, I include it because I’d rather not spend time arguing about whether it should be there. I do believe that this sense of workflow is one of the most important perspectives we can have when we step back from our code and think as analysts. It’s a perspective that includes user stories as a subset.

There’s a lot of grey area where the things our code does overlaps. Is there any code in your space that does not fall into one of these buckets?

It’s a profound view. Everything is a cross cutting concern.

(1) Brad Abrams, Number of Types in the .NET Framework, http://blogs.msdn.com/b/brada/archive/2008/03/17/number-of-types-in-the-net-framework.aspx