RoslynDom Quick Start

This document is about an early version of RoslynDom, focusing mostly on working features, with notes on the impact of missing certain upcoming features. You can also see notes on missing features in GitHub issues.

You can find the code for these quickstarts in the RoslynDomExampleTests NuGet package.

For more information see these documents in the “Documents” folder on GitHub (Creation of these documents is currently in progress):

  • See the RoslynDom Project Layout if you are curious about why there are five projects and the dependencies these projects have on the .NET Compiler Framework (Microsoft.CodeAnalysis), CSharp compiler (Microsoft.CodeAnalysis.CSharp) and Unity (Microsoft.Practices.Unity.*)
  • See the RoslynDom Design Overview for a discussion of how RoslynDom is built
  • See the RoslynDom Extensibility if you’re interested in doing more with RoslynDom
  • See the RoslynDom Roadmap.ppt for a vision of RoslynDom

What is RoslynDom

RoslynDom is an alternative view of your code.

The most efficient, best way to express your code in ASCII is your code in your favorite language.

The most efficient, best way to organize your code for your compiler is within the compiler, and exposed as part of the .NET Compiler Platform, Roslyn.

Another, ephemeral expression of your code is the one in your head. This is the one that comes out in words in your meetings, and you have entire meetings without phrases like “angle bracket.”

RoslynDom models this third expression of code in memory which has several features:

  • You can load existing code into the RoslynDom model and easily explore, navigate and analyze it. The RoslynDom model is language agnostic.

This feature is currently affected by not yet having multi-file support

  • RoslynDom is mutable. You can alter your code in a load, alter, alter, alter, build output model. Since you can easily navigate your code, finding the location for change is easy
  • RoslynDom entirely isolates the language dependent load process from the model itself. At a simplistic level, when the VB factory is in place, you can load from C# and output to VB and vice versa.
  • RoslynDom models can be created in any manner you desire. RoslynDom views can be created without loading code, and then brand new code created.

 

The basic model

Code exists in the following hierarchy

  • Root groups which are groups of files (not yet implemented)
  • Roots, which are files or a single conceptual load unit
  • Stem members – Namespaces and types that can contain be contained in roots
  • Type members – nested types, methods, properties, etc. that can be contained in types
  • Statements – code statements that are held primarily in methods and property accessors
  • Expressions – sub parts of statements that return values

Most major features, including most statements are complete, see GitHub issues.

Expressions are currently handled via strings by design.

Walkthrough 1: Load and check code

Step 1: Load your code

Add a using statement for RoslynDom.CSharp.

Retrieve the singleton instance of the RDomCSharpFactory from the RDomCSharpFactory.Factory property and call the GetRootFromFile method to open a specific file:

var factory = RDomCSharp.Factory;
var root = factory.GetRootFromFile(fileName);


NOTE: Other overloads support loading source code from strings or trees.

NOTE: You can iteratively work through the files in your project or solution. This approach will be hampered because specifying references and multiple syntax trees for the underlying model isn’t yet supported.

Of course you can assign the factory property to a local variable or class field if you prefer.

RDomCSharp is the code that creates the language agnostic RoslynDom tree from C# code, and that can recreate C# code from the RoslynDom tree. You can create a RoslynDom tree from scratch as well. You will later be able to load from other languages, in particular VB.NET.


Step 2: Check your code


Output your code to a string to test the output. You can do this by outputting to a new file and comparing the files:


var output = factory.BuildSyntax(root).ToString();
File.WriteAllText(outputFileName, output);


Conclusion


You now know how to load and output code from RoslynDom

Walkthrough 2: Navigate and interrogate code


One of the major user scenarios intended for RoslynDom is to allow you to answer questions about your code. This is just a small sampling of the kinds of things you can do.

At present, RoslynDom supports structural features (classes, methods, etc) and statements. It does not support expressions because user stories with value aren’t yet clear.

Step 1: Load and check code


Load and check your code as shown in Walkthrough 1.

Step 2: Ask general questions about code


LINQ is your friend.

You’ll often find it convenient to make an array for easier sequential requests in testing.


var factory = RDomCSharpFactory.Factory.GetRootFromFile(fileName);
Assert.AreEqual(1, root.Usings.Count());
Assert.AreEqual("System", root.Usings.First().Name);
Assert.AreEqual(1, root.Namespaces.Count());
Assert.AreEqual(1, root.RootClasses.Count());


Assigning intermediate values to variables in tests can help clarity


var methods = root.RootClasses.First().Methods.ToArray();
Assert.AreEqual(0, methods[0].Parameters.Count());
Assert.AreEqual(1, methods[1].Parameters.Count());
Assert.AreEqual("x", methods[1].Parameters.First().Name);


The difference between Classes and RootClasses is that root classes include all classes under the root, regardless of namespace. Classes are only those directly under the root. Similar for Interfaces, Enums and Structures.

Step 3: Place a break point and query code


Place a breakpoint, run the test in debug mode and ask questions in the immediate window about the code. Sometimes you’ll have to use the Watch window because of the .NET Compiler Platform CTP behavior. Have fun!

Step 4: Ask harder questions


That might have been fun, but the real value from RoslynDom comes from asking complex questions. I’ll introduce LINQ in this walkthrough, and then show something I really wanted to accomplish in the next.

Ensure RolsynDom.Common and System.Linq are included in the using statements.

Let’s say you’re concerned about unsigned ints variables in your code and want to examine their names. I don’t know why, I just had to make something up.

You can retrieve the RoslynDom entry with


var uintVars = root
.Descendants.OfType<IVariable>()
.Where(x => x.Type.Name.StartsWith("UInt"))
.Select(x => x.Name);


NOTE: Aliases are language specific, RoslynDom entries are agnostic so use the .NET name of the type. The CSharp factory is responsible for straightening this out on output.


As another example, say you want all the methods and variables where unsigned ints are used:


var uintCode = (from c in root.Descendants.OfType<IStatementContainer>()
from v in cl.Descendants.OfType<IVariable>()
where v.Type.Name.StartsWith("UInt")
select new
{
containerName = cl.Name,
variableName = v.Name
} )
.ToArray();


Walkthrough 3: Finding questionable implicit variable typing


I have a sin when I code. I really like ignoring types. When I write code I use var everywhere. This saves me time. But, I realize it can result in code that’s less readable.

I can accept a rule that implicit variable typing should only be used on object instantiation, strings, Int32 (int), and DateTime in VB. VB isn’t yet supported.

This combination of selecting types based on the implemented interfaces, and examining additional properties, like types and names is very powerful in finding particular locations in code. I want to find all the implicitly typed variables that are not an object instantiation, assignments to literal strings, or assignments to integers?

Since this is a complicated question, I’ll ask in steps, although you can certainly refactor this into a single statement if you prefer. LINQ doesn’t evaluate until requested, so the piecewise creation is not a performance issue:

Find all implicitly typed local variables:


var implicitlyTyped = root
.Descendants.OfType<IDeclarationStatement>()
.Where(x => x.IsImplicitlyTyped);


Find all instantiations, because they’re OK:


var instantiations = implicitlyTyped
.Where(x => x.Initializer.ExpressionType == ExpressionType.ObjectCreation);


Find all string, integer (32 bit) and DateTime literals, because they’re OK:


var literals = implicitlyTyped
.Where(x => x.Initializer.ExpressionType == ExpressionType.Literal &&
( x.Type.Name == "String"
|| x.Type.Name == "Int32"
|| x.Type.Name == "DateTime" )// for VB
);


Find all the implicitly types variables that aren’t instantiations or literals string/ints:


var candidates =implicitlyTyped
.Except(instantiations)
.Except(literals);


Step 6: Reporting


The code discussed here is in the ReportCodeLines method.

Once you get the information you’re interested in, you’ll probably want to output it. Obviously in reporting, you’d like file, line and column positions. RoslynDom is an abstract tree that does not directly understand text. But it maintains, and can report, about the code it was created from by holding references to key aspects of the underlying .NET Compiler Platform (Roslyn) objects. As long as you haven’t changed the tree, these aspects remain correct.

If you change the tree, the only safe way to report positions of RoslynDom elements is to recreate the underlying syntax tree, and then reload that tree into RoslynDom – generally also searching again for the elements of interest.

Because we haven’t changed the tree since loading it, this isn’t a problem.

Create a SyntaxTree from part of the RoslynDom item:


private string GetNewCode(IDom item)
{
var ret = new List<string>();
return RDomCSharp.Factory.BuildSyntax(item).ToString();
}


Retrieve the original code that was used to create the RoslynDom element:


private string GetOldCode(IDom item)
{
var node = item.RawItem as SyntaxNode;
if (node == null)
{ return "<no syntax node>"; }
else
{
return node.ToFullString();
}
}


Retrieve the original code position:


private LinePosition GetPosition(IDom item)
{
var node = item.RawItem as SyntaxNode;
if (node == null)
{ return default(LinePosition); }
else
{
var location = node.GetLocation();
var linePos = location.GetLineSpan().StartLinePosition;
return linePos;
}
}


Retrieve the original code filename:


private string GetFileName(IDom item)
{
var root = item.Ancestors.OfType<IRoot>().FirstOrDefault();
if (root != null)
{ return root.FilePath; }
else
{
var top = item.Ancestors.Last();
var node = top as SyntaxNode;
if (node == null)
{ return "<no file name>"; }
else
{ return node.SyntaxTree.FilePath; }
}
}


You can use these helper methods in LINQ to create an IEnumerable of an anonymous type:


var lineItems = from x in items
select new
{
item = x,
fileName = GetFileName(x),
position = GetPosition(x),
code = GetNewCode(x)
};


I’ll use a string formatting trick to make pretty columnar output. I’ll first determine the length of each segment of the string output – such as the maximum file path length. I’ll replace dummy values in a format string, such as fMax, to create a custom format string for the sizes in this data:


var filePathMax = lineItems.Max(x => x.fileName.Length);
var itemMax = lineItems.Max(
x => x.item.ToString().Trim().Length);
var lineMax = lineItems.Max(
x => x.position.Line.ToString().Trim().Length);
var format = "{0, -fMax}({1,lineMax},{2,3}) {3, -itemMax} {4}"
.Replace("fMax", filePathMax.ToString())
.Replace("itemMax", itemMax.ToString())
.Replace("lineMax", lineMax.ToString());


I can then iterate across the IEnumerable of anonymous type:


foreach (var line in lineItems)
{
sb.AppendFormat(format, line.fileName,
line.position.Line, line.position.Character,
line.item.ToString().Trim(), line.code);
sb.AppendLine();
}
return sb.ToString();



This results in nice output like (which would be nicer if I wasn’t wrapping):

Walkthrough_1_code.cs(13, 16) RoslynDom.RDomDeclarationStatement : ret {String} var ret = lastName;

Walkthrough_1_code.cs(51, 16) RoslynDom.RDomDeclarationStatement : x3 {Int32} var x3 = x2;

Walkthrough 4: Fixing questionable implicit variable typing


What good would it be to find issues if you couldn’t fix them. But I’m tired, so I’m going to mostly let you figure out how this code works based on what you’ve already learned


[TestMethod]
public void Walkthrogh_4_Fix_implicit_variables_of_concern()
{
// Assumes Walkthrough_3 passes
var root = RDomCSharp.Factory.GetRootFromFile(fileName);
var candidates = FindImplicitVariablesOfConcern(root);
foreach (var candidate in candidates)
{
candidate.IsImplicitlyTyped = false; // All you need
}
var output = RDomCSharp.Factory.BuildSyntax( root.RootClasses.First());
// For testing, force changes through secondary mechanism
var initialCode = File.ReadAllText(fileName);
var newCode = initialCode
.Replace("var ret = lastName;", "System.String ret = lastName;")
.Replace("var x3 = x2;", "System.Int32 x3 = x2;")
.SubstringAfter("Walkthrough_1_code\r\n{\r\n")
.SubstringBeforeLast("}")
;
Assert.AreEqual(newCode, output.ToFullString());
}


The only thing that’s required is to state that these declarations should not be implicitly typed by setting IsImpliciltyTyped to false for each candidate. The rest of the code is to create a test.

But this results in the rather ugly System.String declaration. That’s jarring in a file that uses the C# aliases. That fix is in the next test:


[TestMethod]
public void Walkthrogh_4_Fix_non_aliased()
{
// Assumes Walkthrough_3 passes
var root = RDomCSharp.Factory.GetRootFromFile(fileName);
var candidates = FindImplicitVariablesOfConcern(root);
foreach (var candidate in candidates)
{
candidate.IsImplicitlyTyped = false;// All you need
candidate.Type.DisplayAlias = true; // All you need
}
var output = RDomCSharp.Factory.BuildSyntax( root.RootClasses.First());
// For testing, force chhanges through secondary mechanism
var initialCode = File.ReadAllText(fileName);
var newCode = initialCode
.Replace("var ret = lastName;", "string ret = lastName;")
.Replace("var x3 = x2;", "int x3 = x2;")
.SubstringAfter("Walkthrough_1_code\r\n{\r\n")
.SubstringBeforeLast("}")
;
Assert.AreEqual(newCode, output.ToFullString());
}


Here, in addition to setting IsImplicitlyTyped to false, I set DisplayAlias to true. Normally, this is set when the code is loaded based on whether the alias is used. Since you’re changing how the type is managed, you also have to request that you want to use the alias.

RoslynDom Update 1.0.10-alpha

 

After reviewing the changes in 1.0.10-alpha I think trying to expand on the features in the context of the update document is not realistic. The document has some minor updates in the documents folder, and I’m archiving the update document for Updates 1.0.1-alpha to 1.0.10-alpha and starting a new document for 1.0.11-alpha. I’ll highlight the changes here in the context of goals accomplished. This has been an enormous leap and I now know where the last five or six weeks of my life have gone.

Language independence

There are three vertical slices to the overall RoslynDom.

  • The interfaces, which I’ll discuss separately with the unrealistically broad goal of being somewhat platform independent and supplying feature based access to the RoslynDom silo. To the extent you can, ignore the interfaces until I discuss them further as they might give you a headache.
  • RoslynDom itself, which is a language independent representation of your source code that is designed from the perspective of .NET and the .NET Compiler Platform, Roslyn. There is significant divergence from the .NET Compiler Platform when it aided the goals
    • Mutability
    • Layout independent format
    • Language independent format
    • Simple access
    • SameIntent support
    • Support for comments and XML documentation as first class citizens
    • Support for compiler directives as first class citizens
  • RoslynDomCSharpFactories, C# factories to load RoslynDom and recreate a .NET Compiler Platform, Roslyn, SyntaxTree. Loading and unloading is via the SyntaxTree to allow parsing and consistent structures.

Each language element, such as a method, is represented by a composed interface (IMethod), a language independent RoslynDom class (RDomMethod) and a C# specific factory (RDomMethodTypeMemberFactory). Each factory has CreateFrom… and BuildSyntax… methods. CreateFrom… methods create RoslynDom entities and BuildSyntax… methods recreate syntax elements for the SyntaxTree.

Dependency Injection

The groundwork for extensibility is in the dependency injection approach to retrieving factories. Since factories instantiate RoslynDom entities and recreate syntax, they are the key player in modifications and extensions. Since they interact to build RoslynDom and SyntaxTree entities, their retrieval is crucial to an extensibility story.

One known extensibility story is a Visual Basic factory. It seems likely possible that user stories for building RoslynDom trees from scratch as an easier way to build SyntaxTrees from scratch will want to tie into extensibility, and therefore a raw helper factory might make sense. I am hoping that more ambitious factories like VB6 can also be created.

There has not yet been any testing of extensibility and more work is required in refactoring the BuildSyntaxHelper methods.

Comments, Vertical Whitespace and XML Documentation Comments

The .NET Compiler Platform, Roslyn, places all whitespace and all comments into language trivia attached to the first token following where the trivia should appear.

Of necessity I use comments for “public annotations” (in earlier versions) which provide information for RoslynDom clients. This is entirely separate from the private annotations the .NET Compiler Platform, Roslyn provides. Also, for any use you will have, XML Documentation (also called Structured Documentation because the use of XML is compiler dependent) should be available on the language element (such as the class or method) it belongs to. Similar arguments raise directives to being first class citizens.

There are four levels where vertical whitespace and comments can logically occur: file, stem, type and code (method or property). Each of these has a MembersAll property that include comments and vertical whitespace, as well as appropriate code elements. The Members property includes all code elements except comments and vertical whitespace.

Structured documentation is extracted and placed on the corresponding element. At present, you access the XML because breaking this into a true structure is a lower priority because I don’t have user stories.

Horizontal Whitespace

Earlier versions of RoslynDom were very heavy handed in formatting. This version manages horizontal whitespace. About 25-30% of the code in the factories is now dedicated to managing horizontal whitespace. Three weeks, three redesigns, and a few tears went into this, but the current approach appears solid.

Report Hierarchy and ToString()

A ReportHierarchy method allows better information about the RoslynDom tree, particularly in the immediate window. More work will go into this, so do not take a dependency on the current structure you don’t want broken.

Added Statement support

There are six main levels of elements in your code base: file, stem, types, code container, statement and expressions. Previous versions of RoslynDom supported only file, stem, types and code containers. This version supports a variety of statements.

Statements are logically nested in code blocks, particularly the code blocks of conditional (if) and looping statements.

Expressions are minimally supported – RoslynDom uses conditions and assignment expressions without breaking them down or understanding them. I’m not sure whether it ever will. I have compelling user stories for understanding statements and statement parts (see the walkthroughs for one example). I do not yet have compelling user stories for breaking down expressions. If the only user story is intelligent VB.NET/C# conversions, support may be minimal.

Added Ancestors and Descendants

You can now query the ancestors and descendants of RoslynDom trees. See the walkthroughs for an example of why you might find this interesting.

Interfaces made non-immutable (mutable)

The RoslynDom tree is mutable. This is because of my intended usage and because I think one of the things an alternative to the .NET Compiler Platform, Roslyn is a mutable alternative. I absolutely agree that the .NET Compiler Platform, Roslyn should be immutable – it’s a compiler structure. However, this pretty much forces a rewriter for any non-trivial changes to the SyntaxTree. I believe there will be scenarios where it’s much easier to load into RoslynDom, do interrogation and mutating in that structure, then output to a new .NET Compiler Platform, Roslyn SyntaxTree.

In my initial vision, the interfaces were immutable (IMethod) and the RoslynDom implementation was mutable (RDomMethod). This proved impractical because of excess casting for mutations. My new vision is that if there’s a need for an immutable set of interfaces, the current set will inherit from the immutable set.

As an implication, and allowing for errors, if something is not-mutable in the interfaces, such as the RDomLists, they aren’t supposed to be changed.

There’s still a lot of work to do

GitHub lists known issues. The next version or two will be clean up and documentation improvements.

Following that I’ll plug the holes of the most important language features I’m not yet supporting. These include regions, lambdas and async because they are hard, and side cases like destructors and explicit operators.

I want to solidify a single file before I work across multiple files. Multiple file usage will make the underlying model much more useful and allow more interesting interrogation of non-mutated RoslynDom structures.

The biggest help I need right now are user stories, even vague ones, and failing unit tests – particularly if it crashes RoslynDom. Of course, if you’d like to help further please be in touch. If you want to fork the code, it would be lovely to see what you’re doing.

This is still a very early release. Everything is up for change.

I will try to keep the NuGet release from getting as out of date as it has been for the last month.

Refactoring Unit Tests

Llewellyn Falco and I paired on an introduction to his AcceptanceTests tool. I really like that tool for evaluating objects during testing in an easy, flexible and evolutionary way. It’s a great tool, but that’s not what this post is about.

Rob Hughes (@rhughesjr ) heard via Twitter that Llewellyn and I also refactored a bunch of RoslynDom tests to remove redundant code, and asked that I do a blog post about this aspect of our pairing. That’s what this post is about.

I wrote this about some later refactoring that Llewellyn inspired – so he should get all of the credit and none of the blame. I don’t think there is anything groundbreaking here. Just a detailed walkthrough of refactoring a set of unit tests, along with the logic behind the changes.

Removing redundant code from unit tests

When I write normal, non-unit test code, I think about refactoring common code from the very beginning. A lack of redundancy and flexibility/extensibility are primary forces I think about in software.

But not in unit tests. I believe that unit test creation should be a bit more like an amoeba eating up everything it can touch. A rigid shape caused by code reuse can hide a reduction in logical coverage and in LOC coverage. So, when Llewellyn and I began there were almost no helper methods in the RoslynDom tests.

RoslynDom is very simple in goal – load C# code into a language agnostic structure, allow interrogation and changes to the structure (yes, it’s mutable), and output C# code which looks like the original code. Oh, and you can ask whether two code structures have the same intent.

Because it does a few things across a mid-size number of different elements, there are a lot of very similar tests.

I believe it is best to discover where your tests are redundant, by refactoring them at a later date, and after you have a big pile of tests.

I do not recommend strategizing ahead of time about how to maximize code reuse in unit tests (been there, done that). It’s the only place in your code where I think the copy-paste-fix-differences cycle is OK. I’m referring only to strategizing and designing around code reuse too early.

Strategizing isolation of unit tests very early in the process is extremely helpful. I would say it is necessary, but if you have no tests, I don’t really care if you isolate your first ten tests.

So, why ever remove the redundant code?

Too often unit tests are a static pile that rots during our application development process. If we’re deeply committed, we fix all the broken tests. If we aren’t tests are disabled or removed as the schedule requires. Regardless, unit tests generally become rotten and stinky.

If tests aren’t isolated, rotting tests may become impossible to run. In the days before mocking, I saw a team toss >1,000 tests because of a database change. But RoslynDom tests are isolated because of the nature of the problem.

Beyond maintaining the ability to run your tests, the universal problem is that rotting tests become impossible to read and understand. Your unit tests are the best view into your system. It’s why we have test names that explain why we wrote the test (RoslynDom test naming is mediocre and adequate, not brilliant).

As you project the technical changes in the rest of this post to tests onto your own projects, think about how to increase clarity in what each test is accomplishing.

OK, already! What are the changes?

Here’s a simple test before refactoring

[TestMethod, TestCategory(SimpleAttributeCategory)]
public void Can_get_attributes_on_class()
{
var csharpCode = @"
[Serializable]
public class MyClass
{ }
"
;
var root = RDomFactory.GetRootFromString(csharpCode);
var class1 = root.Classes.First();
var attributes = class1.Attributes;
Assert.AreEqual(1, attributes.Count());
Assert.AreEqual("Serializable", attributes.First().Name);
}


RoslynDom has almost 600 unit tests. I like test categories and use constants to avoid mistyping them.

There are eight tests nearly identical to this that change what the attribute is placed on (class, structure, method, parameter, etc), and thus different code strings. There are also variations with different numbers of attributes. So, clearly there is a lot of redundant code (about 32 tests).

The two lines that retrieve the attributes are problematic. They will be different for every test. That’s a job for our mild-mannered super-hero: the delegate!

Refactoring the test part of the code with a delegate results in this method:


private static void VerifyAttributes(string csharpCode,
Func<IRoot, IEnumerable<IAttribute>> makeAttributes,
int count, params string[] names)
{
var root = RDomCSharp.Factory.GetRootFromString(csharpCode);
var attributes = makeAttributes(root).ToArray();
Assert.AreEqual(count, attributes.Count());
for (int i = 0; i < attributes.Count(); i++)
{
Assert.AreEqual(names[i], attributes[i].Name);
}
}


The things that change between tests are the input code (csharpCode), how the attributes are retrieved (makeAttributes), the count of attributes expected (count) and the expected parameter names (names).

The test calls this method with:


[TestMethod, TestCategory(SimpleAttributeCategory)]
public void Can_get_attributes_on_class()
{
var csharpCode = @"
[Serializable]
public class MyClass
{ }
"
;
VerifyAttributes(csharpCode, root => root.Classes.First().Attributes,
1, "Serializable");
}


 

The value of this call isn’t removing five lines of code – it’s making it more clear what those five lines of code did.

This change simplified 32 tests and made them more readable.

All tests aren’t that simple


The next set of tests looked at attribute values. The initial test was:


[TestCategory(AttributeValuesCategory)]
public void Can_get_attribute_values_on_class()
{
var csharpCode = @"
[LocalizationResources("
"Fred"", ""Joe"", Cats=42)]
[Name("
"KadGen-Test-Temp"")]
[SemanticLog]
public class MyClass
{ }
"
;
var root = RDomFactory.GetRootFromString(csharpCode);
var attributes = root.Classes.First().Attributes;
Assert.AreEqual(3, attributes.Count());
var first = attributes.First();
Assert.AreEqual("LocalizationResources", first.Name);
Assert.AreEqual(3, first.AttributeValues.Count());
var current = first.AttributeValues.First();
Assert.AreEqual("LocalizationResources", current.Name);
Assert.AreEqual("Fred", current.Value);
Assert.AreEqual(LiteralKind.String, current.ValueType);
current = first.AttributeValues.Skip(1).First();
Assert.AreEqual("LocalizationResources", current.Name);
Assert.AreEqual("Joe", current.Value);
Assert.AreEqual(LiteralKind.String, current.ValueType);
current = first.AttributeValues.Last();
Assert.AreEqual("Cats", current.Name);
Assert.AreEqual(42, current.Value);
Assert.AreEqual(LiteralKind.Numeric, current.ValueType);
Assert.AreEqual("Name", attributes.Skip(1).First().Name);
Assert.AreEqual("SemanticLog", attributes.Last().Name);
}


 

I doubt you can glance at that and understand what it does.

One approach would be to pass a complex data structure to the previous Verify method. I could probably have created something slightly readable with JSON, or XML literals if I was in Visual Basic. But unit tests demand a KISS (Keep it Simple Silly) approach.

If the VerifyAttributes method returns the IEnumerable of IAttribute it’s already creating, the first five lines (and a couple of others) can be replaced with:


var attributes = VerifyAttributes(csharpCode, 
root => root.Classes.First().Attributes,
3, "LocalizationResources", "Name", "SemanticLog")
.ToArray();



Making it an array simplifies accessing individual elements.

For the rest of the test, it makes sense to apply the same refactoring approach that worked on attributes. But here, there’s a name, a value, and a literal kind. Again, one approach is a complex structure, but a simpler approach is to test the count and return the IEnumerable of IAttributeValue for more testing:


private IEnumerable<IAttributeValue> VerifyAttributeValues(

IAttribute attribute, int count)

{

var attributeValues = attribute.AttributeValues;

Assert.AreEqual(count, attributeValues.Count());

return attributeValues;

}



 

An additional method simplifies the testing of individual attribute values:


private void VerifyAttributeValue(IAttributeValue attributeValue, string name, object value, LiteralKind kind)

{

Assert.AreEqual(name, attributeValue.Name);

Assert.AreEqual(value, attributeValue.Value);

Assert.AreEqual(kind, attributeValue.ValueType);

}



 

Calling these methods is a great opportunity for named parameters. Take a minute to compare the readability of this code to the same test at the start of this section (and yep, I wish I’d also used named parameters for the VerifyAttributes calls):


[TestMethod, TestCategory(AttributeValuesCategory)]
public void Can_get_simple_attribute_values_on_property()
{
var csharpCode = @"
public class MyClass
{
[Version(2)]
[Something(3, true)]
public string foo {get; set; }
}
"
;
var attributes = VerifyAttributes(csharpCode,
root => root.Classes.First().Properties.First().Attributes,
2, "Version", "Something")
.ToArray();
var attributeValues = VerifyAttributeValues(attributes[0], count: 1)
.ToArray();
VerifyAttributeValue(attributeValues[0], name: "", value: 2, kind: LiteralKind.Numeric);
attributeValues = VerifyAttributeValues(attributes[1], count: 2)
.ToArray();
VerifyAttributeValue(attributeValues[0], name: "", value: 3, kind: LiteralKind.Numeric);
VerifyAttributeValue(attributeValues[1], name: "", value: true, kind: LiteralKind.Boolean);
}


Does that really fit every circumstance of the area you’re testing?


Rarely will there be such a large number of tests doing such trivial comparisons. In this same test file/test topic, there are also tests of passing types, instead of literals, to attributes. This only appears three places in the file:


[TestMethod, TestCategory(AttributeValuesCategory)]
public void Can_get_attribute_value_of_typeof_identifier_only_on_class()
{
var csharpCode = @"
[Test(TypeTest = typeof(Foo))]
public class MyClass
{ }
"
;
var attributes = VerifyAttributes(csharpCode,
root => root.Classes.First().Attributes,
1, "Test")
.ToArray();
var current = VerifyAttributeValues(attributes[0], count: 1)
.First();
Assert.AreEqual(LiteralKind.Type, current.ValueType);
var refType = current.Value as RDomReferencedType;
Assert.IsNotNull(refType);
Assert.AreEqual("Foo", refType.Name);
}


 

Honestly, I might not bother refactoring this if it was in a test file that was full of variations and refactoring opportunities with more payback. But in this nice clean test file, it’s jarring.

Using a different name, rather than an overload, clarifies that something different is being checked:


private static void VerifyTypeOfAttributeValue(IAttributeValue current, string name)

{

Assert.AreEqual(LiteralKind.Type, current.ValueType);

var refType = current.Value as RDomReferencedType;

Assert.IsNotNull(refType);

Assert.AreEqual(name, refType.Name);

}



 

Making the call:


[TestMethod, TestCategory(AttributeValuesCategory)]

public void Can_get_attribute_value_of_typeof_referenced_on_class()

{

var csharpCode = @"

[Test(TypeTest = typeof(DateTime))]

public class MyClass

{ }

"
;

var attributes = VerifyAttributes(csharpCode,

root => root.Classes.First().Attributes,

1, "Test")

.ToArray();

var current = VerifyAttributeValues(attributes[0], count: 1)

.First();

VerifyTypeOfAttributeValue(current, name: "DateTime");

}



 


Yes, you could make it smaller


The actual change with all these refactorings was about 130 lines of code, 860 to 730 vertical lines in this test class. Because the same set of tests were repeated multiple times, and the C# code I’m testing is so similar for different contexts, I could have reduced the code much further, maybe even to half the size.

But reducing the code size in unit tests beyond the point of maximum clarity is not helpful. The main driving forces for tests are that they be stand-alone and readable. Each test in the resulting file is stand-alone and more readable than without the refactoring. Each should be less than a screen in size, but once you reach this point, clarity trumps size.

Write clear verify tests and allow the reader to correctly assume that each verify method tests the parameters passed, and nothing else.

And then there’s change…


RoslynDom does a handful of things. One of the tricky things that is not tested by these unit tests is round-tripping of attributes, which has some tests in another part of the test suite.

While I’m curious how well the code in this test file runs, I know there are presently some low-priority issues roundtripping attributes. Before creating the common code, it would have been a lot of bother to experiment with round-tripping to see how serious these issues might be. After the changes, I just need to add a couple lines of code to the VerifyAttributes method.

When I actually did this, I got a lot of the expected messages. I know I read any kind of attribute layout, but am opinionated (for now) on outputting as separate attributes:

Result Message:
Assert.AreEqual failed. 
Expected:<
[Serializable, TestClass]
public interface MyInterface
{ }>. 
Actual:<
[Serializable]
[TestClass]
public interface MyInterface
{ }>.



What was unexpected was that 3 tests – those typeof tests –crashed on outputting the code. I get excited anytime I find and fix a problem, because it’s quite challenging to test the myriad of code possibilities that RoslynDom is intended to support.

I liked this test enhancement so much I left it in. I have a rule that all tests that crash RoslynDom should result in new tests – so I had to work it into the test suite one way or another. I added a Boolean value to the VerifyAttributes method to skip the BuildSyntax test where I know it will fail just because of attribute layout.

Here I used a refactoring trick.

I added the new Boolean value as the first parameter – even though that’s a sucky place for it.

I did a replace of “VerifyAttributes(csharpCode,” with “VerifyAttributes(false, csharpCode,” with a Replace In Files for just the current document so I could check the changes. That was good because I initially had a space at the end, which missed occurrences where I wrapped the delegate to the next line.

Once everything built with the new parameter, I refactored with a signature change to put the Boolean where I wanted it, and then changed the Boolean value to true on the tests where I wanted to skip the assertion that the input and output match. I always call BuildSyntax to ensure it doesn’t crash, but I don’t expect to roundtrip the code perfectly when attributes are combined (at present).

This will also make it dirt simple to find these tests if/when I decided to support round-tripping multiple layouts of attributes. I’ll just ignore and then remove the parameter.

Take a look at your tests and see whether you can make some of them easier to understand with some tactically applied helper methods.