Want to Help Design C# 6.0 This Weekend?

The team is asking for your feedback. If you have a couple of minutes, answer these questions or respond to this thread.

Declaration Expressions

C# 6 is introducing a new feature called declaration expressions. It’s cool because it lets you write this code:

if (int.TryParse(s, out int i)) { … i … }

No more separate variable appearing above the conditional. Less ceremony, less ugliness, yippee!!!!!

The initial implementation expanded the scope of the variable to the next larger scope. This allowed you to do stuff like:

GetCoordinates(out var x, out var y);
… // use x and y;

While that might look useful, I hate it because it obscures the declaration of variables that extend across the entire scope – such as the method. This is called spill out and the team plans to remove it.

For clarity, in the condition if statement above, the integer i is in scope within the statement or block that is effectively the “then” clause of the conditional expression. Most of the enclosing blocks you use: do, for, using have a single place to declare a variable.

Yippeee!

The Issue to Resolve…


The question the team is asking for help resolving involves code like this:


if (int.TryParse(s, out int i)) Console.WriteLine("Got integer {0}", i);
else Console.WriteLine("Got no integer {0}", i);
// Do you expect "i" to be in scope in the "else" clause?


 

The way C# itself is designed (in the .NET Compiler Platform, Roslyn trees) is an else clause is contained inside theif statement. The above use of the variable i is legal (Q1 below).

The scenario most impacted by this decision is a series of else if statements (Q2 below):


if (int.TryParse(s, out var v)) Console.WriteLine("Got integer {0}", v);
else if (double.TryParse(s, out var v)) Console.WriteLine("Got double {0}", v);
else Console.WriteLine("Ain't got nuffink");
// Do you expect you can re-use the name "v" in both clauses?
if ((var v = o as string) != null) Console.WriteLine(v.Length);
else if ((var v = o as Exception) != null) Console.WriteLine(v.Message);
// Do you expect you can re-use the name "v" in both clauses?


 

And the way you manually refactor might be effected because if (b) S1 else S2 will no longer mean precisely what if (!b) S2 else S1 means – you might have to do some variable switcharoos (Q3):


if (int.TryParse(s, out int i)) Console.WriteLine("Got integer {0}", i);
else Console.WriteLine("no integer");
// Do you expect you can negate the condition and switch around "then" and "else" clauses?
if (!int.TryParse(s, out int i)) Console.WriteLine("no integer");
else Console.WriteLine("Got integer {0}", i);


 


The Poll


I’ll hand the results to the team as a non-scientific (because you aren’t randomly chosen) poll, along with any comments you want to make – although if you want to be part of the discussion, this is the place to be. Suggested answers include: yes, no, maybe, don’t care, don’t understand.

Q1: Would you expect a variable declared in the “if” condition to be in scope and available for use in the code of the “else” clause?

Q2: Would you expect to be able to reuse the same variable in multiple “if” clauses that appear as a series of “else if” statements?

Q3: Do you expect to be able to rearrange if and else clauses with no risk of compiler errors regarding out of scope variables?

Q4: Do you think it matters whether C# and Visual Basic.NET do the same thing?

Q5: Are there other scenarios you’re worried or curious about regarding the new declaration expression features?

Video Series on C# 6.0, Visual Basic 14 and Visual Studio 14

Video Series on C# 6.0, Visual Basic 14 and Visual Studio 14I am really excited to be sharing a series of short videos on C# 6.0, Visual Basic 14 and Visual Studio 14. The series will be free and available at www.WintellectNOW.com

The first video is “The New Compilers” and is an overview of the next releases.

The second video “Simplifying Classes with C# 6.0” shows how to use auto-property initialization, getter-only auto-properties and primary constructors to create classes with simple code and immutable or mutable properties.

Next week I’ll dive deeper into auto-properties and primary constructors in C#.

Visual Basic folks can watch these videos for the basic concepts in this release, and I’ll focus some upcoming videos on Visual Basic 14 features.

I’ll show Visual Studio features in context. In the first three videos, you’ll see features like Alt period refactoring and introduce local variable. Later in the series I’ll dedicate at least one, probably long, show to an overview of new features.

While you’re at www.Wintellect.com, check out some other videos. For a free trial, use the code KDOLLARD14.

RoslynDom Quick Start

This document is about an early version of RoslynDom, focusing mostly on working features, with notes on the impact of missing certain upcoming features. You can also see notes on missing features in GitHub issues.

You can find the code for these quickstarts in the RoslynDomExampleTests NuGet package.

For more information see these documents in the “Documents” folder on GitHub (Creation of these documents is currently in progress):

  • See the RoslynDom Project Layout if you are curious about why there are five projects and the dependencies these projects have on the .NET Compiler Framework (Microsoft.CodeAnalysis), CSharp compiler (Microsoft.CodeAnalysis.CSharp) and Unity (Microsoft.Practices.Unity.*)
  • See the RoslynDom Design Overview for a discussion of how RoslynDom is built
  • See the RoslynDom Extensibility if you’re interested in doing more with RoslynDom
  • See the RoslynDom Roadmap.ppt for a vision of RoslynDom

What is RoslynDom

RoslynDom is an alternative view of your code.

The most efficient, best way to express your code in ASCII is your code in your favorite language.

The most efficient, best way to organize your code for your compiler is within the compiler, and exposed as part of the .NET Compiler Platform, Roslyn.

Another, ephemeral expression of your code is the one in your head. This is the one that comes out in words in your meetings, and you have entire meetings without phrases like “angle bracket.”

RoslynDom models this third expression of code in memory which has several features:

  • You can load existing code into the RoslynDom model and easily explore, navigate and analyze it. The RoslynDom model is language agnostic.

This feature is currently affected by not yet having multi-file support

  • RoslynDom is mutable. You can alter your code in a load, alter, alter, alter, build output model. Since you can easily navigate your code, finding the location for change is easy
  • RoslynDom entirely isolates the language dependent load process from the model itself. At a simplistic level, when the VB factory is in place, you can load from C# and output to VB and vice versa.
  • RoslynDom models can be created in any manner you desire. RoslynDom views can be created without loading code, and then brand new code created.

 

The basic model

Code exists in the following hierarchy

  • Root groups which are groups of files (not yet implemented)
  • Roots, which are files or a single conceptual load unit
  • Stem members – Namespaces and types that can contain be contained in roots
  • Type members – nested types, methods, properties, etc. that can be contained in types
  • Statements – code statements that are held primarily in methods and property accessors
  • Expressions – sub parts of statements that return values

Most major features, including most statements are complete, see GitHub issues.

Expressions are currently handled via strings by design.

Walkthrough 1: Load and check code

Step 1: Load your code

Add a using statement for RoslynDom.CSharp.

Retrieve the singleton instance of the RDomCSharpFactory from the RDomCSharpFactory.Factory property and call the GetRootFromFile method to open a specific file:

var factory = RDomCSharp.Factory;
var root = factory.GetRootFromFile(fileName);


NOTE: Other overloads support loading source code from strings or trees.

NOTE: You can iteratively work through the files in your project or solution. This approach will be hampered because specifying references and multiple syntax trees for the underlying model isn’t yet supported.

Of course you can assign the factory property to a local variable or class field if you prefer.

RDomCSharp is the code that creates the language agnostic RoslynDom tree from C# code, and that can recreate C# code from the RoslynDom tree. You can create a RoslynDom tree from scratch as well. You will later be able to load from other languages, in particular VB.NET.


Step 2: Check your code


Output your code to a string to test the output. You can do this by outputting to a new file and comparing the files:


var output = factory.BuildSyntax(root).ToString();
File.WriteAllText(outputFileName, output);


Conclusion


You now know how to load and output code from RoslynDom

Walkthrough 2: Navigate and interrogate code


One of the major user scenarios intended for RoslynDom is to allow you to answer questions about your code. This is just a small sampling of the kinds of things you can do.

At present, RoslynDom supports structural features (classes, methods, etc) and statements. It does not support expressions because user stories with value aren’t yet clear.

Step 1: Load and check code


Load and check your code as shown in Walkthrough 1.

Step 2: Ask general questions about code


LINQ is your friend.

You’ll often find it convenient to make an array for easier sequential requests in testing.


var factory = RDomCSharpFactory.Factory.GetRootFromFile(fileName);
Assert.AreEqual(1, root.Usings.Count());
Assert.AreEqual("System", root.Usings.First().Name);
Assert.AreEqual(1, root.Namespaces.Count());
Assert.AreEqual(1, root.RootClasses.Count());


Assigning intermediate values to variables in tests can help clarity


var methods = root.RootClasses.First().Methods.ToArray();
Assert.AreEqual(0, methods[0].Parameters.Count());
Assert.AreEqual(1, methods[1].Parameters.Count());
Assert.AreEqual("x", methods[1].Parameters.First().Name);


The difference between Classes and RootClasses is that root classes include all classes under the root, regardless of namespace. Classes are only those directly under the root. Similar for Interfaces, Enums and Structures.

Step 3: Place a break point and query code


Place a breakpoint, run the test in debug mode and ask questions in the immediate window about the code. Sometimes you’ll have to use the Watch window because of the .NET Compiler Platform CTP behavior. Have fun!

Step 4: Ask harder questions


That might have been fun, but the real value from RoslynDom comes from asking complex questions. I’ll introduce LINQ in this walkthrough, and then show something I really wanted to accomplish in the next.

Ensure RolsynDom.Common and System.Linq are included in the using statements.

Let’s say you’re concerned about unsigned ints variables in your code and want to examine their names. I don’t know why, I just had to make something up.

You can retrieve the RoslynDom entry with


var uintVars = root
.Descendants.OfType<IVariable>()
.Where(x => x.Type.Name.StartsWith("UInt"))
.Select(x => x.Name);


NOTE: Aliases are language specific, RoslynDom entries are agnostic so use the .NET name of the type. The CSharp factory is responsible for straightening this out on output.


As another example, say you want all the methods and variables where unsigned ints are used:


var uintCode = (from c in root.Descendants.OfType<IStatementContainer>()
from v in cl.Descendants.OfType<IVariable>()
where v.Type.Name.StartsWith("UInt")
select new
{
containerName = cl.Name,
variableName = v.Name
} )
.ToArray();


Walkthrough 3: Finding questionable implicit variable typing


I have a sin when I code. I really like ignoring types. When I write code I use var everywhere. This saves me time. But, I realize it can result in code that’s less readable.

I can accept a rule that implicit variable typing should only be used on object instantiation, strings, Int32 (int), and DateTime in VB. VB isn’t yet supported.

This combination of selecting types based on the implemented interfaces, and examining additional properties, like types and names is very powerful in finding particular locations in code. I want to find all the implicitly typed variables that are not an object instantiation, assignments to literal strings, or assignments to integers?

Since this is a complicated question, I’ll ask in steps, although you can certainly refactor this into a single statement if you prefer. LINQ doesn’t evaluate until requested, so the piecewise creation is not a performance issue:

Find all implicitly typed local variables:


var implicitlyTyped = root
.Descendants.OfType<IDeclarationStatement>()
.Where(x => x.IsImplicitlyTyped);


Find all instantiations, because they’re OK:


var instantiations = implicitlyTyped
.Where(x => x.Initializer.ExpressionType == ExpressionType.ObjectCreation);


Find all string, integer (32 bit) and DateTime literals, because they’re OK:


var literals = implicitlyTyped
.Where(x => x.Initializer.ExpressionType == ExpressionType.Literal &&
( x.Type.Name == "String"
|| x.Type.Name == "Int32"
|| x.Type.Name == "DateTime" )// for VB
);


Find all the implicitly types variables that aren’t instantiations or literals string/ints:


var candidates =implicitlyTyped
.Except(instantiations)
.Except(literals);


Step 6: Reporting


The code discussed here is in the ReportCodeLines method.

Once you get the information you’re interested in, you’ll probably want to output it. Obviously in reporting, you’d like file, line and column positions. RoslynDom is an abstract tree that does not directly understand text. But it maintains, and can report, about the code it was created from by holding references to key aspects of the underlying .NET Compiler Platform (Roslyn) objects. As long as you haven’t changed the tree, these aspects remain correct.

If you change the tree, the only safe way to report positions of RoslynDom elements is to recreate the underlying syntax tree, and then reload that tree into RoslynDom – generally also searching again for the elements of interest.

Because we haven’t changed the tree since loading it, this isn’t a problem.

Create a SyntaxTree from part of the RoslynDom item:


private string GetNewCode(IDom item)
{
var ret = new List<string>();
return RDomCSharp.Factory.BuildSyntax(item).ToString();
}


Retrieve the original code that was used to create the RoslynDom element:


private string GetOldCode(IDom item)
{
var node = item.RawItem as SyntaxNode;
if (node == null)
{ return "<no syntax node>"; }
else
{
return node.ToFullString();
}
}


Retrieve the original code position:


private LinePosition GetPosition(IDom item)
{
var node = item.RawItem as SyntaxNode;
if (node == null)
{ return default(LinePosition); }
else
{
var location = node.GetLocation();
var linePos = location.GetLineSpan().StartLinePosition;
return linePos;
}
}


Retrieve the original code filename:


private string GetFileName(IDom item)
{
var root = item.Ancestors.OfType<IRoot>().FirstOrDefault();
if (root != null)
{ return root.FilePath; }
else
{
var top = item.Ancestors.Last();
var node = top as SyntaxNode;
if (node == null)
{ return "<no file name>"; }
else
{ return node.SyntaxTree.FilePath; }
}
}


You can use these helper methods in LINQ to create an IEnumerable of an anonymous type:


var lineItems = from x in items
select new
{
item = x,
fileName = GetFileName(x),
position = GetPosition(x),
code = GetNewCode(x)
};


I’ll use a string formatting trick to make pretty columnar output. I’ll first determine the length of each segment of the string output – such as the maximum file path length. I’ll replace dummy values in a format string, such as fMax, to create a custom format string for the sizes in this data:


var filePathMax = lineItems.Max(x => x.fileName.Length);
var itemMax = lineItems.Max(
x => x.item.ToString().Trim().Length);
var lineMax = lineItems.Max(
x => x.position.Line.ToString().Trim().Length);
var format = "{0, -fMax}({1,lineMax},{2,3}) {3, -itemMax} {4}"
.Replace("fMax", filePathMax.ToString())
.Replace("itemMax", itemMax.ToString())
.Replace("lineMax", lineMax.ToString());


I can then iterate across the IEnumerable of anonymous type:


foreach (var line in lineItems)
{
sb.AppendFormat(format, line.fileName,
line.position.Line, line.position.Character,
line.item.ToString().Trim(), line.code);
sb.AppendLine();
}
return sb.ToString();



This results in nice output like (which would be nicer if I wasn’t wrapping):

Walkthrough_1_code.cs(13, 16) RoslynDom.RDomDeclarationStatement : ret {String} var ret = lastName;

Walkthrough_1_code.cs(51, 16) RoslynDom.RDomDeclarationStatement : x3 {Int32} var x3 = x2;

Walkthrough 4: Fixing questionable implicit variable typing


What good would it be to find issues if you couldn’t fix them. But I’m tired, so I’m going to mostly let you figure out how this code works based on what you’ve already learned


[TestMethod]
public void Walkthrogh_4_Fix_implicit_variables_of_concern()
{
// Assumes Walkthrough_3 passes
var root = RDomCSharp.Factory.GetRootFromFile(fileName);
var candidates = FindImplicitVariablesOfConcern(root);
foreach (var candidate in candidates)
{
candidate.IsImplicitlyTyped = false; // All you need
}
var output = RDomCSharp.Factory.BuildSyntax( root.RootClasses.First());
// For testing, force changes through secondary mechanism
var initialCode = File.ReadAllText(fileName);
var newCode = initialCode
.Replace("var ret = lastName;", "System.String ret = lastName;")
.Replace("var x3 = x2;", "System.Int32 x3 = x2;")
.SubstringAfter("Walkthrough_1_code\r\n{\r\n")
.SubstringBeforeLast("}")
;
Assert.AreEqual(newCode, output.ToFullString());
}


The only thing that’s required is to state that these declarations should not be implicitly typed by setting IsImpliciltyTyped to false for each candidate. The rest of the code is to create a test.

But this results in the rather ugly System.String declaration. That’s jarring in a file that uses the C# aliases. That fix is in the next test:


[TestMethod]
public void Walkthrogh_4_Fix_non_aliased()
{
// Assumes Walkthrough_3 passes
var root = RDomCSharp.Factory.GetRootFromFile(fileName);
var candidates = FindImplicitVariablesOfConcern(root);
foreach (var candidate in candidates)
{
candidate.IsImplicitlyTyped = false;// All you need
candidate.Type.DisplayAlias = true; // All you need
}
var output = RDomCSharp.Factory.BuildSyntax( root.RootClasses.First());
// For testing, force chhanges through secondary mechanism
var initialCode = File.ReadAllText(fileName);
var newCode = initialCode
.Replace("var ret = lastName;", "string ret = lastName;")
.Replace("var x3 = x2;", "int x3 = x2;")
.SubstringAfter("Walkthrough_1_code\r\n{\r\n")
.SubstringBeforeLast("}")
;
Assert.AreEqual(newCode, output.ToFullString());
}


Here, in addition to setting IsImplicitlyTyped to false, I set DisplayAlias to true. Normally, this is set when the code is loaded based on whether the alias is used. Since you’re changing how the type is managed, you also have to request that you want to use the alias.

RoslynDom Update 1.0.10-alpha

 

After reviewing the changes in 1.0.10-alpha I think trying to expand on the features in the context of the update document is not realistic. The document has some minor updates in the documents folder, and I’m archiving the update document for Updates 1.0.1-alpha to 1.0.10-alpha and starting a new document for 1.0.11-alpha. I’ll highlight the changes here in the context of goals accomplished. This has been an enormous leap and I now know where the last five or six weeks of my life have gone.

Language independence

There are three vertical slices to the overall RoslynDom.

  • The interfaces, which I’ll discuss separately with the unrealistically broad goal of being somewhat platform independent and supplying feature based access to the RoslynDom silo. To the extent you can, ignore the interfaces until I discuss them further as they might give you a headache.
  • RoslynDom itself, which is a language independent representation of your source code that is designed from the perspective of .NET and the .NET Compiler Platform, Roslyn. There is significant divergence from the .NET Compiler Platform when it aided the goals
    • Mutability
    • Layout independent format
    • Language independent format
    • Simple access
    • SameIntent support
    • Support for comments and XML documentation as first class citizens
    • Support for compiler directives as first class citizens
  • RoslynDomCSharpFactories, C# factories to load RoslynDom and recreate a .NET Compiler Platform, Roslyn, SyntaxTree. Loading and unloading is via the SyntaxTree to allow parsing and consistent structures.

Each language element, such as a method, is represented by a composed interface (IMethod), a language independent RoslynDom class (RDomMethod) and a C# specific factory (RDomMethodTypeMemberFactory). Each factory has CreateFrom… and BuildSyntax… methods. CreateFrom… methods create RoslynDom entities and BuildSyntax… methods recreate syntax elements for the SyntaxTree.

Dependency Injection

The groundwork for extensibility is in the dependency injection approach to retrieving factories. Since factories instantiate RoslynDom entities and recreate syntax, they are the key player in modifications and extensions. Since they interact to build RoslynDom and SyntaxTree entities, their retrieval is crucial to an extensibility story.

One known extensibility story is a Visual Basic factory. It seems likely possible that user stories for building RoslynDom trees from scratch as an easier way to build SyntaxTrees from scratch will want to tie into extensibility, and therefore a raw helper factory might make sense. I am hoping that more ambitious factories like VB6 can also be created.

There has not yet been any testing of extensibility and more work is required in refactoring the BuildSyntaxHelper methods.

Comments, Vertical Whitespace and XML Documentation Comments

The .NET Compiler Platform, Roslyn, places all whitespace and all comments into language trivia attached to the first token following where the trivia should appear.

Of necessity I use comments for “public annotations” (in earlier versions) which provide information for RoslynDom clients. This is entirely separate from the private annotations the .NET Compiler Platform, Roslyn provides. Also, for any use you will have, XML Documentation (also called Structured Documentation because the use of XML is compiler dependent) should be available on the language element (such as the class or method) it belongs to. Similar arguments raise directives to being first class citizens.

There are four levels where vertical whitespace and comments can logically occur: file, stem, type and code (method or property). Each of these has a MembersAll property that include comments and vertical whitespace, as well as appropriate code elements. The Members property includes all code elements except comments and vertical whitespace.

Structured documentation is extracted and placed on the corresponding element. At present, you access the XML because breaking this into a true structure is a lower priority because I don’t have user stories.

Horizontal Whitespace

Earlier versions of RoslynDom were very heavy handed in formatting. This version manages horizontal whitespace. About 25-30% of the code in the factories is now dedicated to managing horizontal whitespace. Three weeks, three redesigns, and a few tears went into this, but the current approach appears solid.

Report Hierarchy and ToString()

A ReportHierarchy method allows better information about the RoslynDom tree, particularly in the immediate window. More work will go into this, so do not take a dependency on the current structure you don’t want broken.

Added Statement support

There are six main levels of elements in your code base: file, stem, types, code container, statement and expressions. Previous versions of RoslynDom supported only file, stem, types and code containers. This version supports a variety of statements.

Statements are logically nested in code blocks, particularly the code blocks of conditional (if) and looping statements.

Expressions are minimally supported – RoslynDom uses conditions and assignment expressions without breaking them down or understanding them. I’m not sure whether it ever will. I have compelling user stories for understanding statements and statement parts (see the walkthroughs for one example). I do not yet have compelling user stories for breaking down expressions. If the only user story is intelligent VB.NET/C# conversions, support may be minimal.

Added Ancestors and Descendants

You can now query the ancestors and descendants of RoslynDom trees. See the walkthroughs for an example of why you might find this interesting.

Interfaces made non-immutable (mutable)

The RoslynDom tree is mutable. This is because of my intended usage and because I think one of the things an alternative to the .NET Compiler Platform, Roslyn is a mutable alternative. I absolutely agree that the .NET Compiler Platform, Roslyn should be immutable – it’s a compiler structure. However, this pretty much forces a rewriter for any non-trivial changes to the SyntaxTree. I believe there will be scenarios where it’s much easier to load into RoslynDom, do interrogation and mutating in that structure, then output to a new .NET Compiler Platform, Roslyn SyntaxTree.

In my initial vision, the interfaces were immutable (IMethod) and the RoslynDom implementation was mutable (RDomMethod). This proved impractical because of excess casting for mutations. My new vision is that if there’s a need for an immutable set of interfaces, the current set will inherit from the immutable set.

As an implication, and allowing for errors, if something is not-mutable in the interfaces, such as the RDomLists, they aren’t supposed to be changed.

There’s still a lot of work to do

GitHub lists known issues. The next version or two will be clean up and documentation improvements.

Following that I’ll plug the holes of the most important language features I’m not yet supporting. These include regions, lambdas and async because they are hard, and side cases like destructors and explicit operators.

I want to solidify a single file before I work across multiple files. Multiple file usage will make the underlying model much more useful and allow more interesting interrogation of non-mutated RoslynDom structures.

The biggest help I need right now are user stories, even vague ones, and failing unit tests – particularly if it crashes RoslynDom. Of course, if you’d like to help further please be in touch. If you want to fork the code, it would be lovely to see what you’re doing.

This is still a very early release. Everything is up for change.

I will try to keep the NuGet release from getting as out of date as it has been for the last month.

Refactoring Unit Tests

Llewellyn Falco and I paired on an introduction to his AcceptanceTests tool. I really like that tool for evaluating objects during testing in an easy, flexible and evolutionary way. It’s a great tool, but that’s not what this post is about.

Rob Hughes (@rhughesjr ) heard via Twitter that Llewellyn and I also refactored a bunch of RoslynDom tests to remove redundant code, and asked that I do a blog post about this aspect of our pairing. That’s what this post is about.

I wrote this about some later refactoring that Llewellyn inspired – so he should get all of the credit and none of the blame. I don’t think there is anything groundbreaking here. Just a detailed walkthrough of refactoring a set of unit tests, along with the logic behind the changes.

Removing redundant code from unit tests

When I write normal, non-unit test code, I think about refactoring common code from the very beginning. A lack of redundancy and flexibility/extensibility are primary forces I think about in software.

But not in unit tests. I believe that unit test creation should be a bit more like an amoeba eating up everything it can touch. A rigid shape caused by code reuse can hide a reduction in logical coverage and in LOC coverage. So, when Llewellyn and I began there were almost no helper methods in the RoslynDom tests.

RoslynDom is very simple in goal – load C# code into a language agnostic structure, allow interrogation and changes to the structure (yes, it’s mutable), and output C# code which looks like the original code. Oh, and you can ask whether two code structures have the same intent.

Because it does a few things across a mid-size number of different elements, there are a lot of very similar tests.

I believe it is best to discover where your tests are redundant, by refactoring them at a later date, and after you have a big pile of tests.

I do not recommend strategizing ahead of time about how to maximize code reuse in unit tests (been there, done that). It’s the only place in your code where I think the copy-paste-fix-differences cycle is OK. I’m referring only to strategizing and designing around code reuse too early.

Strategizing isolation of unit tests very early in the process is extremely helpful. I would say it is necessary, but if you have no tests, I don’t really care if you isolate your first ten tests.

So, why ever remove the redundant code?

Too often unit tests are a static pile that rots during our application development process. If we’re deeply committed, we fix all the broken tests. If we aren’t tests are disabled or removed as the schedule requires. Regardless, unit tests generally become rotten and stinky.

If tests aren’t isolated, rotting tests may become impossible to run. In the days before mocking, I saw a team toss >1,000 tests because of a database change. But RoslynDom tests are isolated because of the nature of the problem.

Beyond maintaining the ability to run your tests, the universal problem is that rotting tests become impossible to read and understand. Your unit tests are the best view into your system. It’s why we have test names that explain why we wrote the test (RoslynDom test naming is mediocre and adequate, not brilliant).

As you project the technical changes in the rest of this post to tests onto your own projects, think about how to increase clarity in what each test is accomplishing.

OK, already! What are the changes?

Here’s a simple test before refactoring

[TestMethod, TestCategory(SimpleAttributeCategory)]
public void Can_get_attributes_on_class()
{
var csharpCode = @"
[Serializable]
public class MyClass
{ }
"
;
var root = RDomFactory.GetRootFromString(csharpCode);
var class1 = root.Classes.First();
var attributes = class1.Attributes;
Assert.AreEqual(1, attributes.Count());
Assert.AreEqual("Serializable", attributes.First().Name);
}


RoslynDom has almost 600 unit tests. I like test categories and use constants to avoid mistyping them.

There are eight tests nearly identical to this that change what the attribute is placed on (class, structure, method, parameter, etc), and thus different code strings. There are also variations with different numbers of attributes. So, clearly there is a lot of redundant code (about 32 tests).

The two lines that retrieve the attributes are problematic. They will be different for every test. That’s a job for our mild-mannered super-hero: the delegate!

Refactoring the test part of the code with a delegate results in this method:


private static void VerifyAttributes(string csharpCode,
Func<IRoot, IEnumerable<IAttribute>> makeAttributes,
int count, params string[] names)
{
var root = RDomCSharp.Factory.GetRootFromString(csharpCode);
var attributes = makeAttributes(root).ToArray();
Assert.AreEqual(count, attributes.Count());
for (int i = 0; i < attributes.Count(); i++)
{
Assert.AreEqual(names[i], attributes[i].Name);
}
}


The things that change between tests are the input code (csharpCode), how the attributes are retrieved (makeAttributes), the count of attributes expected (count) and the expected parameter names (names).

The test calls this method with:


[TestMethod, TestCategory(SimpleAttributeCategory)]
public void Can_get_attributes_on_class()
{
var csharpCode = @"
[Serializable]
public class MyClass
{ }
"
;
VerifyAttributes(csharpCode, root => root.Classes.First().Attributes,
1, "Serializable");
}


 

The value of this call isn’t removing five lines of code – it’s making it more clear what those five lines of code did.

This change simplified 32 tests and made them more readable.

All tests aren’t that simple


The next set of tests looked at attribute values. The initial test was:


[TestCategory(AttributeValuesCategory)]
public void Can_get_attribute_values_on_class()
{
var csharpCode = @"
[LocalizationResources("
"Fred"", ""Joe"", Cats=42)]
[Name("
"KadGen-Test-Temp"")]
[SemanticLog]
public class MyClass
{ }
"
;
var root = RDomFactory.GetRootFromString(csharpCode);
var attributes = root.Classes.First().Attributes;
Assert.AreEqual(3, attributes.Count());
var first = attributes.First();
Assert.AreEqual("LocalizationResources", first.Name);
Assert.AreEqual(3, first.AttributeValues.Count());
var current = first.AttributeValues.First();
Assert.AreEqual("LocalizationResources", current.Name);
Assert.AreEqual("Fred", current.Value);
Assert.AreEqual(LiteralKind.String, current.ValueType);
current = first.AttributeValues.Skip(1).First();
Assert.AreEqual("LocalizationResources", current.Name);
Assert.AreEqual("Joe", current.Value);
Assert.AreEqual(LiteralKind.String, current.ValueType);
current = first.AttributeValues.Last();
Assert.AreEqual("Cats", current.Name);
Assert.AreEqual(42, current.Value);
Assert.AreEqual(LiteralKind.Numeric, current.ValueType);
Assert.AreEqual("Name", attributes.Skip(1).First().Name);
Assert.AreEqual("SemanticLog", attributes.Last().Name);
}


 

I doubt you can glance at that and understand what it does.

One approach would be to pass a complex data structure to the previous Verify method. I could probably have created something slightly readable with JSON, or XML literals if I was in Visual Basic. But unit tests demand a KISS (Keep it Simple Silly) approach.

If the VerifyAttributes method returns the IEnumerable of IAttribute it’s already creating, the first five lines (and a couple of others) can be replaced with:


var attributes = VerifyAttributes(csharpCode, 
root => root.Classes.First().Attributes,
3, "LocalizationResources", "Name", "SemanticLog")
.ToArray();



Making it an array simplifies accessing individual elements.

For the rest of the test, it makes sense to apply the same refactoring approach that worked on attributes. But here, there’s a name, a value, and a literal kind. Again, one approach is a complex structure, but a simpler approach is to test the count and return the IEnumerable of IAttributeValue for more testing:


private IEnumerable<IAttributeValue> VerifyAttributeValues(

IAttribute attribute, int count)

{

var attributeValues = attribute.AttributeValues;

Assert.AreEqual(count, attributeValues.Count());

return attributeValues;

}



 

An additional method simplifies the testing of individual attribute values:


private void VerifyAttributeValue(IAttributeValue attributeValue, string name, object value, LiteralKind kind)

{

Assert.AreEqual(name, attributeValue.Name);

Assert.AreEqual(value, attributeValue.Value);

Assert.AreEqual(kind, attributeValue.ValueType);

}



 

Calling these methods is a great opportunity for named parameters. Take a minute to compare the readability of this code to the same test at the start of this section (and yep, I wish I’d also used named parameters for the VerifyAttributes calls):


[TestMethod, TestCategory(AttributeValuesCategory)]
public void Can_get_simple_attribute_values_on_property()
{
var csharpCode = @"
public class MyClass
{
[Version(2)]
[Something(3, true)]
public string foo {get; set; }
}
"
;
var attributes = VerifyAttributes(csharpCode,
root => root.Classes.First().Properties.First().Attributes,
2, "Version", "Something")
.ToArray();
var attributeValues = VerifyAttributeValues(attributes[0], count: 1)
.ToArray();
VerifyAttributeValue(attributeValues[0], name: "", value: 2, kind: LiteralKind.Numeric);
attributeValues = VerifyAttributeValues(attributes[1], count: 2)
.ToArray();
VerifyAttributeValue(attributeValues[0], name: "", value: 3, kind: LiteralKind.Numeric);
VerifyAttributeValue(attributeValues[1], name: "", value: true, kind: LiteralKind.Boolean);
}


Does that really fit every circumstance of the area you’re testing?


Rarely will there be such a large number of tests doing such trivial comparisons. In this same test file/test topic, there are also tests of passing types, instead of literals, to attributes. This only appears three places in the file:


[TestMethod, TestCategory(AttributeValuesCategory)]
public void Can_get_attribute_value_of_typeof_identifier_only_on_class()
{
var csharpCode = @"
[Test(TypeTest = typeof(Foo))]
public class MyClass
{ }
"
;
var attributes = VerifyAttributes(csharpCode,
root => root.Classes.First().Attributes,
1, "Test")
.ToArray();
var current = VerifyAttributeValues(attributes[0], count: 1)
.First();
Assert.AreEqual(LiteralKind.Type, current.ValueType);
var refType = current.Value as RDomReferencedType;
Assert.IsNotNull(refType);
Assert.AreEqual("Foo", refType.Name);
}


 

Honestly, I might not bother refactoring this if it was in a test file that was full of variations and refactoring opportunities with more payback. But in this nice clean test file, it’s jarring.

Using a different name, rather than an overload, clarifies that something different is being checked:


private static void VerifyTypeOfAttributeValue(IAttributeValue current, string name)

{

Assert.AreEqual(LiteralKind.Type, current.ValueType);

var refType = current.Value as RDomReferencedType;

Assert.IsNotNull(refType);

Assert.AreEqual(name, refType.Name);

}



 

Making the call:


[TestMethod, TestCategory(AttributeValuesCategory)]

public void Can_get_attribute_value_of_typeof_referenced_on_class()

{

var csharpCode = @"

[Test(TypeTest = typeof(DateTime))]

public class MyClass

{ }

"
;

var attributes = VerifyAttributes(csharpCode,

root => root.Classes.First().Attributes,

1, "Test")

.ToArray();

var current = VerifyAttributeValues(attributes[0], count: 1)

.First();

VerifyTypeOfAttributeValue(current, name: "DateTime");

}



 


Yes, you could make it smaller


The actual change with all these refactorings was about 130 lines of code, 860 to 730 vertical lines in this test class. Because the same set of tests were repeated multiple times, and the C# code I’m testing is so similar for different contexts, I could have reduced the code much further, maybe even to half the size.

But reducing the code size in unit tests beyond the point of maximum clarity is not helpful. The main driving forces for tests are that they be stand-alone and readable. Each test in the resulting file is stand-alone and more readable than without the refactoring. Each should be less than a screen in size, but once you reach this point, clarity trumps size.

Write clear verify tests and allow the reader to correctly assume that each verify method tests the parameters passed, and nothing else.

And then there’s change…


RoslynDom does a handful of things. One of the tricky things that is not tested by these unit tests is round-tripping of attributes, which has some tests in another part of the test suite.

While I’m curious how well the code in this test file runs, I know there are presently some low-priority issues roundtripping attributes. Before creating the common code, it would have been a lot of bother to experiment with round-tripping to see how serious these issues might be. After the changes, I just need to add a couple lines of code to the VerifyAttributes method.

When I actually did this, I got a lot of the expected messages. I know I read any kind of attribute layout, but am opinionated (for now) on outputting as separate attributes:

Result Message:
Assert.AreEqual failed. 
Expected:<
[Serializable, TestClass]
public interface MyInterface
{ }>. 
Actual:<
[Serializable]
[TestClass]
public interface MyInterface
{ }>.



What was unexpected was that 3 tests – those typeof tests –crashed on outputting the code. I get excited anytime I find and fix a problem, because it’s quite challenging to test the myriad of code possibilities that RoslynDom is intended to support.

I liked this test enhancement so much I left it in. I have a rule that all tests that crash RoslynDom should result in new tests – so I had to work it into the test suite one way or another. I added a Boolean value to the VerifyAttributes method to skip the BuildSyntax test where I know it will fail just because of attribute layout.

Here I used a refactoring trick.

I added the new Boolean value as the first parameter – even though that’s a sucky place for it.

I did a replace of “VerifyAttributes(csharpCode,” with “VerifyAttributes(false, csharpCode,” with a Replace In Files for just the current document so I could check the changes. That was good because I initially had a space at the end, which missed occurrences where I wrapped the delegate to the next line.

Once everything built with the new parameter, I refactored with a signature change to put the Boolean where I wanted it, and then changed the Boolean value to true on the tests where I wanted to skip the assertion that the input and output match. I always call BuildSyntax to ensure it doesn’t crash, but I don’t expect to roundtrip the code perfectly when attributes are combined (at present).

This will also make it dirt simple to find these tests if/when I decided to support round-tripping multiple layouts of attributes. I’ll just ignore and then remove the parameter.

Take a look at your tests and see whether you can make some of them easier to understand with some tactically applied helper methods.

What I learned about coding against the .NET Compiler Framework this week…July 24, 2014

I don’t know if I’ll do this every week, but this week I hit two spots of the .NET Compiler Platform API quicksand. I did not get out of either alone, so wanted to share what I learned.

ToFullString()

I struggled fantastically with creating code for XML documentation. Run the Roslyn quoter against a simple comment and you’ll get the gist of it.

For my work with RoslynDom I need to go both ways after modifying the documentation:

- Code -> Compiler API (syntax tree) -> RoslynDom items

- RoslynDom items -> Compiler API (syntax tree) -> to code

-

The first works great. Grab the symbol and you can grab the documentation:

Symbol.GetDocumentationCommentXml()

This gives you the XML as a string. Just load it as an XDocument and run as much LINQ to XML as you like. All is good.

But then… I needed to recreate the syntax tree. I really, really felt I should be able to build it up from primitives. After a few hours banging my head against that wall, I had to accept the core rule of …

The .NET Compiler Platform is a compiler, what it does really, really well is parse code.

So, even though it made me feel dirty, I wrote out the XML to a string, split the string into lines, iterated over the lines inserting the three slashes, and asked the SyntaxFactory to parse it. If you’re struggling to build something, see if you can parse into what you need.

In this particular case, it failed. I mean I had the output and it looked good, but the first three slashes were missing and the end of line at the end was missing. Specifically, I mean when I wrote it out in the immediate window these were missing. Crap.

Happily I have friends. Anthony D Green (ADG) on the team pointed out that I wasn’t using ToFullString(). At various points in working with the API, ToString() may do surprising things – working too hard or just getting nuts on your behalf. Perhaps someone somewhere needs the stripped version.

If you’re looking at a string output from the API, check it also with ToFullString().

The Formatter is picky, and EndOfLineTrivia requires \r\n

The .NET Compiler Platform is designed, and massively tested, with code that can happen in the real world from its own parsing. When you build trees, there is a large number of ways you can mess up that could never happen through parsing. I’d say infinite, but my son is an astrophysicist and doesn’t let me say things are infinite.

In my case, I naively thought that EndOfLineTrivia would understand that it was supposed to, well, you know, output an end of line. I did not anticipate that I would also need to pass a \r\n. I also did not anticipate that it would silently create an object that would later cause an exception – deep in the heart of the Formatter API. This time Balaji Soundrarajan did a little telepathic debugging and guessed that I’d failed to include \r\n. Thanks to him and all the folks that took a look at that one!

RoslynDom: Structural Interrogation Walk-throughs

The goal of RoslynDom is to present information about your code in the way you think about your code.

A note on VB: I’m building out the C# version first, but I know VB very well and am designing to support later VB creation. If something is at odds with good C# support, I’ll cross that bridge when I get there.

You can get RoslynDom on NuGet via the Package Manager in Visual Studio and here on GitHub. Keep in mind that it is an early experimental release.

RoslynDom celebrates the awesome .NET Compiler Platform, but also respects that the .NET Compiler Platform is built as a compiler, and you are not a compiler.

Introduction

I started from the outside, highest level of code in a single file and am working inward – beginning with the structure and working inwards to statements and eventually expressions. Support for multiple files is coming – but not until I’ve completed work on statements.

By structural, I mean artifacts that organize your code – namespaces, classes, structures, etc. This post shows how to use RoslynDom to query code. Changing code is a different post. You can rather easily change the RoslynDom– outputting a new tree with your changes is currently buggy. In the meantime, most RoslynDom items expose the SyntaxNode it was created from, and where practical the corresponding ISymbol (SyntaxNode and ISymbol are part of the .NET Compiler Platform). You can use RoslynDom to get to the right location in your code, and then use .NET Compiler Platform techniques.

You can find more about the scenarios I wrote RoslynDom to support here. If you have a tool idea and want me to make RoslynDom friendly to what you’re doing, let’s talk.

This post has walk-throughs of how you can use RoslynDom today. RoslynDom is a library to build tools from – it is not itself a tool. One tool that has been built on top of it is Jim Christopher’s RoslynDom-Provider.

Retrieving Namespaces

A namespace is a logical container. It’s orthogonal to the structure of your running application and tools like ObjectBrowser offer alternate physical (assembly/module) and logical (namespace) trees.

RoslynDom sets out to give access to your code the way you think about it, and you might think about it differently at different times. Both of these statements are true:

  • A namespace is a dot delimited string attached to a class or other type to give it a more complete and hopefully unique name (in->out)
  • A namespace is an identifier that you put at the top of a file that groups the contained code with related code in different files (in-out)

A namespace can be nested – the namespace System contains the namespace System.Diagnostics

The nesting of namespaces in code is entirely arbitrary – these code fragments are logically identical:

namespace RoslynDom
{
   namespace Common.Test
   {
      public class Foo { }
   }
}

namespace RoslynDom.Common.Test
{
   public class Foo { }
}
.csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }
 

The .NET Compiler Platform manages namespaces differently in the two trees. RoslynDom’s is committed to expressing code the way you think of it and you probably don’t think of your code in terms of different access mechanisms, each good for different things.

To access namespace information in RoslynDom, you first load your code. You can do this from a file, a source code string, a project document, or a SyntaxTree. For example:

IRoot root = RDomFactory.GetRootFromFile(@"..\..\TestFile.cs");

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


There are three properties regarding namespaces, one of which is still evolving (is the fully expanded view ever valuable to a person):

var nspaces1 = root.Namespaces;
var nspaces3 = root.NonemptyNamespaces;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


These provide your namespaces as you wrote them and where the namespace is actually in use in this root. The following code would have two members in the Namespaces property and one member in the NonemptyNamespaces property:

namespace RoslynDom
{
   namespace Common.Test
   {
      public class Foo { }
   }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


If you use the NonemptyNamespaces property, the behavior will not change if someone refactors this code to have a single or three namespace statements.

Now that you’ve seen RoslynDom in action, you may be able to predict how to retrieve using statements:

var usings = root.Usings;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Since namespaces and using directives can appear within other namespaces, both of these properties also appear on the RoslynDom INamespace interface.

Retrieving Classes


The next step down the structural hierarchy is classes, structures and other types. These may appear at the root or in a namespace. You probably have a single namespace in your file and probably do not perceive your file as a nested structure of namespace(s) containing types. RoslynDom supports both approaches:

var nspace = root.Namespaces.First();
var classes = nspace.Classes;
var classes = root.RootClasses;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Again, I think the second will generally be more useful. Each class includes a property containing its namespace/fully qualified name.

Retrieving Methods


RoslynDom reflects the four fundamental levels of code in .NET:

  • Root attachable, which I call “stem members:” root, namespaces, using directives, classes, interfaces, structures, enums, and (in the future) delegates
  • Type attachable, or type members: methods, properties, fields, (soon) enum values and (soon) events (constructors are currently a special case of a method, but waiting for a final understanding of primary constructors)
  • Statements attachable to methods and property accessors
  • Expressions that can be attached to statements and to fields as initializers (and now properties)

Remembering how you access namespaces, you can probably predict the code to access type members in RoslynDom:

IRoot root = RDomFactory.GetRootFromFile(@"..\..\TestFile.cs");
var class1 = root.Namespaces.Last().Classes.First();
var methods = class1.Methods;
var fields = class1.Fields;
var properties = class1.Properties;
Methods can have parameters:
var method = class1.Methods.First();
var parameters = method.Parameters;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Wow, that was easy!

Retrieving information about items


Here’s some code

namespace Namespace2
{
   public class FooClass
   {
      public string FooMethod(int bar1, string bar2)
      { }
      public string FooProperty { get; set; }
   }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Let’s say you want the name and type of a parameter:

var parameters = method.Parameters.ToArray();
Assert.AreEqual("bar1", parameters[0].Name);
Assert.AreEqual("Int32", parameters[0].Type.Name);

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Except in the case of CLR types, you may not have access to the Reflection runtime type. To avoid taking a dependency on the .NET Compiler Platform, RoslynDom has its own type class. Alas, that’s another post.

Here is the set of features RoslynDom makes available for methods and parameters:

var method = class1.Methods.First();
var parameters = method.Parameters.ToArray();
Assert.AreEqual(2, parameters.Count());
Assert.AreEqual("FooMethod", method.Name);
Assert.AreEqual("String", method.ReturnType.Name);
Assert.AreEqual(AccessModifier.Public, method.AccessModifier );
Assert.IsFalse(method.IsAbstract);
Assert.IsFalse(method.IsExtensionMethod);
Assert.IsFalse(method.IsOverride);
Assert.IsFalse(method.IsSealed);
Assert.IsFalse(method.IsStatic);
Assert.IsFalse(method.IsVirtual);
Assert.IsFalse(method.IsVirtual);
Assert.AreEqual("bar1", parameters[0].Name);
Assert.AreEqual("Int32", parameters[0].Type.Name);
Assert.AreEqual(0, parameters[0].Ordinal);
Assert.AreEqual("bar2", parameters[1].Name);
Assert.AreEqual("String", parameters[1].Type.Name);
Assert.AreEqual(1, parameters[1].Ordinal);
Assert.IsFalse(parameters[1].IsOptional);
Assert.IsFalse(parameters[1].IsOut);
Assert.IsFalse(parameters[1].IsParamArray);


.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Attributes


Attributes are another area where the .NET Compiler Platform syntax tree keeps track of arbitrary differences in how code is written – differences that you don’t think about when reading code. These two fragments of code have the same intent.

[SomeAttr, SomeAttr2]
struct Foo<T>
{ }

[SomeAttr]
[SomeAttr2]
struct Foo<T>
{ }

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


And that’s without even considering the optional parentheses on the attributes.

RoslynDom collapses these differences and has an Attributes property on every item that allows it in .NET:

var attributes = class1.Attributes;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Attributes have names. One way to use them is with LINQ expressions. This retrieves any attributes that is on a class and has a particular name:

var classAttributes = from x in root.RootClasses
                      from a in x.Attributes
                      where(a => a.Name == name)
                      select x;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


A similar LINQ expression could return the matching, or non-matching classes.

Attributes may have values, which would be the parameters to the attributes. Since RoslynDom does not yet support multiple files, the attributes aren’t fully resolved and positional arguments are currently problematic.

If you have code like these attributes (used in a rather silly way):

[ExcludeFromCodeCoverage]
[EventSource(Name ="George")]
public class FooClass
{}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


You can retrieve named arguments like this:

var class1 = root.Namespaces.Last().Classes.First();
var attributes = class1.Attributes.ToArray();
Assert.AreEqual(2, attributes.Count());
Assert.AreEqual("ExcludeFromCodeCoverage", attributes[0].Name);
Assert.AreEqual("EventSource", attributes[1].Name);
Assert.AreEqual("Name", attributes[1].AttributeValues.First().Name);
Assert.AreEqual("George", attributes[1].AttributeValues.First().Value);
Assert.AreEqual(LiteralKind.String, attributes[1].AttributeValues.First().ValueType);

Summary


RoslynDom is in a preliminary stage, and I’d be happy to hear your thoughts. The goal of RoslynDom is to enhance the .NET Compiler Platform to make humans like you and me happy accessing the fantastic information the compiler is exposing!

Updates to RoslynDom 1.0.9 Alpha

New Parent property on all items

IDom now contains a Parent property of IDom type

In any tree things can become, well interesting, if nodes appear in more than one location. This is particularly damaging in a tree that takes characteristics from context – which happens with naming (namespaces and nested classes) in the .NET class model. Thus, by intent, no item may appear in more than one location in the tree.

When a member is cloned, its parent is not copied with it. Also, parent and parent properties are not used in determining same intent.

Real-time Namespace property

Previously, Namespace was stored from the symbol when the instance was created. Because Namespace is contextual, this was incorrect. Namespace is now calculated from the parent hierarchy when the namespace is requested for all classes except RDomReferencedType. This resulted in some changes in Namespace results, including the result from

Namespace testing.Foo

Which previously returned Foo and now returns testing.Foo.

The Namespace in RDomRefernecedType is the namespace of the type being referenced, so is still retrieved from the symbol on load.

AddOrMoveMember and RemoveMember methods

Methods to add members to containers have been added to new IRDomStemContainer, IRDomTypeContainer and IRDomCodeContainer interfaces.

As discussed under the heading “New Parent property on all items,” IDom items may not appear in more than one location in the tree. The AddOrMove semantics reflect this. I actually think moving will be a rare task, but if you accidently add an item to a new location in the tree, RoslynDom will remove it from the prior location and I wanted naming to clarify this.

I may add an “AddCloneOfMember” to simplify the process of cloning a member and adding it to a new location after changes. This is the anticipated use case.

ICodeContainer and ICodeMember interfaces

There are new ICodeContainer and ICodeMember interfaces. Support for intra-member features (code) remains almost non-existent in this version.

RawItem and OriginalRawItem semantic changes

RawItem and the new OriginalRawItem on the IDom interface represent the underlying data in an agnostic way. IDom is agnostic on mutability so there may be future implementations where RawItem and OriginalRawItem are always the same. I want the semantics to be clear that RawItem is the best current capturing of the tree, and OriginalRawItem is the original unchanged item. This intentionally implies that the original must be maintained.

TypedSyntax and OriginalTypedSyntax are the RDom implementations of these generalized ideas.

AddMember method added to RDomStemContainer and RDomBaseType

To support mutability, AddMember methods were added to these two base classes. This makes the ability to add types and type members available to appropriate types, namespaces, and the root.

Changed return of PublicAnnotationList.GetValue(string key)

Previously this returned the default value, which blocked access to other values. It now returns the PublicAnnotation. The default value remains accessible by GetValue(name, name).

Changed PublicAnnotation to a Class

PublicAnnotation was a struct. This was the only struct in the system and I felt the value/reference semantic difference would be detrimental to maintenance. As part of this, I removed the equality testing and added a SameIntent method.

Added IHasSameIntentMethod interface

Another characteristic interface was added for the SameIntent methods. This is for consistency with other characteristic interface usage.

Moved SameIntent to a subsystem in RoslynDom.Common

This code may eventually run with a DI, but for now, if the interface data matches, they match.

Changed SameIntent method type parameter

Previously the SameIntent method appeared on the strongly typed IDom<T> interface and could only be called on items of the same type. This was overly restrictive, so the method was changed to have a local strongly typed parameter, constraint only to be a class. Comparing different IDom types of the current implementations will always return false, although it is possible that a derived class could be created that had different behavior, but the same intent, as one of the existing implementation classes, and could therefore return true as the same intent. This was also done to support scenarios where the type is not known, such as public annotations that might be IDom types.

Changed inheritance semantics of SameIntent() method

The previous inheritance semantics of the SameIntent method were to directly override the public SameIntent method. This method is no longer virtual. Instead override the CheckSameIntent protected method. Be sure to call the base CheckSameIntent method for correct behavior.

SameIntent and names

Type members (fields, properties, methods and enum) do not include outer name when considering same intent.

Stem members (types, namespaces) do not include namespace/qualified name in same intent.

Added IHasLookupValues interface

Added this interface to reduce dependencies in an upcoming project.

Virtual Matches method added to IDom

Immediately this allows CheckSameIntentChild to better find the other child to compare to. It also provides a generalized way to find items in a list.

Changed name of RDomTypeParameter. HasReferenceTypeConstraint

Was previously HasReferenceConstraint. Changed for consistency. Also changed ITypeParameter

Changed name of MemberKind, StemMemberKind and LiteralKind

The suffix “type” is confusing. Switched these enums and property names to “kind”

BuildSyntax

Implementation of syntax recreation from changed nodes is begun, not complete.

Internal cleanup

- Separated RDomBase classes into separate files

- Created SameIntentHelper

- Changes to how IHasNamespace properties are stored and used

- IHasNamespace moved from IStemMember to IType and INamespace

- IUsing now includes IStemMember

- StemMembers property of Namespaces and Root now include usings

- Fixed some bugs in RDomField attributes

Updates to RoslynDom 1.0.8 Alpha

Thanks to Llewellyn Falco for his ongoing support and insight. He is encouraging my frequent releases of RoslynDom, and to get a preliminary release of CodeFirstMetadata to NuGet as well as GitHub real soon.

You can get the bits here and the download the NuGet package through Visual Studio package manager or another NuGet client.

These are experimental releases, and as such are not signed.

SameIntent methods

For the work I am doing, I am more interested in the intent of the code than the details of it. There are a number of ways different code can result in identical behavior including ordering of members, attribute syntax details, namespace nesting, and use of named parameters. The first version of the SameIntent methods are fairly conservative – not all code with identical results will be found, just the big, common issues.

Cloning as Copy methods

I added a feature to clone RoslynDom items. This is a precursor to adding mutability, but mutability is not yet available. This involved changing a number of items from direct access to the underlying trees to retrieving this information into local fields. All tests pass, but if you find a missing feature or anything funny, let me know.

PublicAnnotationList replaces IEnumerable<PublicAnnotation>

Previously RDomBase managed a list of PublicAnnotation. This was a bad refactoring of concerns, so I added a PublicAnnotationList class. This cleaned up the code in RDomBase and will make it easier to evolve the PublicAnnotationList.

Removed RDomSyntaxNodeBase from hierarchy

At one point this class seemed appropriate in the hierarchy. It wasn’t doing anything and was removed.

NonEmptyNamespaces renamed to NonemptyNamespaces

Cleanup issue found by FxCop.

Improved code analysis (FxCop) and test coverage

I may separately blog about how positive the code analysis exercise was – in spite of my deep dread of what I would find. The recommended rules had only one issue – which I thought was pretty cool. Switching to All Microsoft Rules for the non-testing libraries resulted in about 100 issues. I dropped this to under 25 and almost all the changes were things I was really happy to find – insufficient checks for nulls on method entry, a couple of naming fixes.

Public Annotations

Code is data that communicates your intent. If you have no special relationship with your compiler, you don’t need any special data to communicate additional intent.


Once you’re in an open compiler world, you may need to communicate with your compiler. This features has been called “design time attributes” and “annotations.” I’m adding this feature to RoslynDom and calling it “PublicAnnotations.”


I’ll try to always remember to say “public annotations” to differentiate it from Roslyn private annotations.


 


Why not just use attributes?


Attributes do not work very well to communicate with the compiler for at least these reasons:


  • Can’t tell what’s available at runtime
  • If design attributes are visible at runtime, they become a contract
  • Can result in build dependency
    o If one player removes attributes it’s done with to avoid runtime contracts
  • Must follow attribute syntax
  • Only constants allowed
    o No lambda expressions
    o No generic types
    o No expressions
  • Can’t be placed in all desired locations
    o Not on namespaces, files, or in random locations inside or outside methods

I think the first is actually the biggest issue. I think it’s important to differentiate communications with the compiler pipeline, including design time with the Visual Studio/Roslyn linkage, and runtime attributes. But even if you disagree with that, attributes simply don’t work because of limitations in attribute content and attribute location.


 


The Syntax


Eventually there will be enough examples of public annotations that an obvious syntax can be included in the languages. I’m not willing to wait as I need public annotations right now, like today.


The current syntax has to reside inside a comment. That’s the only way to solve the content and location limitations of attributes without changing the compiler.


The syntax should be clearly differentiated from all other lines of code to allow easy recognition by human, parser/loader and later IDE colorization. The syntax should also be easily found via RegEx to allow updates if we get language support.


RoslynDom now supports the following, with any desired whitespace within the line.


//[[ NameOfAnnotation(stuff) ]]


This currently requires the annotation appear on a single line and I’m not currently supporting end of line comments.


Because it’s familiar to you, the annotation looks like an attribute, except for that funny double square bracket. It does not need an attribute class to exist anywhere, and one generally will not exist.


Just like attributes, the following variations are all supported, along with the logical combinations:


//[[ NameOfAnnotation() ]]

//[[ NameOfAnnotation(stuff) ]]

//[[ NameOfAnnotation(name:stuff, name2 : stuff) ]]

//[[ NameOfAnnotation(name=stuff, name2 = stuff) ]]


The common way to add annotations will be to include them in your source code. You can also add annotations explicitly with the AddPublicAnnotationValue(string name, object value) and the AddPublicAnnotationValue(string name, string key, object value) methods.


Public annotations with no parameters are just accessed to see if the public annotation exists via its name.


A single positional value is supported and is accessed via the public annotation name.


Named values are accessed by the public annotation name and the value name as a key.


 


Legal locations for public annotations


Annotations are currently legal on using statements, namespaces, types and members. They are also legal at the file or root level.


var csharpCode = @"
//[[ file: kad_Test4(val1 = "
"George"", val2 = 43) ]]
//[[ kad_Test1(val1 : "
"Fred"", val2 : 40) ]]
using Foo;

//[[ kad_Test2("
"Bill"", val2 : 41) ]]
//[[ kad_Test3(val1 ="
"Percy"", val2 : 42) ]]
public class MyClass
{ }
"
;

This illustrates a challenge. A likely location for public annotations is the file level. But I then need to distinguish between the file or root level public annotation and annotations on the first item in the file. I decided to do this by prefixing the file public annotation. I am currently supporting both file and root. 


Accessing your values


The following methods are available for accessing public annotations


bool HasPublicAnnotation(string name);

void AddPublicAnnotationValue(string name, string key, object value);
void AddPublicAnnotationValue(string name, object value);

object GetPublicAnnotationValue(string name, string key);
object GetPublicAnnotationValue(string name);
T GetPublicAnnotationValue<T>(string name);
T GetPublicAnnotationValue<T>(string name, string key);

These methods are available on all items via the IDom interface.


 


Full example


Here’s the full example from the Scenario_PatternMatchingSelection class of the RoslynDomExampleTests project.


[TestMethod]
public void Can_get_and_retrieve_public_annotations()
{
var csharpCode = @"
//[[ file: kad_Test4(val1 = "
"George"", val2 = 43) ]]
//[[ kad_Test1(val1 : "
"Fred"", val2 : 40) ]]
using Foo;

//[[ kad_Test2("
"Bill"", val2 : 41) ]]
//[[ kad_Test3(val1 ="
"Percy"", val2 : 42) ]]
public class MyClass
{ }
"
;
var root = RDomFactory.GetRootFromString(csharpCode);

var using1 = root.Usings.First();
Assert.AreEqual("Fred",using1.GetPublicAnnotationValue <string>("kad_Test1","val1"));
Assert.AreEqual("Fred",using1.GetPublicAnnotationValue("kad_Test1","val1"));
Assert.AreEqual(40, using1.GetPublicAnnotationValue <int>("kad_Test1","val2"));
Assert.AreEqual(40, using1.GetPublicAnnotationValue("kad_Test1","val2"));

var class1 = root.RootClasses.First();
Assert.AreEqual("Bill", class1.GetPublicAnnotationValue( "kad_Test2"));
Assert.AreEqual(41, class1.GetPublicAnnotationValue("kad_Test2", "val2"));
Assert.AreEqual("Percy", class1.GetPublicAnnotationValue("kad_Test3", "val1"));
Assert.AreEqual(42, class1.GetPublicAnnotationValue("kad_Test3", "val2"));

Assert.AreEqual("George", root.GetPublicAnnotationValue("kad_Test4", "val1"));
Assert.AreEqual(43, root.GetPublicAnnotationValue("kad_Test4", "val2"));

}

Just another Microsoft MVPs site