Category Archives: 18474

RoslynDom: Structural Interrogation Walk-throughs

The goal of RoslynDom is to present information about your code in the way you think about your code.

A note on VB: I’m building out the C# version first, but I know VB very well and am designing to support later VB creation. If something is at odds with good C# support, I’ll cross that bridge when I get there.

You can get RoslynDom on NuGet via the Package Manager in Visual Studio and here on GitHub. Keep in mind that it is an early experimental release.

RoslynDom celebrates the awesome .NET Compiler Platform, but also respects that the .NET Compiler Platform is built as a compiler, and you are not a compiler.

Introduction

I started from the outside, highest level of code in a single file and am working inward – beginning with the structure and working inwards to statements and eventually expressions. Support for multiple files is coming – but not until I’ve completed work on statements.

By structural, I mean artifacts that organize your code – namespaces, classes, structures, etc. This post shows how to use RoslynDom to query code. Changing code is a different post. You can rather easily change the RoslynDom– outputting a new tree with your changes is currently buggy. In the meantime, most RoslynDom items expose the SyntaxNode it was created from, and where practical the corresponding ISymbol (SyntaxNode and ISymbol are part of the .NET Compiler Platform). You can use RoslynDom to get to the right location in your code, and then use .NET Compiler Platform techniques.

You can find more about the scenarios I wrote RoslynDom to support here. If you have a tool idea and want me to make RoslynDom friendly to what you’re doing, let’s talk.

This post has walk-throughs of how you can use RoslynDom today. RoslynDom is a library to build tools from – it is not itself a tool. One tool that has been built on top of it is Jim Christopher’s RoslynDom-Provider.

Retrieving Namespaces

A namespace is a logical container. It’s orthogonal to the structure of your running application and tools like ObjectBrowser offer alternate physical (assembly/module) and logical (namespace) trees.

RoslynDom sets out to give access to your code the way you think about it, and you might think about it differently at different times. Both of these statements are true:

  • A namespace is a dot delimited string attached to a class or other type to give it a more complete and hopefully unique name (in->out)
  • A namespace is an identifier that you put at the top of a file that groups the contained code with related code in different files (in-out)

A namespace can be nested – the namespace System contains the namespace System.Diagnostics

The nesting of namespaces in code is entirely arbitrary – these code fragments are logically identical:

namespace RoslynDom
{
   namespace Common.Test
   {
      public class Foo { }
   }
}

namespace RoslynDom.Common.Test
{
   public class Foo { }
}
.csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }
 

The .NET Compiler Platform manages namespaces differently in the two trees. RoslynDom’s is committed to expressing code the way you think of it and you probably don’t think of your code in terms of different access mechanisms, each good for different things.

To access namespace information in RoslynDom, you first load your code. You can do this from a file, a source code string, a project document, or a SyntaxTree. For example:

IRoot root = RDomFactory.GetRootFromFile(@"..\..\TestFile.cs");

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


There are three properties regarding namespaces, one of which is still evolving (is the fully expanded view ever valuable to a person):

var nspaces1 = root.Namespaces;
var nspaces3 = root.NonemptyNamespaces;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


These provide your namespaces as you wrote them and where the namespace is actually in use in this root. The following code would have two members in the Namespaces property and one member in the NonemptyNamespaces property:

namespace RoslynDom
{
   namespace Common.Test
   {
      public class Foo { }
   }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


If you use the NonemptyNamespaces property, the behavior will not change if someone refactors this code to have a single or three namespace statements.

Now that you’ve seen RoslynDom in action, you may be able to predict how to retrieve using statements:

var usings = root.Usings;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Since namespaces and using directives can appear within other namespaces, both of these properties also appear on the RoslynDom INamespace interface.

Retrieving Classes


The next step down the structural hierarchy is classes, structures and other types. These may appear at the root or in a namespace. You probably have a single namespace in your file and probably do not perceive your file as a nested structure of namespace(s) containing types. RoslynDom supports both approaches:

var nspace = root.Namespaces.First();
var classes = nspace.Classes;
var classes = root.RootClasses;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Again, I think the second will generally be more useful. Each class includes a property containing its namespace/fully qualified name.

Retrieving Methods


RoslynDom reflects the four fundamental levels of code in .NET:

  • Root attachable, which I call “stem members:” root, namespaces, using directives, classes, interfaces, structures, enums, and (in the future) delegates
  • Type attachable, or type members: methods, properties, fields, (soon) enum values and (soon) events (constructors are currently a special case of a method, but waiting for a final understanding of primary constructors)
  • Statements attachable to methods and property accessors
  • Expressions that can be attached to statements and to fields as initializers (and now properties)

Remembering how you access namespaces, you can probably predict the code to access type members in RoslynDom:

IRoot root = RDomFactory.GetRootFromFile(@"..\..\TestFile.cs");
var class1 = root.Namespaces.Last().Classes.First();
var methods = class1.Methods;
var fields = class1.Fields;
var properties = class1.Properties;
Methods can have parameters:
var method = class1.Methods.First();
var parameters = method.Parameters;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Wow, that was easy!

Retrieving information about items


Here’s some code

namespace Namespace2
{
   public class FooClass
   {
      public string FooMethod(int bar1, string bar2)
      { }
      public string FooProperty { get; set; }
   }
}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Let’s say you want the name and type of a parameter:

var parameters = method.Parameters.ToArray();
Assert.AreEqual("bar1", parameters[0].Name);
Assert.AreEqual("Int32", parameters[0].Type.Name);

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Except in the case of CLR types, you may not have access to the Reflection runtime type. To avoid taking a dependency on the .NET Compiler Platform, RoslynDom has its own type class. Alas, that’s another post.

Here is the set of features RoslynDom makes available for methods and parameters:

var method = class1.Methods.First();
var parameters = method.Parameters.ToArray();
Assert.AreEqual(2, parameters.Count());
Assert.AreEqual("FooMethod", method.Name);
Assert.AreEqual("String", method.ReturnType.Name);
Assert.AreEqual(AccessModifier.Public, method.AccessModifier );
Assert.IsFalse(method.IsAbstract);
Assert.IsFalse(method.IsExtensionMethod);
Assert.IsFalse(method.IsOverride);
Assert.IsFalse(method.IsSealed);
Assert.IsFalse(method.IsStatic);
Assert.IsFalse(method.IsVirtual);
Assert.IsFalse(method.IsVirtual);
Assert.AreEqual("bar1", parameters[0].Name);
Assert.AreEqual("Int32", parameters[0].Type.Name);
Assert.AreEqual(0, parameters[0].Ordinal);
Assert.AreEqual("bar2", parameters[1].Name);
Assert.AreEqual("String", parameters[1].Type.Name);
Assert.AreEqual(1, parameters[1].Ordinal);
Assert.IsFalse(parameters[1].IsOptional);
Assert.IsFalse(parameters[1].IsOut);
Assert.IsFalse(parameters[1].IsParamArray);


.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Attributes


Attributes are another area where the .NET Compiler Platform syntax tree keeps track of arbitrary differences in how code is written – differences that you don’t think about when reading code. These two fragments of code have the same intent.

[SomeAttr, SomeAttr2]
struct Foo<T>
{ }

[SomeAttr]
[SomeAttr2]
struct Foo<T>
{ }

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


And that’s without even considering the optional parentheses on the attributes.

RoslynDom collapses these differences and has an Attributes property on every item that allows it in .NET:

var attributes = class1.Attributes;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


Attributes have names. One way to use them is with LINQ expressions. This retrieves any attributes that is on a class and has a particular name:

var classAttributes = from x in root.RootClasses
                      from a in x.Attributes
                      where(a => a.Name == name)
                      select x;

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


A similar LINQ expression could return the matching, or non-matching classes.

Attributes may have values, which would be the parameters to the attributes. Since RoslynDom does not yet support multiple files, the attributes aren’t fully resolved and positional arguments are currently problematic.

If you have code like these attributes (used in a rather silly way):

[ExcludeFromCodeCoverage]
[EventSource(Name ="George")]
public class FooClass
{}

.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: consolas, “Courier New”, courier, monospace;
background-color: #ffffff;
/*white-space: pre;*/
}
.csharpcode pre { margin: 0em; }
.csharpcode .rem { color: #008000; }
.csharpcode .kwrd { color: #0000ff; }
.csharpcode .str { color: #006080; }
.csharpcode .op { color: #0000c0; }
.csharpcode .preproc { color: #cc6633; }
.csharpcode .asp { background-color: #ffff00; }
.csharpcode .html { color: #800000; }
.csharpcode .attr { color: #ff0000; }
.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}
.csharpcode .lnum { color: #606060; }


You can retrieve named arguments like this:

var class1 = root.Namespaces.Last().Classes.First();
var attributes = class1.Attributes.ToArray();
Assert.AreEqual(2, attributes.Count());
Assert.AreEqual("ExcludeFromCodeCoverage", attributes[0].Name);
Assert.AreEqual("EventSource", attributes[1].Name);
Assert.AreEqual("Name", attributes[1].AttributeValues.First().Name);
Assert.AreEqual("George", attributes[1].AttributeValues.First().Value);
Assert.AreEqual(LiteralKind.String, attributes[1].AttributeValues.First().ValueType);

Summary


RoslynDom is in a preliminary stage, and I’d be happy to hear your thoughts. The goal of RoslynDom is to enhance the .NET Compiler Platform to make humans like you and me happy accessing the fantastic information the compiler is exposing!

Updates to RoslynDom 1.0.8 Alpha

Thanks to Llewellyn Falco for his ongoing support and insight. He is encouraging my frequent releases of RoslynDom, and to get a preliminary release of CodeFirstMetadata to NuGet as well as GitHub real soon.

You can get the bits here and the download the NuGet package through Visual Studio package manager or another NuGet client.

These are experimental releases, and as such are not signed.

SameIntent methods

For the work I am doing, I am more interested in the intent of the code than the details of it. There are a number of ways different code can result in identical behavior including ordering of members, attribute syntax details, namespace nesting, and use of named parameters. The first version of the SameIntent methods are fairly conservative – not all code with identical results will be found, just the big, common issues.

Cloning as Copy methods

I added a feature to clone RoslynDom items. This is a precursor to adding mutability, but mutability is not yet available. This involved changing a number of items from direct access to the underlying trees to retrieving this information into local fields. All tests pass, but if you find a missing feature or anything funny, let me know.

PublicAnnotationList replaces IEnumerable<PublicAnnotation>

Previously RDomBase managed a list of PublicAnnotation. This was a bad refactoring of concerns, so I added a PublicAnnotationList class. This cleaned up the code in RDomBase and will make it easier to evolve the PublicAnnotationList.

Removed RDomSyntaxNodeBase from hierarchy

At one point this class seemed appropriate in the hierarchy. It wasn’t doing anything and was removed.

NonEmptyNamespaces renamed to NonemptyNamespaces

Cleanup issue found by FxCop.

Improved code analysis (FxCop) and test coverage

I may separately blog about how positive the code analysis exercise was – in spite of my deep dread of what I would find. The recommended rules had only one issue – which I thought was pretty cool. Switching to All Microsoft Rules for the non-testing libraries resulted in about 100 issues. I dropped this to under 25 and almost all the changes were things I was really happy to find – insufficient checks for nulls on method entry, a couple of naming fixes.

Public Annotations

Code is data that communicates your intent. If you have no special relationship with your compiler, you don’t need any special data to communicate additional intent.


Once you’re in an open compiler world, you may need to communicate with your compiler. This features has been called “design time attributes” and “annotations.” I’m adding this feature to RoslynDom and calling it “PublicAnnotations.”


I’ll try to always remember to say “public annotations” to differentiate it from Roslyn private annotations.


 


Why not just use attributes?


Attributes do not work very well to communicate with the compiler for at least these reasons:


  • Can’t tell what’s available at runtime
  • If design attributes are visible at runtime, they become a contract
  • Can result in build dependency
    o If one player removes attributes it’s done with to avoid runtime contracts
  • Must follow attribute syntax
  • Only constants allowed
    o No lambda expressions
    o No generic types
    o No expressions
  • Can’t be placed in all desired locations
    o Not on namespaces, files, or in random locations inside or outside methods

I think the first is actually the biggest issue. I think it’s important to differentiate communications with the compiler pipeline, including design time with the Visual Studio/Roslyn linkage, and runtime attributes. But even if you disagree with that, attributes simply don’t work because of limitations in attribute content and attribute location.


 


The Syntax


Eventually there will be enough examples of public annotations that an obvious syntax can be included in the languages. I’m not willing to wait as I need public annotations right now, like today.


The current syntax has to reside inside a comment. That’s the only way to solve the content and location limitations of attributes without changing the compiler.


The syntax should be clearly differentiated from all other lines of code to allow easy recognition by human, parser/loader and later IDE colorization. The syntax should also be easily found via RegEx to allow updates if we get language support.


RoslynDom now supports the following, with any desired whitespace within the line.


//[[ NameOfAnnotation(stuff) ]]


This currently requires the annotation appear on a single line and I’m not currently supporting end of line comments.


Because it’s familiar to you, the annotation looks like an attribute, except for that funny double square bracket. It does not need an attribute class to exist anywhere, and one generally will not exist.


Just like attributes, the following variations are all supported, along with the logical combinations:


//[[ NameOfAnnotation() ]]

//[[ NameOfAnnotation(stuff) ]]

//[[ NameOfAnnotation(name:stuff, name2 : stuff) ]]

//[[ NameOfAnnotation(name=stuff, name2 = stuff) ]]


The common way to add annotations will be to include them in your source code. You can also add annotations explicitly with the AddPublicAnnotationValue(string name, object value) and the AddPublicAnnotationValue(string name, string key, object value) methods.


Public annotations with no parameters are just accessed to see if the public annotation exists via its name.


A single positional value is supported and is accessed via the public annotation name.


Named values are accessed by the public annotation name and the value name as a key.


 


Legal locations for public annotations


Annotations are currently legal on using statements, namespaces, types and members. They are also legal at the file or root level.


var csharpCode = @"
//[[ file: kad_Test4(val1 = "
"George"", val2 = 43) ]]
//[[ kad_Test1(val1 : "
"Fred"", val2 : 40) ]]
using Foo;

//[[ kad_Test2("
"Bill"", val2 : 41) ]]
//[[ kad_Test3(val1 ="
"Percy"", val2 : 42) ]]
public class MyClass
{ }
"
;

This illustrates a challenge. A likely location for public annotations is the file level. But I then need to distinguish between the file or root level public annotation and annotations on the first item in the file. I decided to do this by prefixing the file public annotation. I am currently supporting both file and root. 


Accessing your values


The following methods are available for accessing public annotations


bool HasPublicAnnotation(string name);

void AddPublicAnnotationValue(string name, string key, object value);
void AddPublicAnnotationValue(string name, object value);

object GetPublicAnnotationValue(string name, string key);
object GetPublicAnnotationValue(string name);
T GetPublicAnnotationValue<T>(string name);
T GetPublicAnnotationValue<T>(string name, string key);

These methods are available on all items via the IDom interface.


 


Full example


Here’s the full example from the Scenario_PatternMatchingSelection class of the RoslynDomExampleTests project.


[TestMethod]
public void Can_get_and_retrieve_public_annotations()
{
var csharpCode = @"
//[[ file: kad_Test4(val1 = "
"George"", val2 = 43) ]]
//[[ kad_Test1(val1 : "
"Fred"", val2 : 40) ]]
using Foo;

//[[ kad_Test2("
"Bill"", val2 : 41) ]]
//[[ kad_Test3(val1 ="
"Percy"", val2 : 42) ]]
public class MyClass
{ }
"
;
var root = RDomFactory.GetRootFromString(csharpCode);

var using1 = root.Usings.First();
Assert.AreEqual("Fred",using1.GetPublicAnnotationValue <string>("kad_Test1","val1"));
Assert.AreEqual("Fred",using1.GetPublicAnnotationValue("kad_Test1","val1"));
Assert.AreEqual(40, using1.GetPublicAnnotationValue <int>("kad_Test1","val2"));
Assert.AreEqual(40, using1.GetPublicAnnotationValue("kad_Test1","val2"));

var class1 = root.RootClasses.First();
Assert.AreEqual("Bill", class1.GetPublicAnnotationValue( "kad_Test2"));
Assert.AreEqual(41, class1.GetPublicAnnotationValue("kad_Test2", "val2"));
Assert.AreEqual("Percy", class1.GetPublicAnnotationValue("kad_Test3", "val1"));
Assert.AreEqual(42, class1.GetPublicAnnotationValue("kad_Test3", "val2"));

Assert.AreEqual("George", root.GetPublicAnnotationValue("kad_Test4", "val1"));
Assert.AreEqual(43, root.GetPublicAnnotationValue("kad_Test4", "val2"));

}

Creating Strong-typed Metadata Classes

This post is about an aspect of the CodeFirstMetadata library. You can find out more about this library and where to get it here and here.

You can find out more about strong-typed metadata classes in this post.

You can find out about code-first (generalized, not Entity Framework) here.

This post talks about the two existing examples to explain how strong typing works in real code and to show how instances of these examples are created.

At present, in order to create a set of strong typed classes to solve a new problem you need to create a fairly messy set of classes. Feel free to ping me if you think you have a good problem or you want to extend the existing problems and I’ll help guide you. In the long run I want to automate that process, so I probably won’t document it until then.

Because part will be automated/generated, it comes in two parts. I’m currently combining them with inheritance, rather than partial classes, to make this code approachable for non-.NET programmers, and because virtual/override are simpler concepts.

These classes all derive from a common base class – CodeFirstMetadata<T> – to provide common features like naming. Below this are code element specific classes like CodeFirstMetadataClass<T> that help with the conversion. I may later replace this with a shallow hierarchy and interfaces, so don’t get dependent on this implementation.

For a semantic log, the class, the predictable part I’ll later generate looks like:

using System.Collections.Generic;
using CodeFirst.Common;

namespace CodeFirstMetadataTest.SemanticLog
{
// TODO: Generate this base class based on expected attributes
public abstract class CodeFirstSemanticLogBase : CodeFirstMetadataClass<CodeFirstSemanticLog>
{
public CodeFirstSemanticLogBase()
{
this.Events = new List<CodeFirstLogEvent>();
}

public virtual string UniqueName { get; set; }
public virtual string LocalizationResources { get; set; }

public IEnumerable<CodeFirstLogEvent> Events { get; private set; }

}

}



 



The manual changes I’ve made, which are by far the most complex I’ve needed so far are:



using System.Linq;
// TODO: Attempt to remove this line after generating base class

namespace CodeFirstMetadataTest.SemanticLog
{

public class CodeFirstSemanticLog : CodeFirstSemanticLogBase
{

private string _uniqueName;
public override string UniqueName
{
get
{
if (string.IsNullOrWhiteSpace(_uniqueName))
{ return Namespace.Replace(".", "-") + "-" + ClassName; }
return _uniqueName;
}
set
{ _uniqueName = value; }
}


public bool IncludesInterface
{ get { return this.ImplementedInterfaces.Count() > 0; } }

public bool IsLocalized
{ get { return !string.IsNullOrWhiteSpace(this.LocalizationResources); } }

public override bool ValidateAndUpdateCore()
{
var isOk = base.ValidateAndUpdateCore();
if (isOk)
{ return CheckAndUpdateEventIds(); }
return false;
}

/// <summary>
/// This is a weird algorithm because it numbers implicit events from
/// the top, regardless of whether other events have event IDs. But
/// while I wouldn't have chosen this, I think it's important to match
/// EventSource implicit behavior exactly.
/// </summary>
private bool CheckAndUpdateEventIds()
{
var i = 0;
foreach (var evt in this.Events)
{
i++;
if (evt.EventId == 0) evt.EventId = i;
}
// PERF: The following is an O<n2> algorithm, probably a better way
var dupes = this.Events
.Where(x => this.Events
.Any(y => (y != x) && x.EventId == y.EventId));
return (dupes.Count() == 0);
}
}
}



 



EventSource, and presumably any other log system, requires a unique name, and I want to help you create that. Also, whether there is an interface and whether the class is localized have a significant impact on the template, so I simplify access to this information.



Loading strong-typed metadata is an opportunity for validation of the model. I use this to provide unique numeric ids to each of the log events, which are needed by EventSource and potentially other log mechanisms.



Mapping Between Code-first and Strong-typed Metadata



A bunch of ugly Roslyn and reflection code maps between code-first and strong typed metadata. This is the code that drove creation of the RoslynDom library – directly hitting the .NET Compiler Platform/Roslyn API within this code was monstrous.



var root = RDomFactory.GetRootFromFile(fileName);
var cfNamespace = root.Namespaces.First();
var returnType = typeof(T);
var mapping = TargetMapping.DeriveMapping("root", "root", returnType.GetTypeInfo()) as TargetNamespaceMapping;
var mapper = new CodeFirstMapper();
var newObj = mapper.Map(mapping, cfNamespace);



  • cfNamespace is the RolsynDom root
  • T is the type to return – the strong-typed metadata
  • mapping derived data about the mapping of the target– just create it as shown
  • mapper is the class that does the hard work
  • newObj is the new strong-typed metadata object


In the end, you have an object that is the strong-typed metadata for the initial code.



OK, but how does that work?



For metaprogramming:



  • I create a minimal description is in a file with a .cfcs extension
  • I lie to Visual Studio and tell it that this is a C# file (Tools/Options/Text Editor/File Extensions) I get nice IntelliSense for most features (more work to be done later).
  • MSBuild doesn’t see it as a C# file, so the .cfcs files are ignored as source in compilation
  • Generation creates .g.cs files that are included in compilation


The intent is to have this automated as part of your normal development pipeline, through one or more mechanism – build, custom tools, VS extension/PowerShell. The pipeline part is not done yet, but you can grab the necessary pieces from the console application in the example.



Getting CodeFirstMetadata



You can get this project on GitHub. I’ll add this to NuGet when the samples are in a more accessible from your Visual Studio project.

RoslynDom and Friends – Just the Facts

See this post for the Roadmap of these projects

RoslynDom

A wrapper for the .NET Compiler Platform – the roadmap has further plans

Project on GitHub

See the RoslynDomExampleTests project in the solution for the 20 things you’re most likely to do

Download via Visual Studio NuGet Package Manager if you want to play with that

RoslynDom-Provider

By Jim Christopher

A PowerShell provider for Roslyn Dom

Project on GitHub

CodeFirstMetadata

Strong-typed metadata from code-first (general sense, not Entity Framework sense)

Project on GitHub

See the ConsoleRunT4Example project in the solution along with strong-typed files and T4 usage

Roadmap for RoslynDom, CodeFirst Strong-typed Metadata and ExpansionFirst Templates

I’ve been working on three interleaved projects RoslynDom, CodeFirst Strong-typed Metadata and ExpansionFirst Templates. Also, Jim Christopher (aka beefarino) built a PowerShell provider. This post is an overview of these projects and a roadmap of how they relate to each other.


You can find the short version here.


clip_image002[7]


In the roadmap, blue indicates full (almost) test coverage and that the library has had more than one user, orange indicates preliminary released code, and grey indicates code that it’s really not ready to go and not yet available.


I’m working left to right, waiting to complete some features of the RoslynDom library until I have the full set of projects available in preliminary form.


RoslynDom Library


.NET Compiler Services, or Roslyn, does exactly what it was intended to do, which is exactly what we want it to do. It’s a very good compiler, now released as open source, and exposing all of its internals. It’s great that we get access to the internal trees, but it’s not happy code for you and I to use – it’s compiler internals.


At the same time, these trees hold a wealth of information we want – it’s more complete information than reflection, holds design information like comments and XML documentation, and it’s available even when the source code doesn’t compile.


When you and I ask questions about our code, we ask simple things – what are the classes in this file? We don’t care about whitespace, or precisely how we defined namespaces. In fact, most of the time, we don’t even care about namespaces at all. And we certainly don’t care whether a piece of information is available in the syntactic or semantic tree or whether attributes were defined with this style or that style.


RoslynDom wraps the Roslyn compiler trees and exposes the information in a programmer friendly way. Goals include


  • Easy access to the tree in the way(s) programmers think about code as a hierarchy
  • Easy access to common information about the code as parameters
  • Access to the applicable SyntaxNode when you need it
  • Access to the applicable Symbol when you need it
  • Planned: Access to the full logical model – solution to smallest code detail
    (Currently, file down to member)
  • Planned: A kludged public annotation/design time attribute system until we get a real one
    (Currently, attribute support only)
  • Planned: Ability to morph and output changes
    (Currently, readonly)

Getting RoslynDom


You can get the source code on GitHub, and there’s a RoslynDomExampleTests project which shows how to do about 20 common things.


The project is also available via NuGet. It’s preliminary, use cautiously. Download with the Visual Studio NuGet package manager.


RoslynDom-Provider


Jim Christopher created a PowerShell provider for RoslynDom. PowerShell providers allow you to access the underlying tree of information in the same way you access the file system. IOW, you can mount your source code as though it was a drive.


I’m really happy about the RoslynDom-Provider. It shows one way to use a .NET Compiler Platform/library to access the information that’s otherwise locked into the compiler trees. It’s also another way for you to find out about the amazing power of PowerShell providers. If you’re new to PowerShell, and you’re a Pluralsight subscriber, check out “Discovering PowerShell with Mark Minasi”. It uses Active Directory as the underlying problem and a few parts may be slow for a developer, but it will give you the gist of it. Follow up with Jim Christopher’s “Everyday PowerShell for Developers” and “PowerShell Gotchas.” If you’d rather read, there are a boatload of awesome books including PowerShell Deep Dives and Windows PowerShell for Developers, and too many Internet sites for me to keep straight.


Getting RoslynDomProvider


This project is available on GitHub.


Code-first Strong-typed Metadata


You can find out more about strong-typed metadata here and code-first strong-typed metadata here.


As a first step, I have samples in runtime T4. These run from the command line at present. These templates inherit from a generic base class that has a property named Meta. This property is typed to the underlying strong-typed metadata item – in the samples either CodeFirstSemanticLog or CodeFirstClass. The EventSource template and problem is significantly more complex, but avoids some extra mind twisting with a strong-typed metadata class around a class. These templates are preliminary and do not handle all scenarios.



Metaprogramming


While there are a couple of ways to solve a metaprogramming expansion or code first problem, I’ve settled on an alternate file extension. The code-first minimal description is in a file with a .cfcs extension. Because I lie to Visual Studio and tell it that this is a C# file (Tools/Options/Text Editor/File Extensions) I get nice IntelliSense for most features (more work to be done later). But because MSBuild doesn’t see it as a C# file, the .cfcs file is ignored as a source file in compilation.


Generation produces an actual source code file in a file with a .g.cs extension. This file becomes part of your project. This is the “real” code and you debug in this “real” code because it’s all the compiler and debugger know about. As a result


- You write is the minimal code that only you can write


- You understand your application through either the minimal or expanded code


- You easily recognize expanded code via a .g.cs extension


- You can place the minimal and expanded code side by side to understand the expansion


- You debug in real code


- You protect the generated code by allowing only the build server to check in these files


Again this happens because there are two clearly differentiated files in your project – the .cfcs file and the .g.cs file.


The intent is to have this automated as part of your normal development pipeline, through one or more mechanism – build, custom tools, VS extension/PowerShell. The pipeline part is not done yet, but you can grab the necessary pieces from the console application in the example.


You can also find more here.



Getting CodeFirstMetadata


You can get this project on GitHub.


I’ll add this to NuGet when the samples are in a more accessible from your Visual Studio project.


ExpansionFirst Templates


T4 has brought us a very long way. It, and CodeSmith have had the lion’s share of code generation templating in the .NET world for about a decade. I have enormous respect for people like Gareth Jones who wrote it and kept it alive and Oleg Sych who taught so many people to use it. But i think it’s time to move on. Look for more upcoming on this – my current bits are so preliminary that I’ll wait to post.


Summary


I look forward to sharing the unfinished pieces of this roadmap in the coming weeks and months.


I’d like to offer a special thanks to the folks in my April DevIntersection workshop. The challenges of explaining the .NET Compiler Platform/Roslyn pieces to you let me to take a step back and isolate those pieces from the rest of the work. While this put me way behind schedule, in the end I think it’s valuable both in simplifying the metaprogramming steps and in offering a wrapper for the .NET Compiler Platform/Roslyn.

Code-first Metadata

This is “code first” in the general sense, not the specific sense of Entity Framework. This has nothing to do with Entity Framework at all, except that team showed us how valuable simple access to code-like metadata is.


Code first is a powerful mechanism for expressing your metadata because code is the most concise way to express many things. There’s 60 years of evolution to todays’ computer languages being efficient in expressing explicit concepts based on a natural contextualization. You can’t get this in JSON, XML or other richer and less-opinionated formats.


Code first is just one approach to getting strong-typed metadata. The keys to the kingdom, the keys to your code, lie in expressing the underlying problems of your code in a strong-typed manner, which you can read about here.


The problem is that the description of the problem is wrapped up with an enormous amount of ceremony about how to do what we’re trying to do. Let’s look at this in relation to metaprogramming where the goal is generally to reduce ceremony and


Only write the code that only you can write



In other words, don’t write any code that isn’t part of the minimum definition of the problem, divorced of all technology artifacts.


For example, you can create a SemanticLog definition that you can later output as an EventSource class, or any other kind of log output – even in a different language or on a different platform.


To do this, describe the SemanticLog in the simplest way possible, devoid of technology artifacts.


namespace ConsoleRunT4Example
{
[SemanticLog()]
public class Normal
{
public void Message(string Message) { }
[Event(2)]
public void AccessByPrimaryKey(int PrimaryKey) { }
}

}

Instead of the EventSource version:


using System;
using System.Diagnostics.Tracing;

namespace ConsoleRunT4Example
{
[EventSource(Name = "ConsoleRunT4Example-Normal")]
public sealed partial class Normal : EventSource
{
#region Standard class stuff
// Private constructor blocks direct instantiation of class
private Normal() { }

// Readonly access to cached, lazily created singleton instance
private static readonly Lazy<Normal> _lazyLog =
new Lazy<Normal>(() => new Normal());
public static Normal Log
{
get { return _lazyLog.Value; }
}
// Readonly access to private cached, lazily created singleton inner class instance
private static readonly Lazy<Normal> _lazyInnerlog =
new Lazy<Normal>(() => new Normal());
private static Normal innerLog
{
get { return _lazyInnerlog.Value; }
}
#endregion


#region Your trace event methods

[Event(1)]
public void Message(System.String Message)
{
if (IsEnabled()) WriteEvent(1, Message);
}
[Event(2)]
public void AccessByPrimaryKey(System.Int32 PrimaryKey)
{
if (IsEnabled()) WriteEvent(2, PrimaryKey);
}
#endregion
}
}

Writing less code (10 lines instead of 47) because we are lazy is a noble goal. But the broader benefit here is that the first requires very little effort to understand and very little trust about whether the pattern is followed. The second requires much more effort to read the code and ensure that everything in the class is doing what’s expected. The meaning of the code requires that you know what an EventSource is.


Code-first allows you to just write the code that only you can write, and leave it to the system to create the rest of the code based on your minimal definition.

Strong-typed Metadata

Your code is code and your code is data.

Metaprogramming opens up worlds where you care very much that your code is data. Editor enhancements open up worlds where you care very much that your code is data. Visualizations open up worlds where you care very much that your code is data. And I think that’s only the beginning.

There’s nothing really new about thinking of code as data. Your compiler does it, metaprogramming techniques do it, and delegates and functional programming do it.

So, let’s make your code data. Living breathing strongly-typed data. Strong typing means describing the code in terms of the underlying problem and providing this view as a first class citizen rather than a passing convenience.

Describing the Underlying Problem

I’ll use logging as an example, because the simpler problem of PropertyChanged just happens to have an underlying problem of classes and properties, making it nearly impossible to think about with appropriate abstractions. Class/property/method is only interesting if the underlying problem is about classes, properties and methods.

The logging problem is not class/method – it’s log/log event. When you strongly type the metadata to classes that describe the problem being solved you can reason about code in a much more effective manner. Alternate examples would be classes that express a service, a UI, a stream or an input device like a machine.

I use EventSource for logging, but my metadata describes the problem in a more generalized way – it describes it as a SemanticLog. A SemanticLog looks like a class, and once you create metadata from it, you can create any logging system you want.

Your application has a handful of conceptual groups like this. Each conceptual group has a finite appropriate types of customization. Your application problem also has a small number of truly unique classes.

Treating Metadata as a First Class Citizen

In the past, metadata has been a messy affair. The actual metadata description of the underlying patterns of your application have been sufficiently difficult to extract that you’ve had no reason to care,. Thus, tools like the compiler that treated your code as data simply created the data view it needed and tossed in out as rubbish when it was done.

The .NET Compiler Platform, Roslyn, stops throwing away its data view. It exposes it for us to play with.

Usage Examples

I’m interested in strongly typed metadata to write templates for metaprogramming. I want these template to be independent of how you are running them – whether they are part of code generation, metaprogramming, a code refactoring or whatever. I also want these templates to be independent of how the metadata is loaded.

Strongly typed metadata works today in T4 templates. My CodeFirstMetadata project has examples.

I’m starting work on expansion first templates and there are many other ways to use strong-typed metadata – both for other metaprogramming techniques and completely different uses. One of the reasons I’m so excited about this project is to see what interesting things people do, once their code is in a strong-typed form. At the very least, I think it will be an approach to visualizations and ensuring your code follows expected patterns. It will be better at ensuring large scale patterns than code analysis rules. Whew! So much fun work to do!!!

Strong-typed Metadata in a T4 Template

Here’s a sample of strong typing in a T4 template

 
image
 



There’s some gunk at the top to add some assemblies and some using statements for the template itself. The important piece at the top is that the class created by this template is a generic type with a type argument – CodeFirstSemanticLog – that is a strong-typed metadata class. Thus the Meta property of the CodeFirstT4CSharpBase class is a SemanticLog class and understands concepts specific to the SemanticLog, like IncludesInterface. I’ve removed a few variable declarations that are specific to the included T4 files.