Micro-optimization: the surprising inefficiency of readonly fields

Introduction

Recently I’ve been optimizing the heck out of Noda Time. Most of the time this has been a case of the normal measurement, find bottlenecks, carefully analyse them, lather, rinse, repeat. Yesterday I had a hunch about a particular cost, and decided to experiment… leading to a surprising optimization.

Noda Time’s core types are mostly value types – date/time values are naturally value types, just as DateTime and DateTimeOffset are in the BCL. Noda Time’s types are a bit bigger than most value types, however – the largest being ZonedDateTime, weighing in at 40 bytes in an x64 CLR at the moment. (I can shrink it down to 32 bytes with a bit of messing around, although it’s not terribly pleasant to do so.) The main reason for the bulk is that we have two reference types involved (the time zone and the calendar system), and in Noda Time 2.0 we’re going to have nanosecond resolution instead of tick resolution (so we need 12 bytes just to store a point in time). While this goes against the Class Library Design Guidelines, it would be odd for the smaller types (LocalDate, LocalTime) to be value types and the larger ones to be reference types. Overall, these still feel like value types.

A lot of these value types are logically composed of each other:

  • A LocalDate is a YearMonthDay and a CalendarSystem reference
  • A LocalDateTime is a LocalDate and a LocalTime
  • An OffsetDateTime is a LocalDateTime and an Offset
  • A ZonedDateTime is an OffsetDateTime and a DateTimeZone reference

This leads to a lot of delegation, potentially – asking a ZonedDateTime for its Year could mean asking the OffsetDateTime, which would ask the LocalDateTime, which would ask the LocalDate, which would ask the YearMonthDay. Very nice from a code reuse point of view, but potentially inefficient due to copying data.

Why would there be data copying involved? Well, that’s where this blog post comes in.

Behaviour of value type member invocations

When an instance member (method or property) belonging to a value type is invoked, the exact behaviour depends on the kind of expression it is called on. From the C# 5 spec, section 7.5.5 (where E is the expression the member M is invoked on, and the type declaring M is a value type):

If E is not classified as a variable, then a temporary local variable of E’s type is created and the value of E is assigned to that variable. E is then reclassified as a reference to that temporary local variable. The temporary variable is accessible as this within M, but not in any other way. Thus, only when E is a true variable is it possible for the caller to observe the changes that M makes to this.

So when is a variable not a variable? When it’s readonly… from section 7.6.4 (emphasis mine) :

If T is a struct-type and I identifies an instance field of that class-type:

  • If E is a value, or if the field is readonly and the reference occurs outside an instance constructor of the struct in which the field is declared, then the result is a value, namely the value of the field I in the struct instance given by E.

(There’s a very similar bullet for T being a class-type; the important part is that the field type is a value type

The upshot is that if you have a method call of:

int result = someField.Foo();

then it’s effectively converted into this:

var tmp = someField;
int result = tmp.Foo();

Now if the type of the field is quite a large value type, but Foo() doesn’t modify the value (which it never does within my value types), that’s performing a copy completely unnecessarily.

To see this in action outside Noda Time, I’ve built a little sample app.

Show me the code!

Our example is a simple 256-bit type, composed of 4 Int64 values. The type itself doesn’t do anything useful – it just holds the four values, and exposes them via properties. We then measure how long it takes to sum the four properties lots of times.

using System;
using System.Diagnostics;

public struct Int256
{
    private readonly long bits0;
    private readonly long bits1;
    private readonly long bits2;
    private readonly long bits3;
    
    public Int256(long bits0, long bits1, long bits2, long bits3)
    {
        this.bits0 = bits0;
        this.bits1 = bits1;
        this.bits2 = bits2;
        this.bits3 = bits3;
    }
    
    public long Bits0 { get { return bits0; } }
    public long Bits1 { get { return bits1; } }
    public long Bits2 { get { return bits2; } }
    public long Bits3 { get { return bits3; } }
}

class Test
{
    private readonly Int256 value;

    public Test()
    {
        value = new Int256(1L, 5L, 10L, 100L);
    }
    
    public long TotalValue 
    { 
        get 
        {
            return value.Bits0 + value.Bits1 + value.Bits2 + value.Bits3; 
        }
    }
    
    public void RunTest()
    {
        // Just make sure it’s JITted…
        var sample = TotalValue;
        Stopwatch sw = Stopwatch.StartNew();
        long total = 0;
        for (int i = 0; i < 1000000000; i++)
        {
            total += TotalValue;
        }
        sw.Stop();
        Console.WriteLine("Total time: {0}ms", sw.ElapsedMilliseconds);
    }
    
    static void Main()
    {
        new Test().RunTest();
    }
}

Building this from the command line with /o+ /debug- and running (in a 64-bit CLR, but no RyuJIT) this takes about 20 seconds to run on my laptop. We can make it much faster with just one small change:

class Test
{
    private Int256 value;

    // Code as before
}

The same test now takes about 4 seconds – a 5-fold speed improvement, just by making a field non-readonly. If we look at the IL for the TotalValue property, the copying becomes obvious. Here it is when the field is readonly:

.method public hidebysig specialname instance int64 
        get_TotalValue() cil managed
{
  // Code size       60 (0x3c)
  .maxstack  2
  .locals init (valuetype Int256 V_0,
           valuetype Int256 V_1,
           valuetype Int256 V_2,
           valuetype Int256 V_3)
  IL_0000:  ldarg.0
  IL_0001:  ldfld      valuetype Int256 Test::’value’
  IL_0006:  stloc.0
  IL_0007:  ldloca.s   V_0
  IL_0009:  call       instance int64 Int256::get_Bits0()
  IL_000e:  ldarg.0
  IL_000f:  ldfld      valuetype Int256 Test::’value’
  IL_0014:  stloc.1
  IL_0015:  ldloca.s   V_1
  IL_0017:  call       instance int64 Int256::get_Bits1()
  IL_001c:  add
  IL_001d:  ldarg.0
  IL_001e:  ldfld      valuetype Int256 Test::’value’
  IL_0023:  stloc.2
  IL_0024:  ldloca.s   V_2
  IL_0026:  call       instance int64 Int256::get_Bits2()
  IL_002b:  add
  IL_002c:  ldarg.0
  IL_002d:  ldfld      valuetype Int256 Test::’value’
  IL_0032:  stloc.3
  IL_0033:  ldloca.s   V_3
  IL_0035:  call       instance int64 Int256::get_Bits3()
  IL_003a:  add
  IL_003b:  ret
} // end of method Test::get_TotalValue

And here it is when the field’s not readonly:

.method public hidebysig specialname instance int64 
        get_TotalValue() cil managed
{
  // Code size       48 (0x30)
  .maxstack  8
  IL_0000:  ldarg.0
  IL_0001:  ldflda     valuetype Int256 Test::’value’
  IL_0006:  call       instance int64 Int256::get_Bits0()
  IL_000b:  ldarg.0
  IL_000c:  ldflda     valuetype Int256 Test::’value’
  IL_0011:  call       instance int64 Int256::get_Bits1()
  IL_0016:  add
  IL_0017:  ldarg.0
  IL_0018:  ldflda     valuetype Int256 Test::’value’
  IL_001d:  call       instance int64 Int256::get_Bits2()
  IL_0022:  add
  IL_0023:  ldarg.0
  IL_0024:  ldflda     valuetype Int256 Test::’value’
  IL_0029:  call       instance int64 Int256::get_Bits3()
  IL_002e:  add
  IL_002f:  ret
} // end of method Test::get_TotalValue

Note that it’s still loading the field address (ldflda) four times. You might expect that copying the field onto the stack once via a temporary variable would be faster, but that ends up at about 6.5 seconds on my machine.

There is an optimization which is even faster – moving the totalling property into Int256. That way (with the non-readonly field, still) the total time is less than a second – twenty times faster than the original code!

Conclusion

This isn’t an optimization I’d recommend in general. Most code really doesn’t need to be micro-optimized this hard, and most code doesn’t deal with large value types like the ones in Noda Time. However, I regard Noda Time as a sort of "system level" library, and I don’t ever want someone to decide not to use it on  performance grounds. My benchmarks show that for potentially-frequently-called operations (such as the properties on ZonedDateTime) it really does make a difference, so I’m going to go for it.

I intend to apply a custom attribute to each of these "would normally be readonly" fields to document the intended behaviour of the field – and then when Roslyn is fully released, I’ll probably write a test to validate that all of these fields would still compile if the field were made readonly (e.g. that they’re never assigned to outside the constructor).

Aside from anything else, I find the subtle difference in behaviour between a readonly field and a read/write field fascinating… it’s something I’d been vaguely aware of in the past, but this is the first time that it’s had a practical impact on me. Maybe it’ll never make any difference to your code… but it’s probably worth being aware of anyway.

Anti-pattern: parallel collections

(Note that I’m not talking about "processing collections in parallel, which is definitely not an anti-pattern…)

I figured it was worth starting to blog about anti-patterns I see frequently on Stack Overflow. I realize that some or all of these patterns may be collected elsewhere, but it never hurts to express such things yourself… it’s a good way of internalizing information, aside from anything else. I don’t guarantee that my style of presenting these will stay consistent, but I’ll do what I can…

The anti-patterns themselves are likely to be somewhat language-agnostic, or at the very least common between Java and C#. I’m likely to post code samples in C#, but I don’t expect it to be much of a hindrance to anyone coming from a background in a similar language.

Context

You have related pieces of data about each of several items, and want to keep this data in memory. For example, you’re writing a game and have multiple players, each with a name, score and health.

Anti-pattern

Each kind of data is stored (all the names, all the scores, all the health values) in a separate collection. Typically I see this with arrays. Then each time you need to access related values, you need to make sure you’re using the same index for each collection.

So the code might look like this:

class Game 

    string[] names;
    int[] scores;
    int[] health;

    public Game() 
    { 
        // Initialize all the values appropriately
    } 

    public void PrintScores() 
    { 
        for (int i = 0; i < names.Length; i++) 
        { 
            Console.WriteLine("{0}: {1}", names[i], scores[i]); 
        } 
    } 
}

Preferred approach

The code above fails to represent an entity which seems pretty obvious when you look at the description of the data: a player. Whenever you find yourself describing pieces of data which are closely related, you should make sure you have some kind of representation of that in your code. (In some cases an anonymous type is okay, but often you’ll want a separate named class.)

Once you’ve got that type, you can use a single collection, which makes the code much cleaner to work with.

class Player
{
    public string Name { get; private set; }
    public int Score { get; set; }
    public int Health { get; set; }

    public Player(string name, int health)
    {
        Name = name;
        Health = health;
    }

    // Consider other player-specific operations
}

class Game
{
    List<Player> players;

    public Game()
    {
        // Initialize players appropriately
    }

    public void PrintScores()
    {
        foreach (var player in players)
        {
            Console.WriteLine("{0}: {1}", player.Name, player.Score);
        }
    }
}

Note how we can now use a foreach loop to iterate over our players, because we don’t care need to use the same index for two different collections.

Once you perform this sort of refactoring, you may well find that there are other operations within the Game class which would be better off in the Player class. For example, if you also had a Level property, and increasing that would automatically increase a player’s health and score, then it makes much more sense for that "level up" operation to be in Player than in Game. Without the Player concept, you’d have nowhere else to put the code, but once you’ve identified the relationship between the values, it becomes much simpler to work with.

It’s also much easier to modify a single collection than multiple ones. For example, if we wanted to add or remove a player, we now just need to make a change to a single collection, instead of making sure we perform the same operation to each "single value" collection in the original code. This may sound like a small deal, but it’s easy to make a mistake and miss out on one of the collections somewhere. Likewise if you need to add another related value – like the "level" value described above – it’s much easier to add that in one place than adding another collection and then making sure you do the right thing in every piece of code which changes any of the other collections.

Summary

Any time you find yourself with multiple collections sharing the same keys (whether those are simple list indexes or dictionary keys), think about whether you could have a single collection of a type which composes the values stored in each of the original collections. As well as making it easier to handle the collection data, you may find the new type allows you to encapsulate other operations more sensibly.

Update: what about performance?

As some readers have noted, this transformation can have an impact on performance. Originally, all the scores were kept close together in memory, all the health etc. If you perform a bulk operation on the scores (finding the average score, for example) that locality of reference can have a significant impact. In some cases that may be enough justification to use the parallel collections instead… but this should be a conscious decision, having weighed up the pros and cons and measured the performance impact. Even then, I’d be tempted to encapsulate that PlayerCollection in a separate type, allowing it to implement IEnumerable<Player> where useful. (If you wanted the Player to be mutable, you’d need it to be aware of PlayerCollection itself.)

In almost all these anti-patterns, there will be cases where they’re the lesser of two evils – but novice developers need to be aware of them as anti-patterns to start with. As ever with performance trade-offs, I believe in first deciding on concrete performance goals, then implementing the code in the simplest possible way that meets the non-performance goals, measuring against the performance goals, and tweaking if necessary to achieve them, relying heavily on measurement.

Diagnosing Portable Class Libraries

I’ve always found Portable Class Library (PCL) configuration to be a bit of a mystery. In simple cases, it’s simple: start a new PCL project in Visual Studio, select the environments you want to support, and away you go. But what’s going on under the hood, and what do all the options mean? How do I do this without Visual Studio?

Background: supporting Universal Windows Applications in Noda Time

Recently, I had a feature request for Noda Time to support Windows Phone 8.1 and Universal Windows Applications. This is something I haven’t done before, but it doesn’t sound too hard – it’s just a matter of changing an option in Visual Studio, surely… Unfortunately it’s not that simple for Noda Time. In order to support .NET 3.5 (and some other features which aren’t available in PCLs) the core NodaTime.dll project has multiple configurations which effectively split into "Desktop" and "PCL". This has been achieved by hacking around with the .csproj file manually – nothing particularly clever, but the sort of thing which can’t be done directly in Visual Studio. This means that when I open the project properties in Visual Studio – even when I’m building a PCL version – it treats the project as a regular class library. (It confuses the heck out of ReSharper too in some cases, unfortunately.)

That’s not too hard to work around – I just need to know the name of the profile I want to support. A profile name is basically a single identifier which refers to the configuration of environments you want to support. Unfortunately, it’s pretty opaque – Noda Time 1.2 ships with "Profile2" which means ".NET 4.0, Windows 8 (store apps), Silverlight 4, Windows Phone 7. The simplest way to work out what profile I wanted was to create a new "regular" PCL in Visual Studio, unload the project, manually set the profile to Profile2 in the project file, reload the project, and then enable "Windows Phone 8.1".

That then led to the next problem: Visual Studio 2013 (update 2) wanted to upgrade that sample project… because Profile2 is no longer really an option. You can’t manually set that combination in Visual Studio any more – it doesn’t display anything earlier than Silverlight 5 and Windows Phone 8. I now have two problems:

  • Given that Visual Studio changes the project file for me with a warning when I open it, how can I check what I’ve actually built against? Does msbuild silently do the same thing?
  • How can I find out what profiles are available, in order to select the most appropriate one?

Time to do some research…

Detecting a profile in a built assembly

First things first: if someone gives you a DLL, how can you detect what profile it was built against, and therefore which environments it can run in?

After hunting around in vain for a while, Petr K helped me out on Twitter: the crucial information is in the TargetFrameworkAttribute and its FrameworkName property, which is something like ".NETPortable,Version=v4.0,Profile=Profile2". That can then be used with the FrameworkName class to parse out the profile name without writing any string manipulation code.

Then you just need to find the details of the profile…

Where are all the profiles?

Windows stores the profiles along with the reference assembles, in the .NETPortable framework directory – on my machine for example, that’s "C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETPortable". There are then subdirectories v4.0, v4.5 and v4.6 – although it’s not 100% clear what those refer to. Within each of those, there’s a Profile directory, which contains a bunch of Profile[n] directories. Then under each Profile[n] directory, a SupportedFrameworks directory contains XML documents describing the supported frameworks for that profile. Phew! So as an example:

C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETPortable\v4.0\Profile\Profile2

contains the profile information for Profile2, including files:

  • C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETPortable\v4.0\Profile\Profile2\SupportedFrameworks\.NET Framework 4.xml
  • C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETPortable\v4.0\Profile\Profile2\SupportedFrameworks\Silverlight.xml
  • C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETPortable\v4.0\Profile\Profile2\SupportedFrameworks\Windows 8.xml
  • C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETPortable\v4.0\Profile\Profile2\SupportedFrameworks\Windows Phone Silverlight 7.xml

Each of those XML files consists of a single Framework element, with a variety of attributes. Not all elements have the same set of attributes, although everything has an Identifier, Family, Profile, MinimumVersion and DisplayName as far as I’ve seen. Here’s an example – the "Windows Phone Silverlight 7.xml" file listed above:

<?xml version="1.0" encoding="utf-8"?>
<Framework
    Identifier="Silverlight"
    Profile="WindowsPhone*"
    MinimumVersion="4.0"
    Family="WindowsPhone"
    MinimumVisualStudioVersion="10.0"
    MaximumVisualStudioVersion="11.0"
    DisplayName="Windows Phone Silverlight"
    MinimumVersionDisplayName="7"
    PlatformArchitectures="AnyCPU" />

As noted by David Kean in comments, the "Identifier", "Profile" and "MinimumVersion" are mapped to the corresponding <TargetFramework*> properties in MSBuild; the combination of the three is unique for a target, but you need to be careful to look at all the relevant information… don’t just look at the Identifier attribute. Additionally, if there’s a MinimumVersionDisplayName, that’s probably the version you’ll be more familiar with – so in the example above, the displayed minimum version number is 7, which is what people think about for Windows Phone versions, even though the real minimum version number is 4.0 as far as the build is concerned.

Read David’s comment for more details – I don’t want to paraphrase it too much, in case I accidentally write something misleading.

What about the Xamarin frameworks?

As you can see above, Profile2 doesn’t include anything that looks like it’s a Xamarin framework. (Compare that with Profile136 for example, which includes Xamarin.iOS.xml and Xamarin.Android.xml.) But haven’t I previously said that Noda Time 1.2 works seamlessly in Xamarin.iOS and Xamarin.Android? Indeed I have. This is all very confusing, but as far as I’m aware, the Xamarin.* frameworks are only required if you want to build within Xamarin Studio and you’re not on a Windows machine. I could be wrong about that, but certainly there are some profiles which will work with some versions of Xamarin without a Xamarin-specific framework being on the list.

I would welcome more clarity on this, and if I learn any more about it (either in comments here or via other channels) I’ll update this post.

Where should the files go in NuGet packages?

The next issue to consider is that of NuGet packages. A NuGet package can contain multiple binaries for multiple target profiles – for example, the NodaTime NuGet package includes both the desktop and PCL versions. NuGet adds a reference to the right one when you install a package. How can it tell which binary is which? Through a directory structure. Here’s a piece of Noda Time’s nuspec package:

<files>
  <file src="bin\Signed Release\NodaTime.dll" target="lib\net35-Client" />
  <file src="bin\Signed Release\NodaTime.pdb" target="lib\net35-Client" />
  <file src="bin\Signed Release\NodaTime.xml" target="lib\net35-Client" />
  <file src="bin\Signed Release Portable\NodaTime.dll" target="lib\portable-win+net40+sl4+wp7" />
  <file src="bin\Signed Release Portable\NodaTime.pdb" target="lib\portable-win+net40+sl4+wp7" />
  <file src="bin\Signed Release Portable\NodaTime.xml" target="lib\portable-win+net40+sl4+wp7" />
  <file src="**\*.cs" target="src" />
</files>

As you can see, the PCL files end up in lib\portable-win+net40+sl4+wp7. That works for Profile2, but it’s not the right directory for the profile I’ll end up using. So how do I work out what directory to use in the next version? I can look at the NuGet source code, or this handy table of profiles and corresponding NuGet targets. (Interestingly, that lists the directory for Profile2 as portable-net4+sl4+netcore45+wp7, so I’m not entirely sure what’s going on. I suspect the table is more reliable than whenever I got the name from when preparing a Noda Time release.)

This all sounds a bit fiddly. Shouldn’t there be a tool to make it simple?

Yes, yes there should. As it happens, Stephen Cleary has already written one, although I didn’t realise this until I’d also written one myself. Indeed, the table linked above is the output from Stephen’s tool.

My own tool is called PclPal, and the source is available on GitHub. Don’t expect polished code (or tests) – this was very much a quick and dirty hack to make life slightly easier. It could be the basis of a more polished tool – possibly even a "PCL profile explorer" or something similar. Oh, and it requires the Roslyn End User Preview.

Currently it’s just a command line tool with three options:

  • "list" – lists the profiles on your system
  • "dll <filename>" – shows the details of the profile a particular assembly was built against
  • "show <profile name>" – shows the details of a specify profile

Note that currently I’m not doing anything in terms of NuGet package directories, although that’s probably the next step, if I take this any further. (I may just use Stephen’s tool, of course…)

Conclusion

The world of PCLs is a bit of a mess, to say the least. It’s possible that there’s lots of really clear, up-to-date documentation explaining everything somewhere – but I have yet to find it. I suspect the main problem is that this is a moving target in all kinds of ways. Versioning is never easy, and between all the different variations possible here, I’m not entirely surprised it’s confusing.

Hopefully this post will help to shed a bit more light on things – please feel very free to leave corrections, as I definitely don’t want to propagate any falsehoods. I’ve definitely learned a thing or two in the last few days, and with tools such as PclPal and Stephen’s PortableLibraryProfiles, life should be that little bit easier…

Extension methods, explicitly implemented interfaces and collection initializers

This post is the answer to yesterday’s brainteaser. As a reminder, I was asking what purpose this code might have:

public static class Extensions 

    public static void Add<T>(this ICollection<T> source, T item) 
    { 
        source.Add(item); 
    } 
}

There are plenty of answers, varying from completely incorrect (sorry!) to pretty much spot on.

As many people noticed, ICollection<T> already has an Add method taking an item of type T. So what difference could this make? Well, consider LinkedList<T>, which implements ICollection<T>, used as below:

// Invalid
LinkedList<int> list = new LinkedList<int>();
list.Add(10);

That’s not valid code (normally)…. whereas this is:

// Valid
ICollection<int> list = new LinkedList<int>();
list.Add(10);

The only difference is the compile-time type of the list variable – and that changes things because LinkedList<T> implements ICollection<T>.Add using explicit interface implementation. (Basically you’re encouraged to use AddFirst and AddLast instead, to make it clear which end you’re adding to. Add is equivalent to AddLast.)

Now consider the invalid code above, but with the brainteaser extension method in place. Now it’s a perfectly valid call to the extension method, which happens to delegate straight to the ICollection<T> implementation. Great! But why bother? Surely we can just cast list if we really want to:

LinkedList<int> list = new LinkedList<int>();
((IList<int>)list).Add(10);

That’s ugly (really ugly) – but it does work. But what about situations where you can’t cast? They’re pretty rare, but they do exist. Case in point: collection initializers. This is where the C# 6 connection comes in. As of C# 6 (at least so far…) collection initializers have changed so that an appropriate Add extension method is also permitted. So for example:

// Sometimes valid :)
LinkedList<int> list = new LinkedList<int> { 10, 20 };

That’s invalid in C# 5 whatever you do, and it’s only valid in C# 6 when you’ve got a suitable extension method in place, such as the one in yesterday’s post. There’s nothing to say the extension method has to be on ICollection<T>. While it might feel nice to be general, most implementations of ICollection<T> which use explicit interface implementation for ICollection<T>.Add do so for a very good reason. With the extension method in place, this is valid too…

// Valid from a language point of view…
ReadOnlyCollection<int> collection = new ReadOnlyCollection<int>(new[] { 10, 20 }) { 30, 40 };

That will compile, but it’s obviously not going to succeed at execution time. (It throws NotSupportedException.)

Conclusion

I don’t think I’d ever actually use the extension method I showed yesterday… but that’s not quite the same thing as it being useless, particularly when coupled with C# 6’s new-and-improved collection initializers. (The indexer functionality means you can use collection initializers with ConcurrentDictionary<,> even without extension methods, by the way.)

Explicit interface implementation is an interesting little corner to C# which is easy to forget about when you look at code – and which doesn’t play nicely with dynamic typing, as I’ve mentioned before.

And finally…

Around the same time as I posted the brainteaser yesterday, I also remarked on how unfortunate it was that StringBuilder didn’t implement IEnumerable<char>. It’s not that I really want to iterate over a StringBuilder… but if it implemented IEnumerable, I could use it with a collection initializer, having added some extension methods. This would have been wonderfully evil…

using System;
using System.Text; 

public static class Extensions  
{  
    public static void Add(this StringBuilder builder, string text)
    {  
        builder.AppendLine(text);
    }  

    public static void Add(this StringBuilder builder, string format, params object[] arguments)
    {  
        builder.AppendFormat(format, arguments);
        builder.AppendLine();
    }  

class Test
{
    static void Main()
    {
        // Sadly invalid :(
        var builder = new StringBuilder
        {
            "Just a plain message",
            { "A message with formatting, recorded at {0}", DateTime.Now }
        };
    }
}

Unfortunately it’s not to be. But watch this space – I’m sure I’ll find some nasty ways of abusing C# 6…

Quick brainteaser

Just a really quick one today…

What’s the point of this code? Does it have any point at all?

public static class Extensions
{
    public static void Add<T>(this ICollection<T> source, T item)
    {
        source.Add(item);
    }
}

Bonus marks if you can work out what made me think about it.

I suggest you ROT-13 answers to avoid spoilers for other readers.

C# 6: First reactions

It’s been a scandalously long time since I’ve blogged about C#, and now that the first C# 6 preview bits are available, that feels like exactly the right thing to set the keys clacking again. Don’t expect anything massively insightful from me just yet; I’d heard Mads and Dustin (individually) talk about some new features of C# 6 at conferences, but this is the first opportunity I’ve had to play with the bits. There are more features to come, and I suspect that the ones we’ve got aren’t in quite the shape they’ll be in the end.

First up, if you haven’t been following Build, here are some of the resources to get you going:

Firstly, the fact that Roslyn is now open source (under the Apache 2 licence, no less) – this is incredible news. I gather it’s already working under Mono (not sure whether that’s in its entirety, or just the most important parts) which can only be a good thing. I don’t know whether I’ll have the time and energy to fork the code and try to implement any little features of my own, but it’s definitely nice to know that I could. I’ll definitely try to poke around the source code to learn good patterns for working with immutability in C#, if nothing else.

I’m not going to list the C# 6 features and go through them – read the docs that come with Roslyn for that. This is more in the way of commentary and early feedback.

Initializers for automatically implemented properties and primary constructors

I’ll talk about these two together because they’re so clearly related… which is part my beef with them. Let’s start out with the positive side: for very simple cases, these really will be wonderful. For times where you’re really just trying to wrap up a bunch of values with no additional validation, it’s really great. For example, in Noda Time I have a struct that currently looks like this:

internal struct Transition : IEquatable<Transition>
{
    private readonly Instant instant;
    private readonly Offset oldOffset;
    private readonly Offset newOffset;

    // Note: no property for oldOffset 
    public Offset NewOffset { get { return newOffset; } } 
    public Instant Instant { get { return instant; } }

    public Transition(Instant instant, Offset oldOffset, Offset newOffset)
    {
        this.instant = instant;
        this.oldOffset = oldOffset;
        this.newOffset = newOffset;
    }

    // Methods
}

(In case you’re wondering: no, there’s no reason for the constructor and properties to be public within an internal struct. I’ll fix that shortly :)

In C# 6, that would be as simple as:

internal struct Transition(Instant instant, private Offset oldOffset, Offset newOffset) : IEquatable<Transition>
{
    public Instant Instant { get; } = instant;
    public Offset NewOffset { get; } = newOffset;

    // Methods
}

This example, entirely coincidentally, shows how you can generate both fields and properties from constructor parameters.

Yay for read-only automatically implemented properties (although at-declaration initialization isn’t just restricted to read-only ones), yay for read-only fields without boilerplate constructor code, and all is generally goodness. Except… a lot of my types don’t end up looking like that. There are three main issues – validation, accessibility, and structs. Validation is already mentioned in the release notes as a potential cause for concern, and that one actually can be ameliorated to some extent. Accessibility is the assumption – as far as I can see – that the primary constructor should have the same accessibility as the class itself. Struct initialization is a relatively rare issue, but one worth considering.

LocalDateTime in Noda Time highlights all three of these issues in one handy parcel. Here’s some of the code:

public struct LocalDateTime // and interfaces
{
    private readonly CalendarSystem calendar;
    private readonly LocalInstant localInstant;

    internal LocalInstant LocalInstant { get { return localInstant; } }
    public CalendarSystem Calendar
    {
        get { return calendar ?? CalendarSystem.Iso; }
    }
    internal LocalDateTime(LocalInstant localInstant, CalendarSystem calendar)
    {
        Preconditions.CheckNotNull(calendar, "calendar");
        this.localInstant = localInstant;
        this.calendar = calendar;
    }

    // Other constructors etc
}

In C# 6, I could *sort of* achieve a similar result, like this:

public struct LocalDateTime(LocalInstant localInstant, CalendarSystem calendar) // and interfaces
{
    private CalendarSystem Calendar { get; }  = Preconditions.CheckNotNull(calendar, "calendar");
    internal LocalInstant LocalInstant { get; } = localInstant;

    // Other constructors etc
}

I’ve worked around the validation, by putting it in the property initialization. That’s not too bad, but it does potentially mean that your previously-centralized validation is scattered around the code a bit more – it lacks clarity.

We now have a public constructor instead of an internal one – which in this case wouldn’t work for me at all, as LocalInstant is an internal type. In fact, LocalDateTime does have some public constructors, but quite often I create types with no public constructors at all, just private constructors and static factory methods. Unless I’ve missed some way of separating the accessibility of the constructor from the accessibility of the type, primary constructors simply won’t be applicable for those types.

Finally, the struct part. Note how the Calendar property in the original code uses the null-coalescing operator. That’s because values of structs can always be created without going through a constructor, making the validation less helpful than it would otherwise be. In Noda Time I’ve attempted to make the default values for all structs act as if they’re valid values, even if they’re not useful values. (ZonedDateTime uses a time zone of UTC in a similar fashion.) I suspect there’s no equivalent for this using read-only automatically implemented properties – but the fix here would probably be to use an "auto read-only private field" and back that with a property – not too much of a hardship.

I think what bothers me most about this pair of features is how tightly coupled they are. If you don’t want to have a primary constructor (because you want to have more logic than it allows, or because you want to change the accessibility), but still want to initialize read-only properties, you’re forced back to declaring a separate private backing field. I think it would actually be reasonable to treat read-only properties like read-only fields, and allow them to be initialized from normal constructors too. I doubt that I’ll prevail, but I’ll certainly make the point to the team. (Heck, this may be the thing I try to implement in a Roslyn fork, just for kicks. How hard can it be? ;)

Again, I want to reiterate that in simple cases – even including where there’s simple validation, using helper methods – these features will be absolutely wonderful. I’d just like to be able to use them more often than it currently looks like I’d be able to.

Using static

Hooray! We’ll be able to select extension methods from a single type instead of from a whole namespace, like I asked for in 2005. I’m not the only one who’s been badgering the team on this for a while, so it’s great to see that it’s made it. I’m not 100% keen on the fact that it looks like any other using directive – I think it would help with clarify if it were "using static …" instead, but I’m not going to beef about that. (EDIT: I see from the language design notes that this option was considered as recently as February 2014.)

Note that this will further simplify the validation in the example above – I can see myself adding a using directive for Preconditions very frequently. It will also mean that I might get round to adding a load of extension methods into NodaTime.Testing – I didn’t want to introduce one namespace per kind of extension method, but this will be much cleaner. (I’ll still use a separate namespace to shield any pre-C#-6 users from seeing a slew of them, mind you.)

Declaration expressions

The language guide has this to say near the end:

Declaration expressions take a little getting used to, and can probably be abused in new and interesting ways.

I’m going to reserve judgement here. I’m instinctively against them as I’m sure they’ll encourage out parameter usage… but I suspect that’s the point – that out parameters aren’t too bad when they’re easy to use. It still feels like a little bit of a workaround – the language is still really designed to return a single value, and out parameters help you out when you’ve got some good reason to want to return more than one value. The big question is whether returning more than one value is a good thing or not. If it’s not – if it should only be done in extremis – then we shouldn’t be making it easier. If it is a good thing, might there not be better ways of making it easier? Maybe not – aside from anything else, this is a relatively small change, compared with alternatives. It seems likely that a platform/language designed with multiple return values in mind would not use the same approach, but that ship has sailed.

The beneficial side effect is the cleaner way of using the "as" operator to both assign and test in one go, as per the example in the guide:

if ((string s = o as string) != null) { … }

While this clearly lack the joyous vulgarity of my alternative workaround:

for (string s = o as string; s != null; s = null) { … }

… I have to grudgingly say I prefer it overall.

I’m not quite egocentric enough to think that Mads specifically had me in mind with the phrase "and can probably be abused in new and interesting ways" but I’ll obviously do my best to find something.

Anything else?

The other features (exception filters, binary literals, separators in literals, indexed members and element initializers, await in catch and finally blocks, and extension Add methods) all seem fairly reasonable to me – or at least, I don’t think I have anything interesting to say about them yet. I’m hopeful that exception filters could create some really interesting mayhem due to the timing of when the condition is evaluated, but I haven’t got as far as abusing it in a genuinely glorious way yet.

Conclusion

I’m really, really looking forward to C# 6. Despite my reservations about primary constructors and read-only automatically implemented properties, they’re still my favourite features – and ones which I hope can be made more widely useful before the final release.

The question I naturally ask myself is, "Where will this make Noda Time cleaner?" (Bear in mind that I’m still just a hobbyist in C# – Noda Time is as close to a "production" codebase as I have.) I can certainly see myself updating Noda Time to require C# 6 to build (but keeping the .NET 3.5 library requirement) fairly soon after the final release – there’s plenty that will make life simpler… and more to come, I hope.

So many thanks, C# team – and I’ll look forward to the next preview!

How many 32-bit types might we want?

I was recently directed to an article on "tiny types" – an approach to static typing which introduces distinct types for the sake of code clarity, rather than to add particular behaviour to each type. As I understand it, they’re like type aliases with no conversions between the various types. (Unlike plain aliases, an object is genuinely an instance of the relevant tiny type – it doesn’t have "alias erasure" as a language-based solution could easily do.)

I like the idea, and wish it were better supported in languages – but it led me to thinking more about the existing numeric types that we’ve got and how they’re used. In particular, I was considering how in C# the "byte" type is relatively rarely used as a number, with a useful magnitude which has arithmetic performed on it. That does happen, but more often it’s used either as part of other types (e.g. converting 4 bytes from a stream into an integer) or as a sequence of 8 bits.

It then struck me that the situations where we perform bitwise operations and the situations where we perform arithmetic operations are reasonably distinct. I wonder whether it would be worth having five separate types – which could be purely at the language level, if we wanted:

  • Float32 – regular IEEE-754 32-bit binary floating point type with arithmetic operations but no bitwise operations
  • Int32 – regular signed integer with arithmetic operations but no bitwise operations
  • UInt32 – unsigned integer with arithmetic operations but no bitwise operations
  • Bit32 – a bit sequence with bitwise operations but no arithmetic operations
  • Identity32 – a 32-bit value which only defines equality

The last type would be used for identities which happened to consist of 32 bits but where the actual bit values were only useful in terms of comparison with other identities. (In this case, an Identity64 might be more useful in practice.)

Explicit conversions which preserved the underlying bit pattern would be available, so you could easily generate a sequence of Identity32 values by having a simple Int32 counter, for example.

At the same time, I’d want to introduce bitwise operations for Bit8 and Bit16 values, rather than the "everything is promoted to 32 bits" that we currently have in Java and C#, reducing the number of pointless casts for code that performs bitwise operations.

The expected benefits would be the increased friction when using a type in an "unexpected" way for the value being represented – you’d need to explicitly cast to the right type – but maintaining a low-friction path when using a value in the expected way.

I haven’t yet figured out what I’d want a Stream.Read(…) call to use. Probably Bit32, but I’m not sure yet.

Anyway, I figured I’d just throw it out there as a thought experiment… any reactions?

Diagnosing issues with reversible data transformations

I see a lot of problems which look somewhat different at first glance, but all have the same cause:

  • Text is losing "special characters" when I transfer it from one computer to another
  • Decryption ends up with garbage
  • Compressed data can’t be decompressed
  • I can transfer text but not binary data

These are all cases of transforming and (usually) transferring data, and then performing the reverse transformation. Often there are multiple transformations involved, and they need to be carefully reversed in the appropriate order. For example:

  1. Convert text to binary using UTF-8
  2. Compress
  3. Encrypt
  4. Base64-encode
  5. Transfer (e.g. as text in XML)
  6. Base64-decode
  7. Decrypt
  8. Decompress
  9. Convert binary back to text using UTF-8

The actual details of each question can be different, but the way I’d diagnose them is the same in each case. That’s what this post is about – partly so that I can just link to it when such questions arise. Although I’ve numbered the broad steps, it’s one of those constant iteration situations – you may well need to tweak the logging before you can usefully reduce the problem, and so on.

1. Reduce the problem as far as possible

This is just my normal advice for almost any problem, but it’s particularly relevant in this kind of question.

  • Start by assembling a complete program demonstrating nothing but the transformations. Using a single program which goes in both directions is simpler than producing two programs, one in each direction.
  • Remove pairs of transformations (e.g. encrypt/decrypt) at a time, until you’ve got the minimal set which demonstrates the problem
  • Avoid file IO if possible: hard-code short sample data which demonstrates the problem, and use in-memory streams (ByteArrayInputStream/ByteArrayOutputStream in Java; MemoryStream in .NET) for temporary results
  • If you’re performing encryption, hard-code a dummy key or generate it as part of the program.
  • Remove irrelevant 3rd party dependencies if possible (it’s simpler to reproduce an issue if I don’t need to download other libraries first)
  • Include enough logging (just console output, usually) to make it obvious where the discrepancy lies

In my experience, this is often enough to help you fix the problem for yourself – but if you don’t, you’ll be in a much better position for others to help you.

2. Make sure you’re really diagnosing the right data

It’s quite easy to get confused when comparing expected and actual (or before and after) data… particularly strings:

  • Character encoding issues can sometimes be hidden by spurious control characters being introduced invisibly
  • Fonts that can’t display all characters can make it hard to see the real data
  • Debuggers sometimes "helpfully" escape data for you
  • Variable-width fonts can make whitespace differences hard to spot

For diagnostic purposes, I find it useful to be able to log the raw UTF-16 code units which make up a string in both .NET and Java. For example, in .NET:

static void LogUtf16(string input)
{
    // Replace Console.WriteLine with your logging approach
    Console.WriteLine("Length: {0}", input.Length);
    foreach (char c in input)
    {
        Console.WriteLine("U+{0:x4}: {1}", (uint) c, c);
    }
}

Binary data has different issues, mostly in terms of displaying it in some form to start with. Our diagnosis tools are primarily textual, so you’ll need to perform some kind of conversion to a string representation in order to see it at all. If you’re really trying to diagnose this as binary data (so you’re interested in the raw bytes) do not treat it as encoded text using UTF-8 or something similar. Hex is probably the simplest representation that allows differences to be pinpointed pretty simply. Again, logging the length of the data in question is a good first step.

In both cases you might want to include a hash in your diagnostics. It doesn’t need to be a cryptographically secure hash in any way, shape or form. Any form of hash that is likely to change if the data changes is a good start. Just make sure you can trust your hashing code! (Every piece of code you write should be considered suspect – including whatever you decide to use for converting binary to hex, for example. Use trusted third parties and APIs provided with your target platform where possible. Even though adding an extra dependency for diagnostics makes it slightly harder for others to reproduce the problem, it’s better than the diagnostics themselves being suspect.)

3. Analyze a clean, full, end-to-end log

This can be tricky when you’ve got multiple systems and platforms (which is why if you can possibly reproduce it in a single program it makes life simpler) but it’s really important to look at one log for a complete run.

Make sure you’re talking about the same data end-to-end. If you’re analyzing live traffic (which should be rare; unless the problem is very intermittent, this should all be done in test environments or a developer machine) or you have a shared test environment you need to be careful that you don’t use part of the data from one test and part of the data from another test. I know this sounds trivial, but it’s a really easy mistake to make. In particular, don’t assume that the data you’ll get from one part of the process will be the same run-to-run. In many cases it should be, but if the overall system isn’t working, then you already know that one of your expectations is invalid.

Compare "supposed to be equal" parts of the data. As per the steps in the introduction, there should be pairs of equal data, moving from the "top and bottom" of the transformation chain towards the middle. Initially, you shouldn’t care about whether you view the transformation as being correct – you’re only worried about whether the output is equal to the input. If you’ve managed to preserve all the data, the function of the transformation (encryption, compression etc) becomes relevant – but if you’re losing data, anything else is secondary. This is where the hash from the bottom of step 2 is relevant: you want to be able to determine whether the data is probably right as quickly as possible. Between "length" and "hash", you should have at least some confidence, which will let you get to the most likely problem as quickly as possible.

4. Profit! (Conclusion…)

Once you’ve compared the results at each step, you should get an idea of which transformations are working and which aren’t. This may allow you to reduce the problem further, until you’ve just got a single transformation to diagnose. At that point, the problem becomes about encryption, or about compression, or about text encoding.

Depending on the situation you’re in, at this point you may be able to try multiple implementations or potentially multiple platforms to work out what’s wrong: for example, if you’re producing a zip file and then trying to decompress it, you might want to try using a regular decompression program to open your intermediate results, or decompress the results of compressing with a standard compression tool. Or if you’re trying to encrypt on Java and decrypt in C#, implement the other parts in each platform, so you can at least try to get a working "in-platform" solution – that may well be enough to find out which half has the problem.

To some extent all this blog post is about is reducing the problem as far as possible, with some ideas of how to do that. I haven’t tried to warn you much about the problems you can run into in any particular domain, but you should definitely read Marc Gravell’s excellent post on "How many ways can you mess up IO?" and my earlier post on understanding the meaning of your data is pretty relevant too.

As this is a "hints and tips" sort of post, I’ll happily modify it to include reader contributions from comments. With any luck it’ll be a useful resource for multiple Stack Overflow questions in the months and years to come…

A tale of two puzzles

As I begin to write this, I’m in a small cubicle in Philadelphia airport, on my way back from CodeMash – a wonderful conference (yet again) which I feel privileged to have attended. Personal top highlights definitely include Dustin Campbell’s talk on C# 6 (I’m practically dribbling with anticipation – bits please!) and playing Settlers of Catan on an enormous board. Kevin Pilch-Bisson’s talk on scriptcs was fabulous too, not least because he demonstrated its NuGet support using Noda Time. I’m very likely to use scriptcs as a tool to ease folks into the language gently if I ever get round to writing my introductory C# book. (To follow my own advice from the talk I gave, I should really just start writing it – whether the initial attempt is good or not, it’s better than having nothing.)

First puzzle

During Dustin’s talk, he set some puzzles around refactoring – where an "inline" operation had to be smarter than it originally appeared, for various reasons. The final puzzle particularly piqued my interest, as Dustin explained that various members of the C# team hadn’t actually worked out why the code behaved the way it did… the refactoring in Roslyn worked in that it didn’t change the behaviour, but more through a good general strategy than any attempt to handle this specific case.

So, the code in question is:

using System;

class X
{
    static int M(Func<int?, byte> x, object y) { return 1; }
    static int M(Func<X, byte> x, string y) { return 2; }

    const int Value = 1000;

    static void Main()
    {
        var a = M(X => (byte) X.Value, null);

        unchecked
        {
            Console.WriteLine(a);
            Console.WriteLine(M(X => (byte) X.Value, null));
        }
    }
}

This produces output of:

1
2

Is this correct? Is it a compiler bug from the old native compiler, faithfully reproduced in Roslyn? Why would moving the expression into an "unchecked" block cause different behaviour?

In an attempt to give you enough space and time to avoid accidentally reading the answer, which is explained below, I’d like to take the time to mention the source of this puzzle. A few years ago, Microsoft hired Vladmir Reshetnikov into testing part of the C#/VB team. Vladmir had previously worked at JetBrains on Resharper, so it’s not like he was new to C#. Every year when I see Kevin and Dustin at CodeMash, they give me more reports of the crazy things Vladmir has come up with – perversions of the language way beyond my usual fare. The stories are always highly entertaining, and I hope to meet Vladmir in person one day. He has a twitter account which I’ve only just started following, but which I suspect I’ll find immensely distracting.

Explanation

I massively overthought this for a while. I then significantly underthought it for a while. I played around with the code by:

  • Removing the second parameter from M
  • Changing the name of the parameter
  • Changing the parameter types in various ways
  • Making X.Value non-const (just static readonly)
  • Changing the value of X.Value to 100
  • Changing the lambda expression to make a call to a method using (byte) X.Value instead of returning it directly
  • Using a statement lambda

These gave me clues which I then failed to follow up on properly – partly because I was looking for something complicated. I was considering the categorization of a "cast of a constant value to a different numeric type" and similar details – instead of following through the compiler rules in a systematic way.

The expression we need to focus on is this:

M(X => (byte) X.Value, null)

This is just a method call using a lambda expression, using overload resolution to determine which overload to call and type inference to determine the type arguments to the method. In a very crude description of overload resolution, we perform the following steps:

  • Determine which overloads are applicable (i.e. which ones make sense in terms of the supplied arguments and the corresponding parameters)
  • Compare the applicable overload to each other in terms of "betterness"
  • If one applicable overload is "better" than all the others, use that (modulo a bit more final checking)
  • If there are no applicable overload, or no one applicable method is better than all the others, then the call is invalid and leads to a compile-time error

My mistake was jumping straight to the second bullet point, assuming that both overloads are valid in each case.

Overload 1 – a simple parameter

First let’s look at the first overload: the one where the first parameter is of type Func<int?, byte>. What does the lambda expression of X => (byte) X.Value mean when converted to that delegate type? Is it always valid?

The tricky part is working out what the simple-name X refers to as part of X.Value within the lambda expression. The important part of the spec here is the start of section 7.6.2 (simple names):

A simple-name is either of the form I or of the form I<A1, …, AK>, where I is a single identifier and <A1, …, AK> is an optional type-argument-list. When no type-argument-list is specified, consider K to be zero. The simple-name is evaluated and classified as follows:

  • If K is zero and the simple-name appears within a block and if the block’s (or an enclosing block’s) local variable declaration space (§3.3) contains a local variable, parameter or constant with name I, then the simple-name refers to that local variable, parameter or constant and is classified as a variable or value.

(The spec continues with other cases.) So X refers to the lambda expression parameter, which is of type int? – so X.Value refers to the underlying value of the parameter, in the normal way.

Overload 2 – a bit more complexity

What about the second overload? Here the first parameter to M is of type Func<X, byte>, so we’re trying to convert the same lambda expression to that type. Here, the same part of section 7.6.2 is used, but also section 7.6.4.1 comes into play when determining the meaning of the member access expression X.Value:

In a member access of the form E.I, if E is a single identifier, and if the meaning of E as a simple-name (§7.6.2) is a constant, field, property, local variable, or parameter with the same type as the meaning of E as a type-name (§3.8), then both possible meanings of E are permitted. The two possible meanings of E.I are never ambiguous, since I must necessarily be a member of the type E in both cases. In other words, the rule simply permits access to the static members and nested types of E where a compile-time error would otherwise have occurred.

It’s not spelled out explicitly here, but the reason that it can’t be ambiguous is that you can’t have two members of the same name within the same type (other than for method overloads). You can have method overloads that are both static and instance methods, but then normal method overloading rules are enforced.

So in this case, X.Value doesn’t involve the use of the parameter called X at all. Instead, it’s the constant called value within the class X. Note that this is only the case because the type of the parameter is the same as the type that the parameter’s name refers to. (It isn’t quite the same as the names being the same. If you have a using directive introducing an alias, such as using Y = X; then the condition can be satisfied with different names.)

So what difference does unchecked make?

Now that we know what the conversions from the lambda expression to the two different delegate types mean, we can consider whether they’re always valid. Taking the overload resolution out of the picture, when is each of these lines valid at compile-time?

Func<int?, byte> foo = X => (byte) X.Value;
Func<X, byte> bar = X => (byte) X.Value;

Note that we’re not trying to consider whether they might throw an exception – clearly if you call foo(null) then that will fail. But will it at least compile?

The first line is always valid… but the second depends on your context, in terms of checked or unchecked arithmetic. For the most part, if you’re not explicitly in a checked or unchecked context, the language assumes a default context of unchecked, but allows "external factors" (such as a compiler flag). The Microsoft compiler is unchecked by default, and you can make an assembly use checked arithmetic "globally" using the /checked compiler argument (or going to the Advanced settings in the build properties of a Visual Studio project.

However, one exception to this "default" rule is constant expressions. From section 7.6.12 of the spec:

For constant expressions (expressions that can be fully evaluated at compile-time), the default overflow checking context is always checked. Unless a constant expression is explicitly placed in an unchecked context, overflows that occur during the compile-time evaluation of the expression always cause compile-time errors.

The cast of X.Value to byte is still a constant expression – and it’s one that overflows at compile-time, because 1000 is outside the range of byte. So unless we’re explicitly in an unchecked context, the code snippet above will fail for the second line.

Back to overloading

Given all of that, how does it fit in with our problem? Well, at this point it’s reasonably simple – so long as we’re careful. Look at the first assignment, which occurs outside the unchecked statement:

var a = M(X => (byte) X.Value, null);

Which overloads are applicable here? The second argument isn’t a problem – the null literal is convertible to both object and string. So can we convert the first argument (the lambda expression) to the first parameter of each overload?

As noted earlier, it’s fine to convert it to a Func<int?, byte>. So the first method is definitely applicable. However, the second method requires us to convert the lambda expression to Func<X, byte>. We’re not in an explicitly unchecked context, so the conversion doesn’t work. Where the rule quoted above talks about "overflows that occur during the compile-time evaluation of the expression always cause compile-time errors" it doesn’t mean we actually see an error message when the compiler speculatively tries to convert the lambda expression to the Func<X, byte> – it just means that the conversion isn’t valid.

So, there’s only one applicable method, and that gets picked and executed. As a result, the value of a is 1.

Now in the second call (inside the unchecked statement) both methods are applicable: the conversion from the lambda expression to the Func<X, byte> is valid because the constant conversion occurs within an explicitly unchecked context. We end up with two applicable methods, and need to see if one of them is definitively better than the other. This comes down to section 7.5.3.2 of the C #5 spec, which gives details of a "better function member" test. This in turn ends up talking about better conversions from arguments to parameter types. The lambda expression conversions are effectively equal, but the conversion of null to string is "better than" the conversion to object, because there’s an implicit conversion from string to object but not vice versa. (In that sense the string conversion is "more specific" to borrow Java terminology.) So the second overload is picked, and we print 2 to the console.

Phew!

Summary

This all comes down to the cast from a constant to byte being valid in an explicitly unchecked context, but not otherwise.

One odd part is that I spotted that reasonably quickly, and assumed the problem was actually fairly straightforward. It’s only when I started writing it up for this blog post that I dug into the spec to check the exact rules for the meaning of simple names in different situations. I’d solved the crux of the puzzle, but the exact details were somewhat involved. Indeed, while writing this blog post I’ve started two separate emails to the C# team reporting potential (equal and opposite) bugs – before discarding those emails as I got further into the specification.

This is reasonably common, in my experience: the C# language has been well designed so that we can intuitively understand the meaning of code without knowing the details of the exact rules. Those rules can be pretty labyrinthine sometimes, because the language is very precisely specified – but most of the time you don’t need to read through the rules in detail.

Didn’t you mention two puzzles?

Yes, yes I did.

The second puzzle is much simpler to state – and much harder to solve. To quote Vladmir’s tweet verbatim:

C# quiz: write a valid C# program containing a sequence of three tokens `? null :` that remains valid after we remove the `null`.

Constructing a valid program containing "? null :" is trivial.

Constructing a valid program containing "? :" is slightly trickier, but not really hard.

Constructing a program which is valid both with and without the null token is too hard for me, at the moment. I’m stuck – and I know I’m in good company, as Kevin, Dustin and Bill Wagner were all stuck too, last time I heard. I know of two people who’ve solved this so far… but I’m determined to get it…