Overload resolution and null

My colleague Soundar discovered this rather interesting behavior.

  1: class Test
  2: {
  3:     public static void Main()
  4:     {
  5:         Test test = null;
  6: 
  7:         Console.WriteLine("{0}", test);
  8:         Console.WriteLine("{0}", null);
  9:     }
 10: }


If you run this code, you’ll find that while line 7 prints an empty line, line 8 causes an ArgumentNullException. Note that the test reference is also null, so it should certainly surprise you that the two lines result in different behavior at runtime.



It certainly surprised me enough to make me dig deeper into the reason for the difference. I reasoned that given that the parameter values are identical at runtime, the discrepancy must happen because of a compiler operation – probably method overloading. And sure enough, the two calls resolve to different overloads.



Line 7 resolves to



public static void WriteLine(string format, object arg0);


whereas Line 8 resolves to



public static void WriteLine(string format, params object[] arg);


A peek at the source code using Reflector showed that the second overload throws if arg is null, whereas the first one packs arg0 into an object array and calls the second overload.



Ok, but why did the compiler pick different overloads?



Intuitively, for a method call with a single parameter, you’d expect the overload resolution algorithm to choose a single parameter method over a method with variable number of arguments. And that’s what the compiler did on line 7.



On line 8, the situation is different – null is directly assignable to arg0 and to arg. The overload resolution algorithm had to choose the best function, and it chose the one with the object array.



That appears counter intuitive, until you have code like



  1: class Test
  2: {
  3:     public static void Main()
  4:     {
  5:         SubTest subTest = null;
  6:         M(subTest);
  7:     }
  8: 
  9:     static void M(Test t) { Console.WriteLine("Test"); }
 10:     static void M(SubTest s) { Console.WriteLine("SubTest"); }
 11: }
 12: 
 13: class SubTest : Test { }


You wouldn’t be surprised if the call at line 6 resolved to M(SubTest), would you?



The C# spec’s rules for determining the best match say that



“ Given an implicit conversion C1 that converts from a type S to a type T1, and an implicit conversion C2 that converts from a type S to a type T2, the better conversion of the two conversions is determined as follows:



  • If T1 and T2 are the same type, neither conversion is better.
  • If S is T1, C1 is the better conversion.
  • If S is T2, C2 is the better conversion.
  • If an implicit conversion from T1 to T2 exists, and no implicit conversion from T2 to T1 exists, C1 is the better conversion.
  • If an implicit conversion from T2 to T1 exists, and no implicit conversion from T1 to T2 exists, C2 is the better conversion.
  • … “


In this case, SubTest (T1) is implicitly convertible to Test (T2), and therefore the compiler picks M(SubTest).



Now in our case, the compiler was trying to pick the best conversion between null to object and null to object[]. Applying the same rule as above, object[] is implicitly convertible to object, and therefore the overload resolution algorithm chose  WriteLine(string format, params object[] arg). The params modifier didn’t play a part in overload resolution in this (null) case.



Interesting, ain’t it?

Volatile and local

If you’ve done any multithreading programming at all, you must be aware of the volatile modifier. When a field is marked volatile, it tells

1. the JIT compiler that it can’t hoist the field because it may be modified by multiple threads

2. the CLR that the field must be read to and written from with acquire and release semantics.

Given what you’ve read above, the post’s title doesn’t make sense. A local variable, by definition, cannot be accessed from multiple threads. An object referred to by a local variable can be shared among threads, but never the variable itself.

Well, that was true as long as local variables remained just that – local variables. The 2.0 release of C# brought closures to the language, and C# implements capturing of local variables by making them members of a generated class. Now do you see the problem?

  1: public static void Main()
  2: {
  3:     bool stopRunning = false;
  4: 
  5:     Thread t = new Thread(() =>
  6:         {
  7:             while (!stopRunning)
  8:                 Console.WriteLine("Hello");
  9:         });
 10:     t.Start();
 11:     DoSomethingElse();
 12:     stopRunning = true;
 13: }


Nothing out of the ordinary here – I’m creating a thread, passing a lambda to the Thread constructor, and capturing stopRunning inside the lambda.



This code isn’t correct though – for the reasons mentioned in the initial paragraph of this post, stopRunning needs to be declared with the volatile modifier. Unfortunately, you can’t make stopRunning volatile – the compiler complains that local variables cannot be marked volatile.



Oops.



Making stopRunning a member of the class will solve the immediate problem – you can then mark the field volatile, and all is good. However, left at that, it now makes the class non-threadsafe – two threads could call Main, and stopRunning will be shared between them.



I guess this is the price to pay for compiler magic – magic that enables seamless access to local variables from anonymous methods.