Performance in .NET – Part 1

Updated: thanks, Paulo Morgado!

Introduction

Along the years I wrote a couple of posts about performance in the .NET world. Some were more tied to specific frameworks, such as NHibernate or Entity Framework, while others focus on the generic bits. In this series of posts I will summarize my findings on .NET in general, namely:

  • Object creation (this post)
  • Object cloning
  • Value Types versus Reference Types
  • Collections
  • Possibly other stuff

I won’t be talking about object serialization, as there are lots of serializers out there, each with its pros and cons. In general, I’d say either serializing to and from JSON or from a binary format seem to be the most demanded ones, and each has quite a few options, either provided by Microsoft or from third parties. The actual usage also affects what we want – is it a general-purpose serializer or one for a particular usage, that needs classes prepared accordingly? Let’s keep it out of this discussion.

As always, feel free to reach out to me if you want to discuss any of these! So, lets start with object creation.

Object Creation

Let’s start with object creation and by defining our purpose: we want to be able to create object instances of a certain type as fast as possible. We have a couple of strategies:

Let’s cover them all one by one.

Using the new Operator

This is the most obvious (and fast), but does not play well with dynamic instantiation, meaning, the type to instantiate needs to be hardcoded. I call it direct instantiation, and it goes as this (you know, you know…):

var obj = new Xpto();

This should be the baseline for all performance operations, as it should offer the best possible performance.

Using Reflection

Here I’m caching the public parameterless constructor and invoking it, then casting the result to the target type:

var ci = typeof(Xpto).GetConstructor(Type.EmptyTypes);<br />var obj = ci.Invoke(null) as Xpto;

Just avoid getting the constructor over and over again, do it once for each type then cache it somewhere.

Using FormatterServices.GetUninitializedObject

The GetUninitializedObject method is used internally by some serializers and what it does is, it merely allocates memory for the target type and zeroes all of its fields, without actually running any constructor. This has the effect that any explicitly declared field and property values will be lost, so use with care. It is available in .NET Core:

var obj = FormatterServices.GetUninitializedObject(typeof(Xpto)) as Xpto;

Pay attention that none of the constructors of your type are executed, and no fields or properties have their initial values set, other than the default value for each type (null for reference types, the default for value types).

Using System.Reflection.Emit code generation

This one uses the code generation library that is built-in with .NET (but not .NET Core, for the time being):

var m = new DynamicMethod(string.Empty, typeof(object), null, typeof(Xpto), true);<br />var ci = typeof(Xpto).GetConstructor(Type.EmptyTypes);<br />var il = m.GetILGenerator();<br />il.Emit(OpCodes.Newobj, ci);<br />il.Emit(OpCodes.Ret);<br />var creator = m.CreateDelegate(typeof(Func<object>)) as Func<object>;<br />var obj = creator() as Xpto;

As you can see, we are just generating code for a dynamic method, providing a simple content that does “new Xpto()”, and execute it.

Using Activator.CreateInstance

This is essentially a wrapper around the reflection code I’ve shown earlier, with the drawback that it does not cache each types’ public parameterless constructor:

var obj = Activator.CreateInstance(typeof(Xpto)) as Xpto;

Using LINQ expressions

The major drawback of this approach is the time it takes to build the actual code (the first call to Compile). After that, it should be fast:

var ci = typeof(Xpto).GetConstructor(Type.EmptyTypes);<br />var expr = Expression.New(ci);<br />var del = Expression.Lambda(expr).Compile();<br />var obj = del.DynamicInvoke() as Xpto;

Of course, if you are to call this a number of times for the same type, it may be worth caching the constructor for each type.

Using Delegates

The LINQ expressions approach actually compiles to this one, but this is strongly typed:

Func<Xpto> del = () => new Xpto();<br />var obj = del();

Using Roslyn

This one is relatively new in .NET. As you may know, Microsoft now uses Roslyn to both parse and generate code dynamically. The scripting capabilities are made available through the Microsoft.CodeAnalysis.CSharp.Scripting NuGet package. The actual code for instantiating a class (or actually executing any code) dynamically goes like this:

var obj = CSharpScript.EvaluateAsync("new Xpto()").GetAwaiter().GetResult() as Xpto;

Do keep in mind that Roslyn is asynchronous by nature, so you need to wait for the result, also, do add the full namespace of your type, which I omitted for brevity. There are other APIs that allow you to compile code and reuse the compilation:

var script = CSharpScript.Create<Xpto>("new Xpto()", ScriptOptions.Default.AddReferences(typeof(Xpto).Assembly));<br />var runner = script.CreateDelegate();<br />var obj = runner().GetAwaiter().GetResult();

Conclusion

Feel free to run your tests, with a few iterations, and look at the results. Always compare with the normal way to create objects, the new operator. Do not forget the problems with each approach, like the need to cache something or any limitations on the instantiated object.

In my machine, for 1000 iterations, a couple times for the same run, I get these average results (elapsed ticks):

Technique Delay
Direct 0.148
FormatterServices.GetUninitializedObject 0.324
Activator.CreateInstance 0.296
Reflection 0.6
IL 0.557
LINQ Expression 4.085
Delegate 0.109
Roslyn 2400.796

Some of these may be surprising to you, as they were to me! It seems that reflection is not that much slower than direct instantiation as one might think… hmmm…

As usual, I’d love to hear your thoughts on this! More to come soon! Winking smile

Detecting Default Values of Value Types

Func<object, bool> func = (obj) => obj.Equals(default(Int32));

Func<object, bool> func = (obj) => obj.Equals(default(Int32));

Introduction

I don’t know if this happened to you: the need to find out if some instance of a class is the class’ default value. For reference types – which include nullables -, it is always null, and for value types, it is the value that you get if you do not initialize a field of that type or if you call its default parameterless constructor – false for Boolean, 0 for numbers, the 0 member of an enumeration, etc. So, the problem is, how can we tell if some instance represents this default value, dynamically, that is, for any value type, not just a specific one.

Again, nothing special, just the result of a lazy Saturday afternoon, still, hope you see some value in it! Winking smile

Take 0: Direct Comparison

How would we compare a built-in type instance, like int, with its default value? Just compare it with 0. This is useful to see how each of the other techniques compare to it.

static bool IsDefaultDirect(int obj)

{

    return obj.Equals(0);

}

Take 1: Comparison With a New Instance

An easy way is to compare the value we have with a new instance of the same type, like this:

static bool IsDefaultUsingConstruction(ValueType obj)

{

    return obj.Equals(Activator.CreateInstance(obj.GetType()));

}

Activator.CreateInstance knows how to create a default instance of a value type, so we’re good.

Take 2: Using Generics Directly

Another option is to use the ability to compare a generic variable with the default value for the generic type. This cannot be used in exactly the same way as #1, because here we need to explicitly call the comparison method with a generic parameter:

static bool IsDefaultUsingGeneric<T>(T obj) where T : struct

{

    return obj.Equals(default(T));

}

Notice the struct constraint, it is exactly the same as declaring the parameter as ValueType, because it is the base class for all value types.

Take 3: Using Generics Dynamically

You can make the previous example dynamic, with the cost of an additional method invocation, like this:

static bool IsDefaultUsingReflection(ValueType obj)

{

    //cache this, please

    var isDefaultUsingGenericMethod = typeof(Program).GetMethod("IsDefaultUsingGeneric", BindingFlags.Static | BindingFlags.NonPublic);

    var method = isDefaultUsingGenericMethod.MakeGenericMethod(obj.GetType());

    return (bool) method.Invoke(null, new object[] { obj });

}

Take 4: Using a LINQ Expression Bound to a Specific Type

Another option is to dynamically compile a LINQ expression that performs the comparison, something like this:

Func<T, bool> func = (obj) => obj.Equals(default(T));

We can create this expression dynamically, and bind it to the desired value type:

static bool IsDefaultUsingLinq(ValueType obj)

{

    var argType = obj.GetType();

    var arguments = new Expression[] { Expression.Default(argType) };

    var paramExpression = Expression.Parameter(argType, "x");

 

    var equalsMethod = argType.GetMethod("Equals", new Type[] { argType });

    var call = Expression.Call(paramExpression, equalsMethod, arguments);

 

    var lambdaArgType = typeof(Func<,>).MakeGenericType(argType, typeof(bool));

    var lambdaMethod = LambdaMethod.MakeGenericMethod(lambdaArgType);

 

    var expression = lambdaMethod.Invoke(null, new object[] { call, new ParameterExpression[] { paramExpression } }) as LambdaExpression;

 

    //cache this, please

    Delegate func = expression.Compile();

 

    return (bool)func.DynamicInvoke(obj);

}

Take 5: Using a LINQ Expression Bound to Object

A very similar option to #4 is to use Object.Equals instead of the value type’s specific Equals method, like this:

Func<object, bool> func = (obj) => obj.Equals(default(int));

Of course, the int parameter depends on the actual type of the value type parameter being passed:

static readonly MethodInfo LambdaMethod = typeof(Expression).GetMethods(BindingFlags.Static | BindingFlags.Public).First(x => x.Name == "Lambda" && x.GetParameters()[1].ParameterType == typeof(ParameterExpression[]));

static readonly MethodInfo EqualsMethod = typeof (object).GetMethod("Equals", BindingFlags.Instance | BindingFlags.Public);

 

static bool IsDefaultUsingLinqObject(ValueType obj)

{

    var argType = typeof(object);

    var arguments = new Expression[] { Expression.Convert(Expression.Default(obj.GetType()), argType) };

    var equalsMethod = EqualsMethod;

    var paramExpression = Expression.Parameter(argType, "x");

    var call = Expression.Call(paramExpression, equalsMethod, arguments);

    var lambdaArgType = typeof(Func<object, bool>);

    var lambdaMethod = LambdaMethod.MakeGenericMethod(lambdaArgType);

    var expression = lambdaMethod.Invoke(null, new object[] { call, new ParameterExpression[] { paramExpression } }) as Expression<Func<object, bool>>;

 

    //cache this, please

    Func<object, bool> func = expression.Compile();

 

    return func(obj);

}

Because the comparison expression, of type Func<object, bool>, is strongly typed, we avoid the need to call Delegate.DynamicInvoke, the performance increases substantially.

Take 6: Using Formatter Services

A long, long time ago, in a distance galaxy, I already mentioned, en passant, the usage of FormatterServices.GetUninitializedObject to create instances of a type. Picking up example #1, let’s replace Activator.CreateInstance by FormatterServices.GetUninitializedObject and see the gains:

static bool IsDefaultUsingFormatterServices(ValueType obj)

{

    return obj.Equals(FormatterServices.GetUninitializedObject(obj.GetType()));

}

Take 7: Using a LINQ Expression Bound to a Specific Type and Using Invocation Through Dynamics

What a long name… Smile Well, this one is identical to #4, but without Delegate.DynamicInvoke. Instead, I make use of the dynamic type’s late binding to invoke the delegate, which results in even better performance:

static readonly MethodInfo LambdaMethod = typeof(Expression).GetMethods(BindingFlags.Static | BindingFlags.Public).First(x => x.Name == "Lambda" && x.GetParameters()[1].ParameterType == typeof(ParameterExpression[]));

 

static bool IsDefaultUsingLinqAndDynamic(ValueType obj)

{

    var argType = obj.GetType();

    var arguments = new Expression[] { Expression.Default(argType) };

    var paramExpression = Expression.Parameter(argType, "x");

    var equalsMethod = argType.GetMethod("Equals", new Type[] { argType });

    var call = Expression.Call(paramExpression, equalsMethod, arguments);

    var lambdaArgType = typeof(Func<,>).MakeGenericType(argType, typeof(bool));

    var lambdaMethod = LambdaMethod.MakeGenericMethod(lambdaArgType);

    var expression = lambdaMethod.Invoke(null, new object[] { call, new ParameterExpression[] { paramExpression } }) as LambdaExpression;

 

    //cache this, please

    Delegate func = expression.Compile();

 

    dynamic arg = obj;

    dynamic del = func;

 

    return del(arg);

}

Measuring

I put in two methods for measuring calls and doing averages:

static long MeasureTicks(Action action)

{

    var watch = Stopwatch.StartNew();

 

    action();

 

    return watch.ElapsedTicks;

}

 

static float Measure(int times, Action action)

{

    var avg = 0L;

 

    for (var i = 0; i < times; ++i)

    {

        avg += MeasureTicks(action);

    }

 

    return (float)avg / times;

}

I used a Stopwatch to obtain the ElapsedTicks of the method to be exercised. I changed the methods I presented, namely, #4, #5 and #7, so as to cache the types and delegates created dynamically, this is crucial, and I leave that as an exercise to you – just remember that each method can potencially be called with different values, of different types. Then I added a warm-up step, which exercises the code using an integer parameter:

static void Warmup(int value)

{

    var times = 1;

    Measure(times, () => IsDefaultDirect(value));

    Measure(times, () => IsDefaultUsingConstruction(value));

    Measure(times, () => IsDefaultUsingGeneric(value));

    Measure(times, () => IsDefaultUsingReflection(value));

    Measure(times, () => IsDefaultUsingLinq(value));

    Measure(times, () => IsDefaultUsingLinqObject(value));

    Measure(times, () => IsDefaultUsingFormatterServices(value));

    Measure(times, () => IsDefaultUsingLinqAndDynamic(value));

}

In the past, I learned that a warm-up method – or lack of it – makes a huge difference.

I executed each option 100 times and got its results:

static void Measure()

{

    var times = 100;

    var value = 100;

 

    Warmup(value);

 

    var m0 = Measure(times, () => IsDefaultDirect(value));

    var m1 = Measure(times, () => IsDefaultUsingConstruction(value));

    var m2 = Measure(times, () => IsDefaultUsingGeneric(value));

    var m3 = Measure(times, () => IsDefaultUsingReflection(value));

    var m4 = Measure(times, () => IsDefaultUsingLinq(value));

    var m5 = Measure(times, () => IsDefaultUsingLinqObject(value));

    var m6 = Measure(times, () => IsDefaultUsingFormatterServices(value));

    var m7 = Measure(times, () => IsDefaultUsingLinqAndDynamic(value));

}

The results I got were:


Method Ticks Difference
#0: Direct Comparison 1.82 131.88%
#1: Comparison With a New Instance 1.92 139.13%
#2: Using Generics Directly 1.46 105.80%
#3: Using Generics Dynamically 6.9 500%
#4: Using a LINQ Expression Bound to a Specific Type 3.05 221.01%
#5: Using a LINQ Expression Bound to Object 1.61 116.67%
#6: Using Formatter Services 1.53
#7: Using a LINQ Expression Bound to a Specific Type and Using Invocation Through Dynamics 1.38 100%

Conclusion

I was really surprised that the direct comparison is actually – at least for integers – not the best way to see if a value is the default for its type! There’s a big range in results, and I can say that I was expecting that for #3. I knew that FormatterServices.GetUninitializedObject would give better results than Activator.CreateInstance, but I imagine this cannot be used with all types, because it doesn’t run the type’s constructor, possibly skipping some default initializations. I also knew that the performance of Delegate.DynamicInvoke is less than ideal, but it was interesting to see that dynamics can improve it.

As always, I’d love to see what you have to say! Do you see flaws in my approach, or do you know of any better solutions? Fire away! Winking smile

Lesser-Known NHibernate Features: Filtering an Unloaded Collection

Suppose you have an entity with a collection of other entities (one to many, many to many); this collection is normally represented by some property implementing IEnumerable<T>, or one some more specific interface.

If we want to query this collection, we normally have to load all of its items. For indexed collections, it is possible to use extra-lazy laziness, which means NHibernate will only load one item at a time, instead of all at once, but this might even be worse, because it would result in multiple SELECT statements.

NHibernate offers a nice mechanism to filter and order a collection without actually loading all of its items; it’s the CreateFilter method of the ISession, and we use it like this:

   1: //get an order

   2: var order = session.Query<Order>().First();

   3:  

   4: //all order details of this order where quantity is greater than one

   5: var detailsWithMoreThanOneItem = session.CreateFilter(order.Details, "WHERE Quantity > 1").List<OrderDetail>();

   6:  

   7: //all order details of this order sorted by their quantity in descending order

   8: var detailsSortedByQuantityInDescendingOrder = session.CreateFilter(o.Details, "ORDER BY Quantity DESC").List<OrderDetail>();

I even once wrote a helper method for querying collections with LINQ without actually loading them; its usage is:

   1: //all order details of this order where quantity is greater than one

   2: var detailsWithMoreThanOneItem = order.Details.Query().Where(d => d.Quantity > 1).ToList();

The Query extension method will check to see if the collection has already been loaded, in which case, it resorts to LINQ to Objects, otherwise, it generates a proper LINQ query. You can find the source code here.