Hydrating Objects With Expression Trees – Part III

LINQ With C# (Portuguese)

To finalize this series on object hydration, I’ll show some performance comparisons between the different methods of hydrating objects.

For the purpose of this exercise, I’ll use this class:

class SomeType
{
    public int Id { get; set; }
    public string Name { get; set; }
    public DateTimeOffset CreationTime { get; set; }
    public Guid UniqueId { get; set; }
}


and this set of data:



var data = (
    from i in Enumerable.Range(1, ObjectCount)
    select new object[] { i, i.ToString(), DateTimeOffset.Now, Guid.NewGuid() }
).ToArray();


The data bellow shows the time (in seconds) for different runs for different values of ObjectCount (in the same machine with approximately the same load) as well as it’s availability for different version of the .NET Framework and the C# programming language:



10000 100000 1000000 Valid for
Setup Hydrate Total Setup Hydrate Total Setup Hydrate Total Framework version C# Version
Activation and Reflection setter 0.060 0.101 0.161 0.055 0.736 0.791 0.054 6.822 6.876 1.0, 1.1, 2.0, 3.5, 4.0 1.0, 2.0, 3.0, 4.0
Activation and Expression Tree setter 0.300 0.003 0.303 0.313 0.049 0.359 0.293 0.578 0.871 4.0 none
Member Initializer 0.035 0.001 0.036 0.039 0.027 0.066 0.041 0.518 0.559 3.5, 4.0 3.0, 4.0


These values will vary with the number of the objects being hydrated and the number of its properties, but the method using the Member Initializer will be the most performant.



Code samples for this series of posts (and the one about object dumping with expression trees) can be found on my MSDN Code Gallery: Dump And Hydrate Objects With Expression Trees

Hydrating Objects With Expression Trees – Part II

LINQ With C# (Portuguese)

In my previous post I showed how to hydrate objects by creating instances and setting properties in those instances.

But, if the intent is to hydrate the objects from data, why not having an expression that does just that? That’s what the member initialization expression is for.

To create such an expression we need the constructor expression and the property binding expressions:

var properties = objectType.GetProperties();
var bindings = new MemberBinding[properties.Length];
var valuesArrayExpression = Expression.Parameter(typeof(object[]), "v");

for (int p = 0; p < properties.Length; p++)
{
    var property = properties[p];

    bindings[p] = Expression.Bind(
        property,
        Expression.Convert(
            Expression.ArrayAccess(
                valuesArrayExpression,
                Expression.Constant(p, typeof(int))
            ),
            property.PropertyType
        )
    );
}

var memberInitExpression = Expression.MemberInit(
    Expression.New(objectType),
    bindings
);

var objectHidrationExpression = Expression.Lambda<Func<object[], object>>(memberInitExpression, valuesArrayExpression);

var compiledObjectHidrationExpression = objectHidrationExpression.Compile();


This might seem more complex than the previous solution, but using it is a lot more simple:



for (int o = 0; o < objects.Length; o++)
{
    newObjects[o] = compiledObjectHidrationExpression(objects[o]);
}

Hydrating Objects With Expression Trees – Part I

LINQ With C# (Portuguese)

After my post about dumping objects using expression trees, I’ve been asked if the same could be done for hydrating objects.

Sure it can, but it might not be that easy.

What we are looking for is a way to set properties on objects of an unknown type. For that, we need to generate methods to set each property of the objects.

Such methods would look like this expression:

Expression<Action<object, object>> expression = (o, v) => ((SomeType)o).Property1 = (PropertyType)v;


Unfortunately, we cannot use the .NET Reflector trick because, if you try to compile this, you’ll get this error:



error CS0832: An expression tree may not contain an assignment operator


Fortunately, that corresponds to a valid .NET expression tree. We just have to build it by hand.



So, for a given type, the set of property setters would be built this way:



var compiledExpressions = (from property in objectType.GetProperties()
                           let objectParameterExpression = Expression.Parameter(typeof(object), "o")
                           let convertedObjectParameteExpressionr = Expression.ConvertChecked(objectParameter, objectType)
                           let valueParameter = Expression.Parameter(propertyType, "v")
                           let convertedValueParameter = Expression.ConvertChecked(valueParameter, property.PropertyType)
                           let propertyExpression = Expression.Property(convertedObjectParameter, property)
                           select
                                Expression.Lambda<Action<object, object>>(
                                    Expression.Assign(
                                        propertyExpression,
                                        convertedValueParameter
                                    ),
                                    objectParameter,
                                    valueParameter
                                ).Compile()).ToArray();


And hydrating objects would be like this:



for (int o = 0; o < objects.Length; o++)
{
    var objectProperties = objects[o];

    var newObject = newObjects[o] = Activator.CreateInstance(objectType);

    for (int p = 0; p < compiledExpressions.Length; p++)
    {
        compiledExpressions[p](newObject, objectProperties[p]);
    }
}

Mastering Expression Trees With .NET Reflector

Following my last post, I received lots of enquiries about how got to master the creation of expression trees.

The answer is: .NET Reflector

On that post I needed to to generate an expression tree for this expression:

Expression<Func<object, object>> expression = o => ((object)((SomeType)o).Property1);


I just compiled that code in Visual Studio 2010, loaded the assembly in .NET Reflector, and disassembled it to C# without optimizations (View –> Options –> Disassembler –> Optimization: None).



The disassembled code looked like this:



Expression<Func<object, object>> expression;
ParameterExpression CS$0$0000;
ParameterExpression[] CS$0$0001;
expression = Expression.Lambda<Func<object, object>>(Expression.Convert(Expression.Property(Expression.Convert(CS$0$0000 = Expression.Parameter(typeof(object), "o"), typeof(SomeType)), (MethodInfo) methodof(SomeType.get_Property1)), typeof(object)), new ParameterExpression[] { CS$0$0000 });


After giving valid C# names to the variables and tidy up the code a bit, I came up with this:



ParameterExpression parameter = Expression.Parameter(typeof(object), "o");
Expression<Func<object, object>> expression =
    Expression.Lambda<Func<object, object>>(
        Expression.Convert(
            Expression.Property(
                Expression.Convert(
                    parameter,
                    typeof(SomeType)
                ),
                "Property1"
            ),
            typeof(object)
        ),
        parameter
    );


Easy! Isn’t it?

Dumping Objects Using Expression Trees

LINQ With C# (Portuguese)


No. I’m not proposing to get rid of objects.


A colleague of mine was asked if I knew a way to dump a list of objects of unknown type into a DataTable with better performance than the way he was using.


The objects being dumped usually have over a dozen of properties, but, for the sake of this post, let’s assume they look like this:


class SomeClass
{
    public int Property1 { get; set; }
    public long Property2 { get; set; }
    public string Property3 { get; set; }
    public object Property4 { get; set; }
    public DateTimeOffset Property5 { get; set; }
}

The code he was using was something like this:


var properties = objectType.GetProperties();

foreach (object obj in objects)
{
    foreach (var property in properties)
    {
        property.GetValue(obj, null);
    }
}

For a list of one million objects, this is takes a little over 6000 milliseconds on my machine.


I immediately thought: Expression Trees!


If the type of the objects was know at compile time, it would be something like this:


Expression<Func<SomeClass, int>> expression = o => o.Property1;
var compiled = expression.Compile();
var propertyValue = compiled.Invoke(obj);

But, at compile time, the type of the object and, consequently, the type of its properties, is unknown. So, we’ll need, for each property, an expression tree like this:


Expression<Func<object, object>> expression = o => ((SomeClass)o).Property1;

The previous expression gets the value of a property of the conversion of the parameter of type object to the type of the object. The result must also be converted to type object because the type of the result must match the type of the return value of the expression.


For the same type of objects, the collection of property accessors would be built this way:


var compiledExpressions = (from property in properties
                           let objectParameter = Expression.Parameter(typeof(object), "o")
                           select
                             Expression.Lambda<Func<object, object>>(
                                 Expression.Convert(
                                     Expression.Property(
                                         Expression.Convert(
                                             objectParameter,
                                             objectType
                                         ),
                                         property
                                     ),
                                     typeof(object)
                                 ),
                                 objectParameter
                             ).Compile()).ToArray();

Looks bit overcomplicated, but reading all properties of all objects for the same object set with this code:


foreach (object obj in objects)
{
    foreach (var compiledExpression in compiledExpressions)
    {
        compiledExpression (obj);
    }
}

takes a little over 150 milliseconds on my machine.


That’s right. 2.5% of the previous time.

C#: More On Array Variance

In a previous post, I went through how arrays have are covariant in relation to the type of its elements, but not safely covariant.

In the following example, the second assignment is invalid at run time because, although the type of the objectArray variable is array of object, the real type of the array is array of string and an object cannot be assigned to a string.

object[] objectArray = new string[] { "string 1", "string 2" };
objectArray[0] = "string 3";
objectArray[1] = new object();


On the other hand, because arrays are not contravariant in relation to the type of its elements, in the following code, the second line will fail at run time because string[] ≤ object[].



object[] objectArray = new object[] { "string 1", "string 2" };
string[] stringArray = (string[])objectArray;


The fact that all elements in the object array are strings doesn’t make it convertible to an array of string. To convert this object array of strings into a string array, you’ll need to create a new string array and copy each converted element.



The conversion can be as easily as this:



string[] stringArray = objectArray.Cast<string>().ToArray();


The above code is just shorthand to traversing the whole array while converting its elements to string and creating an string array with all elements.



Arrays are a good storage structure because they are the only structure provided by the runtime to store groups of items of the same type. However, because of limitations like the above, its use in APIs should be very carefully considered.



If all you need is to traverse all elements of some collection, you should use an IEnumerable<T> (IEnumerable<out T> in .NET 4.0). This way, the cost of using Enumerable.Cast<T>() is minimal.