LINQ

Hydrating Objects With Expression Trees – Part II

LINQ With C# (Portuguese)

In my previous post I showed how to hydrate objects by creating instances and setting properties in those instances.

But, if the intent is to hydrate the objects from data, why not having an expression that does just that? That’s what the member initialization expression is for.

To create such an expression we need the constructor expression and the property binding expressions:

var properties = objectType.GetProperties();
var bindings = new MemberBinding[properties.Length];
var valuesArrayExpression = Expression.Parameter(typeof(object[]), "v");

for (int p = 0; p < properties.Length; p++)
{
    var property = properties[p];

    bindings[p] = Expression.Bind(
        property,
        Expression.Convert(
            Expression.ArrayAccess(
                valuesArrayExpression,
                Expression.Constant(p, typeof(int))
            ),
            property.PropertyType
        )
    );
}

var memberInitExpression = Expression.MemberInit(
    Expression.New(objectType),
    bindings
);

var objectHidrationExpression = Expression.Lambda<Func<object[], object>>(memberInitExpression, valuesArrayExpression);

var compiledObjectHidrationExpression = objectHidrationExpression.Compile();

This might seem more complex than the previous solution, but using it is a lot more simple:

for (int o = 0; o < objects.Length; o++)
{
    newObjects[o] = compiledObjectHidrationExpression(objects[o]);
}

Hydrating Objects With Expression Trees – Part I

LINQ With C# (Portuguese)

After my post about dumping objects using expression trees, I’ve been asked if the same could be done for hydrating objects.

Sure it can, but it might not be that easy.

What we are looking for is a way to set properties on objects of an unknown type. For that, we need to generate methods to set each property of the objects.

Such methods would look like this expression:

Expression<Action<object, object>> expression = (o, v) => ((SomeType)o).Property1 = (PropertyType)v;

Unfortunately, we cannot use the .NET Reflector trick because, if you try to compile this, you’ll get this error:

error CS0832: An expression tree may not contain an assignment operator

Fortunately, that corresponds to a valid .NET expression tree. We just have to build it by hand.

So, for a given type, the set of property setters would be built this way:

var compiledExpressions = (from property in objectType.GetProperties()
                           let objectParameterExpression = Expression.Parameter(typeof(object), "o")
                           let convertedObjectParameteExpressionr = Expression.ConvertChecked(objectParameter, objectType)
                           let valueParameter = Expression.Parameter(propertyType, "v")
                           let convertedValueParameter = Expression.ConvertChecked(valueParameter, property.PropertyType)
                           let propertyExpression = Expression.Property(convertedObjectParameter, property)
                           select
                                Expression.Lambda<Action<object, object>>(
                                    Expression.Assign(
                                        propertyExpression,
                                        convertedValueParameter
                                    ),
                                    objectParameter,
                                    valueParameter
                                ).Compile()).ToArray();

And hydrating objects would be like this:

for (int o = 0; o < objects.Length; o++)
{
    var objectProperties = objects[o];

    var newObject = newObjects[o] = Activator.CreateInstance(objectType);

    for (int p = 0; p < compiledExpressions.Length; p++)
    {
        compiledExpressions[p](newObject, objectProperties[p]);
    }
}

Mastering Expression Trees With .NET Reflector

Following my last post, I received lots of enquiries about how got to master the creation of expression trees.

The answer is: .NET Reflector

On that post I needed to to generate an expression tree for this expression:

Expression<Func<object, object>> expression = o => ((object)((SomeType)o).Property1);

I just compiled that code in Visual Studio 2010, loaded the assembly in .NET Reflector, and disassembled it to C# without optimizations (View –> Options –> Disassembler –> Optimization: None).

The disassembled code looked like this:

Expression<Func<object, object>> expression;
ParameterExpression CS$0$0000;
ParameterExpression[] CS$0$0001;
expression = Expression.Lambda<Func<object, object>>(Expression.Convert(Expression.Property(Expression.Convert(CS$0$0000 = Expression.Parameter(typeof(object), "o"), typeof(SomeType)), (MethodInfo) methodof(SomeType.get_Property1)), typeof(object)), new ParameterExpression[] { CS$0$0000 });

After giving valid C# names to the variables and tidy up the code a bit, I came up with this:

ParameterExpression parameter = Expression.Parameter(typeof(object), "o");
Expression<Func<object, object>> expression =
    Expression.Lambda<Func<object, object>>(
        Expression.Convert(
            Expression.Property(
                Expression.Convert(
                    parameter,
                    typeof(SomeType)
                ),
                "Property1"
            ),
            typeof(object)
        ),
        parameter
    );

Easy! Isn’t it?

Dumping Objects Using Expression Trees

LINQ With C# (Portuguese)

No. I’m not proposing to get rid of objects.

A colleague of mine was asked if I knew a way to dump a list of objects of unknown type into a DataTable with better performance than the way he was using.

The objects being dumped usually have over a dozen of properties, but, for the sake of this post, let’s assume they look like this:

class SomeClass
{
    public int Property1 { get; set; }
    public long Property2 { get; set; }
    public string Property3 { get; set; }
    public object Property4 { get; set; }
    public DateTimeOffset Property5 { get; set; }
}

The code he was using was something like this:

var properties = objectType.GetProperties();

foreach (object obj in objects)
{
    foreach (var property in properties)
    {
        property.GetValue(obj, null);
    }
}

For a list of one million objects, this is takes a little over 6000 milliseconds on my machine.

I immediately thought: Expression Trees!

If the type of the objects was know at compile time, it would be something like this:

Expression<Func<SomeClass, int>> expression = o => o.Property1;
var compiled = expression.Compile();
var propertyValue = compiled.Invoke(obj);

But, at compile time, the type of the object and, consequently, the type of its properties, is unknown. So, we’ll need, for each property, an expression tree like this:

Expression<Func<object, object>> expression = o => ((SomeClass)o).Property1;

The previous expression gets the value of a property of the conversion of the parameter of type object to the type of the object. The result must also be converted to type object because the type of the result must match the type of the return value of the expression.

For the same type of objects, the collection of property accessors would be built this way:

var compiledExpressions = (from property in properties
                           let objectParameter = Expression.Parameter(typeof(object), "o")
                           select
                             Expression.Lambda<Func<object, object>>(
                                 Expression.Convert(
                                     Expression.Property(
                                         Expression.Convert(
                                             objectParameter,
                                             objectType
                                         ),
                                         property
                                     ),
                                     typeof(object)
                                 ),
                                 objectParameter
                             ).Compile()).ToArray();

Looks bit overcomplicated, but reading all properties of all objects for the same object set with this code:

foreach (object obj in objects)
{
    foreach (var compiledExpression in compiledExpressions)
    {
        compiledExpression (obj);
    }
}

takes a little over 150 milliseconds on my machine.

That’s right. 2.5% of the previous time.

LINQ: Enhancing Distinct With The SelectorEqualityComparer

LINQ With C# (Portuguese)

On my last post, I introduced the PredicateEqualityComparer and a Distinct extension method that receives a predicate to internally create a PredicateEqualityComparer to filter elements.

Using the predicate, greatly improves readability, conciseness and expressiveness of the queries, but it can be even better. Most of the times, we don’t want to provide a comparison method but just to extract the comaprison key for the elements.

So, I developed a SelectorEqualityComparer that takes a method that extracts the key value for each element. Something like this:

public class SelectorEqualityComparer<TSource, Tkey> : EqualityComparer<TSource>
    where Tkey : IEquatable<Tkey>
{
    private Func<TSource, Tkey> selector;

    public SelectorEqualityComparer(Func<TSource, Tkey> selector)
        : base()
    {
        this.selector = selector;
    }

    public override bool Equals(TSource x, TSource y)
    {
        Tkey xKey = this.GetKey(x);
        Tkey yKey = this.GetKey(y);

        if (xKey != null)
        {
            return ((yKey != null) && xKey.Equals(yKey));
        }

        return (yKey == null);
    }

    public override int GetHashCode(TSource obj)
    {
        Tkey key = this.GetKey(obj);

        return (key == null) ? 0 : key.GetHashCode();
    }

    public override bool Equals(object obj)
    {
        SelectorEqualityComparer<TSource, Tkey> comparer = obj as SelectorEqualityComparer<TSource, Tkey>;
        return (comparer != null);
    }

    public override int GetHashCode()
    {
        return base.GetType().Name.GetHashCode();
    }

    private Tkey GetKey(TSource obj)
    {
        return (obj == null) ? (Tkey)(object)null : this.selector(obj);
    }
}

Now I can write code like this:

.Distinct(new SelectorEqualityComparer<Source, Key>(x => x.Field))

And, for improved readability, conciseness and expressiveness and support for anonymous types the corresponding Distinct extension method:

public static IEnumerable<TSource> Distinct<TSource, TKey>(this IEnumerable<TSource> source, Func<TSource, TKey> selector)
    where TKey : IEquatable<TKey>
{
    return source.Distinct(new SelectorEqualityComparer<TSource, TKey>(selector));
}

And the query is now written like this:

.Distinct(x => x.Field)

For most usages, it’s simpler than using a predicate.

LINQ: Enhancing Distinct With The PredicateEqualityComparer

LINQ With C# (Portuguese)

Today I was writing a LINQ query and I needed to select distinct values based on a comparison criteria.

Fortunately, LINQ’s Distinct method allows an equality comparer to be supplied, but, unfortunately, sometimes, this means having to write custom equality comparer.

Because I was going to need more than one equality comparer for this set of tools I was building, I decided to build a generic equality comparer that would just take a custom predicate. Something like this:

public class PredicateEqualityComparer<T> : EqualityComparer<T>
{
    private Func<T, T, bool> predicate;

    public PredicateEqualityComparer(Func<T, T, bool> predicate)
        : base()
    {
        this.predicate = predicate;
    }

    public override bool Equals(T x, T y)
    {
        if (x != null)
        {
            return ((y != null) && this.predicate(x, y));
        }

        if (y != null)
        {
            return false;
        }

        return true;
    }

    public override int GetHashCode(T obj)
    {
        // Always return the same value to force the call to IEqualityComparer<T>.Equals
        return 0;
    }
}

Now I can write code like this:

.Distinct(new PredicateEqualityComparer<Item>((x, y) => x.Field == y.Field))

But I felt that I’d lost all conciseness and expressiveness of LINQ and it doesn’t support anonymous types. So I came up with another Distinct extension method:

public static IEnumerable<TSource> Distinct<TSource>(this IEnumerable<TSource> source, Func<TSource, TSource, bool> predicate)
{
    return source.Distinct(new PredicateEqualityComparer<TSource>(predicate));
}

And the query is now written like this:

.Distinct((x, y) => x.Field == y.Field)

Looks a lot better, doesn’t it? And it works wit anonymous types.

Update: I, accidently, had published the wrong version of the IEqualityComparer<T>.Equals method,

LINQ: Single vs. SingleOrDefault

LINQ With C# (Portuguese)

Like other LINQ API methods that extract a scalar value from a sequence, Single has a companion SingleOrDefault.

The documentation of SingleOrDefault states that it returns a single, specific element of a sequence of values, or a default value if no such element is found, although, in my opinion, it should state that it returns THE single, specific element of a sequence of ONE value, or a default value if no such element is found. Nevertheless, what this method does is return the default value of the source type if the sequence is empty or, like Single, throws an exception if the sequence has more than one element.

I received several comments to my last post saying that SingleOrDefault could be used to avoid an exception.

Well, it only “solves” half of the “problem”. If the sequence has more than one element, an exception will be thrown anyway.

In the end, it all comes down to semantics and intent. If it is expected that the sequence may have none or one element, than SingleOrDefault should be used. If it’s not expect that the sequence is empty and the sequence is empty, than it’s an exceptional situation and an exception should be thrown right there. And, in that case, why not use Single instead? In my opinion, when a failure occurs, it’s best to fail fast and early than slow and late.

Other methods in the LINQ API that use the same companion pattern are: ElementAt/ElementAtOrDefault, First/FirstOrDefault and Last/LastOrDefault.

LINQ: Single vs. First

LINQ With C# (Portuguese)

I’ve witnessed and been involved in several discussions around the correctness or usefulness of the Single method in the LINQ API.

The most common argument is that you are querying for the first element on the result set and an exception will be thrown if there’s more than one element. The First method should be used instead, because it doesn’t throw if the result set has more than one item.

Although the documentation for Single states that it returns a single, specific element of a sequence of values, it actually returns THE single, specific element of a sequence of ONE value. When you use the Single method in your code you are asserting that your query will result in a scalar result instead of a result set of arbitrary length.

On the other hand, the documentation for First states that it returns the first element of a sequence of arbitrary length.

Imagine you want to catch a taxi. You go the the taxi line and catch the FIRST one, no matter how many are there.

On the other hand, if you go the the parking lot to get your car, you want the SINGLE one specific car that’s yours. If your “query” “returns” more than one car, it’s an exception. Either because it “returned” not only your car or you happen to have more than one car in that parking lot. In either case, you can only drive one car at once and you’ll need to refine your “query”.

Playing With LINQ: Getting Interface Property Implementations

LINQ With C# (Portuguese)

Today, my friend Nuno was writing some code to get the PropertyInfos of a class implementation of an interface.

Given this interface:

public interface ISomeInterface { int IntProperty { get; set; } string StringProperty { get; } void Method(); }

and this class:

public class SomeClass : ISomeInterface
{
    int ISomeInterface.IntProperty { get; set; }
    public int IntProperty { get; private set; }
    public string StringProperty { get; private set; }
    public void Method() { }
}

Nuno wanted to retrieve:

  • Int32 ISomeInterface.IntProperty
  • System.String StringProperty

The code is fairly simple. First we need to get the interface mappings for the class:

typeof(SomeClass).GetInterfaceMap(typeof(ISomeInterface)).TargetMethods

and get all PropertyInfos for which the MethodInfo in the list is part of the implementation of the property (is either the set method or the get method).

Something like his:

public static bool Implements(this MethodInfo methodInfo, PropertyInfo propertyInfo)
{
    return (propertyInfo.GetGetMethod(true) == methodInfo) || (propertyInfo.GetSetMethod(true) == methodInfo);
}

But what caught my eye was that, with the above extension methods, I can use LINQ to retrieve a list of the desired PropertyInfos.

Something like this:

public static IEnumerable<PropertyInfo> GetInterfacePropertyImplementation(Type implementer, Type implemented)
{
    return (from propertyInfo in implementer.GetProperties(BindingFlags.Instance | BindingFlags.Public | BindingFlags.NonPublic).AsEnumerable()
            from methodInfo in implementer.GetInterfaceMap(implemented).TargetMethods.AsEnumerable()
            where methodInfo.Implements(propertyInfo)
            select propertyInfo).Distinct();
}

For the sample class and interface, the usage would be something like this:

var q = GetInterfacePropertyImplementation(typeof(SomeClass), typeof(ISomeInterface));

foreach (var p in q)
{
    Console.WriteLine(p);
}

Which would produce the following output:

Int32 ISomeInterface.IntProperty
System.String StringProperty

UPDATED: The previous implementation was overcomplicated and had some string based logic. Kudos to Nuno.

How To Set Elements Of An Array Of A Private Type Using Visual Studio Shadows

Visual Studio uses Publicize to create accessors public for private members and types of a type.


But when you try to set elements of a private array of elements of a private type, things get complicated.


Imagine this hypothetic class to test:

public static class MyClass
{
private static readonly MyInnerClass[] myArray = new MyInnerClass[10];

public static bool IsEmpty()
{
foreach (var item in myArray)
{
if ((item != null) && (!string.IsNullOrEmpty(item.Field)))
{
return false;
}
}

return true;
}

private class MyInnerClass
{
public string Field;
}
}


If I want to write a test for the case when the array has “non empty” entries, I need to setup the array first.


Using the accessors generated by Visual Studio, I would write something like this:

[TestClass()]
public class MyClassTest
{
[TestMethod()]
public void IsEmpty_NotEmpty_ReturnsFalse()
{
for (int i = 0; i < 10; i++)
{
MyClass_Accessor.myArray[i] = new MyClass_Accessor.MyInnerClass { Field = i.ToString() };
}

bool expected = false;
bool actual;

actual = MyClass.IsEmpty();

Assert.AreEqual(expected, actual);
}
}


But the test will fail because, although the elements of the private array myArray can be read as MyClass_Accessor.MyInnerClass instances, they can’t be written as such.


To do so, the test would have to be written like this:

[TestClass()]
public class MyClassTest
{
[TestMethod()]
public void IsEmpty_NotEmpty_ReturnsFalse()
{
for (int i = 0; i < 10; i++)
{
MyClass_Accessor.ShadowedType.SetStaticArrayElement("myArray", new MyClass_Accessor.MyInnerClass { Field = i.ToString() }.Target, i);
}

bool expected = false;
bool actual;

actual = MyClass.IsEmpty();

Assert.AreEqual(expected, actual);
}
}


But, this way, we loose all the strong typing of the accessors because we need to write the name of the array field.


Because the accessor for the field is a property, we could write a set of extension methods that take care of getting the field name for us. Something like this:

public static class PrivateypeExtensions
{
public static void SetStaticArrayElement<T>(this PrivateType self, Expression<Func<T[]>> expression, T value, params int[] indices)
{
object elementValue = (value is BaseShadow) ? (value as BaseShadow).Target : value;

self.SetStaticArrayElement(
((PropertyInfo)((MemberExpression)(expression.Body)).Member).Name,
elementValue,
indices);
}

public static void SetStaticArrayElement<T>(this PrivateType self, Expression<Func<T[]>> expression, BindingFlags invokeAttr, T value, params int[] indices)
{
object elementValue = (value is BaseShadow) ? (value as BaseShadow).Target : value;

self.SetStaticArrayElement(
((PropertyInfo)((MemberExpression)(expression.Body)).Member).Name,
invokeAttr,
elementValue,
indices);
}
}


Now, we can write the test like this:

[TestClass()]
public class MyClassTest
{
[TestMethod()]
public void IsEmpty_NotEmpty_ReturnsFalse()
{
for (int i = 0; i < 10; i++)
{
MyClass_Accessor.ShadowedType.SetStaticArrayElement(() => MyClass_Accessor.myArray, new MyClass_Accessor.MyInnerClass { Field = i.ToString() }, i);
}

bool expected = false;
bool actual;

actual = MyClass.IsEmpty();

Assert.AreEqual(expected, actual);
}
}


It’s not the same as the first form, but it’s strongly typed and we’ll get a compiler error instead of a test run error if we change the name of the myArray field.


You can find this and other tools on the PauloMorgado.TestTools on CodePlex.