Reimplementing LINQ to Objects: Part 37 – Guiding principles

Now that I’m "done" reimplementing LINQ to Objects – in that I’ve implemented all the methods in System.Linq.Enumerable – I wanted to write a few posts looking at the bigger picture. I’m not 100% sure of what this will consist of yet; I want to avoid this blog series continuing forever. However, I’m confident it will contain (in no particular order): This post: principles governing the behaviour of LINQ to Objects Missing operators: what else I’d have liked to see in Enumerable Optimization: where the .NET implementation could be further optimized, and why some obvious-sounding optimizations may be inappropriate How … Continue reading Reimplementing LINQ to Objects: Part 37 – Guiding principles

Reimplementing LINQ to Objects: Part 36 – AsEnumerable

Our last operator is the simplest of all. Really, really simple. What is it? AsEnumerable has a single signature: public static IEnumerable<TSource> AsEnumerable<TSource>(this IEnumerable<TSource> source) I can describe its behaviour pretty easily: it returns source. That’s all it does. There’s no argument validation, it doesn’t create another iterator. It just returns source. You may well be wondering what the point is… and it’s all about changing the compile-time type of the expression. I’m going to take about IQueryable<T> in another post (although probably not implement anything related to it) but hopefully you’re aware that it’s usually used for "out of process" … Continue reading Reimplementing LINQ to Objects: Part 36 – AsEnumerable

Reimplementing LINQ to Objects: Part 35 – Zip

Zip will be a familiar operator to any readers who use Python. It was introduced in .NET 4 – it’s not entirely clear why it wasn’t part of the first release of LINQ, to be honest. Perhaps no-one thought of it as a useful operator until it was too late in the release cycle, or perhaps implementing it in the other providers (e.g. LINQ to SQL) took too long. Eric Lippert blogged about it in 2009, and I find it interesting to note that aside from braces, layout and names we’ve got exactly the same code. (I read the post … Continue reading Reimplementing LINQ to Objects: Part 35 – Zip

Reimplementing LINQ to Objects: Part 34 – SequenceEqual

Nearly there now… What is it? SequenceEqual has two overloads – the obvious two given that we’re dealing with equality: public static bool SequenceEqual<TSource>(     this IEnumerable<TSource> first,     IEnumerable<TSource> second) public static bool SequenceEqual<TSource>(     this IEnumerable<TSource> first,     IEnumerable<TSource> second,     IEqualityComparer<TSource> comparer) The purpose of the operator is to determine if two sequences are equal; that is, if they consist of the same elements, in the same order. A custom equality comparer can be used to compare each individual pair of elements. Characteristics: The first and second parameters mustn’t be null, and are validated immediately. The comparer parameter can be null, … Continue reading Reimplementing LINQ to Objects: Part 34 – SequenceEqual

Reimplementing LINQ to Objects: Part 33 – Cast and OfType

More design decisions around optimization today, but possibly less controversial ones… What are they? Cast and OfType are somewhat unusual LINQ operators. They are extension methods, but they work on the non-generic IEnumerable type instead of the generic IEnumerable<T> type: public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source)          public static IEnumerable<TResult> OfType<TResult>(this IEnumerable source) It’s worth mentioning what Cast and OfType are used for to start with. There are two main purposes: Using a non-generic collection (such as a DataTable or an ArrayList) within a LINQ query (DataTable has the AsEnumerable extension method too) Changing the type of a generic collection, usually to … Continue reading Reimplementing LINQ to Objects: Part 33 – Cast and OfType

Reimplementing LINQ to Objects: Part 32 – Contains

After the dubious optimizations of ElementAt/ElementAtOrDefault yesterday, we meet an operator which is remarkably good at defying optimization. Sort of. Depending on how you feel it should behave. What is it? Contains has two overloads, which only differ by whether or not they take an equality comparer – just like Distinct, Intersect and the like: public static bool Contains<TSource>(     this IEnumerable<TSource> source,     TSource value) public static bool Contains<TSource>(     this IEnumerable<TSource> source,     TSource value,     IEqualityComparer<TSource> comparer) The operator simply returns a Boolean indicating whether or not "value" was found in "source". The salient points of its behaviour should be predictable … Continue reading Reimplementing LINQ to Objects: Part 32 – Contains

Reimplementing LINQ to Objects: Part 31 – ElementAt / ElementAtOrDefault

A nice easy pair of operators tonight. I should possibly have covered them at the same time as First/Last/Single and the OrDefault variants, but never mind… What are they? ElementAt and ElementAtOrDefault have a single overload each: public static TSource ElementAt<TSource>(     this IEnumerable<TSource> source,     int index) public static TSource ElementAtOrDefault<TSource>(     this IEnumerable<TSource> source,     int index) Isn’t that blissfully simple after the overload storm of the past few days? The two operators work in very similar ways: They use immediate execution. The source parameter must not be null, and this is validated immediately. They return the element at the … Continue reading Reimplementing LINQ to Objects: Part 31 – ElementAt / ElementAtOrDefault

Reimplementing LINQ to Objects: Part 30 – Average

This is the final aggregation operator, after which I suspect we won’t need to worry about floating point difficulties any more. Between this and the unexpected behaviour of Comparer<string>.Default, I’ve covered two of my "big three" pain points. It’s hard to see how I could get dates and times into Edulinq naturally; it’s even harder to see how time zones could cause problems. I’ve still got a few operators to go though, so you never know… What is it? Average has 20 overloads, all like the following but for long, decimal, float and double as well as int: public static double Average(this … Continue reading Reimplementing LINQ to Objects: Part 30 – Average

Reimplementing LINQ to Objects: Part 29 – Min/Max

The second and third AOOOD operators today… if I’m brave enough to tackle Average tomorrow, I’ll have done them all. More surprises here today, this time in terms of documentation… What are they? Min and Max are both extension methods with 22 overloads each. Min looks like this: public static int Min(this IEnumerable<int> source) public static int Min<TSource>(     this IEnumerable<TSource> source,     Func<TSource, int> selector) public static int? Min(this IEnumerable<int?> source) public static int? Min<TSource>(     this IEnumerable<TSource> source,     Func<TSource, int?> selector) // Repeat the above four overloads for long, float, double and decimal, // then add two more generic ones: public static TSource Min<TSource>(this IEnumerable<TSource> source) … Continue reading Reimplementing LINQ to Objects: Part 29 – Min/Max

Reimplementing LINQ to Objects: Part 28 – Sum

Okay, I’ve bitten the bullet. The first of the four Aggregation Operators Of Overload Doom (AOOOD) that I’ve implemented is Sum. It was far from difficult to implement – just tedious. What is it? Sum has 20 overloads – a set of 4 for each of the types that it covers (int, long, float, double, decimal). Here are the overloads for int: public static int Sum(this IEnumerable<int> source) public static int? Sum(this IEnumerable<int?> source) public static int Sum<T>(     this IEnumerable<T> source,     Func<T, int> selector) public static int? Sum<T>(     this IEnumerable<T> source,     Func<T, int?> selector) As you can see, there are basically two variations: A … Continue reading Reimplementing LINQ to Objects: Part 28 – Sum