Reimplementing LINQ to Objects: Part 33 – Cast and OfType

More design decisions around optimization today, but possibly less controversial ones… What are they? Cast and OfType are somewhat unusual LINQ operators. They are extension methods, but they work on the non-generic IEnumerable type instead of the generic IEnumerable<T> type: public static IEnumerable<TResult> Cast<TResult>(this IEnumerable source)          public static IEnumerable<TResult> OfType<TResult>(this IEnumerable source) It’s worth mentioning what Cast and OfType are used for to start with. There are two main purposes: Using a non-generic collection (such as a DataTable or an ArrayList) within a LINQ query (DataTable has the AsEnumerable extension method too) Changing the type of a generic collection, usually to … Continue reading Reimplementing LINQ to Objects: Part 33 – Cast and OfType

Reimplementing LINQ to Objects: Part 32 – Contains

After the dubious optimizations of ElementAt/ElementAtOrDefault yesterday, we meet an operator which is remarkably good at defying optimization. Sort of. Depending on how you feel it should behave. What is it? Contains has two overloads, which only differ by whether or not they take an equality comparer – just like Distinct, Intersect and the like: public static bool Contains<TSource>(     this IEnumerable<TSource> source,     TSource value) public static bool Contains<TSource>(     this IEnumerable<TSource> source,     TSource value,     IEqualityComparer<TSource> comparer) The operator simply returns a Boolean indicating whether or not "value" was found in "source". The salient points of its behaviour should be predictable … Continue reading Reimplementing LINQ to Objects: Part 32 – Contains

Reimplementing LINQ to Objects: Part 31 – ElementAt / ElementAtOrDefault

A nice easy pair of operators tonight. I should possibly have covered them at the same time as First/Last/Single and the OrDefault variants, but never mind… What are they? ElementAt and ElementAtOrDefault have a single overload each: public static TSource ElementAt<TSource>(     this IEnumerable<TSource> source,     int index) public static TSource ElementAtOrDefault<TSource>(     this IEnumerable<TSource> source,     int index) Isn’t that blissfully simple after the overload storm of the past few days? The two operators work in very similar ways: They use immediate execution. The source parameter must not be null, and this is validated immediately. They return the element at the … Continue reading Reimplementing LINQ to Objects: Part 31 – ElementAt / ElementAtOrDefault

Reimplementing LINQ to Objects: Part 30 – Average

This is the final aggregation operator, after which I suspect we won’t need to worry about floating point difficulties any more. Between this and the unexpected behaviour of Comparer<string>.Default, I’ve covered two of my "big three" pain points. It’s hard to see how I could get dates and times into Edulinq naturally; it’s even harder to see how time zones could cause problems. I’ve still got a few operators to go though, so you never know… What is it? Average has 20 overloads, all like the following but for long, decimal, float and double as well as int: public static double Average(this … Continue reading Reimplementing LINQ to Objects: Part 30 – Average

Reimplementing LINQ to Objects: Part 29 – Min/Max

The second and third AOOOD operators today… if I’m brave enough to tackle Average tomorrow, I’ll have done them all. More surprises here today, this time in terms of documentation… What are they? Min and Max are both extension methods with 22 overloads each. Min looks like this: public static int Min(this IEnumerable<int> source) public static int Min<TSource>(     this IEnumerable<TSource> source,     Func<TSource, int> selector) public static int? Min(this IEnumerable<int?> source) public static int? Min<TSource>(     this IEnumerable<TSource> source,     Func<TSource, int?> selector) // Repeat the above four overloads for long, float, double and decimal, // then add two more generic ones: public static TSource Min<TSource>(this IEnumerable<TSource> source) … Continue reading Reimplementing LINQ to Objects: Part 29 – Min/Max

Reimplementing LINQ to Objects: Part 28 – Sum

Okay, I’ve bitten the bullet. The first of the four Aggregation Operators Of Overload Doom (AOOOD) that I’ve implemented is Sum. It was far from difficult to implement – just tedious. What is it? Sum has 20 overloads – a set of 4 for each of the types that it covers (int, long, float, double, decimal). Here are the overloads for int: public static int Sum(this IEnumerable<int> source) public static int? Sum(this IEnumerable<int?> source) public static int Sum<T>(     this IEnumerable<T> source,     Func<T, int> selector) public static int? Sum<T>(     this IEnumerable<T> source,     Func<T, int?> selector) As you can see, there are basically two variations: A … Continue reading Reimplementing LINQ to Objects: Part 28 – Sum

Reimplementing LINQ to Objects: Part 27 – Reverse

Time for a change of pace after the deep dive into sorting. Reversing is pretty simple… which is not to say there’s nothing to discuss, of course. What is it? Reverse only has a single, simple signature: public static IEnumerable<TSource> Reverse<TSource>(     this IEnumerable<TSource> source) The behaviour is pretty simple to describe: source cannot be null; this is validated eagerly. The operator uses deferred execution: until you start reading from the result, it won’t read anything from the input sequence As soon as you start reading from the result sequence, the input sequence is read in its entirety The result sequence … Continue reading Reimplementing LINQ to Objects: Part 27 – Reverse

Reimplementing LINQ to Objects: Part 26d – Fixing the key selectors, and yielding early

I feel I need a voice over. "Previously, on reimplementing LINQ to Objects…" Well, we’d got as far as a working implementation of OrderedEnumerable which didn’t have terrible performance – unless you had an expensive key selector. Oh, and it didn’t make use of the fact that we may only want the first few results. Executing key selectors only once Our first problem is to do with the key selectors. For various reasons (mentioned in part 26b) life is better if we execute the key selector once per input element. While we can do that with lazy evaluation, it makes … Continue reading Reimplementing LINQ to Objects: Part 26d – Fixing the key selectors, and yielding early

Reimplementing LINQ to Objects: Part 26c – Optimizing OrderedEnumerable

Part 26b left us with a working implementation of the ordering operators, with two caveats: The sort algorithm used was awful We were performing the key selection on every comparison, instead of once to start with Today’s post is just going to fix the first bullet – although I’m pretty sure that fixing the second will require changing it again completely. Choosing a sort algorithm There are lots of sort algorithms available. In our case, we need the eventual algorithm to: Work on arbitrary pair-based comparisons Be stable Go like the clappers 🙂 (Ideally) allow the first results to be … Continue reading Reimplementing LINQ to Objects: Part 26c – Optimizing OrderedEnumerable

Reimplementing LINQ to Objects: Part 26b – OrderBy{,Descending}/ThenBy{,Descending}

Last time we looked at IOrderedEnumerable<TElement> and I gave an implementation we could use in order to implement the public extension methods within LINQ. I’m still going to do that in this post, but it’s worth mentioning something else that’s coming up in another part (26d) – I’m going to revisit my OrderedEnumerable implementation. There may be trouble ahead… A comment on the previous post mentioned how my comparer executes the keySelector on each element every time it makes a comparison. I didn’t think of that as a particularly awful problem, until I thought of this sample query to rank … Continue reading Reimplementing LINQ to Objects: Part 26b – OrderBy{,Descending}/ThenBy{,Descending}