Reimplementing LINQ to Objects: Part 26d – Fixing the key selectors, and yielding early

I feel I need a voice over. "Previously, on reimplementing LINQ to Objects…" Well, we’d got as far as a working implementation of OrderedEnumerable which didn’t have terrible performance – unless you had an expensive key selector. Oh, and it didn’t make use of the fact that we may only want the first few results.

Executing key selectors only once

Our first problem is to do with the key selectors. For various reasons (mentioned in part 26b) life is better if we execute the key selector once per input element. While we can do that with lazy evaluation, it makes more sense in my opinion to do it up-front. That means we need to separate out the key selector from the key comparer – in other words, we need to get rid of the handy ProjectionComparer we used to simplify the arguments to OrderBy/ThenBy/etc.

If we’re going to keep the key selectors in a strongly typed way, that means our OrderedEnumerable (or at least some type involved in the whole business) needs to become generic in the key type. Let’s bite the bullet and make it OrderedEnumerable. Now we have a slight problem right away in the fact that the "CreateOrderedEnumerable" method is generic, introducing a new type parameter TKey… so we shouldn’t use TKey as the name of the new type parameter for OrderedEnumerable. We could rename the type parameter in the generic method implementation, but I’m becoming a big believer in leaving the signatures of methods alone when I implement an interface. For type parameters it’s not too bad, but for normal parameters it can be awful if you mess around with the names – particularly for those using named arguments.

Thinking ahead, our single "key" type parameter in OrderedEnumerable could well end up being a composite key. After all, if we have OrderBy(…).ThenBy(…).ThenBy(…) we’re going to have to have some way of representing the key formed by the three selectors. It makes sense to use a "nested" key type, where the key type of OrderedEnumerable is always the "composite key so far". Thus I named the type parameter TCompositeKey, and introduced an appropriate field. Here’s the skeleton of the new class:

internal class OrderedEnumerable<TElement, TCompositeKey> : IOrderedEnumerable<TElement>
{
    private readonly IEnumerable<TElement> source;
    private readonly Func<TElement, TCompositeKey> compositeSelector;
    private readonly IComparer<TCompositeKey> compositeComparer;

    internal OrderedEnumerable(IEnumerable<TElement> source,
        Func<TElement, TCompositeKey> compositeSelector,
        IComparer<TCompositeKey> compositeComparer)
    {
        this.source = source;
        this.compositeSelector = compositeSelector;
        this.compositeComparer = compositeComparer;
    }

    // Interface implementations here
}

(I’m aware this is very "stream of consciousness" – I’m assuming that presenting the decisions in the order in which I addressed them is a good way of explaining the necessary changes. Apologies if the style doesn’t work for you.)

ThenBy and ThenByDescending don’t have to change at all – they were already just using the interface. OrderBy and OrderByDescending become a little simpler, as we don’t need to build the projection comparer. Here’s the new version of OrderBy:

public static IOrderedEnumerable<TSource> OrderBy<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    IComparer<TKey> comparer)
{
    if (source == null)
    {
        throw new ArgumentNullException("source");
    }
    if (keySelector == null)
    {
        throw new ArgumentNullException("keySelector");
    }
    return new OrderedEnumerable<TSource, TKey>
        (source, keySelector, comparer ?? Comparer<TKey>.Default);
}

Lovely – we just call a constructor, basically.

So far, so good. Now what about the implementation of IOrderedEnumerable? We should expect this to get messy, because there are three types of key involved:

  • The current key type
  • The secondary key type
  • The composite key type

Currently we don’t even have a type which can represent the composite key. We could use something like KeyValuePair<TKey, TValue>, but that doesn’t really give the right impression. Instead, let’s create our own simple type:

internal struct CompositeKey<TPrimary, TSecondary>
{
    private readonly TPrimary primary;
    private readonly TSecondary secondary;

    internal TPrimary Primary { get { return primary; } }
    internal TSecondary Secondary{ get { return secondary; } }

    internal CompositeKey(TPrimary primary, TSecondary secondary)
    {
        this.primary = primary;
        this.secondary = secondary;
    }
}

Now we can easily create a projection from two key selectors to a new one which selects a composite key. However, we’ll need to do the same thing for a comparer. We could use the CompoundComparer class we created before, but that will end up with quite a bit of indirection. Instead, it would be nice to have a type to work directly with CompositeKey – something which knew it was dealing with comparers of different types, one for each part of the key.

We could create a completely separate top-level type for that… but specifying the type parameters again seems a bit daft when we can reuse them by simply creating a nested class within CompositeKey:

internal struct CompositeKey<TPrimary, TSecondary>
{
    // Other members as shown above

    internal sealed class Comparer : IComparer<CompositeKey<TPrimary, TSecondary>>
    {
        private readonly IComparer<TPrimary> primaryComparer;
        private readonly IComparer<TSecondary> secondaryComparer;

        internal Comparer(IComparer<TPrimary> primaryComparer,
                          IComparer<TSecondary> secondaryComparer)
        {
            this.primaryComparer = primaryComparer;
            this.secondaryComparer = secondaryComparer;
        }

        public int Compare(CompositeKey<TPrimary, TSecondary> x,
                           CompositeKey<TPrimary, TSecondary> y)
        {
            int primaryResult = primaryComparer.Compare(x.Primary, y.Primary);
            if (primaryResult != 0)
            {
                return primaryResult;
            }
            return secondaryComparer.Compare(x.Secondary, y.Secondary);
        }
    }
}

This may look a little odd to begin with, but the two types really are quite deeply connected.

Now that we can compose keys in terms of both selection and comparison, we can implement CreateOrderedEnumerable:

public IOrderedEnumerable<TElement> CreateOrderedEnumerable<TKey>(
    Func<TElement, TKey> keySelector,
    IComparer<TKey> comparer,
    bool descending)
{
    if (keySelector == null)
    {
        throw new ArgumentNullException("keySelector");
    }
    comparer = comparer ?? Comparer<TKey>.Default;
    if (descending)
    {
        comparer = new ReverseComparer<TKey>(comparer);
    }

    // Copy to a local variable so we don’t need to capture "this"
    Func<TElement, TCompositeKey> primarySelector = compositeSelector;
    Func<TElement, CompositeKey<TCompositeKey, TKey>> newKeySelector = 
        element => new CompositeKey<TCompositeKey, TKey>(primarySelector(element), keySelector(element));

    IComparer<CompositeKey<TCompositeKey, TKey>> newKeyComparer =
        new CompositeKey<TCompositeKey, TKey>.Comparer(compositeComparer, comparer);

    return new OrderedEnumerable<TElement, CompositeKey<TCompositeKey, TKey>>
        (source, newKeySelector, newKeyComparer);
}

I’m not going to pretend that the second half of the method is anything other than ghastly. I’m not sure I’ve ever written code which is so dense in type arguments. IComparer<CompositeKey<TCompositeKey, TKey>> is a particularly "fine" type. Ick.

However, it works – and once you’ve got your head round what each of the type parameters actually means at any one time, it’s not really complicated code – it’s just verbose and clunky.

The only bit which might require a bit of explanation is the primarySelector variable. I could certainly have just used compositeSelector within the lambda expression used to create the new key selector – it’s not like it’s going to change, after all. The memory benefits of not having a reference to "this" (where the intermediate OrderedEnumerable is likely to be eligible for GC collection immediately, in a typical OrderBy(…).ThenBy(…) call) are almost certainly not worth it. It just feels right to have both the primary and secondary key selectors in the same type, which is what will happen with the current code. They’re both local variables, they’ll be captured together, all will be well.

I hope you can see the parallel between the old code and the new code. Previously we composed a new (element-based) comparer based on the existing comparer, and a projection comparer from the method parameters. Now we’re composing a new key selector and a new key comparer. It’s all the same idea, just maintaining the split between key selection and key comparison.

Now let’s sort…

So far, we haven’t implemented GetEnumerator – and that’s all. As soon as we’ve done that to our satisfaction, we’re finished with ordering.

There are several approaches to how we could sort. Here are a few of them:

  • Project each element to its key, and create a KeyValuePair for each item. Merge sort in the existing way to achieve stability. This will involve copying a lot of data around – particularly if the element and key types end up being large value types.
  • Project each element to a { key, index } pair, and create another composite comparer which uses the index as a tie-breaker to achieve stability. This still involves copying keys around, but it means we could easily use a built-in sort (such as List<T>).
  • Project each element to a key, and separately create an array of indexes (0, 1, 2, 3…). Sort the indexes by accessing the relevant key at any point, using indexes as tie-breakers. This requires a more fiddly sort, as we need to keep indexing into the indexes array.
  • Build up "chunks" of sorted data as we read it in, keeping some number of chunks and merging them appropriate when we want to. We can then yield the results without ever performing a full sort, by effectively performing the "merge" operation of merge sort, just yielding values instead of copying them to temporary storage. (Obviously this is trivial with 2 chunks, but can be extended to more.)
  • Do something involving a self-balancing binary tree :)

I decided to pick the middle option, using quicksort as the sorting algorithm. This comes with the normal problems of possibly picking bad pivots, but it’s usually a reasonable choice. I believe there are cunning ways of improving the worst-case performance, but I haven’t implemented any of those.

Here’s the non-quicksort part of the code, just to set the scene.

public IEnumerator<TElement> GetEnumerator()
{
    // First copy the elements into an array: don’t bother with a list, as we
    // want to use arrays for all the swapping around.
    int count;
    TElement[] data = source.ToBuffer(out count);

    int[] indexes = new int[count];
    for (int i = 0; i < indexes.Length; i++)
    {
        indexes[i] = i;
    }

    TCompositeKey[] keys = new TCompositeKey[count];
    for (int i = 0; i < keys.Length; i++)
    {
        keys[i] = compositeSelector(data[i]);
    }

    QuickSort(indexes, keys, 0, count – 1);

    for (int i = 0; i < indexes.Length; i++)
    {
        yield return data[indexes[i]];
    }
}

I could certainly have combined the first two loops – I just liked the separation provided in this code. One tiny micro-optimization point to note is that for each loop I’m using the Length property of the array rather than "count" as the upper bound, as I believe that will reduce the amount of array boundary checking the JIT will generate. I very much doubt that it’s relevant, admittedly :) I’ve left the code here as it is in source control – but looking at it now, I could certainly have used a foreach loop on the final yield part. We wouldn’t be able to later, admittedly… but I’ll come to that all in good time.

The actual quicksort part is reasonably standard except for the fact that I pass in both the arrays for both indexes and keys – usually there’s just the one array which is being sorted. Here’s the code for both the recursive call and the partition part:

private void QuickSort(int[] indexes, TCompositeKey[] keys, int left, int right)
{
    if (right > left)
    {
        int pivot = left + (right – left) / 2;
        int pivotPosition = Partition(indexes, keys, left, right, pivot);
        QuickSort(indexes, keys, left, pivotPosition – 1);
        QuickSort(indexes, keys, pivotPosition + 1, right);
    }
}

private int Partition(int[] indexes, TCompositeKey[] keys, int left, int right, int pivot)
{
    // Remember the current index (into the keys/elements arrays) of the pivot location
    int pivotIndex = indexes[pivot];
    TCompositeKey pivotKey = keys[pivotIndex];

    // Swap the pivot value to the end
    indexes[pivot] = indexes[right];
    indexes[right] = pivotIndex;
    int storeIndex = left;
    for (int i = left; i < right; i++)
    {
        int candidateIndex = indexes[i];
        TCompositeKey candidateKey = keys[candidateIndex];
        int comparison = compositeComparer.Compare(candidateKey, pivotKey);
        if (comparison < 0 || (comparison == 0 && candidateIndex < pivotIndex))
        {
            // Swap storeIndex with the current location
            indexes[i] = indexes[storeIndex];
            indexes[storeIndex] = candidateIndex;
            storeIndex++;
        }
    }
    // Move the pivot to its final place
    int tmp = indexes[storeIndex];
    indexes[storeIndex] = indexes[right];
    indexes[right] = tmp;
    return storeIndex;
}

It’s interesting to observe how similar the quicksort and merge sort recursive parts are – both picking a midpoint, recursing on the left of it, recursing on the right of it, and performing some operation on the whole sublist. Of course the "some operation" is very different between partition and merge, and it occurs at a different time – but it’s an interesting parallel nonetheless.

One significant difference between merge sort and quicksort is the use of the pivot. Once Partition has returned where the pivot element ended up, quicksort doesn’t touch that element itself (we already know it will be in the right place). It recurses on the sublist entirely to the left of the pivot and the sublist entirely to the right of the pivot. Compare this with merge sort with recurses on two sublists which together comprise the whole list for that call.

The overloading of the word "index" here is unfortunate, but that is unfortunately life. Both sorts of "index" here really are indexes… you just need to keep an eye on which is which.

The final point to note is how we’re using the indexes in the comparison, as a tie-break to keep stability. It’s an ugly expression, but it does the job.

(As a small matter of language, I wasn’t sure whether to use indexes or indices. I far prefer the former, so I used it. Having just checked in the dictionary, it appears both are correct. This reminds me of when I was writing C# in Depth – I could never decide between appendixes and appendices. Blech.)

Now, do you want to hear the biggest surprise I received last night? After I’d fixed up the compile-time errors to arrive at the code above, it worked first time. I’m not kidding. I’m not quite sure how I pulled that off (merge sort didn’t take long either, but it did at least have a few tweaks to fix up) but it shocked the heck out of me. So, are we done? Well, not quite.

Yielding early

Just as a reminder, one of my aims was to be able to use iterator blocks to return some values to anyone iterating over the result stream without having to do all the sorting work. This means that in the case of calling OrderBy(…).Take(5) on a large collection, we can end up saving a lot of work… I hope!

This is currently fairly normal quicksort code, leaving the "dual arrays" aspect aside… but it’s not quite amenable to early yielding. We’re definitely computing the earliest results first, due to the order of the recursion – but we can’t yield from the recursive method – iterator blocks just don’t do that.

So, we’ll have to fake the recursion. Fortunately, quicksort is only directly recursive – we don’t need to worry about mutually recursive routines: A calling B which might call C or it might call back to A, etc. Instead, we can just keep a Stack<T> of "calls" to quicksort that we want to make, and execute the appropriate code within our GetEnumerator() method, so we can yield at the right point. Now in the original code, quicksort has four parameters, so you might expect our Stack<T> to have those four values within T too… but no! Two of those values are just the keys and indexes… and we already have those in two local variables. We only need to keep track of "right" and "left". Again, for the sake of clarity I decided to implement this using a custom struct – nested within OrderedEnumerable as there’s no need for it to exist anywhere else:

private struct LeftRight
{
    internal int left, right;
    internal LeftRight(int left, int right)
    {
        this.left = left;
        this.right = right;
    }
}

Purists amongst you may curse at the use of internal fields rather than properties. I’m not bothered – this is a private class, and we’re basically using this as a tuple. Heck, I would have used anonymous types if it weren’t for two issues:

  • I wanted to use Stack<T>, and there’s no way of creating one of those for an anonymous type (without introducing more generic methods to use type inference)
  • I wanted to use a struct – we’ll end up creating a lot of these values, and there’s simply no sense in them being individual objects on the heap. Anonymous types are always classes.

So, as a first step we can transform our code to use this "fake recursion" but still yield at the very end:

var stack = new Stack<LeftRight>();
stack.Push(new LeftRight(0, count – 1));
while (stack.Count > 0)
{
    LeftRight leftRight = stack.Pop();
    int left = leftRight.left;
    int right = leftRight.right;
    if (right > left)
    {
        int pivot = left + (right – left) / 2;
        int pivotPosition = Partition(indexes, keys, left, right, pivot);
        stack.Push(new LeftRight(pivotPosition + 1, right));
        stack.Push(new LeftRight(left, pivotPosition – 1));
    }
}

for (int i = 0; i < indexes.Length; i++) 

    yield return data[indexes[i]]; 
}

We initially push a value of (0, count – 1) to simulate the call to QuickSort(0, count – 1) which started it all before. The code within the loop is very similar to the original QuickSort method, with three changes:

  • We have to grab the next value of LeftRight from the stack, and then separate it into left and right values
  • Instead of calls to QuickSort, we have calls to stack.Push
  • We’ve reversed the order of the recursive calls: in order to sort the left sublist first, we have to push it onto the stack last.

Happy so far? We’re getting very close now. All we need to do is work out when to yield. This is the bit which caused me the most headaches, until I worked out that the "if (right > left)" condition really meant "if we’ve got work to do"… and we’re interested in the exact opposite scenario – when we don’t have any work to do, as that means everything up to and including "right" is already sorted. There are two situations here: either right == left, i.e. we’re sorting one element, or right == left – 1, which will occur if we picked a pivot which was the maximum or minimum value in the list at the previous recursive step.

It’s taken me a little bit of thinking (and just running the code) to persuade me that we will always naturally reach a situation where we end up seeing right == count and right <= left, i.e. a place where we know we’re completely done. But it’s okay – it does happen.

It’s not just a case of yielding the values between left and right though – because otherwise we’d never yield a pivot. Remember how I pointed out that quick sort missed out the pivot when specifying the sublists to recurse into? Well, that’s relevant here. Fortunately, it’s really easy to work out what to do. Knowing that everything up to and including "right" has been sorted means we just need to keep a cursor representing the next index to yield, and then just move that cursor up until it’s positioned beyond "right". The code is probably easier to understand than the description:

int nextYield = 0;

var stack = new Stack<LeftRight>();
stack.Push(new LeftRight(0, count – 1));
while (stack.Count > 0)
{
    LeftRight leftRight = stack.Pop();
    int left = leftRight.left;
    int right = leftRight.right;
    if (right > left)
    {
        int pivot = left + (right – left) / 2;
        int pivotPosition = Partition(indexes, keys, left, right, pivot);
        // Push the right sublist first, so that we *pop* the
        // left sublist first
        stack.Push(new LeftRight(pivotPosition + 1, right));
        stack.Push(new LeftRight(left, pivotPosition – 1));
    }
    else
    {
        while (nextYield <= right)
        {
            yield return data[indexes[nextYield]];
            nextYield++;
        }
    }
}

Tada! It works (at least according to my tests).

I have tried optimizing this a little further, to deal with the case when right == left + 1, i.e. we’re only sorting two elements. It feels like that ought to be cheaper to do explicitly than via pivoting and adding two pointless entries to the stack… but the code gets a lot more complicated (to the point where I had to fiddle significantly to get it working) and from what I’ve seen, it doesn’t make much performance difference. Odd. If this were a production-quality library to be used in performance-critical situations I’d go further in the testing, but as it is, I’m happy to declare victory at this point.

Performance

So, how well does it perform? I’ve only performed crude tests, and they perplex me somewhat. I’m sure that last night, when I was running the "yield at the end" code, my tests were running twice as slowly in Edulinq as in LINQ to Objects. Fair enough – this is just a hobby, Microsoft have no doubt put a lot of performance testing effort into this. (That hasn’t stopped them from messing up "descending" comparers, admittedly, as I found out last night to my amusement.) That was on my "meaty" laptop (which is 64-bit with a quad core i7). On my netbook this morning, the same Edulinq code seemed to be running slightly faster than LINQ to Objects. Odd.

This evening, having pulled the "early out" code from the source repository, the Edulinq implementation is running faster than the LINQ to Objects implementation even when the "early out" isn’t actually doing much good. That’s just plain weird. I blame my benchmarking methodology, which is far from rigorous. I’ve tweaked the parameters of my tests quite a bit, but I haven’t tried all kinds of different key and element types, etc. The basic results are very roughly:

  • When evaluating the whole ordered list, Edulinq appears to run about 10% faster than LINQ to Objects
  • When evaluating only the top 5 of a large ordered list, Edulinq can be much faster. How much faster depends on the size of the list of course, and it still has to perform the initial complete partitioning step – but on 100,000 items it’s regularly about 10x faster than LINQ to Objects.

That makes me happy :) Of course, the code is all open source, so if Microsoft wish to include the Edulinq implementation in .NET 5, they’re quite at liberty to do so, as long as they abide by the terms of the licence. I’m not holding my breath ;)

More seriously, I fully expect there are a bunch of scenarios where my knocked-up-in-an-evening code performs slower than that in the framework. Maybe my approach takes a lot more memory. Maybe it has worse locality of reference in some scenarios. There are all kinds of possibilities here. Full performance analysis was never meant to be the goal of Edulinq. I’m doing this in the spirit of learning more about LINQ – but it’s fun to try to optimize just a little bit. I’m going to delete the increasingly-inaccurately-named MergeSortTest project now – I may institute a few more benchmarks later on though. I’m also removing CompoundComparer and ProjectionComparer, which are no longer used. They’ll live on in part 26a though…

Conclusions

Well that was fun, wasn’t it? I’m pretty pleased with the result. The final code has some nasty generic complexity in it, but it’s not too bad if you keep all the types clear in your mind.

None of the remaining operators will be nearly as complex as this, unless I choose to implement AsQueryable (which I wasn’t planning on doing). On the other hand, as I’ve mentioned before, Max/Sum/etc have oodles of overloads. While I’ll certainly implement all of them, I’m sure I’ll only present the code for selected interesting overloads.

As a bit of light relief, I think I’ll tackle Reverse. That’s about as simple as it gets – although it could still present some interesting options.

Addendum

An earlier version of this post (and the merge sort implementation) had a flawed piece of code for choosing the pivot. Here’s both the old and the new code:

// Old code
int pivot = (left + right) / 2;

// New code
int pivot = left + (right – left) / 2;

The difference is whether or not the code can overflow when left and right are very large. Josh Bloch wrote about it back in 2006. A colleague alerted me to this problem shortly after posting, but it’s taken until now to correct it. (I fixed the source repository almost immediately, but deferred writing this addendum.) Why was I not too worried? Because .NET restricts each object to be less than 2GB in size, even in .NET 4.0, even on a 64-bit CLR. As we’ve created an array of integers, one per entry, that means we can only have just under (int.MaxValue / 4) elements. Within those limits, there’s no problem in the original pivot code. However, it’s still worth fixing of course – one never knows when the restriction will be lifted. The CLR team blogged about the issue back in 2005 (when the 64-bit CLR was new) – I haven’t seen any mentions of plans to remove the limitation, but I would imagine it’s discussed periodically.

One oddity about this is that the Array class itself has some API support for large arrays, such as the LongLength property. To be honest, I can’t see large arrays ever being particularly pleasant to work with – what would they return for the normal Length property, for example, or their implementation of IList<T> etc? I suspect we may see support for larger objects before we see support for arrays with more than int.MaxValue elements, but that’s a complete guess.

13 thoughts on “Reimplementing LINQ to Objects: Part 26d – Fixing the key selectors, and yielding early”

  1. Wow, really long post… but really interesting too, especially the part about yielding early. And congratulations on making it faster than Linq to Objects ;)

    There’s one thing that bothers me in your implementation, though : by calculating all the keys before doing any sorting, you actually evaluate *all* parts of the composite key for each item, which might not be necessary. Typically, if you sort a sequence according to 5 keys, you will rarely need to access the 4th and 5th key, since the comparison will usually have been resolved by the previous keys. The only case where you really need to evaluate the 5th key of each element is when the 4 first keys are equal for all elements. So I think using lazy evaluation for the keys could be a significant improvement… Of course, I didn’t try it, so I’m not sure how big an improvement it would be; it probably depends on how expensive it is to compute the keys.

  2. @Thomas: Yes, that’s certainly a possibility. It will depend not only on how expensive it is to compute the keys, but on how expensive it is to access them lazily. (If nothing else, there’s likely to be an associated memory cost.) Interesting idea though. I don’t know what LINQ to Objects does there, but it would be easy to find out using a side-effecting key comparer.

  3. Indices is the plural of index, where it refers to a subscript (or superscript), e.g. the 3 in x[3] or x^3.

    Indexes is the plural of index, where it refers to the index of a book.

    —-

    Something minor surprised me about your code: you’ve got a mutable struct in there. I know you’re not mutating it, but you’re the biggest opposer of mutable structs I know…

    Your posts lately got me thinking, what is the complexity of getting the first m numbers in an array of size n using your yielding sort. In other words, in a collection of size n, how many times will the comparer x be used in collection.OrderBy(x).Take(m)?

    It seems to me like in quicksort this is O(n log(n)); am I wrong here? I was thinking which sorting method would be the best for this and came up with a surprising conclusion: I have no idea :)
    Seriously though, a heapsort would probably perform much better here. If I’m not mistaken, it would take O(n) to perform the original heapify and then O(log(n)) for each element – so the total complexity would be O(n + m log(n)) which is identical (of course) for taking the entire sequence, but considerably shorter for smaller values of m.
    (And of course smoothsort would be much better but I can’t begin to understand that algorithm :)

    What do you think? Am I right in my complexity calculations? Are there other sorts that would perform even better for smaller values of m without sacrificing the m=n case?

  4. I just had a look at the “real” Linq to Objects code in the reference source. It’s not surprising your implementation is much faster when you only take the first few items: the .NET implementation doesn’t yield early. It sorts the whole buffer, then starts yielding results. I guess MS should take some ideas from Edulinq ;)

    Linq to Objects takes an approach similar to yours: it also computes all the keys eagerly, like you did, and sorts an array of indexes. The structure of the code is quite different, though… much harder to understand IMHO

  5. @configurator: I’m going to stick with indexes as it’s simpler to read :) I imagine it’s also *much* easier to understand for non-native English speakers.

    I’ve fixed the lack of readonly in LeftRight, thanks.

    As for complexity – it’s been a while since I’ve looked at heapsort, to be honest. The trick is to come up with an algorithm which still works well when you take all items, but is faster when you only take a few. I believe heapsort is “usually” slower than quicksort.

    One thing I’ve been thinking about is the possibility of a new extension method called TopBy which would be like OrderBy, but express the limit at the start. It could return IOrderedEnumerable, so that ThenBy would still work.

  6. Jon,

    Alternatively, you could add:

    public static IOrderedEnumerable Take(this IOrderedEnumerable source, int count);

    .. that would return a new IOrderedEnumerable with the count specified in a field? Then you can keep the existing seq.OrderBy(…).ThenBy(…).Take(n) syntax.

  7. edit: That should, of course, say TElement instead of TSource.

    It should also probably return IEnumerable so callers couldn’t misuse it like this: seq.OrderBy(…).ThenBy(…).Take(n).ThenBy(…).

  8. I don’t know that quicksort is faster than heapsort… Doesn’t seem that way according to http://www.sorting-algorithms.com/

    Also, from my quick search I found that the main reason quicksort is faster is locality of reference, and we’ve already forgone that. ( http://stackoverflow.com/q/1853208/9536 )

    Jason, the problem with this approach is that it is very brittle and would break with minor code changes.

    list.OrderBy(…).ThenBy(…).Select(…).Take(….) would use the ‘old’ version of Take, for example.

  9. @configurator: Wikipedia on heap sort claims this:

    “Although somewhat slower in practice on most machines than a well implemented quicksort, it has the advantage of a more favorable worst-case Θ(n log n) runtime.”

    I realise that Wikipedia isn’t always right, but I believe it to be accurate in this case. Not that I’m claiming mine is a particularly well-implemented quicksort :)

    My main reason for not implementing heapsort at the moment is that I believe it would be significantly more work than the two algorithms I’ve implemented so far :)

  10. @skeet

    Knuth (CoCP sec 5.2.3) agrees with Wikipedia (or more a case of visa versa :-)).

    Knuth also notes that Heapsort’s asymptotic running time has a larger constant factor than Quicksort’s average. So, unless one encounter’s one of Quicksort’s poor cases Quicksort will be better.

    I would expect the framwork’s Quicksort takes steps to avoid those bad cases. Actually I think it would be a bug if it didn’t.

  11. Nice post,
    but your code has at least two bugs:
    1)
    int pivot = (left + right) / 2;
    should be
    left + (right – left + 1)/2
    to avoid overflow

    2)
    this does not protect you against
    the pathological case where
    quick sort becomes n^2. Use some
    randomness or select the middle of at
    least 3 elements

    Someone asked what’s the complexity of
    getting the first m elements.
    For 1 element its around 2*n
    For 2 elements its 2n + n
    For 3 elements its 2n + n + n/2

  12. @stefan: I’m aware of the overflow problem, and plan to come back to it with an addendum soon. On the .NET framework itself this isn’t an issue though – because we couldn’t create an array with enough elements for it to be a problem. No object can take more than 1GB of memory, which means that you can’t have an int[] array of more than int.MaxValue/4 elements.

    I don’t regard “not avoiding pathological cases” as a *bug*, but it’s clearly an area where we could improve. I’ve no idea whether LINQ to Objects avoids pathological cases, btw.

Comments are closed.