Category Archives: 8701

Hash Function Added To The PredicateEqualityComparer

Sometime ago I wrote a predicate equality comparer to be used with LINQ’s Distinct operator.

The Distinct operator uses an instance of an internal Set class to maintain the collection of distinct elements in the source collection which in turn checks the hash code of each element (by calling the GetHashCode method of the equality comparer) and only if there’s already an element with the same hash code in the collection calls the Equals method of the comparer to disambiguate.

At the time I provided only the possibility to specify the comparison predicate, but, in some cases, comparing a hash code instead of calling the provided comparer predicate can be a significant performance improvement, I’ve added the possibility to had a hash function to the predicate equality comparer.

You can get the updated code from the PauloMorgado.Linq project on CodePlex,

LINQ: Implementing The SkipLastWhile Operator

LINQ Com C#

Following my last posts (>)(>), in this post I’ll introduce the implementation of the SkipLastWhile operator.

The SkipLastWhile returns all but the last contiguous elements from a a sequence that satisfy the specified criteria and is implemented as the SkipLastWhile extension methods:

public static IEnumerable<TSource> SkipLastWhile<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)


public static IEnumerable<TSource> SkipLastWhile<TSource>(this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)


The implementation of these methods is very simple. We start with an empty buffer and buffer every item that satisfies the criteria implemented by a predicate. Whenever an item doesn’t satisfy the criteria, all buffered items are yield, the buffer is cleared and the the item that doesn’t satisfy the criteria is yield:



var buffer = new List<TSource>();

foreach (var item in source)
{
    if (predicate(item))
    {
        buffer.Add(item);
    }
    else
    {
        if (buffer.Count > 0)
        {
            foreach (var bufferedItem in buffer)
            {
                yield return bufferedItem;
            }

            buffer.Clear();
        }

        yield return item;
    }
}


The overload that takes in account the index of the item only differs in the call the predicate that implements the criteria:



var buffer = new List<TSource>();
var idx = 0;

foreach (var item in source)
{
    if (predicate(item, idx++))
    {
        buffer.Add(item);
    }
    else
    {
        if (buffer.Count > 0)
        {
            foreach (var bufferedItem in buffer)
            {
                yield return bufferedItem;
            }

            buffer.Clear();
        }

        yield return item;
    }
}



You can find the complete implementation of this operator (and more) CodePlex project for LINQ utilities and operators: PauloMorgado.Linq

LINQ: Implementing The SkipLast Operator

LINQ With C# (Portuguese)

Following my last post, in this post I’ll introduce the implementation of the SkipLast operator.

The SkipLast operator returns all but a specified number of contiguous elements from the end of a sequence and is implemented as the SkipLast extension method:

public static IEnumerable<TSource> SkipLast<TSource>(this IEnumerable<TSource> source, int count)


To implement this operator, first we start by buffering, at most, a count number of items from the source sequence in an array that acts as a circular buffer:



var sourceEnumerator = source.GetEnumerator();
var buffer = new TSource[count];
int idx;

for (idx = 0; (idx < count) && sourceEnumerator.MoveNext(); idx++)
{
    buffer[idx] = sourceEnumerator.Current;
}


Next, we iterate over the rest of the items of the source sequence circularly buffering them and yielding the previously buffered item at the same postition:



idx = 0;

while (sourceEnumerator.MoveNext())
{
    var item = buffer[idx];

    buffer[idx] = sourceEnumerator.Current;

    idx = (idx + 1) % count;

    yield return item;
}


If the number of items to skip is greater or equal to the number of items in the source sequence, sourceEnumerator.MoveNext() will return false on the first iteration of the while loop and an empty sequence will be produced.



As with the TakeLast operator, a few optimizations can be made here.



The first is obvious, if the requested number of items is 0 (zero) or lower, we just return an equivalent sequence:



if (count <= 0)
{
    return source.Select(i => i);
}


The second is if the source sequence is known to implement the IList<T> interface. Objects implementing this interface have a Count property and indexed access to its items which allows us to optimize the production of the final sequence.



If the number of items to skip is equal to or greater than the number of items in the source sequence, we just return an empty sequence:



var list = source as IList<TSource>;

if (list != null)
{
    if (count >= list.Count)
    {
        return System.Linq.Enumerable.Empty<TSource>();
    }

    // ...
}


If the number of items in the source sequence is greater than the number of items to skip, producing the final sequence consists of yielding all the items in the source sequence except the last count items:



int returnCount = list.Count - count;

for (int idx = 0; idx < returnCount; idx++)
{
    yield return list[idx];
}


You can find the complete implementation of this operator (and more) CodePlex project for LINQ utilities and operators: PauloMorgado.Linq

LINQ: Introducing The Skip Last Operators

LINQ With C# (Portuguese)

After having introduced the TakeLast operators (>)(>)(>), it makes sense to introduce their duals: the SkipLast operators.

Name Description Example
SkipLast<TSource>(IEnumerable<TSource>)

Returns all but a specified number of contiguous elements from the end of a sequence.

int[] grades = { 59, 82, 70, 56, 92, 98, 85 };

var lowerGrades = grades
                  .OrderBy(g => g)
                  .SkipLast(3);

Console.WriteLine("All grades except the top three are:");
foreach (int grade in lowerGrades)
{
    Console.WriteLine(grade);
}

/*
This code produces the following output:

All grades except the top three are:
56
59
70
82
*/
SkipLastWhile<TSource>(IEnumerable<TSource>, Func<TSource, Boolean>)

Returns all the elements from sequence skipping those at the end as long as the specified condition is true.

int[] grades = { 59, 82, 70, 56, 92, 98, 85 };

var lowerGrades = grades
                 .OrderBy(grade => grade)
                 .SkipLastWhile(grade => grade >= 80);

Console.WriteLine("All grades below 80:");
foreach (int grade in lowerGrades)
{
    Console.WriteLine(grade);
}

/*
This code produces the following output:

All grades below 80:
56
59
70
*/
SkipLastWhile<TSource>(IEnumerable<TSource>, Func<TSource, Int32, Boolean>)

Returns all the elements from sequence skipping those at the end as long as the specified condition is true.

int[] amounts =
    {
        5000,
        2500,
        5500,
        8000,
        6500,
        4000,
        1500,
        9000
    };

var query = amounts
            .SkipLastWhile((amount, index) => amount > index * 1000);

foreach (int amount in query)
{
    Console.WriteLine(amount);
}

/*
 This code produces the following output:

9000
*/


You can find these (and more) operators in my CodePlex project for LINQ utilities and operators: PauloMorgado.Linq

LINQ: Implementing The TakeLastWhile Operator

LINQ With C# (Portuguese)

Following my last posts (>)(>), in this post I’ll introduce the implementation of the TakeLastWhile operator.

The TakeLastWhile operator returns last contiguous elements from a sequence that satisfy the specified criteria and is implemented as the TakeLastWhile extension methods:

public static IEnumerable<TSource> TakeLastWhile<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate)


public static IEnumerable<TSource> TakeLastWhile<TSource>(this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)


The implementation of the first method is very simple. We start with an empty buffer and buffer every item that satisfies the criteria implemented by a predicate. Whenever an item doesn’t satisfy the criteria, the buffer is cleared:



var buffer = new List<TSource>();

foreach (var item in source)
{
    if (predicate(item))
    {
        buffer.Add(item);
    }
    else
    {
        buffer.Clear();
    }
}


After traversing the source sequence, we just yield all the items, if any, in the buffer:



foreach (var item in buffer)
{
    yield return item;
}


The overload that takes in account the index of the item only differs in the call the predicate that implements the criteria:



var buffer = new List<TSource>();
var idx = 0;

foreach (var item in source)
{
    if (predicate(item, idx++))
    {
        buffer.Add(item);
    }
    else
    {
        buffer.Clear();
    }
}

foreach (var item in buffer)
{
    yield return item;
}


You can find the complete implementation of this operator (and more) CodePlex project for LINQ utilities and operators: PauloMorgado.Linq

LINQ: Implementing The TakeLast Operator

LINQ With C# (Portuguese)

Following my last post, in this post I’ll introduce the implementation of the TakeLast operator.

The TakeLast operator returns a specified number of contiguous elements from the end of a sequence and is implemented as the TakeLast extension method:

public static IEnumerable<TSource> TakeLast<TSource>(this IEnumerable<TSource> source, int count)


To implement this operator, first we start by buffering, at most, a count number of items from the source sequence in an array that acts as a circular buffer:



var sourceEnumerator = source.GetEnumerator();
var buffer = new TSource[count];
var numOfItems = 0;
int idx;

for (idx = 0; (idx < count) && sourceEnumerator.MoveNext(); idx++, numOfItems++)
{
    buffer[idx] = sourceEnumerator.Current;
}


If the number of buffered items (numOfItems) is less than the requested number of items (count), we just yield all the buffered items:



if (numOfItems < count)
{
    for (idx = 0; idx < numOfItems; idx++)
    {
        yield return buffer[idx];
    }

    yield break;
}


Next, we iterate over the rest of the items circularly buffering them:



for (idx = 0; sourceEnumerator.MoveNext(); idx = (idx + 1) % count)
{
    buffer[idx] = sourceEnumerator.Current;
}


And finally, we just iterate over the buffered items and yield them:



for (; numOfItems > 0; idx = (idx + 1) % count, numOfItems--)
{
    yield return buffer[idx];
}


There are two optimizations you can make here.



The first is obvious, if the requested number of items is 0 (zero), we just return an empty sequence:



if (count <= 0)
{
    return System.Linq.Enumerable.Empty<TSource>();
}


The second is if the source sequence is known to implement the IList<T> interface. Objects implementing this interface have a Count property and indexed access to its items which allows us to optimize the production of the final sequence.



Producing the final sequence consists of yielding the required number of items from the end of the list (or all of them if the list contains less items than required):



int listCount = list.Count;

for (int idx = listCount - ((count < listCount) ? count : listCount); idx < listCount; idx++)
{
    yield return list[idx];
}


You can find the complete implementation of this operator (and more) CodePlex project for LINQ utilities and operators: PauloMorgado.Linq

LINQ: Introducing The Take Last Operators

LINQ With C# (Portuguese)

Some time ago I needed to retrieve the last items of a sequence that satisfied some criteria and, looking at the operators available in the Enumerable class, I noticed that there wasn’t such operator.

The only way to achieve this was to reverse the sequence, take the items that satisfied the criteria and reverse the resulting sequence. Something like this:

sequence.Reverse().TakeWhile(criteria).Reverse();


Looks quite simple, right? First we call the Reverse method to produce a new sequence with the same items as the original sequence but in the reverse order, then we call the TakeWhile method to take the first items that satisfy the criteria and then call the Reverse method again to restore the original order of the items.



The problem with this approach is that the Reverse method buffers the entire sequence before iterating through its items in the reverse order – and the above code uses it twice. This means iterating over all items in the original sequence and buffer them all, iterating over first items of the resulting sequence that satisfy the criteria and buffer them all and, finally, iterate over that result to produce the final sequence.



If you’re counting, you’ve come to the conclusion that all items in the original sequence will be iterated over once and the ones in the resulting sequence will be iterated again three times. If the original sequence is large, this can take lots of memory and time.



There’s another issue if you’re using the variant the uses the index of the item in the original sequence in the evaluation of the selection criteria (>). When we reverse the order of the items, the indexes will be reversed and the predicate must take that in account, which might not be possible if you don’t know the number of items in the original sequence.



There must be a better way, and that’s why I implemented the Take Last Operators:



Name Description Example
TakeLast<TSource>(IEnumerable<TSource>)

Returns a specified number of contiguous elements from the end of a sequence.

int[] grades = { 59, 82, 70, 56, 92, 98, 85 };

var topThreeGrades = grades
                     .OrderBy(grade => grade)
                     .TakeLast(3);

Console.WriteLine("The top three grades are:");
foreach (int grade in topThreeGrades)
{
    Console.WriteLine(grade);
}
/*
This code produces the following output:

The top three grades are:
98
92
85
*/
TakeLastWhile<TSource>(IEnumerable<TSource>, Func<TSource, Boolean>)

Returns the elements from the end of a sequence as long as the specified condition is true.

string[] fruits =
    {
        "apple",
        "passionfruit",
        "banana",
        "mango",
        "orange",
        "blueberry",
        "grape",
        "strawberry"
    };

var query = fruits
            .TakeLastWhile(fruit => string.Compare("orange", fruit, true) != 0);

foreach (string fruit in query)
{
    Console.WriteLine(fruit);
}

/*
This code produces the following output:
blueberry
grape
strawberry
*/
TakeLastWhile<TSource>(IEnumerable<TSource>, Func<TSource, Int32, Boolean>)

Returns the elements from the end of a sequence as long as the specified condition is true.

string[] fruits =
    {
        "apple",
        "passionfruit",
        "banana",
        "mango",
        "orange",
        "blueberry",
        "grape",
        "strawberry"
    };

var query = fruits
            .TakeLastWhile((fruit, index) => fruit.Length >= index);

foreach (string fruit in query)
{
    Console.WriteLine(fruit);
}

/*
This code produces the following output:

strawberry
*/


You can find these (and more) operators in my CodePlex project for LINQ utilities and operators: PauloMorgado.Linq